Sample records for constructed-response cr items

  1. Dynamic Testing of Analogical Reasoning in 5- to 6-Year-Olds: Multiple-Choice versus Constructed-Response Training Items

    ERIC Educational Resources Information Center

    Stevenson, Claire E.; Heiser, Willem J.; Resing, Wilma C. M.

    2016-01-01

    Multiple-choice (MC) analogy items are often used in cognitive assessment. However, in dynamic testing, where the aim is to provide insight into potential for learning and the learning process, constructed-response (CR) items may be of benefit. This study investigated whether training with CR or MC items leads to differences in the strategy…

  2. Single- versus Double-Scoring of Trend Responses in Trend Score Equating with Constructed-Response Tests. Research Report. ETS RR-10-12

    ERIC Educational Resources Information Center

    Tan, Xuan; Ricker, Kathryn L.; Puhan, Gautam

    2010-01-01

    This study examines the differences in equating outcomes between two trend score equating designs resulting from two different scoring strategies for trend scoring when operational constructed-response (CR) items are double-scored--the single group (SG) design, where each trend CR item is double-scored, and the nonequivalent groups with anchor…

  3. Measurement Properties of Two Innovative Item Formats in a Computer-Based Test

    ERIC Educational Resources Information Center

    Wan, Lei; Henly, George A.

    2012-01-01

    Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…

  4. The Effect of Year-to-Year Rater Variation on IRT Linking

    ERIC Educational Resources Information Center

    Yen, Shu Jing; Ochieng, Charles; Michaels, Hillary; Friedman, Greg

    2005-01-01

    Year-to-year rater variation may result in constructed response (CR) parameter changes, making CR items inappropriate to use in anchor sets for linking or equating. This study demonstrates how rater severity affected the writing and reading scores. Rater adjustments were made to statewide results using an item response theory (IRT) methodology…

  5. The Assignment of Raters to Items: Controlling for Rater Effects.

    ERIC Educational Resources Information Center

    Sykes, Robert C.; Heidorn, Mark; Lee, Guemin

    A study was conducted to evaluate the effect of different modes (modalities) of assigning raters to test items. The impact on total constructed response (c.r.) score, and subsequently on total test score, of assigning a single versus multiple raters to an examination reading of a student's set of c.r. responses was evaluated for several mixed-item…

  6. Multiple-Choice versus Constructed-Response Tests in the Assessment of Mathematics Computation Skills.

    ERIC Educational Resources Information Center

    Gadalla, Tahany M.

    The equivalence of multiple-choice (MC) and constructed response (discrete) (CR-D) response formats as applied to mathematics computation at grade levels two to six was tested. The difference between total scores from the two response formats was tested for statistical significance, and the factor structure of items in both response formats was…

  7. Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

    ERIC Educational Resources Information Center

    Wang, Wei

    2013-01-01

    Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

  8. Examining Two Strategies to Link Mixed-Format Tests Using Multiple-Choice Anchors. Research Report. ETS RR-10-18

    ERIC Educational Resources Information Center

    Walker, Michael E.; Kim, Sooyeon

    2010-01-01

    This study examined the use of an all multiple-choice (MC) anchor for linking mixed format tests containing both MC and constructed-response (CR) items, in a nonequivalent groups design. An MC-only anchor could effectively link two such test forms if either (a) the MC and CR portions of the test measured the same construct, so that the MC anchor…

  9. Applying Item Response Theory methods to design a learning progression-based science assessment

    NASA Astrophysics Data System (ADS)

    Chen, Jing

    Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all the defined boundaries. This ensures the accuracy of the classification. Third, when item threshold parameters vary a bit, the scoring rubrics and the items need to be reviewed to make the threshold parameters similar across items. This is because one important design criterion of the learning progression-based items is that ideally, a student should be at the same level across items, which means that the item threshold parameters (d1, d 2 and d3) should be similar across items. To design a learning progression-based science assessment, we need to understand whether the assessment measures a single construct or several constructs and how items are associated with the constructs being measured. Results from dimension analyses indicate that items of different carbon transforming processes measure different aspects of the carbon cycle construct. However, items of different practices assess the same construct. In general, there are high correlations among different processes or practices. It is not clear whether the strong correlations are due to the inherent links among these process/practice dimensions or due to the fact that the student sample does not show much variation in these process/practice dimensions. Future data are needed to examine the dimensionalities in terms of process/practice in detail. Finally, based on item characteristics analysis, recommendations are made to write more discriminative CR items and better OMC, MTF options. Item writers can follow these recommendations to write better learning progression-based items.

  10. Using a MaxEnt Classifier for the Automatic Content Scoring of Free-Text Responses

    NASA Astrophysics Data System (ADS)

    Sukkarieh, Jana Z.

    2011-03-01

    Criticisms against multiple-choice item assessments in the USA have prompted researchers and organizations to move towards constructed-response (free-text) items. Constructed-response (CR) items pose many challenges to the education community—one of which is that they are expensive to score by humans. At the same time, there has been widespread movement towards computer-based assessment and hence, assessment organizations are competing to develop automatic content scoring engines for such items types—which we view as a textual entailment task. This paper describes how MaxEnt Modeling is used to help solve the task. MaxEnt has been used in many natural language tasks but this is the first application of the MaxEnt approach to textual entailment and automatic content scoring.

  11. Constructing the Exact Significance Level for a Person-Fit Statistic.

    ERIC Educational Resources Information Center

    Liou, Michelle; Chang, Chih-Hsin

    1992-01-01

    An extension is proposed for the network algorithm introduced by C.R. Mehta and N.R. Patel to construct exact tail probabilities for testing the general hypothesis that item responses are distributed according to the Rasch model. A simulation study indicates the efficiency of the algorithm. (SLD)

  12. Investigating Assessment Bias for Constructed Response Explanation Tasks: Implications for Evaluating Performance Expectations for Scientific Practice

    NASA Astrophysics Data System (ADS)

    Federer, Meghan Rector

    Assessment is a key element in the process of science education teaching and research. Understanding sources of performance bias in science assessment is a major challenge for science education reforms. Prior research has documented several limitations of instrument types on the measurement of students' scientific knowledge (Liu et al., 2011; Messick, 1995; Popham, 2010). Furthermore, a large body of work has been devoted to reducing assessment biases that distort inferences about students' science understanding, particularly in multiple-choice [MC] instruments. Despite the above documented biases, much has yet to be determined for constructed response [CR] assessments in biology and their use for evaluating students' conceptual understanding of scientific practices (such as explanation). Understanding differences in science achievement provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Using the integrative framework put forth by the National Research Council (2012), this dissertation aimed to explore whether assessment biases occur for assessment practices intended to measure students' conceptual understanding and proficiency in scientific practices. Using a large corpus of undergraduate biology students' explanations, three studies were conducted to examine whether known biases of MC instruments were also apparent in a CR instrument designed to assess students' explanatory practice and understanding of evolutionary change (ACORNS: Assessment of COntextual Reasoning about Natural Selection). The first study investigated the challenge of interpreting and scoring lexically ambiguous language in CR answers. The incorporation of 'multivalent' terms into scientific discourse practices often results in statements or explanations that are difficult to interpret and can produce faulty inferences about student knowledge. The results of this study indicate that many undergraduate biology majors frequently incorporate multivalent concepts into explanations of change, resulting in explanatory practices that were scientifically non-normative. However, use of follow-up question approaches was found to resolve this source of bias and thereby increase the validity of inferences about student understanding. The second study focused on issues of item and instrument structure, specifically item feature effects and item position effects, which have been shown to influence measures of student performance across assessment tasks. Results indicated that, along the instrument item sequence, items with similar surface features produced greater sequencing effects than sequences of items with dissimilar surface features. This bias could be addressed by use of a counterbalanced design (i.e., Latin Square) at the population level of analysis. Explanation scores were also highly correlated with student verbosity, despite verbosity being an intrinsically trivial aspect of explanation quality. Attempting to standardize student response length was one proposed solution to the verbosity bias. The third study explored gender differences in students' performance on constructed-response explanation tasks using impact (i.e., mean raw scores) and differential item function (i.e., item difficulties) patterns. While prior research in science education has suggested that females tend to perform better on constructed-response items, the results of this study revealed no overall differences in gender achievement. However, evaluation of specific item features patterns suggested that female respondents have a slight advantage on unfamiliar explanation tasks. That is, male students tended to incorporate fewer scientifically normative concepts (i.e., key concepts) than females for unfamiliar taxa. Conversely, females tended to incorporate more scientifically non-normative ideas (i.e., naive ideas) than males for familiar taxa. Together these results indicate that gender achievement differences for this CR instrument may be a result of differences in how males and females interpret and respond to combinations of item features. Overall, the results presented in the subsequent chapters suggest that as science education shifts toward the evaluation of fused scientific knowledge and practice (e.g., explanation), it is essential that educators and researchers investigate potential sources of bias inherent to specific assessment practices. This dissertation revealed significant sources of CR assessment bias, and provided solutions to address these problems.

  13. Studies of a Latent Class Signal Detection Model for Constructed Response Scoring II: Incomplete and Hierarchical Designs. Research Report. ETS RR-10-08

    ERIC Educational Resources Information Center

    DeCarlo, Lawrence T.

    2010-01-01

    A basic consideration in large-scale assessments that use constructed response (CR) items, such as essays, is how to allocate the essays to the raters that score them. Designs that are used in practice are incomplete, in that each essay is scored by only a subset of the raters, and also unbalanced, in that the number of essays scored by each rater…

  14. Does Linking Mixed-Format Tests Using a Multiple-Choice Anchor Produce Comparable Results for Male and Female Subgroups? Research Report. ETS RR-11-44

    ERIC Educational Resources Information Center

    Kim, Sooyeon; Walker, Michael E.

    2011-01-01

    This study examines the use of subpopulation invariance indices to evaluate the appropriateness of using a multiple-choice (MC) item anchor in mixed-format tests, which include both MC and constructed-response (CR) items. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using an MC-only anchor set for 4…

  15. Examining Gender Differences in Written Assessment Tasks in Biology: A Case Study of Evolutionary Explanations

    PubMed Central

    Federer, Meghan Rector; Nehm, Ross H.; Pearl, Dennis K.

    2016-01-01

    Understanding sources of performance bias in science assessment provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Research investigating assessment bias due to factors such as instrument structure, participant characteristics, and item types are well documented across a variety of disciplines. However, the relationships among these factors are unclear for tasks evaluating understanding through performance on scientific practices, such as explanation. Using item-response theory (Rasch analysis), we evaluated differences in performance by gender on a constructed-response (CR) assessment about natural selection (ACORNS). Three isomorphic item strands of the instrument were administered to a sample of undergraduate biology majors and nonmajors (Group 1: n = 662 [female = 51.6%]; G2: n = 184 [female = 55.9%]; G3: n = 642 [female = 55.1%]). Overall, our results identify relationships between item features and performance by gender; however, the effect is small in the majority of cases, suggesting that males and females tend to incorporate similar concepts into their CR explanations. These results highlight the importance of examining gender effects on performance in written assessment tasks in biology. PMID:26865642

  16. Measuring cancer caregiver health literacy: Validation of the Health Literacy of Caregivers Scale-Cancer (HLCS-C) in an Australian population.

    PubMed

    Yuen, Eva; Knight, Tess; Dodson, Sarity; Chirgwin, Jacqueline; Busija, Lucy; Ricciardelli, Lina A; Burney, Susan; Parente, Phillip; Livingston, Patricia M

    2018-05-01

    Caregivers have been largely neglected in health literacy measurement. We assess the construct validity, and internal consistency of the Health Literacy of Caregivers Scale-Cancer (HLCS-C), and present a revised, psychometrically robust scale. Using data from 297 cancer caregivers (12.4% response rate) recruited from Melbourne, Australia between January-July 2014, confirmatory factor analysis (CFA) was conducted to evaluate the HLCS-C's proposed factor structure. Items were evaluated for: item difficulty, unidimensionality and overall item fit within their domain. Item-threshold-ordering was examined though one-parameter Item Response Theory models. Internal consistency was assessed using Raykov's reliability coefficient. CFA results identified 42 poorly performing/redundant items which were subsequently removed. A 10-factor model was fitted to 46 acceptable items with no correlated residuals or factor cross-loadings accepted. Adequate fit was revealed (χ 2 WLSMV  = 1463.807[df = 944], p < .001, RMSEA = 0.043, CFI = 0.980, TLI = 0.978, WRMR = 1.00). Ten domains were identified: Proactivity and determination to seek information; Adequate information about cancer and cancer management; Supported by healthcare providers (HCP) to understand information; Social support; Cancer-related communication with the care recipient (CR); Understanding CR needs and preferences; Self-care; Understanding the healthcare system; Capacity to process health information; and Active engagement with HCP. Internal consistency was adequate across domains (0.78-0.92). The revised HLCS-C demonstrated good structural, convergent, and discriminant validity, and high internal consistency. The scale may be useful for the development and evaluation of caregiver interventions. © 2017 John Wiley & Sons Ltd.

  17. Psychometrics of a Child Report Measure of Maternal Support Following Disclosure of Sexual Abuse

    PubMed Central

    Smith, Daniel W.; Sawyer, Genelle K.; Heck, Nicholas C.; Zajac, Kristyn; Solomon, David; Self-Brown, Shannon; Danielson, Carla K.; Ralston, M. Elizabeth

    2018-01-01

    Objective The purpose of this study was to develop a psychometrically sound child-report measure of maternal support following disclosure of child sexual abuse. Maternal support following disclosure of child sexual abuse is an important predictor of child adjustment; however, this construct is not well defined, and a psychometrically sound method to assess maternal support from a child’s perspective does not exist. Methods Demographic and abuse-specific information was collected via structured interview from 146 mother-child dyads presenting for an initial forensic evaluation at a child advocacy center. Mothers completed the Maternal Self-report Support Questionnaire, and children completed the Trauma Symptom Checklist for Children and 32 items considered for inclusion in a new measure known as the Maternal Support Questionnaire – Child Report (MSQ-CR). Results Exploratory factor analysis of the MSQ-CR resulted in a three factor solution: Emotional Support (9 items), Skeptical Preoccupation (5 items), and Protection/Retaliation (6 items). Each factor demonstrated adequate internal consistency reliability. Analyses with the Maternal Self-report Support Questionnaire and the Trauma Symptom Checklist supported the construct and concurrent validity of the new measure. Conclusions The MSQ-CR demonstrated sound psychometric properties. Future research is needed to determine whether the MSQ-CR provides a more sensitive approximation of maternal support following disclosure of sexual abuse, relative to measures of global parent-child relations. Additional research is needed to contextualize discrepancies between mother and child ratings of maternal support. Important limitations of the investigation are reviewed. PMID:28471341

  18. Development and psychometric evaluation of a cardiovascular risk and disease management knowledge assessment tool.

    PubMed

    Rosneck, James S; Hughes, Joel; Gunstad, John; Josephson, Richard; Noe, Donald A; Waechter, Donna

    2014-01-01

    This article describes the systematic construction and psychometric analysis of a knowledge assessment instrument for phase II cardiac rehabilitation (CR) patients measuring risk modification disease management knowledge and behavioral outcomes derived from national standards relevant to secondary prevention and management of cardiovascular disease. First, using adult curriculum based on disease-specific learning outcomes and competencies, a systematic test item development process was completed by clinical staff. Second, a panel of educational and clinical experts used an iterative process to identify test content domain and arrive at consensus in selecting items meeting criteria. Third, the resulting 31-question instrument, the Cardiac Knowledge Assessment Tool (CKAT), was piloted in CR patients to ensure use of application. Validity and reliability analyses were performed on 3638 adults before test administrations with additional focused analyses on 1999 individuals completing both pretreatment and posttreatment administrations within 6 months. Evidence of CKAT content validity was substantiated, with 85% agreement among content experts. Evidence of construct validity was demonstrated via factor analysis identifying key underlying factors. Estimates of internal consistency, for example, Cronbach's α = .852 and Spearman-Brown split-half reliability = 0.817 on pretesting, support test reliability. Item analysis, using point biserial correlation, measured relationships between performance on single items and total score (P < .01). Analyses using item difficulty and item discrimination indices further verified item stability and validity of the CKAT. A knowledge instrument specifically designed for an adult CR population was systematically developed and tested in a large representative patient population, satisfying psychometric parameters, including validity and reliability.

  19. A HO-IRT Based Diagnostic Assessment System with Constructed Response Items

    ERIC Educational Resources Information Center

    Yang, Chih-Wei; Kuo, Bor-Chen; Liao, Chen-Huei

    2011-01-01

    The aim of the present study was to develop an on-line assessment system with constructed response items in the context of elementary mathematics curriculum. The system recorded the problem solving process of constructed response items and transfered the process to response codes for further analyses. An inference mechanism based on artificial…

  20. An Evaluation of "Intentional" Weighting of Extended-Response or Constructed-Response Items in Tests with Mixed Item Types.

    ERIC Educational Resources Information Center

    Ito, Kyoko; Sykes, Robert C.

    This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…

  1. Reporting of Subscores Using Multidimensional Item Response Theory

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Recently, there has been increasing interest in reporting subscores. This paper examines reporting of subscores using multidimensional item response theory (MIRT) models (e.g., Reckase in "Appl. Psychol. Meas." 21:25-36, 1997; C.R. Rao and S. Sinharay (Eds), "Handbook of Statistics, vol. 26," pp. 607-642, North-Holland, Amsterdam, 2007; Beguin &…

  2. Item Construction and Psychometric Models Appropriate for Constructed Responses

    DTIC Science & Technology

    1991-08-01

    which involve only one attribute per item. This is especially true when we are dealing with constructed-response items, we have to measure much more...Service University of Ilinois Educacional Testing Service Rosedal Road Capign. IL 61801 Princeton. K3 08541 Princeton. N3 08541 Dr. Charles LeiS Dr

  3. Comparison of Integrated Testlet and Constructed-Response Question Formats

    ERIC Educational Resources Information Center

    Slepkov, Aaron D.; Shiell, Ralph C.

    2014-01-01

    Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of the time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently developed…

  4. Applying the Nominal Response Model within a Longitudinal Framework to Construct the Positive Family Relationships Scale

    ERIC Educational Resources Information Center

    Preston, Kathleen Suzanne Johnson; Parral, Skye N.; Gottfried, Allen W.; Oliver, Pamella H.; Gottfried, Adele Eskeles; Ibrahim, Sirena M.; Delany, Danielle

    2015-01-01

    A psychometric analysis was conducted using the nominal response model under the item response theory framework to construct the Positive Family Relationships scale. Using data from the Fullerton Longitudinal Study, this scale was constructed within a long-term longitudinal framework spanning middle childhood through adolescence. Items tapping…

  5. Estimating the Effect on Grades of Using Multiple-Choice versus Constructive-Response Questions: Data from the Classroom

    ERIC Educational Resources Information Center

    Hickson, Stephen; Reed, W. Robert; Sander, Nicholas

    2012-01-01

    This study investigates the degree to which grades based solely on constructed-response (CR) questions differ from grades based solely on multiple-choice (MC) questions. If CR questions are to justify their higher costs, they should produce different grade outcomes than MC questions. We use a data set composed of thousands of observations on…

  6. Detecting Differential Item Discrimination (DID) and the Consequences of Ignoring DID in Multilevel Item Response Models

    ERIC Educational Resources Information Center

    Lee, Woo-yeol; Cho, Sun-Joo

    2017-01-01

    Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…

  7. The Effect of the Multiple-Choice Item Format on the Measurement of Knowledge of Language Structure

    ERIC Educational Resources Information Center

    Currie, Michael; Chiramanee, Thanyapa

    2010-01-01

    Noting the widespread use of multiple-choice items in tests in English language education in Thailand, this study compared their effect against that of constructed-response items. One hundred and fifty-two university undergraduates took a test of English structure first in constructed-response format, and later in three, stem-equivalent…

  8. A Quasi-Parametric Method for Fitting Flexible Item Response Functions

    ERIC Educational Resources Information Center

    Liang, Longjuan; Browne, Michael W.

    2015-01-01

    If standard two-parameter item response functions are employed in the analysis of a test with some newly constructed items, it can be expected that, for some items, the item response function (IRF) will not fit the data well. This lack of fit can also occur when standard IRFs are fitted to personality or psychopathology items. When investigating…

  9. Developing a Machine-Supported Coding System for Constructed-Response Items in PISA. Research Report. ETS RR-17-47

    ERIC Educational Resources Information Center

    Yamamoto, Kentaro; He, Qiwei; Shin, Hyo Jeong; von Davier, Mattias

    2017-01-01

    Approximately a third of the Programme for International Student Assessment (PISA) items in the core domains (math, reading, and science) are constructed-response items and require human coding (scoring). This process is time-consuming, expensive, and prone to error as often (a) humans code inconsistently, and (b) coding reliability in…

  10. Identification of high school students' ability level of constructing free body diagrams to solve restricted and structured response items in force matter

    NASA Astrophysics Data System (ADS)

    Rahmaniar, Andinisa; Rusnayati, Heni; Sutiadi, Asep

    2017-05-01

    While solving physics problem particularly in force matter, it is needed to have the ability of constructing free body diagrams which can help students to analyse every force which acts on an object, the length of its vector and the naming of its force. Mix method was used to explain the result without any special treatment to participants. The participants were high school students in first grade totals 35 students. The purpose of this study is to identify students' ability level of constructing free body diagrams in solving restricted and structured response items. Considering of two types of test, every student would be classified into four levels ability of constructing free body diagrams which is every level has different characteristic and some students were interviewed while solving test in order to know how students solve the problem. The result showed students' ability of constructing free body diagrams on restricted response items about 34.86% included in no evidence of level, 24.11% inadequate level, 29.14% needs improvement level and 4.0% adequate level. On structured response items is about 16.59% included no evidence of level, 23.99% inadequate level, 36% needs improvement level, and 13.71% adequate level. Researcher found that students who constructed free body diagrams first and constructed free body diagrams correctly were more successful in solving restricted and structured response items.

  11. Construct Validity Evidence for Single-Response Items to Estimate Physical Activity Levels in Large Sample Studies

    ERIC Educational Resources Information Center

    Jackson, Allen W.; Morrow, James R., Jr.; Bowles, Heather R.; FitzGerald, Shannon J.; Blair, Steven N.

    2007-01-01

    Valid measurement of physical activity is important for studying the risks for morbidity and mortality. The purpose of this study was to examine evidence of construct validity of two similar single-response items assessing physical activity via self-report. Both items are based on the stages of change model. The sample was 687 participants (men =…

  12. Cardiac rehabilitation in Canada and Arab countries: comparing availability and program characteristics.

    PubMed

    Turk-Adawi, Karam I; Terzic, Carmen; Bjarnason-Wehrens, Birna; Grace, Sherry L

    2015-11-26

    Despite the high burden of cardiovascular diseases in Arab countries, little is known about cardiac rehabilitation (CR) delivery. This study assessed availability, and CR program characteristics in the Arab World, compared to Canada. A questionnaire incorporating items from 4 national / regional published CR program surveys was created for this cross-sectional study. The survey was emailed to all Arab CR program contacts that were identified through published studies, conference abstracts, a snowball sampling strategy, and other key informants from the 22 Arab countries. An online survey link was also emailed to all contacts in the Canadian Association of Cardiovascular Prevention and Rehabilitation directory. Descriptive statistics were used to describe all closed-ended items in the survey. All open-ended responses were coded using an interpretive-descriptive approach. Eight programs were identified in Arab countries, of which 5 (62.5 %) participated; 128 programs were identified in Canada, of which 39 (30.5%) participated. There was consistency in core components delivered in Arab countries and Canada; however, Arab programs more often delivered women-only classes. Lack of human resources was perceived as the greatest barrier to CR provision in all settings, with space also a barrier in Arab settings, and financial resources in Canada. The median number of patients served per program was 300 for Canada vs. 200 for Arab countries. Availability of CR programs in Arab countries is incredibly limited, despite the fact that most responses stemmed from high-income countries. Where available, CR programs in Arab countries appear to be delivered in a manner consistent with Canada.

  13. Missouri Assessment Program (MAP), Spring 2000: Elementary Health/Physical Education, Released Items, Grade 5.

    ERIC Educational Resources Information Center

    Missouri State Dept. of Elementary and Secondary Education, Jefferson City.

    This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to fifth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…

  14. Developing a Strategy for Using Technology-Enhanced Items in Large-Scale Standardized Tests

    ERIC Educational Resources Information Center

    Bryant, William

    2017-01-01

    As large-scale standardized tests move from paper-based to computer-based delivery, opportunities arise for test developers to make use of items beyond traditional selected and constructed response types. Technology-enhanced items (TEIs) have the potential to provide advantages over conventional items, including broadening construct measurement,…

  15. Evaluation of the Psychometric Properties of the Asian Adolescent Depression Scale and Construction of a Short Form: An Item Response Theory Analysis.

    PubMed

    Lo, Barbara Chuen Yee; Zhao, Yue; Kwok, Alice Wai Yee; Chan, Wai; Chan, Calais Kin Yuen

    2017-07-01

    The present study applied item response theory to examine the psychometric properties of the Asian Adolescent Depression Scale and to construct a short form among 1,084 teenagers recruited from secondary schools in Hong Kong. Findings suggested that some items of the full form reflected higher levels of severity and were more discriminating than others, and the Asian Adolescent Depression Scale was useful in measuring a broad range of depressive severity in community youths. Differential item functioning emerged in several items where females reported higher depressive severity than males. In the short form construction, preliminary validation suggested that, relative to the 20-item full form, our derived short form offered significantly greater diagnostic performance and stronger discriminatory ability in differentiating depressed and nondepressed groups, and simultaneously maintained adequate measurement precision with a reduced response burden in assessing depression in the Asian adolescents. Cultural variance in depressive symptomatology and clinical implications are discussed.

  16. [Eight-step structured decision-making process to assign criminal responsibility and seven focal points for describing relationship between psychopathology and offense].

    PubMed

    Okada, Takayuki

    2013-01-01

    The author suggested that it is essential for lawyers and psychiatrists to have a common understanding of the mutual division of roles between them when determining criminal responsibility (CR) and, for this purpose, proposed an 8-step structured CR decision-making process. The 8 steps are: (1) gathering of information related to mental function and condition, (2) recognition of mental function and condition,(3) psychiatric diagnosis, (4) description of the relationship between psychiatric symptom or psychopathology and index offense, (5) focus on capacities of differentiation between right and wrong and behavioral control, (6) specification of elements of cognitive/volitional prong in legal context, (7) legal evaluation of degree of cognitive/volitional prong, and (8) final interpretation of CR as a legal conclusion. The author suggested that the CR decision-making process should proceed not in a step-like pattern from (1) to (2) to (3) to (8), but in a step-like pattern from (1) to (2) to (4) to (5) to (6) to (7) to (8), and that not steps after (5), which require the interpretation or the application of section 39 of the Penal Code, but Step (4), must be the core of psychiatric expert evidence. When explaining the relationship between the mental disorder and offense described in Step (4), the Seven Focal Points (7FP) are often used. The author urged basic precautions to prevent the misuse of 7FP, which are: (a) the priority of each item is not equal and the relative importance differs from case to case; (b) each item is not exclusively independent, there may be overlap between items; (c) the criminal responsibility shall not be judged because one item is applicable or because a number of items are applicable, i. e., 7FP are not "criteria," for example, the aim is not to decide such things as 'the motive is understandable' or 'the conduct is appropriate', but should be to describe how psychopathological factors affected the offense specifically in the context of understandability of motive or appropriateness of conduct; (d) it is essential to evaluate each item from a neutral point of view rather than only from one perspective, for example, looking at the case from the aspects of both comprehensibility and incomprehensibility of motive or from aspects of both oriented, purposeful, organized behavior and disoriented, purposeless, disorganized behavior during the offense; (e) depending on the case, there are some items that do not require any consideration (there are some cases in which there are less than seven items); (f) 7FP are not exhaustive and there are instances in which, depending on the case, there should be a focus on points that are not included in these.

  17. Missouri Assessment Program (MAP), Spring 2000: High School Health/Physical Education, Released Items, Grade 9.

    ERIC Educational Resources Information Center

    Missouri State Dept. of Elementary and Secondary Education, Jefferson City.

    This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to ninth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…

  18. Validation of Automated Scoring of Science Assessments

    ERIC Educational Resources Information Center

    Liu, Ou Lydia; Rios, Joseph A.; Heilman, Michael; Gerard, Libby; Linn, Marcia C.

    2016-01-01

    Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine…

  19. Constructed-Response Problems

    ERIC Educational Resources Information Center

    Swinford, Ashleigh

    2016-01-01

    With rigor outlined in state and Common Core standards and the addition of constructed-response test items to most state tests, math constructed-response questions have become increasingly popular in today's classroom. Although constructed-response problems can present a challenge for students, they do offer a glimpse of students' learning through…

  20. Firestar-"D": Computerized Adaptive Testing Simulation Program for Dichotomous Item Response Theory Models

    ERIC Educational Resources Information Center

    Choi, Seung W.; Podrabsky, Tracy; McKinney, Natalie

    2012-01-01

    Computerized adaptive testing (CAT) enables efficient and flexible measurement of latent constructs. The majority of educational and cognitive measurement constructs are based on dichotomous item response theory (IRT) models. An integral part of developing various components of a CAT system is conducting simulations using both known and empirical…

  1. Developing and Evaluating a Machine-Scorable, Constrained Constructed-Response Item.

    ERIC Educational Resources Information Center

    Braun, Henry I.; And Others

    The use of constructed response items in large scale standardized testing has been hampered by the costs and difficulties associated with obtaining reliable scores. The advent of expert systems may signal the eventual removal of this impediment. This study investigated the accuracy with which expert systems could score a new, non-multiple choice…

  2. Practical methods for dealing with 'not applicable' item responses in the AMC Linear Disability Score project

    PubMed Central

    Holman, Rebecca; Glas, Cees AW; Lindeboom, Robert; Zwinderman, Aeilko H; de Haan, Rob J

    2004-01-01

    Background Whenever questionnaires are used to collect data on constructs, such as functional status or health related quality of life, it is unlikely that all respondents will respond to all items. This paper examines ways of dealing with responses in a 'not applicable' category to items included in the AMC Linear Disability Score (ALDS) project item bank. Methods The data examined in this paper come from the responses of 392 respondents to 32 items and form part of the calibration sample for the ALDS item bank. The data are analysed using the one-parameter logistic item response theory model. The four practical strategies for dealing with this type of response are: cold deck imputation; hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. Results The item and respondent population parameter estimates were very similar for the strategies involving hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. The estimates obtained using the cold deck imputation method were substantially different. Conclusions The cold deck imputation method was not considered suitable for use in the ALDS item bank. The other three methods described can be usefully implemented in the ALDS item bank, depending on the purpose of the data analysis to be carried out. These three methods may be useful for other data sets examining similar constructs, when item response theory based methods are used. PMID:15200681

  3. An Extension of IRT-Based Equating to the Dichotomous Testlet Response Theory Model

    ERIC Educational Resources Information Center

    Tao, Wei; Cao, Yi

    2016-01-01

    Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…

  4. Measuring Constructs in Family Science: How Can Item Response Theory Improve Precision and Validity?

    PubMed Central

    Gordon, Rachel A.

    2014-01-01

    This article provides family scientists with an understanding of contemporary measurement perspectives and the ways in which item response theory (IRT) can be used to develop measures with desired evidence of precision and validity for research uses. The article offers a nontechnical introduction to some key features of IRT, including its orientation toward locating items along an underlying dimension and toward estimating precision of measurement for persons with different levels of that same construct. It also offers a didactic example of how the approach can be used to refine conceptualization and operationalization of constructs in the family sciences, using data from the National Longitudinal Survey of Youth 1979 (n = 2,732). Three basic models are considered: (a) the Rasch and (b) two-parameter logistic models for dichotomous items and (c) the Rating Scale Model for multicategory items. Throughout, the author highlights the potential for researchers to elevate measurement to a level on par with theorizing and testing about relationships among constructs. PMID:25663714

  5. Developing Form Assembly Specifications for Exams with Multiple Choice and Constructed Response Items: Balancing Reliability and Validity Concerns

    ERIC Educational Resources Information Center

    Hendrickson, Amy; Patterson, Brian; Ewing, Maureen

    2010-01-01

    The psychometric considerations and challenges associated with including constructed response items on tests are discussed along with how these issues affect the form assembly specifications for mixed-format exams. Reliability and validity, security and fairness, pretesting, content and skills coverage, test length and timing, weights, statistical…

  6. FY2017 Defense Spending Under an Interim Continuing Resolution (CR): In Brief

    DTIC Science & Technology

    2016-11-07

    Appropriations and Military Construction, Veterans Affairs, and Related Agencies Appropriations Act, 2017, and Zika Response and Preparedness Act...Appropriations and Military Construction, Veterans Affairs, and Related Agencies Appropriations Act, 2017, and Zika Response and Preparedness Act, into

  7. Fixed or mixed: a comparison of three, four and mixed-option multiple-choice tests in a Fetal Surveillance Education Program

    PubMed Central

    2013-01-01

    Background Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. Methods The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Results Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. Conclusions The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information. PMID:23453056

  8. Fixed or mixed: a comparison of three, four and mixed-option multiple-choice tests in a Fetal Surveillance Education Program.

    PubMed

    Zoanetti, Nathan; Beaves, Mark; Griffin, Patrick; Wallace, Euan M

    2013-03-04

    Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information.

  9. Assessing Construct Validity Using Multidimensional Item Response Theory.

    ERIC Educational Resources Information Center

    Ackerman, Terry A.

    The concept of a user-specified validity sector is discussed. The idea of the validity sector combines the work of M. D. Reckase (1986) and R. Shealy and W. Stout (1991). Reckase developed a methodology to represent an item in a multidimensional latent space as a vector. Item vectors are computed using multidimensional item response theory item…

  10. Evaluation of Internal Construct Validity and Unidimensionality of the Brachial Assessment Tool, A Patient-Reported Outcome Measure for Brachial Plexus Injury.

    PubMed

    Hill, Bridget; Pallant, Julie; Williams, Gavin; Olver, John; Ferris, Scott; Bialocerkowski, Andrea

    2016-12-01

    To evaluate the internal construct validity and dimensionality of a new patient-reported outcome measure for people with traumatic brachial plexus injury (BPI) based on the International Classification of Functioning, Disability and Health definition of activity. Cross-sectional study. Outpatient clinics. Adults (age range, 18-82y) with a traumatic BPI (N=106). There were 106 people with BPI who completed a 51-item 5-response questionnaire. Responses were analyzed in 4 phases (missing responses, item correlations, exploratory factor analysis, and Rasch analysis) to evaluate the properties of fit to the Rasch model, threshold response, local dependency, dimensionality, differential item functioning, and targeting. Not applicable, as this study addresses the development of an outcome measure. Six items were deleted for missing responses, and 10 were deleted for high interitem correlations >.81. The remaining 35 items, while demonstrating fit to the Rasch model, showed evidence of local dependency and multidimensionality. Items were divided into 3 subscales: dressing and grooming (8 items), arm and hand (17 items), and no hand (6 items). All 3 subscales demonstrated fit to the model with no local dependency, minimal disordered thresholds, no unidimensionality or differential item functioning for age, time postinjury, or self-selected dominance. Subscales were combined into 3 subtests and demonstrated fit to the model, no misfit, and unidimensionality, allowing calculation of a summary score. This preliminary analysis supports the internal construct validity of the Brachial Assessment Tool, a unidimensional targeted 4-response patient-reported outcome measure designed to solely assess activity after traumatic BPI regardless of level of injury, age at recruitment, premorbid limb dominance, and time postinjury. Further examination is required to determine test-retest reliability and responsiveness. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  11. The Construction of a Long Variable of Conceptual Development in Social Education.

    ERIC Educational Resources Information Center

    Doig, Brian

    This paper demonstrates a method for constructing long variables using items that elicit partically correct responses across ages. Long variables may be defined by students at different ages (year levels) attempting common items within a test containing other items considered to be appropriate for each age or year level. A developmental model of…

  12. Evaluating and Refining the Construct of Sexual Quality With Item Response Theory: Development of the Quality of Sex Inventory.

    PubMed

    Shaw, Amanda M; Rogge, Ronald D

    2016-02-01

    This study took a critical look at the construct of sexual quality. The 65 items of four well-validated self-report measures of sexual satisfaction (the Index of Sexual Satisfaction [ISS], Hudson, Harrison, & Crosscup, 1981; the Global Measure of Sexual Satisfaction [GMSEX], Lawrance & Byers, 1995; the Pinney Sexual Satisfaction Inventory [PSSI], Pinney, Gerrard, & Denney, 1987; the Young Sexual Satisfaction Scale [YSSS], Young, Denny, Luquis, & Young, 1998) and an additional 74 potential sexual quality items were given to 3060 online participants. Using Item Response Theory (IRT), we demonstrated that the ISS, YSSS, and PSSI scales provided suboptimal levels of precision in assessing sexual quality, particularly given the length of those scales. Exploratory factor analyses, IRT, differential item functioning analyses, and longitudinal responsiveness analyses were used to develop and evaluate the Quality of Sex Inventory. Results suggested that, in comparison to existing scales, the QSI (1) offers investigators and clinicians more theoretically focused scales, (2) distinguishes sexual satisfaction from sexual dissatisfaction, and (3) offers greater precision and power for detecting differences with (4) comparably high levels of responsiveness for detecting change over time despite being notably shorter than most of the existing scales. The QSI-satisfaction subscales demonstrated strong convergent validity with other measures of sexual satisfaction and excellent construct validity with anchor scales from the nomological net surrounding that construct, suggesting that they continue to assess the same theoretical construct as prior scales. Implications for research are discussed.

  13. Extreme Response Style: Which Model Is Best?

    ERIC Educational Resources Information Center

    Leventhal, Brian

    2017-01-01

    More robust and rigorous psychometric models, such as multidimensional Item Response Theory models, have been advocated for survey applications. However, item responses may be influenced by construct-irrelevant variance factors such as preferences for extreme response options. Through empirical and simulation methods, this study evaluates the use…

  14. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain

    PubMed Central

    Crins, Martine H. P.; Roorda, Leo D.; Smits, Niels; de Vet, Henrica C. W.; Westhovens, Rene; Cella, David; Cook, Karon F.; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B.

    2015-01-01

    The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach’s alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach’s alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed. PMID:26214178

  15. Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

    PubMed

    Crins, Martine H P; Roorda, Leo D; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B

    2015-01-01

    The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach's alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

  16. Discriminant content validity: a quantitative methodology for assessing content of theory-based measures, with illustrative applications.

    PubMed

    Johnston, Marie; Dixon, Diane; Hart, Jo; Glidewell, Liz; Schröder, Carin; Pollard, Beth

    2014-05-01

    In studies involving theoretical constructs, it is important that measures have good content validity and that there is not contamination of measures by content from other constructs. While reliability and construct validity are routinely reported, to date, there has not been a satisfactory, transparent, and systematic method of assessing and reporting content validity. In this paper, we describe a methodology of discriminant content validity (DCV) and illustrate its application in three studies. Discriminant content validity involves six steps: construct definition, item selection, judge identification, judgement format, single-sample test of content validity, and assessment of discriminant items. In three studies, these steps were applied to a measure of illness perceptions (IPQ-R) and control cognitions. The IPQ-R performed well with most items being purely related to their target construct, although timeline and consequences had small problems. By contrast, the study of control cognitions identified problems in measuring constructs independently. In the final study, direct estimation response formats for theory of planned behaviour constructs were found to have as good DCV as Likert format. The DCV method allowed quantitative assessment of each item and can therefore inform the content validity of the measures assessed. The methods can be applied to assess content validity before or after collecting data to select the appropriate items to measure theoretical constructs. Further, the data reported for each item in Appendix S1 can be used in item or measure selection. Statement of contribution What is already known on this subject? There are agreed methods of assessing and reporting construct validity of measures of theoretical constructs, but not their content validity. Content validity is rarely reported in a systematic and transparent manner. What does this study add? The paper proposes discriminant content validity (DCV), a systematic and transparent method of assessing and reporting whether items assess the intended theoretical construct and only that construct. In three studies, DCV was applied to measures of illness perceptions, control cognitions, and theory of planned behaviour response formats. Appendix S1 gives content validity indices for each item of each questionnaire investigated. Discriminant content validity is ideally applied while the measure is being developed, before using to measure the construct(s), but can also be applied after using a measure. © 2014 The British Psychological Society.

  17. EXAMINING THE GENERALITY OF CHILDREN'S PREFERENCE FOR CONTINGENT REINFORCEMENT VIA EXTENSION TO DIFFERENT RESPONSES, REINFORCERS, AND SCHEDULES

    PubMed Central

    Luczynski, Kevin C; Hanley, Gregory P

    2010-01-01

    Studies that have assessed whether children prefer contingent reinforcement (CR) or noncontingent reinforcement (NCR) have shown that they prefer CR. Preference for CR has, however, been evaluated only under continuous reinforcement (CRF) schedules. The prevalence of intermittent reinforcement (INT) warrants an evaluation of whether preference for CR persists as the schedule of reinforcement is thinned. In the current study, we evaluated 2 children's preference for contingent versus noncontingent delivery of highly preferred edible items for academic task completion under CRF and INT schedules. Children (a) preferred CR to NCR under the CRF schedule, (b) continued to prefer CR as the schedule of reinforcement became intermittent, and (c) exhibited a shift in preference from CR to NCR as the schedule became increasingly thin. These findings extend the generality of and provide one set of limits to the preference for CR. Applied implications, variables controlling preferences, and future research are discussed. PMID:21358901

  18. Psychometrics of a Child Report Measure of Maternal Support following Disclosure of Sexual Abuse.

    PubMed

    Smith, Daniel W; Sawyer, Genelle K; Heck, Nicholas C; Zajac, Kristyn; Solomon, David; Self-Brown, Shannon; Danielson, Carla K; Ralston, M Elizabeth

    2017-04-01

    The study examined a new child report measure of maternal support following child sexual abuse. One hundred and forty-six mother-child dyads presenting for a forensic evaluation completed assessments including standardized measures of adjustment. Child participants also responded to 32 items considered for inclusion in a new measure, the Maternal Support Questionnaire-Child Report (MSQ-CR). Exploratory factor analysis of the Maternal Support Questionnaire-Child Report resulted in a three factor, 20-item solution: Emotional Support (9 items), Skeptical Preoccupation (5 items), and Protection/Retaliation (6 items). Each factor demonstrated adequate internal consistency. Construct and concurrent validity of the new measure were supported in comparison to other trauma-specific measures. The Maternal Support Questionnaire-Child Report demonstrated sound psychometric properties. Future research is needed to determine whether the Maternal Support Questionnaire-Child Report provides a more sensitive approximation of maternal support following disclosure of sexual abuse, relative to measures of global parent-child relations and to contextualize discrepancies between mother and child ratings of maternal support.

  19. Effect of response format on cognitive reflection: Validating a two- and four-option multiple choice question version of the Cognitive Reflection Test.

    PubMed

    Sirota, Miroslav; Juanchich, Marie

    2018-03-27

    The Cognitive Reflection Test, measuring intuition inhibition and cognitive reflection, has become extremely popular because it reliably predicts reasoning performance, decision-making, and beliefs. Across studies, the response format of CRT items sometimes differs, based on the assumed construct equivalence of tests with open-ended versus multiple-choice items (the equivalence hypothesis). Evidence and theoretical reasons, however, suggest that the cognitive processes measured by these response formats and their associated performances might differ (the nonequivalence hypothesis). We tested the two hypotheses experimentally by assessing the performance in tests with different response formats and by comparing their predictive and construct validity. In a between-subjects experiment (n = 452), participants answered stem-equivalent CRT items in an open-ended, a two-option, or a four-option response format and then completed tasks on belief bias, denominator neglect, and paranormal beliefs (benchmark indicators of predictive validity), as well as on actively open-minded thinking and numeracy (benchmark indicators of construct validity). We found no significant differences between the three response formats in the numbers of correct responses, the numbers of intuitive responses (with the exception of the two-option version, which had a higher number than the other tests), and the correlational patterns of the indicators of predictive and construct validity. All three test versions were similarly reliable, but the multiple-choice formats were completed more quickly. We speculate that the specific nature of the CRT items helps build construct equivalence among the different response formats. We recommend using the validated multiple-choice version of the CRT presented here, particularly the four-option CRT, for practical and methodological reasons. Supplementary materials and data are available at https://osf.io/mzhyc/ .

  20. What Do You Think You Are Measuring? A Mixed-Methods Procedure for Assessing the Content Validity of Test Items and Theory-Based Scaling

    PubMed Central

    Koller, Ingrid; Levenson, Michael R.; Glück, Judith

    2017-01-01

    The valid measurement of latent constructs is crucial for psychological research. Here, we present a mixed-methods procedure for improving the precision of construct definitions, determining the content validity of items, evaluating the representativeness of items for the target construct, generating test items, and analyzing items on a theoretical basis. To illustrate the mixed-methods content-scaling-structure (CSS) procedure, we analyze the Adult Self-Transcendence Inventory, a self-report measure of wisdom (ASTI, Levenson et al., 2005). A content-validity analysis of the ASTI items was used as the basis of psychometric analyses using multidimensional item response models (N = 1215). We found that the new procedure produced important suggestions concerning five subdimensions of the ASTI that were not identifiable using exploratory methods. The study shows that the application of the suggested procedure leads to a deeper understanding of latent constructs. It also demonstrates the advantages of theory-based item analysis. PMID:28270777

  1. Optimal Item Selection with Credentialing Examinations.

    ERIC Educational Resources Information Center

    Hambleton, Ronald K.; And Others

    The study compared two promising item response theory (IRT) item-selection methods, optimal and content-optimal, with two non-IRT item selection methods, random and classical, for use in fixed-length certification exams. The four methods were used to construct 20-item exams from a pool of approximately 250 items taken from a 1985 certification…

  2. Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior

    ERIC Educational Resources Information Center

    Tassé, Marc J.; Schalock, Robert L.; Thissen, David; Balboni, Giulia; Bersani, Henry, Jr.; Borthwick-Duffy, Sharon A.; Spreat, Scott; Widaman, Keith F.; Zhang, Dalun; Navas, Patricia

    2016-01-01

    The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT…

  3. Translation, Cross-cultural Adaptation and Psychometric Validation of the Korean-Language Cardiac Rehabilitation Barriers Scale (CRBS-K).

    PubMed

    Baek, Sora; Park, Hee-Won; Lee, Yookyung; Grace, Sherry L; Kim, Won-Seok

    2017-10-01

    To perform a translation and cross-cultural adaptation of the Cardiac Rehabilitation Barriers Scale (CRBS) for use in Korea, followed by psychometric validation. The CRBS was developed to assess patients' perception of the degree to which patient, provider and health system-level barriers affect their cardiac rehabilitation (CR) participation. The CRBS consists of 21 items (barriers to adherence) rated on a 5-point Likert scale. The first phase was to translate and cross-culturally adapt the CRBS to the Korean language. After back-translation, both versions were reviewed by a committee. The face validity was assessed in a sample of Korean patients (n=53) with history of acute myocardial infarction that did not participate in CR through semi-structured interviews. The second phase was to assess the construct and criterion validity of the Korean translation as well as internal reliability, through administration of the translated version in 104 patients, principle component analysis with varimax rotation and cross-referencing against CR use, respectively. The length, readability, and clarity of the questionnaire were rated well, demonstrating face validity. Analysis revealed a six-factor solution, demonstrating construct validity. Cronbach's alpha was greater than 0.65. Barriers rated highest included not knowing about CR and not being contacted by a program. The mean CRBS score was significantly higher among non-attendees (2.71±0.26) than CR attendees (2.51±0.18) (p<0.01). The Korean version of CRBS has demonstrated face, content and criterion validity, suggesting it may be useful for assessing barriers to CR utilization in Korea.

  4. Development and Application of Methods for Estimating Operating Characteristics of Discrete Test Item Responses without Assuming any Mathematical Form.

    ERIC Educational Resources Information Center

    Samejima, Fumiko

    In latent trait theory the latent space, or space of the hypothetical construct, is usually represented by some unidimensional or multi-dimensional continuum of real numbers. Like the latent space, the item response can either be treated as a discrete variable or as a continuous variable. Latent trait theory relates the item response to the latent…

  5. Phenomenological aspects of the cognitive rumination construct.

    PubMed

    Meyer, Leonardo Fernandez; Taborda, José Geraldo Vernet; da Costa, Fábio Antônio; Soares, Ana Luiza Alfaya Galego; Mecler, Kátia; Valença, Alexandre Martins

    2015-01-01

    To evaluate the importance of phenomenological aspects of the cognitive rumination (CR) construct in current empirical psychiatric research. We searched SciELO, Scopus, ScienceDirect, MEDLINE, OneFile (GALE), SpringerLink, Cambridge Journals and Web of Science between February and March of 2014 for studies whose title and topic included the following keywords: cognitive rumination; rumination response scale; and self-reflection. The inclusion criteria were: empirical clinical study; CR as the main object of investigation; and study that included a conceptual definition of CR. The studies selected were published in English in biomedical journals in the last 10 years. Our phenomenological analysis was based on Karl Jaspers' General Psychopathology. Most current empirical studies adopt phenomenological cognitive elements in conceptual definitions. However, these elements do not seem to be carefully examined and are indistinctly understood as objective empirical factors that may be measured, which may contribute to misunderstandings about CR, erroneous interpretations of results and problematic theoretical models. Empirical studies fail when evaluating phenomenological aspects of the cognitive elements of the CR construct. Psychopathology and phenomenology may help define the characteristics of CR elements and may contribute to their understanding and hierarchical organization as a construct. A review of the psychopathology principles established by Jasper may clarify some of these issues.

  6. Perceived freedom-responsibility covariation among Cypriot adolescents.

    PubMed

    Frangou, Georgia; Wilkerson, Keith; McGahan, Joseph R

    2008-04-01

    Participants were 67 Cypriot adolescents who responded to propositions regarding positive, negative, and noncontingent relations between freedom and responsibility. The authors framed items so that half dealt with freedom given responsibility, and the other half dealt with responsibility given freedom. Results indicated participants were more likely to endorse positive-contingency items than they were negative and noncontingency items when items were framed around freedom given responsibility. However, when items were framed around responsibility given freedom, no such differences emerged. The authors discuss results relative to cultural and sociopolitical differences and similarities between children in Cypress and participants in the United States and implications concerning the present study and previous studies regarding these constructs.

  7. A signal detection-item response theory model for evaluating neuropsychological measures.

    PubMed

    Thomas, Michael L; Brown, Gregory G; Gur, Ruben C; Moore, Tyler M; Patt, Virginie M; Risbrough, Victoria B; Baker, Dewleen G

    2018-02-05

    Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias. Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness. SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education. SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the development of computerized adaptive tests and integration with mixture and random-effects models.

  8. Algorithms for the Construction of Parallel Tests by Zero-One Programming. Project Psychometric Aspects of Item Banking No. 7. Research Report 86-7.

    ERIC Educational Resources Information Center

    Boekkooi-Timminga, Ellen

    Nine methods for automated test construction are described. All are based on the concepts of information from item response theory. Two general kinds of methods for the construction of parallel tests are presented: (1) sequential test design; and (2) simultaneous test design. Sequential design implies that the tests are constructed one after the…

  9. Methodology for Developing and Evaluating the PROMIS® Smoking Item Banks

    PubMed Central

    Cai, Li; Stucky, Brian D.; Tucker, Joan S.; Shadel, William G.; Edelen, Maria Orlando

    2014-01-01

    Introduction: This article describes the procedures used in the PROMIS® Smoking Initiative for the development and evaluation of item banks, short forms (SFs), and computerized adaptive tests (CATs) for the assessment of 6 constructs related to cigarette smoking: nicotine dependence, coping expectancies, emotional and sensory expectancies, health expectancies, psychosocial expectancies, and social motivations for smoking. Methods: Analyses were conducted using response data from a large national sample of smokers. Items related to each construct were subjected to extensive item factor analyses and evaluation of differential item functioning (DIF). Final item banks were calibrated, and SF assessments were developed for each construct. The performance of the SFs and the potential use of the item banks for CAT administration were examined through simulation study. Results: Item selection based on dimensionality assessment and DIF analyses produced item banks that were essentially unidimensional in structure and free of bias. Simulation studies demonstrated that the constructs could be accurately measured with a relatively small number of carefully selected items, either through fixed SFs or CAT-based assessment. Illustrative results are presented, and subsequent articles provide detailed discussion of each item bank in turn. Conclusions: The development of the PROMIS smoking item banks provides researchers with new tools for measuring smoking-related constructs. The use of the calibrated item banks and suggested SF assessments will enhance the quality of score estimates, thus advancing smoking research. Moreover, the methods used in the current study, including innovative approaches to item selection and SF construction, may have general relevance to item bank development and evaluation. PMID:23943843

  10. Estimating the Nominal Response Model under Nonnormal Conditions

    ERIC Educational Resources Information Center

    Preston, Kathleen Suzanne Johnson; Reise, Steven Paul

    2014-01-01

    The nominal response model (NRM), a much understudied polytomous item response theory (IRT) model, provides researchers the unique opportunity to evaluate within-item category distinctions. Polytomous IRT models, such as the NRM, are frequently applied to psychological assessments representing constructs that are unlikely to be normally…

  11. Item Estimates under Low-Stakes Conditions: How Should Omits Be Treated?

    ERIC Educational Resources Information Center

    DeMars, Christine

    Using data from a pilot test of science and math from students in 30 high schools, item difficulties were estimated with a one-parameter model (partial-credit model for the multi-point items). Some items were multiple-choice items, and others were constructed-response items (open-ended). Four sets of estimates were obtained: estimates for males…

  12. [Application of decision curve on evaluation of MRI predictive model for early assessing pathological complete response to neoadjuvant therapy in breast cancer].

    PubMed

    He, Y J; Li, X T; Fan, Z Q; Li, Y L; Cao, K; Sun, Y S; Ouyang, T

    2018-01-23

    Objective: To construct a dynamic enhanced MR based predictive model for early assessing pathological complete response (pCR) to neoadjuvant therapy in breast cancer, and to evaluate the clinical benefit of the model by using decision curve. Methods: From December 2005 to December 2007, 170 patients with breast cancer treated with neoadjuvant therapy were identified and their MR images before neoadjuvant therapy and at the end of the first cycle of neoadjuvant therapy were collected. Logistic regression model was used to detect independent factors for predicting pCR and construct the predictive model accordingly, then receiver operating characteristic (ROC) curve and decision curve were used to evaluate the predictive model. Results: ΔArea(max) and Δslope(max) were independent predictive factors for pCR, OR =0.942 (95% CI : 0.918-0.967) and 0.961 (95% CI : 0.940-0.987), respectively. The area under ROC curve (AUC) for the constructed model was 0.886 (95% CI : 0.820-0.951). Decision curve showed that in the range of the threshold probability above 0.4, the predictive model presented increased net benefit as the threshold probability increased. Conclusions: The constructed predictive model for pCR is of potential clinical value, with an AUC>0.85. Meanwhile, decision curve analysis indicates the constructed predictive model has net benefit from 3 to 8 percent in the likely range of probability threshold from 80% to 90%.

  13. A Multidimensional Partial Credit Model with Associated Item and Test Statistics: An Application to Mixed-Format Tests

    ERIC Educational Resources Information Center

    Yao, Lihua; Schwarz, Richard D.

    2006-01-01

    Multidimensional item response theory (IRT) models have been proposed for better understanding the dimensional structure of data or to define diagnostic profiles of student learning. A compensatory multidimensional two-parameter partial credit model (M-2PPC) for constructed-response items is presented that is a generalization of those proposed to…

  14. Covariates of the Rating Process in Hierarchical Models for Multiple Ratings of Test Items

    ERIC Educational Resources Information Center

    Mariano, Louis T.; Junker, Brian W.

    2007-01-01

    When constructed response test items are scored by more than one rater, the repeated ratings allow for the consideration of individual rater bias and variability in estimating student proficiency. Several hierarchical models based on item response theory have been introduced to model such effects. In this article, the authors demonstrate how these…

  15. Measurement of activity limitations and participation restrictions: examination of ICF-linked content and scale properties of the FIM and PC-PART instruments.

    PubMed

    Darzins, Susan W; Imms, Christine; Di Stefano, Marilyn

    2017-05-01

    To explore the operationalization of activity and participation-related measurement constructs through comparison of item phrasing, item response categories and scoring (scale properties) for two separate instruments targeting activities of daily living. Personal Care Participation Assessment and Resource Tool (PC-PART) item content was linked to ICF categories using established linking rules. Previously reported ICF-linked FIM content categories and ICF-linked PC-PART content categories were compared to identify common ICF categories between the instruments. Scale properties of both instruments were compared using a patient scenario to explore the instruments' separate measurement constructs. The PC-PART and FIM shared 15 of the 53 level two ICF-linked categories identified across both instruments. Examination of the instruments' scale properties for items with overlapping ICF content, and exploration through a patient scenario, provided supportive evidence that the instruments measure different constructs. While the PC-PART and FIM share common ICF-linked content, they measure separate constructs. Measurement construct was influenced by the instruments' scale properties. The FIM was observed to measure activity limitations and the PC-PART measured participation restrictions. Scrutiny of instruments' scale properties in addition to item content is critical in the operationalization of activity and participation-related measurement constructs. Implications for Rehabilitation When selecting outcome measures for use in rehabilitation it is necessary to examine both the content of the instruments' items and item phrasing, response categories and scoring, to clarify the construct being measured. Measurement of activity limitations as well as participation restrictions in activities of daily living required for community life provides a more comprehensive measurement of rehabilitation outcomes than measurement of either construct alone. To measure the effects of interventions used in rehabilitation, it is necessary to select measures with relevant content and scale properties that enable evaluation of change in the constructs that are expected to change, as a result of the rehabilitation intervention.

  16. 23 CFR 635.116 - Subcontracting and contractor responsibilities.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... TRAFFIC OPERATIONS CONSTRUCTION AND MAINTENANCE Contract Procedures § 635.116 Subcontracting and contractor responsibilities. (a) Contracts for projects shall specify the minimum percentage of work that a... total original contract price excluding any identified specialty items. Specialty items may be performed...

  17. Item response theory scoring and the detection of curvilinear relationships.

    PubMed

    Carter, Nathan T; Dalal, Dev K; Guan, Li; LoPilato, Alexander C; Withrow, Scott A

    2017-03-01

    Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct. We test whether item response theory (IRT) scoring appropriate for the underlying response process to self-report measures results in more accurate tests for curvilinearity. In 2 simulation studies, we show that, regardless of the underlying response process used to generate the data, using the traditional sum-score generally results in high Type 1 error rates or low power for detecting curvilinearity, depending on the distribution of item locations. With few exceptions, appropriate power and Type 1 error rates are achieved when dominance-based and ideal point-based IRT scoring are correctly used to score dominance and ideal point response data, respectively. We conclude that (a) researchers should be theory-guided when hypothesizing and testing for curvilinear relations; (b) correctly identifying whether responses follow an ideal point versus dominance process, particularly when items are not extreme is critical; and (c) IRT model-based scoring is crucial for accurate tests of curvilinearity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. Methodology for developing and evaluating the PROMIS smoking item banks.

    PubMed

    Hansen, Mark; Cai, Li; Stucky, Brian D; Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando

    2014-09-01

    This article describes the procedures used in the PROMIS Smoking Initiative for the development and evaluation of item banks, short forms (SFs), and computerized adaptive tests (CATs) for the assessment of 6 constructs related to cigarette smoking: nicotine dependence, coping expectancies, emotional and sensory expectancies, health expectancies, psychosocial expectancies, and social motivations for smoking. Analyses were conducted using response data from a large national sample of smokers. Items related to each construct were subjected to extensive item factor analyses and evaluation of differential item functioning (DIF). Final item banks were calibrated, and SF assessments were developed for each construct. The performance of the SFs and the potential use of the item banks for CAT administration were examined through simulation study. Item selection based on dimensionality assessment and DIF analyses produced item banks that were essentially unidimensional in structure and free of bias. Simulation studies demonstrated that the constructs could be accurately measured with a relatively small number of carefully selected items, either through fixed SFs or CAT-based assessment. Illustrative results are presented, and subsequent articles provide detailed discussion of each item bank in turn. The development of the PROMIS smoking item banks provides researchers with new tools for measuring smoking-related constructs. The use of the calibrated item banks and suggested SF assessments will enhance the quality of score estimates, thus advancing smoking research. Moreover, the methods used in the current study, including innovative approaches to item selection and SF construction, may have general relevance to item bank development and evaluation. © The Author 2013. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Ramsay-Curve Differential Item Functioning

    ERIC Educational Resources Information Center

    Woods, Carol M.

    2011-01-01

    Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…

  20. CrMAPK3 regulates the expression of iron-deficiency-responsive genes in Chlamydomonas reinhardtii.

    PubMed

    Fei, Xiaowen; Yu, Junmei; Li, Yajun; Deng, Xiaodong

    2017-05-16

    Under iron-deficient conditions, Chlamydomonas exhibits high affinity for iron absorption. Nevertheless, the response, transmission, and regulation of downstream gene expression in algae cells have not to be investigated. Considering that the MAPK pathway is essential for abiotic stress responses, we determined whether this pathway is involved in iron deficiency signal transduction in Chlamydomonas. Arabidopsis MAPK gene sequences were used as entry data to search for homologous genes in Chlamydomonas reinhardtii genome database to investigate the functions of mitogen-activated protein kinase (MAPK) gene family in C. reinhardtii under iron-free conditions. Results revealed 16 C. reinhardtii MAPK genes labeled CrMAPK2-CrMAPK17 with TXY conserved domains and low homology to MAPK in yeast, Arabidopsis, and humans. The expression levels of these genes were then analyzed through qRT-PCR and exposure to high salt (150 mM NaCl), low nitrogen, or iron-free conditions. The expression levels of these genes were also subjected to adverse stress conditions. The mRNA levels of CrMAPK2, CrMAPK3, CrMAPK4, CrMAPK5, CrMAPK6, CrMAPK8, CrMAPK9, and CrMAPK11 were remarkably upregulated under iron-deficient stress. The increase in CrMAPK3 expression was 43-fold greater than that in the control. An RNA interference vector was constructed and transformed into C. reinhardtii 2A38, an algal strain with an exogenous FOX1:ARS chimeric gene, to silence CrMAPK3. After this gene was silenced, the mRNA levels and ARS activities of FOX1:ARS chimeric gene and endogenous CrFOX1 were decreased. The mRNA levels of iron-responsive genes, such as CrNRAMP2, CrATX1, CrFTR1, and CrFEA1, were also remarkably reduced. CrMAPK3 regulates the expression of iron-deficiency-responsive genes in C. reinhardtii.

  1. Translation, Cross-cultural Adaptation and Psychometric Validation of the Korean-Language Cardiac Rehabilitation Barriers Scale (CRBS-K)

    PubMed Central

    2017-01-01

    Objective To perform a translation and cross-cultural adaptation of the Cardiac Rehabilitation Barriers Scale (CRBS) for use in Korea, followed by psychometric validation. The CRBS was developed to assess patients' perception of the degree to which patient, provider and health system-level barriers affect their cardiac rehabilitation (CR) participation. Methods The CRBS consists of 21 items (barriers to adherence) rated on a 5-point Likert scale. The first phase was to translate and cross-culturally adapt the CRBS to the Korean language. After back-translation, both versions were reviewed by a committee. The face validity was assessed in a sample of Korean patients (n=53) with history of acute myocardial infarction that did not participate in CR through semi-structured interviews. The second phase was to assess the construct and criterion validity of the Korean translation as well as internal reliability, through administration of the translated version in 104 patients, principle component analysis with varimax rotation and cross-referencing against CR use, respectively. Results The length, readability, and clarity of the questionnaire were rated well, demonstrating face validity. Analysis revealed a six-factor solution, demonstrating construct validity. Cronbach's alpha was greater than 0.65. Barriers rated highest included not knowing about CR and not being contacted by a program. The mean CRBS score was significantly higher among non-attendees (2.71±0.26) than CR attendees (2.51±0.18) (p<0.01). Conclusion The Korean version of CRBS has demonstrated face, content and criterion validity, suggesting it may be useful for assessing barriers to CR utilization in Korea. PMID:29201826

  2. The Work Instability Scale for Rheumatoid Arthritis (RA-WIS): Does it work in osteoarthritis?

    PubMed

    Tang, Kenneth; Beaton, Dorcas E; Lacaille, Diane; Gignac, Monique A M; Zhang, Wei; Anis, Aslam H; Bombardier, Claire

    2010-09-01

    To validate the 23-item Work Instability Scale for Rheumatoid Arthritis (RA-WIS) for use in osteoarthritis (OA) using both classical test theory and item response theory approaches. Baseline and 12-month follow-up data were collected from workers with OA recruited from community and clinical settings (n = 130). Fit of RA-WIS data to the Rasch model was evaluated by item- and person-fit statistics (size of residual, chi-sq), assessments of differential item functioning, and tests of unidimensionality and local independence. Internal consistency was assessed by KR-20. Convergent construct validity (Spearman r, known-groups) was evaluated against theoretical constructs that assess impact of health on work. Responsiveness to global indicators of change was assessed by standardized response means (SRM) and area under the receiver operating characteristic curves. Data structure of the RA-WIS showed adequate fit to the Rasch model (chi-sq = 83.2, P = 0.03) after addressing local dependency in three item pairs by creating testlets. High internal consistency (KR-20 = 0.93) and convergent validity with work-oriented constructs (|r| = 0.55-0.77) were evident. The RA-WIS correlated most strongly with the concept of illness intrusiveness (r = 0.77) and was highly responsive to changes (SRM = 1.05 [deterioration]; -0.78 [improvement]). Although developed for RA, the RA-WIS is psychometrically sound for OA and demonstrates interval-level property.

  3. Confirmatory Factor Analysis of the Finnish Job Content Questionnaire (JCQ) in 590 Professional Musicians.

    PubMed

    Vastamäki, Heidi; Vastamäki, Martti; Laimi, Katri; Saltychev, Michail

    2017-07-01

    Poorly functioning work environments may lead to dissatisfaction for the employees and financial loss for the employers. The Job Content Questionnaire (JCQ) was designed to measure social and psychological characteristics of work environments. To investigate the factor construct of the Finnish 14-item version of JCQ when applied to professional orchestra musicians. In a cross-sectional survey, the questionnaire was sent by mail to 1550 orchestra musicians and students. 630 responses were received. Full data were available for 590 respondents (response rate 38%).The questionnaire also contained questions on demographics, job satisfaction, health status, health behaviors, and intensity of playing music. Confirmatory factor analysis of the 2-factor model of JCQ was conducted. Of the 5 estimates, JCQ items in the "job demand" construct, the "conflicting demands" (question 5) explained most of the total variance in this construct (79%) demonstrating almost perfect correlation of 0.63. In the construct of "job control," "opinions influential" (question 10) demonstrated a perfect correlation index of 0.84 and the items "little decision freedom" (question 14) and "allows own decisions" (question 6) showed substantial correlations of 0.77 and 0.65. The 2-factor model of the Finnish 14-item version of JCQ proposed in this study fitted well into the observed data. The "conflicting demands," "opinions influential," "little decision freedom," and "allows own decisions" items demonstrated the strongest correlations with latent factors suggesting that in a population similar to the studied one, especially these items should be taken into account when observed in the response of a population.

  4. Forced-Choice Assessment of Work-Related Maladaptive Personality Traits: Preliminary Evidence From an Application of Thurstonian Item Response Modeling.

    PubMed

    Guenole, Nigel; Brown, Anna A; Cooper, Andrew J

    2018-06-01

    This article describes an investigation of whether Thurstonian item response modeling is a viable method for assessment of maladaptive traits. Forced-choice responses from 420 working adults to a broad-range personality inventory assessing six maladaptive traits were considered. The Thurstonian item response model's fit to the forced-choice data was adequate, while the fit of a counterpart item response model to responses to the same items but arranged in a single-stimulus design was poor. Monotrait heteromethod correlations indicated corresponding traits in the two formats overlapped substantially, although they did not measure equivalent constructs. A better goodness of fit and higher factor loadings for the Thurstonian item response model, coupled with a clearer conceptual alignment to the theoretical trait definitions, suggested that the single-stimulus item responses were influenced by biases that the independent clusters measurement model did not account for. Researchers may wish to consider forced-choice designs and appropriate item response modeling techniques such as Thurstonian item response modeling for personality questionnaire applications in industrial psychology, especially when assessing maladaptive traits. We recommend further investigation of this approach in actual selection situations and with different assessment instruments.

  5. Item Vector Plots for the Multidimensional Three-Parameter Logistic Model

    ERIC Educational Resources Information Center

    Bryant, Damon; Davis, Larry

    2011-01-01

    This brief technical note describes how to construct item vector plots for dichotomously scored items fitting the multidimensional three-parameter logistic model (M3PLM). As multidimensional item response theory (MIRT) shows promise of being a very useful framework in the test development life cycle, graphical tools that facilitate understanding…

  6. Environmental Knowledge and Beliefs among Grade 10 Students in Australia.

    ERIC Educational Resources Information Center

    Eyers, Vivian George

    To develop environmental education in Australia, a survey of tenth-grade students was undertaken. Thirty knowledge items and ten belief items were constructed. A panel of environmentalists and educators identified best responses for the knowledge items, and a common reference point, preservation of homo sapiens, for the belief items, so a…

  7. Measuring organizational effectiveness in information and communication technology companies using item response theory.

    PubMed

    Trierweiller, Andréa Cristina; Peixe, Blênio César Severo; Tezza, Rafael; Pereira, Vera Lúcia Duarte do Valle; Pacheco, Waldemar; Bornia, Antonio Cezar; de Andrade, Dalton Francisco

    2012-01-01

    The aim of this paper is to measure the effectiveness of the organizations Information and Communication Technology (ICT) from the point of view of the manager, using Item Response Theory (IRT). There is a need to verify the effectiveness of these organizations which are normally associated to complex, dynamic, and competitive environments. In academic literature, there is disagreement surrounding the concept of organizational effectiveness and its measurement. A construct was elaborated based on dimensions of effectiveness towards the construction of the items of the questionnaire which submitted to specialists for evaluation. It demonstrated itself to be viable in measuring organizational effectiveness of ICT companies under the point of view of a manager through using Two-Parameter Logistic Model (2PLM) of the IRT. This modeling permits us to evaluate the quality and property of each item placed within a single scale: items and respondents, which is not possible when using other similar tools.

  8. Preschool Gifted Education: Perceived Challenges Associated with Program Development

    ERIC Educational Resources Information Center

    Kettler, Todd; Oveross, Mattie E.; Salman, Rania C.

    2017-01-01

    This descriptive study investigated the challenges related to implementing gifted education services in preschool centers. Participants were 254 licensed preschool center directors in a southern state. Participants completed a researcher-created survey including both selected response items and constructed response items to examine the perceived…

  9. Using the Rasch Measurement Model in Psychometric Analysis of the Family Effectiveness Measure

    PubMed Central

    McCreary, Linda L.; Conrad, Karen M.; Conrad, Kendon J.; Scott, Christy K; Funk, Rodney R.; Dennis, Michael L.

    2013-01-01

    Background Valid assessment of family functioning can play a vital role in optimizing client outcomes. Because family functioning is influenced by family structure, socioeconomic context, and culture, existing measures of family functioning--primarily developed with nuclear, middle class European American families--may not be valid assessments of families in diverse populations. The Family Effectiveness Measure was developed to address this limitation. Objectives To test the Family Effectiveness Measure with data from a primarily low-income African American convenience sample, using the Rasch measurement model. Method A sample of 607 adult women completed the measure. Rasch analysis was used to assess unidimensionality, response category functioning, item fit, person reliability, differential item functioning by race and parental status, and item hierarchy. Criterion-related validity was tested using correlations with five other variables related to family functioning. Results The Family Effectiveness Measure measures two separate constructs: The effective family functioning construct was a psychometrically sound measure of the target construct that was more efficient due to the deletion of 22 items. The ineffective family functioning construct consisted of 16 of those deleted items but was not as strong psychometrically. Items in both constructs evidenced no differential item functioning by race. Criterion-related validity was supported for both. Discussion In contrast to the prevailing conceptualization that family functioning is a single construct, assessed by positively and negatively worded items, use of the Rasch analysis suggested the existence of two constructs. While the effective family functioning is a strong and efficient measure of family functioning, the ineffective family functioning will require additional item development and psychometric testing. PMID:23636342

  10. Calibration of the Dutch-Flemish PROMIS Pain Behavior item bank in patients with chronic pain.

    PubMed

    Crins, M H P; Roorda, L D; Smits, N; de Vet, H C W; Westhovens, R; Cella, D; Cook, K F; Revicki, D; van Leeuwen, J; Boers, M; Dekker, J; Terwee, C B

    2016-02-01

    The aims of the current study were to calibrate the item parameters of the Dutch-Flemish PROMIS Pain Behavior item bank using a sample of Dutch patients with chronic pain and to evaluate cross-cultural validity between the Dutch-Flemish and the US PROMIS Pain Behavior item banks. Furthermore, reliability and construct validity of the Dutch-Flemish PROMIS Pain Behavior item bank were evaluated. The 39 items in the bank were completed by 1042 Dutch patients with chronic pain. To evaluate unidimensionality, a one-factor confirmatory factor analysis (CFA) was performed. A graded response model (GRM) was used to calibrate the items. To evaluate cross-cultural validity, Differential item functioning (DIF) for language (Dutch vs. English) was evaluated. Reliability of the item bank was also examined and construct validity was studied using several legacy instruments, e.g. the Roland Morris Disability Questionnaire. CFA supported the unidimensionality of the Dutch-Flemish PROMIS Pain Behavior item bank (CFI = 0.960, TLI = 0.958), the data also fit the GRM, and demonstrated good coverage across the pain behavior construct (threshold parameters range: -3.42 to 3.54). Analysis showed good cross-cultural validity (only six DIF items), reliability (Cronbach's α = 0.95) and construct validity (all correlations ≥0.53). The Dutch-Flemish PROMIS Pain Behavior item bank was found to have good cross-cultural validity, reliability and construct validity. The development of the Dutch-Flemish PROMIS Pain Behavior item bank will serve as the basis for Dutch-Flemish PROMIS short forms and computer adaptive testing (CAT). © 2015 European Pain Federation - EFIC®

  11. Measuring the ICF components of impairment, activity limitation and participation restriction: an item analysis using classical test theory and item response theory

    PubMed Central

    Pollard, Beth; Dixon, Diane; Dieppe, Paul; Johnston, Marie

    2009-01-01

    Background The International Classification of Functioning, Disability and Health (ICF) proposes three main health outcomes, Impairment (I), Activity Limitation (A) and Participation Restriction (P), but good measures of these constructs are needed The aim of this study was to use both Classical Test Theory (CTT) and Item Response Theory (IRT) methods to carry out an item analysis to improve measurement of these three components in patients having joint replacement surgery mainly for osteoarthritis (OA). Methods A geographical cohort of patients about to undergo lower limb joint replacement was invited to participate. Five hundred and twenty four patients completed ICF items that had been previously identified as measuring only a single ICF construct in patients with osteoarthritis. There were 13 I, 26 A and 20 P items. The SF-36 was used to explore the construct validity of the resultant I, A and P measures. The CTT and IRT analyses were run separately to identify items for inclusion or exclusion in the measurement of each construct. The results from both analyses were compared and contrasted. Results Overall, the item analysis resulted in the removal of 4 I items, 9 A items and 11 P items. CTT and IRT identified the same 14 items for removal, with CTT additionally excluding 3 items, and IRT a further 7 items. In a preliminary exploration of reliability and validity, the new measures appeared acceptable. Conclusion New measures were developed that reflect the ICF components of Impairment, Activity Limitation and Participation Restriction for patients with advanced arthritis. The resulting Aberdeen IAP measures (Ab-IAP) comprising I (Ab-I, 9 items), A (Ab-A, 17 items), and P (Ab-P, 9 items) met the criteria of conventional psychometric (CTT) analyses and the additional criteria (information and discrimination) of IRT. The use of both methods was more informative than the use of only one of these methods. Thus combining CTT and IRT appears to be a valuable tool in the development of measures. PMID:19422677

  12. A Multivariate Multilevel Approach to the Modeling of Accuracy and Speed of Test Takers

    ERIC Educational Resources Information Center

    Klein Entink, R. H.; Fox, J. P.; van der Linden, W. J.

    2009-01-01

    Response times on test items are easily collected in modern computerized testing. When collecting both (binary) responses and (continuous) response times on test items, it is possible to measure the accuracy and speed of test takers. To study the relationships between these two constructs, the model is extended with a multivariate multilevel…

  13. Test of Achievement in Quantitative Economics for Secondary Schools: Construction and Validation Using Item Response Theory

    ERIC Educational Resources Information Center

    Eleje, Lydia I.; Esomonu, Nkechi P. M.

    2018-01-01

    A Test to measure achievement in quantitative economics among secondary school students was developed and validated in this study. The test is made up 20 multiple choice test items constructed based on quantitative economics sub-skills. Six research questions guided the study. Preliminary validation was done by two experienced teachers in…

  14. Measuring the quality of life in hypertension according to Item Response Theory

    PubMed Central

    Borges, José Wicto Pereira; Moreira, Thereza Maria Magalhães; Schmitt, Jeovani; de Andrade, Dalton Francisco; Barbetta, Pedro Alberto; de Souza, Ana Célia Caetano; Lima, Daniele Braz da Silva; Carvalho, Irialda Saboia

    2017-01-01

    ABSTRACT OBJECTIVE To analyze the Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL – Mini-questionnaire of Quality of Life in Hypertension) using the Item Response Theory. METHODS This is an analytical study conducted with 712 persons with hypertension treated in thirteen primary health care units of Fortaleza, State of Ceará, Brazil, in 2015. The steps of the analysis by the Item Response Theory were: evaluation of dimensionality, estimation of parameters of items, and construction of scale. The study of dimensionality was carried out on the polychoric correlation matrix and confirmatory factor analysis. To estimate the item parameters, we used the Gradual Response Model of Samejima. The analyses were conducted using the free software R with the aid of psych and mirt. RESULTS The analysis has allowed the visualization of item parameters and their individual contributions in the measurement of the latent trait, generating more information and allowing the construction of a scale with an interpretative model that demonstrates the evolution of the worsening of the quality of life in five levels. Regarding the item parameters, the items related to the somatic state have had a good performance, as they have presented better power to discriminate individuals with worse quality of life. The items related to mental state have been those which contributed with less psychometric data in the MINICHAL. CONCLUSIONS We conclude that the instrument is suitable for the identification of the worsening of the quality of life in hypertension. The analysis of the MINICHAL using the Item Response Theory has allowed us to identify new sides of this instrument that have not yet been addressed in previous studies. PMID:28492764

  15. Comparison of integrated testlet and constructed-response question formats

    NASA Astrophysics Data System (ADS)

    Slepkov, Aaron D.; Shiell, Ralph C.

    2014-12-01

    Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of the time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently developed question structure designed to provide a proxy of the pedagogical advantages of CR questions while procedurally functioning as set of MC questions. ITs utilize an answer-until-correct response format that provides immediate confirmatory or corrective feedback, and they thus allow not only for the granting of partial credit in cases of initially incorrect reasoning, but, furthermore, the ability to build cumulative question structures. Here, we report on a study that directly compares the functionality of ITs and CR questions in introductory physics exams. To do this, CR questions were converted to concept-equivalent ITs, and both sets of questions were deployed in midterm and final exams. We find that both question types provide adequate discrimination between stronger and weaker students, with CR questions discriminating slightly better than the ITs. There is some indication that any difference in discriminatory power may result from the baseline score for guessing that is inherent in MC testing. Meanwhile, an analysis of interrater scoring of the CR questions raises serious concerns about the reliability of the granting of partial credit when this traditional assessment technique is used in a realistic (but nonoptimized) setting. Furthermore, we show evidence that partial credit is granted in a valid manner in the ITs. Thus, together with consideration of the vastly reduced costs of administering IT-based examinations compared to CR-based examinations, our findings indicate that ITs are viable replacements for CR questions in formal examinations where it is desirable both to assess concept integration and to reward partial knowledge, while efficiently scoring examinations.

  16. The Performance of IRT Model Selection Methods with Mixed-Format Tests

    ERIC Educational Resources Information Center

    Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G.

    2012-01-01

    When tests consist of multiple-choice and constructed-response items, researchers are confronted with the question of which item response theory (IRT) model combination will appropriately represent the data collected from these mixed-format tests. This simulation study examined the performance of six model selection criteria, including the…

  17. The Arabic Version of The Depression Anxiety Stress Scale-21: Cumulative scaling and discriminant-validation testing.

    PubMed

    Ali, Amira Mohammed; Ahmed, Anwar; Sharaf, Amira; Kawakami, Norito; Abdeldayem, Samia M; Green, Joseph

    2017-12-01

    This study aimed to examine the validity of the Arabic version of the Depression Anxiety Stress Scale-21 (DASS-21) in 149 illicit drug users. We calculated α coefficient, inter-item and item-total correlations, coefficients of reproducibility and scalability (CR and CS), item difficulty and discrimination indices. The DASS-21 had an acceptable reliability; but values of the CR and the CS were less than acceptable. Items varied in difficulty and discrimination; some items are candidates for elimination. The DASS-21 is a probabilistic and not a deterministic measure of distress; it has problematic items and needs further investigations. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. A double-blind, placebo-controlled, exploratory trial of chromium picolinate in atypical depression: effect on carbohydrate craving.

    PubMed

    Docherty, John P; Sack, David A; Roffman, Mark; Finch, Manley; Komorowski, James R

    2005-09-01

    : In a small pilot trial, patients with atypical depression demonstrated significant positive therapeutic response to chromium picolinate. This finding is of interest because of the demonstrated link between depression, decreased insulin sensitivity, and subsequent diabetes and chromium picolinate's insulin enhancing effect. : In this double-blind, multicenter, 8-week replication study, 113 adult outpatients with atypical depression were randomized 2:1 to receive 600 mug/day of elemental chromium, as provided by chromium picolinate (CrPic), or placebo. Primary efficacy measures were the 29-item Hamilton Depression Rating Scale (HAM-D-29) and the Clinical Global Impressions Improvement Scale (CGI-I). : Of the 113 randomized patients, 110 (70 CrPic, 40 placebo) constituted the intent-to-treat (ITT) population (i.e., received at least one dose of study medication and completed at least one efficacy evaluation) and 75 (50 CrPic, 25 placebo) were evaluable (i.e., took at least 80% of study drug with no significant protocol deviations). In the evaluable population, mean age was 46 years, 69% were female, 81% were Caucasian, and mean body mass index (BMI) was 29.7. There was no significant difference between the CrPic and placebo groups in both the ITT and evaluable populations on the primary efficacy measures, with both groups showing significant improvement from baseline on total HAM-D-29 scores during the course of treatment (p < 0.0001). However, in the evaluable population, the CrPic group showed significant improvements from baseline compared with the placebo group on 4 HAM-D-29 items: appetite increase, increased eating, carbohydrate craving, and diurnal variation of feelings. A supplemental analysis of data from the subset of 41 patients in the ITT population with high carbohydrate craving (26 CrPic, 15 placebo; mean BMI = 31.1) showed that the CrPic patients had significantly greater response on total HAM-D-29 scores than the placebo group (65% vs. 33%; p < 0.05) as well as significantly greater improvements on the following HAM-D-29 items: appetite increase, increased eating, carbohydrate craving, and genital symptoms (e.g., level of libido). Chromium treatment was well-tolerated. : The study did not include a placebo run-in period, did not require minimum duration or severity of depression, and enrolled patients with major depression, dysthymia, or depression NOS. : In a population of adults with atypical depression, most of whom were overweight or obese, CrPic produced improvement on the following HAM-D-29 items: appetite increase, increased eating, carbohydrate craving, and diurnal variation of feelings. In a subpopulation of patients with high carbohydrate craving, overall HAM-D-29 scores improved significantly in patients treated with CrPic compared with placebo. The results of this study suggest that the main effect of chromium was on carbohydrate craving and appetite regulation in depressed patients and that 600 mug of elemental chromium may be beneficial for patients with atypical depression who also have severe carbohydrate craving. Further studies are needed to evaluate chromium in depressed patients specifically selected for symptoms of increased appetite and carbohydrate craving as well as to determine whether a higher dose of chromium would have an effect on mood.

  19. Psychometric properties of the Triarchic Psychopathy Measure: An item response theory approach.

    PubMed

    Shou, Yiyun; Sellbom, Martin; Xu, Jing

    2018-05-01

    There is cumulative evidence for the cross-cultural validity of the Triarchic Psychopathy Measure (TriPM; Patrick, 2010) among non-Western populations. Recent studies using correlational and regression analyses show promising construct validity of the TriPM in Chinese samples. However, little is known about the efficiency of items in TriPM in assessing the proposed latent traits. The current study evaluated the psychometric properties of the Chinese TriPM at the item level using item response theory analyses. It also examined the measurement invariance of the TriPM between the Chinese and the U.S. student samples by applying differential item functioning analyses under the item response theory framework. The results supported the unidimensional nature of the Disinhibition and Meanness scales. Both scales had a greater level of precision in the respective underlying constructs at the positive ends. The two scales, however, had several items that were weakly associated with their respective latent traits in the Chinese student sample. Boldness, on the other hand, was found to be multidimensional, and reflected a more normally distributed range of variation. The examination of measurement bias via differential item functioning analyses revealed that a number of items of the TriPM were not equivalent across the Chinese and the U.S. Some modification and adaptation of items might be considered for improving the precision of the TriPM for Chinese participants. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  20. Construct Validity and Reliability of the Beliefs Toward Mental Illness Scale for American, Japanese, and Korean Women.

    PubMed

    Saint Arnault, Denise M; Gang, Moonhee; Woo, Seoyoon

    2017-11-01

    The aim of this study was to evaluate the psychometric properties of the Beliefs Toward Mental Illness Scale (BMI) across women from the United States, Japan, and South Korea. A cross-sectional study design was employed. The sample was 564 women aged 21-64 years old who were recruited in the United States and Korea (American = 127, Japanese immigrants in the United States = 204, and Korean = 233). We carried out item analysis, construct validity by confirmatory factor analysis (CFA), and internal consistency using SPSS Version 22 and AMOS Version 22. An acceptable model fit for a 20-item BMI (Beliefs Toward Mental Illness Scale-Revised [BMI-R]) with 3 factors was confirmed using CFA. Construct validity of the BMI-R showed to be all acceptable; convergent validity (average variance extracted [AVE] ≥0.5, construct reliability [CR] ≥0.7) and discriminant validity (r = .65-.89, AVE >.79). The Cronbach's alpha of the BMI-R was .92. These results showed that the BMI was a reliable tool to study beliefs about mental illness across cultures. Our findings also suggested that continued efforts to reduce stigma in culturally specific contexts within and between countries are necessary to promote help-seeking for those suffering from psychological distress.

  1. Pesticide applicators questionnaire content validation: A fuzzy delphi method.

    PubMed

    Manakandan, S K; Rosnah, I; Mohd Ridhuan, J; Priya, R

    2017-08-01

    The most crucial step in forming a set of survey questionnaire is deciding the appropriate items in a construct. Retaining irrelevant items and removing important items will certainly mislead the direction of a particular study. This article demonstrates Fuzzy Delphi method as one of the scientific analysis technique to consolidate consensus agreement within a panel of experts pertaining to each item's appropriateness. This method reduces the ambiguity, diversity, and discrepancy of the opinions among the experts hence enhances the quality of the selected items. The main purpose of this study was to obtain experts' consensus on the suitability of the preselected items on the questionnaire. The panel consists of sixteen experts from the Occupational and Environmental Health Unit of Ministry of Health, Vector-borne Disease Control Unit of Ministry of Health and Occupational and Safety Health Unit of both public and private universities. A set of questionnaires related to noise and chemical exposure were compiled based on the literature search. There was a total of six constructs with 60 items in which three constructs for knowledge, attitude, and practice of noise exposure and three constructs for knowledge, attitude, and practice of chemical exposure. The validation process replicated recent Fuzzy Delphi method that using a concept of Triangular Fuzzy Numbers and Defuzzification process. A 100% response rate was obtained from all the sixteen experts with an average Likert scoring of four to five. Post FDM analysis, the first prerequisite was fulfilled with a threshold value (d) ≤ 0.2, hence all the six constructs were accepted. For the second prerequisite, three items (21%) from noise-attitude construct and four items (40%) from chemical-practice construct had expert consensus lesser than 75%, which giving rise to about 12% from the total items in the questionnaire. The third prerequisite was used to rank the items within the constructs by calculating the average fuzzy numbers. The seven items which did not fulfill the second prerequisite similarly had lower ranks during the analysis, therefore those items were discarded from the final draft. Post FDM analysis, the experts' consensus on the suitability of the pre-selected items on the questionnaire set were obtained, hence it is now ready for further construct validation process.

  2. Development of a PROMIS item bank to measure pain interference.

    PubMed

    Amtmann, Dagmar; Cook, Karon F; Jensen, Mark P; Chen, Wen-Hung; Choi, Seung; Revicki, Dennis; Cella, David; Rothrock, Nan; Keefe, Francis; Callahan, Leigh; Lai, Jin-Shei

    2010-07-01

    This paper describes the psychometric properties of the PROMIS-pain interference (PROMIS-PI) bank. An initial candidate item pool (n=644) was developed and evaluated based on the review of existing instruments, interviews with patients, and consultation with pain experts. From this pool, a candidate item bank of 56 items was selected and responses to the items were collected from large community and clinical samples. A total of 14,848 participants responded to all or a subset of candidate items. The responses were calibrated using an item response theory (IRT) model. A final 41-item bank was evaluated with respect to IRT assumptions, model fit, differential item function (DIF), precision, and construct and concurrent validity. Items of the revised bank had good fit to the IRT model (CFI and NNFI/TLI ranged from 0.974 to 0.997), and the data were strongly unidimensional (e.g., ratio of first and second eigenvalue=35). Nine items exhibited statistically significant DIF. However, adjusting for DIF had little practical impact on score estimates and the items were retained without modifying scoring. Scores provided substantial information across levels of pain; for scores in the T-score range 50-80, the reliability was equivalent to 0.96-0.99. Patterns of correlations with other health outcomes supported the construct validity of the item bank. The scores discriminated among persons with different numbers of chronic conditions, disabling conditions, levels of self-reported health, and pain intensity (p<0.0001). The results indicated that the PROMIS-PI items constitute a psychometrically sound bank. Computerized adaptive testing and short forms are available. Copyright 2010 International Association for the Study of Pain. All rights reserved.

  3. Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

    PubMed

    Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

    2018-03-01

    The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.

  4. Capturing the true burden of dystonia on patients: the Cervical Dystonia Impact Profile (CDIP-58).

    PubMed

    Cano, S J; Warner, T T; Linacre, J M; Bhatia, K P; Thompson, A J; Fitzpatrick, R; Hobart, J C

    2004-11-09

    To develop a new rating scale for measuring the health impact of cervical dystonia (CD) that includes patients' perceptions and complements existing observer dependent clinician rating scales. Scale development was in three stages. In Stage 1, a large pool of items was generated from patient interviews (n = 25), expert opinion, and literature review. In Stage 2, these items were administered by postal survey to people with CD. The resulting data were analyzed using Rasch item analysis to construct, from the item pool, a rating scale that satisfied criteria for rigorous measurement. In Stage 3, the measurement properties of this rating scale were examined in an independent sample of people with CD. In Stage 1, 150 items concerning the health impact of CD were generated. In Stage 2, 556 people completed questionnaires (87% response rate) and a 58-item rating scale measuring the health impact of CD in eight areas was constructed (CD Impact Profile, CDIP-58). In Stage 3, CDIP-58 data from 391 people (87% response rate) were received. Analyses supported the measurement of eight unidimensional constructs (infit mean square range 0.62 to 1.50), item calibration (33.37 to 67.56), and patient separation statistics (2.59 to 3.38). Items demonstrated stable calibrations in subgroups of people with CD supporting the stability of the CDIP-58. The CDIP-58 is a reliable and valid patient-based rating scale measuring the health impact of CD in eight health dimensions.

  5. Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation.

    PubMed

    Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel

    2017-06-15

    Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.

  6. Immediate list recall as a measure of short-term episodic memory: insights from the serial position effect and item response theory.

    PubMed

    Gavett, Brandon E; Horwitz, Julie E

    2012-03-01

    The serial position effect shows that two interrelated cognitive processes underlie immediate recall of a supraspan word list. The current study used item response theory (IRT) methods to determine whether the serial position effect poses a threat to the construct validity of immediate list recall as a measure of verbal episodic memory. Archival data were obtained from a national sample of 4,212 volunteers aged 28-84 in the Midlife Development in the United States study. Telephone assessment yielded item-level data for a single immediate recall trial of the Rey Auditory Verbal Learning Test (RAVLT). Two parameter logistic IRT procedures were used to estimate item parameters and the Q(1) statistic was used to evaluate item fit. A two-dimensional model better fit the data than a unidimensional model, supporting the notion that list recall is influenced by two underlying cognitive processes. IRT analyses revealed that 4 of the 15 RAVLT items (1, 12, 14, and 15) were misfit (p < .05). Item characteristic curves for items 14 and 15 decreased monotonically, implying an inverse relationship between the ability level and the probability of recall. Elimination of the four misfit items provided better fit to the data and met necessary IRT assumptions. Performance on a supraspan list learning test is influenced by multiple cognitive abilities; failure to account for the serial position of words decreases the construct validity of the test as a measure of episodic memory and may provide misleading results. IRT methods can ameliorate these problems and improve construct validity.

  7. Development of the PROMIS positive emotional and sensory expectancies of smoking item banks.

    PubMed

    Tucker, Joan S; Shadel, William G; Edelen, Maria Orlando; Stucky, Brian D; Li, Zhen; Hansen, Mark; Cai, Li

    2014-09-01

    The positive emotional and sensory expectancies of cigarette smoking include improved cognitive abilities, positive affective states, and pleasurable sensorimotor sensations. This paper describes development of Positive Emotional and Sensory Expectancies of Smoking item banks that will serve to standardize the assessment of this construct among daily and nondaily cigarette smokers. Data came from daily (N = 4,201) and nondaily (N =1,183) smokers who completed an online survey. To identify a unidimensional set of items, we conducted item factor analyses, item response theory analyses, and differential item functioning analyses. Additionally, we evaluated the performance of fixed-item short forms (SFs) and computer adaptive tests (CATs) to efficiently assess the construct. Eighteen items were included in the item banks (15 common across daily and nondaily smokers, 1 unique to daily, 2 unique to nondaily). The item banks are strongly unidimensional, highly reliable (reliability = 0.95 for both), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.86). Results from simulated CATs indicated that, on average, less than 8 items are needed to assess the construct with adequate precision using the item banks. These analyses identified a new set of items that can assess the positive emotional and sensory expectancies of smoking in a reliable and standardized manner. Considerable efficiency in assessing this construct can be achieved by using the item bank SF, employing computer adaptive tests, or selecting subsets of items tailored to specific research or clinical purposes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Automatically Scoring Short Essays for Content. CRESST Report 836

    ERIC Educational Resources Information Center

    Kerr, Deirdre; Mousavi, Hamid; Iseli, Markus R.

    2013-01-01

    The Common Core assessments emphasize short essay constructed response items over multiple choice items because they are more precise measures of understanding. However, such items are too costly and time consuming to be used in national assessments unless a way is found to score them automatically. Current automatic essay scoring techniques are…

  9. A Two-Parameter Latent Trait Model. Methodology Project.

    ERIC Educational Resources Information Center

    Choppin, Bruce

    On well-constructed multiple-choice tests, the most serious threat to measurement is not variation in item discrimination, but the guessing behavior that may be adopted by some students. Ways of ameliorating the effects of guessing are discussed, especially for problems in latent trait models. A new item response model, including an item parameter…

  10. A Multidimensional Scaling Approach to Dimensionality Assessment for Measurement Instruments Modeled by Multidimensional Item Response Theory

    ERIC Educational Resources Information Center

    Toro, Maritsa

    2011-01-01

    The statistical assessment of dimensionality provides evidence of the underlying constructs measured by a survey or test instrument. This study focuses on educational measurement, specifically tests comprised of items described as multidimensional. That is, items that require examinee proficiency in multiple content areas and/or multiple cognitive…

  11. Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior.

    PubMed

    Tassé, Marc J; Schalock, Robert L; Thissen, David; Balboni, Giulia; Bersani, Henry Hank; Borthwick-Duffy, Sharon A; Spreat, Scott; Widaman, Keith F; Zhang, Dalun; Navas, Patricia

    2016-03-01

    The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT modeling and a nationally representative standardization sample, the item set was reduced to 75 items that provide the most precise adaptive behavior information at the cutoff area determining the presence or not of significant adaptive behavior deficits across conceptual, social, and practical skills. The standardization of the DABS is described and discussed.

  12. Measuring Constructs in Family Science: How Can Item Response Theory Improve Precision and Validity?

    ERIC Educational Resources Information Center

    Gordon, Rachel A.

    2015-01-01

    This article provides family scientists with an understanding of contemporary measurement perspectives and the ways in which item response theory (IRT) can be used to develop measures with desired evidence of precision and validity for research uses. The article offers a nontechnical introduction to some key features of IRT, including its…

  13. Looking Closer at the Effects of Framing on Risky Choice: An Item Response Theory Analysis.

    PubMed

    Sickar; Highhouse

    1998-07-01

    Item response theory (IRT) methodology allowed an in-depth examination of several issues that would be difficult to explore using traditional methodology. IRT models were estimated for 4 risky-choice items, answered by students under either a gain or loss frame. Results supported the typical framing finding of risk-aversion for gains and risk-seeking for losses but also suggested that a latent construct we label preference for risk was influential in predicting risky choice. Also, the Asian Disease item, most often used in framing research, was found to have anomalous statistical properties when compared to other framing items. Copyright 1998 Academic Press.

  14. INTRODUCTION TO PATIENT-REPORTED OUTCOME ITEM BANKS: ISSUES IN MINORITY AGING RESEARCH

    PubMed Central

    Templin, Thomas N; Hays, Ron D; Gershon, Richard C; Rothrock, Nan; Jones, Richard N; Teresi, Jeanne A; Stewart, Anita; Weech-Maldonado, Robert; Wallace, Steve

    2014-01-01

    In 2004 NIH awarded contracts to initiate the development of high quality psychological and neuropsychological outcome measures for improved assessment of health-related outcomes. The workshop introduced these measurement development initiatives, the measures created, and the NIH supported resource (Assessment Center) for internet or tablet-based test administration and scoring. Presentation covered: (a) item response theory (IRT) and assessment of test bias, (b) construction of item banks and computerized adaptive testing, and (c) the different ways in which qualitative analyses contribute to the definition of construct domains and the refinement of outcome constructs. The panel discussion included questions about representativeness of samples, and assessment of cultural bias. PMID:23570428

  15. The Usability of CAT System for Assessing the Depressive Level of Japanese-A Study on Psychometric Properties and Response Behavior.

    PubMed

    Iwata, Noboru; Kikuchi, Kenichi; Fujihara, Yuya

    2016-08-01

    An innovative measurement system using a computerized adaptive testing technique based on the item response theory (CAT) has been expanding to measure mental health status. However, little is known about details in its measurement properties based on the empirical data. Moreover, the response time (RT) data, which are not available by a paper-and-pencil measurement but available by a computerized measurement, would be worth investigating for exploring the response behavior. We aimed at constructing the CAT to measure depressive symptomatology in a community population and exploring its measurement properties. Also, we examined the relationships between RTs, individual item responses, and depressive levels. For constructing the CAT system, responses of 2061 workers and university students to 24 depression scale plus four negatively revised positive affect items were subjected to a polytomous IRT analysis. The stopping rule was set for standard error of estimation < 0.30 or the maximum 15 items displayed. The CAT and non-adaptive computer-based test (CBT) were administered to 209 undergraduates, and 168 of them administered again after 1 week. On average, the CAT was converged by 10.4 items. The θ values estimated by CAT and CBT were highly correlated (r = 0.94 and 0.95 for the 1st and 2nd measurements) and with the traditional scoring procedures (r's > 0.90). The test-retest reliability was at a satisfactory level (r = 0.86). RTs to some items significantly correlated with the θ estimates. The mean RT varied by the item contents and wording, i.e., the RT to positive affect items required additional 2 s or longer than the other subscale items. The CAT would be a reliable and practical measurement tool for various purposes including stress check at workplace.

  16. Reevaluation of the Amsterdam Inventory for Auditory Disability and Handicap Using Item Response Theory.

    PubMed

    Boeschen Hospers, J Mirjam; Smits, Niels; Smits, Cas; Stam, Mariska; Terwee, Caroline B; Kramer, Sophia E

    2016-04-01

    We reevaluated the psychometric properties of the Amsterdam Inventory for Auditory Disability and Handicap (AIADH; Kramer, Kapteyn, Festen, & Tobi, 1995) using item response theory. Item response theory describes item functioning along an ability continuum. Cross-sectional data from 2,352 adults with and without hearing impairment, ages 18-70 years, were analyzed. They completed the AIADH in the web-based prospective cohort study "Netherlands Longitudinal Study on Hearing." A graded response model was fitted to the AIADH data. Category response curves, item information curves, and the standard error as a function of self-reported hearing ability were plotted. The graded response model showed a good fit. Item information curves were most reliable for adults who reported having hearing disability and less reliable for adults with normal hearing. The standard error plot showed that self-reported hearing ability is most reliably measured for adults reporting mild up to moderate hearing disability. This is one of the few item response theory studies on audiological self-reports. All AIADH items could be hierarchically placed on the self-reported hearing ability continuum, meaning they measure the same construct. This provides a promising basis for developing a clinically useful computerized adaptive test, where item selection adapts to the hearing ability of individuals, resulting in efficient assessment of hearing disability.

  17. The Nature of Science Instrument-Elementary (NOSI-E): Using Rasch principles to develop a theoretically grounded scale to measure elementary student understanding of the nature of science

    NASA Astrophysics Data System (ADS)

    Peoples, Shelagh

    The purpose of this study was to determine which of three competing models will provide, reliable, interpretable, and responsive measures of elementary students' understanding of the nature of science (NOS). The Nature of Science Instrument-Elementary (NOSI-E), a 28-item Rasch-based instrument, was used to assess students' NOS understanding. The NOS construct was conceptualized using five construct dimensions (Empirical, Inventive, Theory-laden, Certainty and Socially & Culturally Embedded). The competing models represent three internal models for the NOS construct. One postulate is that the NOS construct is unidimensional where one latent construct explains the relationship between the 28 items of the NOSI-E. Alternatively, the NOS construct is composed of five independent unidimensional constructs (the consecutive approach). Lastly, the NOS construct is multidimensional and composed of five inter-related but separate dimensions. A validity argument was developed that hypothesized that the internal structure of the NOS construct is best represented by the multidimensional Rasch model. Four sets of analyses were performed in which the three representations were compared. These analyses addressed five validity aspects (content, substantive, generalizability, structural and external) of construct validity. The vast body of evidence supported the claim that the NOS construct is composed of five separate but inter-related dimensions that is best represented by the multidimensional Rasch model. The results of the multidimensional analyses indicated that the items of the five subscales were of excellent technical quality, exhibited no differential item functioning (based on gender), had an item hierarchy that conformed to theoretical expectations; and together formed subscales of reasonable reliability (> 0.7 on each subscale) that were responsive to change in the construct. Theory-laden scores from the multidimensional model predicted students' science achievement with scores from all five NOS dimensions significantly predicting students' perceptions of the constructivist nature of their classroom learning environment. The NOSI-E instrument is a theoretically grounded scale that can measure elementary students' NOS understanding and appears suitable for use in science education research.

  18. Profile-likelihood Confidence Intervals in Item Response Theory Models.

    PubMed

    Chalmers, R Philip; Pek, Jolynn; Liu, Yang

    2017-01-01

    Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.

  19. Performance of Men and Women on Multiple-Choice and Constructed-Response Tests for Beginning Teachers. Research Report. ETS RR-04-48

    ERIC Educational Resources Information Center

    Livingston, Samuel A.; Rupp, Stacie L.

    2004-01-01

    Some previous research results imply that women tend to perform better, relative to men, on constructed-response (CR) tests than on multiple-choice (MC) tests in the same subjects. An analysis of data from several tests used in the licensing of beginning teachers supported this hypothesis, to varying degrees, in most of the tests investigated. The…

  20. An item response theory analysis of the narcissistic personality inventory.

    PubMed

    Ackerman, Robert A; Donnellan, M Brent; Robins, Richard W

    2012-01-01

    This research uses item response theory methods to evaluate the Narcissistic Personality Inventory (NPI; Raskin & Terry, 1988). Analyses using the 2-parameter logistic model were conducted on the total score and the Corry, Merritt, Mrug, and Pamp (2008) and Ackerman et al. (2011) subscales for the NPI. In addition to offering precise information about the psychometric properties of the NPI item pool, these analyses generated insights that can be used to develop new measures of the personality constructs embedded within this frequently used inventory.

  1. IRT-LR-DIF with Estimation of the Focal-Group Density as an Empirical Histogram

    ERIC Educational Resources Information Center

    Woods, Carol M.

    2008-01-01

    Item response theory-likelihood ratio-differential item functioning (IRT-LR-DIF) is used to evaluate the degree to which items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Usually, the latent distribution is presumed normal for both…

  2. Asymptotic Standard Errors of Observed-Score Equating with Polytomous IRT Models

    ERIC Educational Resources Information Center

    Andersson, Björn

    2016-01-01

    In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response…

  3. Sexual Assault Prevention and Response Climate DEOCS 4.1 Construct Validity Summary

    DTIC Science & Technology

    2017-08-01

    DEOCS, (7) examining variance and descriptive statistics (8) examining the relationship among items/areas to reduce multicollinearity, and (9...selecting items that demonstrate the strongest scale properties. Included is a review of the 4.0 description and items, followed by the proposed...Tables 1 – 7 for the description of each measure and corresponding items. Table 1. DEOCS 4.0 Perceptions of Safety Measure Description

  4. The Effects of Item Format and Cognitive Domain on Students' Science Performance in TIMSS 2011

    NASA Astrophysics Data System (ADS)

    Liou, Pey-Yan; Bulut, Okan

    2017-12-01

    The purpose of this study was to examine eighth-grade students' science performance in terms of two test design components, item format, and cognitive domain. The portion of Taiwanese data came from the 2011 administration of the Trends in International Mathematics and Science Study (TIMSS), one of the major international large-scale assessments in science. The item difficulty analysis was initially applied to show the proportion of correct items. A regression-based cumulative link mixed modeling (CLMM) approach was further utilized to estimate the impact of item format, cognitive domain, and their interaction on the students' science scores. The results of the proportion-correct statistics showed that constructed-response items were more difficult than multiple-choice items, and that the reasoning cognitive domain items were more difficult compared to the items in the applying and knowing domains. In terms of the CLMM results, students tended to obtain higher scores when answering constructed-response items as well as items in the applying cognitive domain. When the two predictors and the interaction term were included together, the directions and magnitudes of the predictors on student science performance changed substantially. Plausible explanations for the complex nature of the effects of the two test-design predictors on student science performance are discussed. The results provide practical, empirical-based evidence for test developers, teachers, and stakeholders to be aware of the differential function of item format, cognitive domain, and their interaction in students' science performance.

  5. Overview of Classical Test Theory and Item Response Theory for Quantitative Assessment of Items in Developing Patient-Reported Outcome Measures

    PubMed Central

    Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.

    2014-01-01

    Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753

  6. Modeling the Severity of Drinking Consequences in First-Year College Women: An Item Response Theory Analysis of the Rutgers Alcohol Problem Index*

    PubMed Central

    Cohn, Amy M.; Hagman, Brett T.; Graff, Fiona S.; Noel, Nora E.

    2011-01-01

    Objective: The present study examined the latent continuum of alcohol-related negative consequences among first-year college women using methods from item response theory and classical test theory. Method: Participants (N = 315) were college women in their freshman year who reported consuming any alcohol in the past 90 days and who completed assessments of alcohol consumption and alcohol-related negative consequences using the Rutgers Alcohol Problem Index. Results: Item response theory analyses showed poor model fit for five items identified in the Rutgers Alcohol Problem Index. Two-parameter item response theory logistic models were applied to the remaining 18 items to examine estimates of item difficulty (i.e., severity) and discrimination parameters. The item difficulty parameters ranged from 0.591 to 2.031, and the discrimination parameters ranged from 0.321 to 2.371. Classical test theory analyses indicated that the omission of the five misfit items did not significantly alter the psychometric properties of the construct. Conclusions: Findings suggest that those consequences that had greater severity and discrimination parameters may be used as screening items to identify female problem drinkers at risk for an alcohol use disorder. PMID:22051212

  7. Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project.

    PubMed

    Singh, Amika S; Vik, Froydis N; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Verloigne, Maïté; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; Martens, Marloes; Brug, Johannes

    2011-12-09

    Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items.

  8. Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project

    PubMed Central

    2011-01-01

    Background Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. Objective To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. Methods We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Results Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Conclusions Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items. PMID:22152048

  9. The Effect of Response Bias on the Personality Inventory for DSM-5 (PID-5).

    PubMed

    McGee Ng, Sarah A; Bagby, R Michael; Goodwin, Brandee E; Burchett, Danielle; Sellbom, Martin; Ayearst, Lindsay E; Dhillon, Sonya; Yiu, Shirley; Ben-Porath, Yossef S; Baker, Spencer

    2016-01-01

    Valid self-report assessment of psychopathology relies on accurate and credible responses to test questions. There are some individuals who, in certain assessment contexts, cannot or choose not to answer in a manner typically representative of their traits or symptoms. This is referred to, most broadly, as test response bias. In this investigation, we explore the effect of response bias on the Personality Inventory for DSM-5 (PID-5; Krueger, Derringer, Markon, Watson, & Skodol, 2013 ), a self-report instrument designed to assess the pathological personality traits used to inform diagnosis of the personality disorders in Section III of DSM-5. A set of Minnesota Multiphasic Personality Inventory Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008 / 2011 ) validity scales, which are used to assess and identify response bias, were employed to identify individuals who engaged in either noncredible overreporting (OR) or underreporting (UR), or who were deemed to be reporting or responding to the items in a "credible" manner-credible responding (CR). A total of 2,022 research participants (1,587 students, 435 psychiatric patients) completed the MMPI-2-RF and PID-5; following protocol screening, these participants were classified into OR, UR, or CR response groups based on MMPI-2-RF validity scale scores. Groups of students and patients in the OR group scored significantly higher on the PID-5 than those students and patients in the CR group, whereas those in the UR group scored significantly lower than those in the CR group. Although future research is needed to explore the effects of response bias on the PID-5, results from this investigation provide initial evidence suggesting that response bias influences scale elevations on this instrument.

  10. Cross-cultural adaptation and construct validity of the Korean version of a physical activity measure for community-dwelling elderly.

    PubMed

    Choi, Bongsam

    2018-01-01

    [Purpose] This study aimed to cross-cultural adapt and validate the Korean version of an physical activity measure (K-PAM) for community-dwelling elderly. [Subjects and Methods] One hundred and thirty eight community-dwelling elderlies, 32 males and 106 female, participated in the study. All participants were asked to fill out a fifty-one item questionnaire measuring perceived difficulty in the activities of daily living (ADL) for the elderly. One-parameter model of item response theory (Rasch analysis) was applied to determine the construct validity and to inspect item-level psychometric properties of 51 ADL items of the K-PAM. [Results] Person separation reliability (analogous to Cronbach's alpha) for internal consistency was ranging 0.93 to 0.94. A total of 16 items was misfit to the Rasch model. After misfit item deletion, 35 ADL items of the K-PAM were placed in an empirically meaningful hierarchy from easy to hard. The item-person map analysis delineated that the item difficulty was well matched for the elderlies with moderate and low ability except for high ceilings. [Conclusion] Cross-cultural adapted K-PAM was shown to be sufficient for establishing construct validity and stable psychometric properties confirmed by person separation reliability and fit statistics.

  11. Development of a tool to assess adherence to a model of the division of responsibility in feeding young children: using response mapping to capacitate validation measures.

    PubMed

    Lohse, Barbara; Satter, Ellyn; Arnold, Kristen

    2014-04-01

    Accurate early assessment and targeted intervention with problematic parent/child feeding dynamics is critical for the prevention and treatment of child obesity. The division of responsibility in feeding (sDOR), articulated by the Satter Feeding Dynamics Model (fdSatter), has been demonstrated clinically as an effective approach to reduce child feeding problems, including those leading to obesity. Lack of a tested instrument to examine adherence to fdSatter stimulated initial construction of the Satter Feeding Dynamics Inventory (fdSI). The aim of this project was to refine the item pool to establish translational validity, making the fdSI suitable for advanced psychometric analysis. Cognitive interviews (n = 80) with caregivers of varied socioeconomic strata informed revisions that demonstrated face and content validity. fdSI responses were mapped to interviews using an iterative, multi-phase thematic approach to provide an instrument ready for construct validation. fdSI development required five interview phases over 32 months: Foundational; Refinement; Transitional; Assurance; and Launching. Each phase was associated with item reduction and revision. Thirteen items were removed from the 38-item Foundational phase and seven were revised in the Refinement phase. Revisions, deletions, and additions prompted by Transitional and Assurance phase interviews resulted in the 15-item Launching phase fdSI. Only one Foundational phase item was carried through all development phases, emphasizing the need to test for item comprehension and interpretation before psychometric analyses. Psychometric studies of item pools without encrypted meanings will facilitate progress toward a tool that accurately detects adherence to sDOR. Ability to measure sDOR will facilitate focus on feeding behaviors associated with reduced risk of childhood obesity.

  12. Item response theory - A first approach

    NASA Astrophysics Data System (ADS)

    Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

    2017-07-01

    The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.

  13. On the Equivalence of Constructed-Response and Multiple-Choice Tests.

    ERIC Educational Resources Information Center

    Traub, Ross E.; Fisher, Charles W.

    Two sets of mathematical reasoning and two sets of verbal comprehension items were cast into each of three formats--constructed response, standard multiple-choice, and Coombs multiple-choice--in order to assess whether tests with indentical content but different formats measure the same attribute, except for possible differences in error variance…

  14. Slower is not always better: Response-time evidence clarifies the limited role of miserly information processing in the Cognitive Reflection Test

    PubMed Central

    Pitchford, Melanie; Ball, Linden J.; Hunt, Thomas E.; Steel, Richard

    2017-01-01

    We report a study examining the role of ‘cognitive miserliness’ as a determinant of poor performance on the standard three-item Cognitive Reflection Test (CRT). The cognitive miserliness hypothesis proposes that people often respond incorrectly on CRT items because of an unwillingness to go beyond default, heuristic processing and invest time and effort in analytic, reflective processing. Our analysis (N = 391) focused on people’s response times to CRT items to determine whether predicted associations are evident between miserly thinking and the generation of incorrect, intuitive answers. Evidence indicated only a weak correlation between CRT response times and accuracy. Item-level analyses also failed to demonstrate predicted response-time differences between correct analytic and incorrect intuitive answers for two of the three CRT items. We question whether participants who give incorrect intuitive answers on the CRT can legitimately be termed cognitive misers and whether the three CRT items measure the same general construct. PMID:29099840

  15. Development of a simple 12-item theory-based instrument to assess the impact of continuing professional development on clinical behavioral intentions.

    PubMed

    Légaré, France; Borduas, Francine; Freitas, Adriana; Jacques, André; Godin, Gaston; Luconi, Francesca; Grimshaw, Jeremy

    2014-01-01

    Decision-makers in organizations providing continuing professional development (CPD) have identified the need for routine assessment of its impact on practice. We sought to develop a theory-based instrument for evaluating the impact of CPD activities on health professionals' clinical behavioral intentions. Our multipronged study had four phases. 1) We systematically reviewed the literature for instruments that used socio-cognitive theories to assess healthcare professionals' clinically-oriented behavioral intentions and/or behaviors; we extracted items relating to the theoretical constructs of an integrated model of healthcare professionals' behaviors and removed duplicates. 2) A committee of researchers and CPD decision-makers selected a pool of items relevant to CPD. 3) An international group of experts (n = 70) reached consensus on the most relevant items using electronic Delphi surveys. 4) We created a preliminary instrument with the items found most relevant and assessed its factorial validity, internal consistency and reliability (weighted kappa) over a two-week period among 138 physicians attending a CPD activity. Out of 72 potentially relevant instruments, 47 were analyzed. Of the 1218 items extracted from these, 16% were discarded as improperly phrased and 70% discarded as duplicates. Mapping the remaining items onto the constructs of the integrated model of healthcare professionals' behaviors yielded a minimum of 18 and a maximum of 275 items per construct. The partnership committee retained 61 items covering all seven constructs. Two iterations of the Delphi process produced consensus on a provisional 40-item questionnaire. Exploratory factorial analysis following test-retest resulted in a 12-item questionnaire. Cronbach's coefficients for the constructs varied from 0.77 to 0.85. A 12-item theory-based instrument for assessing the impact of CPD activities on health professionals' clinical behavioral intentions showed adequate validity and reliability. Further studies could assess its responsiveness to behavior change following CPD activities and its capacity to predict health professionals' clinical performance.

  16. Development of a Simple 12-Item Theory-Based Instrument to Assess the Impact of Continuing Professional Development on Clinical Behavioral Intentions

    PubMed Central

    Légaré, France; Borduas, Francine; Freitas, Adriana; Jacques, André; Godin, Gaston; Luconi, Francesca; Grimshaw, Jeremy

    2014-01-01

    Background Decision-makers in organizations providing continuing professional development (CPD) have identified the need for routine assessment of its impact on practice. We sought to develop a theory-based instrument for evaluating the impact of CPD activities on health professionals' clinical behavioral intentions. Methods and Findings Our multipronged study had four phases. 1) We systematically reviewed the literature for instruments that used socio-cognitive theories to assess healthcare professionals' clinically-oriented behavioral intentions and/or behaviors; we extracted items relating to the theoretical constructs of an integrated model of healthcare professionals' behaviors and removed duplicates. 2) A committee of researchers and CPD decision-makers selected a pool of items relevant to CPD. 3) An international group of experts (n = 70) reached consensus on the most relevant items using electronic Delphi surveys. 4) We created a preliminary instrument with the items found most relevant and assessed its factorial validity, internal consistency and reliability (weighted kappa) over a two-week period among 138 physicians attending a CPD activity. Out of 72 potentially relevant instruments, 47 were analyzed. Of the 1218 items extracted from these, 16% were discarded as improperly phrased and 70% discarded as duplicates. Mapping the remaining items onto the constructs of the integrated model of healthcare professionals' behaviors yielded a minimum of 18 and a maximum of 275 items per construct. The partnership committee retained 61 items covering all seven constructs. Two iterations of the Delphi process produced consensus on a provisional 40-item questionnaire. Exploratory factorial analysis following test-retest resulted in a 12-item questionnaire. Cronbach's coefficients for the constructs varied from 0.77 to 0.85. Conclusion A 12-item theory-based instrument for assessing the impact of CPD activities on health professionals' clinical behavioral intentions showed adequate validity and reliability. Further studies could assess its responsiveness to behavior change following CPD activities and its capacity to predict health professionals' clinical performance. PMID:24643173

  17. Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms

    ERIC Educational Resources Information Center

    Debeer, Dries; Ali, Usama S.; van Rijn, Peter W.

    2017-01-01

    Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…

  18. Assessment of Computer and Information Literacy in ICILS 2013: Do Different Item Types Measure the Same Construct?

    ERIC Educational Resources Information Center

    Ihme, Jan Marten; Senkbeil, Martin; Goldhammer, Frank; Gerick, Julia

    2017-01-01

    The combination of different item formats is found quite often in large scale assessments, and analyses on the dimensionality often indicate multi-dimensionality of tests regarding the task format. In ICILS 2013, three different item types (information-based response tasks, simulation tasks, and authoring tasks) were used to measure computer and…

  19. Comparing Science Achievement Constructs: Targeted and Achieved

    ERIC Educational Resources Information Center

    Ferrara, Steve; Duncan, Teresa

    2011-01-01

    This article illustrates how test specifications based solely on academic content standards, without attention to other cognitive skills and item response demands, can fall short of their targeted constructs. First, the authors inductively describe the science achievement construct represented by a statewide sixth-grade science proficiency test.…

  20. Multivariate Generalizability Analysis of Automated Scoring for Short Answer Items of Social Studies in Large-Scale Assessment

    ERIC Educational Resources Information Center

    Sung, Kyung Hee; Noh, Eun Hee; Chon, Kyong Hee

    2017-01-01

    With increased use of constructed response items in large scale assessments, the cost of scoring has been a major consideration (Noh et al. in KICE Report RRE 2012-6, 2012; Wainer and Thissen in "Applied Measurement in Education" 6:103-118, 1993). In response to the scoring cost issues, various forms of automated system for scoring…

  1. The Effects of Rater Severity and Rater Distribution on Examinees' Ability Estimation for Constructed-Response Items. Research Report. ETS RR-13-23

    ERIC Educational Resources Information Center

    Wang, Zhen; Yao, Lihua

    2013-01-01

    The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…

  2. Psychometric properties of the Epworth Sleepiness Scale: A factor analysis and item-response theory approach.

    PubMed

    Pilcher, June J; Switzer, Fred S; Munc, Alec; Donnelly, Janet; Jellen, Julia C; Lamm, Claus

    2018-04-01

    The purpose of this study is to examine the psychometric properties of the Epworth Sleepiness Scale (ESS) in two languages, German and English. Students from a university in Austria (N = 292; 55 males; mean age = 18.71 ± 1.71 years; 237 females; mean age = 18.24 ± 0.88 years) and a university in the US (N = 329; 128 males; mean age = 18.71 ± 0.88 years; 201 females; mean age = 21.59 ± 2.27 years) completed the ESS. An exploratory-factor analysis was completed to examine dimensionality of the ESS. Item response theory (IRT) analyses were used to provide information about the response rates on the items on the ESS and provide differential item functioning (DIF) analyses to examine whether the items were interpreted differently between the two languages. The factor analyses suggest that the ESS measures two distinct sleepiness constructs. These constructs indicate that the ESS is probing sleepiness in settings requiring active versus passive responding. The IRT analyses found that overall, the items on the ESS perform well as a measure of sleepiness. However, Item 8 and to a lesser extent Item 6 were being interpreted differently by respondents in comparison to the other items. In addition, the DIF analyses showed that the responses between German and English were very similar indicating that there are only minor measurement differences between the two language versions of the ESS. These findings suggest that the ESS provides a reliable measure of propensity to sleepiness; however, it does convey a two-factor approach to sleepiness. Researchers and clinicians can use the German and English versions of the ESS but may wish to exclude Item 8 when calculating a total sleepiness score.

  3. Support for an auto-associative model of spoken cued recall: evidence from fMRI.

    PubMed

    de Zubicaray, Greig; McMahon, Katie; Eastburn, Mathew; Pringle, Alan J; Lorenz, Lina; Humphreys, Michael S

    2007-03-02

    Cued recall and item recognition are considered the standard episodic memory retrieval tasks. However, only the neural correlates of the latter have been studied in detail with fMRI. Using an event-related fMRI experimental design that permits spoken responses, we tested hypotheses from an auto-associative model of cued recall and item recognition [Chappell, M., & Humphreys, M. S. (1994). An auto-associative neural network for sparse representations: Analysis and application to models of recognition and cued recall. Psychological Review, 101, 103-128]. In brief, the model assumes that cues elicit a network of phonological short term memory (STM) and semantic long term memory (LTM) representations distributed throughout the neocortex as patterns of sparse activations. This information is transferred to the hippocampus which converges upon the item closest to a stored pattern and outputs a response. Word pairs were learned from a study list, with one member of the pair serving as the cue at test. Unstudied words were also intermingled at test in order to provide an analogue of yes/no recognition tasks. Compared to incorrectly rejected studied items (misses) and correctly rejected (CR) unstudied items, correctly recalled items (hits) elicited increased responses in the left hippocampus and neocortical regions including the left inferior prefrontal cortex (LIPC), left mid lateral temporal cortex and inferior parietal cortex, consistent with predictions from the model. This network was very similar to that observed in yes/no recognition studies, supporting proposals that cued recall and item recognition involve common rather than separate mechanisms.

  4. Sexual Harassment DEOCS 4.1 Construct Validity Summary

    DTIC Science & Technology

    2017-08-01

    These items were modified to provide additional clarity regarding chain of command actions and response in the final survey . ** These items were...modified to provide additional clarity regarding indivduals from the respondent’s workplace in the final survey . 4 Conclusion The revised sexual

  5. Artificial Neural Network-Based Three-dimensional Continuous Response Relationship Construction of 3Cr20Ni10W2 Heat-Resisting Alloy and Its Application in Finite Element Simulation

    NASA Astrophysics Data System (ADS)

    Li, Le; Wang, Li-yong

    2018-04-01

    The application of accurate constitutive relationship in finite element simulation would significantly contribute to accurate simulation results, which plays a critical role in process design and optimization. In this investigation, the true stress-strain data of 3Cr20Ni10W2 heat-resisting alloy were obtained from a series of isothermal compression tests conducted in a wide temperature range of 1203-1403 K and strain rate range of 0.01-10 s-1 on a Gleeble 1500 testing machine. Then the constitutive relationship was modeled by an optimally constructed and well-trained back-propagation artificial neural network (BP-ANN). The evaluation of the BP-ANN model revealed that it has admirable performance in characterizing and predicting the flow behaviors of 3Cr20Ni10W2 heat-resisting alloy. Meanwhile, a comparison between improved Arrhenius-type constitutive equation and BP-ANN model shows that the latter has higher accuracy. Consequently, the developed BP-ANN model was used to predict abundant stress-strain data beyond the limited experimental conditions and construct the three-dimensional continuous response relationship for temperature, strain rate, strain, and stress. Finally, the three-dimensional continuous response relationship was applied to the numerical simulation of isothermal compression tests. The results show that such constitutive relationship can significantly promote the accuracy improvement of numerical simulation for hot forming processes.

  6. Item response theory analysis of the life orientation test-revised: age and gender differential item functioning analyses.

    PubMed

    Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina

    2015-06-01

    This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.

  7. Application of Item Response Theory to Tests of Substance-related Associative Memory

    PubMed Central

    Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

    2015-01-01

    A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051

  8. Developing Multiple Choice Tests: Tips & Techniques

    ERIC Educational Resources Information Center

    McCowan, Richard J.

    1999-01-01

    Item writing is a major responsibility of trainers. Too often, qualified staff who prepare lessons carefully and teach conscientiously use inadequate tests that do not validly reflect the true level of trainee achievement. This monograph describes techniques for constructing multiple-choice items that measure student performance accurately. It…

  9. Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

    ERIC Educational Resources Information Center

    Almond, Russell G.

    2014-01-01

    Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…

  10. A Hierarchical Rater Model for Constructed Responses, with a Signal Detection Rater Model

    ERIC Educational Resources Information Center

    DeCarlo, Lawrence T.; Kim, YoungKoung; Johnson, Matthew S.

    2011-01-01

    The hierarchical rater model (HRM) recognizes the hierarchical structure of data that arises when raters score constructed response items. In this approach, raters' scores are not viewed as being direct indicators of examinee proficiency but rather as indicators of essay quality; the (latent categorical) quality of an examinee's essay in turn…

  11. Automatic Short Essay Scoring Using Natural Language Processing to Extract Semantic Information in the Form of Propositions. CRESST Report 831

    ERIC Educational Resources Information Center

    Kerr, Deirdre; Mousavi, Hamid; Iseli, Markus R.

    2013-01-01

    The Common Core assessments emphasize short essay constructed-response items over multiple-choice items because they are more precise measures of understanding. However, such items are too costly and time consuming to be used in national assessments unless a way to score them automatically can be found. Current automatic essay-scoring techniques…

  12. Automatic Generation of Rasch-Calibrated Items: Figural Matrices Test GEOM and Endless-Loops Test EC

    ERIC Educational Resources Information Center

    Arendasy, Martin

    2005-01-01

    The future of test construction for certain psychological ability domains that can be analyzed well in a structured manner may lie--at the very least for reasons of test security--in the field of automatic item generation. In this context, a question that has not been explicitly addressed is whether it is possible to embed an item response theory…

  13. Identification of rice genes associated with cosmic-ray response via co-expression gene network analysis.

    PubMed

    Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong

    2014-05-15

    In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Item Banks for Measuring Emotional Distress From the Patient-Reported Outcomes Measurement Information System (PROMIS®): Depression, Anxiety, and Anger

    PubMed Central

    Pilkonis, Paul A.; Choi, Seung W.; Reise, Steven P.; Stover, Angela M.; Riley, William T.; Cella, David

    2011-01-01

    The authors report on the development and calibration of item banks for depression, anxiety, and anger as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®). Comprehensive literature searches yielded an initial bank of 1,404 items from 305 instruments. After qualitative item analysis (including focus groups and cognitive interviewing), 168 items (56 for each construct) were written in a first person, past tense format with a 7-day time frame and five response options reflecting frequency. The calibration sample included nearly 15,000 respondents. Final banks of 28, 29, and 29 items were calibrated for depression, anxiety, and anger, respectively, using item response theory. Test information curves showed that the PROMIS item banks provided more information than conventional measures in a range of severity from approximately −1 to +3 standard deviations (with higher scores indicating greater distress). Short forms consisting of seven to eight items provided information comparable to legacy measures containing more items. PMID:21697139

  15. Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): depression, anxiety, and anger.

    PubMed

    Pilkonis, Paul A; Choi, Seung W; Reise, Steven P; Stover, Angela M; Riley, William T; Cella, David

    2011-09-01

    The authors report on the development and calibration of item banks for depression, anxiety, and anger as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®). Comprehensive literature searches yielded an initial bank of 1,404 items from 305 instruments. After qualitative item analysis (including focus groups and cognitive interviewing), 168 items (56 for each construct) were written in a first person, past tense format with a 7-day time frame and five response options reflecting frequency. The calibration sample included nearly 15,000 respondents. Final banks of 28, 29, and 29 items were calibrated for depression, anxiety, and anger, respectively, using item response theory. Test information curves showed that the PROMIS item banks provided more information than conventional measures in a range of severity from approximately -1 to +3 standard deviations (with higher scores indicating greater distress). Short forms consisting of seven to eight items provided information comparable to legacy measures containing more items.

  16. Harmonizing Measures of Cognitive Performance Across International Surveys of Aging Using Item Response Theory.

    PubMed

    Chan, Kitty S; Gross, Alden L; Pezzin, Liliana E; Brandt, Jason; Kasper, Judith D

    2015-12-01

    To harmonize measures of cognitive performance using item response theory (IRT) across two international aging studies. Data for persons ≥65 years from the Health and Retirement Study (HRS, N = 9,471) and the English Longitudinal Study of Aging (ELSA, N = 5,444). Cognitive performance measures varied (HRS fielded 25, ELSA 13); 9 were in common. Measurement precision was examined for IRT scores based on (a) common items, (b) common items adjusted for differential item functioning (DIF), and (c) DIF-adjusted all items. Three common items (day of date, immediate word recall, and delayed word recall) demonstrated DIF by survey. Adding survey-specific items improved precision but mainly for HRS respondents at lower cognitive levels. IRT offers a feasible strategy for harmonizing cognitive performance measures across other surveys and for other multi-item constructs of interest in studies of aging. Practical implications depend on sample distribution and the difficulty mix of in-common and survey-specific items. © The Author(s) 2015.

  17. The multi-faceted assessment of independence in patients with rheumatoid arthritis: preliminary validation from the ATTAIN study.

    PubMed

    Hassett, Afton L; Li, Tracy; Buyske, Steven; Savage, Shantal V; Gignac, Monique A M

    2008-05-01

    To consider the feasibility of assessing multiple facets of independence in rheumatoid arthritis (RA) using a measure developed from existing items and examining its face validity, construct validity and responsiveness to change. The ATTAIN (Abatacept Trial in Treatment of Anti-tumor necrosis factor [TNF] Inadequate responders) database was used. Patients with RA were randomized 2:1, abatacept (n = 258) and placebo (n = 133). A multi-faceted scale to measure physical and psychosocial independence was constructed using items from the Health Assessment Questionnaire (HAQ) and Short Form 36 Health Survey (SF-36). Questions assessing activity limitations and need for outside caregiver help were also examined. Interviews with 20 RA patients assessed face validity. Item Response Theory analysis yielded two traits - 'Psychosocial Independence', derived from the number of days with activity limitations plus the Role Emotional, Social Functioning and Role Physical subscale items from the SF-36; and 'Physical Independence', derived from 15 HAQ items assessing need for help from another. The two traits showed no significant differential item functioning for age or gender and demonstrated good face validity. Changes over 169 days on Psychosocial Independence were greater (mean 0.46 units, 95% confidence interval [CI]: 0.17-0.75) for the abatacept group than for placebo (p = 0.002). Changes in Physical Independence were greater (mean 0.59 units, 95% CI: 0.35-0.82) for the abatacept group than for placebo (p < 0.001). The multi-faceted assessment of independence in RA based on items from commonly used instruments is feasible suggesting promise for evaluating independence in future clinical trials. This approach demonstrated good face and construct validity and responsiveness in RA patients who had previously failed anti-TNF therapy. However, we caution against an interpretation that these data suggest that abatacept improves independence because the component parts of this assessment came from instruments used in the ATTAIN trial where data had been previously analyzed.

  18. The construct validity of the Major Depression Inventory: A Rasch analysis of a self-rating scale in primary care.

    PubMed

    Nielsen, Marie Germund; Ørnbøl, Eva; Vestergaard, Mogens; Bech, Per; Christensen, Kaj Sparle

    2017-06-01

    We aimed to assess the measurement properties of the ten-item Major Depression Inventory when used on clinical suspicion in general practice by performing a Rasch analysis. General practitioners asked consecutive persons to respond to the web-based Major Depression Inventory on clinical suspicion of depression. We included 22 practices and 245 persons. Rasch analysis was performed using RUMM2030 software. The Rasch model fit suggests that all items contribute to a single underlying trait (defined as internal construct validity). Mokken analysis was used to test dimensionality and scalability. Our Rasch analysis showed misfit concerning the sleep and appetite items (items 9 and 10). The response categories were disordered for eight items. After modifying the original six-point to a four-point scoring system for all items, we achieved ordered response categories for all ten items. The person separation reliability was acceptable (0.82) for the initial model. Dimensionality testing did not support combining the ten items to create a total score. The scale appeared to be well targeted to this clinical sample. No significant differential item functioning was observed for gender, age, work status and education. The Rasch and Mokken analyses revealed two dimensions, but the Major Depression Inventory showed fit to one scale if items 9 and 10 were excluded. Our study indicated scalability problems in the current version of the Major Depression Inventory. The conducted analysis revealed better statistical fit when items 9 and 10 were excluded. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Reliability and validity evidence of the Assessment of Language Use in Social Contexts for Adults (ALUSCA).

    PubMed

    Valente, Ana Rita S; Hall, Andreia; Alvelos, Helena; Leahy, Margaret; Jesus, Luis M T

    2018-04-12

    The appropriate use of language in context depends on the speaker's pragmatic language competencies. A coding system was used to develop a specific and adult-focused self-administered questionnaire to adults who stutter and adults who do not stutter, The Assessment of Language Use in Social Contexts for Adults, with three categories: precursors, basic exchanges, and extended literal/non-literal discourse. This paper presents the content validity, item analysis, reliability coefficients and evidences of construct validity of the instrument. Content validity analysis was based on a two-stage process: first, 11 pragmatic questionnaires were assessed to identify items that probe each pragmatic competency and to create the first version of the instrument; second, items were assessed qualitatively by an expert panel composed by adults who stutter and controls, and quantitatively and qualitatively by an expert panel composed by clinicians. A pilot study was conducted with five adults who stutter and five controls to analyse items and calculate reliability. Construct validity evidences were obtained using the hypothesized relationships method and factor analysis with 28 adults who stutter and 28 controls. Concerning content validity, the questionnaires assessed up to 13 pragmatic competencies. Qualitative and quantitative analysis revealed ambiguities in items construction. Disagreement between experts was solved through item modification. The pilot study showed that the instrument presented internal consistency and temporal stability. Significant differences between adults who stutter and controls and different response profiles revealed the instrument's underlying construct. The instrument is reliable and presented evidences of construct validity.

  20. Development of the multiple sclerosis (MS) early mobility impairment questionnaire (EMIQ).

    PubMed

    Ziemssen, Tjalf; Phillips, Glenn; Shah, Ruchit; Mathias, Adam; Foley, Catherine; Coon, Cheryl; Sen, Rohini; Lee, Andrew; Agarwal, Sonalee

    2016-10-01

    The Early Mobility Impairment Questionnaire (EMIQ) was developed to facilitate early identification of mobility impairments in multiple sclerosis (MS) patients. We describe the initial development of the EMIQ with a focus on the psychometric evaluation of the questionnaire using classical and item response theory methods. The initial 20-item EMIQ was constructed by clinical specialists and qualitatively tested among people with MS and physicians via cognitive interviews. Data from an observational study was used to make additional updates to the instrument based on exploratory factor analysis (EFA) and item response theory (IRT) analysis, and psychometric analyses were performed to evaluate the reliability and validity of the final instrument's scores and screening properties (i.e., sensitivity and specificity). Based on qualitative interview analyses, a revised 15-item EMIQ was included in the observational study. EFA, IRT and item-to-item correlation analyses revealed redundant items which were removed leading to the final nine-item EMIQ. The nine-item EMIQ performed well with respect to: test-retest reliability (ICC = 0.858); internal consistency (α = 0.893); convergent validity; and known-groups methods for construct validity. A cut-point of 41 on the 0-to-100 scale resulted in sufficient sensitivity and specificity statistics for viably identifying patients with mobility impairment. The EMIQ is a content valid and psychometrically sound instrument for capturing MS patients' experience with mobility impairments in a clinical practice setting. Additional research is suggested to further confirm the EMIQ's screening properties over time.

  1. Differential item functional analysis on pedagogic and content knowledge (PCK) questionnaire for Indonesian teachers using RASCH model

    NASA Astrophysics Data System (ADS)

    Rahmani, B. D.

    2018-01-01

    The purpose of this paper is to evaluate Indonesian senior high school teacher’s pedagogical content knowledge also their perception toward curriculum changing in West Java Indonesia. The data used in this study were derived from a questionnaire survey conducted among teachers in Bandung, West Java. A total of 61 usable responses were collected. The Differential Item Functioning (DIFF) was used to analyze the data whether the item had a difference or not toward gender, education background also on school location. However, the result showed that there was no any significant difference on gender and school location toward the item response but educational background. As a conclusion, the teacher’s educational background influence on giving the response to the questionnaire. Therefore, it is suggested in the future to construct the items on the questionnaire which is coped the differences of the participant particularly the educational background.

  2. Cognitive Diagnostic Models for Tests with Multiple-Choice and Constructed-Response Items

    ERIC Educational Resources Information Center

    Kuo, Bor-Chen; Chen, Chun-Hua; Yang, Chih-Wei; Mok, Magdalena Mo Ching

    2016-01-01

    Traditionally, teachers evaluate students' abilities via their total test scores. Recently, cognitive diagnostic models (CDMs) have begun to provide information about the presence or absence of students' skills or misconceptions. Nevertheless, CDMs are typically applied to tests with multiple-choice (MC) items, which provide less diagnostic…

  3. Improving measurement of injection drug risk behavior using item response theory.

    PubMed

    Janulis, Patrick

    2014-03-01

    Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.

  4. Vegetable parenting practices scale. Item response modeling analyses

    PubMed Central

    Chen, Tzu-An; O’Connor, Teresia; Hughes, Sheryl; Beltran, Alicia; Baranowski, Janice; Diep, Cassandra; Baranowski, Tom

    2015-01-01

    Objective To evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We also tested for differences in the ways item function (called differential item functioning) across child’s gender, ethnicity, age, and household income groups. Method Parents of 3–5 year old children completed a self-reported vegetable parenting practices scale online. Vegetable parenting practices consisted of 14 effective vegetable parenting practices and 12 ineffective vegetable parenting practices items, each with three subscales (responsiveness, structure, and control). Multidimensional polytomous item response modeling was conducted separately on effective vegetable parenting practices and ineffective vegetable parenting practices. Results One effective vegetable parenting practice item did not fit the model well in the full sample or across demographic groups, and another was a misfit in differential item functioning analyses across child’s gender. Significant differential item functioning was detected across children’s age and ethnicity groups, and more among effective vegetable parenting practices than ineffective vegetable parenting practices items. Wright maps showed items only covered parts of the latent trait distribution. The harder- and easier-to-respond ends of the construct were not covered by items for effective vegetable parenting practices and ineffective vegetable parenting practices, respectively. Conclusions Several effective vegetable parenting practices and ineffective vegetable parenting practices scale items functioned differently on the basis of child’s demographic characteristics; therefore, researchers should use these vegetable parenting practices scales with caution. Item response modeling should be incorporated in analyses of parenting practice questionnaires to better assess differences across demographic characteristics. PMID:25895694

  5. An item response theory analysis of the Executive Interview and development of the EXIT8: A Project FRONTIER Study.

    PubMed

    Jahn, Danielle R; Dressel, Jeffrey A; Gavett, Brandon E; O'Bryant, Sid E

    2015-01-01

    The Executive Interview (EXIT25) is an effective measure of executive dysfunction, but may be inefficient due to the time it takes to complete 25 interview-based items. The current study aimed to examine psychometric properties of the EXIT25, with a specific focus on determining whether a briefer version of the measure could comprehensively assess executive dysfunction. The current study applied a graded response model (a type of item response theory model for polytomous categorical data) to identify items that were most closely related to the underlying construct of executive functioning and best discriminated between varying levels of executive functioning. Participants were 660 adults ages 40 to 96 years living in West Texas, who were recruited through an ongoing epidemiological study of rural health and aging, called Project FRONTIER. The EXIT25 was the primary measure examined. Participants also completed the Trail Making Test and Controlled Oral Word Association Test, among other measures, to examine the convergent validity of a brief form of the EXIT25. Eight items were identified that provided the majority of the information about the underlying construct of executive functioning; total scores on these items were associated with total scores on other measures of executive functioning and were able to differentiate between cognitively healthy, mildly cognitively impaired, and demented participants. In addition, cutoff scores were recommended based on sensitivity and specificity of scores. A brief, eight-item version of the EXIT25 may be an effective and efficient screening for executive dysfunction among older adults.

  6. Improving the Measurement of Cognitive Skills through Automated Conversations

    ERIC Educational Resources Information Center

    Jackson, G. Tanner; Castellano, Katherine E.; Brockway, Debra; Lehman, Blair

    2018-01-01

    Open-ended, short-answer questions, referred to as constructed responses (CR), allow students to express knowledge and skills through their own words. While CRs can reduce the likelihood of guessing correct answers, they also enable students to provide errant responses due to a lack of knowledge or a misunderstanding of the question.…

  7. Using item response theory to address vulnerabilities in FFQ.

    PubMed

    Kazman, Josh B; Scott, Jonathan M; Deuster, Patricia A

    2017-09-01

    The limitations for self-reporting of dietary patterns are widely recognised as a major vulnerability of FFQ and the dietary screeners/scales derived from FFQ. Such instruments can yield inconsistent results to produce questionable interpretations. The present article discusses the value of psychometric approaches and standards in addressing these drawbacks for instruments used to estimate dietary habits and nutrient intake. We argue that a FFQ or screener that treats diet as a 'latent construct' can be optimised for both internal consistency and the value of the research results. Latent constructs, a foundation for item response theory (IRT)-based scales (e.g. Patient Reported Outcomes Measurement Information System) are typically introduced in the design stage of an instrument to elicit critical factors that cannot be observed or measured directly. We propose an iterative approach that uses such modelling to refine FFQ and similar instruments. To that end, we illustrate the benefits of psychometric modelling by using items and data from a sample of 12 370 Soldiers who completed the 2012 US Army Global Assessment Tool (GAT). We used factor analysis to build the scale incorporating five out of eleven survey items. An IRT-driven assessment of response category properties indicates likely problems in the ordering or wording of several response categories. Group comparisons, examined with differential item functioning (DIF), provided evidence of scale validity across each Army sub-population (sex, service component and officer status). Such an approach holds promise for future FFQ.

  8. The Health Education Impact Questionnaire (heiQ): an outcomes and evaluation measure for patient education and self-management interventions for people with chronic conditions.

    PubMed

    Osborne, Richard H; Elsworth, Gerald R; Whitfield, Kathryn

    2007-05-01

    This paper describes the development and validation of the Health Education Impact Questionnaire (heiQ). The aim was to develop a user-friendly, relevant, and psychometrically sound instrument for the comprehensive evaluation of patient education programs, which can be applied across a broad range of chronic conditions. Item development for the heiQ was guided by a Program Logic Model, Concept Mapping, interviews with stakeholders and psychometric analyses. Construction (N=591) and confirmatory (N=598) samples were drawn from consumers of patient education programs and hospital outpatients. The properties of the heiQ were investigated using item response theory and structural equation modeling. Over 90 candidate items were generated, with 42 items selected for inclusion in the final scale. Eight independent dimensions were derived: Positive and Active Engagement in Life (five items, Cronbach's alpha (alpha)=0.86); Health Directed Behavior (four items, alpha=0.80); Skill and Technique Acquisition (five items, alpha=0.81); Constructive Attitudes and Approaches (five items, alpha=0.81); Self-Monitoring and Insight (seven items, alpha=0.70); Health Service Navigation (five items, alpha=0.82); Social Integration and Support (five items, alpha=0.86); and Emotional Wellbeing (six items, alpha=0.89). The heiQ has high construct validity and is a reliable measure of a broad range of patient education program benefits. The heiQ will provide valuable information to clinicians, researchers, policymakers and other stakeholders about the value of patient education programs in chronic disease management.

  9. Analysis of the construct of dignity and content validity of the patient dignity inventory

    PubMed Central

    2011-01-01

    Background Maintaining dignity, the quality of being worthy of esteem or respect, is considered as a goal of palliative care. The aim of this study was to analyse the construct of personal dignity and to assess the content validity of the Patient Dignity Inventory (PDI) in people with an advance directive in the Netherlands. Methods Data were collected within the framework of an advance directives cohort study. This cohort study is aiming to get a better insight into how decisions are made at the end of life with regard to advance directives in the Netherlands. One half of the cohort (n = 2404) received an open-ended question concerning factors relevant to dignity. Content labels were assigned to issues mentioned in the responses to the open-ended question. The other half of the cohort (n = 2537) received a written questionnaire including the PDI. The relevance and comprehensiveness of the PDI items were assessed with the COSMIN checklist ('COnsensus-based Standards for the selection of health status Measurement INstruments'). Results The majority of the PDI items were found to be relevant for the construct to be measured, the study population, and the purpose of the study but the items were not completely comprehensive. The responses to the open-ended question indicated that communication and care-related aspects were also important for dignity. Conclusions This study demonstrated that the PDI items were relevant for people with an advance directive in the Netherlands. The comprehensiveness of the items can be improved by including items concerning communication and care. PMID:21682924

  10. A Psychometric Analysis of the Italian Version of the eHealth Literacy Scale Using Item Response and Classical Test Theory Methods

    PubMed Central

    Dima, Alexandra Lelia; Schulz, Peter Johannes

    2017-01-01

    Background The eHealth Literacy Scale (eHEALS) is a tool to assess consumers’ comfort and skills in using information technologies for health. Although evidence exists of reliability and construct validity of the scale, less agreement exists on structural validity. Objective The aim of this study was to validate the Italian version of the eHealth Literacy Scale (I-eHEALS) in a community sample with a focus on its structural validity, by applying psychometric techniques that account for item difficulty. Methods Two Web-based surveys were conducted among a total of 296 people living in the Italian-speaking region of Switzerland (Ticino). After examining the latent variables underlying the observed variables of the Italian scale via principal component analysis (PCA), fit indices for two alternative models were calculated using confirmatory factor analysis (CFA). The scale structure was examined via parametric and nonparametric item response theory (IRT) analyses accounting for differences between items regarding the proportion of answers indicating high ability. Convergent validity was assessed by correlations with theoretically related constructs. Results CFA showed a suboptimal model fit for both models. IRT analyses confirmed all items measure a single dimension as intended. Reliability and construct validity of the final scale were also confirmed. The contrasting results of factor analysis (FA) and IRT analyses highlight the importance of considering differences in item difficulty when examining health literacy scales. Conclusions The findings support the reliability and validity of the translated scale and its use for assessing Italian-speaking consumers’ eHealth literacy. PMID:28400356

  11. Analysis of the construct of dignity and content validity of the patient dignity inventory.

    PubMed

    Albers, Gwenda; Pasman, H Roeline W; Rurup, Mette L; de Vet, Henrica C W; Onwuteaka-Philipsen, Bregje D

    2011-06-19

    Maintaining dignity, the quality of being worthy of esteem or respect, is considered as a goal of palliative care. The aim of this study was to analyse the construct of personal dignity and to assess the content validity of the Patient Dignity Inventory (PDI) in people with an advance directive in the Netherlands. Data were collected within the framework of an advance directives cohort study. This cohort study is aiming to get a better insight into how decisions are made at the end of life with regard to advance directives in the Netherlands. One half of the cohort (n = 2404) received an open-ended question concerning factors relevant to dignity. Content labels were assigned to issues mentioned in the responses to the open-ended question. The other half of the cohort (n = 2537) received a written questionnaire including the PDI. The relevance and comprehensiveness of the PDI items were assessed with the COSMIN checklist ('COnsensus-based Standards for the selection of health status Measurement INstruments'). The majority of the PDI items were found to be relevant for the construct to be measured, the study population, and the purpose of the study but the items were not completely comprehensive. The responses to the open-ended question indicated that communication and care-related aspects were also important for dignity. This study demonstrated that the PDI items were relevant for people with an advance directive in the Netherlands. The comprehensiveness of the items can be improved by including items concerning communication and care.

  12. Emotional Intelligence and Nurse Recruitment: Rasch and confirmatory factor analysis of the trait emotional intelligence questionnaire short form.

    PubMed

    Snowden, Austyn; Watson, Roger; Stenhouse, Rosie; Hale, Claire

    2015-12-01

    To examine the construct validity of the Trait Emotional Intelligence Questionnaire Short form. Emotional intelligence involves the identification and regulation of our own emotions and the emotions of others. It is therefore a potentially useful construct in the investigation of recruitment and retention in nursing and many questionnaires have been constructed to measure it. Secondary analysis of existing dataset of responses to Trait Emotional Intelligence Questionnaire Short form using concurrent application of Rasch analysis and confirmatory factor analysis. First year undergraduate nursing and computing students completed Trait Emotional Intelligence Questionnaire-Short Form in September 2013. Responses were analysed by synthesising results of Rasch analysis and confirmatory factor analysis. Participants (N = 938) completed Trait Emotional Intelligence Questionnaire Short form. Rasch analysis showed the majority of the Trait Emotional Intelligence Questionnaire-Short Form items made a unique contribution to the latent trait of emotional intelligence. Five items did not fit the model and differential item functioning (gender) accounted for this misfit. Confirmatory factor analysis revealed a four-factor structure consisting of: self-confidence, empathy, uncertainty and social connection. All five misfitting items from the Rasch analysis belonged to the 'social connection' factor. The concurrent use of Rasch and factor analysis allowed for novel interpretation of Trait Emotional Intelligence Questionnaire Short form. Much of the response variation in Trait Emotional Intelligence Questionnaire Short form can be accounted for by the social connection factor. Implications for practice are discussed. © 2015 John Wiley & Sons Ltd.

  13. Equal Area Logistic Estimation for Item Response Theory

    NASA Astrophysics Data System (ADS)

    Lo, Shih-Ching; Wang, Kuo-Chang; Chang, Hsin-Li

    2009-08-01

    Item response theory (IRT) models use logistic functions exclusively as item response functions (IRFs). Applications of IRT models require obtaining the set of values for logistic function parameters that best fit an empirical data set. However, success in obtaining such set of values does not guarantee that the constructs they represent actually exist, for the adequacy of a model is not sustained by the possibility of estimating parameters. In this study, an equal area based two-parameter logistic model estimation algorithm is proposed. Two theorems are given to prove that the results of the algorithm are equivalent to the results of fitting data by logistic model. Numerical results are presented to show the stability and accuracy of the algorithm.

  14. Construct Validity of the Spanish Versions of the Memorial Symptom Assessment Scale Short Form and Condensed Form: Rasch Analysis of Responses in Oncology Outpatients.

    PubMed

    Llamas-Ramos, Inés; Llamas-Ramos, Rocío; Buz, José; Cortés-Rodríguez, María; Martín-Nogueras, Ana María

    2018-06-01

    The Memorial Symptom Assessment Scale (MSAS) is a self-rating instrument for the assessment of symptom distress in cancer patients. The Spanish version of the MSAS has recently been validated. However, we lack evidence of the internal construct validity of the shorter versions (short form [MSAS-SF] and condensed form [CMSAS]). In addition, rigorous testing of these scales with modern psychometric methods is needed. The aim of this study was to evaluate the internal construct validity and reliability of the Spanish versions of the MSAS-SF and CMSAS in oncology outpatients using Rasch analysis. Data from a convenience sample of oncology outpatients receiving chemotherapy (n = 306; mean age 60 years; 63% women) at a university hospital were analyzed. The Rasch unidimensional measurement model was used to examine response category functioning, item hierarchy, targeting, unidimensionality, reliability, and differential item functioning by age, gender, and marital status. The response category structure of the symptom distress items was improved by collapsing two categories. The scales were adequately targeted to the study patients, showed overall Rasch model fit (mean Infit MnSq ranged from 0.98 to 1.05), met criteria for unidimensionality, and the reliability of scores was good (person reliability > 0.80), except for the CMSAS prevalence scale. Only four items showed differential item functioning. The present study demonstrated that the Spanish versions of the MSAS-SF and CMSAS have adequate psychometric properties to evaluate symptom distress in oncology outpatients. Additional studies of the CMSAS are recommended. Copyright © 2018 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

  15. Item bank development, calibration and validation for patient-reported outcomes in female urinary incontinence

    PubMed Central

    Sung, Vivian W.; Griffith, James W.; Rogers, Rebecca G.; Raker, Christina A.; Clark, Melissa A.

    2016-01-01

    Purpose Current patient-reported outcomes for female urinary incontinence (UI) are limited by their inability to be tailored. Our objective is to describe the development and field-testing of 7 item banks designed to measure domains identified as important UI in females (UIf). We also describe the calibration and validation properties of the UIf-item banks, which allow for more efficient computerized-adaptive testing (CAT) in the future. METHODS The UIf-measures included 168 items covering 7 domains: Stress UI (SUI), Overactive Bladder (OAB), Urinary Frequency, Physical, Social and Emotional Health Impact, and Adaptation. Items underwent rigorous qualitative development and psychometric testing across 2 sites. Items were calibrated using item response theory and evaluated for internal consistency, construct validity and responsiveness. RESULTS 750 women (249 SUI, 249 OAB, and 252 mixed UI) participated. Mean age was 55±14 years ,23% were Hispanic, 80% white. In addition to face and content validity, the measures demonstrated good internal consistency (coefficient alpha 0.92-0.98) and unidimensionality. There was evidence for construct validity with moderate to strong correlations with the UDI (r’s ≥ 0.6) and IIQ (r’s = ≥ 0.6) scales. The measures were responsive to change for SUI treatment (paired t-test p <.001, ES range=1.3 to 2.9; SRM range=1.3 to 2.5) and OAB treatment (paired t-test p <.05 for all domains except Social Health Impact and Adaptation, ES range=.3 to 1.5, SRM range=0.4 to 1.0). The measures were responsive based on concurrent changes with the UDI and IIQ (p < 0.05). CAT versions were developed and pilot tested. CONCLUSIONS The UIf-item banks demonstrate good psychometric characteristics and are a sufficiently valid set of customizable tools for measuring UI symptoms and life impact. PMID:26732514

  16. Diagnostic Opportunities Using Rasch Measurement in the Context of a Misconceptions-Based Physical Science Assessment

    ERIC Educational Resources Information Center

    Wind, Stefanie A.; Gale, Jessica D.

    2015-01-01

    Multiple-choice (MC) items that are constructed such that distractors target known misconceptions for a particular domain provide useful diagnostic information about student misconceptions (Herrmann-Abell & DeBoer, 2011, 2014; Sadler, 1998). Item response theory models can be used to examine misconceptions distractor-driven multiple-choice…

  17. Using Hospital Anxiety and Depression Scale (HADS) on patients with epilepsy: Confirmatory factor analysis and Rasch models.

    PubMed

    Lin, Chung-Ying; Pakpour, Amir H

    2017-02-01

    The problems of mood disorders are critical in people with epilepsy. Therefore, there is a need to validate a useful tool for the population. The Hospital Anxiety and Depression Scale (HADS) has been used on the population, and showed that it is a satisfactory screening tool. However, more evidence on its construct validity is needed. A total of 1041 people with epilepsy were recruited in this study, and each completed the HADS. Confirmatory factor analysis (CFA) and Rasch analysis were used to understand the construct validity of the HADS. In addition, internal consistency was tested using Cronbachs' α, person separation reliability, and item separation reliability. Ordering of the response descriptors and the differential item functioning (DIF) were examined using the Rasch models. The HADS showed that 55.3% of our participants had anxiety; 56.0% had depression based on its cutoffs. CFA and Rasch analyses both showed the satisfactory construct validity of the HADS; the internal consistency was also acceptable (α=0.82 in anxiety and 0.79 in depression; person separation reliability=0.82 in anxiety and 0.73 in depression; item separation reliability=0.98 in anxiety and 0.91 in depression). The difficulties of the four-point Likert scale used in the HADS were monotonically increased, which indicates no disordering response categories. No DIF items across male and female patients and across types of epilepsy were displayed in the HADS. The HADS has promising psychometric properties on construct validity in people with epilepsy. Moreover, the additive item score is supported for calculating the cutoff. Copyright © 2016 British Epilepsy Association. Published by Elsevier Ltd. All rights reserved.

  18. The development of an instrument to assess chemistry perceptions

    NASA Astrophysics Data System (ADS)

    Wells, Raymond R.

    The instrument, developed in this study, attempted to correct the deficiencies of previous instruments. Statements of belief and opinion can be validly included under the construct of chemistry perceptions. Further, statements that might be better characterized as science attitudes, math attitudes, or attitudes toward a specific course or program were not included. Eliminating statements of math anxiety and test anxiety insured that responses to statements of anxiety were perceptions of anxiety solely related to chemistry. The results of the expert judges' responses to the Validation of Proposed Perception Statements forms were detailed to establish construct and content validity. The nature of Likert scale construction and calculation of internal consistency also supported the validity of the instrument. A pilot Chemistry Perception Questionnaire (CPQ) was then constructed based on agreement of the appropriate subscale and mean importance of the perception statements. The pilot CPQ results were subjected to an item analysis based on three sets of statistics: the frequency of each response and the percentage of respondents making each response for each perception statement, the mean and standard deviations for each item, and the item discrimination index which correlated the item scores with the subscale scores. With no zero or negative correlations to the subscale scores, it was not necessary to replace any of the perception statements contained in the pilot instrument. Therefore, the piloted Chemistry Perception Questionnaire became the final instrument. Factor analysis confirmed the multidimensionality of the instrument. The instrument was administered twice with a separation interval of approximately one month in order to perform a test-retest reliability analysis. One hundred and forty-one pairs were matched and results detailed. The correlation between forms, for the total instrument, was 0.9342. The mean coefficient alpha, for the total instrument, was 0.9495. With test-retest correlations and alphas exceeding 0.70 for all seven subscales and the total instrument, it was determined that the Chemistry Perception Questionnaire instrument achieved reasonably high reliability estimations.

  19. Psychometric properties of the neck disability index amongst patients with chronic neck pain using item response theory.

    PubMed

    Saltychev, Mikhail; Mattie, Ryan; McCormick, Zachary; Laimi, Katri

    2017-05-13

    The Neck Disability Index (NDI) is commonly used for clinical and research assessment for chronic neck pain, yet the original version of this tool has not undergone significant validity testing, and in particular, there has been minimal assessment using Item Response Theory. The goal of the present study was to investigate the psychometric properties of the original version of the NDI in a large sample of individuals with chronic neck pain by defining its internal consistency, construct structure and validity, and its ability to discriminate between different degrees of functional limitation. This is a cross-sectional cohort study of 585 consecutive patients with chronic neck pain seen in a university hospital rehabilitation clinic. Internal consistency was evaluated using Cronbach's alpha, construct structure was evaluated by exploratory factor analysis, and discrimination ability was determined by Item Response Theory. The NDI demonstrated good internal consistency assessed by Cronbach's alpha (0.87). The exploratory factor analysis identified only one factor with eigenvalue considered significant (cutoff 1.0). When analyzed by Item Response Theory, eight out of 10 items demonstrated almost ideal difficulty parameter estimates. In addition, eight out of 10 items showed high to perfect estimates of discrimination ability (overall range 0.8 to 2.9). Amongst patients with chronic neck pain, the NDI was found to have good internal consistency, have unidimensional properties, and an excellent ability to distinguish patients with different levels of perceived disability. Implications for Rehabilitation The Neck Disability Index has good internal consistency, unidimensional properties, and an excellent ability to distinguish patients with different levels of perceived disability. The Neck Disability Index is recommended for use when selecting patients for rehabilitation, setting rehabilitation goals, and measuring the outcome of intervention.

  20. The Trunk Impairment Scale - modified to ordinal scales in the Norwegian version.

    PubMed

    Gjelsvik, Bente; Breivik, Kyrre; Verheyden, Geert; Smedal, Tori; Hofstad, Håkon; Strand, Liv Inger

    2012-01-01

    To translate the Trunk Impairment Scale (TIS), a measure of trunk control in patients after stroke, into Norwegian (TIS-NV), and to explore its construct validity, internal consistency, intertester and test-retest reliability. TIS was translated according to international guidelines. The validity study was performed on data from 201 patients with acute stroke. Fifty patients with stroke and acquired brain injury were recruited to examine intertester and test-retest reliability. Construct validity was analyzed with exploratory and confirmatory factor analysis and item response theory, internal consistency with Cronbach's alpha test, and intertester and test-retest reliability with kappa and intraclass correlation coefficient tests. The back-translated version of TIS-NV was validated by the original developer. The subscale Static sitting balance was removed. By combining items from the subscales Dynamic sitting balance and Coordination, six ordinal superitems (testlets) were constructed. The TIS-NV was renamed the modified TIS-NV (TIS-modNV). After modifications the TIS-modNV fitted well to a locally dependent unidimensional item response theory model. It demonstrated good construct validity, excellent internal consistency, and high intertester and test-retest reliability for the total score. This study supports that the TIS-modNV is a valid and reliable scale for use in clinical practice and research.

  1. Inferring the effective TOR-dependent network: a computational study in yeast

    PubMed Central

    2013-01-01

    Background Calorie restriction (CR) is one of the most conserved non-genetic interventions that extends healthspan in evolutionarily distant species, ranging from yeast to mammals. The target of rapamycin (TOR) has been shown to play a key role in mediating healthspan extension in response to CR by integrating different signals that monitor nutrient-availability and orchestrating various components of cellular machinery in response. Both genetic and pharmacological interventions that inhibit the TOR pathway exhibit a similar phenotype, which is not further amplified by CR. Results In this paper, we present the first comprehensive, computationally derived map of TOR downstream effectors, with the objective of discovering key lifespan mediators, their crosstalk, and high-level organization. We adopt a systematic approach for tracing information flow from the TOR complex and use it to identify relevant signaling elements. By constructing a high-level functional map of TOR downstream effectors, we show that our approach is not only capable of recapturing previously known pathways, but also suggests potential targets for future studies. Information flow scores provide an aggregate ranking of relevance of proteins with respect to the TOR signaling pathway. These rankings must be normalized for degree bias, appropriately interpreted, and mapped to associated roles in pathways. We propose a novel statistical framework for integrating information flow scores, the set of differentially expressed genes in response to rapamycin treatment, and the transcriptional regulatory network. We use this framework to identify the most relevant transcription factors in mediating the observed transcriptional response, and to construct the effective response network of the TOR pathway. This network is hypothesized to mediate life-span extension in response to TOR inhibition. Conclusions Our approach, unlike experimental methods, is not limited to specific aspects of cellular response. Rather, it predicts transcriptional changes and post-translational modifications in response to TOR inhibition. The constructed effective response network greatly enhances understanding of the mechanisms underlying the aging process and helps in identifying new targets for further investigation of anti-aging regimes. It also allows us to identify potential network biomarkers for diagnosis and prognosis of age-related pathologies. PMID:24005029

  2. The Mindful Attention Awareness Scale: Further Examination of Dimensionality, Reliability, and Concurrent Validity Estimates.

    PubMed

    Osman, Augustine; Lamis, Dorian A; Bagge, Courtney L; Freedenthal, Stacey; Barnes, Sean M

    2016-01-01

    We examined the factor structure and psychometric properties of the Mindful Attention Awareness Scale (MAAS) in a sample of 810 undergraduate students. Using common exploratory factor analysis (EFA), we obtained evidence for a 1-factor solution (41.84% common variance). To confirm unidimensionality of the 15-item MAAS, we conducted a 1-factor confirmatory factor analysis (CFA). Results of the EFA and CFA, respectively, provided support for a unidimensional model. Using differential item functioning analysis methods within item response theory modeling (IRT-based DIF), we found that individuals with high and low levels of nonattachment responded similarly to the MAAS items. Following a detailed item analysis, we proposed a 5-item short version of the instrument and present descriptive statistics and composite score reliability for the short and full versions of the MAAS. Finally, correlation analyses showed that scores on the full and short versions of the MAAS were associated with measures assessing related constructs. The 5-item MAAS is as useful as the original MAAS in enhancing our understanding of the mindfulness construct.

  3. Applying Item Response Theory Methods to Examine the Impact of Different Response Formats

    ERIC Educational Resources Information Center

    Hohensinn, Christine; Kubinger, Klaus D.

    2011-01-01

    In aptitude and achievement tests, different response formats are usually used. A fundamental distinction must be made between the class of multiple-choice formats and the constructed response formats. Previous studies have examined the impact of different response formats applying traditional statistical approaches, but these influences can also…

  4. Combination of classical test theory (CTT) and item response theory (IRT) analysis to study the psychometric properties of the French version of the Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (Q-LES-Q-SF).

    PubMed

    Bourion-Bédès, Stéphanie; Schwan, Raymund; Epstein, Jonathan; Laprevote, Vincent; Bédès, Alex; Bonnet, Jean-Louis; Baumann, Cédric

    2015-02-01

    The study aimed to examine the construct validity and reliability of the Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (Q-LES-Q-SF) according to both classical test and item response theories. The psychometric properties of the French version of this instrument were investigated in a cross-sectional, multicenter study. A total of 124 outpatients with a substance dependence diagnosis participated in the study. Psychometric evaluation included descriptive analysis, internal consistency, test-retest reliability, and validity. The dimensionality of the instrument was explored using a combination of the classical test, confirmatory factor analysis (CFA), and an item response theory analysis, the Person Separation Index (PSI), in a complementary manner. The results of the Q-LES-Q-SF revealed that the questionnaire was easy to administer and the acceptability was good. The internal consistency and the test-retest reliability were 0.9 and 0.88, respectively. All items were significantly correlated with the total score and the SF-12 used in the study. The CFA with one factor model was good, and for the unidimensional construct, the PSI was found to be 0.902. The French version of the Q-LES-Q-SF yielded valid and reliable clinical assessments of the quality of life for future research and clinical practice involving French substance abusers. In response to recent questioning regarding the unidimensionality or bidimensionality of the instrument and according to the underlying theoretical unidimensional construct used for its development, this study suggests the Q-LES-Q-SF as a one-dimension questionnaire in French QoL studies.

  5. Development and psychometric validation of a scale to assess information needs in cardiac rehabilitation: the INCR Tool.

    PubMed

    Ghisi, Gabriela Lima de Melo; Grace, Sherry L; Thomas, Scott; Evans, Michael F; Oh, Paul

    2013-06-01

    To develop and psychometrically validate a tool to assess information needs in cardiac rehabilitation (CR) patients. After a literature search, 60 information items divided into 11 areas of needs were identified. To establish content validity, they were reviewed by an expert panel (N=10). Refined items were pilot-tested in 34 patients on a 5-point Likert-scale from 1 "really not helpful" to 5 "very important". A final version was generated and psychometrically tested in 203 CR patients. Test-retest reliability was assessed via the intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and criterion validity was assessed with regard to patient's education and duration in CR. Five items were excluded after ICC analysis as well as one area of needs. All 10 areas were considered internally consistent (Cronbach's alpha>0.7). Criterion validity was supported by significant differences in mean scores by educational level (p<0.05) and duration in CR (p<0.001). The mean total score was 4.08 ± 0.53. Patients rated safety as their greatest information need. The INCR Tool was demonstrated to have good reliability and validity. This is an appropriate tool for application in clinical and research settings, assessing patients' needs during CR and as part of education programming. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  6. Negative Mood and Obsessive-Compulsive Related Clinical Constructs: An Examination of Underlying Factors

    PubMed Central

    Britton, Gary I.; Davey, Graham C. L.

    2017-01-01

    Emerging evidence suggests that many of the clinical constructs used to help understand and explain obsessive-compulsive (OC) symptoms, and negative mood, may be causally interrelated. One approach to understanding this interrelatedness is a motivational systems approach. This approach suggests that rather than considering clinical constructs and negative affect as separable entities, they are all features of an integrated threat management system, and as such are highly coordinated and interdependent. The aim of the present study was to examine if clinical constructs related to OC symptoms and negative mood are best treated as separable or, alternatively, if these clinical constructs and negative mood are best seen as indicators of an underlying superordinate variable, as would be predicted by a motivational systems approach. A sample of 370 student participants completed measures of mood and the clinical constructs of inflated responsibility, intolerance of uncertainty, not just right experiences, and checking stop rules. An exploratory factor analysis suggested two plausible factor structures, one where all construct items and negative mood items loaded onto one underlying superordinate variable, and a second structure comprising of five factors, where each item loaded onto a factor representative of what the item was originally intended to measure. A confirmatory factor analysis showed that the five factor model was preferential to the one factor model, suggesting the four constructs and negative mood are best conceptualized as separate variables. Given the predictions of a motivational systems approach were not supported in the current study, other possible explanations for the causal interrelatedness between clinical constructs and negative mood are discussed. PMID:28959224

  7. Cancer-Related Direct-to-Consumer Advertising: Awareness, Perceptions, and Reported Impact Among Patients Undergoing Active Cancer Treatment

    PubMed Central

    Abel, Gregory A.; Burstein, Harold J.; Hevelone, Nathanael D.; Weeks, Jane C.

    2009-01-01

    Purpose Although cancer-related direct-to-consumer advertising (CR-DTCA) is prevalent, little is known about cancer patients' experiences with this controversial medium of medical communication. Methods We administered a 41-item, mailed questionnaire to consecutive patients with breast and hematologic malignancies who were undergoing active treatment at our institution. We assessed awareness of CR-DTCA within the prior year, perceptions of CR-DTCA, and CR-DTCA–prompted patient and provider behaviors. Results We received 348 completed questionnaires (response rate, 75.0%). Overall, 86.2% reported being aware of CR-DTCA, most frequently from television (77.7%). Awareness did not vary with clinical or sociodemographic factors except that patients were more likely to be aware of CR-DTCA for products specific to their cancer types (P < .0001). A majority of those aware reported that CR-DTCA made them “aware of treatments they did not know about” (62.2%), provided information in “a balanced manner” (65.2%), and helped them to have “better discussions” with their provider (56.8%). These perceptions were significantly more favorable among those who had not graduated from college (P < .05 for each). Overall, 11.2% reported that CR-DTCA made them “less confident” in their providers' judgment. Of those aware, 17.3% reported talking to their provider about an advertised medication, although less than one fifth of those reported receiving a prescription for the advertised medication. Conclusion The patients in our cohort were highly aware of CR-DTCA. CR-DTCA was found to be accessible and useful; however, it decreased some patients' confidence in their providers' judgment. CR-DTCA prompted a modest amount of patient-provider discussion but infrequent patient-reported changes in therapy. PMID:19652071

  8. Career decisions and the structure of training: an American Board Of Colon and Rectal Surgery survey of colorectal residents.

    PubMed

    Schmitz, Constance C; Rothenberger, David A; Trudel, Judith L; Wolff, Bruce G

    2009-07-01

    To investigate potential impacts of restructuring general surgery training on colorectal (CR) surgery recruitment and expertise. In response to the American Surgical Association Blue Ribbon Committee report on surgical education (2004), the American Board of Colon and Rectal Surgery, working with the Accreditation Council for Graduate Medical Education and American Board of Surgery, established a committee (2006) to review residency training curricula and study new pathways to certification as a CR surgeon. To address concerns related to shortened general surgery residency, the American Board of Colon and Rectal Surgery committee surveyed recent, current, and entering CR residents on the timing and factors associated with their career choice and opinions regarding restructuring. A 10-item, online survey of 189 CR surgeons enrolled in the class years of 2005, 2006, and 2007 was administered and analyzed May to July 2007. One hundred forty-five CR residents responded (77%); results were consistent across class years and types of general surgery training program. Seventy percent of respondents had rotated onto a CR service by the end of their PGY-2 year. Most identified CR as a career interest in their PGY-3 or PGY-4 year. Overall interest in CR surgery, the influence of CR mentors and teachers, and positive exposure to CR as PGY-3, PGY-4, or PGY-5 residents were the top cited factors influencing choice decisions. Respondents were opposed to restructuring by a 2:1 ratio, primarily because of concerns about inadequate training and lack of time to develop technical expertise. Shortening general surgery residency would not necessarily limit exposure to CR rotations and mentors unless such rotations are cut. The details of proposed restructuring are critical.

  9. Cancer-related direct-to-consumer advertising: awareness, perceptions, and reported impact among patients undergoing active cancer treatment.

    PubMed

    Abel, Gregory A; Burstein, Harold J; Hevelone, Nathanael D; Weeks, Jane C

    2009-09-01

    Although cancer-related direct-to-consumer advertising (CR-DTCA) is prevalent, little is known about cancer patients' experiences with this controversial medium of medical communication. We administered a 41-item, mailed questionnaire to consecutive patients with breast and hematologic malignancies who were undergoing active treatment at our institution. We assessed awareness of CR-DTCA within the prior year, perceptions of CR-DTCA, and CR-DTCA-prompted patient and provider behaviors. We received 348 completed questionnaires (response rate, 75.0%). Overall, 86.2% reported being aware of CR-DTCA, most frequently from television (77.7%). Awareness did not vary with clinical or sociodemographic factors except that patients were more likely to be aware of CR-DTCA for products specific to their cancer types (P < .0001). A majority of those aware reported that CR-DTCA made them "aware of treatments they did not know about" (62.2%), provided information in "a balanced manner" (65.2%), and helped them to have "better discussions" with their provider (56.8%). These perceptions were significantly more favorable among those who had not graduated from college (P < .05 for each). Overall, 11.2% reported that CR-DTCA made them "less confident" in their providers' judgment. Of those aware, 17.3% reported talking to their provider about an advertised medication, although less than one fifth of those reported receiving a prescription for the advertised medication. The patients in our cohort were highly aware of CR-DTCA. CR-DTCA was found to be accessible and useful; however, it decreased some patients' confidence in their providers' judgment. CR-DTCA prompted a modest amount of patient-provider discussion but infrequent patient-reported changes in therapy.

  10. Construction of a memory battery for computerized administration, using item response theory.

    PubMed

    Ferreira, Aristides I; Almeida, Leandro S; Prieto, Gerardo

    2012-10-01

    In accordance with Item Response Theory, a computer memory battery with six tests was constructed for use in the Portuguese adult population. A factor analysis was conducted to assess the internal structure of the tests (N = 547 undergraduate students). According to the literature, several confirmatory factor models were evaluated. Results showed better fit of a model with two independent latent variables corresponding to verbal and non-verbal factors, reproducing the initial battery organization. Internal consistency reliability for the six tests were alpha = .72 to .89. IRT analyses (Rasch and partial credit models) yielded good Infit and Outfit measures and high precision for parameter estimation. The potential utility of these memory tasks for psychological research and practice willbe discussed.

  11. Test-retest reliability and construct validity of the ENERGY-parent questionnaire on parenting practices, energy balance-related behaviours and their potential behavioural determinants: the ENERGY-project.

    PubMed

    Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes

    2012-08-13

    Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.

  12. A Psychometric Analysis of the Italian Version of the eHealth Literacy Scale Using Item Response and Classical Test Theory Methods.

    PubMed

    Diviani, Nicola; Dima, Alexandra Lelia; Schulz, Peter Johannes

    2017-04-11

    The eHealth Literacy Scale (eHEALS) is a tool to assess consumers' comfort and skills in using information technologies for health. Although evidence exists of reliability and construct validity of the scale, less agreement exists on structural validity. The aim of this study was to validate the Italian version of the eHealth Literacy Scale (I-eHEALS) in a community sample with a focus on its structural validity, by applying psychometric techniques that account for item difficulty. Two Web-based surveys were conducted among a total of 296 people living in the Italian-speaking region of Switzerland (Ticino). After examining the latent variables underlying the observed variables of the Italian scale via principal component analysis (PCA), fit indices for two alternative models were calculated using confirmatory factor analysis (CFA). The scale structure was examined via parametric and nonparametric item response theory (IRT) analyses accounting for differences between items regarding the proportion of answers indicating high ability. Convergent validity was assessed by correlations with theoretically related constructs. CFA showed a suboptimal model fit for both models. IRT analyses confirmed all items measure a single dimension as intended. Reliability and construct validity of the final scale were also confirmed. The contrasting results of factor analysis (FA) and IRT analyses highlight the importance of considering differences in item difficulty when examining health literacy scales. The findings support the reliability and validity of the translated scale and its use for assessing Italian-speaking consumers' eHealth literacy. ©Nicola Diviani, Alexandra Lelia Dima, Peter Johannes Schulz. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 11.04.2017.

  13. Construct and Differential Item Functioning in the Assessment of Prescription Opioid Use Disorders among American Adolescents

    ERIC Educational Resources Information Center

    Wu, Li-Tzy; Ringwalt, Christopher L.; Yang, Chongming; Reeve, Bryce B.; Pan, Jeng-Jong; Blazer, Dan G.

    2009-01-01

    DSM-IV's hierarchical distinction between abuse of and dependence on prescription opioids is not supported since the symptoms of abuse in adolescents are not less severe than dependence. The finding is based on the examination of the DSM-IV criteria for opioid use disorders using item response theory.

  14. The tinnitus functional index: development of a new clinical measure for chronic, intrusive tinnitus.

    PubMed

    Meikle, Mary B; Henry, James A; Griest, Susan E; Stewart, Barbara J; Abrams, Harvey B; McArdle, Rachel; Myers, Paula J; Newman, Craig W; Sandridge, Sharon; Turk, Dennis C; Folmer, Robert L; Frederick, Eric J; House, John W; Jacobson, Gary P; Kinney, Sam E; Martin, William H; Nagler, Stephen M; Reich, Gloria E; Searchfield, Grant; Sweetow, Robert; Vernon, Jack A

    2012-01-01

    Chronic subjective tinnitus is a prevalent condition that causes significant distress to millions of Americans. Effective tinnitus treatments are urgently needed, but evaluating them is hampered by the lack of standardized measures that are validated for both intake assessment and evaluation of treatment outcomes. This work was designed to develop a new self-report questionnaire, the Tinnitus Functional Index (TFI), that would have documented validity both for scaling the severity and negative impact of tinnitus for use in intake assessment and for measuring treatment-related changes in tinnitus (responsiveness) and that would provide comprehensive coverage of multiple tinnitus severity domains. To use preexisting knowledge concerning tinnitus-related problems, an Item Selection Panel (17 expert judges) surveyed the content (175 items) of nine widely used tinnitus questionnaires. From those items, the Panel identified 13 separate domains of tinnitus distress and selected 70 items most likely to be responsive to treatment effects. Eliminating redundant items while retaining good content validity and adding new items to achieve the recommended minimum of 3 to 4 items per domain yielded 43 items, which were then used for constructing TFI Prototype 1.Prototype 1 was tested at five clinics. The 326 participants included consecutive patients receiving tinnitus treatment who provided informed consent-constituting a convenience sample. Construct validity of Prototype 1 as an outcome measure was evaluated by measuring responsiveness of the overall scale and its individual items at 3 and 6 mo follow-up with 65 and 42 participants, respectively. Using a predetermined list of criteria, the 30 best-functioning items were selected for constructing TFI Prototype 2.Prototype 2 was tested at four clinics with 347 participants, including 155 and 86 who provided 3 and 6 mo follow-up data, respectively. Analyses were the same as for Prototype 1. Results were used to select the 25 best-functioning items for the final TFI. Both prototypes and the final TFI displayed strong measurement properties, with few missing data, high validity for scaling of tinnitus severity, and good reliability. All TFI versions exhibited the same eight factors characterizing tinnitus severity and negative impact. Responsiveness, evaluated by computing effect sizes for responses at follow-up, was satisfactory in all TFI versions.In the final TFI, Cronbach's alpha was 0.97 and test-retest reliability 0.78. Convergent validity (r = 0.86 with Tinnitus Handicap Inventory [THI]; r = 0.75 with Visual Analog Scale [VAS]) and discriminant validity (r = 0.56 with Beck Depression Inventory-Primary Care [BDI-PC]) were good. The final TFI was successful at detecting improvement from the initial clinic visit to 3 mo with moderate to large effect sizes and from initial to 6 mo with large effect sizes. Effect sizes for the TFI were generally larger than those obtained for the VAS and THI. After careful evaluation, a 13-point reduction was considered a preliminary criterion for meaningful reduction in TFI outcome scores. The TFI should be useful in both clinical and research settings because of its responsiveness to treatment-related change, validity for scaling the overall severity of tinnitus, and comprehensive coverage of multiple domains of tinnitus severity.

  15. Development of the Sexual Minority Adolescent Stress Inventory

    PubMed Central

    Schrager, Sheree M.; Goldbach, Jeremy T.; Mamey, Mary Rose

    2018-01-01

    Although construct measurement is critical to explanatory research and intervention efforts, rigorous measure development remains a notable challenge. For example, though the primary theoretical model for understanding health disparities among sexual minority (e.g., lesbian, gay, bisexual) adolescents is minority stress theory, nearly all published studies of this population rely on minority stress measures with poor psychometric properties and development procedures. In response, we developed the Sexual Minority Adolescent Stress Inventory (SMASI) with N = 346 diverse adolescents ages 14–17, using a comprehensive approach to de novo measure development designed to produce a measure with desirable psychometric properties. After exploratory factor analysis on 102 candidate items informed by a modified Delphi process, we applied item response theory techniques to the remaining 72 items. Discrimination and difficulty parameters and item characteristic curves were estimated overall, within each of 12 initially derived factors, and across demographic subgroups. Two items were removed for excessive discrimination and three were removed following reliability analysis. The measure demonstrated configural and scalar invariance for gender and age; a three-item factor was excluded for demonstrating substantial differences by sexual identity and race/ethnicity. The final 64-item measure comprised 11 subscales and demonstrated excellent overall (α = 0.98), subscale (α range 0.75–0.96), and test–retest (scale r > 0.99; subscale r range 0.89–0.99) reliabilities. Subscales represented a mix of proximal and distal stressors, including domains of internalized homonegativity, identity management, intersectionality, and negative expectancies (proximal) and social marginalization, family rejection, homonegative climate, homonegative communication, negative disclosure experiences, religion, and work domains (distal). Thus, the SMASI development process illustrates a method to incorporate information from multiple sources, including item response theory models, to guide item selection in building a psychometrically sound measure. We posit that similar methods can be used to improve construct measurement across all areas of psychological research, particularly in areas where a strong theoretical framework exists but existing measures are limited. PMID:29599737

  16. Exploratory Item Classification Via Spectral Graph Clustering

    PubMed Central

    Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

    2017-01-01

    Large-scale assessments are supported by a large item pool. An important task in test development is to assign items into scales that measure different characteristics of individuals, and a popular approach is cluster analysis of items. Classical methods in cluster analysis, such as the hierarchical clustering, K-means method, and latent-class analysis, often induce a high computational overhead and have difficulty handling missing data, especially in the presence of high-dimensional responses. In this article, the authors propose a spectral clustering algorithm for exploratory item cluster analysis. The method is computationally efficient, effective for data with missing or incomplete responses, easy to implement, and often outperforms traditional clustering algorithms in the context of high dimensionality. The spectral clustering algorithm is based on graph theory, a branch of mathematics that studies the properties of graphs. The algorithm first constructs a graph of items, characterizing the similarity structure among items. It then extracts item clusters based on the graphical structure, grouping similar items together. The proposed method is evaluated through simulations and an application to the revised Eysenck Personality Questionnaire. PMID:29033476

  17. Exploring the Validity of the Affect Balance Scale With a Sample of Family Caregivers

    PubMed Central

    Perkinson, Margaret A.; Albert, Steven M.; Luborsky, Mark; Moss, Miriam; Glicksman, Allen

    2014-01-01

    Open-ended responses of caregiving daughters and daughters-in-law were generated by a modified random probe technique to investigate the construct validity of the two subscales of the Affect Balance Scale (ABS), i.e., the 5-item Positive Affect Scale (PAS) and the 5-item Negative Affect Scale (NAS). A set of criteria were developed to distinguish between responses that did and did not correspond to Bradburn’s assumptions concerning affect. While most responses met at least one of the criteria, very few met all. In exploring the nature of affect, we found that positive affect was based to a large extent on personal accomplishments and the recognition of others. The assessment of negative affect was a more interior, or self-focused process. For a significant subset of the sample, a negative response to a closed-ended PAS or NAS item implied disagreement or discontent with the wording or the implications of the item itself, rather than an absence of affect. Not all of the ABS items were equally valid measures of affect. PMID:8056955

  18. Toward a Measure of Accountability in Nursing: A Three-Stage Validation Study.

    PubMed

    Drach-Zahavy, Anat; Leonenko, Marina; Srulovici, Einav

    2018-06-04

    To develop and psychometrically evaluate a three-dimensional questionnaire suitable for evaluating personal and organizational accountability in nurses. Accountability is defined as a three-dimensional value, directing professionals to take responsibility for their decisions and actions, to be willing to explain them (transparency) and to be judged according to society's accepted values (answerability). Despite the relatively clear definition, measurement of accountability lags well behind. Existing self-report questionnaires do not fully capture the complexity of the concept; nor do they capture the different sources of accountability (e.g., personal accountability, organizational accountability). A three-stage measure development. Data were collected during 2015-2016. In Phase 1, an initial database of items (N = 74) was developed, based on literature review and qualitative study, establishing face and content validity. In Phase 2, the face, content, construct and criterion-related validity of the initial questionnaires (19 items for personal and organizational accountability questionnaire) was established with a sample of 229 nurses. In Phase 3, the final questionnaires (19 items each) were validated with a new sample of 329 nurses and established construct validity. The final version of the instruments comprised 19 items, suitable for assessing personal and organizational accountability. The questionnaire referred to the dimensions of responsibility, transparency and answerability. The findings established the instrument's content, construct and criterion-related validity, as well as good internal reliability. The questionnaire portrays accountability in nursing, by capturing nurses' subjective perceptions of accountability dimensions (responsibility, transparency, answerability), as demonstrated by personal and organizational values. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  19. Item Selection, Evaluation, and Simple Structure in Personality Data

    PubMed Central

    Pettersson, Erik; Turkheimer, Eric

    2010-01-01

    We report an investigation of the genesis and interpretation of simple structure in personality data using two very different self-reported data sets. The first consists of a set of relatively unselected lexical descriptors, whereas the second is based on responses to a carefully constructed instrument. In both data sets, we explore the degree of simple structure by comparing factor solutions to solutions from simulated data constructed to have either strong or weak simple structure. The analysis demonstrates that there is little evidence of simple structure in the unselected items, and a moderate degree among the selected items. In both instruments, however, much of the simple structure that could be observed originated in a strong dimension of positive vs. negative evaluation. PMID:20694168

  20. Two objective measures of self-esteem.

    PubMed

    Lorr, M; Wunderlich, R A

    1986-01-01

    Two scales were constructed to assess self-esteem, conceptualized as reflecting (a) feelings of competence and efficacy, and (b) perceived positive appraisal from significant others. To control for response bias a paired choice format was chosen for the items constructed. A buffer scale designed to measure social assertiveness was also included. Data were collected on three samples of high school boys. The item intercorrelations were subjected to principal component analyses followed by Varimax rotations. In each of the three analyses factors of Confidence, Popularity (Social Approval), and Social Assertiveness emerged. The revised self-esteem scales, each defined by 11 items, have been shown to have acceptable reliability and some concurrent validity based on correlations with the well-known Rosenberg Self-Esteem Scale.

  1. The Utrecht questionnaire (U-CEP) measuring knowledge on clinical epidemiology proved to be valid.

    PubMed

    Kortekaas, Marlous F; Bartelink, Marie-Louise E L; de Groot, Esther; Korving, Helen; de Wit, Niek J; Grobbee, Diederick E; Hoes, Arno W

    2017-02-01

    Knowledge on clinical epidemiology is crucial to practice evidence-based medicine. We describe the development and validation of the Utrecht questionnaire on knowledge on Clinical epidemiology for Evidence-based Practice (U-CEP); an assessment tool to be used in the training of clinicians. The U-CEP was developed in two formats: two sets of 25 questions and a combined set of 50. The validation was performed among postgraduate general practice (GP) trainees, hospital trainees, GP supervisors, and experts. Internal consistency, internal reliability (item-total correlation), item discrimination index, item difficulty, content validity, construct validity, responsiveness, test-retest reliability, and feasibility were assessed. The questionnaire was externally validated. Internal consistency was good with a Cronbach alpha of 0.8. The median item-total correlation and mean item discrimination index were satisfactory. Both sets were perceived as relevant to clinical practice. Construct validity was good. Both sets were responsive but failed on test-retest reliability. One set took 24 minutes and the other 33 minutes to complete, on average. External GP trainees had comparable results. The U-CEP is a valid questionnaire to assess knowledge on clinical epidemiology, which is a prerequisite for practicing evidence-based medicine in daily clinical practice. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Constructing Multiple-Choice Items to Measure Higher-Order Thinking

    ERIC Educational Resources Information Center

    Scully, Darina

    2017-01-01

    Across education, certification and licensure, there are repeated calls for the development of assessments that target "higher-order thinking," as opposed to mere recall of facts. A common assumption is that this necessitates the use of constructed response or essay-style test questions; however, empirical evidence suggests that this may…

  3. A Non-Parametric Item Response Theory Evaluation of the CAGE Instrument Among Older Adults.

    PubMed

    Abdin, Edimansyah; Sagayadevan, Vathsala; Vaingankar, Janhavi Ajit; Picco, Louisa; Chong, Siow Ann; Subramaniam, Mythily

    2018-02-23

    The validity of the CAGE using item response theory (IRT) has not yet been examined in older adult population. This study aims to investigate the psychometric properties of the CAGE using both non-parametric and parametric IRT models, assess whether there is any differential item functioning (DIF) by age, gender and ethnicity and examine the measurement precision at the cut-off scores. We used data from the Well-being of the Singapore Elderly study to conduct Mokken scaling analysis (MSA), dichotomous Rasch and 2-parameter logistic IRT models. The measurement precision at the cut-off scores were evaluated using classification accuracy (CA) and classification consistency (CC). The MSA showed the overall scalability H index was 0.459, indicating a medium performing instrument. All items were found to be homogenous, measuring the same construct and able to discriminate well between respondents with high levels of the construct and the ones with lower levels. The item discrimination ranged from 1.07 to 6.73 while the item difficulty ranged from 0.33 to 2.80. Significant DIF was found for 2-item across ethnic group. More than 90% (CC and CA ranged from 92.5% to 94.3%) of the respondents were consistently and accurately classified by the CAGE cut-off scores of 2 and 3. The current study provides new evidence on the validity of the CAGE from the IRT perspective. This study provides valuable information of each item in the assessment of the overall severity of alcohol problem and the precision of the cut-off scores in older adult population.

  4. Measuring Graph Comprehension, Critique, and Construction in Science

    NASA Astrophysics Data System (ADS)

    Lai, Kevin; Cabrera, Julio; Vitale, Jonathan M.; Madhok, Jacquie; Tinker, Robert; Linn, Marcia C.

    2016-08-01

    Interpreting and creating graphs plays a critical role in scientific practice. The K-12 Next Generation Science Standards call for students to use graphs for scientific modeling, reasoning, and communication. To measure progress on this dimension, we need valid and reliable measures of graph understanding in science. In this research, we designed items to measure graph comprehension, critique, and construction and developed scoring rubrics based on the knowledge integration (KI) framework. We administered the items to over 460 middle school students. We found that the items formed a coherent scale and had good reliability using both item response theory and classical test theory. The KI scoring rubric showed that most students had difficulty linking graphs features to science concepts, especially when asked to critique or construct graphs. In addition, students with limited access to computers as well as those who speak a language other than English at home have less integrated understanding than others. These findings point to the need to increase the integration of graphing into science instruction. The results suggest directions for further research leading to comprehensive assessments of graph understanding.

  5. Evaluating Job Demands and Control Measures for Use in Farm Worker Health Surveillance.

    PubMed

    Alterman, Toni; Gabbard, Susan; Grzywacz, Joseph G; Shen, Rui; Li, Jia; Nakamoto, Jorge; Carroll, Daniel J; Muntaner, Carles

    2015-10-01

    Workplace stress likely plays a role in health disparities; however, applying standard measures to studies of immigrants requires thoughtful consideration. The goal of this study was to determine the appropriateness of two measures of occupational stressors ('decision latitude' and 'job demands') for use with mostly immigrant Latino farm workers. Cross-sectional data from a pilot module containing a four-item measure of decision latitude and a two-item measure of job demands were obtained from a subsample (N = 409) of farm workers participating in the National Agricultural Workers Survey. Responses to items for both constructs were clustered toward the low end of the structured response-set. Percentages of responses of 'very often' and 'always' for each of the items were examined by educational attainment, birth country, dominant language spoken, task, and crop. Cronbach's α, when stratified by subgroups of workers, for the decision latitude items were (0.65-0.90), but were less robust for the job demands items (0.25-0.72). The four-item decision latitude scale can be applied to occupational stress research with immigrant farm workers, and potentially other immigrant Latino worker groups. The short job demands scale requires further investigation and evaluation before suggesting widespread use.

  6. A central review of histopathology reports after breast cancer neoadjuvant chemotherapy in the neo-tango trial

    PubMed Central

    Provenzano, E; Vallier, A-L; Champ, R; Walland, K; Bowden, S; Grier, A; Fenwick, N; Abraham, J; Iddawela, M; Caldas, C; Hiller, L; Dunn, J; Earl, H M

    2013-01-01

    Background: Neo-tAnGo, a National Cancer Research Network (NCRN) multicentre randomised neoadjuvant chemotherapy trial in early breast cancer, enroled 831 patients in the United Kingdom. We report a central review of post-chemotherapy histopathology reports on the surgical specimens, to assess the presence and degree of response. Methods: A central independent two-reader review (EP and HME) of histopathology reports from post-treatment surgical specimens was performed. The quality and completeness of pathology reporting across all centres was assessed. The reviews included pathological response to chemotherapy (pathological complete response (pCR); minimal residual disease (MRD); and lesser degrees of response), laterality, the number of axillary metastases and axillary nodes, and the type of surgery. A consensus was reached after discussion. Results: In all, 825 surgical reports from 816 patients were available for review. Out of 4125 data items there were 347 discrepant results (8.4% of classifications), which involved 281 patients. These involved grading of breast response (169 but only 9 involving pCR vs MRD); laterality (6); presence of axillary metastasis (35); lymph node counts (108); and type of axillary surgery (29). Excluding cases with pCR, only 45% of reports included any comment regarding response in the breast and 30% in the axillary lymph nodes. Conclusion: We found considerable variability in the completeness of reporting of surgical specimens within this national neoadjuvant breast cancer trial. This highlights the need for consensus guidelines among trial groups on histopathology reporting, and the participation of histopathologists throughout the development and analysis of neoadjuvant trials. PMID:23299526

  7. Measuring Alexithymia via Trait Approach-I: A Alexithymia Scale Item Selection and Formation of Factor Structure

    PubMed Central

    TATAR, Arkun; SALTUKOĞLU, Gaye; ALİOĞLU, Seda; ÇİMEN, Sümeyye; GÜVEN, Hülya; AY, Çağla Ebru

    2017-01-01

    Introduction It is not clear in the literature whether available instruments are sufficient to measure alexithymia because of its theoretical structure. Moreover, it has been reported that several measuring instruments are needed to measure this construct, and all the instruments have different error sources. The old and the new forms of Toronto Alexithymia Scale are the only instruments available in Turkish. Thus, the purpose of this study was to develop a new scale to measure alexithymia, selecting items and constructing the factor structure. Methods A total of 1117 patients aged from 19 to 82 years (mean = 35.05 years) were included. A 100-item pool was prepared and applied to 628 women and 489 men. Data were analyzed using Explanatory Factor Analysis, Confirmatory Factor Analysis, and Item Response Theory and 28 items were selected. The new form of 28 items was applied to 415 university students, including 271 women and 144 men aged from 18 to 30 (mean=21.44). Results The results of Explanatory Factor Analysis revealed a five-factor construct of “Solving and Expressing Affective Experiences,” “External Locused Cognitive Style,” “Tendency to Somatize Affections,” “Imaginary Life and Visualization,” and “Acting Impulsively,” along with a two-factor construct representing the “Affective” and “Cognitive” components. All the components of the construct showed good model fit and high internal consistency. The new form was tested in terms of internal consistency, test-retest reliability, and concurrent validity using Toronto Alexithymia Scale as criteria and discriminative validity using Five-Factor Personality Inventory Short Form. Conclusion The results showed that the new scale met the basic psychometric requirements. Results have been discussed in line with related studies. PMID:29033633

  8. Synthesis and characterization of transition metal clusters: From the isolation of ligand-stabilized solid fragments to the tuning of magnetic anisotropy and host-guest selectivity, and, Approaches to science teaching: Development of an observation instrument with a measurement model based on item response theory

    NASA Astrophysics Data System (ADS)

    Hee, Allan George

    Part I. The work presented herein describes efforts to develop general techniques for the synthesis of transition metal clusters and the manipulation of their properties. In Chapter 2, it is demonstrated that a modified metal atom reactor allows for the vaporization, passivation, and isolation of metal-chalcogenide clusters from their parent binary solids. Among the clusters produced by this method were Cr6S8(PEt3)6, Fe4S 4(PEt3)4, Co6S8(PEt 3)6, Cu6S4(PEt3)6, Cu12S6(PEt3)8, and Cu26Se 13(PEt3)14. To create single-molecule magnets with higher demagnetization barriers, we are developing metal-cyanide systems which exhibit highly adjustable magnetic behavior. Chapter 3 reports an attempt to introduce magnetic anisotropy into a MnCr6 cluster. Replacement of CrIII with Mo III resulted in the assembly of K[(Me3tacn)6MnMo 6(CN)18](ClO4)3 (Me3tacn = N,N',N″ -trimethyl-1,4,7-triazacyclononane)---the first well-documented example of a cyano-bridged single-molecule magnet. Recently, it was demonstrated that replacing Me3tacn with the less sterically hindering tach (tach = cis,cis-1,3,5-triaminocyclohexane) in the face-centered cubic cluster [(tach)8Cr8Ni 6(CN)24]Br12 provides greater access to the cluster cavity. Chapter 4 describes my efforts to probe the selectivity of this cluster toward inclusion of various guests. Part II. Successful implementation of student-centered curricula reforms requires the creation of a measurement instrument for monitoring whether the curricula are being used as intended. The creation and development of an observation instrument would greatly contribute to this effort. To develop a theoretically sound construct map, it is necessary to review the literature and conduct our own investigations of approaches to science teaching. Chapter 2 presents the findings of these investigations and their contributions to our understanding of the construct. Using these findings, the Science Teaching Observation Protocol (STOP) was created and designed to measure two subconstructs: intentions and strategies. Chapter 3 details the first pilot test of STOP and analysis of the collected data. In Chapter 4, the theoretical shortcomings of the instrument are analyzed and discussed. Modified versions of the intention and strategy subconstruct maps are presented.

  9. Rasch-built Overall Disability Scale for Multifocal motor neuropathy (MMN-RODS(©) ).

    PubMed

    Vanhoutte, Els K; Faber, Catharina G; van Nes, Sonja I; Cats, Elisabeth A; Van der Pol, W-Ludo; Gorson, Kenneth C; van Doorn, Pieter A; Cornblath, David R; van den Berg, Leonard H; Merkies, Ingemar S J

    2015-09-01

    Clinical trials in multifocal motor neuropathy (MMN) have often used ordinal-based measures that may not accurately capture changes. We aimed to construct a disability interval outcome measure specifically for MMN using the Rasch model and to examine its clinimetric properties. A total of 146 preliminary activity and participation items were assessed twice (reliability studies) in 96 clinically stable MMN patients. These patients also assessed the ordinal-based overall disability sum score (construct, sample-dependent validity). The final Rasch-built overall disability scale for MMN (MMN-RODS(©) ) was serially applied in 26 patients with newly diagnosed or relapsing MMN, treated with intravenous immunoglobulin (IVIg) (1-year follow-up; responsiveness study). The magnitude of change for each patient was calculated using the minimum clinically important difference technique related to the individually obtained standard errors. A total of 121 items not fulfilling Rasch requirements were removed. The final 25-item MMN-RODS(©) fulfilled all Rasch model's expectations and showed acceptable reliability and validity including good discriminatory capacity. Most serially examined patients improved, but its magnitude was low, reflecting poor responsiveness. The constructed MMN-RODS(©) is a disease-specific, interval measure to detect activity limitations in patients with MMN and overcomes the shortcomings of ordinal scales. However, future clinimetric studies are needed to improve the MMN-RODS(©) 's responsiveness by longer observations and/or more rigorous treatment regimens. © 2015 Peripheral Nerve Society.

  10. Measuring the Quality of Life of Visually Impaired Children: First Stage Psychometric Evaluation of the Novel VQoL_CYP Instrument.

    PubMed

    Tadić, Valerija; Cooper, Andrew; Cumberland, Phillippa; Lewando-Hundt, Gillian; Rahi, Jugnoo S

    2016-01-01

    To report piloting and initial validation of the VQoL_CYP, a novel age-appropriate vision-related quality of life (VQoL) instrument for self-reporting by children with visual impairment (VI). Participants were a random patient sample of children with VI aged 10-15 years. 69 patients, drawn from patient databases at Great Ormond Street Hospital and Moorfields Eye Hospital, United Kingdom, participated in piloting of the draft 47-item VQoL instrument, which enabled preliminary item reduction. Subsequent administration of the instrument, alongside functional vision (FV) and generic health-related quality of life (HRQoL) self-report measures, to 101 children with VI comprising a nationally representative sample enabled further item reduction and evaluation of psychometric properties using Rasch analysis. Construct validity was assessed through Pearson correlation coefficients. Item reduction through piloting (8 items removed for skewness and individual item response pattern) and validation (1 item removed for skewness and 3 for misfit in Rasch) produced a 35-item scale, with fit values within acceptable limits, no notable differential item functioning, good measurement precision, ordered response categories and acceptable targeting in Rasch. The VQoL_CYP showed good construct validity, correlating strongly with HRQoL scores, moderately with FV scores but not with acuity. Robust child-appropriate self-report VQoL measures for children with VI are necessary for understanding the broader impacts of living with a visual disability, distinguishing these from limited functioning per se. Future planned use in larger patient samples will allow further psychometric development of the VQoL_CYP as an adjunct to objective outcomes assessment.

  11. A New Tool for Nutrition App Quality Evaluation (AQEL): Development, Validation, and Reliability Testing

    PubMed Central

    Huang, Wenhao; Chapman-Novakofski, Karen M

    2017-01-01

    Background The extensive availability and increasing use of mobile apps for nutrition-based health interventions makes evaluation of the quality of these apps crucial for integration of apps into nutritional counseling. Objective The goal of this research was the development, validation, and reliability testing of the app quality evaluation (AQEL) tool, an instrument for evaluating apps’ educational quality and technical functionality. Methods Items for evaluating app quality were adapted from website evaluations, with additional items added to evaluate the specific characteristics of apps, resulting in 79 initial items. Expert panels of nutrition and technology professionals and app users reviewed items for face and content validation. After recommended revisions, nutrition experts completed a second AQEL review to ensure clarity. On the basis of 150 sets of responses using the revised AQEL, principal component analysis was completed, reducing AQEL into 5 factors that underwent reliability testing, including internal consistency, split-half reliability, test-retest reliability, and interrater reliability (IRR). Two additional modifiable constructs for evaluating apps based on the age and needs of the target audience as selected by the evaluator were also tested for construct reliability. IRR testing using intraclass correlations (ICC) with all 7 constructs was conducted, with 15 dietitians evaluating one app. Results Development and validation resulted in the 51-item AQEL. These were reduced to 25 items in 5 factors after principal component analysis, plus 9 modifiable items in two constructs that were not included in principal component analysis. Internal consistency and split-half reliability of the following constructs derived from principal components analysis was good (Cronbach alpha >.80, Spearman-Brown coefficient >.80): behavior change potential, support of knowledge acquisition, app function, and skill development. App purpose split half-reliability was .65. Test-retest reliability showed no significant change over time (P>.05) for all but skill development (P=.001). Construct reliability was good for items assessing age appropriateness of apps for children, teens, and a general audience. In addition, construct reliability was acceptable for assessing app appropriateness for various target audiences (Cronbach alpha >.70). For the 5 main factors, ICC (1,k) was >.80, with a P value of <.05. When 15 nutrition professionals evaluated one app, ICC (2,15) was .98, with a P value of <.001 for all 7 constructs when the modifiable items were specified for adults seeking weight loss support. Conclusions Our preliminary effort shows that AQEL is a valid, reliable instrument for evaluating nutrition apps’ qualities for clinical interventions by nutrition clinicians, educators, and researchers. Further efforts in validating AQEL in various contexts are needed. PMID:29079554

  12. Development of a Self-Determination Measure for College Students: Validity Evidence for the Basic Needs Satisfaction at College Scale

    ERIC Educational Resources Information Center

    Jenkins-Guarnieri, Michael A.; Vaughan, Angela L.; Wright, Stephen L.

    2015-01-01

    We adapted a work self-determination measure to create the Basic Needs Satisfaction at College Scale. Confirmatory factor analysis and item response theory analyses with data from 525 adults supported a 3-factor model with 13 items most sensitive for lower to middle range levels of the autonomy, competence, and relatedness constructs.

  13. Automated Scoring for the "TOEFL Junior"® Comprehensive Writing and Speaking Test. Research Report. ETS RR-15-09

    ERIC Educational Resources Information Center

    Evanini, Keelan; Heilman, Michael; Wang, Xinhao; Blanchard, Daniel

    2015-01-01

    This report describes the initial automated scoring results that were obtained using the constructed responses from the Writing and Speaking sections of the pilot forms of the "TOEFL Junior"® Comprehensive test administered in late 2011. For all of the items except one (the edit item in the Writing section), existing automated scoring…

  14. The Communicative Participation Item Bank (CPIB): Item bank calibration and development of a disorder-generic short form

    PubMed Central

    Baylor, Carolyn; Yorkston, Kathryn; Eadie, Tanya; Kim, Jiseon; Chung, Hyewon; Amtmann, Dagmar

    2015-01-01

    Purpose The purpose of this study was to calibrate the items for the Communicative Participation Item Bank (CPIB) using Item Response Theory (IRT). One overriding objective was to examine if the IRT item parameters would be consistent across different diagnostic groups, thereby allowing creation of a disorder-generic instrument. The intended outcomes were the final item bank and a short form ready for clinical and research applications. Methods Self-report data were collected from 701 individuals representing four diagnoses: multiple sclerosis, Parkinson’s disease, amyotrophic lateral sclerosis and head and neck cancer. Participants completed the CPIB and additional self-report questionnaires. CPIB data were analyzed using the IRT Graded Response Model (GRM). Results The initial set of 94 candidate CPIB items were reduced to an item bank of 46 items demonstrating unidimensionality, local independence, good item fit, and good measurement precision. Differential item function (DIF) analyses detected no meaningful differences across diagnostic groups. A 10-item, disorder-generic short form was generated. Conclusions The CPIB provides speech-language pathologists with a unidimensional, self-report outcomes measurement instrument dedicated to the construct of communicative participation. This instrument may be useful to clinicians and researchers wanting to implement measures of communicative participation in their work. PMID:23816661

  15. Older adults' drug benefit beliefs: construct definition and measure development.

    PubMed

    Cline, Richard R; Gupta, Kiran; Singh, Reshmi L

    2008-03-01

    The Medicare Prescription Drug, Improvement and Modernization Act of 2003 provides coverage of outpatient prescription drugs for Medicare beneficiaries. Although much has been learned since the program's implementation, a context within which this information can be understood is lacking. The purpose of this study was to develop a reliable and valid multi-item instrument measuring beliefs about Medicare prescription drug benefits. Survey items were generated using focus group transcripts, other surveys on the Medicare Part "D" program, and past studies of choice and satisfaction in drug insurance programs. Using data from the survey pilot test, item and reliability analyses were used to reduce and refine an initial pool of items. Data then were collected from a cross-sectional, mail survey of older adults living in Minnesota. Data were analyzed using exploratory factor analysis. Summated rating scales then were constructed and assessed further using reliability analyses. Construct validity of summated scales was examined by comparing scale scores across response categories of survey items that collected information on general political attitudes, perceptions of the Medicare Part "D" program, health status, and health care utilization and demographics. The adjusted response rate for the main survey was 55.98% (744/1329). Iterative factor analysis produced 2 interpretable scales. The first, termed "access/equity" (13 items, Cronbach's alpha=0.89) measures beliefs that a Medicare drug benefit should both provide affordable prescription drugs for beneficiaries and do this in a manner that is equitable for all participants. The second, termed "comprehensibility" (6 items, Cronbach's alpha=0.80) assesses beliefs that regulations governing a Medicare drug benefit should be easily understood. Discriminant validity tests suggest that these measures behave in a manner consistent with related research in these areas. Measures of 2 facets of older adults' drug benefit beliefs were developed using a multiple step procedure. Future research could focus on developing a better understanding of other facets of these beliefs and sound methods of measurement.

  16. [Construction of a physiological aging scale for healthy people based on a modified Delphi method].

    PubMed

    Long, Yao; Zhou, Xuan; Deng, Pengfei; Liao, Xiong; Wu, Lei; Zhou, Jianming; Huang, Helang

    2016-04-01

    To build a physiological aging scale for healthy people.
 We collected age-related physiologic items through literature screening and expert interview. Two rounds of Delphi were implemented. The importance, feasibility and the degree of authority for the physiological index system were graded. Using analytic hierarchy process, we determined the weight of dimensions and items.
 Using Delphy mothod, 17 physiological and other professional experts offered the results as follow: coefficient of expert authorities Cr was 0.86±0.03, coordination coefficients for the first and second round were 0.264(χ2=229.691, P<0.001) and 0.293(χ2=228.474,P<0.001), respectively. The consistency was good. The aging scale for healthy people included 3 dimensions, namely physical form, feeling movement and functional status. Each dimension had 8 items. The weight coefficients for the 3 dimensions were 0.54, 0.16, and 0.30, respectively. The Cronbach's α coefficient of the scale was 0.893, the reliability was 0.796, and the variance of the common factor was 58.17%.
 The improved Delphi method or physiological aging scale is satisfied, which can provide reference for the evaluation of aging.

  17. Construction of a Cr3C2-C Peritectic Point Cell for Thermocouple Calibration

    NASA Astrophysics Data System (ADS)

    Ogura, Hideki; Deuze, Thierry; Morice, Ronan; Ridoux, Pascal; Filtz, Jean-Remy

    The melting points of Cr3C2-C peritectic (1826°C) and Cr7C3-Cr3C2 eutectic (1742°C) alloys as materials for high-temperature fixed point cells are investigated for the use of thermocouple calibration. Pretests are performed to establish a suitable procedure for constructing contact thermometry cells based on such chromium-carbon mixtures. Two cells are constructed following two different possible procedures. The above two melting points are successfully observed for one of these cells using tungsten-rhenium alloy thermocouples.

  18. Psychometric Evaluation of the Ford Insomnia Response to Stress Test (FIRST) in Early Pregnancy.

    PubMed

    Gelaye, Bizu; Zhong, Qiu-Yue; Barrios, Yasmin V; Redline, Susan; Drake, Christopher L; Williams, Michelle A

    2016-04-15

    To evaluate the construct validity and factor structure of the Spanish-language version of the Ford Insomnia Response to Stress Test questionnaire (FIRST-S) when used in early pregnancy. A cohort of 647 women were interviewed at ≤ 16 weeks of gestation to collect information regarding lifestyle, demographic, and sleep characteristics. The factorial structure of the FIRST-S was tested through exploratory and confirmatory factor analyses (EFA and CFA). Internal consistency and construct validity were also assessed by evaluating the association between the FIRST-S with symptoms of depression, anxiety, and sleep quality. Item response theory (IRT) analyses were conducted to complement classical test theory (CTT) analytic approaches. The mean score of the FIRST-S was 13.8 (range: 9-33). The results of the EFA showed that the FIRST-S contained a one-factor solution that accounted for 69.8% of the variance. The FIRST-S items showed good internal consistency (Cronbach α = 0.81). CFA results corroborated the one-factor structure finding from the EFA; and yielded measures indicating goodness of fit (comparative fit index of 0.902) and accuracy (root mean square error of approximation of 0.057). The FIRST-S had good construct validity as demonstrated by statistically significant associations of FIRST-S scores with sleep quality, antepartum depression and anxiety symptoms. Finally, results from IRT analyses suggested excellent item infit and outfit measures. The FIRST-S was found to have good construct validity and internal consistency for assessing vulnerability to insomnia during early pregnancy. © 2016 American Academy of Sleep Medicine.

  19. National Reading Tests in Denmark, Norway, and Sweden: A Comparison of Construct Definitions, Cognitive Targets, and Response Formats

    ERIC Educational Resources Information Center

    Tengberg, Michael

    2017-01-01

    Reading comprehension tests are often assumed to measure the same, or at least similar, constructs. Yet, reading is not a single but a multidimensional form of processing, which means that variations in terms of reading material and item design may emphasize one aspect of the construct at the cost of another. The educational systems in Denmark,…

  20. Incorporating Response Times in Item Response Theory Models of Reading Comprehension Fluency

    ERIC Educational Resources Information Center

    Su, Shiyang

    2017-01-01

    With the online assessment becoming mainstream and the recording of response times becoming straightforward, the importance of response times as a measure of psychological constructs has been recognized and the literature of modeling times has been growing during the last few decades. Previous studies have tried to formulate models and theories to…

  1. Item-saving assessment of self-care performance in children with developmental disabilities: A prospective caregiver-report computerized adaptive test

    PubMed Central

    Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi

    2018-01-01

    Objective The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. Methods The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. Results The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). Conclusion The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with DD in clinical and research settings. PMID:29561879

  2. Item-saving assessment of self-care performance in children with developmental disabilities: A prospective caregiver-report computerized adaptive test.

    PubMed

    Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi; Chen, Kuan-Lin

    2018-01-01

    The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with DD in clinical and research settings.

  3. Development and psychometric properties of the Suicidality of Adolescent Screening Scale (SASS) using Multidimensional Item Response Theory.

    PubMed

    Sukhawaha, Supattra; Arunpongpaisal, Suwanna; Hurst, Cameron

    2016-09-30

    Suicide prevention in adolescents by early detection using screening tools to identify high suicidal risk is a priority. Our objective was to build a multidimensional scale namely "Suicidality of Adolescent Screening Scale (SASS)" to identify adolescents at risk of suicide. An initial pool of items was developed by using in-depth interview, focus groups and a literature review. Initially, 77 items were administered to 307 adolescents and analyzed using the exploratory Multidimensional Item Response Theory (MIRT) to remove unnecessary items. A subsequent exploratory factor analysis revealed 35 items that collected into 4 factors: Stressors, Pessimism, Suicidality and Depression. To confirm this structure, a new sample of 450 adolescents were collected and confirmatory MIRT factor analysis was performed. The resulting scale was shown to be both construct valid and able to discriminate well between adolescents that had, and hadn't previous attempted suicide. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  4. A modular approach for item response theory modeling with the R package flirt.

    PubMed

    Jeon, Minjeong; Rijmen, Frank

    2016-06-01

    The new R package flirt is introduced for flexible item response theory (IRT) modeling of psychological, educational, and behavior assessment data. flirt integrates a generalized linear and nonlinear mixed modeling framework with graphical model theory. The graphical model framework allows for efficient maximum likelihood estimation. The key feature of flirt is its modular approach to facilitate convenient and flexible model specifications. Researchers can construct customized IRT models by simply selecting various modeling modules, such as parametric forms, number of dimensions, item and person covariates, person groups, link functions, etc. In this paper, we describe major features of flirt and provide examples to illustrate how flirt works in practice.

  5. Development and validation of a measure of workplace climate for healthy weight maintenance.

    PubMed

    Sliter, Katherine A

    2013-07-01

    Due to the obesity epidemic, an increasing amount of research is being conducted to better understand the antecedents and consequences of excess employee weight. One construct often of interest to researchers in this area is organizational climate. Unfortunately, a viable measure of climate, as related to employee weight, does not exist. The purpose of this study was to remedy this by developing and validating a concise, psychometrically sound measure of climate for healthy weight. An item pool was developed based on surveys of full-time employees, and a sorting task was used to eliminate ambiguous items. Items were pilot tested by a sample of 338 full-time employees, and the item pool was reduced through item response theory (IRT) and reliability analyses. Finally, the retained 14 items, comprising 3 subscales, were completed by a sample of 360 full-time employees, representing 26 different organizations from across the United States. Multilevel modeling indicated that sufficient variance was explained by group membership to support aggregation, and confirmatory factor analysis (CFA) supported the hypothesized model of 3 subscale factors and an overall climate factor. Nine hypotheses specific to construct validation were tested. Scores on the new scale correlated significantly with individual-level reports of psychological constructs (e.g., health motivation, general leadership support for health) and physiological phenomena (e.g., body mass index [BMI], physical health problems) to which they should theoretically relate, supporting construct validity. Implications for the use of this scale in both applied and research settings are discussed. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  6. Predicting sugar-sweetened behaviours with theory of planned behaviour constructs: Outcome and process results from the SIPsmartER behavioural intervention.

    PubMed

    Zoellner, Jamie M; Porter, Kathleen J; Chen, Yvonnes; Hedrick, Valisa E; You, Wen; Hickman, Maja; Estabrooks, Paul A

    2017-05-01

    Guided by the theory of planned behaviour (TPB) and health literacy concepts, SIPsmartER is a six-month multicomponent intervention effective at improving SSB behaviours. Using SIPsmartER data, this study explores prediction of SSB behavioural intention (BI) and behaviour from TPB constructs using: (1) cross-sectional and prospective models and (2) 11 single-item assessments from interactive voice response (IVR) technology. Quasi-experimental design, including pre- and post-outcome data and repeated-measures process data of 155 intervention participants. Validated multi-item TPB measures, single-item TPB measures, and self-reported SSB behaviours. Hypothesised relationships were investigated using correlation and multiple regression models. TPB constructs explained 32% of the variance cross sectionally and 20% prospectively in BI; and explained 13-20% of variance cross sectionally and 6% prospectively. Single-item scale models were significant, yet explained less variance. All IVR models predicting BI (average 21%, range 6-38%) and behaviour (average 30%, range 6-55%) were significant. Findings are interpreted in the context of other cross-sectional, prospective and experimental TPB health and dietary studies. Findings advance experimental application of the TPB, including understanding constructs at outcome and process time points and applying theory in all intervention development, implementation and evaluation phases.

  7. Development and psychometric characteristics of the SCI-QOL Ability to Participate and Satisfaction with Social Roles and Activities item banks and short forms.

    PubMed

    Heinemann, Allen W; Kisala, Pamela A; Hahn, Elizabeth A; Tulsky, David S

    2015-05-01

    To develop a spinal cord injury (SCI)-focused version of PROMIS and Neuro-QOL social domain item banks; evaluate the psychometric properties of items developed for adults with SCI; and report information to facilitate clinical and research use. We used a mixed-methods design to develop and evaluate Ability to Participate in Social Roles and Activities and Satisfaction with Social Roles and Activities items. Focus groups helped define the constructs; cognitive interviews helped revise items; and confirmatory factor analysis and item response theory methods helped calibrate item banks and evaluate differential item functioning related to demographic and injury characteristics. Five SCI Model System sites and one Veterans Administration medical center. The calibration sample consisted of 641 individuals; a reliability sample consisted of 245 individuals residing in the community. A subset of 27 Ability to Participate and 35 Satisfaction items demonstrated good measurement properties and negligible differential item functioning related to demographic and injury characteristics. The SCI-specific measures correlate strongly with the PROMIS and Neuro-QOL versions. Ten item short forms correlate >0.96 with the full banks. Variable-length CATs with a minimum of 4 items, variable-length CATs with a minimum of 8 items, fixed-length CATs of 10 items, and the 10-item short forms demonstrate construct coverage and measurement error that is comparable to the full item bank. The Ability to Participate and Satisfaction with Social Roles and Activities CATs and short forms demonstrate excellent psychometric properties and are suitable for clinical and research applications.

  8. Bees Algorithm for Construction of Multiple Test Forms in E-Testing

    ERIC Educational Resources Information Center

    Songmuang, Pokpong; Ueno, Maomi

    2011-01-01

    The purpose of this research is to automatically construct multiple equivalent test forms that have equivalent qualities indicated by test information functions based on item response theory. There has been a trade-off in previous studies between the computational costs and the equivalent qualities of test forms. To alleviate this problem, we…

  9. TEST BOOKLET FOR HIGH SCHOOL BIOLOGY, EXPERIMENTAL MATERIALS FOR USE 1966-1968.

    ERIC Educational Resources Information Center

    Biological Sciences Curriculum Study, Boulder, CO.

    SUPPLEMENTARY TEST QUESTIONS FOR USE BY SECONDARY BIOLOGICAL SCIENCES CURRICULUM STUDY GREEN VERSION BIOLOGY TEACHERS IN THE CONSTRUCTION OF EXAMINATIONS ARE CONTAINED IN THIS EXPERIMENTAL MANUAL. THE ITEMS WERE PREPARED BY THE BIOLOGICAL SCIENCES CURRICULUM STUDY TEST CONSTRUCTION COMMITTEE IN RESPONSE TO TEACHER REQUESTS FOR SHORT-RANGE TESTS.…

  10. Psychometric properties of the SDM-Q-9 questionnaire for shared decision-making in multiple sclerosis: item response theory modelling and confirmatory factor analysis.

    PubMed

    Ballesteros, Javier; Moral, Ester; Brieva, Luis; Ruiz-Beato, Elena; Prefasi, Daniel; Maurino, Jorge

    2017-04-22

    Shared decision-making is a cornerstone of patient-centred care. The 9-item Shared Decision-Making Questionnaire (SDM-Q-9) is a brief self-assessment tool for measuring patients' perceived level of involvement in decision-making related to their own treatment and care. Information related to the psychometric properties of the SDM-Q-9 for multiple sclerosis (MS) patients is limited. The objective of this study was to assess the performance of the items composing the SDM-Q-9 and its dimensional structure in patients with relapsing-remitting MS. A non-interventional, cross-sectional study in adult patients with relapsing-remitting MS was conducted in 17 MS units throughout Spain. A nonparametric item response theory (IRT) analysis was used to assess the latent construct and dimensional structure underlying the observed responses. A parametric IRT model, General Partial Credit Model, was fitted to obtain estimates of the relationship between the latent construct and item characteristics. The unidimensionality of the SDM-Q-9 instrument was assessed by confirmatory factor analysis. A total of 221 patients were studied (mean age = 42.1 ± 9.9 years, 68.3% female). Median Expanded Disability Status Scale score was 2.5 ± 1.5. Most patients reported taking part in each step of the decision-making process. Internal reliability of the instrument was high (Cronbach's α = 0.91) and the overall scale scalability score was 0.57, indicative of a strong scale. All items, except for the item 1, showed scalability indices higher than 0.30. Four items (items 6 through to 9) conveyed more than half of the SDM-Q-9 overall information (67.3%). The SDM-Q-9 was a good fit for a unidimensional latent structure (comparative fit index = 0.98, root-mean-square error of approximation = 0.07). All freely estimated parameters were statistically significant (P < 0.001). All items presented standardized parameter estimates with salient loadings (>0.40) with the exception of item 1 which presented the lowest loading (0.26). Items 6 through to 8 were the most relevant items for shared decision-making. The SDM-Q-9 presents appropriate psychometric properties and is therefore useful for assessing different aspects of shared decision-making in patients with multiple sclerosis.

  11. The Development of a Multiple-Item Annoyance Scale (MIAS) for Transportation Noise Annoyance

    PubMed Central

    Belke, Christin; Spilski, Jan

    2018-01-01

    In 2001, Team#6 of the International Commission on Biological Effects of Noise (ICBEN) recommended the use of two single international standardised questions and response scales. This recommendation has been widely accepted in the scientific community. Nevertheless, annoyance can be regarded as a multidimensional construct comprising the three elements: (1) experience of an often repeated noise-related disturbance and the behavioural response to cope with it, (2) an emotional/attitudinal response to the sound and its disturbing impact, and (3) the perceived control or coping capacity with regard to the noise situation. The psychometric properties of items reflecting these three elements have been explored for aircraft noise annoyance. Analyses were conducted using data of the NORAH-Study (Noise-Related Annoyance, Cognition, and Health), and a multi-item noise annoyance scale (MIAS) has been developed and tested post hoc by using a stepwise process (exploratory and confirmatory factor analyses). Preliminary results were presented to the 12th ICBEN Congress in 2017. In this study, the validation of MIAS is done for aircraft noise and extended to railway and road traffic noise. The results largely confirm the concept of MIAS as a second-order construct of annoyance for all of the investigated transportation noise sources; however, improvements can be made, in particular with regard to items addressing the perceived coping capacity. PMID:29757228

  12. Development and validation of the Measure of Indigenous Racism Experiences (MIRE)

    PubMed Central

    Paradies, Yin C; Cunningham, Joan

    2008-01-01

    Background In recent decades there has been increasing evidence of a relationship between self-reported racism and health. Although a plethora of instruments to measure racism have been developed, very few have been described conceptually or psychometrically Furthermore, this research field has been limited by a dearth of instruments that examine reactions/responses to racism and by a restricted focus on African American populations. Methods In response to these limitations, the 31-item Measure of Indigenous Racism Experiences (MIRE) was developed to assess self-reported racism for Indigenous Australians. This paper describes the development of the MIRE together with an opportunistic examination of its content, construct and convergent validity in a population health study involving 312 Indigenous Australians. Results Focus group research supported the content validity of the MIRE, and inter-item/scale correlations suggested good construct validity. A good fit with a priori conceptual dimensions was demonstrated in factor analysis, and convergence with a separate item on discrimination was satisfactory. Conclusion The MIRE has considerable utility as an instrument that can assess multiple facets of racism together with responses/reactions to racism among indigenous populations and, potentially, among other ethnic/racial groups. PMID:18426602

  13. The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda.

    PubMed

    Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert

    2008-12-02

    The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda.

  14. The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda

    PubMed Central

    Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert

    2008-01-01

    Background The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. Methods A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. Results The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. Conclusion This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda. PMID:19055716

  15. Evaluation of the Fecal Incontinence Quality of Life Scale (FIQL) using item response theory reveals limitations and suggests revisions.

    PubMed

    Peterson, Alexander C; Sutherland, Jason M; Liu, Guiping; Crump, R Trafford; Karimuddin, Ahmer A

    2018-06-01

    The Fecal Incontinence Quality of Life Scale (FIQL) is a commonly used patient-reported outcome measure for fecal incontinence, often used in clinical trials, yet has not been validated in English since its initial development. This study uses modern methods to thoroughly evaluate the psychometric characteristics of the FIQL and its potential for differential functioning by gender. This study analyzed prospectively collected patient-reported outcome data from a sample of patients prior to colorectal surgery. Patients were recruited from 14 general and colorectal surgeons in Vancouver Coastal Health hospitals in Vancouver, Canada. Confirmatory factor analysis was used to assess construct validity. Item response theory was used to evaluate test reliability, describe item-level characteristics, identify local item dependence, and test for differential functioning by gender. 236 patients were included for analysis, with mean age 58 and approximately half female. Factor analysis failed to identify the lifestyle, coping, depression, and embarrassment domains, suggesting lack of construct validity. Items demonstrated low difficulty, indicating that the test has the highest reliability among individuals who have low quality of life. Five items are suggested for removal or replacement. Differential test functioning was minimal. This study has identified specific improvements that can be made to each domain of the Fecal Incontinence Quality of Life Scale and to the instrument overall. Formatting, scoring, and instructions may be simplified, and items with higher difficulty developed. The lifestyle domain can be used as is. The embarrassment domain should be significantly revised before use.

  16. Solar power satellite system definition study. Part 2, volume 5: Space operations (construction and transportation)

    NASA Technical Reports Server (NTRS)

    Miller, K.; Davis, E. E.

    1977-01-01

    Construction and transportation systems and operations are described for the following combinations: (1) silicon photovoltaic CR=1 satellite constructed primarily in low earth orbit (LEO); (2) silicon photovoltaic CR=1 satellite constructed in geosynchronous earth orbit (GEO); (3) Rankine thermal engine satellite constructed primarily in LEO; and (4) Rankine thermal engine satellite constructed in GEO.

  17. A multi-agent safety response model in the construction industry.

    PubMed

    Meliá, José L

    2015-01-01

    The construction industry is one of the sectors with the highest accident rates and the most serious accidents. A multi-agent safety response approach allows a useful diagnostic tool in order to understand factors affecting risk and accidents. The special features of the construction sector can influence the relationships among safety responses along the model of safety influences. The purpose of this paper is to test a model explaining risk and work-related accidents in the construction industry as a result of the safety responses of the organization, the supervisors, the co-workers and the worker. 374 construction employees belonging to 64 small Spanish construction companies working for two main companies participated in the study. Safety responses were measured using a 45-item Likert-type questionnaire. The structure of the measure was analyzed using factor analysis and the model of effects was tested using a structural equation model. Factor analysis clearly identifies the multi-agent safety dimensions hypothesized. The proposed safety response model of work-related accidents, involving construction specific results, showed a good fit. The multi-agent safety response approach to safety climate is a useful framework for the assessment of organizational and behavioral risks in construction.

  18. [Design and validation of a questionnaire for psychosocial nursing diagnosis in Primary Care].

    PubMed

    Brito-Brito, Pedro Ruymán; Rodríguez-Álvarez, Cristobalina; Sierra-López, Antonio; Rodríguez-Gómez, José Ángel; Aguirre-Jaime, Armando

    2012-01-01

    To develop a valid, reliable and easy-to-use questionnaire for a psychosocial nursing diagnosis. The study was performed in two phases: first phase, questionnaire design and construction; second phase, validity and reliability tests. A bank of items was constructed using the NANDA classification as a theoretical framework. Each item was assigned a Likert scale or dichotomous response. The combination of responses to the items constituted the diagnostic rules to assign up to 28 labels. A group of experts carried out the validity test for content. Other validated scales were used as reference standards for the criterion validity tests. Forty-five nurses provided the questionnaire to the patients on three separate occasions over a period of three weeks, and the other validated scales only once to 188 randomly selected patients in Primary Care centres in Tenerife (Spain). Validity tests for construct confirmed the six dimensions of the questionnaire with 91% of total variance explained. Validity tests for criterion showed a specificity of 66%-100%, and showed high correlations with the reference scales when the questionnaire was assigning nursing diagnoses. Reliability tests showed agreement of 56%-91% (P<.001), and a 93% internal consistency. The Questionnaire for Psychosocial Nursing Diagnosis was called CdePS, and included 61 items. The CdePS is a valid, reliable and easy-to-use tool in Primary Care centres to improve the assigning of a psychosocial nursing diagnosis. Copyright © 2011 Elsevier España, S.L. All rights reserved.

  19. Pedagogy of Science Teaching Tests: Formative assessments of science teaching orientations

    NASA Astrophysics Data System (ADS)

    Cobern, William W.; Schuster, David; Adams, Betty; Skjold, Brandy Ann; Zeynep Muğaloğlu, Ebru; Bentz, Amy; Sparks, Kelly

    2014-09-01

    A critical aspect of teacher education is gaining pedagogical content knowledge of how to teach science for conceptual understanding. Given the time limitations of college methods courses, it is difficult to touch on more than a fraction of the science topics potentially taught across grades K-8, particularly in the context of relevant pedagogies. This research and development work centers on constructing a formative assessment resource to help expose pre-service teachers to a greater number of science topics within teaching episodes using various modes of instruction. To this end, 100 problem-based, science pedagogy assessment items were developed via expert group discussions and pilot testing. Each item contains a classroom vignette followed by response choices carefully crafted to include four basic pedagogies (didactic direct, active direct, guided inquiry, and open inquiry). The brief but numerous items allow a substantial increase in the number of science topics that pre-service students may consider. The intention is that students and teachers will be able to share and discuss particular responses to individual items, or else record their responses to collections of items and thereby create a snapshot profile of their teaching orientations. Subsets of items were piloted with students in pre-service science methods courses, and the quantitative results of student responses were spread sufficiently to suggest that the items can be effective for their intended purpose.

  20. Item response theory, computerized adaptive testing, and PROMIS: assessment of physical function.

    PubMed

    Fries, James F; Witter, James; Rose, Matthias; Cella, David; Khanna, Dinesh; Morgan-DeWitt, Esi

    2014-01-01

    Patient-reported outcome (PRO) questionnaires record health information directly from research participants because observers may not accurately represent the patient perspective. Patient-reported Outcomes Measurement Information System (PROMIS) is a US National Institutes of Health cooperative group charged with bringing PRO to a new level of precision and standardization across diseases by item development and use of item response theory (IRT). With IRT methods, improved items are calibrated on an underlying concept to form an item bank for a "domain" such as physical function (PF). The most informative items can be combined to construct efficient "instruments" such as 10-item or 20-item PF static forms. Each item is calibrated on the basis of the probability that a given person will respond at a given level, and the ability of the item to discriminate people from one another. Tailored forms may cover any desired level of the domain being measured. Computerized adaptive testing (CAT) selects the best items to sharpen the estimate of a person's functional ability, based on prior responses to earlier questions. PROMIS item banks have been improved with experience from several thousand items, and are calibrated on over 21,000 respondents. In areas tested to date, PROMIS PF instruments are superior or equal to Health Assessment Questionnaire and Medical Outcome Study Short Form-36 Survey legacy instruments in clarity, translatability, patient importance, reliability, and sensitivity to change. Precise measures, such as PROMIS, efficiently incorporate patient self-report of health into research, potentially reducing research cost by lowering sample size requirements. The advent of routine IRT applications has the potential to transform PRO measurement.

  1. Hexavalent chromium exposure and control in welding tasks.

    PubMed

    Meeker, John D; Susi, Pam; Flynn, Michael R

    2010-11-01

    Studies of exposure to the lung carcinogen hexavalent chromium (CrVI) from welding tasks are limited, especially within the construction industry where overexposure may be common. In addition, despite the OSHA requirement that the use of engineering controls such as local exhaust ventilation (LEV) first be considered before relying on other strategies to reduce worker exposure to CrVI, data on the effectiveness of LEV to reduce CrVI exposures from welding are lacking. The goal of the present study was to characterize breathing zone air concentrations of CrVI during welding tasks and primary contributing factors in four datasets: (1) OSHA compliance data; (2) a publicly available database from The Welding Institute (TWI); (3) field survey data of construction welders collected by the Center for Construction Research and Training (CPWR); and (4) controlled welding trials conducted by CPWR to assess the effectiveness of a portable LEV unit to reduce CrVI exposure. In the OSHA (n = 181) and TWI (n = 124) datasets, which included very few samples from the construction industry, the OSHA permissible exposure level (PEL) for CrVI (5 μg/m(3)) was exceeded in 9% and 13% of samples, respectively. CrVI concentrations measured in the CPWR field surveys (n = 43) were considerably higher, and 25% of samples exceeded the PEL. In the TWI and CPWR datasets, base metal, welding process, and LEV use were important predictors of CrVI concentrations. Only weak-to-moderate correlations were found between total particulate matter and CrVI, suggesting that total particulate matter concentrations are not a good surrogate for CrVI exposure in retrospective studies. Finally, in the controlled welding trials, LEV reduced median CrVI concentrations by 68% (p = 0.02). In conclusion, overexposure to CrVI in stainless steel welding is likely widespread, especially in certain operations such as shielded metal arc welding, which is commonly used in construction. However, exposure could be substantially reduced with proper use of LEV.

  2. A Comparison of Three IRT Approaches to Examinee Ability Change Modeling in a Single-Group Anchor Test Design

    ERIC Educational Resources Information Center

    Paek, Insu; Park, Hyun-Jeong; Cai, Li; Chi, Eunlim

    2014-01-01

    Typically a longitudinal growth modeling based on item response theory (IRT) requires repeated measures data from a single group with the same test design. If operational or item exposure problems are present, the same test may not be employed to collect data for longitudinal analyses and tests at multiple time points are constructed with unique…

  3. Construct validity of the Heart Failure Screening Tool (Heart-FaST) to identify heart failure patients at risk of poor self-care: Rasch analysis.

    PubMed

    Reynolds, Nicholas A; Ski, Chantal F; McEvedy, Samantha M; Thompson, David R; Cameron, Jan

    2018-02-14

    The aim of this study was to psychometrically evaluate the Heart Failure Screening Tool (Heart-FaST) via: (1) examination of internal construct validity; (2) testing of scale function in accordance with design; and (3) recommendation for change/s, if items are not well adjusted, to improve psychometric credential. Self-care is vital to the management of heart failure. The Heart-FaST may provide a prospective assessment of risk, regarding the likelihood that patients with heart failure will engage in self-care. Psychometric validation of the Heart-FaST using Rasch analysis. The Heart-FaST was administered to 135 patients (median age = 68, IQR = 59-78 years; 105 males) enrolled in a multidisciplinary heart failure management program. The Heart-FaST is a nurse-administered tool for screening patients with HF at risk of poor self-care. A Rasch analysis of responses was conducted which tested data against Rasch model expectations, including whether items serve as unbiased, non-redundant indicators of risk and measure a single construct and that rating scales operate as intended. The results showed that data met Rasch model expectations after rescoring or deleting items due to poor discrimination, disordered thresholds, differential item functioning, or response dependence. There was no evidence of multidimensionality which supports the use of total scores from Heart-FaST as indicators of risk. Aggregate scores from this modified screening tool rank heart failure patients according to their "risk of poor self-care" demonstrating that the Heart-FaST items constitute a meaningful scale to identify heart failure patients at risk of poor engagement in heart failure self-care. © 2018 John Wiley & Sons Ltd.

  4. Evaluating Job Demands and Control Measures for Use in Farm Worker Health Surveillance

    PubMed Central

    Alterman, Toni; Gabbard, Susan; Grzywacz, Joseph G.; Shen, Rui; Li, Jia; Nakamoto, Jorge; Carroll, Daniel J.; Muntaner, Carles

    2015-01-01

    Workplace stress likely plays a role in health disparities; however, applying standard measures to studies of immigrants requires thoughtful consideration. The goal of this study was to determine the appropriateness of two measures of occupational stressors (‘decision latitude’ and ‘job demands’) for use with mostly immigrant Latino farm workers. Cross-sectional data from a pilot module containing a four-item measure of decision latitude and a two-item measure of job demands were obtained from a subsample (N = 409) of farm workers participating in the National Agricultural Workers Survey. Responses to items for both constructs were clustered toward the low end of the structured response-set. Percentages of responses of ‘very often’ and ‘always’ for each of the items were examined by educational attainment, birth country, dominant language spoken, task, and crop. Cronbach’s α, when stratified by subgroups of workers, for the decision latitude items were (0.65–0.90), but were less robust for the job demands items (0.25–0.72). The four-item decision latitude scale can be applied to occupational stress research with immigrant farm workers, and potentially other immigrant Latino worker groups. The short job demands scale requires further investigation and evaluation before suggesting widespread use. PMID:25138138

  5. The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

    PubMed

    Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

    2017-07-01

    The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Exploring the Full-Information Bifactor Model in Vertical Scaling with Construct Shift

    ERIC Educational Resources Information Center

    Li, Ying; Lissitz, Robert W.

    2012-01-01

    To address the lack of attention to construct shift in item response theory (IRT) vertical scaling, a multigroup, bifactor model was proposed to model the common dimension for all grades and the grade-specific dimensions. Bifactor model estimation accuracy was evaluated through a simulation study with manipulated factors of percentage of common…

  7. Managing What We Can Measure: Quantifying the Susceptibility of Automated Scoring Systems to Gaming Behavior

    ERIC Educational Resources Information Center

    Higgins, Derrick; Heilman, Michael

    2014-01-01

    As methods for automated scoring of constructed-response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of research relevant to how construct-irrelevant…

  8. Development of a survey instrument to measure connectivity to evaluate national public health preparedness and response performance.

    PubMed

    Dorn, Barry C; Savoia, Elena; Testa, Marcia A; Stoto, Michael A; Marcus, Leonard J

    2007-01-01

    Survey instruments for evaluating public health preparedness have focused on measuring the structure and capacity of local, state, and federal agencies, rather than linkages among structure, process, and outcomes. To focus evaluation on the latter, we evaluated the linkages among individuals, organizations, and systems using the construct of "connectivity" and developed a measurement instrument. Results from focus groups of emergency preparedness first responders generated 62 items used in the development sample of 187 respondents. Item reduction and factors analyses were conducted to confirm the scale's components. The 62 items were reduced to 28. Five scales explained 70% of the total variance (number of items, percent variance explained, Cronbach's alpha) including connectivity with the system (8, 45%, 0.94), coworkers (7, 7%, 0.91), organization (7, 12%, 0.93), and perceptions (6, 6%, 0.90). Discriminant validity was found to be consistent with the factor structure. We developed a Connectivity Measurement Tool for the public health workforce consisting of a 34-item questionnaire found to be a reliable measure of connectivity with preliminary evidence of construct validity.

  9. Are life satisfaction and self-esteem distinct constructs? A black South African perspective.

    PubMed

    Westaway, Margaret S; Maluka, Constance S

    2005-10-01

    As part of a longitudinal project on Quality of Life, a study was undertaken to extend the applicability of the 5-item Satisfaction With Life Scale, developed in the USA, in South Africa. Data on basic sociodemographic characteristics, the scale, and the 10-item Rosenberg Self-esteem scale were available for 360 Black South Africans (151 men and 209 women), ages 21 to 83 years (M = 38.6 yr., SD = 10.3). Factor analysis applied to scale scores gave two factors, accounting for 71% of the variance. Factor I was loaded by 10 Self-esteem items and Factor II by four of the five Life Satisfaction items. Coefficient alpha was .77 for the Satisfaction With Life Scale and .97 for the Rosenberg Self-esteem Scale. Life Satisfaction was related to Self-esteem (r = .17, p < .01). It was concluded that Life Satisfaction and Self-esteem appear to be distinct, unitary constructs, but responses to Item 5 on the Satisfaction With Life Scale require cautious interpretation and may contribute to the weak r, although so may the collectivist culture of Black South Africans.

  10. Factor structure of a conceptual model of oral health tested among 65-year olds in Norway and Sweden.

    PubMed

    Astrøm, Anne Nordrehaug; Ekbäck, Gunnar; Ordell, Sven

    2010-04-01

    No studies have tested oral health-related quality of life models in dentate older adults across different populations. To test the factor structure of oral health outcomes within Gilbert's conceptual model among 65-year olds in Sweden and Norway. It was hypothesized that responses to 14 observed indicators could be explained by three correlated factors, symptom status, functional limitations and oral disadvantages, that each observed oral health indicator would associate more strongly with the factor it is supposed to measure than with competing factors and that the proposed 3-factor structure would possess satisfactory cross-national stability with 65-year olds in Norway and Sweden. In 2007, 6078 Swedish- and 4062 Norwegian adults borne in 1942 completed mailed questionnaires including oral symptoms, functional limitations and the eight item Oral Impacts on Daily Performances inventory. Model generation analysis was restricted to the Norwegian study group and the model achieved was tested without modifications in Swedish 65-year olds. A modified 3-factor solution with cross-loadings, improved the fit to the data compared with a 2-factor- and the initially proposed 3-factor model among the Norwegian [comparative fit index (CFI) = 0.97] and Swedish (CFI = 0.98) participants. All factor loadings for the modified 3-factor model were in the expected direction and were statistically significant at CR > 1. Multiple group confirmatory factor analyses, with Norwegian and Swedish data simultaneously revealed acceptable fit for the unconstrained model (CFI = 0.97), whereas unconstrained and constrained models were statistically significant different in nested model comparison. Within construct validity of Gilbert's model was supported with Norwegian and Swedish 65-year olds, indicating that the 14-item questionnaire reflected three constructs; symptom status, functional limitation and oral disadvantage. Measurement invariance was confirmed at the level of factor structure, suggesting that the 3-factor model is comparable to some extent across 65-year olds in Norway and Sweden.

  11. A New Tool for Nutrition App Quality Evaluation (AQEL): Development, Validation, and Reliability Testing.

    PubMed

    DiFilippo, Kristen Nicole; Huang, Wenhao; Chapman-Novakofski, Karen M

    2017-10-27

    The extensive availability and increasing use of mobile apps for nutrition-based health interventions makes evaluation of the quality of these apps crucial for integration of apps into nutritional counseling. The goal of this research was the development, validation, and reliability testing of the app quality evaluation (AQEL) tool, an instrument for evaluating apps' educational quality and technical functionality. Items for evaluating app quality were adapted from website evaluations, with additional items added to evaluate the specific characteristics of apps, resulting in 79 initial items. Expert panels of nutrition and technology professionals and app users reviewed items for face and content validation. After recommended revisions, nutrition experts completed a second AQEL review to ensure clarity. On the basis of 150 sets of responses using the revised AQEL, principal component analysis was completed, reducing AQEL into 5 factors that underwent reliability testing, including internal consistency, split-half reliability, test-retest reliability, and interrater reliability (IRR). Two additional modifiable constructs for evaluating apps based on the age and needs of the target audience as selected by the evaluator were also tested for construct reliability. IRR testing using intraclass correlations (ICC) with all 7 constructs was conducted, with 15 dietitians evaluating one app. Development and validation resulted in the 51-item AQEL. These were reduced to 25 items in 5 factors after principal component analysis, plus 9 modifiable items in two constructs that were not included in principal component analysis. Internal consistency and split-half reliability of the following constructs derived from principal components analysis was good (Cronbach alpha >.80, Spearman-Brown coefficient >.80): behavior change potential, support of knowledge acquisition, app function, and skill development. App purpose split half-reliability was .65. Test-retest reliability showed no significant change over time (P>.05) for all but skill development (P=.001). Construct reliability was good for items assessing age appropriateness of apps for children, teens, and a general audience. In addition, construct reliability was acceptable for assessing app appropriateness for various target audiences (Cronbach alpha >.70). For the 5 main factors, ICC (1,k) was >.80, with a P value of <.05. When 15 nutrition professionals evaluated one app, ICC (2,15) was .98, with a P value of <.001 for all 7 constructs when the modifiable items were specified for adults seeking weight loss support. Our preliminary effort shows that AQEL is a valid, reliable instrument for evaluating nutrition apps' qualities for clinical interventions by nutrition clinicians, educators, and researchers. Further efforts in validating AQEL in various contexts are needed. ©Kristen Nicole DiFilippo, Wenhao Huang, Karen M. Chapman-Novakofski. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 27.10.2017.

  12. A natural language screening measure for motivation to change.

    PubMed

    Miller, William R; Johnson, Wendy R

    2008-09-01

    Client motivation for change, a topic of high interest to addiction clinicians, is multidimensional and complex, and many different approaches to measurement have been tried. The current effort drew on psycholinguistic research on natural language that is used by clients to describe their own motivation. Seven addiction treatment sites participated in the development of a simple scale to measure client motivation. Twelve items were drafted to represent six potential dimensions of motivation for change that occur in natural discourse. The maximum self-rating of motivation (10 on a 0-10 scale) was the median score on all items, and 43% of respondents rated 10 on all 12 items - a substantial ceiling effect. From 1035 responses, three factors emerged representing importance, ability, and commitment - constructs that are also reflected in several theoretical models of motivation. A 3-item version of the scale, with one marker item for each of these constructs, accounted for 81% of variance in the full scale. The three items are: 1. It is important for me to . . . 2. I could . . . and 3. I am trying to . . . This offers a quick (1-minute) assessment of clients' self-reported motivation for change.

  13. Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

    PubMed

    Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

    2018-01-01

    To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.

  14. A Maximin Model for Test Design with Practical Constraints. Project Psychometric Aspects of Item Banking No. 25. Research Report 87-10.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Boekkooi-Timminga, Ellen

    A "maximin" model for item response theory based test design is proposed. In this model only the relative shape of the target test information function is specified. It serves as a constraint subject to which a linear programming algorithm maximizes the information in the test. In the practice of test construction there may be several…

  15. Dimensionality of the Knee Numeric-Entity Evaluation Score (KNEES-ACL): a condition-specific questionnaire.

    PubMed

    Comins, J D; Krogsgaard, M R; Kreiner, S; Brodersen, J

    2013-10-01

    The benefit of anterior cruciate ligament (ACL) reconstruction has been questioned based on patient-reported outcome measures (PROMs). Valid interpretation of such results requires confirmation of the psychometric properties of the PROM. Rasch analysis is the gold standard for validation of PROMs, yet PROMs used for ACL reconstruction have not been validated using Rasch analysis. We used Rasch analysis to investigate the psychometric properties of the Knee Numeric-Entity Evaluation Score (KNEES-ACL), a newly developed PROM for patients treated for ACL deficiency. Two-hundred forty-two patients pre- and post-ACL reconstruction completed the pilot PROM. Rasch models were used to assess the psychometric properties (e.g., unidimensionality, local response dependency, and differential item functioning). Forty-one items distributed across seven unidimensional constructs measuring impairment, functional limitations, and psychosocial consequences were confirmed to fit Rasch models. Fourteen items were removed because of statistical lack of fit and inadequate face validity. Local response dependency and differential item functioning were identified and adjusted. The KNEES-ACL is the first Rasch-validated condition-specific PROM constructed for patients with ACL deficiency and patients with ACL reconstruction. Thus, this instrument can be used for within- and between-group comparisons. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  16. Which kind of psychometrics is adequate for patient satisfaction questionnaires?

    PubMed

    Konerding, Uwe

    2016-01-01

    The construction and psychometric analysis of patient satisfaction questionnaires are discussed. The discussion is based upon the classification of multi-item questionnaires into scales or indices. Scales consist of items that describe the effects of the latent psychological variable to be measured, and indices consist of items that describe the causes of this variable. Whether patient satisfaction questionnaires should be constructed and analyzed as scales or as indices depends upon the purpose for which these questionnaires are required. If the final aim is improving care with regard to patients' preferences, then these questionnaires should be constructed and analyzed as indices. This implies two requirements: 1) items for patient satisfaction questionnaires should be selected in such a way that the universe of possible causes of patient satisfaction is covered optimally and 2) Cronbach's alpha, principal component analysis, exploratory factor analysis, confirmatory factor analysis, and analyses with models from item response theory, such as the Rasch Model, should not be applied for psychometric analyses. Instead, multivariate regression analyses with a direct rating of patient satisfaction as the dependent variable and the individual questionnaire items as independent variables should be performed. The coefficients produced by such an analysis can be applied for selecting the best items and for weighting the selected items when a sum score is determined. The lower boundaries of the validity of the unweighted and the weighted sum scores can be estimated by their correlations with the direct satisfaction rating. While the first requirement is fulfilled in the majority of the previous patient satisfaction questionnaires, the second one deviates from previous practice. Hence, if patient satisfaction is actually measured with the final aim of improving care with regard to patients' preferences, then future practice should be changed so that the second requirement is also fulfilled.

  17. Using Rasch rating scale model to reassess the psychometric properties of the Persian version of the PedsQL™ 4.0 Generic Core Scales in school children.

    PubMed

    Jafari, Peyman; Bagheri, Zahra; Ayatollahi, Seyyed Mohamad Taghi; Soltani, Zahra

    2012-03-13

    Item response theory (IRT) is extensively used to develop adaptive instruments of health-related quality of life (HRQoL). However, each IRT model has its own function to estimate item and category parameters, and hence different results may be found using the same response categories with different IRT models. The present study used the Rasch rating scale model (RSM) to examine and reassess the psychometric properties of the Persian version of the PedsQL™ 4.0 Generic Core Scales. The PedsQL™ 4.0 Generic Core Scales was completed by 938 Iranian school children and their parents. Convergent, discriminant and construct validity of the instrument were assessed by classical test theory (CTT). The RSM was applied to investigate person and item reliability, item statistics and ordering of response categories. The CTT method showed that the scaling success rate for convergent and discriminant validity were 100% in all domains with the exception of physical health in the child self-report. Moreover, confirmatory factor analysis supported a four-factor model similar to its original version. The RSM showed that 22 out of 23 items had acceptable infit and outfit statistics (<1.4, >0.6), person reliabilities were low, item reliabilities were high, and item difficulty ranged from -1.01 to 0.71 and -0.68 to 0.43 for child self-report and parent proxy-report, respectively. Also the RSM showed that successive response categories for all items were not located in the expected order. This study revealed that, in all domains, the five response categories did not perform adequately. It is not known whether this problem is a function of the meaning of the response choices in the Persian language or an artifact of a mostly healthy population that did not use the full range of the response categories. The response categories should be evaluated in further validation studies, especially in large samples of chronically ill patients.

  18. Cardiac rehabilitation using the Family-Centered Empowerment Model versus home-based cardiac rehabilitation in patients with myocardial infarction: a randomised controlled trial

    PubMed Central

    Vahedian-Azimi, Amir; Hajiesmaieli, Mohammadreza; Kangasniemi, Mari; Alhani, Fatemah; Jelvehmoghaddam, Hosseinali; Fathi, Mohammad; Farzanegan, Behrooz; Ardehali, Seyed H; Hatamian, Sevak; Gahremani, Mehdi; Mosavinasab, Seyed M M; Rostami, Zohreh; Madani, Seyed J; Izadi, Morteza

    2016-01-01

    Objective To determine if a hybrid cardiac rehabilitation (CR) programme using the Family-Centered Empowerment Model (FCEM) as compared with standard CR will improve patient quality of life, perceived stress and state anxiety of patients with myocardial infarction (MI). Methods We conducted a randomised controlled trial in which patients received either standard home CR or CR using the FCEM strategy. Patient empowerment was measured with FCEM questionnaires preintervention and postintervention for a total of 9 assessments. Quality of life, perceived stress, and state and trait anxiety were assessed using the 36-Item Short Form Health Survey (SF-36), the 14-item Perceived Stress, and the 20-item State and 20-item Trait Anxiety questionnaires, respectively. Results 70 patients were randomised. Baseline characteristics were similar. Ejection fraction was significantly higher in the intervention group at measurements 2 (p=0.01) and 3 (p=0.001). Exercise tolerance measured as walking distance was significantly improved in the intervention group throughout the study. The quality of life results in the FCEM group showed significant improvement both within the group over time (p<0.0001) and when compared with control (p<0.0001). Similarly, the perceived stress and state anxiety results showed significant improvement both within the FCEM group over time (p<0.0001) and when compared with control (p<0.0001). No significant difference was found either within or between groups for trait anxiety. Conclusions The family-centred empowerment model may be an effective hybrid cardiac rehabilitation method for improving the physical and mental health of patients post-MI; however, further study is needed to validate these findings. Clinical Trials.gov identifier NCT02402582. Trial registration number NCT02402582. PMID:27110376

  19. Psychological distress in cancer survivors: the further development of an item bank.

    PubMed

    Smith, Adam B; Armes, Jo; Richardson, Alison; Stark, Dan P

    2013-02-01

    Assessment of psychological distress by patient report is necessary to meet patients' needs throughout the cancer journey. We have previously developed an item bank to assess psychological distress but not evaluated it for cancer survivors. Our first aim in this study was to test whether we could extend our item bank to include cancer survivors. The second aim was to examine whether the item bank could assess positive affect as a single construct alongside negative psychological symptoms. Responses from 1315 cancer survivors to the Hospital Anxiety and Depression Scale (HADS) and the Positive and Negative Affect Scale (PANAS) were considered for inclusion in a pre-existing item bank created from a heterogeneous sample of 4914 cancer patients. Differential item functioning (DIF) was used to assess whether HADS responses drawn from the two samples were equivalent. Common-item equating was used to anchor the shared (HADS) items, whilst the PANAS items were added. Item fit was evaluated at each stage, and misfitting items were removed. Unidimensionality was assessed with a principal components factor analysis. The DIF analysis did not reveal any differences between the HADS item locations from the two samples. Three misfitting PANAS items were removed, resulting in a final unidimensional bank of 80 items with good internal reliability (α = 0.85). The new item bank is valid for use across the cancer journey, including cancer survivors, and modestly improves the assessment of all levels of psychological distress and positive psychological function. Copyright © 2011 John Wiley & Sons, Ltd.

  20. Human phase response curve to a single 6.5 h pulse of short-wavelength light

    PubMed Central

    Rüger, Melanie; St Hilaire, Melissa A; Brainard, George C; Khalsa, Sat-Bir S; Kronauer, Richard E; Czeisler, Charles A; Lockley, Steven W

    2013-01-01

    The photic resetting response of the human circadian pacemaker depends on the timing of exposure, and the direction and magnitude of the resulting shift is described by a phase response curve (PRC). Previous PRCs in humans have utilized high-intensity polychromatic white light. Given that the circadian photoreception system is maximally sensitive to short-wavelength visible light, the aim of the current study was to construct a PRC to blue (480 nm) light and compare it to a 10,000 lux white light PRC constructed previously using a similar protocol. Eighteen young healthy participants (18–30 years) were studied for 9–10 days in a time-free environment. The protocol included three baseline days followed by a constant routine (CR) to assess initial circadian phase. Following this CR, participants were exposed to a 6.5 h 480 nm light exposure (11.8 μW cm−2, 11.2 lux) following mydriasis via a modified Ganzfeld dome. A second CR was conducted following the light exposure to re-assess circadian phase. Phase shifts were calculated from the difference in dim light melatonin onset (DLMO) between CRs. Exposure to 6.5 h of 480 nm light resets the circadian pacemaker according to a conventional type 1 PRC with fitted maximum delays and advances of −2.6 h and 1.3 h, respectively. The 480 nm PRC induced ∼75% of the response of the 10,000 lux white light PRC. These results may contribute to a re-evaluation of dosing guidelines for clinical light therapy and the use of light as a fatigue countermeasure. PMID:23090946

  1. Predicting sugar-sweetened behaviours with theory of planned behaviour constructs: Outcome and process results from the SIPsmartER behavioural intervention

    PubMed Central

    Zoellner, Jamie M.; Porter, Kathleen J.; Chen, Yvonnes; Hedrick, Valisa E.; You, Wen; Hickman, Maja; Estabrooks, Paul A.

    2017-01-01

    Objective Guided by the theory of planned behaviour (TPB) and health literacy concepts, SIPsmartER is a six-month multicomponent intervention effective at improving SSB behaviours. Using SIPsmartER data, this study explores prediction of SSB behavioural intention (BI) and behaviour from TPB constructs using: (1) cross-sectional and prospective models and (2) 11 single-item assessments from interactive voice response (IVR) technology. Design Quasi-experimental design, including pre- and post-outcome data and repeated-measures process data of 155 intervention participants. Main Outcome Measures Validated multi-item TPB measures, single-item TPB measures, and self-reported SSB behaviours. Hypothesised relationships were investigated using correlation and multiple regression models. Results TPB constructs explained 32% of the variance cross sectionally and 20% prospectively in BI; and explained 13–20% of variance cross sectionally and 6% prospectively. Single-item scale models were significant, yet explained less variance. All IVR models predicting BI (average 21%, range 6–38%) and behaviour (average 30%, range 6–55%) were significant. Conclusion Findings are interpreted in the context of other cross-sectional, prospective and experimental TPB health and dietary studies. Findings advance experimental application of the TPB, including understanding constructs at outcome and process time points and applying theory in all intervention development, implementation and evaluation phases. PMID:28165771

  2. Eligibility of Indoor Plumbing Under Alaska Sanitation Infrastructure Grant Program

    EPA Pesticide Factsheets

    Memorandum response to questions that relate to whether indoor plumbing of homes, as part of a wastewater construction project, is an eligible cost item under the EPA Alaska Sanitation Infrastructure Grant Program.

  3. The feeding practices and structure questionnaire: construction and initial validation in a sample of Australian first-time mothers and their 2-year olds.

    PubMed

    Jansen, Elena; Mallan, Kimberley M; Nicholson, Jan M; Daniels, Lynne A

    2014-06-04

    Early feeding practices lay the foundation for children's eating habits and weight gain. Questionnaires are available to assess parental feeding but overlapping and inconsistent items, subscales and terminology limit conceptual clarity and between study comparisons. Our aim was to consolidate a range of existing items into a parsimonious and conceptually robust questionnaire for assessing feeding practices with very young children (<3 years). Data were from 462 mothers and children (age 21-27 months) from the NOURISH trial. Items from five questionnaires and two study-specific items were submitted to a priori item selection, allocation and verification, before theoretically-derived factors were tested using Confirmatory Factor Analysis. Construct validity of the new factors was examined by correlating these with child eating behaviours and weight. Following expert review 10 factors were specified. Of these, 9 factors (40 items) showed acceptable model fit and internal reliability (Cronbach's α: 0.61-0.89). Four factors reflected non-responsive feeding practices: 'Distrust in Appetite', 'Reward for Behaviour', 'Reward for Eating', and 'Persuasive Feeding'. Five factors reflected structure of the meal environment and limits: 'Structured Meal Setting', 'Structured Meal Timing', 'Family Meal Setting', 'Overt Restriction' and 'Covert Restriction'. Feeding practices generally showed the expected pattern of associations with child eating behaviours but none with weight. The Feeding Practices and Structure Questionnaire (FPSQ) provides a new reliable and valid measure of parental feeding practices, specifically maternal responsiveness to children's hunger/satiety signals facilitated by routine and structure in feeding. Further validation in more diverse samples is required.

  4. Evaluation of the Irritable Bowel Syndrome Quality of Life (IBS-QOL) questionnaire in diarrheal-predominant irritable bowel syndrome patients

    PubMed Central

    2013-01-01

    Background Diarrhea-predominant irritable bowel syndrome (IBS-d) significantly diminishes the health-related quality of life (HRQOL) of patients. Psychological and social impacts are common with many IBS-d patients reporting comorbid depression, anxiety, decreased intimacy, and lost working days. The Irritable Bowel Syndrome Quality of Life (IBS-QOL) questionnaire is a 34-item instrument developed and validated for measurement of HRQOL in non-subtyped IBS patients. The current paper assesses this previously-validated instrument employing data collected from 754 patients who participated in a randomized clinical trial of a novel treatment, eluxadoline, for IBS-d. Methods Psychometric methods common to HRQOL research were employed to evaluate the IBS-QOL. Many of the historical analyses of the IBS-QOL validations were used. Other techniques that extended the original methods were applied where more appropriate for the current dataset. In IBS-d patients, we analyzed the items and substructure of the IBS-QOL via item reduction, factor structure, internal consistency, reproducibility, construct validity, and ability to detect change. Results This study supports the IBS-QOL as a psychometrically valid measure. Factor analyses suggested that IBS-specific QOL as measured by the IBS-QOL is a unidimensional construct. Construct validity was further buttressed by significant correlations between IBS-QOL total scores and related measures of IBS-d severity including the historically-relevant Irritable Bowel Syndrome Adequate Relief (IBS-AR) item and the FDA’s Clinical Responder definition. The IBS-QOL also showed a significant ability to detect change as evidenced by analysis of treatment effects. A minority of the items, unrelated to the IBS-d, performed less well by the standards set by the original authors. Conclusions We established that the IBS-QOL total score is a psychometrically valid measure of HRQOL in IBS-d patients enrolled in this study. Our analyses suggest that the IBS-QOL items demonstrate very good construct validity and ability to detect changes due to treatment effects. Furthermore, our analyses suggest that the IBS-QOL items measure a univariate construct and we believe further modeling of the IBS-QOL from an item response theory (IRT) approach under both non-treatment and treatment conditions would greatly further our understanding as item-based methods could be used to develop a short form. PMID:24330412

  5. Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

    PubMed

    Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

    2014-05-01

    The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.

  6. Validation of a condition-specific measure for women having an abnormal screening mammography.

    PubMed

    Brodersen, John; Thorsen, Hanne; Kreiner, Svend

    2007-01-01

    The aim of this study is to assess the validity of a new condition-specific instrument measuring psychosocial consequences of abnormal screening mammography (PCQ-DK33). The draft version of the PCQ-DK33 was completed on two occasions by 184 women who had received an abnormal screening mammography and on one occasion by 240 women who had received a normal screening result. Item Response Theories and Classical Test Theories were used to analyze data. Construct validity, concurrent validity, known group validity, objectivity and reliability were established by item analysis examining the fit between item responses and Rasch models. Six dimensions covering anxiety, behavioral impact, sense of dejection, impact on sleep, breast examination, and sexuality were identified. One item belonging to the dejection dimension had uniform differential item functioning. Two items not fitting the Rasch models were retained because of high face validity. A sick leave item added useful information when measuring side effects and socioeconomic consequences of breast cancer screening. Five "poor items" were identified and should be deleted from the final instrument. Preliminary evidence for a valid and reliable condition-specific measure for women having an abnormal screening mammography was established. The measure includes 27 "good" items measuring different attributes of the same overall latent structure-the psychosocial consequences of abnormal screening mammography.

  7. Compound washing remediation and response surface analysis of lead-contaminated soil in mining area by fermentation broth and saponin.

    PubMed

    Zhang, Hongjiao; Wang, Zhengwei; Gao, Yuntao

    2018-03-01

    The development of eluent is the key to soil washing remediation, and a compound eluent was constructed using the prepared citric acid fermentation broth and saponin in this study. It displayed a good washing performance for Pb, Cu, Cr, and Cd in red soil, and the removal rates, especially Pb, gained an improvement compared with a single eluent. Based on this, the compound eluent was applied to remediation of Pb-contaminated soil in mining area; the desorption of Pb is a heterogeneous diffusion process, and Pb in large particle size soil is relatively easy to remove. An available response surface analysis model was established; its P < 0.0001 is very significant, and the P of the missing item is 0.1152. The degree of influence of three significant factors on removal of Pb is liquid-to-solid ratio > washing time > saponin concentration, and liquid-to-solid ratio and washing time show interaction. Moreover, the Pb removal rate can reach 56.20% under the optimized conditions: 0.25% saponin concentration, 20 mL/g liquid-to-solid ratio, and 320-min washing time, which is close to the predicted value of 56.20% with a difference of 1.41%. In addition, most of the active Pb was removed and environmental risks were lowered after washing.

  8. Person Heterogeneity of the BDI-II-C and Its Effects on Dimensionality and Construct Validity: Using Mixture Item Response Models

    ERIC Educational Resources Information Center

    Wu, Pei-Chen; Huang, Tsai-Wei

    2010-01-01

    This study was to apply the mixed Rasch model to investigate person heterogeneity of Beck Depression Inventory-II-Chinese version (BDI-II-C) and its effects on dimensionality and construct validity. Person heterogeneity was reflected by two latent classes that differ qualitatively. Additionally, person heterogeneity adversely affected the…

  9. Motivations for Older Adults' Participation in Distance Education: A Study at the National Open University of Taiwan

    ERIC Educational Resources Information Center

    Mulenga, Derek; Liang, Jr-Shiuan

    2008-01-01

    This study investigated the factor structure of motivational constructs as expressed by older adult learners and examined how these constructs correlated with selected socio-demographic characteristics at the National Open University of Taiwan (NOUT). Results were based on the responses of 371 elders to the 32-item Reasons for Participation Scale…

  10. Investigating Cognitive Effort and Response Quality of Question Formats in Web Surveys Using Paradata

    ERIC Educational Resources Information Center

    Höhne, Jan Karem; Schlosser, Stephan; Krebs, Dagmar

    2017-01-01

    Measuring attitudes and opinions employing agree/disagree (A/D) questions is a common method in social research because it appears to be possible to measure different constructs with identical response scales. However, theoretical considerations suggest that A/D questions require a considerable cognitive processing. Item-specific (IS) questions,…

  11. Assessing children’s competence to consent in research by a standardized tool: a validity study

    PubMed Central

    2012-01-01

    Background Currently over 50% of drugs prescribed to children have not been evaluated properly for use in their age group. One key reason why children have been excluded from clinical trials is that they are not considered able to exercise meaningful autonomy over the decision to participate. Dutch law states that competence to consent can be presumed present at the age of 12 and above; however, in pediatric practice children’s competence is not that clearly presented and the transition from assent to active consent is gradual. A gold standard for competence assessment in children does not exist. In this article we describe a study protocol on the development of a standardized tool for assessing competence to consent in research in children and adolescents. Methods/design In this study we modified the MacCAT-CR, the best evaluated competence assessment tool for adults, for use in children and adolescents. We will administer the tool prospectively to a cohort of pediatric patients from 6 to18 years during the selection stages of ongoing clinical trials. The outcomes of the MacCAT-CR interviews will be compared to a reference standard, established by the judgments of clinical investigators, and an expert panel consisting of child psychiatrists, child psychologists and medical ethicists. The reliability, criterion-related validity and reproducibility of the tool will be determined. As MacCAT-CR is a multi-item scale consisting of 13 items, power was justified at 130–190 subjects, providing a minimum of 10–15 observations per item. MacCAT-CR outcomes will be correlated with age, life experience, IQ, ethnicity, socio-economic status and competence judgment of the parent(s). It is anticipated that 160 participants will be recruited over 2 years to complete enrollment. Discussion A validity study on an assessment tool of competence to consent is strongly needed in research practice, particularly in the child and adolescent population. In this study we will establish a reference standard of children’s competence to consent, combined with validation of an assessment instrument. Results can facilitate responsible involvement of children in clinical trials by further development of guidelines, health-care policies and legal policies. PMID:23009102

  12. The Social Physique Anxiety Scale: an example of the potential consequence of negatively worded items in factorial validity studies.

    PubMed

    Motl, R W; Conroy, D E; Horan, P M

    2000-01-01

    Social physique anxiety (SPA) based on Hart, Leary, and Rejeski's (1989) Social Physique Anxiety Scale (SPAS) was originally conceptualized to be a unidimensional construct. Empirical evidence on the factorial validity of the SPAS has been contradictory, yielding both one- and two-factor models. The two-factor model, which consists of separate factors associated with positively and negatively worded items, has stimulated an ongoing debate about the dimensionality and content of the SPAS. The present study employed confirmatory factor analysis (CFA) to examine whether the two-factor solution to the 12-item SPAS was substantively meaningful or a methodological artifact. Results of the CFAs, which were performed on responses from four different samples (Eklund, Kelley, and Wilson, 1997; Eklund, Mack, and Hart, 1996), supported the existence of a single substantive SPA factor underlying responses to the 12-item SPAS. There were, in addition, method effects associated with the negatively worded items that could be modeled to achieve good fit. Therefore, it was concluded that a single substantive factor and a non-substantive method effect primarily related to the negatively worded items best represented the 12-item SPAS.

  13. The prioritization of symptom beliefs over illness beliefs: The development and validation of the Pain Perception Questionnaire for Young People.

    PubMed

    Ghio, Daniela; Thomson, Wendy; Calam, Rachel; Ulph, Fiona; Baildam, Eileen M; Hyrich, Kimme; Cordingley, Lis

    2018-02-01

    To investigate the suitability of the revised Illness Perception Questionnaire (IPQ-R) for use with adolescents with a long-term pain condition and to validate a new questionnaire for use with this age group. A three-phase mixed-methods study. Phase 1 comprised in-depth qualitative analyses of audio-recorded cognitive interviews with 20 adolescents with juvenile idiopathic arthritis who were answering IPQ-R items. Transcripts were coded using framework analysis. A content analysis of their intended responses to individual items was also conducted. In Phase 2, a new questionnaire was developed and its linguistic and face validity were assessed with 18 adolescents without long-term conditions. In Phase 3, the construct validity of the new questionnaire was assessed with 240 adolescents with juvenile idiopathic arthritis. A subset of 43 adolescents completed the questionnaire a second time to assess test-retest reliability. All participants were aged 11-16 years. Participants described both conceptual and response format difficulties when answering IPQ-R items. In response, the Pain Perception Questionnaire for Young People (PPQ-YP) was designed which incorporated significant modifications to both wording and response formats when compared with the IPQ-R. A principal component analysis of the PPQ-YP identified ten constructs in the new questionnaire. Emotional representations were separated into two constructs, responsive and anticipatory emotions. The PPQ-YP showed high test-retest reliability. Symptom beliefs appear to be more salient to adolescents with a long-term pain condition than beliefs about the illness as a whole. A new questionnaire to assess pain beliefs of adolescents was designed. Further validation work may be needed to assess its suitability for use with other pain conditions. Statement of contribution What is already known on this subject? Versions of the adult Revised Illness Perception Questionnaire (IPQ-R) have been adapted for adolescents and children by changing item wording; however, research to assess the degree to which the underlying IPQ-R constructs are relevant to adolescents with a long-term condition had not been performed. What the present study adds? In adolescents, beliefs about symptoms of their condition are more salient than beliefs about the illness as a whole. Question response formats for children and young people need to take account of age-specific abilities. A new questionnaire has been designed for adolescents with pain. It is theoretically congruent with the CS-SRM. © 2017 The Authors. British Journal of Health Psychology published by John Wiley & Sons Ltd on behalf of British Psychological Society.

  14. Tier One Performance Screen Initial Operational Test and Evaluation: 2012 Interim Report

    DTIC Science & Technology

    2013-12-01

    are known to predict outcomes in work settings. Because the TAPAS uses item response theory (IRT) methods to construct and score items, it can be...Qualification Test (AFQT), to select new Soldiers. Although the AFQT is useful for selecting new Soldiers, other personal attributes are important to...to be and will continue to serve as a useful metric for selecting new Soldiers, other personal attributes, in particular non-cognitive attributes

  15. Psychometric properties of a revised version of the Assisting Hand Assessment (Kids-AHA 5.0).

    PubMed

    Holmefur, Marie M; Krumlinde-Sundholm, Lena

    2016-06-01

    The aim of this study was to scrutinize the Assisting Hand Assessment (AHA) version 4.4 for possible improvements and to evaluate the psychometric properties regarding internal scale validity and aspects of reliability of a revised version of the AHA. In collaboration with experts, scoring criteria were changed for four items, and one fully new item was constructed. Twenty-two original, one new, and four revised items were scored for 164 assessments of children with unilateral cerebral palsy aged 18 months to 12 years. Rasch measurement analysis was used to evaluate internal scale validity by exploring rating-scale functioning, item and person goodness-of-fit, and principal component analysis. Targeting and scale reliability were also evaluated. After removal of misfitting items, a 20-item scale showed satisfactory goodness-of-fit. Unidimensionality was confirmed by principal component analysis. The rating scale functioned well for the 20 items, and the item difficulty was well suited to the ability level of the sample. The person reliability coefficient was 0.98, indicating high separation ability of the scale. A conversion table of AHA scores between the previous version (4.4) and the new version (5.0) was constructed. The new, 20-item version of the Kids-AHA (version 5.0), demonstrated excellent internal scale validity, suggesting improved responsiveness to changes and shortened scoring time. For comparison of scores from version 4.4 to 5.0, a transformation table is presented. © 2015 Mac Keith Press.

  16. Adjusting for cross-cultural differences in computer-adaptive tests of quality of life.

    PubMed

    Gibbons, C J; Skevington, S M

    2018-04-01

    Previous studies using the WHOQOL measures have demonstrated that the relationship between individual items and the underlying quality of life (QoL) construct may differ between cultures. If unaccounted for, these differing relationships can lead to measurement bias which, in turn, can undermine the reliability of results. We used item response theory (IRT) to assess differential item functioning (DIF) in WHOQOL data from diverse language versions collected in UK, Zimbabwe, Russia, and India (total N = 1332). Data were fitted to the partial credit 'Rasch' model. We used four item banks previously derived from the WHOQOL-100 measure, which provided excellent measurement for physical, psychological, social, and environmental quality of life domains (40 items overall). Cross-cultural differential item functioning was assessed using analysis of variance for item residuals and post hoc Tukey tests. Simulated computer-adaptive tests (CATs) were conducted to assess the efficiency and precision of the four items banks. Splitting item parameters by DIF results in four linked item banks without DIF or other breaches of IRT model assumptions. Simulated CATs were more precise and efficient than longer paper-based alternatives. Assessing differential item functioning using item response theory can identify measurement invariance between cultures which, if uncontrolled, may undermine accurate comparisons in computer-adaptive testing assessments of QoL. We demonstrate how compensating for DIF using item anchoring allowed data from all four countries to be compared on a common metric, thus facilitating assessments which were both sensitive to cultural nuance and comparable between countries.

  17. The Impact of Non-attempted and Dually-Attempted Items on Person Abilities Using Item Response Theory

    PubMed Central

    Sideridis, Georgios D.; Tsaousis, Ioannis; Al Harbi, Khaleel

    2016-01-01

    The purpose of the present study was to relate response strategy with person ability estimates. Two behavioral strategies were examined: (a) the strategy to skip items in order to save time on timed tests, and, (b) the strategy to select two responses on an item, with the hope that one of them may be considered correct. Participants were 4,422 individuals who were administered a standardized achievement measure related to math, biology, chemistry, and physics. In the present evaluation, only the physics subscale was employed. Two analyses were conducted: (a) a person-based one to identify differences between groups and potential correlates of those differences, and, (b) a measure-based analysis in order to identify the parts of the measure that were responsible for potential group differentiation. For (a) person abilities the 2-PL model was employed and later the 3-PL and 4-PL models in order to estimate upper and lower asymptotes of person abilities. For (b) differential item functioning, differential test functioning, and differential distractor functioning were investigated. Results indicated that there were significant differences between groups with completers having the highest ability compared to both non-attempters and dual responders. There were no significant differences between no-attempters and dual responders. The present findings have implications for response strategy efficacy and measure evaluation, revision, and construction. PMID:27790174

  18. The Impact of Non-attempted and Dually-Attempted Items on Person Abilities Using Item Response Theory.

    PubMed

    Sideridis, Georgios D; Tsaousis, Ioannis; Al Harbi, Khaleel

    2016-01-01

    The purpose of the present study was to relate response strategy with person ability estimates. Two behavioral strategies were examined: (a) the strategy to skip items in order to save time on timed tests, and, (b) the strategy to select two responses on an item, with the hope that one of them may be considered correct. Participants were 4,422 individuals who were administered a standardized achievement measure related to math, biology, chemistry, and physics. In the present evaluation, only the physics subscale was employed. Two analyses were conducted: (a) a person-based one to identify differences between groups and potential correlates of those differences, and, (b) a measure-based analysis in order to identify the parts of the measure that were responsible for potential group differentiation. For (a) person abilities the 2-PL model was employed and later the 3-PL and 4-PL models in order to estimate upper and lower asymptotes of person abilities. For (b) differential item functioning, differential test functioning, and differential distractor functioning were investigated. Results indicated that there were significant differences between groups with completers having the highest ability compared to both non-attempters and dual responders. There were no significant differences between no-attempters and dual responders. The present findings have implications for response strategy efficacy and measure evaluation, revision, and construction.

  19. Integrated Cr(VI) removal using constructed wetlands and composting.

    PubMed

    Sultana, Mar-Yam; Chowdhury, Abu Khayer Md Muktadirul Bari; Michailides, Michail K; Akratos, Christos S; Tekerlekopoulou, Athanasia G; Vayenas, Dimitrios V

    2015-01-08

    The present work was conducted to study integrated chromium removal from aqueous solutions in horizontal subsurface (HSF) constructed wetlands. Two pilot-scale HSF constructed wetlands (CWs) units were built and operated. One unit was planted with common reeds (Phragmites australis) and one was kept unplanted. Influent concentrations of Cr(VI) ranged from 0.5 to 10mg/L. The effect of temperature and hydraulic residence time (8-0.5 days) on Cr(VI) removal were studied. Temperature was proved to affect Cr(VI) removal in both units. In the planted unit maximum Cr(VI) removal efficiencies of 100% were recorded at HRT's of 1 day with Cr(VI) concentrations of 5, 2.5 and 1mg/L, while a significantly lower removal rate was recorded in the unplanted unit. Harvested reed biomass from the CWs was co-composted with olive mill wastes. The final product had excellent physicochemical characteristics (C/N: 14.1-14.7, germination index (GI): 145-157%, Cr: 8-10mg/kg dry mass), fulfills EU requirements and can be used as a fertilizer in organic farming. Copyright © 2014 Elsevier B.V. All rights reserved.

  20. Construct Validation of the Self-Efficacy Teaching and Knowledge Instrument for Science Teachers-Revised (SETAKIST-R): Lessons Learned

    NASA Astrophysics Data System (ADS)

    Pruski, Linda A.; Blanco, Sharon L.; Riggs, Rosemary A.; Grimes, Kandi K.; Fordtran, Chase W.; Barbola, Gina M.; Cornell, John E.; Lichtenstein, Michael J.

    2013-11-01

    Described herein is the academic lineage and independent validation of the Self-Efficacy Teaching and Knowledge Instrument for Science Teachers-Revised (SETAKIST-R). Data from 334 K-12 science teachers were analyzed using Partial Credit Rasch models. Principal components analysis on the person-item residuals suggest two latent dimensions: Knowledge and Teaching Self-Efficacies. Item-fit statistics were used to select items for each subscale. Person and item separation (reliability) indices were quite low, and we noted disordered response patterns on the person-item maps that revealed problems with item content and/or scaling for both subscales. These issues include the presence of: verbal negatives, ambiguous modifiers, counter-intuitive scaling, and an "undecided/uncertain" option. The SETAKIST-R, in its current form, cannot be recommended as a measure of science teacher self-efficacy.

  1. Novelty, Age, and IQ: A Theoretical Look at Human Preference for Novelty.

    ERIC Educational Resources Information Center

    Eaves, Ronald C.; Glen, Roderick

    1996-01-01

    A study of 86 children (ages 5-16) investigated the relationship between age, IQ, and preferences for novelty. Children with higher IQs spent significantly more time responding to novel items than lower-IQ children. Older children with high IQs showed the most interest in novel items. (CR)

  2. Development and validation of the coping with terror scale.

    PubMed

    Stein, Nathan R; Schorr, Yonit; Litz, Brett T; King, Lynda A; King, Daniel W; Solomon, Zahava; Horesh, Danny

    2013-10-01

    Terrorism creates lingering anxiety about future attacks. In prior terror research, the conceptualization and measurement of coping behaviors were constrained by the use of existing coping scales that index reactions to daily hassles and demands. The authors created and validated the Coping with Terror Scale to fill the measurement gap. The authors emphasized content validity, leveraging the knowledge of terror experts and groups of Israelis. A multistep approach involved construct definition and item generation, trimming and refining the measure, exploring the factor structure underlying item responses, and garnering evidence for reliability and validity. The final scale comprised six factors that were generally consistent with the authors' original construct specifications. Scores on items linked to these factors demonstrate good reliability and validity. Future studies using the Coping with Terror Scale with other populations facing terrorist threats are needed to test its ability to predict resilience, functional impairment, and psychological distress.

  3. Development and validation of a socioculturally competent trust in physician scale for a developing country setting.

    PubMed

    Gopichandran, Vijayaprasad; Wouters, Edwin; Chetlapalli, Satish Kumar

    2015-05-03

    Trust in physicians is the unwritten covenant between the patient and the physician that the physician will do what is in the best interest of the patient. This forms the undercurrent of all healthcare relationships. Several scales exist for assessment of trust in physicians in developed healthcare settings, but to our knowledge none of these have been developed in a developing country context. To develop and validate a new trust in physician scale for a developing country setting. Dimensions of trust in physicians, which were identified in a previous qualitative study in the same setting, were used to develop a scale. This scale was administered among 616 adults selected from urban and rural areas of Tamil Nadu, south India, using a multistage sampling cross sectional survey method. The individual items were analysed using a classical test approach as well as item response theory. Cronbach's α was calculated and the item to total correlation of each item was assessed. After testing for unidimensionality and absence of local dependence, a 2 parameter logistic Semajima's graded response model was fit and item characteristics assessed. Competence, assurance of treatment, respect for the physician and loyalty to the physician were important dimensions of trust. A total of 31 items were developed using these dimensions. Of these, 22 were selected for final analysis. The Cronbach's α was 0.928. The item to total correlations were acceptable for all the 22 items. The item response analysis revealed good item characteristic curves and item information for all the items. Based on the item parameters and item information, a final 12 item scale was developed. The scale performs optimally in the low to moderate trust range. The final 12 item trust in physician scale has a good construct validity and internal consistency. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  4. Development and validation of a socioculturally competent trust in physician scale for a developing country setting

    PubMed Central

    Gopichandran, Vijayaprasad; Wouters, Edwin; Chetlapalli, Satish Kumar

    2015-01-01

    Trust in physicians is the unwritten covenant between the patient and the physician that the physician will do what is in the best interest of the patient. This forms the undercurrent of all healthcare relationships. Several scales exist for assessment of trust in physicians in developed healthcare settings, but to our knowledge none of these have been developed in a developing country context. Objectives To develop and validate a new trust in physician scale for a developing country setting. Methods Dimensions of trust in physicians, which were identified in a previous qualitative study in the same setting, were used to develop a scale. This scale was administered among 616 adults selected from urban and rural areas of Tamil Nadu, south India, using a multistage sampling cross sectional survey method. The individual items were analysed using a classical test approach as well as item response theory. Cronbach's α was calculated and the item to total correlation of each item was assessed. After testing for unidimensionality and absence of local dependence, a 2 parameter logistic Semajima's graded response model was fit and item characteristics assessed. Results Competence, assurance of treatment, respect for the physician and loyalty to the physician were important dimensions of trust. A total of 31 items were developed using these dimensions. Of these, 22 were selected for final analysis. The Cronbach's α was 0.928. The item to total correlations were acceptable for all the 22 items. The item response analysis revealed good item characteristic curves and item information for all the items. Based on the item parameters and item information, a final 12 item scale was developed. The scale performs optimally in the low to moderate trust range. Conclusions The final 12 item trust in physician scale has a good construct validity and internal consistency. PMID:25941182

  5. Exploring the impact of disability on self-determination measurement.

    PubMed

    Mumbardó-Adam, Cristina; Guàrdia-Olmos, Joan; Giné, Climent

    2018-07-01

    Self-determination is a psychological construct that applies to both the general population and to individuals with disabilities that can be self-determined with adequate accommodations and opportunities. As the relevance of self-determination-related skills in life has been recently acknowledged, researchers have created a measure to assess self-determination in adolescents and young adults with and without disabilities. The Self-Determination Inventory: Student Report (Spanish interim version) is empirically being validated into Spanish. As this scale is the first assessment addressed to all youth, further exploration of its psychometric properties is required to ensure the reliability of the self-determination measurement and gain further insight into the construct when applied to youth with and without disabilities. More than 600 participants were asked to complete the scale. The impact of disability on the item response distributions across the dimensions of self-determination was explored. Differential item functioning (DIF) was found in only 5 of the scale's 45 items. Differences primary favored youth without disabilities. The weak presence of DIF across the items supports the instrument's psychometrical robustness when measuring self-determination in youth with and without disabilities and provides further understanding of the self-determination construct. Implications and future research directions are also discussed. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Development of an item bank for food parenting practices based on published instruments and reports from Canadian and US parents.

    PubMed

    O'Connor, Teresia M; Pham, Truc; Watts, Allison W; Tu, Andrew W; Hughes, Sheryl O; Beauchamp, Mark R; Baranowski, Tom; Mâsse, Louise C

    2016-08-01

    Research to understand how parents influence their children's dietary intake and eating behaviors has expanded in the past decades and a growing number of instruments are available to assess food parenting practices. Unfortunately, there is no consensus on how constructs should be defined or operationalized, making comparison of results across studies difficult. The aim of this study was to develop a food parenting practice item bank with items from published scales and supplement with parenting practices that parents report using. Items from published scales were identified from two published systematic reviews along with an additional systematic review conducted for this study. Parents (n = 135) with children 5-12 years old from the US and Canada, stratified to represent the demographic distribution of each country, were recruited to participate in an online semi-qualitative survey on food parenting. Published items and parent responses were coded using the same framework to reduce the number of items into representative concepts using a binning and winnowing process. The literature contributed 1392 items and parents contributed 1985 items, which were reduced to 262 different food parenting concepts (26% exclusive from literature, 12% exclusive from parents, and 62% represented in both). Food parenting practices related to 'Structure of Food Environment' and 'Behavioral and Educational' were emphasized more by parent responses, while practices related to 'Consistency of Feeding Environment' and 'Emotional Regulation' were more represented among published items. The resulting food parenting item bank should next be calibrated with item response modeling for scientists to use in the future. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Improving Measurement Efficiency of the Inner EAR Scale with Item Response Theory.

    PubMed

    Jessen, Annika; Ho, Andrew D; Corrales, C Eduardo; Yueh, Bevan; Shin, Jennifer J

    2018-02-01

    Objectives (1) To assess the 11-item Inner Effectiveness of Auditory Rehabilitation (Inner EAR) instrument with item response theory (IRT). (2) To determine whether the underlying latent ability could also be accurately represented by a subset of the items for use in high-volume clinical scenarios. (3) To determine whether the Inner EAR instrument correlates with pure tone thresholds and word recognition scores. Design IRT evaluation of prospective cohort data. Setting Tertiary care academic ambulatory otolaryngology clinic. Subjects and Methods Modern psychometric methods, including factor analysis and IRT, were used to assess unidimensionality and item properties. Regression methods were used to assess prediction of word recognition and pure tone audiometry scores. Results The Inner EAR scale is unidimensional, and items varied in their location and information. Information parameter estimates ranged from 1.63 to 4.52, with higher values indicating more useful items. The IRT model provided a basis for identifying 2 sets of items with relatively lower information parameters. Item information functions demonstrated which items added insubstantial value over and above other items and were removed in stages, creating a 8- and 3-item Inner EAR scale for more efficient assessment. The 8-item version accurately reflected the underlying construct. All versions correlated moderately with word recognition scores and pure tone averages. Conclusion The 11-, 8-, and 3-item versions of the Inner EAR scale have strong psychometric properties, and there is correlational validity evidence for the observed scores. Modern psychometric methods can help streamline care delivery by maximizing relevant information per item administered.

  8. Psychometric properties of the PROMIS Physical Function item bank in patients receiving physical therapy.

    PubMed

    Crins, Martine H P; van der Wees, Philip J; Klausch, Thomas; van Dulmen, Simone A; Roorda, Leo D; Terwee, Caroline B

    2018-01-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) is a universally applicable set of instruments, including item banks, short forms and computer adaptive tests (CATs), measuring patient-reported health across different patient populations. PROMIS CATs are highly efficient and the use in practice is considered feasible with little administration time, offering standardized and routine patient monitoring. Before an item bank can be used as CAT, the psychometric properties of the item bank have to be examined. Therefore, the objective was to assess the psychometric properties of the Dutch-Flemish PROMIS Physical Function item bank (DF-PROMIS-PF) in Dutch patients receiving physical therapy. Cross-sectional study. 805 patients >18 years, who received any kind of physical therapy in primary care in the past year, completed the full DF-PROMIS-PF (121 items). Unidimensionality was examined by Confirmatory Factor Analysis and local dependence and monotonicity were evaluated. A Graded Response Model was fitted. Construct validity was examined with correlations between DF-PROMIS-PF T-scores and scores on two legacy instruments (SF-36 Health Survey Physical Functioning scale [SF36-PF10] and the Health Assessment Questionnaire Disability-Index [HAQ-DI]). Reliability (standard errors of theta) was assessed. The results for unidimensionality were mixed (scaled CFI = 0.924, TLI = 0.923, RMSEA = 0.045, 1th factor explained 61.5% of variance). Some local dependence was found (8.2% of item pairs). The item bank showed a broad coverage of the physical function construct (threshold-parameters range: -4.28-2.33) and good construct validity (correlation with SF36-PF10 = 0.84 and HAQ-DI = -0.85). Furthermore, the DF-PROMIS-PF showed greater reliability over a broader score-range than the SF36-PF10 and HAQ-DI. The psychometric properties of the DF-PROMIS-PF item bank are sufficient. The DF-PROMIS-PF can now be used as short forms or CAT to measure the level of physical function of physiotherapy patients.

  9. Construction of an efficient evaluative instrument for myasthenia gravis: the MG composite.

    PubMed

    Burns, Ted M; Conaway, Mark R; Cutter, Gary R; Sanders, Donald B

    2008-12-01

    We assessed the performance of items from the Quantitative Myasthenia Gravis (QMG), MMT (Manual Muscle Test), and MG-ADL (Myasthenia Gravis - Activities of Daily Living) scales, using data from two recently completed treatment trials of generalized MG. Items were selected that were relevant to manifestations of MG, meaningful to both the physician and the patient, and responsive to clinical change. After the 10 items were chosen, they were weighted based on input from MG experts from around the world, considering factors such as quality of life, disease severity, risk, prognosis, validity, and reliability. The MG Composite is easy to administer, takes less than 5 minutes to complete, and requires no equipment. Weighting of the response options of the 10 items should result in ordinal scores that better represent MG status and are more responsive to meaningful clinical change. To better determine its suitability for clinical use and for treatment trials, the MG Composite will be tested prospectively at several academic medical centers and will be used as a secondary measure of efficacy in pending clinical trials of MG.

  10. Development of the Oxford Participation and Activities Questionnaire: constructing an item pool

    PubMed Central

    Kelly, Laura; Jenkinson, Crispin; Dummett, Sarah; Dawson, Jill; Fitzpatrick, Ray; Morley, David

    2015-01-01

    Purpose The Oxford Participation and Activities Questionnaire is a patient-reported outcome measure in development that is grounded on the World Health Organization International Classification of Functioning, Disability, and Health (ICF). The study reported here aimed to inform and generate an item pool for the new measure, which is specifically designed for the assessment of participation and activity in patients experiencing a range of health conditions. Methods Items were informed through in-depth interviews conducted with 37 participants spanning a range of conditions. Interviews aimed to identify how their condition impacted their ability to participate in meaningful activities. Conditions included arthritis, cancer, chronic back pain, diabetes, motor neuron disease, multiple sclerosis, Parkinson’s disease, and spinal cord injury. Transcripts were analyzed using the framework method. Statements relating to ICF themes were recast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n=13) were used to assess items for face and content validity. Results ICF themes relevant to activities and participation in everyday life were explored, and a total of 222 items formed the initial item pool. This item pool was refined by the research team and 28 generic items were mapped onto all nine chapters of the ICF construct, detailing activity and participation. Cognitive interviewing confirmed the questionnaire instructions, items, and response options were acceptable to participants. Conclusion Using a clear conceptual basis to inform item generation, 28 items have been identified as suitable to undergo further psychometric testing. A large-scale postal survey will follow in order to refine the instrument further and to assess its psychometric properties. The final instrument is intended for use in clinical trials and interventions targeted at maintaining or improving activity and participation. PMID:26056503

  11. Development of the Oxford Participation and Activities Questionnaire: constructing an item pool.

    PubMed

    Kelly, Laura; Jenkinson, Crispin; Dummett, Sarah; Dawson, Jill; Fitzpatrick, Ray; Morley, David

    2015-01-01

    The Oxford Participation and Activities Questionnaire is a patient-reported outcome measure in development that is grounded on the World Health Organization International Classification of Functioning, Disability, and Health (ICF). The study reported here aimed to inform and generate an item pool for the new measure, which is specifically designed for the assessment of participation and activity in patients experiencing a range of health conditions. Items were informed through in-depth interviews conducted with 37 participants spanning a range of conditions. Interviews aimed to identify how their condition impacted their ability to participate in meaningful activities. Conditions included arthritis, cancer, chronic back pain, diabetes, motor neuron disease, multiple sclerosis, Parkinson's disease, and spinal cord injury. Transcripts were analyzed using the framework method. Statements relating to ICF themes were recast as questionnaire items and shown for review to an expert panel. Cognitive debrief interviews (n=13) were used to assess items for face and content validity. ICF themes relevant to activities and participation in everyday life were explored, and a total of 222 items formed the initial item pool. This item pool was refined by the research team and 28 generic items were mapped onto all nine chapters of the ICF construct, detailing activity and participation. Cognitive interviewing confirmed the questionnaire instructions, items, and response options were acceptable to participants. Using a clear conceptual basis to inform item generation, 28 items have been identified as suitable to undergo further psychometric testing. A large-scale postal survey will follow in order to refine the instrument further and to assess its psychometric properties. The final instrument is intended for use in clinical trials and interventions targeted at maintaining or improving activity and participation.

  12. An Evaluation of the Kernel Equating Method: A Special Study with Pseudotests Constructed from Real Test Data. Research Report. ETS RR-06-02

    ERIC Educational Resources Information Center

    von Davier, Alina A.; Holland, Paul W.; Livingston, Samuel A.; Casabianca, Jodi; Grant, Mary C.; Martin, Kathleen

    2006-01-01

    This study examines how closely the kernel equating (KE) method (von Davier, Holland, & Thayer, 2004a) approximates the results of other observed-score equating methods--equipercentile and linear equatings. The study used pseudotests constructed of item responses from a real test to simulate three equating designs: an equivalent groups (EG)…

  13. Item-level psychometrics and predictors of performance for Spanish/English bilingual speakers on an object and action naming battery.

    PubMed

    Edmonds, Lisa A; Donovan, Neila J

    2012-04-01

    There is a pressing need for psychometrically sound naming materials for Spanish/English bilingual adults. To address this need, in this study the authors examined the psychometric properties of An Object and Action Naming Battery (An O&A Battery; Druks & Masterson, 2000) in bilingual speakers. Ninety-one Spanish/English bilinguals named O&A Battery items in English and Spanish. Responses underwent a Rasch analysis. Using correlation and regression analyses, the authors evaluated the effect of psycholinguistic (e.g., imageability) and participant (e.g., proficiency ratings) variables on accuracy. Rasch analysis determined unidimensionality across English and Spanish nouns and verbs and robust item-level psychometric properties, evidence for content validity. Few items did not fit the model, there were no ceiling or floor effects after uninformative and misfit items were removed, and items reflected a range of difficulty. Reliability coefficients were high, and the number of statistically different ability levels provided indices of sensitivity. Regression analyses revealed significant correlations between psycholinguistic variables and accuracy, providing preliminary construct validity. The participant variables that contributed most to accuracy were proficiency ratings and time of language use. Results suggest adequate content and construct validity of O&A items retained in the analysis for Spanish/English bilingual adults and support future efforts to evaluate naming in older bilinguals and persons with bilingual aphasia.

  14. Development of a scale for attitude toward condom use for migrant workers in India.

    PubMed

    Talukdar, Arunansu; Bal, Runa; Sanyal, Debasis; Roy, Krishnendu; Talukdar, Payel Sengupta

    2008-02-01

    The propaganda for the use of condoms remains one of the mainstay for prevention of human immunodeficiency virus (HIV) transmission. In spite of the proven efficacy of condom, some moral, social and psychological obstacles are still prevalent, hindering the use of condoms. The study tried to construct a short condom-attitude scale for use among the migrant workers, a major bridge population in India. The study was conducted among the male migrant workers who were 18-49 years old, sexually active and had heard about condoms and were engaged in nonformal jobs. We recruited 234 and 280 candidates for Phase 1 and Phase 2 respectively. Ten items from the original 40-item Brown's ATC (attitude towards condom) scale were selected in Phase 1. After analysis of Phase 1 results, using principal component analysis six items were found appropriate for measuring attitude towards condom use. These six items were then administered in another group in Phase 2. Utilizing Pearson's correlations, scale items were examined in terms of their mean response scores and the correlation matrix between items. Cornbach's alpha and construct validity were also assessed for the entire sample. Study subjects were categorized as condom users and nonusers. The scale structure was explored by analyzing response scores with respect to the items, using principal component analysis followed by varimax rotation analysis. Principal component analysis revealed that the first factor accounted for 71% of the variance, with eigenvalue greater than one. Eigenvalues of the second factor was less than one. Application of screen test suggests only one factor was dominant. Mean score of six items among condom users was 20.45 and that among nonusers was 16.67, which was statistically significant (P<0.01). Cornbach's alpha coefficient was 0.92. This tailor-made attitude-toward-condom-use scale, targeted for most vulnerable people in India, can be included in any rapid survey for assessing the existing beliefs and attitudes toward condoms and also for evaluating efficacy of an intervention program.

  15. Scale Refinement and Initial Evaluation of a Behavioral Health Function Measurement Tool for Work Disability Evaluation

    PubMed Central

    Marfeo, Elizabeth E.; Ni, Pengsheng; Bogusz, Kara; Meterko, Mark; McDonough, Christine M.; Chan, Leighton; Rasch, Elizabeth K.; Brandt, Diane E.; Jette, Alan M.

    2014-01-01

    Objectives To use item response theory (IRT) data simulations to construct and perform initial psychometric testing of a newly developed instrument, the Social Security Administration Behavioral Health Function (SSA-BH) instrument, that aims to assess behavioral health functioning relevant to the context of work. Design Cross-sectional survey followed by item response theory (IRT) calibration data simulations Setting Community Participants A sample of individuals applying for SSA disability benefits, claimants (N=1015), and a normative comparative sample of US adults (N=1000) Interventions None. Main Outcome Measure Social Security Administration Behavioral Health Function (SSA-BH) measurement instrument Results Item response theory analyses supported the unidimensionality of four SSA-BH scales: Mood and Emotions (35 items), Self-Efficacy (23 items), Social Interactions (6 items), and Behavioral Control (15 items). All SSA-BH scales demonstrated strong psychometric properties including reliability, accuracy, and breadth of coverage. High correlations of the simulated 5- or 10- item CATs with the full item bank indicated robust ability of the CAT approach to comprehensively characterize behavioral health function along four distinct dimensions. Conclusions Initial testing and evaluation of the SSA-BH instrument demonstrated good accuracy, reliability, and content coverage along all four scales. Behavioral function profiles of SSA claimants were generated and compared to age and sex matched norms along four scales: Mood and Emotions, Behavioral Control, Social Interactions, and Self-Efficacy. Utilizing the CAT based approach offers the ability to collect standardized, comprehensive functional information about claimants in an efficient way, which may prove useful in the context of the SSA’s work disability programs. PMID:23542404

  16. Research applications for an Object and Action Naming Battery to assess naming skills in adult Spanish-English bilingual speakers.

    PubMed

    Edmonds, Lisa A; Donovan, Neila J

    2014-06-01

    Virtually no valid materials are available to evaluate confrontation naming in Spanish-English bilingual adults in the U.S. In a recent study, a large group of young Spanish-English bilingual adults were evaluated on An Object and Action Naming Battery (Edmonds & Donovan in Journal of Speech, Language, and Hearing Research 55:359-381, 2012). Rasch analyses of the responses resulted in evidence for the content and construct validity of the retained items. However, the scope of that study did not allow for extensive examination of individual item characteristics, group analyses of participants, or the provision of testing and scoring materials or raw data, thereby limiting the ability of researchers to administer the test to Spanish-English bilinguals and to score the items with confidence. In this study, we present the in-depth information described above on the basis of further analyses, including (1) online searchable spreadsheets with extensive empirical (e.g., accuracy and name agreeability) and psycholinguistic item statistics; (2) answer sheets and instructions for scoring and interpreting the responses to the Rasch items; (3) tables of alternative correct responses for English and Spanish; (4) ability strata determined for all naming conditions (English and Spanish nouns and verbs); and (5) comparisons of accuracy across proficiency groups (i.e., Spanish dominant, English dominant, and balanced). These data indicate that the Rasch items from An Object and Action Naming Battery are valid and sensitive for the evaluation of naming in young Spanish-English bilingual adults. Additional information based on participant responses for all of the items on the battery can provide researchers with valuable information to aid in stimulus development and response interpretation for experimental studies in this population.

  17. Measurement versus prediction in the construction of patient-reported outcome questionnaires: can we have our cake and eat it?

    PubMed

    Smits, Niels; van der Ark, L Andries; Conijn, Judith M

    2017-11-02

    Two important goals when using questionnaires are (a) measurement: the questionnaire is constructed to assign numerical values that accurately represent the test taker's attribute, and (b) prediction: the questionnaire is constructed to give an accurate forecast of an external criterion. Construction methods aimed at measurement prescribe that items should be reliable. In practice, this leads to questionnaires with high inter-item correlations. By contrast, construction methods aimed at prediction typically prescribe that items have a high correlation with the criterion and low inter-item correlations. The latter approach has often been said to produce a paradox concerning the relation between reliability and validity [1-3], because it is often assumed that good measurement is a prerequisite of good prediction. To answer four questions: (1) Why are measurement-based methods suboptimal for questionnaires that are used for prediction? (2) How should one construct a questionnaire that is used for prediction? (3) Do questionnaire-construction methods that optimize measurement and prediction lead to the selection of different items in the questionnaire? (4) Is it possible to construct a questionnaire that can be used for both measurement and prediction? An empirical data set consisting of scores of 242 respondents on questionnaire items measuring mental health is used to select items by means of two methods: a method that optimizes the predictive value of the scale (i.e., forecast a clinical diagnosis), and a method that optimizes the reliability of the scale. We show that for the two scales different sets of items are selected and that a scale constructed to meet the one goal does not show optimal performance with reference to the other goal. The answers are as follows: (1) Because measurement-based methods tend to maximize inter-item correlations by which predictive validity reduces. (2) Through selecting items that correlate highly with the criterion and lowly with the remaining items. (3) Yes, these methods may lead to different item selections. (4) For a single questionnaire: Yes, but it is problematic because reliability cannot be estimated accurately. For a test battery: Yes, but it is very costly. Implications for the construction of patient-reported outcome questionnaires are discussed.

  18. The Generic Short Patient Experiences Questionnaire (GS-PEQ): identification of core items from a survey in Norway

    PubMed Central

    2011-01-01

    Background Questionnaires are commonly used to collect patient, or user, experiences with health care encounters; however, their adaption to specific target groups limits comparison between groups. We present the construction of a generic questionnaire (maximum of ten questions) for user evaluation across a range of health care services. Methods Based on previous testing of six group-specific questionnaires, we first constructed a generic questionnaire with 23 items related to user experiences. All questions included a "not applicable" response option, as well as a follow-up question about the item's importance. Nine user groups from one health trust were surveyed. Seven groups received questionnaires by mail and two by personal distribution. Selection of core questions was based on three criteria: applicability (proportion "not applicable"), importance (mean scores on follow-up questions), and comprehensiveness (content coverage, maximum two items per dimension). Results 1324 questionnaires were returned providing subsample sizes ranging from 52 to 323. Ten questions were excluded because the proportion of "not applicable" responses exceeded 20% in at least one user group. The number of remaining items was reduced to ten by applying the two other criteria. The final short questionnaire included items on outcome (2), clinician services (2), user involvement (2), incorrect treatment (1), information (1), organisation (1), and accessibility (1). Conclusion The Generic Short Patient Experiences Questionnaire (GS-PEQ) is a short, generic set of questions on user experiences with specialist health care that covers important topics for a range of groups. It can be used alone or with other instruments in quality assessment or in research. The psychometric properties and the relevance of the GS-PEQ in other health care settings and countries need further evaluation. PMID:21510871

  19. The feeding practices and structure questionnaire: construction and initial validation in a sample of Australian first-time mothers and their 2-year olds

    PubMed Central

    2014-01-01

    Background Early feeding practices lay the foundation for children’s eating habits and weight gain. Questionnaires are available to assess parental feeding but overlapping and inconsistent items, subscales and terminology limit conceptual clarity and between study comparisons. Our aim was to consolidate a range of existing items into a parsimonious and conceptually robust questionnaire for assessing feeding practices with very young children (<3 years). Methods Data were from 462 mothers and children (age 21–27 months) from the NOURISH trial. Items from five questionnaires and two study-specific items were submitted to a priori item selection, allocation and verification, before theoretically-derived factors were tested using Confirmatory Factor Analysis. Construct validity of the new factors was examined by correlating these with child eating behaviours and weight. Results Following expert review 10 factors were specified. Of these, 9 factors (40 items) showed acceptable model fit and internal reliability (Cronbach’s α: 0.61-0.89). Four factors reflected non-responsive feeding practices: ‘Distrust in Appetite’, ‘Reward for Behaviour’, ‘Reward for Eating’, and ‘Persuasive Feeding’. Five factors reflected structure of the meal environment and limits: ‘Structured Meal Setting’, ‘Structured Meal Timing’, ‘Family Meal Setting’, ‘Overt Restriction’ and ‘Covert Restriction’. Feeding practices generally showed the expected pattern of associations with child eating behaviours but none with weight. Conclusion The Feeding Practices and Structure Questionnaire (FPSQ) provides a new reliable and valid measure of parental feeding practices, specifically maternal responsiveness to children’s hunger/satiety signals facilitated by routine and structure in feeding. Further validation in more diverse samples is required. PMID:24898364

  20. Development of a Symptom-Focused Patient-Reported Outcome Measure for Functional Dyspepsia: The Functional Dyspepsia Symptom Diary (FDSD)

    PubMed Central

    Taylor, Fiona; Higgins, Sophie; Carson, Robyn T; Eremenco, Sonya; Foley, Catherine; Lacy, Brian E; Parkman, Henry P; Reasner, David S; Shields, Alan L; Tack, Jan; Talley, Nicholas J

    2018-01-01

    Objectives: The Functional Dyspepsia Symptom Diary (FDSD) was developed to address the lack of symptom-focused, patient-reported outcome (PRO) measures designed for use in functional dyspepsia (FD) patients and meeting Food and Drug Administration recommendations for PRO instrument development. Methods: Concept elicitation interviews were conducted with FD participants to identify symptoms important and relevant to FD patients. A preliminary version of the FDSD was constructed, then completed by FD participants on an electronic device in cognitive interviews to evaluate the readability, comprehensibility, relevance, and comprehensiveness of the FDSD, and to preliminarily evaluate its measurement properties. Results: During concept elicitation interviews, 45 participants spontaneously reported 19 symptom concepts. Of those, seven symptoms were selected for assessment by the eight-item FDSD. Cognitive interviews with 57 participants confirmed that participants were able to comprehend and provide meaningful responses to the FDSD, and that the handheld electronic FDSD format was suitable for use in the target population. Scores of the FDSD were well-distributed among response options, item discrimination indices suggested that the FDSD items differentiate among patients with varying degrees of FD severity, and inter-item correlations suggested that no items of the FDSD were capturing redundant information. Internal consistency estimates (0.87) and construct-related validity estimates using known-groups methods were within acceptable ranges. Conclusions: The FDSD is a content-valid PRO measure, with preliminary psychometric evidence providing support for the FDSD’s items and total score. Further psychometric evaluations are recommended to more fully test the FDSD’s score performance and other measurement properties in the target patient population. PMID:28925989

  1. The Nursing Home Physical Performance Test: A Secondary Data Analysis of Women in Long-Term Care Using Item Response Theory.

    PubMed

    Perera, Subashan; Nace, David A; Resnick, Neil M; Greenspan, Susan L

    2017-04-11

    The Nursing Home Physical Performance Test (NHPPT) was developed to measure function among nursing home residents using sit-to-stand, scooping applesauce, face washing, dialing phone, putting on sweater, and ambulating tasks. Using item response theory, we explore its measurement characteristics at item level and opportunities for improvements. We used data from long-term care women. We fitted a graded response model, estimated parameters, and constructed probability and information curves. We identified items to be targeted toward lower and higher functioning persons to increase the range of abilities to which the instrument is applicable. We revised the scoring by making sit-to-stand and sweater items harder and dialing phone easier. We examined changes to concurrent validity with activities of daily living (ADL), frailty, and cognitive function. Participants were 86 years old, had more than three comorbidities, and a NHPPT of 19.4. All items had high discrimination and were targeted toward the lower middle range of performance continuum. After revision, sit-to-stand and sweater items demonstrated greater discrimination among the higher functioning and/or greater spread of thresholds for response categories. The overall test showed discrimination over a wider range of individuals. Concurrent validity correlation improved from 0.60 to 0.68 for instrumental ADL and explained variability (R2) from 22% to 36% for frailty. NHPPT has good measurement characteristics at the item level. NHPPT can be improved, implemented in computerized adaptive testing, and combined with self-report for greater utility, but a definitive study is needed. © The Author 2017. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Developing an Adapted Cardiac Rehabilitation Training for Home Care Clinicians

    PubMed Central

    Russell, David; Mola, Ana; Bowles, Kathryn H.; Lipman, Terri H.

    2017-01-01

    Purpose: There is limited evidence that home care clinicians receive education on the core competencies of cardiac rehabilitation (CR). This article describes the development and implementation of a CR training program adapted for home care clinicians, which incorporated the viewpoints of homebound patients with cardiovascular disease. Methods: Literature and guideline reviews were performed to glean curriculum content, supplemented with themes identified among patients and clinicians. Semistructured interviews were conducted with homebound patients regarding their perspectives on living with cardiovascular disease and focus groups were held with home care clinicians regarding their perspectives on caring for these patients. Transcripts were analyzed with the constant comparative method. A 15-item questionnaire was administered to home care nurses and rehabilitation therapists pre- and posttraining, and responses were analyzed using a paired sample t test. Results: Three themes emerged among patients: (1) awareness of heart disease; (2) motivation and caregivers' importance; and (3) barriers to attendance at outpatient CR; and 2 additional themes among clinicians: (4) gaps in care transitions; and (5) educational needs. Questionnaire results demonstrated significantly increased knowledge posttraining compared with pretraining among home care clinicians (pretest mean = 12.81; posttest mean = 14.63, P < .001). There was no significant difference between scores for nurses and rehabilitation therapists. Conclusions: Home care clinicians respond well to an adapted CR training to improve care for homebound patients with cardiovascular disease. Clinicians who participated in the training demonstrated an increase in their knowledge and skills of the core competencies for CR. PMID:28033165

  3. Expectations for Visual Function: An Initial Evaluation of a New Clinical Instrument.

    ERIC Educational Resources Information Center

    Corn, Anne L.; Webne, Steve L.

    2001-01-01

    A study explored the internal consistency of items in a visual screening instrument developed by Project PAVE: Expectations for Visual Functioning (EVF). The test includes 20 items that evaluate a child's functional use of vision. A pilot test involving 129 teachers indicates the EFV is internally consistent. (Contains three references.) (CR)

  4. 78 FR 63279 - Public Notice for Waiver of Aeronautical Land-Use Assurance

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-10-23

    ... Commissioners of Orange County for the construction of County Road CR 300 South/Airport Road to facilitate... property for a nominal sum of One Dollar and zero cents ($1.00) for the construction of County Road CR 300 South/Airport Road. Construction of the road will facilitate access to the airport. The aforementioned...

  5. Next Generation Flow for highly sensitive and standardized detection of minimal residual disease in multiple myeloma.

    PubMed

    Flores-Montero, J; Sanoja-Flores, L; Paiva, B; Puig, N; García-Sánchez, O; Böttcher, S; van der Velden, V H J; Pérez-Morán, J-J; Vidriales, M-B; García-Sanz, R; Jimenez, C; González, M; Martínez-López, J; Corral-Mateos, A; Grigore, G-E; Fluxá, R; Pontes, R; Caetano, J; Sedek, L; Del Cañizo, M-C; Bladé, J; Lahuerta, J-J; Aguilar, C; Bárez, A; García-Mateo, A; Labrador, J; Leoz, P; Aguilera-Sanz, C; San-Miguel, J; Mateos, M-V; Durie, B; van Dongen, J J M; Orfao, A

    2017-10-01

    Flow cytometry has become a highly valuable method to monitor minimal residual disease (MRD) and evaluate the depth of complete response (CR) in bone marrow (BM) of multiple myeloma (MM) after therapy. However, current flow-MRD has lower sensitivity than molecular methods and lacks standardization. Here we report on a novel next generation flow (NGF) approach for highly sensitive and standardized MRD detection in MM. An optimized 2-tube 8-color antibody panel was constructed in five cycles of design-evaluation-redesign. In addition, a bulk-lysis procedure was established for acquisition of ⩾10 7 cells/sample, and novel software tools were constructed for automatic plasma cell gating. Multicenter evaluation of 110 follow-up BM from MM patients in very good partial response (VGPR) or CR showed a higher sensitivity for NGF-MRD vs conventional 8-color flow-MRD -MRD-positive rate of 47 vs 34% (P=0.003)-. Thus, 25% of patients classified as MRD-negative by conventional 8-color flow were MRD-positive by NGF, translating into a significantly longer progression-free survival for MRD-negative vs MRD-positive CR patients by NGF (75% progression-free survival not reached vs 7 months; P=0.02). This study establishes EuroFlow-based NGF as a highly sensitive, fully standardized approach for MRD detection in MM which overcomes the major limitations of conventional flow-MRD methods and is ready for implementation in routine diagnostics.

  6. Reliability and Validity of the Alcohol Short Index of Problems and a Newly Constructed Drug Short Index of Problems*

    PubMed Central

    Alterman, Arthur I.; Cacciola, John S.; Ivey, Megan A.; Lynch, Kevin G.

    2009-01-01

    Objective: This study evaluated the psychometric properties of the 15-item alcohol Short Index of Problems (SIP) instrument and those of a newly constructed 15-item drug Short Index of Problems (SIP-D) instrument in 277 newly entered substance-abuse patients. Method: The SIP is derived from the longer, 50-item Drinker Inventory of Consequences (DrInC), which was designed to assess adverse consequences of alcohol use. The SIP-D was constructed by substituting the term “drug use” for the term “drinking” in each SIP item. A 3-month recall interval was employed. Results: Factor analyses of each of the instruments revealed similar solutions, with only one main factor accounting for the majority of variance. Nonparametric item response theory methods produced the same finding. Internal consistency reliability estimates for the SIP and SIP-D total scores were .98 and .97, respectively. Concurrent validity was demonstrated by examining the correlations of the total scores for each of the instruments with the recent summary indexes of the newly revised Addiction Severity Index (ASI-Version 6): alcohol, drug, medical, economic, legal, family/social, and psychiatric problems. Conclusions: This study is the first to confirm the psychometric validity of the SIP when used as an independent instrument unembedded within the DrInC. The study also supports the use of the SIP-D as a brief measure of adverse consequences of drug use. The findings strongly support the unidimensional structure of both measures. PMID:19261243

  7. Reliability and validity of the alcohol short index of problems and a newly constructed drug short index of problems.

    PubMed

    Alterman, Arthur I; Cacciola, John S; Ivey, Megan A; Habing, Brian; Lynch, Kevin G

    2009-03-01

    This study evaluated the psychometric properties of the 15-item alcohol Short Index of Problems (SIP) instrument and those of a newly constructed 15-item drug Short Index of Problems (SIP-D) instrument in 277 newly entered substance-abuse patients. The SIP is derived from the longer, 50-item Drinker Inventory of Consequences (DrInC), which was designed to assess adverse consequences of alcohol use. The SIP-D was constructed by substituting the term "drug use" for the term "drinking" in each SIP item. A 3-month recall interval was employed. Factor analyses of each of the instruments revealed similar solutions, with only one main factor accounting for the majority of variance. Nonparametric item response theory methods produced the same finding. Internal consistency reliability estimates for the SIP and SIP-D total scores were .98 and .97, respectively. Concurrent validity was demonstrated by examining the correlations of the total scores for each of the instruments with the recent summary indexes of the newly revised Addiction Severity Index (ASI-Version 6): alcohol, drug, medical, economic, legal, family/social, and psychiatric problems. This study is the first to confirm the psychometric validity of the SIP when used as an independent instrument unembedded within the DrInC. The study also supports the use of the SIP-D as a brief measure of adverse consequences of drug use. The findings strongly support the unidimensional structure of both measures.

  8. Measurement Invariance and the Five-Factor Model of Personality: Asian International and Euro American Cultural Groups.

    PubMed

    Rollock, David; Lui, P Priscilla

    2016-10-01

    This study examined measurement invariance of the NEO Five-Factor Inventory (NEO-FFI), assessing the five-factor model (FFM) of personality among Euro American (N = 290) and Asian international (N = 301) students (47.8% women, Mage = 19.69 years). The full 60-item NEO-FFI data fit the expected five-factor structure for both groups using exploratory structural equation modeling, and achieved configural invariance. Only 37 items significantly loaded onto the FFM-theorized factors for both groups and demonstrated metric invariance. Threshold invariance was not supported with this reduced item set. Groups differed the most in the item-factor relationships for Extraversion and Agreeableness, as well as in response styles. Asian internationals were more likely to use midpoint responses than Euro Americans. While the FFM can characterize broad nomothetic patterns of personality traits, metric invariance with only the subset of NEO-FFI items identified limits direct group comparisons of correlation coefficients among personality domains and with other constructs, and of mean differences on personality domains. © The Author(s) 2015.

  9. What do Demand-Control and Effort-Reward work stress questionnaires really measure? A discriminant content validity study of relevance and representativeness of measures.

    PubMed

    Bell, Cheryl; Johnston, Derek; Allan, Julia; Pollard, Beth; Johnston, Marie

    2017-05-01

    The Demand-Control (DC) and Effort-Reward Imbalance (ERI) models predict health in a work context. Self-report measures of the four key constructs (demand, control, effort, and reward) have been developed and it is important that these measures have good content validity uncontaminated by content from other constructs. We assessed relevance (whether items reflect the constructs) and representativeness (whether all aspects of the construct are assessed, and all items contribute to that assessment) across the instruments and items. Two studies examined fourteen demand/control items from the Job Content Questionnaire and seventeen effort/reward items from the Effort-Reward Imbalance measure using discriminant content validation and a third study developed new methods to assess instrument representativeness. Both methods use judges' ratings and construct definitions to get transparent quantitative estimates of construct validity. Study 1 used dictionary definitions while studies 2 and 3 used published phrases to define constructs. Overall, 3/5 demand items, 4/9 control items, 1/6 effort items, and 7/11 reward items were uniquely classified to the appropriate theoretical construct and were therefore 'pure' items with discriminant content validity (DCV). All pure items measured a defining phrase. However, both the DC and ERI assessment instruments failed to assess all defining aspects. Finding good discriminant content validity for demand and reward measures means these measures are usable and our quantitative results can guide item selection. By contrast, effort and control measures had limitations (in relevance and representativeness) presenting a challenge to the implementation of the theories. Statement of contribution What is already known on this subject? While the reliability and construct validity of Demand-Control and Effort-Reward-Imbalance (DC and ERI) work stress measures are routinely reported, there has not been adequate investigation of their content validity. This paper investigates their content validity in terms of both relevance and representativeness and provides a model for the investigation of content validity of measures in health psychology more generally. What does this study add? A new application of an existing method, discriminant content validity, and a new method of assessing instrument representativeness. 'Pure' DC and ERI items are identified, as are constructs that are not fully represented by their assessment instruments. The findings are important for studies attempting to distinguish between the main DC and ERI work stress constructs. The quantitative results can be used to guide item selection for future studies. © 2017 The British Psychological Society.

  10. Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers

    PubMed Central

    2012-01-01

    Background Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Methods Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. Results and conclusions After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12) – when binary scored – were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech’s “well-being” and “distress” clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS) met criteria for the monotone homogeneity model but four items violated double monotonicity with respect to a single underlying dimension. Software availability and commands used to specify unidimensionality and reliability analysis and graphical displays for diagnosing monotone homogeneity and double monotonicity are discussed, with an emphasis on current implementations in freeware. PMID:22686586

  11. Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers.

    PubMed

    Stochl, Jan; Jones, Peter B; Croudace, Tim J

    2012-06-11

    Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12)--when binary scored--were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech's "well-being" and "distress" clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS) met criteria for the monotone homogeneity model but four items violated double monotonicity with respect to a single underlying dimension.Software availability and commands used to specify unidimensionality and reliability analysis and graphical displays for diagnosing monotone homogeneity and double monotonicity are discussed, with an emphasis on current implementations in freeware.

  12. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    PubMed

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading <.5, 4 residual correlations >.3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  13. Using Rasch rating scale model to reassess the psychometric properties of the Persian version of the PedsQLTM 4.0 Generic Core Scales in school children

    PubMed Central

    2012-01-01

    Background Item response theory (IRT) is extensively used to develop adaptive instruments of health-related quality of life (HRQoL). However, each IRT model has its own function to estimate item and category parameters, and hence different results may be found using the same response categories with different IRT models. The present study used the Rasch rating scale model (RSM) to examine and reassess the psychometric properties of the Persian version of the PedsQLTM 4.0 Generic Core Scales. Methods The PedsQLTM 4.0 Generic Core Scales was completed by 938 Iranian school children and their parents. Convergent, discriminant and construct validity of the instrument were assessed by classical test theory (CTT). The RSM was applied to investigate person and item reliability, item statistics and ordering of response categories. Results The CTT method showed that the scaling success rate for convergent and discriminant validity were 100% in all domains with the exception of physical health in the child self-report. Moreover, confirmatory factor analysis supported a four-factor model similar to its original version. The RSM showed that 22 out of 23 items had acceptable infit and outfit statistics (<1.4, >0.6), person reliabilities were low, item reliabilities were high, and item difficulty ranged from -1.01 to 0.71 and -0.68 to 0.43 for child self-report and parent proxy-report, respectively. Also the RSM showed that successive response categories for all items were not located in the expected order. Conclusions This study revealed that, in all domains, the five response categories did not perform adequately. It is not known whether this problem is a function of the meaning of the response choices in the Persian language or an artifact of a mostly healthy population that did not use the full range of the response categories. The response categories should be evaluated in further validation studies, especially in large samples of chronically ill patients. PMID:22414135

  14. Feasibility of constructed wetland planted with Leersia hexandra Swartz for removing Cr, Cu and Ni from electroplating wastewater.

    PubMed

    You, Shao-Hong; Zhang, Xue-Hong; Liu, Jie; Zhu, Yi-Nian; Gu, Chen

    2014-01-01

    As a low-cost treatment technology for effluent, the constructed wetlands can be applied to remove the heavy metals from wastewater. Leersia hexandra Swartz is a metal-accumulating hygrophyte with great potential to remove heavy metal from water. In this study, two pilot-scale constructed wetlands planted with L. hexandra (CWL) were set up in greenhouse to treat electroplating wastewater containing Cr, Cu and Ni. The treatment performance of CWL under different hydraulic loading rates (HLR) and initial metal concentrations were also evaluated. The results showed that CWL significantly reduced the concentrations of Cr, Cu and Ni in wastewater by 84.4%, 97.1% and 94.3%, respectively. High HLR decreased the removal efficiencies of Cr, Cu and Ni; however, the heavy metal concentrations in effluent met Emission Standard of Pollutants for Electroplating in China (ESPE) at HLR less than 0.3 m3/m2 d. For the influent of 5 mg/L Cr, 10 mg/L Cu and 8 mg/L Ni, effluent concentrations were below maximum allowable concentrations in ESPE, indicating that the removal of Cr, Cu and Ni by CWL was feasible at considerably high influent metal concentrations. Mass balance showed that the primary sink for the retention of contaminants within the constructed wetland system was the sediment, which accounted for 59.5%, 83.5%, and 73.9% of the Cr, Cu and Ni, respectively. The data from the pilot wetlands support the view that CWL could be used to successfully remove Cr, Cu and Ni from electroplating wastewater.

  15. Genes, Culture and Conservatism-A Psychometric-Genetic Approach.

    PubMed

    Schwabe, Inga; Jonker, Wilfried; van den Berg, Stéphanie M

    2016-07-01

    The Wilson-Patterson conservatism scale was psychometrically evaluated using homogeneity analysis and item response theory models. Results showed that this scale actually measures two different aspects in people: on the one hand people vary in their agreement with either conservative or liberal catch-phrases and on the other hand people vary in their use of the "?" response category of the scale. A 9-item subscale was constructed, consisting of items that seemed to measure liberalism, and this subscale was subsequently used in a biometric analysis including genotype-environment interaction, correcting for non-homogeneous measurement error. Biometric results showed significant genetic and shared environmental influences, and significant genotype-environment interaction effects, suggesting that individuals with a genetic predisposition for conservatism show more non-shared variance but less shared variance than individuals with a genetic predisposition for liberalism.

  16. Partial validation of a French version of the ADHD-rating scale IV on a French population of children with ADHD and epilepsy. Factorial structure, reliability, and responsiveness.

    PubMed

    Mercier, Catherine; Roche, Sylvain; Gaillard, Ségolène; Kassai, Behrouz; Arzimanoglou, Alexis; Herbillon, Vania; Roy, Pascal; Rheims, Sylvain

    2016-05-01

    Attention deficit hyperactivity disorder (ADHD) is a well-known comorbidity in children with epilepsy. In English-speaking countries, the scores of the original ADHD-rating scale IV are currently used as main outcomes in various clinical trials in children with epilepsy. In French-speaking countries, several French versions are in use though none has been fully validated yet. We sought here for a partial validation of a French version of the ADHD-RS IV regarding construct validity, internal consistency (i.e., scale reliability), item reliability, and responsiveness in a group of French children with ADHD and epilepsy. The study involved 167 children aged 6-15years in 10 French neuropediatric units. The factorial structure and item reliability were assessed with a confirmatory factorial analysis for ordered categorical variables. The dimensions' internal consistency was assessed with Guttman's lambda 6 coefficient. The responsiveness was assessed by the change in score under methylphenidate and in comparison with a control group. The results confirmed the original two-dimensional factorial structure (inattention, hyperactivity/impulsivity) and showed a satisfactory reliability of most items, a good dimension internal consistency, and a good responsiveness of the total score and the two subscores. The studied French version of the ADHD-RS IV is thus validated regarding construct validity, reliability, and responsiveness. It can now be used in French-speaking countries in clinical trials of treatments involving children with ADHD and epilepsy. The full validation requires further investigations. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. [Construction and expression of a eukaryotic expression vector containing human CR2-Fc fusion protein].

    PubMed

    Li, Xinxin; Wu, Zhihao; Zhang, Chuanfu; Jia, Leili; Song, Hongbin; Xu, Yuanyong

    2014-01-01

    To construct a eukaryotic expression vector containing human complement receptor 2 (CR2)-Fc and express the CR2-Fc fusion protein in Chinese hamster ovary (CHO) cells. The extracellular domain of human CR2 and IgG1 Fc were respectively amplified, ligated and inserted into the eukaryotic expression vector PCI-neo. After verified by restriction enzyme digestion and sequencing, the recombinant plasmid was transfected into CHO K1 cells. The ones with stable expression of the fusion protein were obtained by means of G418 selection. The expression of the CR2-Fc fusion protein was detected and confirmed by SDS-PAGE and Western blotting. Restriction enzyme digestion and sequencing demonstrated that the recombinant plasmid was valid. SDS-PAGE showed that relative molecular mass (Mr;) of the purified product was consistent with the expected value. Western blotting further proved the single band at the same position. We constructed the eukaryotic expression vector of CR2-Fc/PCI-neo successfully. The obtained fusion protein was active and can be used for the further study of the role in HIV control.

  18. Construction of a web-based questionnaire for longitudinal investigation of work exposure, musculoskeletal pain and performance impairments in high-performance marine craft populations

    PubMed Central

    de Alwis, Manudul Pahansen; Äng, Björn Olov; Garme, Karl

    2017-01-01

    Objective High-performance marine craft personnel (HPMCP) are regularly exposed to vibration and repeated shock (VRS) levels exceeding maximum limitations stated by international legislation. Whereas such exposure reportedly is detrimental to health and performance, the epidemiological data necessary to link these adverse effects causally to VRS are not available in the scientific literature, and no suitable tools for acquiring such data exist. This study therefore constructed a questionnaire for longitudinal investigations in HPMCP. Methods A consensus panel defined content domains, identified relevant items and outlined a questionnaire. The relevance and simplicity of the questionnaire’s content were then systematically assessed by expert raters in three consecutive stages, each followed by revisions. An item-level content validity index (I-CVI) was computed as the proportion of experts rating an item as relevant and simple, and a scale-level content validity index (S-CVI/Ave) as the average I-CVI across items. The thresholds for acceptable content validity were 0.78 and 0.90, respectively. Finally, a dynamic web version of the questionnaire was constructed and pilot tested over a 1-month period during a marine exercise in a study population sample of eight subjects, while accelerometers simultaneously quantified VRS exposure. Results Content domains were defined as work exposure, musculoskeletal pain and human performance, and items were selected to reflect these constructs. Ratings from nine experts yielded S-CVI/Ave of 0.97 and 1.00 for relevance and simplicity, respectively, and the pilot test suggested that responses were sensitive to change in acceleration and that the questionnaire, following some adjustments, was feasible for its intended purpose. Conclusions A dynamic web-based questionnaire for longitudinal survey of key variables in HPMCP was constructed. Expert ratings supported that the questionnaire content is relevant, simple and sufficiently comprehensive, and the pilot test suggested that the questionnaire is feasible for longitudinal measurements in the study population. PMID:28729320

  19. Australian Defence Force Requirements for a Group-feeding Ration Pack

    DTIC Science & Technology

    2010-04-01

    items were instant noodles and pasta (20% and 18% of respondents, respectively). 3.1.6 Items Commonly Discarded Beef and Pasta, Fruit Pudding...sheet). Although not usually required, there is provision to supplement the CR5M with a cereal adjunct such as bread, rice, pasta or noodles [6]. It is...of all the drink items. The Chocolate Drink Powder had the highest acceptability; its consumption was second to that of the Instant Coffee. The

  20. Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

    ERIC Educational Resources Information Center

    Matlock, Ki Lynn; Turner, Ronna

    2016-01-01

    When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…

  1. Clinical Validation of the Nursing Outcome "Swallowing Status" in People with Stroke: Analysis According to the Classical and Item Response Theories.

    PubMed

    Oliveira-Kumakura, Ana Railka de Souza; de Araujo, Thelma Leite; Costa, Alice Gabrielle de Sousa; Cavalcante, Tahissa Frota; Lopes, Marcos Venícios de Oliveira; Carvalho, Emilia Campos

    2017-09-19

    To validate clinically the nursing outcome "Swallowing status". The adjustment of the nursing outcome was investigated according to the Classical and Item Response Theories. The models were compared regarding information loss, goodness-of-fit, and differential item functioning. Stability and internal consistency were examined. The nursing outcome has the best fit in the generalized partial credit model with different discrimination parameters. Strong correlations among the scores of each indicator were observed. There was no differential item functioning of the outcome indicators. The scale presented high internal consistency (Cronbach's α = .954) and stability (and > .800). This study presents a valid nursing outcome. Most accurate monitoring of sensitivity to an intervention. Validar clinicamente o resultado de enefermagem "Estado da Deglutição". MÉTODOS: O ajustamento do resultado foi investigado de acordo com as teorias Clássica e de Resposta ao Item. Os modelos foram comparados assumindo parâmetros de itens cruzados de igual discriminação. Investigaram-se as propriedades de bondade do ajuste, funcionamento diferencial dos itens, estabilidade e consistência interna. O resultado se ajustou melhor a partir do Modelo de crédito parcial generalizado, o qual demonstrou unidimensionalidade do resultado e forte correlação entre os escores de cada indicador. Não houve funcionamento diferencial dos indicadores. A consistência interna para a escala global (Cronbach's α = .954) e a estabilidade (>.800) mantiveram-se elevadas. CONCLUSÃO: O estudo apresenta um resultado de enfermagem válido. RELEVÂNCIA PARA A PRÁTICA CLÍNICA: Maior acurácia para monitorar a sensibilidade da intervenção. © 2017 NANDA International, Inc.

  2. Questionnaire to assess patient satisfaction with pharmaceutical care in Spanish language.

    PubMed

    Traverso, María Luz; Salamano, Mercedes; Botta, Carina; Colautti, Marisel; Palchik, Valeria; Pérez, Beatriz

    2007-08-01

    To develop and validate a questionnaire, in Spanish, for assessing patient satisfaction with pharmaceutical care received in community pharmacies. Selection and translation of questionnaire's items; definition of response scale and demographic questions. Evaluation of face and content validity, feasibility, factor structure, reliability and construct validity. Forty-one community pharmacies of the province of Santa Fe. Argentina. Questionnaire administered to patients receiving pharmaceutical care or traditional pharmacy services. Pilot test to assess feasibility. Factor analysis used principal components and varimax rotation. Reliability established using internal consistency with Cronbach's alpha. Construct validity determined with extreme group method. A self-administered questionnaire with 27 items, 5-point Likert response scale and demographic questions was designed considering multidimensional structure of patient satisfaction. Questionnaire evaluates cumulative experience of patients with comprehensive pharmaceutical care practice in community pharmacies. Two hundred and seventy-four complete questionnaires were obtained. Factor analysis resulted in three factors: Managing therapy, Interpersonal relationship and General satisfaction, with a cumulative variance of 62.51%. Cronbach's alpha for the whole questionnaire was 0.96, and 0.95, 0.88 and 0.76 for the three factors, respectively. Mann-Whitney test for construct validity did not showed significant differences between pharmacies that provide pharmaceutical care and those that do not, however, 23 items showed significant differences between the two groups of pharmacies. The questionnaire developed can be a reliable and valid instrument to assess patient satisfaction with pharmaceutical care in community pharmacies in Spanish. Further research is needed to deepen the validation process.

  3. Ultrasound wave assisted adsorption of congo red using gold-magnetic nanocomposite loaded on activated carbon: Optimization of process parameters.

    PubMed

    Dil, Ebrahim Alipanahpour; Ghaedi, Mehrorang; Asfaram, Arash; Bazrafshan, Ali Akbar

    2018-09-01

    In this study, gold-magnetic nanocomposite in the presence of ultrasound wave assisted was synthesized and loaded on activated carbon (Au-Fe 3 O 4 -NCs-AC) by simple, fast and low-cost process. This novel material was applied for ultrasound assisted adsorption of congo red (CR) as model of toxic and even carcinogenic substance from aqueous solution. The detail of morphology and identity of Au-Fe 3 O 4 -AC was characterized by SEM and TEM techniques and correlation among response to variables such as pH (2-10), adsorbent mass (0.005-0.025 g), initial CR concentration (10-30 mg L -1 ) and ultrasound time (2-6 min) was investigated by response surface methodology (RSM) under central composite design (CCD). Analysis of variance (ANOVA) exhibit a high R 2 value of 0.999 and confirm suitability of constructed second-order regression model for excellent evaluation and prediction of the experimental data. The interaction and main factor and optimum conditions of the under study process were determined from response surface plots based on desirability function. The maximum CR adsorption were achieved at pH of 4, 15 mg L -1 of CR, 0.017 g of Au-Fe 3 O 4 -AC and 5 min sonication which owing to 99.49% removal efficiency is highly recommended for future CR removal from different matrixes. Adsorption kinetic follow second-order rate expression in combination to inter particle diffusion and equilibrium adsorption data best represented by the Langmuir isotherm with maximum mono-layer adsorption capacity of 43.88 mg g -1 . Copyright © 2018 Elsevier B.V. All rights reserved.

  4. A test of the International Personality Item Pool representation of the Revised NEO Personality Inventory and development of a 120-item IPIP-based measure of the five-factor model.

    PubMed

    Maples, Jessica L; Guan, Li; Carter, Nathan T; Miller, Joshua D

    2014-12-01

    There has been a substantial increase in the use of personality assessment measures constructed using items from the International Personality Item Pool (IPIP) such as the 300-item IPIP-NEO (Goldberg, 1999), a representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992). The IPIP-NEO is free to use and can be modified to accommodate its users' needs. Despite the substantial interest in this measure, there is still a dearth of data demonstrating its convergence with the NEO PI-R. The present study represents an investigation of the reliability and validity of scores on the IPIP-NEO. Additionally, we used item response theory (IRT) methodology to create a 120-item version of the IPIP-NEO. Using an undergraduate sample (n = 359), we examined the reliability, as well as the convergent and criterion validity, of scores from the 300-item IPIP-NEO, a previously constructed 120-item version of the IPIP-NEO (Johnson, 2011), and the newly created IRT-based IPIP-120 in comparison to the NEO PI-R across a range of outcomes. Scores from all 3 IPIP measures demonstrated strong reliability and convergence with the NEO PI-R and a high degree of similarity with regard to their correlational profiles across the criterion variables (rICC = .983, .972, and .976, respectively). The replicability of these findings was then tested in a community sample (n = 757), and the results closely mirrored the findings from Sample 1. These results provide support for the use of the IPIP-NEO and both 120-item IPIP-NEO measures as assessment tools for measurement of the five-factor model. (c) 2014 APA, all rights reserved.

  5. A targeted complement-dependent strategy to improve the outcome of mAb therapy, and characterization in a murine model of metastatic cancer

    PubMed Central

    Elvington, Michelle; Huang, Yuxiang; Morgan, B. Paul; Qiao, Fei; van Rooijen, Nico; Atkinson, Carl

    2012-01-01

    Complement inhibitors expressed on tumor cells provide an evasion mechanism against mAb therapy and may modulate the development of an acquired antitumor immune response. Here we investigate a strategy to amplify mAb-targeted complement activation on a tumor cell, independent of a requirement to target and block complement inhibitor expression or function, which is difficult to achieve in vivo. We constructed a murine fusion protein, CR2Fc, and demonstrated that the protein targets to C3 activation products deposited on a tumor cell by a specific mAb, and amplifies mAb-dependent complement activation and tumor cell lysis in vitro. In syngeneic models of metastatic lymphoma (EL4) and melanoma (B16), CR2Fc significantly enhanced the outcome of mAb therapy. Subsequent studies using the EL4 model with various genetically modified mice and macrophage-depleted mice revealed that CR2Fc enhanced the therapeutic effect of mAb therapy via both macrophage-dependent FcγR-mediated antibody-dependent cellular cytotoxicity, and by direct complement-mediated lysis. Complement activation products can also modulate adaptive immunity, but we found no evidence that either mAb or CR2Fc treatment had any effect on an antitumor humoral or cellular immune response. CR2Fc represents a potential adjuvant treatment to increase the effectiveness of mAb therapy of cancer. PMID:22442351

  6. Evaluation of the Hospital Anxiety and Depression Scale (HADS) in screening stroke patients for symptoms: Item Response Theory (IRT) analysis.

    PubMed

    Ayis, Salma A; Ayerbe, Luis; Ashworth, Mark; DA Wolfe, Charles

    2018-03-01

    Variations have been reported in the number of underlying constructs and choice of thresholds that determine caseness of anxiety and /or depression using the Hospital Anxiety and Depression scale (HADS). This study examined the properties of each item of HADS as perceived by stroke patients, and assessed the information these items convey about anxiety and depression between 3 months to 5 years after stroke. The study included 1443 stroke patients from the South London Stroke Register (SLSR). The dimensionality of HADS was examined using factor analysis methods, and items' properties up to 5 years after stroke were tested using Item Response Theory (IRT) methods, including graded response models (GRMs). The presence of two dimensions of HADS (anxiety and depression) for stroke patients was confirmed. Items that accurately inferred about the severity of anxiety and depression, and offered good discrimination of caseness were identified as "I can laugh and see the funny side of things" (Q4) and "I get sudden feelings of panic" (Q13), discrimination 2.44 (se = 0.26), and 3.34 (se = 0.35), respectively. Items that shared properties, hence replicate inference were: "I get a sort of frightened feeling as if something awful is about to happen" (Q3), "I get a sort of frightened feeling like butterflies in my stomach" (Q6), and "Worrying thoughts go through my mind" (Q9). Item properties were maintained over time. Approximately 20% of patients were lost to follow up. A more concise selection of items based on their properties, would provide a precise approach for screening patients and for an optimal allocation of patients into clinical trials. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Development and validation of brief scales to measure emotional and behavioural problems among Chinese adolescents

    PubMed Central

    Shen, Minxue; Hu, Ming; Sun, Zhenqiu

    2017-01-01

    Objectives To develop and validate brief scales to measure common emotional and behavioural problems among adolescents in the examination-oriented education system and collectivistic culture of China. Setting Middle schools in Hunan province. Participants 5442 middle school students aged 11–19 years were sampled. 4727 valid questionnaires were collected and used for validation of the scales. The final sample included 2408 boys and 2319 girls. Primary and secondary outcome measures The tools were assessed by the item response theory, classical test theory (reliability and construct validity) and differential item functioning. Results Four scales to measure anxiety, depression, study problem and sociality problem were established. Exploratory factor analysis showed that each scale had two solutions. Confirmatory factor analysis showed acceptable to good model fit for each scale. Internal consistency and test–retest reliability of all scales were above 0.7. Item response theory showed that all items had acceptable discrimination parameters and most items had appropriate difficulty parameters. 10 items demonstrated differential item functioning with respect to gender. Conclusions Four brief scales were developed and validated among adolescents in middle schools of China. The scales have good psychometric properties with minor differential item functioning. They can be used in middle school settings, and will help school officials to assess the students’ emotional/behavioural problems. PMID:28062469

  8. Pressure ulcers: development and psychometric evaluation of the attitude towards pressure ulcer prevention instrument (APuP).

    PubMed

    Beeckman, D; Defloor, T; Demarré, L; Van Hecke, A; Vanderwee, K

    2010-11-01

    Pressure ulcers continue to be a significant problem in hospitals, nursing homes and community care settings. Pressure ulcer incidence is widely accepted as an indicator for the quality of care. Negative attitudes towards pressure ulcer prevention may result in suboptimal preventive care. A reliable and valid instrument to assess attitudes towards pressure ulcer prevention is lacking. Development and psychometric evaluation of the Attitude towards Pressure ulcer Prevention instrument (APuP). Prospective psychometric instrument validation study. A literature review was performed to design the instrument. Content validity was evaluated by nine European pressure ulcer experts and five experts in psychometric instrument validation in a double Delphi procedure. A convenience sample of 258 nurses and 291 nursing students from Belgium and The Netherlands participated in order to evaluate construct validity and stability reliability of the instrument. The data were collected between February and May 2008. A factor analysis indicated the construct of a 13 item instrument in a five factor solution: (1) attitude towards personal competency to prevent pressure ulcers (three items); (2) attitude towards the priority of pressure ulcer prevention (three items); (3) attitude towards the impact of pressure ulcers (three items); (4) attitude towards personal responsibility in pressure ulcer prevention (two items); and (5) attitude towards confidence in the effectiveness of prevention (two items). This five factor solution accounted for 61.4% of the variance in responses related to attitudes towards pressure ulcer prevention. All items demonstrated factor loadings over 0.60. The instrument produced similar results during stability testing [ICC=0.88 (95% CI=0.84-0.91, P<0.001)]. For the total instrument, the internal consistency (Cronbachs alpha) was 0.79. The APuP is a psychometrically sound instrument that can be used to effectively assess attitudes towards pressure ulcer prevention in patient care, education, and research. In further research, the association between attitude, knowledge and clinical performance should be explored. Copyright 2010 Elsevier Ltd. All rights reserved.

  9. Development, pilot testing and psychometric validation of a short version of the coronary artery disease education questionnaire: The CADE-Q SV.

    PubMed

    Ghisi, Gabriela Lima de Melo; Sandison, Nicole; Oh, Paul

    2016-03-01

    To develop, pilot test and psychometrically validate a shorter version of the coronary artery disease education questionnaire (CADE-Q), called CADE-Q SV. Based on previous versions of the CADE-Q, cardiac rehabilitation (CR) experts developed 20 items divided into 5 knowledge domains to comprise the first version of the CADE-Q SV. To establish content validity, they were reviewed by an expert panel (N=12). Refined items were pilot-tested in 20 patients, in which clarity was provided. A final version was generated and psychometrically-tested in 132CR patients. Test-retest reliability was assessed via the intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and criterion validity with regard to patients' education and duration in CR. All ICC coefficients meet the minimum recommended standard. All domains were considered internally consistent (α>0.7). Criterion validity was supported by significant differences in mean scores by educational level (p<0.01) and duration in CR (p<0.05). Knowledge about exercise and nutrition was higher than knowledge about medical condition. The CADE-Q SV was demonstrated to have good reliability and validity. This is a short, quick and appropriate tool for application in clinical and research settings, assessing patients' knowledge during CR and as part of education programming. Copyright © 2015. Published by Elsevier Ireland Ltd.

  10. The initial development of the WebMedQual scale: domain assessment of the construct of quality of health web sites.

    PubMed

    Provost, Mélanie; Koompalum, Dayin; Dong, Diane; Martin, Bradley C

    2006-01-01

    To develop a comprehensive instrument assessing quality of health-related web sites. Phase I consisted of a literature review to identify constructs thought to indicate web site quality and to identify items. During content analysis, duplicate items were eliminated and items that were not clear, meaningful, or measurable were reworded or removed. Some items were generated by the authors. Phase II: a panel consisting of six healthcare and MIS reviewers was convened to assess each item for its relevance and importance to the construct and to assess item clarity and measurement feasibility. Three hundred and eighty-four items were generated from 26 sources. The initial content analysis reduced the scale to 104 items. Four of the six expert reviewers responded; high concordance on the relevance, importance and measurement feasibility of each item was observed: 3 out of 4, or all raters agreed on 76-85% of items. Based on the panel ratings, 9 items were removed, 3 added, and 10 revised. The WebMedQual consists of 8 categories, 8 sub-categories, 95 items and 3 supplemental items to assess web site quality. The constructs are: content (19 items), authority of source (18 items), design (19 items), accessibility and availability (6 items), links (4 items), user support (9 items), confidentiality and privacy (17 items), e-commerce (6 items). The "WebMedQual" represents a first step toward a comprehensive and standard quality assessment of health web sites. This scale will allow relatively easy assessment of quality with possible numeric scoring.

  11. The Patient Assessment Questionnaire: initial validation of a measure of treatment effectiveness for patients with schizophrenia and schizoaffective disorder.

    PubMed

    Mojtabai, Ramin; Corey-Lisle, Patricia K; Ip, Edward Hak-Sing; Kopeykina, Irina; Haeri, Sophia; Cohen, Lisa Janet; Shumaker, Sally

    2012-12-30

    Investigation of patients' subjective perspective regarding the effectiveness - as opposed to efficacy - of antipsychotic medication has been hampered by a relative shortage of self-report measures of global clinical outcome. This paper presents data supporting the feasibility, inter-item consistency, and construct validity of the Patient Assessment Questionnaire (PAQ)-a self-report measure of psychiatric symptoms, medication side effects and general wellbeing, ultimately intended to assess effectiveness of interventions for schizophrenia-spectrum patients. The original 53-item instrument was developed by a multidisciplinary team which utilized brainstorming sessions for item generation and content analysis, patient focus groups, and expert panel reviews. This instrument and additional validation measures were administered, via Audio Computer-Assisted Self-Interviewing (ACASI), to 300 stable, medicated outpatients diagnosed with schizophrenia or schizoaffective disorder. Item elimination was based on psychometric properties and Item-Response Theory information functions and characteristic curves. Exploratory factor analysis of the resulting 40-item scale yielded a five factor solution. The five subscales (General Distress, Side Effects, Psychotic Symptoms, Cognitive Symptoms, Sleep) showed robust convergent (β's=0.34-0.75, average β=0.49) and discriminant validity. The PAQ demonstrates feasibility, reliability, and construct validity as a self-report measure of multiple domains pertinent to effectiveness. Future research needs to establish the PAQ's sensitivity to change. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  12. Perceiving Cardiac Rehabilitation Staff as Mainly Responsible for Exercise: A Dilemma for Future Self-Management.

    PubMed

    Flora, Parminder K; McMahon, Casey J; Locke, Sean R; Brawley, Lawrence R

    2018-03-01

    Cardiac rehabilitation (CR) exercise therapy facilitates patient recovery and better health following a cardiovascular event. However, post-CR adherence to self-managed (SM)-exercise is suboptimal. Part of this problem may be participants' view of CR staff as mainly responsible for help and program structure. Does post-CR exercise adherence for those perceiving high CR staff responsibility suffer as a consequence? Participants in this prospective, observational study were followed over 12 weeks of CR and one month afterward. High perceived staff responsibility individuals were examined for a decline in the strength of adherence-related social cognitions and exercise. Those high and low in perceived staff responsibility were also compared. High perceived staff responsibility individuals reported significant declines in anticipated exercise persistence (d = .58) and number of different SM-exercise options (d = .44). High versus low responsibility comparisons revealed a significant difference in one-month post-CR SM-exercise volume (d = .67). High perceived staff responsibility individuals exercised half of the amount of low responsibility counterparts at one month post-CR. Perceived staff responsibility and CR SRE significantly predicted SM-exercise volume, R 2 adj = .10, and persistence, R 2 adj = .18, one month post-CR. Viewing helpful well-trained CR staff as mainly responsible for participant behavior may be problematic for post-CR exercise maintenance among those more staff dependent. © 2017 The International Association of Applied Psychology.

  13. Measuring Access to Information and Technology: Environmental Factors Affecting Persons With Neurologic Disorders.

    PubMed

    Hahn, Elizabeth A; Garcia, Sofia F; Lai, Jin-Shei; Miskovic, Ana; Jerousek, Sara; Semik, Patrick; Wong, Alex; Heinemann, Allen W

    2016-08-01

    To develop and validate a patient-reported measure of access to information and technology (AIT) for persons with spinal cord injury, stroke, or traumatic brain injury. A mixed-methods approach was used to develop items, refine them through cognitive interviews, and evaluate their psychometric properties. Item responses were evaluated with the Rasch rating scale model. Correlational and analysis-of-variance methods were used to evaluate construct validity. Community-dwelling individuals participated in telephone interviews or traveled to the academic medical centers where this research took place. Individuals with a diagnosis of spinal cord injury, stroke, or traumatic brain injury (aged ≥18y, English speaking) participated in cognitive interviews (n=12 persons), field testing of the items (n=305 persons), and validation testing of the final set of items (n=604 persons). Not applicable. A set of items to measure AIT for people with disabilities. A user-friendly multimedia touchscreen was used for self-administration of the items. A 23-item AIT measure demonstrated good evidence of internal consistency reliability, and content and construct validity. This new AIT measure will enable researchers and clinicians to determine to what extent environmental factors influence health outcomes and social participation in people with disabilities. The AIT measure could also provide disability advocates with more specific and detailed information about environmental factors to lobby for elimination of barriers. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  14. Evaluating Instrument Quality in Science Education: Rasch-based analyses of a Nature of Science test

    NASA Astrophysics Data System (ADS)

    Neumann, Irene; Neumann, Knut; Nehm, Ross

    2011-07-01

    Given the central importance of the Nature of Science (NOS) and Scientific Inquiry (SI) in national and international science standards and science learning, empirical support for the theoretical delineation of these constructs is of considerable significance. Furthermore, tests of the effects of varying magnitudes of NOS knowledge on domain-specific science understanding and belief require the application of instruments validated in accordance with AERA, APA, and NCME assessment standards. Our study explores three interrelated aspects of a recently developed NOS instrument: (1) validity and reliability; (2) instrument dimensionality; and (3) item scales, properties, and qualities within the context of Classical Test Theory and Item Response Theory (Rasch modeling). A construct analysis revealed that the instrument did not match published operationalizations of NOS concepts. Rasch analysis of the original instrument-as well as a reduced item set-indicated that a two-dimensional Rasch model fit significantly better than a one-dimensional model in both cases. Thus, our study revealed that NOS and SI are supported as two separate dimensions, corroborating theoretical distinctions in the literature. To identify items with unacceptable fit values, item quality analyses were used. A Wright Map revealed that few items sufficiently distinguished high performers in the sample and excessive numbers of items were present at the low end of the performance scale. Overall, our study outlines an approach for how Rasch modeling may be used to evaluate and improve Likert-type instruments in science education.

  15. Questionnaire Construction Manual

    DTIC Science & Technology

    1976-07-01

    fwiW ........ ..., „.,. , r-m-lili^fa^BMiai igMiit VI-C Page 3 1 Jul 76 (2) All questionnaire items should be gramatically correct. (3) All...kept in mind: a. All response alternatives should follow the stem both gramatically and logically, and if possible, be parallel in structure. b

  16. A Music-Related Quality of Life Measure to Guide Music Rehabilitation for Adult Cochlear Implant Users.

    PubMed

    Dritsakis, Giorgos; van Besouw, Rachel M; Kitterick, Pádraig; Verschuur, Carl A

    2017-09-18

    A music-related quality of life (MuRQoL) questionnaire was developed for the evaluation of music rehabilitation for adult cochlear implant (CI) users. The present studies were aimed at refinement and validation. Twenty-four experts reviewed the MuRQoL items for face validity. A refined version was completed by 147 adult CI users, and psychometric techniques were used for item selection, assessment of reliability, and definition of the factor structure. The same participants completed the Short Form Health Survey for construct validation. MuRQoL responses from 68 CI users were compared with those of a matched group of adults with normal hearing. Eighteen items measuring music perception and engagement and 18 items measuring their importance were selected; they grouped together into 2 domains. The final questionnaire has high internal consistency and repeatability. Significant differences between CI users and adults with normal hearing and a correlation between music engagement and quality of life support construct validity. Scores of music perception and engagement and importance for the 18 items can be combined to assess the impact of music on the quality of life. The MuRQoL questionnaire is a reliable and valid measure of self-reported music perception, engagement, and their importance for adult CI users with potential to guide music aural rehabilitation.

  17. Category-Specific Neural Oscillations Predict Recall Organization During Memory Search

    PubMed Central

    Morton, Neal W.; Kahana, Michael J.; Rosenberg, Emily A.; Baltuch, Gordon H.; Litt, Brian; Sharan, Ashwini D.; Sperling, Michael R.; Polyn, Sean M.

    2013-01-01

    Retrieved-context models of human memory propose that as material is studied, retrieval cues are constructed that allow one to target particular aspects of past experience. We examined the neural predictions of these models by using electrocorticographic/depth recordings and scalp electroencephalography (EEG) to characterize category-specific oscillatory activity, while participants studied and recalled items from distinct, neurally discriminable categories. During study, these category-specific patterns predict whether a studied item will be recalled. In the scalp EEG experiment, category-specific activity during study also predicts whether a given item will be recalled adjacent to other same-category items, consistent with the proposal that a category-specific retrieval cue is used to guide memory search. Retrieved-context models suggest that integrative neural circuitry is involved in the construction and maintenance of the retrieval cue. Consistent with this hypothesis, we observe category-specific patterns that rise in strength as multiple same-category items are studied sequentially, and find that individual differences in this category-specific neural integration during study predict the degree to which a participant will use category information to organize memory search. Finally, we track the deployment of this retrieval cue during memory search: Category-specific patterns are stronger when participants organize their responses according to the category of the studied material. PMID:22875859

  18. Intelligent topical sentiment analysis for the classification of e-learners and their topics of interest.

    PubMed

    Ravichandran, M; Kulanthaivel, G; Chellatamilan, T

    2015-01-01

    Every day, huge numbers of instant tweets (messages) are published on Twitter as it is one of the massive social media for e-learners interactions. The options regarding various interesting topics to be studied are discussed among the learners and teachers through the capture of ideal sources in Twitter. The common sentiment behavior towards these topics is received through the massive number of instant messages about them. In this paper, rather than using the opinion polarity of each message relevant to the topic, authors focus on sentence level opinion classification upon using the unsupervised algorithm named bigram item response theory (BIRT). It differs from the traditional classification and document level classification algorithm. The investigation illustrated in this paper is of threefold which are listed as follows: (1) lexicon based sentiment polarity of tweet messages; (2) the bigram cooccurrence relationship using naïve Bayesian; (3) the bigram item response theory (BIRT) on various topics. It has been proposed that a model using item response theory is constructed for topical classification inference. The performance has been improved remarkably using this bigram item response theory when compared with other supervised algorithms. The experiment has been conducted on a real life dataset containing different set of tweets and topics.

  19. Psychometric properties of responses by clinicians and older adults to a 6-item Hebrew version of the Hamilton Depression Rating Scale (HAM-D6)

    PubMed Central

    2013-01-01

    Background The Hamilton Depression Rating Scale (HAM-D) is commonly used as a screening instrument, as a continuous measure of change in depressive symptoms over time, and as a means to compare the relative efficacy of treatments. Among several abridged versions, the 6-item HAM-D6 is used most widely in large degree because of its good psychometric properties. The current study compares both self-report and clinician-rated versions of the Hebrew version of this scale. Methods A total of 153 Israelis 75 years of age on average participated in this study. The HAM-D6 was examined using confirmatory factor analytic (CFA) models separately for both patient and clinician responses. Results Reponses to the HAM-D6 suggest that this instrument measures a unidimensional construct with each of the scales’ six items contributing significantly to the measurement. Comparisons between self-report and clinician versions indicate that responses do not significantly differ for 4 of the 6 items. Moreover, 100% sensitivity (and 91% specificity) was found between patient HAM-D6 responses and clinician diagnoses of depression. Conclusion These results indicate that the Hebrew HAM-D6 can be used to measure and screen for depressive symptoms among elderly patients. PMID:23281688

  20. Internal consistency and validity of a new physical workload questionnaire

    PubMed Central

    Bot, S; Terwee, C; van der Windt, D A W M; Feleus, A; Bierma-Zeinstra, S; Knol, D; Bouter, L; Dekker, J

    2004-01-01

    Aims: To examine the dimensionality, internal consistency, and construct validity of a new physical workload questionnaire in employees with musculoskeletal complaints. Methods: Factor analysis was applied to the responses in three study populations with musculoskeletal disorders (n = 406, 300, and 557) on 26 items related to physical workload. The internal consistency of the resulting subscales was examined. It was hypothesised that physical workload would vary among different occupational groups. The occupations of all subjects were classified into four groups on the basis of expected workload (heavy physical load; long lasting postures and repetitive movements; both; no physical load). Construct validity of the subscales created was tested by comparing the subscale scores among these occupational groups. Results: The pattern of the factor loadings of items was almost identical for the three study populations. Two interpretable factors were found: items related to heavy physical workload loaded highly on the first factor, and items related to static postures or repetitive work loaded highly on the second factor. The first constructed subscale "heavy physical work" had a Cronbach's α of 0.92 to 0.93 and the second subscale "long lasting postures and repetitive movements", of 0.86 to 0.87. Six of eight hypotheses regarding the construct validity of the subscales were confirmed. Conclusions: The results support the internal structure, internal consistency, and validity of the new physical workload questionnaire. Testing this questionnaire in non-symptomatic employees and comparing its performance with objective assessments of physical workload are important next steps in the validation process. PMID:15550603

  1. Rasch analysis of the Italian Lower Extremity Functional Scale: insights on dimensionality and suggestions for an improved 15-item version.

    PubMed

    Bravini, Elisabetta; Giordano, Andrea; Sartorio, Francesco; Ferriero, Giorgio; Vercelli, Stefano

    2017-04-01

    To investigate dimensionality and the measurement properties of the Italian Lower Extremity Functional Scale using both classical test theory and Rasch analysis methods, and to provide insights for an improved version of the questionnaire. Rasch analysis of individual patient data. Rehabilitation centre. A total of 135 patients with musculoskeletal diseases of the lower limb. Patients were assessed with the Lower Extremity Functional Scale before and after the rehabilitation. Rasch analysis showed some problems related to rating scale category functioning, items fit, and items redundancy. After an iterative process, which resulted in the reduction of rating scale categories from 5 to 4, and in the deletion of 5 items, the psychometric properties of the Italian Lower Extremity Functional Scale improved. The retained 15 items with a 4-level response format fitted the Rasch model (internal construct validity), and demonstrated unidimensionality and good reliability indices (person-separation reliability 0.92; Cronbach's alpha 0.94). Then, the analysis showed differential item functioning for six of the retained items. The sensitivity to change of the Italian 15-item Lower Extremity Functional Scale was nearly equal to the one of the original version (effect size: 0.93 and 0.98; standardized response mean: 1.20 and 1.28, respectively for the 15-item and 20-item versions). The Italian Lower Extremity Functional Scale had unsatisfactory measurement properties. However, removing five items and simplifying the scoring from 5 to 4 levels resulted in a more valid measure with good reliability and sensitivity to change.

  2. Psychometric assessment of the IBS-D Daily Symptom Diary and Symptom Event Log.

    PubMed

    Rosa, Kathleen; Delgado-Herrera, Leticia; Zeiher, Bernie; Banderas, Benjamin; Arbuckle, Rob; Spears, Glen; Hudgens, Stacie

    2016-12-01

    Diarrhea-predominant irritable bowel syndrome (IBS-D) can considerably impact patients' lives. Patient-reported symptoms are crucial in understanding the diagnosis and progression of IBS-D. This study psychometrically evaluates the newly developed IBS-D Daily Symptom Diary and Symptom Event Log (hereafter, "Event Log") according to US regulatory recommendations. A US-based observational field study was conducted to understand cross-sectional psychometric properties of the IBS-D Daily Symptom Diary and Event Log. Analyses included item descriptive statistics, item-to-item correlations, reliability, and construct validity. The IBS-D Daily Symptom Diary and Event Log had no items with excessive missing data. With the exception of two items ("frequency of gas" and "accidents"), moderate to high inter-item correlations were observed among all items of the IBS-D Daily Symptom Diary and Event Log (day 1 range 0.67-0.90). Item scores demonstrated reliability, with the exception of the "frequency of gas" and "accidents" items of the Diary and "incomplete evacuation" item of the Event Log. The pattern of correlations of the IBS-D Daily Symptom Diary and Event Log item scores with generic and disease-specific measures was as expected, moderate for similar constructs and low for dissimilar constructs, supporting construct validity. Known-groups methods showed statistically significant differences and monotonic trends in each of the IBS-D Daily Symptom Diary item scores among groups defined by patients' IBS-D severity ratings ("none"/"mild," "moderate," or "severe"/"very severe"), supporting construct validity. Initial psychometric results support the reliability and validity of the items of the IBS-D Daily Symptom Diary and Event Log.

  3. A multi-level differential item functioning analysis of trends in international mathematics and science study: Potential sources of gender and minority difference among U.S. eighth graders' science achievement

    NASA Astrophysics Data System (ADS)

    Qian, Xiaoyu

    Science is an area where a large achievement gap has been observed between White and minority, and between male and female students. The science minority gap has continued as indicated by the National Assessment of Educational Progress and the Trends in International Mathematics and Science Studies (TIMSS). TIMSS also shows a gender gap favoring males emerging at the eighth grade. Both gaps continue to be wider in the number of doctoral degrees and full professorships awarded (NSF, 2008). The current study investigated both minority and gender achievement gaps in science utilizing a multi-level differential item functioning (DIF) methodology (Kamata, 2001) within fully Bayesian framework. All dichotomously coded items from TIMSS 2007 science assessment at eighth grade were analyzed. Both gender DIF and minority DIF were studied. Multi-level models were employed to identify DIF items and sources of DIF at both student and teacher levels. The study found that several student variables were potential sources of achievement gaps. It was also found that gender DIF favoring male students was more noticeable in the content areas of physics and earth science than biology and chemistry. In terms of item type, the majority of these gender DIF items were multiple choice than constructed response items. Female students also performed less well on items requiring visual-spatial ability. Minority students performed significantly worse on physics and earth science items as well. A higher percentage of minority DIF items in earth science and biology were constructed response than multiple choice items, indicating that literacy may be the cause of minority DIF. Three-level model results suggested that some teacher variables may be the cause of DIF variations from teacher to teacher. It is essential for both middle school science teachers and science educators to find instructional methods that work more effectively to improve science achievement of both female and minority students. Physics and earth science are two areas to be improved for both groups. Curriculum and instruction need to enhance female students' learning interests and give them opportunities to improve their visual perception skills. Science instruction should address improving minority students' literacy skills while teaching science.

  4. Validating Measurement of Knowledge Integration in Science Using Multiple-Choice and Explanation Items

    ERIC Educational Resources Information Center

    Lee, Hee-Sun; Liu, Ou Lydia; Linn, Marcia C.

    2011-01-01

    This study explores measurement of a construct called knowledge integration in science using multiple-choice and explanation items. We use construct and instructional validity evidence to examine the role multiple-choice and explanation items plays in measuring students' knowledge integration ability. For construct validity, we analyze item…

  5. The conceptualization and measurement of cognitive reserve using common proxy indicators: Testing some tenable reflective and formative models.

    PubMed

    Ikanga, Jean; Hill, Elizabeth M; MacDonald, Douglas A

    2017-02-01

    The examination of cognitive reserve (CR) literature reveals a lack of consensus regarding conceptualization and pervasive problems with its measurement. This study aimed at examining the conceptual nature of CR through the analysis of reflective and formative models using eight proxies commonly employed in the CR literature. We hypothesized that all CR proxies would significantly contribute to a one-factor reflective model and that educational and occupational attainment would produce the strongest loadings on a single CR factor. The sample consisted of 149 participants (82 male/67 female), with 18.1 average years of education and ages of 45-99 years. Participants were assessed for eight proxies of CR (parent socioeconomic status, intellectual functioning, level of education, health literacy, occupational prestige, life leisure activities, physical activities, and spiritual and religious activities). Primary statistical analyses consisted of confirmatory factor analysis (CFA) to test reflective models and structural equation modeling (SEM) to evaluate multiple indicators multiple causes (MIMIC) models. CFA did not produce compelling support for a unitary CR construct when using all eight of our CR proxy variables in a reflective model but fairly cogent evidence for a one-factor model with four variable proxies. A second three-factor reflective model based upon an exploratory principal components analysis of the eight proxies was tested using CFA. Though all eight indicators significantly loaded on their assigned factors, evidence in support of overall model fit was mixed. Based upon the results involving the three-factor reflective model, two alternative formative models were developed and evaluated. While some support was obtained for both, the model in which the formative influences were specified as latent variables appeared to best account for the contributions of all eight proxies to the CR construct. While the findings provide partial support for our hypothesis regarding CR as a one-dimensional reflective construct, the results strongly suggest that the construct is more complex than what can be captured in a reflective model alone. There is a need for theory to better identify and differentiate formative from reflective indicators and to articulate the mechanisms by which CR develops and operates.

  6. The Multidimensional Assessment of Interoceptive Awareness (MAIA)

    PubMed Central

    Mehling, Wolf E.; Price, Cynthia; Daubenmier, Jennifer J.; Acree, Mike; Bartmess, Elizabeth; Stewart, Anita

    2012-01-01

    This paper describes the development of a multidimensional self-report measure of interoceptive body awareness. The systematic mixed-methods process involved reviewing the current literature, specifying a multidimensional conceptual framework, evaluating prior instruments, developing items, and analyzing focus group responses to scale items by instructors and patients of body awareness-enhancing therapies. Following refinement by cognitive testing, items were field-tested in students and instructors of mind-body approaches. Final item selection was achieved by submitting the field test data to an iterative process using multiple validation methods, including exploratory cluster and confirmatory factor analyses, comparison between known groups, and correlations with established measures of related constructs. The resulting 32-item multidimensional instrument assesses eight concepts. The psychometric properties of these final scales suggest that the Multidimensional Assessment of Interoceptive Awareness (MAIA) may serve as a starting point for research and further collaborative refinement. PMID:23133619

  7. NEPA, a new fixed combination of netupitant and palonosetron, is a cost-effective intervention for the prevention of chemotherapy-induced nausea and vomiting in the UK

    PubMed Central

    Cawston, Helene; Bourhis, Francois; Eriksson, Jennifer; Ruffo, Pierfrancesco; D’Agostino, Paolo; Turini, Marco; Schwartzberg, Lee; McGuire, Alistair

    2017-01-01

    Background The objective was to evaluate the cost-effectiveness of NEPA, an oral fixed combination netupitant (NETU, 300 mg) and palonosetron (PA, 0.5 mg) compared with aprepitant and palonosetron (APPA) or palonosetron (PA) alone, to prevent chemotherapy-induced nausea and vomiting (CINV) in patients undergoing treatment with highly or moderately emetogenic chemotherapy (HEC or MEC) in the UK. Scope A systematic literature review and meta-analysis were undertaken to compare NEPA with currently recommended anti-emetics. Relative effectiveness was estimated over the acute (day 1) and overall treatment (days 1–5) phases, taking complete response (CR, no emesis and no rescue medication) and complete protection (CP, CR and no more than mild nausea [VAS scale <25 mm]) as primary efficacy outcomes. A three-health-state Markov cohort model, including CP, CR and incomplete response (no CR) for HEC and MEC, was constructed. A five-day time horizon and UK NHS perspective were adopted. Transition probabilities were obtained by combining the response rates of CR and CP from NEPA trials and odds ratios from the meta-analysis. Utilities of 0.90, 0.70 and 0.24 were defined for CP, CR and incomplete response, respectively. Costs included medications and management of CINV-related events and were obtained from the British National Formulary and NHS Reference Costs. The expected budgetary impact of NEPA was also evaluated. Findings In HEC patients, the NEPA strategy was more effective than APPA (quality-adjusted life days [QALDs] of 4.263 versus 4.053; incremental emesis-free and CINV-free days of +0.354 and +0.237, respectively) and was less costly (£80 versus £124), resulting in NEPA being the dominant strategy. In MEC patients, NEPA was cost effective, cumulating in an estimated 0.182 extra QALDs at an incremental cost of £6.65 compared with PA. Conclusion Despite study limitations (study setting, time horizon, utility measure), the results suggest NEPA is cost effective for preventing CINV associated with HEC and MEC in the UK. PMID:28392826

  8. Calibrating well-being, quality of life and common mental disorder items: psychometric epidemiology in public mental health research.

    PubMed

    Böhnke, Jan R; Croudace, Tim J

    2016-08-01

    The assessment of 'general health and well-being' in public mental health research stimulates debates around relative merits of questionnaire instruments and their items. Little evidence regarding alignment or differential advantages of instruments or items has appeared to date. Population-based psychometric study of items employed in public mental health narratives. Multidimensional item response theory was applied to General Health Questionnaire (GHQ-12), Warwick-Edinburgh Mental Well-being Scale (WEMWBS) and EQ-5D items (Health Survey for England, 2010-2012; n = 19 290). A bifactor model provided the best account of the data and showed that the GHQ-12 and WEMWBS items assess mainly the same construct. Only one item of the EQ-5D showed relevant overlap with this dimension (anxiety/depression). Findings were corroborated by comparisons with alternative models and cross-validation analyses. The consequences of this lack of differentiation (GHQ-12 v. WEMWBS) for mental health and well-being narratives deserves discussion to enrich debates on priorities in public mental health and its assessment. © The Royal College of Psychiatrists 2015.

  9. Comparing five depression measures in depressed Chinese patients using item response theory: an examination of item properties, measurement precision and score comparability.

    PubMed

    Zhao, Yue; Chan, Wai; Lo, Barbara Chuen Yee

    2017-04-04

    Item response theory (IRT) has been increasingly applied to patient-reported outcome (PRO) measures. The purpose of this study is to apply IRT to examine item properties (discrimination and severity of depressive symptoms), measurement precision and score comparability across five depression measures, which is the first study of its kind in the Chinese context. A clinical sample of 207 Hong Kong Chinese outpatients was recruited. Data analyses were performed including classical item analysis, IRT concurrent calibration and IRT true score equating. The IRT assumptions of unidimensionality and local independence were tested respectively using confirmatory factor analysis and chi-square statistics. The IRT linking assumptions of construct similarity, equity and subgroup invariance were also tested. The graded response model was applied to concurrently calibrate all five depression measures in a single IRT run, resulting in the item parameter estimates of these measures being placed onto a single common metric. IRT true score equating was implemented to perform the outcome score linking and construct score concordances so as to link scores from one measure to corresponding scores on another measure for direct comparability. Findings suggested that (a) symptoms on depressed mood, suicidality and feeling of worthlessness served as the strongest discriminating indicators, and symptoms concerning suicidality, changes in appetite, depressed mood, feeling of worthlessness and psychomotor agitation or retardation reflected high levels of severity in the clinical sample. (b) The five depression measures contributed to various degrees of measurement precision at varied levels of depression. (c) After outcome score linking was performed across the five measures, the cut-off scores led to either consistent or discrepant diagnoses for depression. The study provides additional evidence regarding the psychometric properties and clinical utility of the five depression measures, offers methodological contributions to the appropriate use of IRT in PRO measures, and helps elucidate cultural variation in depressive symptomatology. The approach of concurrently calibrating and linking multiple PRO measures can be applied to the assessment of PROs other than the depression context.

  10. [KON-2006--Neurotic Personality Questionnaire].

    PubMed

    Aleksandrowicz, Jerzy W; Klasa, Katarzyna; Sobański, Jerzy A; Stolarska, Dorota

    2007-01-01

    Construction of a questionnaire describing personality traits connected to the occurrence and persistence of neurotic disorders. Responses of 794 patients (before treatment) and 520 persons from the control group on items of the constructed personality questionnaire and the symptom checklist "0". Analyses of subscales reliability and item-scale correlations, test-retest and split-half reliability. Factor analyses estimating internal reliability of the questionnaire. Cross-validation with the KO"0". symptom checklist Psychometric properties of KON-2006 questionnaire indicate that it is consistent and reliable enough. Validity analyses indicate a large probability that the X-KON coefficient informs on personality dysfunctions related to neurotic disorders. The Neurotic Personality Questionnaire KON-2006 may serve to estimate personality traits connected to the occurrence and persistence of neurotic disorders as well as changes resulting from psychotherapy.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brandt, C.C.; Benson, S.B.; Beeler, D.A.

    The Clinch River Remedial Investigation (CRRI) is designed to address the transport, fate, and distribution of waterborne contaminants (radionuclides, metals, and organic compounds) released from the US Department of Energy`s (DOE`s) Oak Ridge Reservation (ORR) and to assess potential risks to human health and the environment associated with these contaminants. The remedial investigation is entering Phase 2, which has the following items as its objectives: define the nature and extent of the contamination in areas downstream from the DOE ORR, evaluate the human health and ecological risks posed by these contaminants, and perform preliminary identification and evaluation of potential remediationmore » alternatives. This plan describes the requirements, responsibilities, and roles of personnel during sampling, analysis, and data review for the Clinch River Environmental Restoration Program (CR-ERP). The purpose of the plan is to formalize the process for obtaining analytical services, tracking sampling and analysis documentation, and assessing the overall quality of the CR-ERP data collection program to ensure that it will provide the necessary building blocks for the program decision-making process.« less

  12. Computer-adaptive test to measure community reintegration of Veterans.

    PubMed

    Resnik, Linda; Tian, Feng; Ni, Pengsheng; Jette, Alan

    2012-01-01

    The Community Reintegration of Injured Service Members (CRIS) measure consists of three scales measuring extent of, perceived limitations in, and satisfaction with community reintegration. Length of the CRIS may be a barrier to its widespread use. Using item response theory (IRT) and computer-adaptive test (CAT) methodologies, this study developed and evaluated a briefer community reintegration measure called the CRIS-CAT. Large item banks for each CRIS scale were constructed. A convenience sample of 517 Veterans responded to all items. Exploratory and confirmatory factor analyses (CFAs) were used to identify the dimensionality within each domain, and IRT methods were used to calibrate items. Accuracy and precision of CATs of different lengths were compared with the full-item bank, and data were examined for differential item functioning (DIF). CFAs supported unidimensionality of scales. Acceptable item fit statistics were found for final models. Accuracy of 10-, 15-, 20-, and variable-item CATs for all three scales was 0.88 or above. CAT precision increased with number of items administered and decreased at the upper ranges of each scale. Three items exhibited moderate DIF by sex. The CRIS-CAT demonstrated promising measurement properties and is recommended for use in community reintegration assessment.

  13. Experimentally Manipulating Items Informs on the (Limited) Construct and Criterion Validity of the Humor Styles Questionnaire

    PubMed Central

    Ruch, Willibald; Heintz, Sonja

    2017-01-01

    How strongly does humor (i.e., the construct-relevant content) in the Humor Styles Questionnaire (HSQ; Martin et al., 2003) determine the responses to this measure (i.e., construct validity)? Also, how much does humor influence the relationships of the four HSQ scales, namely affiliative, self-enhancing, aggressive, and self-defeating, with personality traits and subjective well-being (i.e., criterion validity)? The present paper answers these two questions by experimentally manipulating the 32 items of the HSQ to only (or mostly) contain humor (i.e., construct-relevant content) or to substitute the humor content with non-humorous alternatives (i.e., only assessing construct-irrelevant context). Study 1 (N = 187) showed that the HSQ affiliative scale was mainly determined by humor, self-enhancing and aggressive were determined by both humor and non-humorous context, and self-defeating was primarily determined by the context. This suggests that humor is not the primary source of the variance in three of the HQS scales, thereby limiting their construct validity. Study 2 (N = 261) showed that the relationships of the HSQ scales to the Big Five personality traits and subjective well-being (positive affect, negative affect, and life satisfaction) were consistently reduced (personality) or vanished (subjective well-being) when the non-humorous contexts in the HSQ items were controlled for. For the HSQ self-defeating scale, the pattern of relationships to personality was also altered, supporting an positive rather than a negative view of the humor in this humor style. The present findings thus call for a reevaluation of the role that humor plays in the HSQ (construct validity) and in the relationships to personality and well-being (criterion validity). PMID:28473794

  14. Validation of the Middlesex Elderly Assessment of Mental State (MEAMS) as a cognitive screening test in patients with acquired brain injury in Turkey.

    PubMed

    Kutlay, Sehim; Kuçukdeveci, Ayse A; Elhan, Atilla H; Yavuzer, Gunes; Tennant, Alan

    2007-02-28

    Assessment of cognitive impairment with a valid cognitive screening tool is essential in neurorehabilitation. The aim of this study was to test the reliability and validity of the Turkish-adapted version of the Middlesex Elderly Assessment of Mental State (MEAMS) among acquired brain injury patients in Turkey. Some 155 patients with acquired brain injury admitted for rehabilitation were assessed by the adapted version of MEAMS at admission and discharge. Reliability was tested by internal consistency, intra-class correlation coefficient (ICC) and person separation index; internal construct validity by Rasch analysis; external construct validity by associations with physical and cognitive disability (FIM); and responsiveness by Effect Size. Reliability was found to be good with Cronbach's alpha of 0.82 at both admission and discharge; and likewise an ICC of 0.80. Person separation index was 0.813. Internal construct validity was good by fit of the data to the Rasch model (mean item fit -0.178; SD 1.019). Items were substantially free of differential item functioning. External construct validity was confirmed by expected associations with physical and cognitive disability. Effect size was 0.42 compared with 0.22 for cognitive FIM. The reliability and validity of the Turkish version of MEAMS as a cognitive impairment screening tool in acquired brain injury has been demonstrated.

  15. Evaluation of diagnostic criteria for panic attack using item response theory: findings from the National Comorbidity Survey in USA.

    PubMed

    Ietsugu, Tetsuji; Sukigara, Masune; Furukawa, Toshiaki A

    2007-12-01

    The dichotomous diagnostic systems such as the Diagnostic and Statistical Manual of Mental Disorders (DSM) and International Classification of Diseases (ICD) lose much important information concerning what each symptom can offer. This study explored the characteristics and performances of DSM-IV and ICD-10 diagnostic criteria items for panic attack using modern item response theory (IRT). The National Comorbidity Survey used the Composite International Diagnostic Interview to assess 14 DSM-IV and ICD-10 panic attack diagnostic criteria items in the general population in the USA. The dimensionality and measurement properties of these items were evaluated using dichotomous factor analysis and the two-parameter IRT model. A total of 1213 respondents reported at least one subsyndromal or syndromal panic attack in their lifetime. Factor analysis indicated that all items constitute a unidimensional construct. The two-parameter IRT model produced meaningful and interpretable results. Among items with high discrimination parameters, the difficulty parameter for "palpitation" was relatively low, while those for "choking," "fear of dying" and "paresthesia" were relatively high. Several items including "dry mouth" and "fear of losing control" had low discrimination parameters. The item characteristics of diagnostic criteria among help-seeking clinical populations may be different from those that we observed in the general population and deserve further examination. "Paresthesia," "choking" and "fear of dying" can be thought to be good indicators of severe panic attacks, while "palpitation" can discriminate well between cases and non-cases at low level of panic attack severity. Items such as "dry mouth" would contribute less to the discrimination.

  16. The development and psychometric validation of the Ethical Awareness Scale.

    PubMed

    Milliken, Aimee; Ludlow, Larry; DeSanto-Madeya, Susan; Grace, Pamela

    2018-04-19

    To develop and psychometrically assess the Ethical Awareness Scale using Rasch measurement principles and a Rasch item response theory model. Critical care nurses must be equipped to provide good (ethical) patient care. This requires ethical awareness, which involves recognizing the ethical implications of all nursing actions. Ethical awareness is imperative in successfully addressing patient needs. Evidence suggests that the ethical import of everyday issues may often go unnoticed by nurses in practice. Assessing nurses' ethical awareness is a necessary first step in preparing nurses to identify and manage ethical issues in the highly dynamic critical care environment. A cross-sectional design was used in two phases of instrument development. Using Rasch principles, an item bank representing nursing actions was developed (33 items). Content validity testing was performed. Eighteen items were selected for face validity testing. Two rounds of operational testing were performed with critical care nurses in Boston between February-April 2017. A Rasch analysis suggests sufficient item invariance across samples and sufficient construct validity. The analysis further demonstrates a progression of items uniformly along a hierarchical continuum; items that match respondent ability levels; response categories that are sufficiently used; and adequate internal consistency. Mean ethical awareness scores were in the low/moderate range. The results suggest the Ethical Awareness Scale is a psychometrically sound, reliable and valid measure of ethical awareness in critical care nurses. © 2018 John Wiley & Sons Ltd.

  17. A longitudinal evaluation of the Center for Epidemiologic Studies-Depression scale (CES-D) in a Rheumatoid Arthritis Population using Rasch Analysis

    PubMed Central

    Covic, Tanya; Pallant, Julie F; Conaghan, Philip G; Tennant, Alan

    2007-01-01

    Background The aim of this study was to test the internal validity of the total Center for Epidemiologic Studies-Depression (CES-D) scale using Rasch analysis in a rheumatoid arthritis (RA) population. Methods CES-D was administered to 157 patients with RA over three time points within a 12 month period. Rasch analysis was applied using RUMM2020 software to assess the overall fit of the model, the response scale used, individual item fit, differential item functioning (DIF) and person separation. Results Pooled data across three time points was shown to fit the Rasch model with removal of seven items from the original 20-item CES-D scale. It was necessary to rescore the response format from four to three categories in order to improve the scale's fit. Two items demonstrated some DIF for age and gender but were retained within the 13-item CES-D scale. A new cut point for depression score of 9 was found to correspond to the original cut point score of 16 in the full CES-D scale. Conclusion This Rasch analysis of the CES-D in a longstanding RA cohort resulted in the construction of a modified 13-item scale with good internal validity. Further validation of the modified scale is recommended particularly in relation to the new cut point for depression. PMID:17629902

  18. Role of Cognitive Testing in the Development of the CAHPS® Hospital Survey

    PubMed Central

    Levine, Roger E; Fowler, Floyd J; Brown, Julie A

    2005-01-01

    Objective To describe how cognitive testing results were used to inform the modification and selection of items for the Consumer Assessment of Health Providers and Systems (CAHPS®) Hospital Survey pilot test instrument. Data Sources Cognitive interviews were conducted on 31 subjects in two rounds of testing: in December 2002–January 2003 and in February 2003. In both rounds, interviews were conducted in northern California, southern California, Massachusetts, and North Carolina. Study Design A common protocol served as the basis for cognitive testing activities in each round. This protocol was modified to enable testing of the items as interviewer-administered and self-administered items and to allow members of each of three research teams to use their preferred cognitive research tools. Data Collection/Extraction Methods Each research team independently summarized, documented, and reported their findings. Item-specific and general issues were noted. The results were reviewed and discussed by senior staff from each research team after each round of testing, to inform the acceptance, modification, or elimination of candidate items. Principal Findings Many candidate items required modification because respondents lacked the information required to answer them, respondents failed to understand them consistently, the items were not measuring the constructs they were intended to measure, the items were based on erroneous assumptions about what respondents wanted or experienced during their hospitalization, or the items were asking respondents to make distinctions that were too fine for them to make. Cognitive interviewing enabled the detection of these problems; an understanding of the etiology of the problem informed item revisions. However, for some constructs, the revisions proved to be inadequate. Accordingly, items could not be developed to provide acceptable measures of certain constructs such as shared decision making, coordination of care, and delays in the admissions process. Conclusions Cognitive testing is the most direct way of finding out whether respondents understand questions consistently, have the information needed to answer the questions, and can use the response alternatives provided to describe their experiences or their opinions accurately. Many of the candidate questions failed to meet these standards. Cognitive testing only evaluates the way in which respondents understand and answer questions. Although it does not directly assess the validity of the answers, it is a reasonable premise that cognitive problems will seriously compromise validity and reliability. PMID:16316437

  19. An item response theory analysis of nicotine dependence symptoms in recent onset adolescent smokers.

    PubMed

    Rose, Jennifer S; Dierker, Lisa C

    2010-07-01

    Given absence of a "gold standard" for measuring self-reported nicotine dependence, particularly among less experienced smokers, there is a need to evaluate existing measures to determine how well symptoms measure the underlying nicotine dependence construct and whether symptoms function differently for less experienced smokers. Study aims were to determine (1) likelihood of endorsement of individual symptoms at different levels of a nicotine dependence construct and the ability of symptoms to discriminate between different levels of this construct and (2) whether these symptom properties varied between nondaily and daily smokers. We used multiple group item response theory analysis to evaluate nicotine dependence symptoms from the nicotine dependence syndrome scale based on a nationally representative sample of 8081 recent onset adolescent smokers from the national surveys on drug use and health. After controlling for age, gender, smoking quantity and length of smoking exposure, symptoms assessing tolerance were invariant across nondaily and daily smokers, and discriminated well between levels of the nicotine dependence construct. However, the majority of symptoms functioned differently for nondaily and daily smokers. These symptoms did not discriminate as well between levels of the nicotine dependence construct and were more likely to be endorsed at lower levels of this construct for daily smokers. A measure that encompasses a range of symptoms tapping different aspects of smoking may be ideally suited for nondaily adolescent smokers, while an ideal measure of nicotine dependence for daily smokers might also include more core diagnostic features of nicotine dependence such as withdrawal and tolerance. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.

  20. RESEARCH ON ROBUST METHODS FOR EXTRACTING AND RECOGNIZING PHOTOGRAPHY MANAGEMENT ITEMS FROM VARIOUS IMAGE DATA Of CONSTRUCTION

    NASA Astrophysics Data System (ADS)

    Kitagawa, Etsuji; Tanaka, Shigenori; Abiko, Satoshi; Wakabayashi, Katsuma; Jiang, Wenyuan

    Recently, an electronic delivery for various documents is carried out by Ministry of Land, Infrastructure, Transport and Tourism in construction fields. One of them is image data of construction photography that must be delivered with information of photography management items such as construction name or type of works, etc. However, there is a problem that a lot of cost is needed to treat contents of these items from characters printed and handwritten on blackboard into these image data. In this research, we develop the system which can treat contents of these items by extracting contents of these items from the image data of construction photography taken in various scenes with preprocessing the image, recognizing characters with OCR and correcting error with natural language process. And we confirm the effectiveness of the system, by experimenting in each function of system and in entire system.

  1. Behavior determinants among cardiac rehabilitation patients receiving educational interventions: an application of the health action process approach.

    PubMed

    Ghisi, Gabriela Lima de Melo; Grace, Sherry L; Thomas, Scott; Oh, Paul

    2015-05-01

    To (1) test the effect of a health action process approach (HAPA) theory-based education program in cardiac rehabilitation (CR) compared to traditional education on patient knowledge and HAPA constructs; and, (2) investigate the theoretical correlates of exercise behavior among CR patients receiving theory-based education. CR patients were exposed to an existing or HAPA-based 6 month education curriculum in this quasi-experimental study. Participants completed a survey assessing exercise behavior, HAPA constructs, and knowledge pre and post-program. 306 patients consented to participate, of which 146 (47.7%) were exposed to the theory-based educational curriculum. There was a significant improvement in patients' overall knowledge pre- to post-CR, as well as in some HAPA constructs and exercise behavior, regardless of curriculum (p < 0.05). Path analysis revealed that knowledge was significantly related to intention formation, and intentions to engage in exercise were not directly related to behavior, which required action planning. The theoretically-informed education curriculum was not associated with greater knowledge or exercise behavior as expected. Education in CR improves knowledge, and theoretical constructs related to exercise behavior. Educational curricula should be designed to not only increase patients' knowledge, but also enhance intentions, self-efficacy, and action planning. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  2. Prognostic value of tumour regression grade in locally advanced rectal cancer: a systematic review and meta-analysis.

    PubMed

    Kong, J C; Guerra, G R; Warrier, S K; Lynch, A Craig; Michael, M; Ngan, S Y; Phillips, W; Ramsay, G; Heriot, A G

    2018-03-27

    The current standard of care for locally advanced rectal cancer involves neoadjuvant chemoradiotherapy (CRT) followed by total mesorectal excision. There is a spectrum of response to neoadjuvant therapy; however, the prognostic value of tumour regression grade (TRG) in predicting disease-free survival (DFS) or overall survival (OS) is inconsistent in the literature. This study was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A systematic search was undertaken using Ovid MEDLINE, Embase and Google Scholar. Inclusion criteria were Stage II and III locally advanced rectal cancer treated with long-course CRT followed by radical surgery. The aim of the meta-analysis was to assess the prognostic implication of each TRG for rectal cancer following neoadjuvant CRT. Long-term prognosis was assessed. The main outcome measures were DFS and OS. A random effects model was performed to pool the hazard ratio (HR) from all included studies. There were 4875 patients from 17 studies, with 775 (15.9%) attaining a pathological complete response (pCR) and 719 (29.9%) with no response. A significant association with OS was identified from a pooled-estimated HR for pCR (HR = 0.47, P = 0.002) and nonresponding tumours (HR = 2.97; P < 0.001). Previously known tumour characteristics, such as ypN, lymphovascular invasion and perineural invasion, were also significantly associated with DFS and OS, with estimated pooled HRs of 2.2, 1.4 and 2.3, respectively. In conclusion, the degree of TRG was of prognostic value in predicting long-term outcomes. The current challenge is the development of a high-validity tests to predict pCR. Colorectal Disease © 2018 The Association of Coloproctology of Great Britain and Ireland.

  3. Rasch Analysis of the Adult Strabismus Quality of Life Questionnaire (AS-20) among Chinese Adult Patients with Strabismus.

    PubMed

    Wang, Zonghua; Zhou, Juan; Luo, Xingli; Xu, Yan; She, Xi; Chen, Ling; Yin, Honghua; Wang, Xianyuan

    2015-01-01

    The impact of strabismus on visual function, self-image, self-esteem, and social interactions decrease health-related quality of life (HRQoL).The purpose of this study was to evaluate and refine the adult strabismus quality of life questionnaire (AS-20) by using Rasch analysis among Chinese adult patients with strabismus. We evaluated the fitness of the AS-20 with Rasch model in Chinese population by assessing unidimensionality, infit and outfit, person and item separation index and reliability, response ordering, targeting and differential item functioning (DIF). The overall AS-20 did not demonstrate unidimensional; however, it was achieved separately in the two Rasch-revised subscales: the psychosocial subscale (11 items) and the function subscale (9 items). The features of good targeting, optimal item infit and outfit, and no notable local dependence were found for each of the subscales. The rating scale was appropriate for the psychosocial subscale but a reduction to four response categories was required for the function subscale. No significant DIF were revealed for any demographic and clinical factors (e.g., age, gender, and strabismus types). The AS-20 was demonstrated by Rasch analysis to be a rigorous instrument for measuring health-related quality of life in Chinese strabismus patents if some revisions were made regarding the subscale construct and response options.

  4. Adaptive Testing without IRT.

    ERIC Educational Resources Information Center

    Yan, Duanli; Lewis, Charles; Stocking, Martha

    It is unrealistic to suppose that standard item response theory (IRT) models will be appropriate for all new and currently considered computer-based tests. In addition to developing new models, researchers will need to give some attention to the possibility of constructing and analyzing new tests without the aid of strong models. Computerized…

  5. Procrastination Revisited: The Constructive Use of Delayed Response.

    ERIC Educational Resources Information Center

    Subotnik, Rena F.; And Others

    This study investigated patterns of procrastination in the domains of health, relationships, employment, and creative outlets in 19 former Westinghouse Science Talent Search winners, age 32 years. A model was synthesized from the available literature and an interview schedule of 14 open-ended items was developed to elicit self-assessments of…

  6. Investigating Psychometric Isomorphism for Traditional and Performance-Based Assessment

    ERIC Educational Resources Information Center

    Fay, Derek M.; Levy, Roy; Mehta, Vandhana

    2018-01-01

    A common practice in educational assessment is to construct multiple forms of an assessment that consists of tasks with similar psychometric properties. This study utilizes a Bayesian multilevel item response model and descriptive graphical representations to evaluate the psychometric similarity of variations of the same task. These approaches for…

  7. Civic Engagement in College Students: Connections between Involvement and Attitudes

    ERIC Educational Resources Information Center

    O'Leary, Lisa S.

    2014-01-01

    This chapter describes how canonical correlation was used in conjunction with an item response theory model to address the relationship between college students' civic engagement involvement and attitudes as undergraduates. The constructs of interest were students' participation in civic, political, and expressive activities, as well as…

  8. Comparison of Automated Scoring Methods for a Computerized Performance Assessment of Clinical Judgment

    ERIC Educational Resources Information Center

    Harik, Polina; Baldwin, Peter; Clauser, Brian

    2013-01-01

    Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…

  9. 24 CFR 570.207 - Ineligible activities.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... to carry out the regular responsibilities of the unit of general local government are not eligible... construction equipment for use as part of a solid waste disposal facility is eligible under § 570.201(c). (ii... grant payments made to an individual or family for items such as food, clothing, housing (rent or...

  10. 24 CFR 570.207 - Ineligible activities.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... to carry out the regular responsibilities of the unit of general local government are not eligible... construction equipment for use as part of a solid waste disposal facility is eligible under § 570.201(c). (ii... grant payments made to an individual or family for items such as food, clothing, housing (rent or...

  11. 24 CFR 570.207 - Ineligible activities.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... to carry out the regular responsibilities of the unit of general local government are not eligible... construction equipment for use as part of a solid waste disposal facility is eligible under § 570.201(c). (ii... grant payments made to an individual or family for items such as food, clothing, housing (rent or...

  12. Rewards of bridging the divide between measurement and clinical theory: demonstration of a bifactor model for the Brief Symptom Inventory.

    PubMed

    Thomas, Michael L

    2012-03-01

    There is growing evidence that psychiatric disorders maintain hierarchical associations where general and domain-specific factors play prominent roles (see D. Watson, 2005). Standard, unidimensional measurement models can fail to capture the meaningful nuances of such complex latent variable structures. The present study examined the ability of the multidimensional item response theory bifactor model (see R. D. Gibbons & D. R. Hedeker, 1992) to improve construct validity by serving as a bridge between measurement and clinical theories. Archival data consisting of 688 outpatients' psychiatric diagnoses and item-level responses to the Brief Symptom Inventory (BSI; L. R. Derogatis, 1993) were extracted from files at a university mental health clinic. The bifactor model demonstrated superior fit for the internal structure of the BSI and improved overall diagnostic accuracy in the sample (73%) compared with unidimensional (61%) and oblique simple structure (65%) models. Consistent with clinical theory, multiple sources of item variance were drawn from individual test items. Test developers and clinical researchers are encouraged to consider model-based measurement in the assessment of psychiatric distress.

  13. An item response theory analysis of the Psychological Inventory of Criminal Thinking Styles: comparing male and female probationers and prisoners.

    PubMed

    Walters, Glenn D

    2014-09-01

    An item response theory (IRT) analysis of the Psychological Inventory of Criminal Thinking Styles (PICTS) was performed on 26,831 (19,067 male and 7,764 female) federal probationers and compared with results obtained on 3,266 (3,039 male and 227 female) prisoners from previous research. Despite the fact male and female federal probationers scored significantly lower on the PICTS thinking style scales than male and female prisoners, discrimination and location parameter estimates for the individual PICTS items were comparable across sex and setting. Consistent with the results of a previous IRT analysis conducted on the PICTS, the current results did not support sentimentality as a component of general criminal thinking. Findings from this study indicate that the discriminative power of the individual PICTS items is relatively stable across sex (male, female) and correctional setting (probation, prison) and that the PICTS may be measuring the same criminal thinking construct in male and female probationers and prisoners. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  14. What is the Ability Emotional Intelligence Test (MSCEIT) good for? An evaluation using item response theory.

    PubMed

    Fiori, Marina; Antonietti, Jean-Philippe; Mikolajczak, Moira; Luminet, Olivier; Hansenne, Michel; Rossier, Jérôme

    2014-01-01

    The ability approach has been indicated as promising for advancing research in emotional intelligence (EI). However, there is scarcity of tests measuring EI as a form of intelligence. The Mayer Salovey Caruso Emotional Intelligence Test, or MSCEIT, is among the few available and the most widespread measure of EI as an ability. This implies that conclusions about the value of EI as a meaningful construct and about its utility in predicting various outcomes mainly rely on the properties of this test. We tested whether individuals who have the highest probability of choosing the most correct response on any item of the test are also those who have the strongest EI ability. Results showed that this is not the case for most items: The answer indicated by experts as the most correct in several cases was not associated with the highest ability; furthermore, items appeared too easy to challenge individuals high in EI. Overall results suggest that the MSCEIT is best suited to discriminate persons at the low end of the trait. Results are discussed in light of applied and theoretical considerations.

  15. Soil retention of hexavalent chromium released from construction and demolition waste in a road-base-application scenario.

    PubMed

    Butera, Stefania; Trapp, Stefan; Astrup, Thomas F; Christensen, Thomas H

    2015-11-15

    We investigated the retention of Cr(VI) in three subsoils with low organic matter content in laboratory experiments at concentration levels relevant to represent leachates from construction and demolition waste (C&DW) reused as unbound material in road construction. The retention mechanism appeared to be reduction and subsequent precipitation as Cr(III) on the soil. The reduction process was slow and in several experiments it was still proceeding at the end of the six-month experimental period. The overall retention reaction fit well with a second-order reaction governed by actual Cr(VI) concentration and reduction capacity of the soil. The experimentally determined reduction capacities and second-order kinetic parameters were used to model, for a 100-year period, the one-dimensional migration of Cr(VI) in the subsoil under a layer of C&DW. The resulting Cr(VI) concentration would be negligible below 7-70 cm depth. However, in rigid climates and with high water infiltration through the road pavement, the reduction reaction could be so slow that Cr(VI) might migrate as deep as 200 cm under the road. The reaction parameters and the model can form the basis for systematically assessing under which scenarios Cr(VI) from C&DW could lead to an environmental issue for ground- and receiving surface waters. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. Do Self Concept Tests Test Self Concept? An Evaluation of the Validity of Items on the Piers Harris and Coopersmith Measures.

    ERIC Educational Resources Information Center

    Lynch, Mervin D.; Chaves, John

    Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…

  17. Perceptions of team members working in cleft services in the United kingdom: a pilot study.

    PubMed

    Scott, Julia K; Leary, Sam D; Ness, Andy R; Sandy, Jonathan R; Persson, Martin; Kilpatrick, Nicky; Waylen, Andrea E

    2015-01-01

    Cleft care provision in the United Kingdom has been centralized over the past 15 years to improve outcomes for children born with cleft lip and palate. However, to date, there have been no investigations to examine how well these multidisciplinary teams are performing. In this pilot study, a cross-sectional questionnaire surveyed members of all health care specialties working to provide cleft care in 11 services across the United Kingdom. Team members were asked to complete the Team Work Assessment (TWA) to investigate perceptions of team working in cleft services. The TWA comprises 55 items measuring seven constructs: team foundation, function, performance and skills, team climate and atmosphere, team leadership, and team identity; individual constructs were also aggregated to provide an overall TWA score. Items were measured using five-point Likert-type scales and were converted into percentage agreement for analysis. Responses were received from members of every cleft team. Ninety-nine of 138 cleft team questionnaires (71.7%) were returned and analyzed. The median (interquartile range) percentage of maximum possible score across teams was 75.5% (70.8, 88.2) for the sum of all items. Team performance and team identity were viewed most positively, with 82.0% (75.0, 88.2) and 88.4% (82.2, 91.4), respectively. Team foundation and leadership were viewed least positively with 79.0% (72.6, 84.6) and 76.6% (70.6, 85.4), respectively. Cleft team members perceive that their teams work well, but there are variations in response according to construct.

  18. Simple construct evaluation with latent class analysis: An investigation of Facebook addiction and the development of a short form of the Facebook Addiction Test (F-AT).

    PubMed

    Dantlgraber, Michael; Wetzel, Eunike; Schützenberger, Petra; Stieger, Stefan; Reips, Ulf-Dietrich

    2016-09-01

    In psychological research, there is a growing interest in using latent class analysis (LCA) for the investigation of quantitative constructs. The aim of this study is to illustrate how LCA can be applied to gain insights on a construct and to select items during test development. We show the added benefits of LCA beyond factor-analytic methods, namely being able (1) to describe groups of participants that differ in their response patterns, (2) to determine appropriate cutoff values, (3) to evaluate items, and (4) to evaluate the relative importance of correlated factors. As an example, we investigated the construct of Facebook addiction using the Facebook Addiction Test (F-AT), an adapted version of the Internet Addiction Test (I-AT). Applying LCA facilitates the development of new tests and short forms of established tests. We present a short form of the F-AT based on the LCA results and validate the LCA approach and the short F-AT with several external criteria, such as chatting, reading newsfeeds, and posting status updates. Finally, we discuss the benefits of LCA for evaluating quantitative constructs in psychological research.

  19. Rasch validation of the Arabic version of the lower extremity functional scale.

    PubMed

    Alnahdi, Ali H

    2018-02-01

    The purpose of this study was to examine the internal construct validity of the Arabic version of the Lower Extremity Functional Scale (20-item Arabic LEFS) using Rasch analysis. Patients (n = 170) with lower extremity musculoskeletal dysfunction were recruited. Rasch analysis of 20-item Arabic LEFS was performed. Once the initial Rasch analysis indicated that the 20-item Arabic LEFS did not fit the Rasch model, follow-up analyses were conducted to improve the fit of the scale to the Rasch measurement model. These modifications included removing misfitting individuals, changing item scoring structure, removing misfitting items, addressing bias caused by response dependency between items and differential item functioning (DIF). Initial analysis indicated deviation of the 20-item Arabic LEFS from the Rasch model. Disordered thresholds in eight items and response dependency between six items were detected with the scale as a whole did not meet the requirement of unidimensionality. Refinements led to a 15-item Arabic LEFS that demonstrated excellent internal consistency (person separation index [PSI] = 0.92) and satisfied all the requirement of the Rasch model. Rasch analysis did not support the 20-item Arabic LEFS as a unidimensional measure of lower extremity function. The refined 15-item Arabic LEFS met all the requirement of the Rasch model and hence is a valid objective measure of lower extremity function. The Rasch-validated 15-item Arabic LEFS needs to be further tested in an independent sample to confirm its fit to the Rasch measurement model. Implications for Rehabilitation The validity of the 20-item Arabic Lower Extremity Functional Scale to measure lower extremity function is not supported. The 15-item Arabic version of the LEFS is a valid measure of lower extremity function and can be used to quantify lower extremity function in patients with lower extremity musculoskeletal disorders.

  20. The Academic Resilience Scale (ARS-30): A New Multidimensional Construct Measure.

    PubMed

    Cassidy, Simon

    2016-01-01

    Resilience is a psychological construct observed in some individuals that accounts for success despite adversity. Resilience reflects the ability to bounce back, to beat the odds and is considered an asset in human characteristic terms. Academic resilience contextualizes the resilience construct and reflects an increased likelihood of educational success despite adversity. The paper provides an account of the development of a new multidimensional construct measure of academic resilience. The 30 item Academic Resilience Scale (ARS-30) explores process-as opposed to outcome-aspects of resilience, providing a measure of academic resilience based on students' specific adaptive cognitive-affective and behavioral responses to academic adversity. Findings from the study involving a sample of undergraduate students ( N = 532) demonstrate that the ARS-30 has good internal reliability and construct validity. It is suggested that a measure such as the ARS-30, which is based on adaptive responses, aligns more closely with the conceptualisation of resilience and provides a valid construct measure of academic resilience relevant for research and practice in university student populations.

  1. Validity and Reliability of General Nutrition Knowledge Questionnaire for Adults in Uganda

    PubMed Central

    Bukenya, Richard; Ahmed, Abhiya; Andrade, Jeanette M.; Grigsby-Toussaint, Diana S.; Muyonga, John; Andrade, Juan E.

    2017-01-01

    This study sought to develop and validate a general nutrition knowledge questionnaire (GNKQ) for Ugandan adults. The initial draft consisted of 133 items on five constructs associated with nutrition knowledge; expert recommendations (16 items), food groups (70 items), selecting food (10 items), nutrition and disease relationship (23 items), and food fortification in Uganda (14 items). The questionnaire validity was evaluated in three studies. For the content validity (study 1), a panel of five content matter nutrition experts reviewed the GNKQ draft before and after face validity. For the face validity (study 2), head teachers and health workers (n = 27) completed the questionnaire before attending one of three focus groups to review the clarity of the items. For the construct and test-rest reliability (study 3), head teachers (n = 40) from private and public primary schools and nutrition (n = 52) and engineering (n = 49) students from Makerere University took the questionnaire twice (two weeks apart). Experts agreed (content validity index, CVI > 0.9; reliability, Gwet’s AC1 > 0.85) that all constructs were relevant to evaluate nutrition knowledge. After the focus groups, 29 items were identified as unclear, requiring major (n = 5) and minor (n = 24) reviews. The final questionnaire had acceptable internal consistency (Cronbach α > 0.95), test-retest reliability (r = 0.89), and differentiated (p < 0.001) nutrition knowledge scores between nutrition (67 ± 5) and engineering (39 ± 11) students. Only the construct on nutrition recommendations was unreliable (Cronbach α = 0.51, test-retest r = 0.55), which requires further optimization. The final questionnaire included topics on food groups (41 items), selecting food (2 items), nutrition and disease relationship (14 items), and food fortification in Uganda (22 items) and had good content, construct, and test-retest reliability to evaluate nutrition knowledge among Ugandan adults. PMID:28230779

  2. More relevant, precise, and efficient items for assessment of physical function and disability: moving beyond the classic instruments

    PubMed Central

    Fries, J F; Bruce, B; Bjorner, J; Rose, M

    2006-01-01

    Objectives Patient reported outcomes (PROs) have become standard study endpoints. However, little attention has been given to using item improvement to advance PRO performance which could improve precision, clarity, patient relevance, and information content of “physical function/disability” items and thus the performance of resulting instruments. Methods The present study included1860 physical function/disability items from 165 instruments. Item formulations were assessed by frequency of use, modified Delphi consensus, respondent judgement of clarity and importance, and item response theory (IRT). Data from 1100 rheumatoid arthritis, osteoarthritis, and normal ageing subjects, using qualitative item review, focus groups, cognitive interviews, and patient survey were used to achieve a unique item pool that was clear, reliable, sensitive to change, readily translatable, devoid of floor and ceiling limitations, contained unidimensional subdomains, and had maximal information content. Results A “present tense” time frame was used most frequently, better understood, more readily translated, and more directly estimated the latent trait of disability. Items in the “past tense” had 80–90% false negatives (p<0.001). The best items were brief, clear, and contained a single construct. Responses with four to five options were preferred by both experts and respondents. The term physical function may be preferable to the term disability because of fewer floor effects. IRT analyses of “disability” suggest four independent subdomains (mobility, dexterity, axial, and compound) with factor loadings of 0.81–0.99. Conclusions Major improvement in performance of items and instruments is possible, and may have the effect of substantially reducing sample size requirements for clinical trials. PMID:17038464

  3. Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey.

    PubMed

    Peyre, Hugo; Leplège, Alain; Coste, Joël

    2011-03-01

    Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory. Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin's "missing completely at random," "missing at random," and "missing not at random"). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36. For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias <2%) in all studied situations. Whereas multiple imputation and full information maximum likelihood are confirmed as reference methods, the personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.

  4. Development, content validity, and piloting of an instrument designed to measure managers' attitude toward workplace breastfeeding support.

    PubMed

    Chow, Tan; Wolfe, Edward W; Olson, Beth H

    2012-07-01

    Manager attitude is influential in female employees' perceptions of workplace breastfeeding support. Currently, no instrument is available to assess manager attitude toward supporting women who wish to combine breastfeeding with work. We developed and piloted an instrument to measure manager attitudes toward workplace breastfeeding support entitled the "Managers' Attitude Toward Breastfeeding Support Questionnaire," an instrument that measures four constructs using 60 items that are rated agree/disagree on a 4-point Likert rating scale. We established the content validity of the Managers' Attitude Toward Breastfeeding Support Questionnaire measures through expert content review (n=22), expert assessment of item fit (n=11), and cognitive interviews (n=8). Data were collected from a purposive sample of 185 front-line managers who had experience supervising female employees, and responses were scaled using the Multidimensional Random Coefficients Multinomial Logit Model. Dimensionality analyses supported the proposed four-construct model. Reliability ranged from 0.75 to 0.86, and correlations between the constructs were moderately strong (0.47 to 0.71). Four items in two constructs exhibited model-to-data misfit and/or a low score-measure correlation. One item was revised and the other three items were retained in the Managers' Attitude Toward Breastfeeding Support Questionnaire. Findings of this study suggest that the Managers' Attitude Toward Breastfeeding Support Questionnaire measures are reliable and valid indicators of manager attitude toward workplace breastfeeding support, and future research should be conducted to establish external validity. The Managers' Attitude Toward Breastfeeding Support Questionnaire could be used to collect data in a standardized manner within and across companies to measure and compare manager attitudes toward supporting breastfeeding. Organizations can subsequently develop targeted strategies to improve support for breastfeeding employees through efforts influencing managerial attitude. Copyright © 2012 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.

  5. The construction of categorization judgments: using subjective confidence and response latency to test a distributed model.

    PubMed

    Koriat, Asher; Sorka, Hila

    2015-01-01

    The classification of objects to natural categories exhibits cross-person consensus and within-person consistency, but also some degree of between-person variability and within-person instability. What is more, the variability in categorization is also not entirely random but discloses systematic patterns. In this study, we applied the Self-Consistency Model (SCM, Koriat, 2012) to category membership decisions, examining the possibility that confidence judgments and decision latency track the stable and variable components of categorization responses. The model assumes that category membership decisions are constructed on the fly depending on a small set of clues that are sampled from a commonly shared population of pertinent clues. The decision and confidence are based on the balance of evidence in favor of a positive or a negative response. The results confirmed several predictions derived from SCM. For each participant, consensual responses to items were more confident than non-consensual responses, and for each item, participants who made the consensual response tended to be more confident than those who made the nonconsensual response. The difference in confidence between consensual and nonconsensual responses increased with the proportion of participants who made the majority response for the item. A similar pattern was observed for response speed. The pattern of results obtained for cross-person consensus was replicated by the results for response consistency when the responses were classified in terms of within-person agreement across repeated presentations. These results accord with the sampling assumption of SCM, that confidence and response speed should be higher when the decision is consistent with what follows from the entire population of clues than when it deviates from it. Results also suggested that the context for classification can bias the sample of clues underlying the decision, and that confidence judgments mirror the effects of context on categorization decisions. The model and results offer a principled account of the stable and variable contributions to categorization behavior within a decision-making framework. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Examination of an eHealth literacy scale and a health literacy scale in a population with moderate to high cardiovascular risk: Rasch analyses.

    PubMed

    Richtering, Sarah S; Morris, Rebecca; Soh, Sze-Ee; Barker, Anna; Bampi, Fiona; Neubeck, Lis; Coorey, Genevieve; Mulley, John; Chalmers, John; Usherwood, Tim; Peiris, David; Chow, Clara K; Redfern, Julie

    2017-01-01

    Electronic health (eHealth) strategies are evolving making it important to have valid scales to assess eHealth and health literacy. Item response theory methods, such as the Rasch measurement model, are increasingly used for the psychometric evaluation of scales. This paper aims to examine the internal construct validity of an eHealth and health literacy scale using Rasch analysis in a population with moderate to high cardiovascular disease risk. The first 397 participants of the CONNECT study completed the electronic health Literacy Scale (eHEALS) and the Health Literacy Questionnaire (HLQ). Overall Rasch model fit as well as five key psychometric properties were analysed: unidimensionality, response thresholds, targeting, differential item functioning and internal consistency. The eHEALS had good overall model fit (χ2 = 54.8, p = 0.06), ordered response thresholds, reasonable targeting and good internal consistency (person separation index (PSI) 0.90). It did, however, appear to measure two constructs of eHealth literacy. The HLQ subscales (except subscale 5) did not fit the Rasch model (χ2: 18.18-60.60, p: 0.00-0.58) and had suboptimal targeting for most subscales. Subscales 6 to 9 displayed disordered thresholds indicating participants had difficulty distinguishing between response options. All subscales did, nonetheless, demonstrate moderate to good internal consistency (PSI: 0.62-0.82). Rasch analyses demonstrated that the eHEALS has good measures of internal construct validity although it appears to capture different aspects of eHealth literacy (e.g. using eHealth and understanding eHealth). Whilst further studies are required to confirm this finding, it may be necessary for these constructs of the eHEALS to be scored separately. The nine HLQ subscales were shown to measure a single construct of health literacy. However, participants' scores may not represent their actual level of ability, as distinction between response categories was unclear for the last four subscales. Reducing the response categories of these subscales may improve the ability of the HLQ to distinguish between different levels of health literacy.

  7. Rasch analyses of the Activities-specific Balance Confidence Scale with individuals 50 years and older with lower limb amputations

    PubMed Central

    Sakakibara, Brodie M.; Miller, William C.; Backman, Catherine L.

    2012-01-01

    Objective To explore shortened response formats for use with the Activities-specific Balance Confidence scale and then: 1) evaluate the unidimensionality of the scale; 2) evaluate the item difficulty; 3) evaluate the scale for redundancy and content gaps; and 4) evaluate the item standard error of measurement (SEM) and internal consistency reliability among aging individuals (≥50 years) with a lower-limb amputation living in the community. Design Secondary analysis of cross-sectional survey and chart review data. Setting Out-patient amputee clinics, Ontario, Canada. Participants Four hundred forty eight community living adults, at least 50 years old (mean = 68 years), who have used a prosthesis for at least 6 months for a major unilateral lower limb amputation. Three hundred twenty five (72.5%) were men. Intervention N/a Main Outcome Measure(s) Activities-specific Balance Confidence Scale. Results A 5-option response format outperformed 4- and 6-option formats. Factor analyses confirmed a unidimensional scale. The distance between response options is not the same for all items on the scale, evident by the Partial Credit Model (PCM) having a better fit to the data than the Rating Scale Model. Two items, however, did not fit the PCM within statistical reason. Revising the wording of the two items may resolve the misfit, and improve the construct validity and lower the SEM. Overall, the difficulty of the scale’s items is appropriate for use with aging individuals with lower-limb amputation, and is most reliable (Cronbach ∝ = 0.94) for use with individuals with moderately low balance confidence levels. Conclusions The ABC-scale with a simplified 5-option response format is a valid and reliable measure of balance confidence for use with individuals aging with a lower limb amputation. PMID:21704978

  8. Influences on the Consumption of Australian Ration Packs: Review of a Contextual Model and Application to Australian Defence Force Data

    DTIC Science & Technology

    2011-03-01

    to consuming the same product types contained in the ration pack (e.g. instant noodles ). We can speculate that this may be a practice that supports... instant rice. Unpopular items were cheese with bacon, noodles and mulligatawny. Items needing improvement were meat bars cheese and instant milk. 35...complimentary food items (e.g. BBQ Beef might be consumed with rice or noodles ) UNCLASSIFIED 5 UNCLASSIFIED DSTO-TR-2526 the CR5M were found to be of no

  9. Development of a bioassay to screen for chemicals mimicking the anti-aging effects of calorie restriction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chiba, Takuya, E-mail: takuya@nagasaki-u.ac.jp; Tsuchiya, Tomoshi; Komatsu, Toshimitsu

    2010-10-15

    Research highlights: {yields} We identified four sequence motifs lying upstream of putative pro-longevity genes. {yields} One of these motifs binds to HNF-4{alpha}. {yields} HNF-4{alpha}/PGC-1{alpha} could up-regulate the transcription of a reporter gene linked to this motif. {yields} The reporter system described here could be used to screen candidate anti-aging molecules. -- Abstract: Suppression of the growth hormone/insulin-like growth factor-I pathway in Ames dwarf (DF) mice, and caloric restriction (CR) in normal mice extends lifespan and delays the onset of age-related disorders. In combination, these interventions have an additive effect on lifespan in Ames DF mice. Therefore, common signaling pathways regulatedmore » by DF and CR could have additive effects on longevity. In this study, we tried to identity the signaling mechanism and develop a system to assess pro-longevity status in cells and mice. We previously identified genes up-regulated in the liver of DF and CR mice by DNA microarray analysis. Motif analysis of the upstream sequences of those genes revealed four major consensus sequence motifs, which have been named dwarfism and calorie restriction-responsive elements (DFCR-REs). One of the synthesized sequences bound to hepatocyte nuclear factor-4{alpha} (HNF-4{alpha}), an important transcription factor involved in liver metabolism. Furthermore, using this sequence information, we developed a highly sensitive bioassay to identify chemicals mimicking the anti-aging effects of CR. When the reporter construct, containing an element upstream of a secreted alkaline phosphatase (SEAP) gene, was co-transfected with HNF-4{alpha} and its regulator peroxisome proliferator-activated receptor (PPAR) {gamma} coactivator-1{alpha} (PGC-1{alpha}), SEAP activity was increased compared with untransfected controls. Moreover, transient transgenic mice established using this construct showed increased SEAP activity in CR mice compared with ad libitum-fed mice. These data suggest that because of its rapidity, ease of use, and specificity, our bioassay will be more useful than the systems currently employed to screen for CR mimetics, which mimic the beneficial effects of CR. Our system will be particularly useful for high-throughput screening of natural and synthetic candidate molecules.« less

  10. Results of a community-based survey of construction safety climate for Hispanic workers.

    PubMed

    Marin, Luz S; Cifuentes, Manuel; Roelofs, Cora

    2015-01-01

    Hispanic construction workers experience high rates of occupational injury, likely influenced by individual, organizational, and social factors. To characterize the safety climate of Hispanic construction workers using worker, contractor, and supervisor perceptions of the workplace. We developed a 40-item interviewer-assisted survey with six safety climate dimensions and administered it in Spanish and English to construction workers, contractors, and supervisors. A safety climate model, comparing responses and assessing contributing factors was created based on survey responses. While contractors and construction supervisors' (n = 128) scores were higher, all respondents shared a negative perception of safety climate. Construction workers had statistically significantly lower safety climate scores compared to supervisors and contractors (30·6 vs 46·5%, P<0·05). Safety climate scores were not associated with English language ability or years lived in the United States. We found that Hispanic construction workers in this study experienced a poor safety climate. The Hispanic construction safety climate model we propose can serve as a framework to guide organizational safety interventions and evaluate safety climate improvements.

  11. Results of a community-based survey of construction safety climate for Hispanic workers

    PubMed Central

    Marin, Luz S; Cifuentes, Manuel; Roelofs, Cora

    2015-01-01

    Background: Hispanic construction workers experience high rates of occupational injury, likely influenced by individual, organizational, and social factors. Objectives: To characterize the safety climate of Hispanic construction workers using worker, contractor, and supervisor perceptions of the workplace. Methods: We developed a 40-item interviewer-assisted survey with six safety climate dimensions and administered it in Spanish and English to construction workers, contractors, and supervisors. A safety climate model, comparing responses and assessing contributing factors was created based on survey responses. Results: While contractors and construction supervisors’ (n = 128) scores were higher, all respondents shared a negative perception of safety climate. Construction workers had statistically significantly lower safety climate scores compared to supervisors and contractors (30.6 vs 46.5%, P<0.05). Safety climate scores were not associated with English language ability or years lived in the United States. Conclusions: We found that Hispanic construction workers in this study experienced a poor safety climate. The Hispanic construction safety climate model we propose can serve as a framework to guide organizational safety interventions and evaluate safety climate improvements. PMID:26145454

  12. Analysis of tandem E-box motifs within human Complement receptor 2 (CR2/CD21) promoter reveals cell specific roles for RP58, E2A, USF and localized chromatin accessibility.

    PubMed

    Cruickshank, Mark N; Dods, James; Taylor, Rhonda L; Karimi, Mahdad; Fenwick, Emily J; Quail, Elizabeth A; Rea, Alexander J; Holers, V Michael; Abraham, Lawrence J; Ulgiati, Daniela

    2015-07-01

    Complement receptor 2 (CR2/CD21) plays an important role in the generation of normal B cell immune responses. As transcription appears to be the prime mechanism via which surface CR2/CD21 expression is controlled, understanding transcriptional regulation of this gene will have broader implications to B cell biology. Here we report opposing, cell-context specific control of CR2/CD21 promoter activity by tandem E-box elements, spaced 22 bp apart and within 70 bp of the transcription initiation site. We have identified E2A and USF transcription factors as binding to the distal and proximal E-box sites respectively in CR2-positive B-cells, at a site that is hypersensitive to restriction enzyme digestion compared to non-expressing K562 cells. However, additional unidentified proteins have also been found to bind these functionally important elements. By utilizing a proteomics approach we have identified a repressor protein, RP58, binding the distal E-box motif. Co-transfection experiments using RP58 overexpression constructs demonstrated a specific 10-fold repression of CR2/CD21 transcriptional activity mediated through the distal E-box repressor element. Taken together, our results indicate that repression of the CR2/CD21 promoter can occur through one of the E-box motifs via recruitment of RP58 and other factors to bring about a silenced chromatin context within CR2/CD21 non-expressing cells. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Biological Cr(VI) removal using bio-filters and constructed wetlands.

    PubMed

    Michailides, Michail K; Sultana, Mar-Yam; Tekerlekopoulou, Athanasia G; Akratos, Christos S; Vayenas, Dimitrios V

    2013-01-01

    The bioreduction of hexavalent chromium from aqueous solution was carried out using suspended growth and packed-bed reactors under a draw-fill operating mode, and horizontal subsurface constructed wetlands. Reactors were inoculated with industrial sludge from the Hellenic Aerospace Industry using sugar as substrate. In the suspended growth reactors, the maximum Cr(VI) reduction rate (about 2 mg/L h) was achieved for an initial concentration of 12.85 mg/L, while in the attached growth reactors, a similar reduction rate was achieved even with high initial concentrations (109 mg/L), thus confirming the advantage of these systems. Two horizontal subsurface constructed wetlands (CWs) pilot-scale units were also built and operated. The units contained fine gravel. One unit was planted with common reeds and one was kept unplanted. The mean influent concentrations of Cr(VI) were 5.61 and 5.47 mg/L for the planted and unplanted units, respectively. The performance of the planted CW units was very effective as mean Cr(VI) removal efficiency was 85% and efficiency maximum reached 100%. On the contrary, the unplanted CW achieved very low Cr(VI) removal with a mean value of 26%. Both attached growth reactors and CWs proved efficient and viable means for Cr(VI) reduction.

  14. Use of non-parametric item response theory to develop a shortened version of the Positive and Negative Syndrome Scale (PANSS).

    PubMed

    Khan, Anzalee; Lewis, Charles; Lindenmayer, Jean-Pierre

    2011-11-16

    Nonparametric item response theory (IRT) was used to examine (a) the performance of the 30 Positive and Negative Syndrome Scale (PANSS) items and their options ((levels of severity), (b) the effectiveness of various subscales to discriminate among differences in symptom severity, and (c) the development of an abbreviated PANSS (Mini-PANSS) based on IRT and a method to link scores to the original PANSS. Baseline PANSS scores from 7,187 patients with Schizophrenia or Schizoaffective disorder who were enrolled between 1995 and 2005 in psychopharmacology trials were obtained. Option characteristic curves (OCCs) and Item Characteristic Curves (ICCs) were constructed to examine the probability of rating each of seven options within each of 30 PANSS items as a function of subscale severity, and summed-score linking was applied to items selected for the Mini-PANSS. The majority of items forming the Positive and Negative subscales (i.e. 19 items) performed very well and discriminate better along symptom severity compared to the General Psychopathology subscale. Six of the seven Positive Symptom items, six of the seven Negative Symptom items, and seven out of the 16 General Psychopathology items were retained for inclusion in the Mini-PANSS. Summed score linking and linear interpolation was able to produce a translation table for comparing total subscale scores of the Mini-PANSS to total subscale scores on the original PANSS. Results show scores on the subscales of the Mini-PANSS can be linked to scores on the original PANSS subscales, with very little bias. The study demonstrated the utility of non-parametric IRT in examining the item properties of the PANSS and to allow selection of items for an abbreviated PANSS scale. The comparisons between the 30-item PANSS and the Mini-PANSS revealed that the shorter version is comparable to the 30-item PANSS, but when applying IRT, the Mini-PANSS is also a good indicator of illness severity.

  15. Use of NON-PARAMETRIC Item Response Theory to develop a shortened version of the Positive and Negative Syndrome Scale (PANSS)

    PubMed Central

    2011-01-01

    Background Nonparametric item response theory (IRT) was used to examine (a) the performance of the 30 Positive and Negative Syndrome Scale (PANSS) items and their options ((levels of severity), (b) the effectiveness of various subscales to discriminate among differences in symptom severity, and (c) the development of an abbreviated PANSS (Mini-PANSS) based on IRT and a method to link scores to the original PANSS. Methods Baseline PANSS scores from 7,187 patients with Schizophrenia or Schizoaffective disorder who were enrolled between 1995 and 2005 in psychopharmacology trials were obtained. Option characteristic curves (OCCs) and Item Characteristic Curves (ICCs) were constructed to examine the probability of rating each of seven options within each of 30 PANSS items as a function of subscale severity, and summed-score linking was applied to items selected for the Mini-PANSS. Results The majority of items forming the Positive and Negative subscales (i.e. 19 items) performed very well and discriminate better along symptom severity compared to the General Psychopathology subscale. Six of the seven Positive Symptom items, six of the seven Negative Symptom items, and seven out of the 16 General Psychopathology items were retained for inclusion in the Mini-PANSS. Summed score linking and linear interpolation was able to produce a translation table for comparing total subscale scores of the Mini-PANSS to total subscale scores on the original PANSS. Results show scores on the subscales of the Mini-PANSS can be linked to scores on the original PANSS subscales, with very little bias. Conclusions The study demonstrated the utility of non-parametric IRT in examining the item properties of the PANSS and to allow selection of items for an abbreviated PANSS scale. The comparisons between the 30-item PANSS and the Mini-PANSS revealed that the shorter version is comparable to the 30-item PANSS, but when applying IRT, the Mini-PANSS is also a good indicator of illness severity. PMID:22087503

  16. Development of the Comprehensive General Parenting Questionnaire for caregivers of 5-13 year olds.

    PubMed

    Sleddens, Ester F C; O'Connor, Teresia M; Watson, Kathleen B; Hughes, Sheryl O; Power, Thomas G; Thijs, Carel; De Vries, Nanne K; Kremers, Stef P J

    2014-02-10

    Despite the large number of parenting questionnaires, considerable disagreement exists about how to best assess parenting. Most of the instruments only assess limited aspects of parenting. To overcome this shortcoming, the "Comprehensive General Parenting Questionnaire" (CGPQ) was systematically developed. Such a measure is frequently requested in the area of childhood overweight. First, an item bank of existing parenting measures was created assessing five key parenting constructs that have been identified across multiple theoretical approaches to parenting (Nurturance, Overprotection, Coercive control, Behavioral control, and Structure). Caregivers of 5- to 13-year-olds were asked to complete the online survey in the Netherlands (N = 821), Belgium (N = 435) and the United States (N = 241). In addition, a questionnaire regarding personality characteristics ("Big Five") of the caregiver was administered and parents were asked to report about their child's height and weight. Factor analyses and Item-Response Modeling (IRM) techniques were used to assess the underlying parenting constructs and for item reduction. Correlation analyses were performed to assess the relations between general parenting and personality of the caregivers, adjusting for socio-economic status (SES) indicators, to establish criterion validity. Multivariate linear regressions were performed to examine the associations of SES indicators and parenting with child BMI z-scores. Additionally, we assessed whether scores on the parenting constructs and child BMI z-scores differed depending on SES indicators. The reduced questionnaire (62 items) revealed acceptable fit of our parenting model and acceptable IRM item fit statistics. Caregiver personality was related as hypothesized with the GCPQ parenting constructs. While correcting for SES, overprotection was positively related to child BMI. The negative relationship between structure and BMI was borderline significant. Parents with a high level of education were less likely to use overly forms of controlling parenting (i.e., coercive control and overprotection) and more likely to have children with lower BMI. Based on several author review meetings and cognitive interviews the questionnaire was further modified to an 85-item questionnaire. The GCPQ may facilitate research exploring how parenting influences children's weight-related behaviors. The contextual influence of general parenting is likely to be more profound than its direct relationship with weight status.

  17. Construction of a web-based questionnaire for longitudinal investigation of work exposure, musculoskeletal pain and performance impairments in high-performance marine craft populations.

    PubMed

    Lo Martire, Riccardo; de Alwis, Manudul Pahansen; Äng, Björn Olov; Garme, Karl

    2017-07-20

    High-performance marine craft personnel (HPMCP) are regularly exposed to vibration and repeated shock (VRS) levels exceeding maximum limitations stated by international legislation. Whereas such exposure reportedly is detrimental to health and performance, the epidemiological data necessary to link these adverse effects causally to VRS are not available in the scientific literature, and no suitable tools for acquiring such data exist. This study therefore constructed a questionnaire for longitudinal investigations in HPMCP. A consensus panel defined content domains, identified relevant items and outlined a questionnaire. The relevance and simplicity of the questionnaire's content were then systematically assessed by expert raters in three consecutive stages, each followed by revisions. An item-level content validity index (I-CVI) was computed as the proportion of experts rating an item as relevant and simple, and a scale-level content validity index (S-CVI/Ave) as the average I-CVI across items. The thresholds for acceptable content validity were 0.78 and 0.90, respectively. Finally, a dynamic web version of the questionnaire was constructed and pilot tested over a 1-month period during a marine exercise in a study population sample of eight subjects, while accelerometers simultaneously quantified VRS exposure. Content domains were defined as work exposure, musculoskeletal pain and human performance, and items were selected to reflect these constructs. Ratings from nine experts yielded S-CVI/Ave of 0.97 and 1.00 for relevance and simplicity, respectively, and the pilot test suggested that responses were sensitive to change in acceleration and that the questionnaire, following some adjustments, was feasible for its intended purpose. A dynamic web-based questionnaire for longitudinal survey of key variables in HPMCP was constructed. Expert ratings supported that the questionnaire content is relevant, simple and sufficiently comprehensive, and the pilot test suggested that the questionnaire is feasible for longitudinal measurements in the study population. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  18. Development and psychometric evaluation of a health-related quality of life instrument for individuals with adult-onset hearing loss.

    PubMed

    Stika, Carren J; Hays, Ron D

    2015-07-01

    Self-reports of 'hearing handicap' are available, but a comprehensive measure of health-related quality of life (HRQOL) for individuals with adult-onset hearing loss (AOHL) does not exist. Our objective was to develop and evaluate a multidimensional HRQOL instrument for individuals with AOHL. The Impact of Hearing Loss Inventory Tool (IHEAR-IT) was developed using results of focus groups, a literature review, advisory expert panel input, and cognitive interviews. The 73-item field-test instrument was completed by 409 adults (22-91 years old) with varying degrees of AOHL and from different areas of the USA. Multitrait scaling analysis supported four multi-item scales and five individual items. Internal consistency reliabilities ranged from 0.93 to 0.96 for the scales. Construct validity was supported by correlations between the IHEAR-IT scales and scores on the 36-item Short Form Health Survey, version 2.0 (SF-36v2) mental composite summary (r = 0.32-0.64) and the Hearing Handicap Inventory for the Elderly/Adults (HHIE/HHIA) (r ≥ -0.70). The field test provides initial support for the reliability and construct validity of the IHEAR-IT for evaluating HRQOL of individuals with AOHL. Further research is needed to evaluate the responsiveness to change of the IHEAR-IT scales and identify items for a short-form.

  19. Preliminary Study of the Autism Self-Efficacy Scale for Teachers (ASSET).

    PubMed

    Ruble, Lisa A; Toland, Michael D; Birdwhistell, Jessica L; McGrew, John H; Usher, Ellen L

    2013-09-01

    The purpose of the current study was to evaluate a new measure, the Autism Self-Efficacy Scale for Teachers (ASSET) for its dimensionality, internal consistency, and construct validity derived in a sample of special education teachers ( N = 44) of students with autism. Results indicate that all items reflect one dominant factor, teachers' responses to items were internally consistent within the sample, and compared to a 100-point scale, a 6-point response scale is adequate. ASSET scores were found to be negatively correlated with scores on two subscale measures of teacher stress (i.e., self-doubt/need for support and disruption of the teaching process) but uncorrelated with teacher burnout scores. The ASSET is a promising tool that requires replication with larger samples.

  20. [Design and validation of the CSR-Hospital-SP scale to measure corporate social responsibility].

    PubMed

    Mira, José Joaquín; Lorenzo, Susana; Navarro, Isabel; Pérez-Jover, Virtudes; Vitaller, Julián

    2013-01-01

    To design and validate a scale (CSR-Hospital-SP) to determine health professionals' views on the approach of management to corporate social responsibility (CSR) in their hospital. The literature was reviewed to identify the main CSR scales and select the dimensions to be evaluated. The initial version of the scale consisted of 25 items. A convenience sample of a minimum of 224 health professionals working in five public hospitals in five autonomous regions were invited to respond. Floor and ceiling effects, internal consistency, reliability, and construct validity were analyzed. A total of 233 health professionals responded. The CSR-Hospital-SP scale had 20 items grouped into four factors. The item-total correlation was higher than 0.30; all factor loadings were greater than 0.50; 59.57% of the variance was explained; Cronbach's alpha was 0.90; Spearman-Brown's coefficient was 0.82. The CSR-Hospital-SP scale is a tool designed for hospitals that implement accountability mechanisms and promote socially responsible management approaches. Copyright © 2012 SESPAS. Published by Elsevier Espana. All rights reserved.

  1. Worldwide Emerging Environmental Issues Affecting the U.S. Military

    DTIC Science & Technology

    2010-08-01

    Brazil protecting non-Amazonian tropical forests . A summary of the consultations across the region are presented in the UNDP LAC Regional...North-American Environmental Integration……..13 6.6 World’s Humid Tropical Forests to Suffer Considerable Biodiversity Change by 2100….…13 6.7 Latin...35456&Cr=sanitation&Cr1 Item 2. Food Security Concerns Increase Around the World The Food Security Risk Index 2010 reveals that the countries most

  2. Method for automatic measurement of second language speaking proficiency

    NASA Astrophysics Data System (ADS)

    Bernstein, Jared; Balogh, Jennifer

    2005-04-01

    Spoken language proficiency is intuitively related to effective and efficient communication in spoken interactions. However, it is difficult to derive a reliable estimate of spoken language proficiency by situated elicitation and evaluation of a person's communicative behavior. This paper describes the task structure and scoring logic of a group of fully automatic spoken language proficiency tests (for English, Spanish and Dutch) that are delivered via telephone or Internet. Test items are presented in spoken form and require a spoken response. Each test is automatically-scored and primarily based on short, decontextualized tasks that elicit integrated listening and speaking performances. The tests present several types of tasks to candidates, including sentence repetition, question answering, sentence construction, and story retelling. The spoken responses are scored according to the lexical content of the response and a set of acoustic base measures on segments, words and phrases, which are scaled with IRT methods or parametrically combined to optimize fit to human listener judgments. Most responses are isolated spoken phrases and sentences that are scored according to their linguistic content, their latency, and their fluency and pronunciation. The item development procedures and item norming are described.

  3. Transformational, transactional among physician and laissez-faire leadership among physician executives.

    PubMed

    Xirasagar, Sudha

    2008-01-01

    The purpose of this paper is to examine the empirical validity of transformational, transactional and laissez-faire leadership and their sub-scales among physician managers. A nation-wide, anonymous mail survey was carried out in the United States, requesting community health center executive directors to provide ratings of their medical director's leadership behaviors (34 items) and effectiveness (nine items), using the Multifactor Leadership Questionnaire 5X-Short, on a five-point Likert scale. The survey response rate was 40.9 percent, for a total 269 responses. Exploratory factor analysis was done, using principal factor extraction, followed by promax rotation). The data yielded a three-factor structure, generally aligned with Bass and Avolio's constructs of transformational, transactional and laissez-faire leadership. Data do not support the factorial independence of their subscales (idealized influence, inspirational motivation, individualized consideration, and intellectual stimulation under transformational leadership; contingent reward, management-by-exception active, and management-by-exception passive under transactional leadership). Two contingent reward items loaded on transformational leadership, and all items of management-by-exception passive loaded on laissez-faire. A key limitation is that supervisors were surveyed for ratings of the medical directors' leadership style. Although past research in other fields has shown that supervisor ratings are strongly correlated with subordinate ratings, further research is needed to validate the findings by surveying physician and other clinical subordinates. Such research will also help to develop appropriate content of leadership training for clinical leaders. This study represents an important step towards establishing the empirical evidence for the full range of leadership constructs among physician leaders.

  4. Construct validity of the Swedish version of the revised piper fatigue scale in an oncology sample--a Rasch analysis.

    PubMed

    Lundgren-Nilsson, Asa; Dencker, Anna; Jakobsson, Sofie; Taft, Charles; Tennant, Alan

    2014-06-01

    Fatigue is a common and distressing symptom in cancer patients due to both the disease and its treatments. The concept of fatigue is multidimensional and includes both physical and mental components. The 22-item Revised Piper Fatigue Scale (RPFS) is a multidimensional instrument developed to assess cancer-related fatigue. This study reports on the construct validity of the Swedish version of the RPFS from the perspective of Rasch measurement. The Swedish version of the RPFS was answered by 196 cancer patients fatigued after 4 to 5 weeks of curative radiation therapy. Data from the scale were fitted to the Rasch measurement model. This involved testing a series of assumptions, including the stochastic ordering of items, local response dependency, and unidimensionality. A series of fit statistics were computed, differential item functioning (DIF) was tested, and local response dependency was accommodated through testlets. The Behavioral, Affective and Sensory domains all satisfied the Rasch model expectations. No DIF was observed, and all domains were found to be unidimensional. The Mood/Cognitive scale failed to fit the model, and substantial multidimensionality was found. Splitting the scale between Mood and Cognitive items resolved fit to the Rasch model, and new domains were unidimensional without DIF. The current Rasch analyses add to the evidence of measurement properties of the scale and show that the RPFS has good psychometric properties and works well to measure fatigue. The original four-factor structure, however, was not supported. Copyright © 2014 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  5. Construction and Validation of a Women's Autonomy Measurement Scale with Reference to Utilization of Maternal Health Care Services in Nepal.

    PubMed

    Bhandari, T R; Dangal, G; Sarma, P S; Kutty, V R

    2014-01-01

    Women's autonomy is one of the predictors of maternal health care service utilization. This study aimed to construct and validate a scale for measuring women's autonomy with relevance to developing countries. We conducted a study for construction and validation of a scale in Rupandehi and further validated in Kapilvastu districts of Nepal. Initially, we administered a 24-item preliminary scale and finalized a 23-item scale using psychometric tests. After defining the construct of women's autonomy, we pooled 194 items and selected 24 items to develop a preliminary scale. The scale development process followed different steps i.e. definition of construct, generation of items pool, pretesting, analysis of psychometric test and further validation. The new scale was strongly supported by Cronbach's Alpha value (0.84), test-retest Pearson correlation (0.87), average content validity ratio (0.8) and overall agreement- Kappa value of the items (0.83) whereas all values were found satisfactory. From factor analysis, we selected 23 items for the final scale which show good convergent and discriminant validity. From preliminary draft, we removed one item; the remaining 23 items were loaded in five factors. All five factors had single loading items by suppressing absolute coefficient value less than 0.45 and average coefficient was more than 0.60 of each factor. Similarly, the factors and loaded items had good convergent and discriminant validity which further showed strong measurement capacity of the scale. The new scale is a reliable tool for assessing women's autonomy in developing countries. We recommend for further use and validation of the scale for ensuring the measurement capacity.

  6. Piers Harris and Coopersmith Measure of Self-Esteem: A Comparative Analysis

    ERIC Educational Resources Information Center

    Lynch, Mervin D.; Foley-Peres, Kathleen D.; Sullivan, Stefanie S.

    2008-01-01

    The purposes of this study were to see if the items from the Piers Harris Self Concept Scale and the Coopersmith Self Esteem Inventory had construct and predictive validity. Items used in this study were 50 items from the Coopersmith Self-Esteem Inventory and 80 items from the Piers Harris Self-Concept Scale. Construct measures were obtained using…

  7. Development of Elderly Quality of Life Index – Eqoli: Item Reduction and Distribution into Dimensions

    PubMed Central

    Paschoal, Sérgio Márcio Pacheco; Filho, Wilson Jacob; Litvoc, Júlio

    2008-01-01

    OBJECTIVE To describe item reduction and its distribution into dimensions in the construction process of a quality of life evaluation instrument for the elderly. METHODS The sampling method was chosen by convenience through quotas, with selection of elderly subjects from four programs to achieve heterogeneity in the “health status”, “functional capacity”, “gender”, and “age” variables. The Clinical Impact Method was used, consisting of the spontaneous and elicited selection by the respondents of relevant items to the construct Quality of Life in Old Age from a previously elaborated item pool. The respondents rated each item’s importance using a 5-point Likert scale. The product of the proportion of elderly selecting the item as relevant (frequency) and the mean importance score they attributed to it (importance) represented the overall impact of that item in their quality of life (impact). The items were ordered according to their impact scores and the top 46 scoring items were grouped in dimensions by three experts. A review of the negative items was performed. RESULTS One hundred and ninety three people (122 women and 71 men) were interviewed. Experts distributed the 46 items into eight dimensions. Closely related items were grouped and dimensions not reaching the minimum expected number of items received additional items resulting in eight dimensions and 43 items. DISCUSSION The sample was heterogeneous and similar to what was expected. The dimensions and items demonstrated the multidimensionality of the construct. The Clinical Impact Method was appropriate to construct the instrument, which was named Elderly Quality of Life Index - EQoLI. An accuracy process will be examined in the future. PMID:18438571

  8. Item analysis of the Spanish version of the Boston Naming Test with a Spanish speaking adult population from Colombia.

    PubMed

    Kim, Stella H; Strutt, Adriana M; Olabarrieta-Landa, Laiene; Lequerica, Anthony H; Rivera, Diego; De Los Reyes Aragon, Carlos Jose; Utria, Oscar; Arango-Lasprilla, Juan Carlos

    2018-02-23

    The Boston Naming Test (BNT) is a widely used measure of confrontation naming ability that has been criticized for its questionable construct validity for non-English speakers. This study investigated item difficulty and construct validity of the Spanish version of the BNT to assess cultural and linguistic impact on performance. Subjects were 1298 healthy Spanish speaking adults from Colombia. They were administered the 60- and 15-item Spanish version of the BNT. A Rasch analysis was computed to assess dimensionality, item hierarchy, targeting, reliability, and item fit. Both versions of the BNT satisfied requirements for unidimensionality. Although internal consistency was excellent for the 60-item BNT, order of difficulty did not increase consistently with item number and there were a number of items that did not fit the Rasch model. For the 15-item BNT, a total of 5 items changed position on the item hierarchy with 7 poor fitting items. Internal consistency was acceptable. Construct validity of the BNT remains a concern when it is administered to non-English speaking populations. Similar to previous findings, the order of item presentation did not correspond with increasing item difficulty, and both versions were inadequate at assessing high naming ability.

  9. Development of a questionnaire for assessing the childbirth experience (QACE).

    PubMed

    Carquillat, Pierre; Vendittelli, Françoise; Perneger, Thomas; Guittier, Marie-Julia

    2017-08-30

    Due to its potential impact on women's psychological health, assessing perceptions of their childbirth experience is important. The aim of this study was to develop a multidimensional self-reporting questionnaire to evaluate the childbirth experience. Factors influencing the childbirth experience were identified from a literature review and the results of a previous qualitative study. A total of 25 items were combined from existing instruments or were created de novo. A draft version was pilot tested for face validity with 30 women and submitted for evaluation of its construct validity to 477 primiparous women at one-month post-partum. The recruitment took place in two obstetric clinics from Swiss and French university hospitals. To evaluate the content validity, we compared item responses to general childbirth experience assessments on a numeric, 0 to 10 rating scale. We dichotomized two group assessment scores: "0 to 7" and "8 to 10". We performed an exploratory factor analysis to identify underlying dimensions. In total, 291 women completed the questionnaire (response rate = 61%). The responses to 22 items were statistically significant between the 0 to 7 and 8 to 10 groups for the general childbirth experience assessments. An exploratory factor analysis yielded four sub-scales, which were labelled "relationship with staff" (4 items), "emotional status" (3 items), "first moments with the new born," (3 items) and "feelings at one month postpartum" (3 items). All 4 scales had satisfactory internal consistency levels (alpha coefficients from 0.70 to 0.85). The full 25-item version can be used to analyse each item by itself, and the short 4-dimension version can be scored to summarize the general assessment of the childbirth experience. The Questionnaire for Assessing the Childbirth Experience (QACE) could be useful as a screening instrument to identify women with negative childbirth experiences. It can be used as both a research instrument in its short version and a questionnaire for use in clinical practice in its full version.

  10. Use of a safety climate questionnaire in UK health care: factor structure, reliability and usability.

    PubMed

    Hutchinson, A; Cooper, K L; Dean, J E; McIntosh, A; Patterson, M; Stride, C B; Laurence, B E; Smith, C M

    2006-10-01

    To explore the factor structure, reliability, and potential usefulness of a patient safety climate questionnaire in UK health care. Four acute hospital trusts and nine primary care trusts in England. The questionnaire used was the 27 item Teamwork and Safety Climate Survey. Thirty three healthcare staff commented on the wording and relevance. The questionnaire was then sent to 3650 staff within the 13 NHS trusts, seeking to achieve at least 600 responses as the basis for the factor analysis. 1307 questionnaires were returned (36% response). Factor analyses and reliability analyses were carried out on 897 responses from staff involved in direct patient care, to explore how consistently the questions measured the underlying constructs of safety climate and teamwork. Some questionnaire items related to multiple factors or did not relate strongly to any factor. Five items were discarded. Two teamwork factors were derived from the remaining 11 teamwork items and three safety climate factors were derived from the remaining 11 safety items. Internal consistency reliabilities were satisfactory to good (Cronbach's alpha > or =0.69 for all five factors). This is one of the few studies to undertake a detailed evaluation of a patient safety climate questionnaire in UK health care and possibly the first to do so in primary as well as secondary care. The results indicate that a 22 item version of this safety climate questionnaire is useable as a research instrument in both settings, but also demonstrates a more general need for thorough validation of safety climate questionnaires before widespread usage.

  11. The development of the 'Quality-of-life for Respiratory Illness Questionnaire (QOL-RIQ)': a disease-specific quality-of-life questionnaire for patients with mild to moderate chronic non-specific lung disease.

    PubMed

    Maillé, A R; Koning, C J; Zwinderman, A H; Willems, L N; Dijkman, J H; Kaptein, A A

    1997-05-01

    Chronic non-specific lung disease (CNSLD) encompasses asthma as well as chronic obstructive pulmonary disease (COPD). Recently in health care, there has been increasing awareness in the functional, psychological and social aspects of the health of patients; their quality of life (QOL). Quality-of-life research addressing CNSLD patients has been rather underdeveloped for a long period of time. Recently, however, the importance of QOL is being increasingly recognized, and several research groups have started to study QOL in CNSLD patients in more detail. This paper describes the construction of a disease-specific QOL instrument for patients with mild to moderately severe CNSLD. Items relating to several domains of QOL were listed, and 171 CNSLD patients in general practice were asked how much of a problem each item had been (assessed on a seven-point Likert scale). After applying an item-selection procedure, a uni-dimensional QOL questionnaire was constructed consisting of 55 items divided into seven domain subscales: breathing problems, physical problems, emotions, situations triggering or enhancing breathing problems, general activities, daily and domestic activities, and social activities, relationships and sexuality. Reliability estimates of the domain subscales of the constructed questionnaire varied from 0.68 to 0.89, and was 0.92 for the QOL for Respiratory Illness Questionnaire (QOL-RIQ) total scale. A first impression of the construct validity of the questionnaire was gained by investigation of the relationship between the QOL domain subscales and several indicators of illness severity, as well as the relative contribution of illness severity variables, background characteristics and symptoms to QOL, using regression analysis. Further research to validate the questionnaire to a greater extent (construct validity, test-retest reliability and responsiveness to change) is currently taking place.

  12. Validity and Reliability of the US National Cancer Institute's Patient-Reported Outcomes Version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE).

    PubMed

    Dueck, Amylou C; Mendoza, Tito R; Mitchell, Sandra A; Reeve, Bryce B; Castro, Kathleen M; Rogak, Lauren J; Atkinson, Thomas M; Bennett, Antonia V; Denicoff, Andrea M; O'Mara, Ann M; Li, Yuelin; Clauser, Steven B; Bryant, Donna M; Bearden, James D; Gillis, Theresa A; Harness, Jay K; Siegel, Robert D; Paul, Diane B; Cleeland, Charles S; Schrag, Deborah; Sloan, Jeff A; Abernethy, Amy P; Bruner, Deborah W; Minasian, Lori M; Basch, Ethan

    2015-11-01

    To integrate the patient perspective into adverse event reporting, the National Cancer Institute developed a patient-reported outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE). To assess the construct validity, test-retest reliability, and responsiveness of PRO-CTCAE items. A total of 975 adults with cancer undergoing outpatient chemotherapy and/or radiation therapy enrolled in this questionnaire-based study between January 2011 and February 2012. Eligible participants could read English and had no clinically significant cognitive impairment. They completed PRO-CTCAE items on tablet computers in clinic waiting rooms at 9 US cancer centers and community oncology practices at 2 visits 1 to 6 weeks apart. A subset completed PRO-CTCAE items during an additional visit 1 business day after the first visit. Primary comparators were clinician-reported Eastern Cooperative Oncology Group Performance Status (ECOG PS) and the European Organisation for Research and Treatment of Cancer Core Quality of Life Questionnaire (QLQ-C30). A total of 940 of 975 (96.4%) and 852 of 940 (90.6%) participants completed PRO-CTCAE items at visits 1 and 2, respectively. At least 1 symptom was reported by 938 of 940 (99.8%) participants. Participants' median age was 59 years; 57.3% were female, 32.4% had a high school education or less, and 17.1% had an ECOG PS of 2 to 4. All PRO-CTCAE items had at least 1 correlation in the expected direction with a QLQ-C30 scale (111 of 124, P<.05 for all). Stronger correlations were seen between PRO-CTCAE items and conceptually related QLQ-C30 domains. Scores for 94 of 124 PRO-CTCAE items were higher in the ECOG PS 2 to 4 vs 0 to 1 group (58 of 124, P<.05 for all). Overall, 119 of 124 items met at least 1 construct validity criterion. Test-retest reliability was 0.7 or greater for 36 of 49 prespecified items (median [range] intraclass correlation coefficient, 0.76 [0.53-.96]). Correlations between PRO-CTCAE item changes and corresponding QLQ-C30 scale changes were statistically significant for 27 prespecified items (median [range] r=0.43 [0.10-.56]; all P≤.006). Evidence demonstrates favorable validity, reliability, and responsiveness of PRO-CTCAE in a large, heterogeneous US sample of patients undergoing cancer treatment. Studies evaluating other measurement properties of PRO-CTCAE are under way to inform further development of PRO-CTCAE and its inclusion in cancer trials.

  13. Improving the evaluation of therapeutic interventions in multiple sclerosis: the role of new psychometric methods.

    PubMed

    Hobart, J; Cano, S

    2009-02-01

    In this monograph we examine the added value of new psychometric methods (Rasch measurement and Item Response Theory) over traditional psychometric approaches by comparing and contrasting their psychometric evaluations of existing sets of rating scale data. We have concentrated on Rasch measurement rather than Item Response Theory because we believe that it is the more advantageous method for health measurement from a conceptual, theoretical and practical perspective. Our intention is to provide an authoritative document that describes the principles of Rasch measurement and the practice of Rasch analysis in a clear, detailed, non-technical form that is accurate and accessible to clinicians and researchers in health measurement. A comparison was undertaken of traditional and new psychometric methods in five large sets of rating scale data: (1) evaluation of the Rivermead Mobility Index (RMI) in data from 666 participants in the Cannabis in Multiple Sclerosis (CAMS) study; (2) evaluation of the Multiple Sclerosis Impact Scale (MSIS-29) in data from 1725 people with multiple sclerosis; (3) evaluation of test-retest reliability of MSIS-29 in data from 150 people with multiple sclerosis; (4) examination of the use of Rasch analysis to equate scales purporting to measure the same health construct in 585 people with multiple sclerosis; and (5) comparison of relative responsiveness of the Barthel Index and Functional Independence Measure in data from 1400 people undergoing neurorehabilitation. Both Rasch measurement and Item Response Theory are conceptually and theoretically superior to traditional psychometric methods. Findings from each of the five studies show that Rasch analysis is empirically superior to traditional psychometric methods for evaluating rating scales, developing rating scales, analysing rating scale data, understanding and measuring stability and change, and understanding the health constructs we seek to quantify. There is considerable added value in using Rasch analysis rather than traditional psychometric methods in health measurement. Future research directions include the need to reproduce our findings in a range of clinical populations, detailed head-to-head comparisons of Rasch analysis and Item Response Theory, and the application of Rasch analysis to clinical practice.

  14. The Instructional Effects of Matching or Mismatching Lesson and Posttest Screen Color

    ERIC Educational Resources Information Center

    Clariana, Roy B.

    2004-01-01

    This investigation considers the instructional effects of color as an over-arching context variable when learning from computer displays. The purpose of this investigation is to examine the posttest retrieval effects of color as a local, extra-item non-verbal lesson context variable for constructed-response versus multiple-choice posttest…

  15. An Alternative Methodology for Creating Parallel Test Forms Using the IRT Information Function.

    ERIC Educational Resources Information Center

    Ackerman, Terry A.

    The purpose of this paper is to report results on the development of a new computer-assisted methodology for creating parallel test forms using the item response theory (IRT) information function. Recently, several researchers have approached test construction from a mathematical programming perspective. However, these procedures require…

  16. Using Rasch Analysis to Identify Uncharacteristic Responses to Undergraduate Assessments

    ERIC Educational Resources Information Center

    Edwards, Antony; Alcock, Lara

    2010-01-01

    Rasch Analysis is a statistical technique that is commonly used to analyse both test data and Likert survey data, to construct and evaluate question item banks, and to evaluate change in longitudinal studies. In this article, we introduce the dichotomous Rasch model, briefly discussing its assumptions. Then, using data collected in an…

  17. Self-Rating and Respondent Anonymity

    ERIC Educational Resources Information Center

    Goh, Jonathan W. P.; Lee, Ong Kim; Salleh, Hairon

    2010-01-01

    Background: Most empirical investigations in survey research have been conducted using self-reported or self-evaluated item responses. Such measures are common because they are relatively easy to obtain and are often the only feasible way to assess constructs of interest. In order to improve on the validity of self-reports it has become a common…

  18. 46 CFR 298.21 - Limits.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... customarily be capitalized as Vessel or Shipyard Project construction costs such as designing, engineering...) Cost items include those items usually specified in Vessel or Shipyard Project construction contracts... fees and interest on the Obligations or other borrowings incurred during the construction period...

  19. 46 CFR 298.21 - Limits.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... customarily be capitalized as Vessel or Shipyard Project construction costs such as designing, engineering...) Cost items include those items usually specified in Vessel or Shipyard Project construction contracts... fees and interest on the Obligations or other borrowings incurred during the construction period...

  20. 46 CFR 298.21 - Limits.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... customarily be capitalized as Vessel or Shipyard Project construction costs such as designing, engineering...) Cost items include those items usually specified in Vessel or Shipyard Project construction contracts... fees and interest on the Obligations or other borrowings incurred during the construction period...

  1. Development of the Assessment of Belief Conflict in Relationship-14 (ABCR-14).

    PubMed

    Kyougoku, Makoto; Teraoka, Mutsumi; Masuda, Noriko; Ooura, Mariko; Abe, Yasushi

    2015-01-01

    Nurses and other healthcare workers frequently experience belief conflict, one of the most important, new stress-related problems in both academic and clinical fields. In this study, using a sample of 1,683 nursing practitioners, we developed The Assessment of Belief Conflict in Relationship-14 (ABCR-14), a new scale that assesses belief conflict in the healthcare field. Standard psychometric procedures were used to develop and test the scale, including a qualitative framework concept and item-pool development, item reduction, and scale development. We analyzed the psychometric properties of ABCR-14 according to entropy, polyserial correlation coefficient, exploratory factor analysis, confirmatory factor analysis, average variance extracted, Cronbach's alpha, Pearson product-moment correlation coefficient, and multidimensional item response theory (MIRT). The results of the analysis supported a three-factor model consisting of 14 items. The validity and reliability of ABCR-14 was suggested by evidence from high construct validity, structural validity, hypothesis testing, internal consistency reliability, and concurrent validity. The result of the MIRT offered strong support for good item response of item slope parameters and difficulty parameters. However, the ABCR-14 Likert scale might need to be explored from the MIRT point of view. Yet, as mentioned above, there is sufficient evidence to support that ABCR-14 has high validity and reliability. The ABCR-14 demonstrates good psychometric properties for nursing belief conflict. Further studies are recommended to confirm its application in clinical practice.

  2. The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): item response theory findings.

    PubMed

    Grigg, Kaine; Manderson, Lenore

    2016-03-17

    Racism and associated discrimination are pervasive and persistent challenges with multiple cumulative deleterious effects contributing to inequities in various health outcomes. Globally, research over the past decade has shown consistent associations between racism and negative health concerns. Such research confirms that race endures as one of the strongest predictors of poor health. Due to the lack of validated Australian measures of racist attitudes, RACES (Racism, Acceptance, and Cultural-Ethnocentrism Scale) was developed. Here, we examine RACES' psychometric properties, including the latent structure, utilising Item Response Theory (IRT). Unidimensional and Multidimensional Rating Scale Model (RSM) Rasch analyses were utilised with 296 Victorian primary school students and 182 adolescents and 220 adults from the Australian community. RACES was demonstrated to be a robust 24-item three-dimensional scale of Accepting Attitudes (12 items), Racist Attitudes (8 items), and Ethnocentric Attitudes (4 items). RSM Rasch analyses provide strong support for the instrument as a robust measure of racist attitudes in the Australian context, and for the overall factorial and construct validity of RACES across primary school children, adolescents, and adults. RACES provides a reliable and valid measure that can be utilised across the lifespan to evaluate attitudes towards all racial, ethnic, cultural, and religious groups. A core function of RACES is to assess the effectiveness of interventions to reduce community levels of racism and in turn inequities in health outcomes within Australia.

  3. Is the Berg Balance Scale an effective tool for the measurement of early postural control impairments in patients with Parkinson's disease? Evidence from Rasch analysis.

    PubMed

    La Porta, F; Giordano, A; Caselli, S; Foti, C; Franchignoni, F

    2015-12-01

    It is unclear whether the BBS is an effective tool for the measurement of early postural control impairments in patients with Parkinson's disease (PD). The aim of this paper was to evaluate BBS' content validity, internal construct validity, reliability and targeting in patients with PD within the Rasch analysis framework. Observational, cross-sectional study. Outpatient Rehabilitation Unit. A sample of 285 outpatients with PD. The content validity of the BBS was assessed using standard linking techniques. The BBS was administered by trained physiotherapists. The data collected then underwent Rasch analysis. Content validity analysis showed a lack of items assessing postural responses to tripping and slips and stability during walking. On Rasch analysis, the BBS failed the requirements of monotonicity, local independence, unidimensionality and invariance. After rescoring 7 items, grouping of locally dependent items into testlets, and deletion of the static sitting balance item because mistargeted and underdiscriminating, the Rasch-modified BBS for PD (BBS-PD) showed adequate internal construct validity (χ(2)24=39.693; P=0.023), including absence of differential item functioning (DIF) across gender and age, and was, as a whole, sufficiently precise for individual person measurement (PSI=0.894). However, the scale was not well targeted to the sample in view of the prevalence of higher scores. This study demonstrated the internal construct validity and reliability of the BBS-PD as a measurement tool for patients with PD within the Rasch analysis framework. However, the lack of items critical to the assessment of postural control impairments typical of PD, affected negatively the targeting, so that a significant percentage of patients was located in the higher ability range of the measurement continuum, where precision of measurement is reduced. These findings suggest that the BBS, even if modified, may not be an effective tool for the measurement of early postural control in patients with PD.

  4. Stroke Self-efficacy Questionnaire: a Rasch-refined measure of confidence post stroke.

    PubMed

    Riazi, Afsane; Aspden, Trefor; Jones, Fiona

    2014-05-01

    Measuring self-efficacy during rehabilitation provides an important insight into understanding recovery post stroke. A Rasch analysis of the Stroke Self-efficacy Questionnaire (SSEQ) was undertaken to establish its use as a clinically meaningful and scientifically rigorous measure. One hundred and eighteen stroke patients completed the SSEQ with the help of an interviewer. Participants were recruited from local acute stroke units and community stroke rehabilitation teams. Data were analysed with confirmatory factor analysis conducted using AMOS and Rasch analysis conducted using RUMM2030 software. Confirmatory factor analysis and Rasch analyses demonstrated the presence of two separate scales that measure stroke survivors' self-efficacy with: i) self-management and ii) functional activities. Guided by Rasch analyses, the response categories of these two scales were collapsed from an 11-point to a 4-point scale. Modified scales met the expectations of the Rasch model. Items satisfied the Rasch requirements (overall and individual item fit, local response independence, differential item functioning, unidimensionality). Furthermore, the two subscales showed evidence of good construct validity. The new SSEQ has good psychometric properties and is a clinically useful assessment of self-efficacy after stroke. The scale measures stroke survivors' self-efficacy with self-management and activities as two unidimensional constructs. It is recommended for use in clinical and research interventions, and in evaluating stroke self-management interventions.

  5. Efficient Algorithms for Segmentation of Item-Set Time Series

    NASA Astrophysics Data System (ADS)

    Chundi, Parvathi; Rosenkrantz, Daniel J.

    We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.

  6. Do the Guideline Violations Influence Test Difficulty of High-Stake Test?: An Investigation on University Entrance Examination in Turkey

    ERIC Educational Resources Information Center

    Atalmis, Erkan Hasan

    2016-01-01

    Multiple-choice (MC) items are commonly used in high-stake tests. Thus, each item of such tests should be meticulously constructed to increase the accuracy of decisions based on test results. Haladyna and his colleagues (2002) addressed the valid item-writing guidelines to construct high quality MC items in order to increase test reliability and…

  7. Construction cost forecast model : model documentation and technical notes.

    DOT National Transportation Integrated Search

    2013-05-01

    Construction cost indices are generally estimated with Laspeyres, Paasche, or Fisher indices that allow changes : in the quantities of construction bid items, as well as changes in price to change the cost indices of those items. : These cost indices...

  8. Developing an Assessment Method of Active Aging: University of Jyvaskyla Active Aging Scale.

    PubMed

    Rantanen, Taina; Portegijs, Erja; Kokko, Katja; Rantakokko, Merja; Törmäkangas, Timo; Saajanaho, Milla

    2018-01-01

    To develop an assessment method of active aging for research on older people. A multiphase process that included drafting by an expert panel, a pilot study for item analysis and scale validity, a feedback study with focus groups and questionnaire respondents, and a test-retest study. Altogether 235 people aged 60 to 94 years provided responses and/or feedback. We developed a 17-item University of Jyvaskyla Active Aging Scale with four aspects in each item (goals, ability, opportunity, and activity; range 0-272). The psychometric and item properties are good and the scale assesses a unidimensional latent construct of active aging. Our scale assesses older people's striving for well-being through activities pertaining to their goals, abilities, and opportunities. The University of Jyvaskyla Active Aging Scale provides a quantifiable measure of active aging that may be used in postal questionnaires or interviews in research and practice.

  9. Development and initial validation of a brief self-report measure of cognitive dysfunction in fibromyalgia.

    PubMed

    Kratz, Anna L; Schilling, Stephen G; Goesling, Jenna; Williams, David A

    2015-06-01

    Pain is often the focus of research and clinical care in fibromyalgia (FM); however, cognitive dysfunction is also a common, distressing, and disabling symptom in FM. Current efforts to address this problem are limited by the lack of a comprehensive, valid measure of subjective cognitive dysfunction in FM that is easily interpretable, accessible, and brief. The purpose of this study was to leverage cognitive functioning item banks that were developed as part of the Patient Reported Outcomes Measurement Information System (PROMIS) to devise a 10-item short form measure of cognitive functioning for use in FM. In study 1, a nationwide (U.S.) sample of 1,035 adults with FM (age range = 18-82, 95.2% female) completed 2 cognitive item pools. Factor analyses and item response theory analyses were used to identify dimensionality and optimally performing items. A recommended 10-item measure, called the Multidimensional Inventory of Subjective Cognitive Impairment (MISCI) was created. In study 2, 232 adults with FM completed the MISCI and a legacy measure of cognitive functioning that is used in FM clinical trials, the Multiple Ability Self-Report Questionnaire (MASQ). The MISCI showed excellent internal reliability, low ceiling/floor effects, and good convergent validity with the MASQ (r = -.82). This paper presents the MISCI, a 10-item measure of cognitive dysfunction in FM, developed through classical test theory and item response theory. This brief but comprehensive measure shows evidence of excellent construct validity through large correlations with a lengthy legacy measure of cognitive functioning. Copyright © 2015 American Pain Society. Published by Elsevier Inc. All rights reserved.

  10. Cognitive ability of preschool, primary and secondary school children in Costa Rica.

    PubMed

    Rindermann, Heiner; Stiegmaier, Eva-Maria; Meisenberg, Gerhard

    2015-05-01

    Cognitive abilities of children in Costa Rica and Austria were compared using three age groups (N = 385/366). Cognitive ability tests (mental speed, culture reduced/fluid intelligence, literacy/crystallized intelligence) were applied that differed in the extent to which they refer to school-related knowledge. Preschool children (kindergarten, 5-6 years old, N(CR) = 80, N(Au) = 51) were assessed with the Coloured Progressive Matrices (CPM), primary school children (4th grade, 9-11 years old, N(CR) = 71, N(Au) = 71) with ZVT (a trail-making test), Standard Progressive Matrices (SPM) and items from PIRLS-Reading and TIMSS-Mathematics, and secondary school students (15-16 years old, N(CR) = 48, N(Au) = 48) with ZVT, Advanced Progressive Matrices (APM) and items from PISA-Reading and PISA-Mathematics. Additionally, parents and pupils were given questionnaires covering family characteristics and instruction. Average cognitive abilities were higher in Austria (Greenwich-IQ M(CR) = 87 and M(Au) = 99, d(IQ) = 12 points) and differences were smaller in preschool than in secondary school (d(IQ) = 7 vs 20 points). Differences in crystallized intelligence were larger than in fluid intelligence (mental speed: d(IQ) = 12, Raven: d(IQ) = 10, student achievement tests: d(IQ) = 17 IQ points). Differences were larger in comparisons at the level of g-factors. Austrian children were also taller (6.80 cm, d = 1.07 SD), but had lower body mass index (BMI(CR) = 19.35 vs BMI(Au) = 17.59, d = -0.89 SD). Different causal hypotheses explaining these differences are compared.

  11. Psychometric properties of the Italian version of the Cognitive Reserve Scale (I-CRS).

    PubMed

    Altieri, Manuela; Siciliano, Mattia; Pappacena, Simona; Roldán-Tapia, María Dolores; Trojano, Luigi; Santangelo, Gabriella

    2018-05-04

    The original definition of cognitive reserve (CR) refers to the individual differences in cognitive performance after a brain damage or pathology. Several proxies were proposed to evaluate CR (education, occupational attainment, premorbid IQ, leisure activities). Recently, some scales were developed to measure CR taking into account several cognitively stimulating activities. The aim of this study is to adapt the Cognitive Reserve Scale (I-CRS) for the Italian population and to explore its psychometric properties. I-CRS was administered to 547 healthy participants, ranging from 18 to 89 years old, along with neuropsychological and behavioral scales to evaluate cognitive functioning, depressive symptoms, and apathy. Cronbach's α, corrected item-total correlations, and the inter-item correlation matrix were calculated to evaluate the psychometric properties of the scale. Linear regression analysis was performed to build a correction grid of the I-CRS according to demographic variables. Correlational analyses were performed to explore the relationships between I-CRS and neuropsychological and behavioral scales. We found that age, sex, and education influenced the I-CRS score. Young adults and adults obtained higher I-CRS scores than elderly adults; women and participants with high educational attainment scored higher on I-CRS than men and participants with low education. I-CRS score correlated poorly with cognitive and depression scale scores, but moderately with apathy scale scores. I-CRS showed good psychometric properties and seemed to be a useful tool to assess CR in every adult life stage. Moreover, our findings suggest that apathy rather than depressive symptoms may interfere with the building of CR across the lifespan.

  12. Cultural Resources Survey of Three Iberville Parish Levee Enlargement and Revetment Construction Items

    DTIC Science & Technology

    1993-09-22

    SURVEY OF THREE IBERVILLE PARISH LEVEE ENLARGEMENT AND REVETMENT CONSTRUCTION ITEMS September 1993 Sam .4 D2 FINAL REPORT E R. Christopher Goodwin...LEVEE ENLARGEMENT ANj REVETMENT CONSTRUCTION ITEMS 12. PERSONAL AUTHOR(S) R. Christopher Goodwin, Ph.d., Rebecca E. Bruce, Lawrence L Hewitt, and E... block number) FIELD GROUP SUB-GROUP Acadian Coast Historic Arche6cogy Rice Antebellum Iberville Parish Saw Mill Plantation Carville Leprosarium Ophelia

  13. The measurement of threat orientations.

    PubMed

    Thompson, Suzanne C; Schlehofer, Michèle M; Bovin, Michelle J

    2006-01-01

    To develop measures of 3 threat orientations that affect responses to health behavior messages. In Study 1, college students (N = 47) completed items assessing threat orientations and health behaviors. In Study 2, college students and community adults (N = 110) completed the threat orientation items and measures of convergent and discriminant validity. In Study 1, the control-based, denial-based, and heightened-sensitivity-based threat orientation scales demonstrated good internal consistency and correlated with engagement in health behaviors. In Study 2, the convergent and discriminant validity of the 3 measures was established. The 3 scales have good internal reliability and construct validity.

  14. Does laboratory cue reactivity correlate with real-world craving and smoking responses to cues?

    PubMed

    Shiffman, Saul; Li, Xiaoxue; Dunbar, Michael S; Tindle, Hilary A; Scholl, Sarah M; Ferguson, Stuart G

    2015-10-01

    Laboratory cue reactivity (CR) assessments are used to assess smokers' responses to cues. Likewise, EMA recording is used to characterize real-world response to cues. Understanding the relationship between CR and EMA responses addresses the ecological validity of CR. In 190 daily smokers not currently quitting, craving and smoking responses to cues were assessed in laboratory CR and by real-world EMA recording. Separate CR sessions involved 5 smoking-relevant cues (smoking, alcohol, negative affect, positive affect, smoking prohibitions), and a neutral cue. Subjects used EMA to monitor smoking situations for 3 weeks, completing parallel situational assessments (presence of others smoking, alcohol consumption, negative affect, positive affect, and smoking prohibitions, plus current craving) in smoking and non-smoking occasions (averaging 70 and 60 occasions each). Analyses correlated CR craving and smoking cue responses with EMA craving and smoking correlations with similar cues. Although some cues did not show main effects on average craving or smoking, a wide range of individual differences in response to cues was apparent in both CR and EMA data, providing the necessary context to assess their relationship. Laboratory CR measures of cue response were not correlated with real-world cue responses assessed by EMA. The average correlation was 0.03; none exceeded 0.32. One of 40 correlations examined was significantly greater than 0. Laboratory CR measures do not correlate with EMA-assessed craving or smoking in response to cues, suggesting that CR measures are not accurate predictors of how smokers react to relevant stimuli in the real world. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  15. The (mis)measurement of the Dark Triad Dirty Dozen: exploitation at the core of the scale

    PubMed Central

    Kajonius, Petri J.; Persson, Björn N.; Rosenberg, Patricia

    2016-01-01

    Background. The dark side of human character has been conceptualized in the Dark Triad Model: Machiavellianism, psychopathy, and narcissism. These three dark traits are often measured using single long instruments for each one of the traits. Nevertheless, there is a necessity of short and valid personality measures in psychological research. As an independent research group, we replicated the factor structure, convergent validity and item response for one of the most recent and widely used short measures to operationalize these malevolent traits, namely, Jonason’s Dark Triad Dirty Dozen. We aimed to expand the understanding of what the Dirty Dozen really captures because the mixed results on construct validity in previous research. Method. We used the largest sample to date to respond to the Dirty Dozen (N = 3,698). We firstly investigated the factor structure using Confirmatory Factor Analysis and an exploratory distribution analysis of the items in the Dirty Dozen. Secondly, using a sub-sample (n = 500) and correlation analyses, we investigated the Dirty Dozen dark traits convergent validity to Machiavellianism measured by the Mach-IV, psychopathy measured by Eysenck’s Personality Questionnaire Revised, narcissism using the Narcissism Personality Inventory, and both neuroticism and extraversion from the Eysenck’s questionnaire. Finally, besides these Classic Test Theory analyses, we analyzed the responses for each Dirty Dozen item using Item Response Theory (IRT). Results. The results confirmed previous findings of a bi-factor model fit: one latent core dark trait and three dark traits. All three Dirty Dozen traits had a striking bi-modal distribution, which might indicate unconcealed social undesirability with the items. The three Dirty Dozen traits did converge too, although not strongly, with the contiguous single Dark Triad scales (r between .41 and .49). The probabilities of filling out steps on the Dirty Dozen narcissism-items were much higher than on the Dirty Dozen items for Machiavellianism and psychopathy. Overall, the Dirty Dozen instrument delivered the most predictive value with persons with average and high Dark Triad traits (theta > −0.5). Moreover, the Dirty Dozen scale was better conceptualized as a combined Machiavellianism-psychopathy factor, not narcissism, and is well captured with item 4: ‘I tend to exploit others towards my own end.’ Conclusion. The Dirty Dozen showed a consistent factor structure, a relatively convergent validity similar to that found in earlier studies. Narcissism measured using the Dirty Dozen, however, did not contribute with information to the core of the Dirty Dozen construct. More importantly, the results imply that the core of the Dirty Dozen scale, a manipulative and anti-social trait, can be measured by a Single Item Dirty Dark Dyad (SIDDD). PMID:26966673

  16. The (mis)measurement of the Dark Triad Dirty Dozen: exploitation at the core of the scale.

    PubMed

    Kajonius, Petri J; Persson, Björn N; Rosenberg, Patricia; Garcia, Danilo

    2016-01-01

    Background. The dark side of human character has been conceptualized in the Dark Triad Model: Machiavellianism, psychopathy, and narcissism. These three dark traits are often measured using single long instruments for each one of the traits. Nevertheless, there is a necessity of short and valid personality measures in psychological research. As an independent research group, we replicated the factor structure, convergent validity and item response for one of the most recent and widely used short measures to operationalize these malevolent traits, namely, Jonason's Dark Triad Dirty Dozen. We aimed to expand the understanding of what the Dirty Dozen really captures because the mixed results on construct validity in previous research. Method. We used the largest sample to date to respond to the Dirty Dozen (N = 3,698). We firstly investigated the factor structure using Confirmatory Factor Analysis and an exploratory distribution analysis of the items in the Dirty Dozen. Secondly, using a sub-sample (n = 500) and correlation analyses, we investigated the Dirty Dozen dark traits convergent validity to Machiavellianism measured by the Mach-IV, psychopathy measured by Eysenck's Personality Questionnaire Revised, narcissism using the Narcissism Personality Inventory, and both neuroticism and extraversion from the Eysenck's questionnaire. Finally, besides these Classic Test Theory analyses, we analyzed the responses for each Dirty Dozen item using Item Response Theory (IRT). Results. The results confirmed previous findings of a bi-factor model fit: one latent core dark trait and three dark traits. All three Dirty Dozen traits had a striking bi-modal distribution, which might indicate unconcealed social undesirability with the items. The three Dirty Dozen traits did converge too, although not strongly, with the contiguous single Dark Triad scales (r between .41 and .49). The probabilities of filling out steps on the Dirty Dozen narcissism-items were much higher than on the Dirty Dozen items for Machiavellianism and psychopathy. Overall, the Dirty Dozen instrument delivered the most predictive value with persons with average and high Dark Triad traits (theta > -0.5). Moreover, the Dirty Dozen scale was better conceptualized as a combined Machiavellianism-psychopathy factor, not narcissism, and is well captured with item 4: 'I tend to exploit others towards my own end.' Conclusion. The Dirty Dozen showed a consistent factor structure, a relatively convergent validity similar to that found in earlier studies. Narcissism measured using the Dirty Dozen, however, did not contribute with information to the core of the Dirty Dozen construct. More importantly, the results imply that the core of the Dirty Dozen scale, a manipulative and anti-social trait, can be measured by a Single Item Dirty Dark Dyad (SIDDD).

  17. Rasch-built Overall Disability Scale (R-ODS) for immune-mediated peripheral neuropathies.

    PubMed

    van Nes, S I; Vanhoutte, E K; van Doorn, P A; Hermans, M; Bakkers, M; Kuitwaard, K; Faber, C G; Merkies, I S J

    2011-01-25

    To develop a patient-based, linearly weighted scale that captures activity and social participation limitations in patients with Guillain-Barré syndrome (GBS), chronic inflammatory demyelinating polyradiculoneuropathy (CIDP), and gammopathy-related polyneuropathy (MGUSP). A preliminary Rasch-built Overall Disability Scale (R-ODS) containing 146 activity and participation items was constructed, based on the WHO International Classification of Functioning, Disability and Health, literature search, and patient interviews. The preliminary R-ODS was assessed twice (interval: 2-4 weeks; test-retest reliability studies) in 294 patients who experienced GBS in the past (n = 174) or currently have stable CIDP (n = 80) or MGUSP (n = 40). Data were analyzed using the Rasch unidimensional measurement model (RUMM2020). The preliminary R-ODS did not meet the Rasch model expectations. Based on disordered thresholds, misfit statistics, item bias, and local dependency, items were systematically removed to improve the model fit, regularly controlling the class intervals and model statistics. Finally, we succeeded in constructing a 24-item scale that fulfilled all Rasch requirements. "Reading a newspaper/book" and "eating" were the 2 easiest items; "standing for hours" and "running" were the most difficult ones. Good validity and reliability were obtained. The R-ODS is a linearly weighted scale that specifically captures activity and social participation limitations in patients with GBS, CIDP, and MGUSP. Compared to the Overall Disability Sum Score, the R-ODS represents a wider range of item difficulties, thereby better targeting patients with different ability levels. If responsive, the R-ODS will be valuable for future clinical trials and follow-up studies in these conditions.

  18. The PROMIS fatigue item bank has good measurement properties in patients with fibromyalgia and severe fatigue.

    PubMed

    Yost, Kathleen J; Waller, Niels G; Lee, Minji K; Vincent, Ann

    2017-06-01

    Efficient management of fibromyalgia (FM) requires precise measurement of FM-specific symptoms. Our objective was to assess the measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) fatigue item bank (FIB) in people with FM. We applied classical psychometric and item response theory methods to cross-sectional PROMIS-FIB data from two samples. Data on the clinical FM sample were obtained at a tertiary medical center. Data for the U.S. general population sample were obtained from the PROMIS network. The full 95-item bank was administered to both samples. We investigated dimensionality of the item bank in both samples by separately fitting a bifactor model with two group factors; experience and impact. We assessed measurement invariance between samples, and we explored an alternate factor structure with the normative sample and subsequently confirmed that structure in the clinical sample. Finally, we assessed whether reporting FM subdomain scores added value over reporting a single total score. The item bank was dominated by a general fatigue factor. The fit of the initial bifactor model and evidence of measurement invariance indicated that the same constructs were measured across the samples. An alternative bifactor model with three group factors demonstrated slightly improved fit. Subdomain scores add value over a total score. We demonstrated that the PROMIS-FIB is appropriate for measuring fatigue in clinical samples of FM patients. The construct can be presented by a single score; however, subdomain scores for the three group factors identified in the alternative model may also be reported.

  19. The Val30Met familial amyloid polyneuropathy specific Rasch-built overall disability scale (FAP-RODS(©) ).

    PubMed

    Pruppers, Mariëlle H J; Merkies, Ingemar S J; Faber, Catharina G; Da Silva, Ana M; Costa, Vanessa; Coelho, Teresa

    2015-09-01

    Familial amyloid polyneuropathy (FAP) is a chronic debilitating multi-organic disorder, mainly assessed using ordinal-based impairment measures. To date, no outcome measure at the activity and participation level has been constructed in FAP. The current study aimed to design an interval activity/participation scale for FAP through Rasch methodology. A preliminary FAP Rasch-built overall disability scale (pre-FAP-RODS) containing 146 activity/participation items was assessed twice (interval: 2-4 week; test-retest reliability) in 248 patients with Val30Met FAP examined in Porto, Portugal, of which 65.7% have received liver transplantation. An ordinal-based 24-item FAP-symptoms inventory questionnaire (FAP-SIQ) was also assessed (validity purposes). The pre-FAP-RODS and FAP-SIQ data were subjected to Rasch analyses. The pre-FAP-RODS did not meet model's expectations. On the basis of requirements such as misfit statistics, differential item functioning, and local dependency, items were systematically removed until a final 34-item FAP-RODS(©) was constructed fulfilling all Rasch requirements. Acceptable reliability/validity scores were demonstrated. In conclusion, the 34-item FAP-RODS(©) is a disease-specific interval measure suitable for detecting activity and participation restrictions in patients with FAP. The use of the FAP-RODS(©) is recommended for future international clinical trials in patients with Val30Met FAP determining its responsiveness and its cross-cultural validation. Its expansion to other forms of FAP should also be focus of future clinical studies. © 2015 Peripheral Nerve Society.

  20. Development of the Primary Care Quality-Homeless (PCQ-H) Instrument: A Practical Survey of Patients' Experiences in Primary Care

    PubMed Central

    Kertesz, Stefan. G.; Pollio, David E.; Jones, Richard N.; Steward, Jocelyn; Stringfellow, Erin J.; Gordon, Adam J.; Johnson, Nancy K.; Kim, Theresa A.; Granstaff, Unita; Austin, Erika L.; Young, Alexander S.; Golden, Joya; Davis, Lori L.; Roth, David L.; Holt, Cheryl L.

    2015-01-01

    Background Homeless patients face unique challenges in obtaining primary care responsive to their needs and context. Patient experience questionnaires could permit assessment of patient-centered medical homes for this population, but standard instruments may not reflect homeless patients' priorities and concerns. Objectives This report describes (a) the content and psychometric properties of a new primary care questionnaire for homeless patients and (b) the methods utilized in its development. Methods Starting with quality-related constructs from the Institute of Medicine, we identified relevant themes by interviewing homeless patients and experts in their care. A multidisciplinary team drafted a preliminary set of 78 items. This was administered to homeless-experienced clients (n=563) across 3 VA facilities and 1 non-VA Health Care for the Homeless Program. Using Item Response Theory, we examined Test Information Function curves to eliminate less informative items and devise plausibly distinct subscales. Results The resulting 33-item instrument (Primary Care Quality-Homeless, PCQ-H) has four subscales: Patient-Clinician Relationship (15 items), Cooperation among Clinicians (3 items), Access/Coordination (11 items) and Homeless-Specific Needs (4 items). Evidence for divergent and convergent validity is provided. Test Information Function (TIF) graphs showed adequate informational value to permit inferences about groups for 3 subscales (Relationship, Cooperation and Access/Coordination). The 3-item Cooperation subscale had lower informational value (TIF<5) but had good internal consistency (alpha=0.75) and patients frequently reported problems in this aspect of care. Conclusions Systematic application of qualitative and quantitative methods supported the development of a brief patient-reported questionnaire focused on the primary care of homeless patients and offers guidance for future population-specific instrument development. PMID:25023918

  1. Are Attitudes Toward Writing and Reading Separable Constructs? A Study With Primary Grade Children

    PubMed Central

    Graham, Steve; Berninger, Virginia; Abbott, Robert

    2012-01-01

    This study examined whether or not attitude towards writing is a unique and separable construct from attitude towards reading for young, beginning writers. Participants were 128 first-grade children (70 girls and 58 boys) and 113 third-grade students (57 girls and 56 boys). Each child was individually administered a 24 item attitude measure, which contained 12 items assessing attitude towards writing and 12 parallel items for reading. Students also wrote a narrative about a personal event in their life. A factor analysis of the 24 item attitude measure provided evidence that generally support the contention that writing and reading attitudes are separable constructs for young beginning writers, as it yielded three factors: a writing attitude factor with 9 items, a reading attitude factor with 9 parallel items, and an attitude about literacy interactions with others factor containing 4 items (2 items in writing and 2 parallel items in reading). Further validation that attitude towards writing is a separable construct from attitude towards reading was obtained at the third-grade level, where writing attitude made a unique and significant contribution, beyond the other two attitude measures, to the prediction of three measures of writing: quality, length, and longest correct word sequence. At the first-grade level, none of the 3 attitude measures predicted students’ writing performance. Finally, girls had more positive attitudes concerning reading and writing than boys. PMID:22736933

  2. The Incremental Value of Subjective and Quantitative Assessment of 18F-FDG PET for the Prediction of Pathologic Complete Response to Preoperative Chemoradiotherapy in Esophageal Cancer.

    PubMed

    van Rossum, Peter S N; Fried, David V; Zhang, Lifei; Hofstetter, Wayne L; van Vulpen, Marco; Meijer, Gert J; Court, Laurence E; Lin, Steven H

    2016-05-01

    A reliable prediction of a pathologic complete response (pathCR) to chemoradiotherapy before surgery for esophageal cancer would enable investigators to study the feasibility and outcome of an organ-preserving strategy after chemoradiotherapy. So far no clinical parameters or diagnostic studies are able to accurately predict which patients will achieve a pathCR. The aim of this study was to determine whether subjective and quantitative assessment of baseline and postchemoradiation (18)F-FDG PET can improve the accuracy of predicting pathCR to preoperative chemoradiotherapy in esophageal cancer beyond clinical predictors. This retrospective study was approved by the institutional review board, and the need for written informed consent was waived. Clinical parameters along with subjective and quantitative parameters from baseline and postchemoradiation (18)F-FDG PET were derived from 217 esophageal adenocarcinoma patients who underwent chemoradiotherapy followed by surgery. The associations between these parameters and pathCR were studied in univariable and multivariable logistic regression analysis. Four prediction models were constructed and internally validated using bootstrapping to study the incremental predictive values of subjective assessment of (18)F-FDG PET, conventional quantitative metabolic features, and comprehensive (18)F-FDG PET texture/geometry features, respectively. The clinical benefit of (18)F-FDG PET was determined using decision-curve analysis. A pathCR was found in 59 (27%) patients. A clinical prediction model (corrected c-index, 0.67) was improved by adding (18)F-FDG PET-based subjective assessment of response (corrected c-index, 0.72). This latter model was slightly improved by the addition of 1 conventional quantitative metabolic feature only (i.e., postchemoradiation total lesion glycolysis; corrected c-index, 0.73), and even more by subsequently adding 4 comprehensive (18)F-FDG PET texture/geometry features (corrected c-index, 0.77). However, at a decision threshold of 0.9 or higher, representing a clinically relevant predictive value for pathCR at which one may be willing to omit surgery, there was no clear incremental value. Subjective and quantitative assessment of (18)F-FDG PET provides statistical incremental value for predicting pathCR after preoperative chemoradiotherapy in esophageal cancer. However, the discriminatory improvement beyond clinical predictors does not translate into a clinically relevant benefit that could change decision making. © 2016 by the Society of Nuclear Medicine and Molecular Imaging, Inc.

  3. Preliminary Study of the Autism Self-Efficacy Scale for Teachers (ASSET)

    PubMed Central

    Ruble, Lisa A.; Toland, Michael D.; Birdwhistell, Jessica L.; McGrew, John H.; Usher, Ellen L.

    2013-01-01

    The purpose of the current study was to evaluate a new measure, the Autism Self-Efficacy Scale for Teachers (ASSET) for its dimensionality, internal consistency, and construct validity derived in a sample of special education teachers (N = 44) of students with autism. Results indicate that all items reflect one dominant factor, teachers’ responses to items were internally consistent within the sample, and compared to a 100-point scale, a 6-point response scale is adequate. ASSET scores were found to be negatively correlated with scores on two subscale measures of teacher stress (i.e., self-doubt/need for support and disruption of the teaching process) but uncorrelated with teacher burnout scores. The ASSET is a promising tool that requires replication with larger samples. PMID:23976899

  4. Measuring the quality of life in hypertension according to Item Response Theory.

    PubMed

    Borges, José Wicto Pereira; Moreira, Thereza Maria Magalhães; Schmitt, Jeovani; Andrade, Dalton Francisco de; Barbetta, Pedro Alberto; Souza, Ana Célia Caetano de; Lima, Daniele Braz da Silva; Carvalho, Irialda Saboia

    2017-05-04

    To analyze the Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL - Mini-questionnaire of Quality of Life in Hypertension) using the Item Response Theory. This is an analytical study conducted with 712 persons with hypertension treated in thirteen primary health care units of Fortaleza, State of Ceará, Brazil, in 2015. The steps of the analysis by the Item Response Theory were: evaluation of dimensionality, estimation of parameters of items, and construction of scale. The study of dimensionality was carried out on the polychoric correlation matrix and confirmatory factor analysis. To estimate the item parameters, we used the Gradual Response Model of Samejima. The analyses were conducted using the free software R with the aid of psych and mirt. The analysis has allowed the visualization of item parameters and their individual contributions in the measurement of the latent trait, generating more information and allowing the construction of a scale with an interpretative model that demonstrates the evolution of the worsening of the quality of life in five levels. Regarding the item parameters, the items related to the somatic state have had a good performance, as they have presented better power to discriminate individuals with worse quality of life. The items related to mental state have been those which contributed with less psychometric data in the MINICHAL. We conclude that the instrument is suitable for the identification of the worsening of the quality of life in hypertension. The analysis of the MINICHAL using the Item Response Theory has allowed us to identify new sides of this instrument that have not yet been addressed in previous studies. Analisar o Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL) por meio da Teoria da Resposta ao Item. Estudo analítico realizado com 712 pessoas com hipertensão arterial atendidas em 13 unidades de atenção primária em saúde de Fortaleza, CE, em 2015. As etapas da análise pela Teoria da Resposta ao Item foram: avaliação da dimensionalidade, estimação dos parâmetros dos itens e construção da escala. O estudo da dimensionalidade foi realizado sobre a matriz de correlação policórica e análise fatorial confirmatória. Para a estimação dos parâmetros dos itens, foi utilizado o Modelo de Resposta Gradual de Samejima. As análises foram conduzidas no software livre R com o auxílio dos pacotes psych e mirt. A análise permitiu a visualização dos parâmetros dos itens e suas contribuições individuais na mensuração do traço latente, gerando mais informação, permitindo a construção de uma escala com um modelo interpretativo que demonstra a evolução da piora da qualidade de vida em cinco níveis. Quanto aos parâmetros dos itens, houve bom desempenho daqueles referentes ao estado somático, pois apresentaram melhor poder de discriminar os indivíduos com pior qualidade de vida. Os itens relacionados ao estado mental foram os que contribuíram com menor quantidade de informação psicométrica no MINICHAL. Conclui-se que o instrumento é indicado para a identificação da deterioração da qualidade de vida em hipertensão arterial. A análise do MINICHAL pela Teoria da Resposta ao Item permitiu identificar novas facetas desse instrumento ainda não abordadas em estudos anteriores.

  5. Hope and General Self-efficacy: Two Measures of the Same Construct?

    PubMed

    Zhou, Mingming; Kam, Chester Chun Seng

    2016-07-03

    The aim of this study was to test the extent to which hope measure is equivalent to general self-efficacy measure. Questionnaire data on these two constructs and other external variables were collected from 199 Chinese college students. The factor analytic results suggested that hope and self-efficacy items measured the same construct. The unidimensional model combining hope items and GSE items fit the data as well as the bidimensional model, indicating that their corresponding items measured the same underlying construct. Further analyses showed that hope and GSE did not correlate with external variables differently in a systematic manner. Most of these correlational differences were non-significant and negligible. These findings suggested that the literatures studying GSE and hope could be considered to be integrated and that researchers need to recognize and acknowledge the conceptual and operational similarities among these constructs in the literature.

  6. Validity and measurement precision of the PROMIS physical function item bank and a content validity-driven 20-item short form in rheumatoid arthritis compared with traditional measures.

    PubMed

    Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Glas, Cees A W; Vonkeman, Harald E; Taal, Erik; Krishnan, Eswar; Bernelot Moens, Hein J; Boers, Maarten; Terwee, Caroline B; van Riel, Piet L C M; van de Laar, Mart A F J

    2015-12-01

    To evaluate the content validity and measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) physical function item bank and a 20-item short form in patients with RA in comparison with the HAQ disability index (HAQ-DI) and 36-item Short Form Health Survey (SF-36) physical functioning scale (PF-10). The content validity of the instruments was evaluated by linking their items to the International Classification of Functioning, Disability and Health (ICF) core set for RA. The measures were administered to 690 RA patients enrolled in the Dutch Rheumatoid Arthritis Monitoring registry. Measurement precision was evaluated using item response theory methods and construct validity was evaluated by correlating physical function scores with other clinical and patient-reported outcome measures. All 207 health concepts identified in the physical function measures referred to activities that are featured in the ICF. Twenty-three of 26 ICF RA core set domains are featured in the full PROMIS physical function item bank compared with 13 and 8 for the HAQ-DI and PF-10, respectively. As hypothesized, all three physical function instruments were highly intercorrelated (r 0.74-0.84), moderately correlated with disease activity measures (r 0.44-0.63) and weakly correlated with age (rs 0.07-0.14). Item response theory-based analysis revealed that a 20-item PROMIS physical function short form covered a wider range of physical function levels than the HAQ-DI or PF-10. The PROMIS physical function item bank demonstrated excellent measurement properties in RA. A content-driven 20-item short form may be a useful tool for assessing physical function in RA. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  7. Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS).

    PubMed

    Rose, M; Bjorner, J B; Becker, J; Fries, J F; Ware, J E

    2008-01-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS) was initiated to improve precision, reduce respondent burden, and enhance the comparability of health outcomes measures. We used item response theory (IRT) to construct and evaluate a preliminary item bank for physical function assuming four subdomains. Data from seven samples (N=17,726) using 136 items from nine questionnaires were evaluated. A generalized partial credit model was used to estimate item parameters, which were normed to a mean of 50 (SD=10) in the US population. Item bank properties were evaluated through Computerized Adaptive Test (CAT) simulations. IRT requirements were fulfilled by 70 items covering activities of daily living, lower extremity, and central body functions. The original item context partly affected parameter stability. Items on upper body function, and need for aid or devices did not fit the IRT model. In simulations, a 10-item CAT eliminated floor and decreased ceiling effects, achieving a small standard error (< 2.2) across scores from 20 to 50 (reliability >0.95 for a representative US sample). This precision was not achieved over a similar range by any comparable fixed length item sets. The methods of the PROMIS project are likely to substantially improve measures of physical function and to increase the efficiency of their administration using CAT.

  8. Item Response Theory Modeling and Categorical Regression Analyses of the Five-Factor Model Rating Form: A Study on Italian Community-Dwelling Adolescent Participants and Adult Participants.

    PubMed

    Fossati, Andrea; Widiger, Thomas A; Borroni, Serena; Maffei, Cesare; Somma, Antonella

    2017-06-01

    To extend the evidence on the reliability and construct validity of the Five-Factor Model Rating Form (FFMRF) in its self-report version, two independent samples of Italian participants, which were composed of 510 adolescent high school students and 457 community-dwelling adults, respectively, were administered the FFMRF in its Italian translation. Adolescent participants were also administered the Italian translation of the Borderline Personality Features Scale for Children-11 (BPFSC-11), whereas adult participants were administered the Italian translation of the Triarchic Psychopathy Measure (TriPM). Cronbach α values were consistent with previous findings; in both samples, average interitem r values indicated acceptable internal consistency for all FFMRF scales. A multidimensional graded item response theory model indicated that the majority of FFMRF items had adequate discrimination parameters; information indices supported the reliability of the FFMRF scales. Both categorical (i.e., item-level) and scale-level regression analyses suggested that the FFMRF scores may predict a nonnegligible amount of variance in the BPFSC-11 total score in adolescent participants, and in the TriPM scale scores in adult participants.

  9. Testing measurement invariance of the patient-reported outcomes measurement information system pain behaviors score between the US general population sample and a sample of individuals with chronic pain.

    PubMed

    Chung, Hyewon; Kim, Jiseon; Cook, Karon F; Askew, Robert L; Revicki, Dennis A; Amtmann, Dagmar

    2014-02-01

    In order to test the difference between group means, the construct measured must have the same meaning for all groups under investigation. This study examined the measurement invariance of responses to the patient-reported outcomes measurement information system (PROMIS) pain behavior (PB) item bank in two samples: the PROMIS calibration sample (Wave 1, N = 426) and a sample recruited from the American Chronic Pain Association (ACPA, N = 750). The ACPA data were collected to increase the number of participants with higher levels of pain. Multi-group confirmatory factor analysis (MG-CFA) and two item response theory (IRT)-based differential item functioning (DIF) approaches were employed to evaluate the existence of measurement invariance. MG-CFA results supported metric invariance of the PROMIS-PB, indicating unstandardized factor loadings with equal across samples. DIF analyses revealed that impact of 6 DIF items was negligible. Based on the results of both MG-CFA and IRT-based DIF approaches, we recommend retaining the original parameter estimates obtained from the combined samples based on the results of MG-CFA.

  10. Factors affecting cardiac rehabilitation referral by physician specialty.

    PubMed

    Grace, Sherry L; Grewal, Keerat; Stewart, Donna E

    2008-01-01

    Cardiac rehabilitation (CR) is widely underutilized because of multiple factors including physician referral practices. Previous research has shown CR referral varies by type of provider, with cardiologists more likely to refer than primary care physicians. The objective of this study was to compare factors affecting CR referral in primary care physicians versus cardiac specialists. A cross-sectional survey of a stratified random sample of 510 primary care physicians and cardiac specialists (cardiologists or cardiovascular surgeons) in Ontario identified through the Canadian Medical Directory Online was administered. One hundred four primary care physicians and 81 cardiac specialists responded to the 26-item investigator-generated survey examining medical, demographic, attitudinal, and health system factors affecting CR referral. Primary care physicians were more likely to endorse lack of familiarity with CR site locations (P < .001), lack of standardized referral forms (P < .001), inconvenience (P = .04), program quality (P = .004), and lack of discharge communication from CR (P = .001) as factors negatively impacting CR referral practices than cardiac specialists. Cardiac specialists were significantly more likely to perceive that their colleagues and department would regularly refer patients to CR than primary care physicians (P < .001). Where differences emerged, primary care physicians were more likely to perceive factors that would impede CR referral, some of which are modifiable. Marketing CR site locations, provision of standardized referral forms, and ensuring discharge summaries are communicated to primary care physicians may improve their willingness to refer to CR.

  11. Rasch Analysis of the Student Refractive Error and Eyeglass Questionnaire

    PubMed Central

    Crescioni, Mabel; Messer, Dawn H.; Warholak, Terri L.; Miller, Joseph M.; Twelker, J. Daniel; Harvey, Erin M.

    2014-01-01

    Purpose To evaluate and refine a newly developed instrument, the Student Refractive Error and Eyeglasses Questionnaire (SREEQ), designed to measure the impact of uncorrected and corrected refractive error on vision-related quality of life (VRQoL) in school-aged children. Methods. A 38 statement instrument consisting of two parts was developed: Part A relates to perceptions regarding uncorrected vision and Part B relates to perceptions regarding corrected vision and includes other statements regarding VRQoL with spectacle correction. The SREEQ was administered to 200 Native American 6th through 12th grade students known to have previously worn and who currently require eyeglasses. Rasch analysis was conducted to evaluate the functioning of the SREEQ. Statements on Part A and Part B were analyzed to examine the dimensionality and constructs of the questionnaire, how well the items functioned, and the appropriateness of the response scale used. Results Rasch analysis suggested two items be eliminated and the measurement scale for matching items be reduced from a 4-point response scale to a 3-point response scale. With these modifications, categorical data were converted to interval level data, to conduct an item and person analysis. A shortened version of the SREEQ was constructed with these modifications, the SREEQ-R, which included the statements that were able to capture changes in VRQoL associated with spectacle wear for those with significant refractive error in our study population. Conclusions While the SREEQ Part B appears to be a have less than optimal reliability to assess the impact of spectacle correction on VRQoL in our student population, it is also able to detect statistically significant differences from pretest to posttest on both the group and individual levels to show that the instrument can assess the impact that glasses have on VRQoL. Further modifications to the questionnaire, such as those included in the SREEQ-R, could enhance its functionality. PMID:24811844

  12. Students' perceptions of a blended learning experience in dental education.

    PubMed

    Varthis, S; Anderson, O R

    2018-02-01

    "Flipped" instructional sequencing is a new instructional method where online instruction precedes the group meeting, allowing for more sophisticated learning through discussion and critical thinking during the in-person class session; a novel approach studied in this research. The purpose of this study was to document dental students' perceptions of flipped-based blended learning and to apply a new method of displaying their perceptions based on Likert-scale data analysis using a network diagramming method known as an item correlation network diagram (ICND). In addition, this article aimed to encourage institutions or course directors to consider self-regulated learning and social constructivism as a theoretical framework when blended learning is incorporated in dental curricula. Twenty (second year) dental students at a Northeastern Regional Dental School in the United States participated in this study. A Likert scale was administered before and after the learning experience to obtain evidence of their perceptions of its quality and educational merits. Item correlation network diagrams, based on the intercorrelations amongst the responses to the Likert-scale items, were constructed to display students' changes in perceptions before and after the learning experience. Students reported positive perceptions of the blended learning, and the ICND analysis of their responses before and after the learning experience provided insights into their social (group-based) cognition about the learning experience. The ICNDs are considered evidence of social or group-based cognition, because they are constructed from evidence obtained using intercorrelations of the total group responses to the Likert-scale items. The students positively received blended learning in dental education, and the ICND analyses demonstrated marked changes in their social cognition of the learning experience based on the pre- and post-Likert survey data. Self-regulated learning and social constructivism are encouraged as useful theoretical frameworks for a blended learning approach. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  13. Dual Inhibition of Ca(+2) Influx and Phosphodiesterase Enzyme Provides Scientific Base for the Medicinal Use of Chrozophora prostrata Dalz. in Respiratory Disorders.

    PubMed

    Arshad, Usman; Bashir, Samra; -Ur-Rehman, Najeeb; Yaqub, Tahir; Gilani, Anwarul-Hassan

    2016-06-01

    The crude ethanolic extract of Chrozophora prostrata (Cp.Cr) was tested using in vivo and ex vivo assays for its possible bronchodilatory effects in order to validate its medicinal use in respiratory disorders, like asthma and cough. Cp.Cr exhibited dose-dependent inhibition of carbachol (CCh)-induced bronchospasm in anesthetized rats, similar to aminophylline. When tested on guinea-pig tracheal preparations, Cp.Cr caused relaxation of both CCh (1 μM) and high K(+) (80 mM)-induced contractions with comparable potencies, similar to papaverine, a dual inhibitor of phosphodiesterse (PDE) and Ca(+2) influx. Pre-treatment of the tracheal tissues with Cp.Cr resulted in potentiation of the inhibitory effect of isoprenaline on CCh-induced contractions, like that caused by papaverine indicative of PDE inhibitory activity, which was confirmed when Cp.Cr concentration dependently (1 and 3 mg/mL) increased intracellular cAMP levels of the tracheal preparations, like papaverine. Cp.Cr shifted concentrationresponse curves of Ca(+2) constructed in guinea-pig tracheal preparation towards right with suppression of the maximum response, similar to both verapamil and papaverine. These data indicate bronchodilator activity of Chrozophora prostrata mediated possibly through dual inhibition of PDE and Ca(+2) influx, thus, showing therapeutic potential in asthma with effect enhancing and side-effect neutralizing potential Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Calibration of the Test of Relational Reasoning.

    PubMed

    Dumas, Denis; Alexander, Patricia A

    2016-10-01

    Relational reasoning, or the ability to discern meaningful patterns within a stream of information, is a critical cognitive ability associated with academic and professional success. Importantly, relational reasoning has been described as taking multiple forms, depending on the type of higher order relations being drawn between and among concepts. However, the reliable and valid measurement of such a multidimensional construct of relational reasoning has been elusive. The Test of Relational Reasoning (TORR) was designed to tap 4 forms of relational reasoning (i.e., analogy, anomaly, antinomy, and antithesis). In this investigation, the TORR was calibrated and scored using multidimensional item response theory in a large, representative undergraduate sample. The bifactor model was identified as the best-fitting model, and used to estimate item parameters and construct reliability. To improve the usefulness of the TORR to educators, scaled scores were also calculated and presented. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  15. A Comparison of Latent Growth Models for Constructs Measured by Multiple Items

    ERIC Educational Resources Information Center

    Leite, Walter L.

    2007-01-01

    Univariate latent growth modeling (LGM) of composites of multiple items (e.g., item means or sums) has been frequently used to analyze the growth of latent constructs. This study evaluated whether LGM of composites yields unbiased parameter estimates, standard errors, chi-square statistics, and adequate fit indexes. Furthermore, LGM was compared…

  16. Guide to English Language Arts/Literacy Released Items: Understanding Scoring

    ERIC Educational Resources Information Center

    Partnership for Assessment of Readiness for College and Careers, 2016

    2016-01-01

    The Partnership for Assessment of Readiness for College and Careers (PARCC) is a group of states working together to develop a set of assessments that measure whether students are on track to be successful in college and careers. Administrations of the PARCC assessment included three Prose Constructed Responses (PCR), one per task for English…

  17. Using the Graded Response Model to Control Spurious Interactions in Moderated Multiple Regression

    ERIC Educational Resources Information Center

    Morse, Brendan J.; Johanson, George A.; Griffeth, Rodger W.

    2012-01-01

    Recent simulation research has demonstrated that using simple raw score to operationalize a latent construct can result in inflated Type I error rates for the interaction term of a moderated statistical model when the interaction (or lack thereof) is proposed at the latent variable level. Rescaling the scores using an appropriate item response…

  18. Longitudinal Construct Validity of Brief Symptom Inventory Subscales in Schizophrenia

    ERIC Educational Resources Information Center

    Long, Jeffrey D.; Harring, Jeffrey R.; Brekke, John S.; Test, Mary Ann; Greenberg, Jan

    2007-01-01

    Longitudinal validity of Brief Symptom Inventory subscales was examined in a sample (N = 318) with schizophrenia-related illness measured at baseline and every 6 months for 3 years. Nonlinear factor analysis of items was used to test graded response models (GRMs) for subscales in isolation. The models varied in their within-time and between-times…

  19. A Critical Analysis of the Body of Work Method for Setting Cut-Scores

    ERIC Educational Resources Information Center

    Radwan, Nizam; Rogers, W. Todd

    2006-01-01

    The recent increase in the use of constructed-response items in educational assessment and the dissatisfaction with the nature of the decision that the judges must make using traditional standard-setting methods created a need to develop new and effective standard-setting procedures for tests that include both multiple-choice and…

  20. Validation Study of a Gatekeeping Attitude Index for Social Work Education

    ERIC Educational Resources Information Center

    Tam, Dora M. Y.; Coleman, Heather

    2011-01-01

    This article reports on a study designed to validate the Gatekeeping Attitude Index, a 14-item Likert scaling index. The authors collected data from a convenience sample of social work field instructors (N = 188) with a response rate of 74.0%. Construct validation by exploratory factor analysis identified a 2-factor solution on the index after…

  1. State of Modern Measurement Approaches in Social Work Research Literature

    ERIC Educational Resources Information Center

    Unick, George J.; Stone, Susan

    2010-01-01

    The need to develop measures that tap into constructs of interest to social work, refine existing measures, and ensure that measures function adequately across diverse populations of interest is critical. Item response theory (IRT) is a modern measurement approach that is increasingly seen as an essential tool in a number of allied professions.…

  2. Curve of Factors Model: A Latent Growth Modeling Approach for Educational Research

    ERIC Educational Resources Information Center

    Isiordia, Marilu; Ferrer, Emilio

    2018-01-01

    A first-order latent growth model assesses change in an unobserved construct from a single score and is commonly used across different domains of educational research. However, examining change using a set of multiple response scores (e.g., scale items) affords researchers several methodological benefits not possible when using a single score. A…

  3. Investigating Criteria That Seventh Graders Use to Evaluate the Quality of Online Information

    ERIC Educational Resources Information Center

    Coiro, Julie; Coscarelli, Carla; Maykel, Cheryl; Forzani, Elena

    2015-01-01

    This article presents qualitative findings from a study that examined the types of criteria that middle school students use to evaluate the quality of online information and sources for a Web-based research assignment. Open-constructed responses from four critical evaluation items were compiled from diverse seventh graders in a representative,…

  4. Development and Psychometric Evaluation of a Health-Related Quality of Life Instrument for Individuals with Adult-Onset Hearing Loss

    PubMed Central

    Stika, Carren J.; Hays, Ron D.

    2016-01-01

    Objective Self-reports of “hearing handicap” are available, but a comprehensive measure of health-related quality of life (HRQOL) for individuals with adult-onset hearing loss (AOHL) does not exist. Our objective was to develop and evaluate a multidimensional HRQOL instrument for individuals with AOHL. Design The Impact of Hearing Loss Inventory Tool (IHEAR-IT) was developed using results of focus groups, a literature review, Advisory Expert Panel input, and cognitive interviews. Study Sample The 73-item field-test instrument was completed by 409 adults (22-91 years old) with varying degrees of AOHL and from different areas of the US. Results Multitrait scaling analysis supported four multi-item scales and five individual items. Internal consistency reliabilities ranged from 0.93 to 0.96 for the scales. Construct validity was supported by correlations between the IHEAR-IT scales and scores on the 36-Item Short Form Health Survey, Version 2.0 (SF-36v2) Mental Composite Summary (r’s = 0.32 – 0.64) and the Hearing Handicap Inventory for the Elderly/Adults (HHIE/HHIA) (r’s > −0.70). Conclusions The field test provide initial support for the reliability and construct validity of the IHEAR-IT for evaluating HRQOL of individuals with AOHL. Further research is needed to evaluate the responsiveness to change of the IHEAR-IT scales and identify items for a short-form. PMID:27104754

  5. The Academic Resilience Scale (ARS-30): A New Multidimensional Construct Measure

    PubMed Central

    Cassidy, Simon

    2016-01-01

    Resilience is a psychological construct observed in some individuals that accounts for success despite adversity. Resilience reflects the ability to bounce back, to beat the odds and is considered an asset in human characteristic terms. Academic resilience contextualizes the resilience construct and reflects an increased likelihood of educational success despite adversity. The paper provides an account of the development of a new multidimensional construct measure of academic resilience. The 30 item Academic Resilience Scale (ARS-30) explores process—as opposed to outcome—aspects of resilience, providing a measure of academic resilience based on students’ specific adaptive cognitive-affective and behavioral responses to academic adversity. Findings from the study involving a sample of undergraduate students (N = 532) demonstrate that the ARS-30 has good internal reliability and construct validity. It is suggested that a measure such as the ARS-30, which is based on adaptive responses, aligns more closely with the conceptualisation of resilience and provides a valid construct measure of academic resilience relevant for research and practice in university student populations. PMID:27917137

  6. Developing and investigating the use of single-item measures in organizational research.

    PubMed

    Fisher, Gwenith G; Matthews, Russell A; Gibbons, Alyssa Mitchell

    2016-01-01

    The validity of organizational research relies on strong research methods, which include effective measurement of psychological constructs. The general consensus is that multiple item measures have better psychometric properties than single-item measures. However, due to practical constraints (e.g., survey length, respondent burden) there are situations in which certain single items may be useful for capturing information about constructs that might otherwise go unmeasured. We evaluated 37 items, including 18 newly developed items as well as 19 single items selected from existing multiple-item scales based on psychometric characteristics, to assess 18 constructs frequently measured in organizational and occupational health psychology research. We examined evidence of reliability; convergent, discriminant, and content validity assessments; and test-retest reliabilities at 1- and 3-month time lags for single-item measures using a multistage and multisource validation strategy across 3 studies, including data from N = 17 occupational health subject matter experts and N = 1,634 survey respondents across 2 samples. Items selected from existing scales generally demonstrated better internal consistency reliability and convergent validity, whereas these particular new items generally had higher levels of content validity. We offer recommendations regarding when use of single items may be more or less appropriate, as well as 11 items that seem acceptable, 14 items with mixed results that might be used with caution due to mixed results, and 12 items we do not recommend using as single-item measures. Although multiple-item measures are preferable from a psychometric standpoint, in some circumstances single-item measures can provide useful information. (c) 2016 APA, all rights reserved).

  7. Validation and psychometric properties of the Somatic and Psychological HEalth REport (SPHERE) in a young Australian-based population sample using non-parametric item response theory.

    PubMed

    Couvy-Duchesne, Baptiste; Davenport, Tracey A; Martin, Nicholas G; Wright, Margaret J; Hickie, Ian B

    2017-08-01

    The Somatic and Psychological HEalth REport (SPHERE) is a 34-item self-report questionnaire that assesses symptoms of mental distress and persistent fatigue. As it was developed as a screening instrument for use mainly in primary care-based clinical settings, its validity and psychometric properties have not been studied extensively in population-based samples. We used non-parametric Item Response Theory to assess scale validity and item properties of the SPHERE-34 scales, collected through four waves of the Brisbane Longitudinal Twin Study (N = 1707, mean age = 12, 51% females; N = 1273, mean age = 14, 50% females; N = 1513, mean age = 16, 54% females, N = 1263, mean age = 18, 56% females). We estimated the heritability of the new scores, their genetic correlation, and their predictive ability in a sub-sample (N = 1993) who completed the Composite International Diagnostic Interview. After excluding items most responsible for noise, sex or wave bias, the SPHERE-34 questionnaire was reduced to 21 items (SPHERE-21), comprising a 14-item scale for anxiety-depression and a 10-item scale for chronic fatigue (3 items overlapping). These new scores showed high internal consistency (alpha > 0.78), moderate three months reliability (ICC = 0.47-0.58) and item scalability (Hi > 0.23), and were positively correlated (phenotypic correlations r = 0.57-0.70; rG = 0.77-1.00). Heritability estimates ranged from 0.27 to 0.51. In addition, both scores were associated with later DSM-IV diagnoses of MDD, social anxiety and alcohol dependence (OR in 1.23-1.47). Finally, a post-hoc comparison showed that several psychometric properties of the SPHERE-21 were similar to those of the Beck Depression Inventory. The scales of SPHERE-21 measure valid and comparable constructs across sex and age groups (from 9 to 28 years). SPHERE-21 scores are heritable, genetically correlated and show good predictive ability of mental health in an Australian-based population sample of young people.

  8. A Mixed Effects Randomized Item Response Model

    ERIC Educational Resources Information Center

    Fox, J.-P.; Wyrick, Cheryl

    2008-01-01

    The randomized response technique ensures that individual item responses, denoted as true item responses, are randomized before observing them and so-called randomized item responses are observed. A relationship is specified between randomized item response data and true item response data. True item response data are modeled with a (non)linear…

  9. Dutch-Flemish translation of nine pediatric item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS)®.

    PubMed

    Haverman, Lotte; Grootenhuis, Martha A; Raat, Hein; van Rossum, Marion A J; van Dulmen-den Broeder, Eline; Hoppenbrouwers, Karel; Correia, Helena; Cella, David; Roorda, Leo D; Terwee, Caroline B

    2016-03-01

    The Patient-Reported Outcomes Measurement Information System (PROMIS(®)) is a new, state-of-the-art assessment system for measuring patient-reported health and well-being of adults and children. It has the potential to be more valid, reliable, and responsive than existing PROMs. The items banks are designed to be self-reported and completed by children aged 8-18 years. The PROMIS items can be administered in short forms or through computerized adaptive testing. This paper describes the translation and cultural adaption of nine PROMIS item banks (151 items) for children in Dutch-Flemish. The translation was performed by FACITtrans using standardized PROMIS methodology and approved by the PROMIS Statistical Center. The translation included four forward translations, two back-translations, three independent reviews (at least two Dutch, one Flemish), and pretesting in 24 children from the Netherlands and Flanders. For some items, it was necessary to have separate translations for Dutch and Flemish: physical function-mobility (three items), anger (one item), pain interference (two items), and asthma impact (one item). Challenges faced in the translation process included scarcity or overabundance of possible translations, unclear item descriptions, constructs broader/smaller in the target language, difficulties in rank ordering items, differences in unit of measurement, irrelevant items, or differences in performance of activities. By addressing these challenges, acceptable translations were obtained for all items. The Dutch-Flemish PROMIS items are linguistically equivalent to the original USA version. Short forms are now available for use, and entire item banks are ready for cross-cultural validation in the Netherlands and Flanders.

  10. [Case Study] CityCenter and Cosmopolitan Construction Projects, Las Vegas, Nevada: lessons learned from the use of multiple sources and mixed methods in a safety needs assessment.

    PubMed

    Gittleman, Janie L; Gardner, Paige C; Haile, Elizabeth; Sampson, Julie M; Cigularov, Konstantin P; Ermann, Erica D; Stafford, Pete; Chen, Peter Y

    2010-06-01

    The present study describes a response to eight tragic deaths over an eighteen month times span on a fast track construction project on the largest commercial development project in U.S. history. Four versions of a survey were distributed to workers, foremen, superintendents, and senior management. In addition to standard Likert-scale safety climate scale items, an open-ended item was included at the end of the survey. Safety climate perceptions differed by job level. Specifically, management perceived a more positive safety climate as compared to workers. Content analysis of the open-ended item was used to identify important safety and health concerns which might have been overlooked with the qualitative portion of the survey. The surveys were conducted to understand workforce issues of concern with the aim of improving site safety conditions. Such efforts can require minimal investment of resources and time and result in critical feedback for developing interventions affecting organizational structure, management processes, and communication. The most important lesson learned was that gauging differences in perception about site safety can provide critical feedback at all levels of a construction organization. Implementation of multi-level organizational perception surveys can identify major safety issues of concern. Feedback, if acted upon, can potentially result in fewer injuries and fatal events. (c) 2010 Elsevier Ltd. All rights reserved.

  11. Validity of a questionnaire measuring the world health organization concept of health system responsiveness with respect to perinatal services in the Dutch obstetric care system.

    PubMed

    van der Kooy, Jacoba; Valentine, Nicole B; Birnie, Erwin; Vujkovic, Marijana; de Graaf, Johanna P; Denktaş, Semiha; Steegers, Eric A P; Bonsel, Gouke J

    2014-12-03

    The concept of responsiveness, introduced by the World Health Organization (WHO), addresses non-clinical aspects of health service quality that are relevant regardless of provider, country, health system or health condition. Responsiveness refers to "aspects related to the way individuals are treated and the environment in which they are treated" during health system interactions. This paper assesses the psychometric properties of a newly developed responsiveness questionnaire dedicated to evaluating maternal experiences of perinatal care services, called the Responsiveness in Perinatal and Obstetric Health Care Questionnaire (ReproQ), using the eight-domain WHO concept. The ReproQ was developed between October 2009 and February 2010 by adapting the WHO Responsiveness Questionnaire items to the perinatal care context. The psychometric properties of feasibility, construct validity, and discriminative validity were empirically assessed in a sample of Dutch women two weeks post partum. A total of 171 women consented to participation. Feasibility: the interviews lasted between 20 and 40 minutes and the overall missing rate was 8%. Construct validity: mean Cronbach's alphas for the antenatal, birth and postpartum phase were: 0.73 (range 0.57-0.82), 0.84 (range 0.66-0.92), and 0.87 (range 0.62-0.95) respectively. The item-own scale correlations within all phases were considerably higher than most of the item-other scale correlations. Within the antenatal care, birth care and post partum phases, the eight factors explained 69%, 69%, and 76% of variance respectively. Discriminative validity: overall responsiveness mean sum scores were higher for women whose children were not admitted. This confirmed the hypothesis that dissatisfaction with health outcomes is transferred to their judgement on responsiveness of the perinatal services. The ReproQ interview-based questionnaire demonstrated satisfactory psychometric properties to describe the quality of perinatal care in the Netherlands, with the potential to discriminate between different levels of quality of care. In view of the relatively small sample, further testing and research is recommended.

  12. Measurement Properties of the Psoriasis Symptom Inventory Electronic Daily Diary in Patients with Moderate to Severe Plaque Psoriasis.

    PubMed

    Viswanathan, Hema N; Mutebi, Alex; Milmont, Cassandra E; Gordon, Kenneth; Wilson, Hilary; Zhang, Hao; Klekotka, Paul A; Revicki, Dennis A; Augustin, Matthias; Kricorian, Gregory; Nirula, Ajay; Strober, Bruce

    2017-09-01

    The Psoriasis Symptom Inventory (PSI) is a patient-reported outcome instrument that measures the severity of psoriasis signs and symptoms. This study evaluated measurement properties of the PSI in patients with moderate to severe plaque psoriasis. This secondary analysis used pooled data from a phase 3 brodalumab clinical trial (AMAGINE-1). Outcome measures included the PSI, Psoriasis Area and Severity Index (PASI), static Physician's Global Assessment (sPGA), psoriasis-affected body surface area, 36-item Short-Form Health Survey version 2, and the Dermatology Life Quality Index (DLQI). The PSI was evaluated for dimensionality, item performance, reliability (internal consistency and test-retest), construct validity, ability to detect change, and agreement between PSI response and response measures based on the PASI, sPGA, and DLQI. Results supported unidimensionality, good item fit, ordered responses, and PSI scoring. The PSI demonstrated reliability: baseline Cronbach's alpha ≥ 0.92 and intraclass correlation coefficients ≥ 0.95. Correlations between PSI total score and DLQI item 1 (r = 0.86), DLQI symptoms and feelings (r = 0.87), and 36-item Short-Form Health Survey version 2 bodily pain (r = -0.61) supported convergent validity. PSI scores differed significantly (P < 0.001) among severity groups based on the PASI (< 12/≥ 12), sPGA (0-1/2-3/4-5), body surface area (< 5%/5%-10%/> 10%), and DLQI (≤ 5/> 5) at weeks 8 and 12. At week 12, the PSI detected significant changes in severity based on PASI responses (< 50/50- < 75/≥ 75) and sPGA (0-1/≥ 2), and showed good agreement (k ≥ 0.66) between PSI response and PASI, sPGA, and DLQI responses. The PSI demonstrated excellent validity, reliability, and ability to detect change in the severity of psoriasis signs and symptoms. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  13. Evaluating construct validity of the second version of the Copenhagen Psychosocial Questionnaire through analysis of differential item functioning and differential item effect.

    PubMed

    Bjorner, Jakob Bue; Pejtersen, Jan Hyld

    2010-02-01

    To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE). We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a one-year register based follow up for long-term sickness absence. DIF was evaluated against age, gender, education, social class, public/private sector employment, and job type using ordinal logistic regression. DIE was evaluated against job satisfaction and self-rated health (using ordinal logistic regression), against depressive symptoms, burnout, and stress (using multiple linear regression), and against long-term sick leave (using a proportional hazards model). We used a cross-validation approach to counter the risk of significant results due to multiple testing. Out of 1,052 tests, we found 599 significant instances of DIF/DIE, 69 of which showed both practical and statistical significance across two independent samples. Most DIF occurred for job type (in 20 cases), while we found little DIF for age, gender, education, social class and sector. DIE seemed to pertain to particular items, which showed DIE in the same direction for several outcome variables. The results allowed a preliminary identification of items that have a positive impact on construct validity and items that have negative impact on construct validity. These results can be used to develop better shortform measures and to improve the conceptual framework, items and scales of the COPSOQ II. We conclude that tests of DIF and DIE are useful for evaluating construct validity.

  14. Measurement of multiple nicotine dependence domains among cigarette, non-cigarette and poly-tobacco users: Insights from item response theory.

    PubMed

    Strong, David R; Messer, Karen; Hartman, Sheri J; Conway, Kevin P; Hoffman, Allison C; Pharris-Ciurej, Nikolas; White, Martha; Green, Victoria R; Compton, Wilson M; Pierce, John

    2015-07-01

    Nicotine dependence (ND) is a key construct that organizes physiological and behavioral symptoms associated with persistent nicotine intake. Measurement of ND has focused primarily on cigarette smokers. Thus, validation of brief instruments that apply to a broad spectrum of tobacco product users is needed. We examined multiple domains of ND in a longitudinal national study of the United States population, the United States National Epidemiological Survey of Alcohol and Related Conditions (NESARC). We used methods based in item response theory to identify and validate increasingly brief measures of ND that included symptoms to assess ND similarly among cigarette, cigar, smokeless, and poly tobacco users. Confirmatory factor analytic models supported a single, primary dimension underlying symptoms of ND across tobacco use groups. Differential Item Functioning (DIF) analysis generated little support for systematic differences in response to symptoms of ND across tobacco use groups. We established significant concurrent and predictive validity of brief 3- and 5-symptom indices for measuring ND. Measuring ND across tobacco use groups with a common set of symptoms facilitates evaluation of tobacco use in an evolving marketplace of tobacco and nicotine products. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  15. For a Revival of Feminist Consciousness-Raising: Horizontal Transformation of Epistemologies and Transgression of Neoliberal Timespace

    ERIC Educational Resources Information Center

    Firth, Rhiannon; Robinson, Andrew

    2016-01-01

    This paper looks back on the methodology and experience of feminist consciousness-raising (CR) in the 1970s, in relation to the current re-emergence of feminism. It constructs an argument that a new wave of CR is desirable so as to construct new forms of feminist pedagogy and activism. The paper will argue that contemporary feminism in the UK and…

  16. What Does a Verbal Test Measure? A New Approach to Understanding Sources of Item Difficulty.

    ERIC Educational Resources Information Center

    Berk, Eric J. Vanden; Lohman, David F.; Cassata, Jennifer Coyne

    Assessing the construct relevance of mental test results continues to present many challenges, and it has proven to be particularly difficult to assess the construct relevance of verbal items. This study was conducted to gain a better understanding of the conceptual sources of verbal item difficulty using a unique approach that integrates…

  17. Item Pool Construction Using Mixed Integer Quadratic Programming (MIQP). GMAC® Research Report RR-14-01

    ERIC Educational Resources Information Center

    Han, Kyung T.; Rudner, Lawrence M.

    2014-01-01

    This study uses mixed integer quadratic programming (MIQP) to construct multiple highly equivalent item pools simultaneously, and compares the results from mixed integer programming (MIP). Three different MIP/MIQP models were implemented and evaluated using real CAT item pool data with 23 different content areas and a goal of equal information…

  18. Construction and Analysis of Educational Tests Using Abductive Machine Learning

    ERIC Educational Resources Information Center

    El-Alfy, El-Sayed M.; Abdel-Aal, Radwan E.

    2008-01-01

    Recent advances in educational technologies and the wide-spread use of computers in schools have fueled innovations in test construction and analysis. As the measurement accuracy of a test depends on the quality of the items it includes, item selection procedures play a central role in this process. Mathematical programming and the item response…

  19. Constructing CrIII-centered heterometallic complexes: [NiCrIII] and [CoCrIII] wheels.

    PubMed

    Kakaroni, Foteini E; Collet, Alexandra; Sakellari, Eirini; Tzimopoulos, Demetrios I; Siczek, Milosz; Lis, Tadeusz; Murrie, Mark; Milios, Constantinos J

    2017-12-19

    The solvothermal reaction between Cr(acac) 3 , MCl 2 ·6H 2 O (M = Ni, Co) and 2-hydroxy-4-methyl-6-phenyl-pyridine-3-amidoxime (H 2 L), under basic conditions, led to the synthesis of the heterometallic heptanuclear clusters [MCr(HL zw ) 6 (HL) 6 ]·3Cl (M = Ni, 1; Co, 2), with the nickel analogue displaying an S = 9/2 spin ground-state.

  20. Development of the Assessment of Belief Conflict in Relationship-14 (ABCR-14)

    PubMed Central

    Kyougoku, Makoto; Teraoka, Mutsumi; Masuda, Noriko; Ooura, Mariko; Abe, Yasushi

    2015-01-01

    Purpose Nurses and other healthcare workers frequently experience belief conflict, one of the most important, new stress-related problems in both academic and clinical fields. Methods In this study, using a sample of 1,683 nursing practitioners, we developed The Assessment of Belief Conflict in Relationship-14 (ABCR-14), a new scale that assesses belief conflict in the healthcare field. Standard psychometric procedures were used to develop and test the scale, including a qualitative framework concept and item-pool development, item reduction, and scale development. We analyzed the psychometric properties of ABCR-14 according to entropy, polyserial correlation coefficient, exploratory factor analysis, confirmatory factor analysis, average variance extracted, Cronbach’s alpha, Pearson product-moment correlation coefficient, and multidimensional item response theory (MIRT). Results The results of the analysis supported a three-factor model consisting of 14 items. The validity and reliability of ABCR-14 was suggested by evidence from high construct validity, structural validity, hypothesis testing, internal consistency reliability, and concurrent validity. The result of the MIRT offered strong support for good item response of item slope parameters and difficulty parameters. However, the ABCR-14 Likert scale might need to be explored from the MIRT point of view. Yet, as mentioned above, there is sufficient evidence to support that ABCR-14 has high validity and reliability. Conclusion The ABCR-14 demonstrates good psychometric properties for nursing belief conflict. Further studies are recommended to confirm its application in clinical practice. PMID:26247356

  1. Consequences of screening in lung cancer: development and dimensionality of a questionnaire.

    PubMed

    Brodersen, John; Thorsen, Hanne; Kreiner, Svend

    2010-08-01

    The objective of this study was to extend the Consequences of Screening (COS) Questionnaire for use in a lung cancer screening by testing for comprehension, content coverage, dimensionality, and reliability. In interviews, the suitability, content coverage, and relevance of the COS were tested on participants in a lung cancer screening program. The results were thematically analyzed to identify the key consequences of abnormal and false-positive screening results. Item Response Theory and Classical Test Theory were used to analyze data. Dimensionality, objectivity, and reliability were established by item analysis, examining the fit between item responses and Rasch models. Eight themes specifically relevant for participants in lung cancer screening results were identified: "self-blame,"focus on symptoms,"stigmatization,"introvert,"harm of smoking,"impulsivity,"empathy," and "regretful of still smoking." Altogether, 26 new items for part I and 16 new items for part II were generated. These themes were confirmed to fit a partial-credit Rasch model measuring different constructs including several of the new items. In conclusion, the reliability and the dimensionality of a condition-specific measure with high content validity for persons having abnormal or false-positive lung cancer screening results have been demonstrated. This new questionnaire called Consequences of Screening in Lung Cancer (COS-LC) covers in two parts the psychosocial experience in lung cancer screening. Part I: "anxiety,"behavior,"dejection,"sleep,"self-blame,"focus on airway symptoms,"stigmatization,"introvert," and "harm of smoking." Part II: "calm/relax,"social network,"existential values,"impulsivity,"empathy," and "regretful of still smoking."

  2. Validity aspects of the patient feedback questionnaire on consultation skills (PFC), a promising learning instrument in medical education.

    PubMed

    Reinders, Marcel E; Blankenstein, Annette H; Knol, Dirk L; de Vet, Henrica C W; van Marwijk, Harm W J

    2009-08-01

    A focus on the communicator competency is considered to be an important requirement to help physicians to acquire consultation skills. A feedback questionnaire, in which patients assess consultation skills might be a useful learning tool. An existing questionnaire on patient perception of patient-centeredness (PPPC) was adapted to cover the 'communicator' items in the competency profile. We assessed the face and content validity, the construct validity and the internal consistency of this new patient feedback on consultation skills (PFC) questionnaire. We assessed the face validity of the PFC by interviewing patients and general practice trainees (GPTs) during the developmental process. The content validity was determined by experts (n=10). First-year GPTs (23) collected 222 PFCs, from which the data were used to assess the construct validity (factor analysis), internal consistency, response rates and ceiling effects. The PFC adequately covers the corresponding 'communicator' competency (face and content validity). Factor analysis showed a one-dimensional construct. The internal consistency was high (Cronbach's alpha 0.89). For the single items, the response rate varied from 89.2% to 100%; the maximum score (ceiling effect) varied from 45.5% to 89.2%. The PFC appears to be a valid, internally consistent instrument. The PFC may be a valuable learning tool with which GPTs, other physicians and medical students can acquire feedback from patients regarding their consultation skills.

  3. Development of the Comprehensive General Parenting Questionnaire for caregivers of 5-13 year olds

    PubMed Central

    2014-01-01

    Background Despite the large number of parenting questionnaires, considerable disagreement exists about how to best assess parenting. Most of the instruments only assess limited aspects of parenting. To overcome this shortcoming, the “Comprehensive General Parenting Questionnaire” (CGPQ) was systematically developed. Such a measure is frequently requested in the area of childhood overweight. Methods First, an item bank of existing parenting measures was created assessing five key parenting constructs that have been identified across multiple theoretical approaches to parenting (Nurturance, Overprotection, Coercive control, Behavioral control, and Structure). Caregivers of 5- to 13-year-olds were asked to complete the online survey in the Netherlands (N = 821), Belgium (N = 435) and the United States (N = 241). In addition, a questionnaire regarding personality characteristics (“Big Five”) of the caregiver was administered and parents were asked to report about their child’s height and weight. Factor analyses and Item-Response Modeling (IRM) techniques were used to assess the underlying parenting constructs and for item reduction. Correlation analyses were performed to assess the relations between general parenting and personality of the caregivers, adjusting for socio-economic status (SES) indicators, to establish criterion validity. Multivariate linear regressions were performed to examine the associations of SES indicators and parenting with child BMI z-scores. Additionally, we assessed whether scores on the parenting constructs and child BMI z-scores differed depending on SES indicators. Results The reduced questionnaire (62 items) revealed acceptable fit of our parenting model and acceptable IRM item fit statistics. Caregiver personality was related as hypothesized with the GCPQ parenting constructs. While correcting for SES, overprotection was positively related to child BMI. The negative relationship between structure and BMI was borderline significant. Parents with a high level of education were less likely to use overly forms of controlling parenting (i.e., coercive control and overprotection) and more likely to have children with lower BMI. Based on several author review meetings and cognitive interviews the questionnaire was further modified to an 85-item questionnaire. Conclusions The GCPQ may facilitate research exploring how parenting influences children’s weight-related behaviors. The contextual influence of general parenting is likely to be more profound than its direct relationship with weight status. PMID:24512450

  4. The Childhood Cancer Survivor Study-Neurocognitive Questionnaire (CCSS-NCQ) Revised: Item Response Analysis and Concurrent Validity

    PubMed Central

    Kenzik, Kelly M.; Huang, I-Chan; Brinkman, Tara M.; Baughman, Brandon; Ness, Kirsten K.; Shenkman, Elizabeth A.; Hudson, Melissa M.; Robison, Leslie L.; Krull, Kevin R.

    2014-01-01

    Objective Childhood cancer survivors are at risk for neurocognitive impairment related to cancer diagnosis or treatment. This study refined and further validated the Childhood Cancer Survivor Study Neurocognitive Questionnaire (CCSS-NCQ), a scale developed to screen for impairment in long-term survivors of childhood cancer. Method Items related to task efficiency, memory, organization and emotional regulation domains were examined using item response theory (IRT). Data were collected from 833 adult survivors of childhood cancer in the St. Jude Lifetime Cohort Study who completed self-report and direct neurocognitive testing. The revision process included: 1) content validity mapping of items to domains, 2) constructing a revised CCSS-NCQ, 3) selecting items within specific domains using IRT, and 4) evaluating concordance between the revised CCSS-NCQ and direct neurocognitive assessment. Results Using content and measurement properties, 32 items were retained (8 items in 4 domains). Items captured low to middle levels of neurocognitive concerns. The latent domain scores demonstrated poor convergent/divergent validity with the direct assessments. Adjusted effect sizes (Cohen's d) for agreement between self-reported memory and direct memory assessment were moderate for total recall (ES=0.66), long-term memory (ES=0.63), and short-term memory (ES=0.55). Effect sizes between self-rated task efficiency and direct assessment of attention were moderate for focused attention (ES=0.70) and attention span (ES=0.50), but small for sustained attention (ES=0.36). Cranial radiation therapy and female gender were associated with lower self-reported neurocognitive function. Conclusion The revised CCSS-NCQ demonstrates adequate measurement properties for assessing day-to-day neurocognitive concerns in childhood cancer survivors, and adds useful information to direct assessment. PMID:24933482

  5. Morphological and transcriptional responses of Lycopersicon esculentum to hexavalent chromium in agricultural soil.

    PubMed

    Li, Shi-Guo; Hou, Jing; Liu, Xin-Hui; Cui, Bao-Shan; Bai, Jun-Hong

    2016-07-01

    The carcinogenic, teratogenic, and mutagenic effects of hexavalent chromium (Cr[VI]) on living organisms through the food chain raise the immediate need to assess the potential toxicological impacts of Cr(VI) on human health. Therefore, the concentration-dependent responses of 12 Cr(VI)-responsive genes selected from a high-throughput Lycopersicon esculentum complementary DNA microarray were examined at different Cr concentrations. The results indicated that most of the genes were differentially expressed from 0.1 mg Cr/kg soil, whereas the lowest-observable-adverse-effect concentrations of Cr(VI) were 1.6 mg Cr/kg soil, 6.4 mg Cr/kg soil, 3.2 mg Cr/kg soil, and 0.4 mg Cr/kg soil for seed germination, root elongation, root biomass, and root morphology, respectively, implying that the transcriptional method was more sensitive than the traditional method in detecting Cr(VI) toxicity. Dose-dependent responses were observed for the relative expression of expansin (p = 0.778), probable chalcone-flavonone isomerase 3 (p = -0.496), and 12S seed storage protein CRD (p = -0.614); therefore, the authors propose the 3 genes as putative biomarkers in Cr(VI)-contaminated soil. Environ Toxicol Chem 2016;35:1751-1758. © 2015 SETAC. © 2015 SETAC.

  6. Validity and reliability of a Malay version of the brief illness perception questionnaire for patients with type 2 diabetes mellitus.

    PubMed

    Chew, Boon-How; Vos, Rimke C; Heijmans, Monique; Shariff-Ghazali, Sazlina; Fernandez, Aaron; Rutten, Guy E H M

    2017-08-03

    Illness perceptions involve the personal beliefs that patients have about their illness and may influence health behaviours considerably. Since an instrument to measure these perceptions for Malay population in Malaysia is lacking, we translated and examined the psychometric properties of the Malay version of the Brief Illness Perception Questionnaire (MBIPQ) in adult patients with type 2 diabetes mellitus. The MBIPQ has nine items, all use a 0-10 response scale, except the ninth item about causal factors, which is an open-ended item. A standard procedure was used to translate and adapt the English BIPQ into Malay language. Construct validity was examined comparing item scores and scores on the Diabetes Management Self-Efficacy Scale, the Morisky Medication Adherence Scale, the World Health Organization Quality of Life-brief, the 9-item Patient Health Questionnaire, the 17-item Diabetes Distress Scale, HbA1c and the presence of complications. In addition, 2-week and 4-week test-retest reliability were studied. A total of 312 patients completed the MBIPQ. Out of this, 97 and 215 patients completed the 2- or 4-weeks test-retest reliability questionnaire, respectively. Moderate inter-items correlations were observed between illness perception dimensions (r = -0.31 to 0.53). MBIPQ items showed the expected correlations with self-efficacy (r = 0.35), medication adherence (r = 0.29), quality of life (r = -0.17 to 0.31) and depressive symptoms (r = -0.18 to 0.21). People with severe diabetes-related distress also were more concern (t-test = 4.01, p < 0.001) and experienced lower personal control (t-test = 2.07, p = 0.031). People with any diabetes-related complication perceived the consequences as more serious (t-test = 2.04, p = 0.044). The 2-week and 4-week test-retest reliabilities varied between ICC agreement 0.39 to 0.70 and 0.58 to 0.78, respectively. The psychometric properties of items in the MBIPQ are moderate. The MBIPQ showed good cross-cultural validity and moderate construct validity. Test-retest reliability was moderate. Despite the moderate psychometric properties, the MBIPQ may be useful in clinical practice as it is a useful instrument to elicit and communicate on patient's personal thoughts and feelings. Future research is needed to establish its responsiveness and predictive validity. ClinicalTrials.gov NCT02730754 registered on March 29, 2016; NCT02730078 registered on March 29, 2016.

  7. Point Defect Structure of Cr203

    DTIC Science & Technology

    1987-10-01

    Calculation of Electron Hole Mobility ........................ 104 6.2.3 Construction of the Defect Concentration vs. Oxygen Pressure Diagram...1000’ to 16000C ............ 123 7.7 Calculated diffusion coefficient vs. oxygen partial pressure diagram for pure Cr203 at 1100 0 C...127 7.10 Calculated parabolic rate constant vs. oxygen partial pressure diagram for pure Cr203 at

  8. Space construction system analysis. Part 2: Cost and programmatics

    NASA Technical Reports Server (NTRS)

    Vonflue, F. W.; Cooper, W.

    1980-01-01

    Cost and programmatic elements of the space construction systems analysis study are discussed. The programmatic aspects of the ETVP program define a comprehensive plan for the development of a space platform, the construction system, and the space shuttle operations/logistics requirements. The cost analysis identified significant items of cost on ETVP development, ground, and flight segments, and detailed the items of space construction equipment and operations.

  9. Confirmatory factor analysis and measurement invariance of the Child Feeding Questionnaire in low-income Hispanic and African-American mothers with preschool-age children.

    PubMed

    Kong, Angela; Vijayasiri, Ganga; Fitzgibbon, Marian L; Schiffer, Linda A; Campbell, Richard T

    2015-07-01

    Validation work of the Child Feeding Questionnaire (CFQ) in low-income minority samples suggests a need for further conceptual refinement of this instrument. Using confirmatory factor analysis, this study evaluated 5- and 6-factor models on a large sample of African-American and Hispanic mothers with preschool-age children (n = 962). The 5-factor model included: 'perceived responsibility', 'concern about child's weight', 'restriction', 'pressure to eat', and 'monitoring' and the 6-factor model also tested 'food as a reward'. Multi-group analysis assessed measurement invariance by race/ethnicity. In the 5-factor model, two low-loading items from 'restriction' and one low-variance item from 'perceived responsibility' were dropped to achieve fit. Only removal of the low-variance item was needed to achieve fit in the 6-factor model. Invariance analyses demonstrated differences in factor loadings. This finding suggests African-American and Hispanic mothers may vary in their interpretation of some CFQ items and use of cognitive interviews could enhance item interpretation. Our results also demonstrated that 'food as a reward' is a plausible construct among a low-income minority sample and adds to the evidence that this factor resonates conceptually with parents of preschoolers; however, further testing is needed to determine the validity of this factor with older age groups. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Assessing Hopelessness in Terminally Ill Cancer Patients: Development of the Hopelessness Assessment in Illness Questionnaire

    PubMed Central

    Rosenfeld, Barry; Pessin, Hayley; Lewis, Charles; Abbey, Jennifer; Olden, Megan; Sachs, Emily; Amakawa, Lia; Kolva, Elissa; Brescia, Robert; Breitbart, William

    2013-01-01

    Hopelessness has become an increasingly important construct in palliative care research, yet concerns exist regarding the utility of existing measures when applied to patients with a terminal illness. This article describes a series of studies focused on the exploration, development, and analysis of a measure of hopelessness specifically intended for use with terminally ill cancer patients. The 1st stage of measure development involved interviews with 13 palliative care experts and 30 terminally ill patients. Qualitative analysis of the patient interviews culminated in the development of a set of potential questionnaire items. In the 2nd study phase, we evaluated these preliminary items with a sample of 314 participants, using item response theory and classical test theory to identify optimal items and response format. These analyses generated an 8-item measure that we tested in a final study phase, using a 3rd sample (n = 228) to assess reliability and concurrent validity. These analyses demonstrated strong support for the Hopelessness Assessment in Illness Questionnaire providing greater explanatory power than existing measures of hopelessness and found little evidence that this assessment was confounded by illness-related variables (e.g., prognosis). In summary, these 3 studies suggest that this brief measure of hopelessness is particularly useful for palliative care settings. Further research is needed to assess the applicability of the measure to other populations and contexts. PMID:21443366

  11. An Item Response Theory Analysis of DSM-IV Cannabis Abuse and Dependence Criteria in Adolescents

    PubMed Central

    Hartman, Christie A.; Gelhorn, Heather; Crowley, Thomas J.; Sakai, Joseph T.; Stallings, Michael; Young, Susan E.; Rhee, Soo Hyun; Corley, Robin; Hewitt, John K.; Hopfer, Christian J.

    2008-01-01

    Objective To examine three aspects of adolescent cannabis problems: 1) do DSM-IV cannabis abuse and dependence criteria represent two different levels of severity of substance involvement, 2) to what degree do each of the 11 abuse and dependence criteria assess adolescent cannabis problems, and 3) do the DSM-IV items function similarly across different adolescent populations? Method We examined 5587 adolescents aged 11–19, including 615 youth in treatment for substance use disorders, 179 adjudicated youth, and 4793 youth from the community. All subjects were assessed with a structured diagnostic interview. Item response theory was utilized to analyze symptom endorsement patterns. Results Abuse and dependence criteria were not found to represent different levels of severity of problem cannabis use in any of the samples. Among the 11 abuse and dependence criteria, Problems cutting down and Legal problems were the least informative for distinguishing problem users. Two dependence criteria and three of the four abuse criteria indicated different severities of cannabis problems across samples. Conclusions We found little evidence to support the idea that abuse and dependence are separate constructs for adolescent cannabis problems. Furthermore, certain abuse criteria may indicate severe substance problems while specific dependence items may indicate less severe problems. The abuse items in particular need further study. These results have implications for the refinement of the current substance use disorder criteria for DSM-V. PMID:18176333

  12. The reliability, validity and responsiveness of the Thai version of Systemic Lupus Erythematosus Quality of Life (SLEQOL-TH) instrument.

    PubMed

    Kasitanon, N; Wangkaew, S; Puntana, S; Sukitawut, W; Leong, K P; Louthrenoo, W

    2013-03-01

    The English version of the Systemic Lupus Erythematosus Quality of Life Questionnaire (SLEQOL) is a validated disease-specific quality of life instrument. The aim of this study was to evaluate the psychometric properties of the Thai version of the SLEQOL (SLEQOL-TH). Two independent translators translated the SLEQOL into Thai. The back translation of this version was performed by two other independent translators. The final version, SLEQOL-TH, was completed after resolving the discrepancies revealed by the back translation. One hundred and nine patients with SLE were enrolled to test the reliability, construct validity, floor and ceiling effects, and sensitivity to the changes of the SLEQOL-TH at six months. The differential item functioning (DIF) between the Thai and English versions was analyzed using the partial gamma. The internal consistency of the SLEQOL-TH was satisfactory with the overall Cronbach's alpha of 0.86. The test-retest reliability of the SLEQOL-TH was acceptable with the intra-class correlation coefficient of 0.86. Low correlations between the SLEQOL-TH and SLEDAI were observed. The total score of the SLEQOL-TH was moderately responsive to changes in quality of life, with a standardized response mean of 0.50. When comparing the SLEQOL-TH from Thai SLE patients with the original SLEQOL version obtained from Singapore SLE patients, 11 out of 40 items showed a moderate to large DIF. The SLEQOL-TH has acceptable psychometric properties and shows construct validity. In comparison with the English version of SLEQOL, there are some items that showed DIF. The applicability of the SLEQOL-TH in real-life clinical practice and clinical trials needs to be determined.

  13. [Measuring job satisfaction: development of a multidimensional scale].

    PubMed

    Faraci, Palmira; Valenti, Giusy

    2016-01-01

    Although numerous studies have been done on the topic ofjob satisfaction, as regards the Italian research, the construction of specific psychometric instruments is lacking. The present paper is aimed to develop a scale to measure job satisfaction referring to our cultural context. Participants were 222 workers (36.5% males, 63.5% females) with an average age of 38.39 years (SD = 10.91). The formulated items were selected from a large item pool on the basis of the evaluation by a group of expert judges, and the item analysis procedure. In order to establish test validity, the following instruments were also administered: Occupational Stress Indicator, Satisfaction With Life Scale, Rosenberg Self-Esteem Scale, Multidimensional Scale of Perceived Social Support, and Beck Depression Inventory. Both exploratory and confirmatory factor analyses highlighted a 6-factor structure. Those factors were responsible for 51.30% of the total variance. Reliability analyses indicated satisfying internal consistency (ranging from alpha = .73 to alpha = .86). Construct validity was supported by results obtained calculating correlations with the theoretically associated variables. Our findings suggest promising psychometric properties for the presented measure. The instrument could be used in specific programs developed to promote well-being conditions in work settings.

  14. Improving Measures via Examining the Behavior of Distractors in Multiple-Choice Tests

    PubMed Central

    Sideridis, Georgios; Tsaousis, Ioannis; Al Harbi, Khaleel

    2017-01-01

    The purpose of the present article was to illustrate, using an example from a national assessment, the value from analyzing the behavior of distractors in measures that engage the multiple-choice format. A secondary purpose of the present article was to illustrate four remedial actions that can potentially improve the measurement of the construct(s) under study. Participants were 2,248 individuals who took a national examination of chemistry. The behavior of the distractors was analyzed by modeling their behavior within the Rasch model. Potentially informative distractors were (a) further modeled using the partial credit model, (b) split onto separate items and retested for model fit and parsimony, (c) combined to form a “super” item or testlet, and (d) reexamined after deleting low-ability individuals who likely guessed on those informative, albeit erroneous, distractors. Results indicated that all but the item split strategies were associated with better model fit compared with the original model. The best fitted model, however, involved modeling and crediting informative distractors via the partial credit model or eliminating the responses of low-ability individuals who likely guessed on informative distractors. The implications, advantages, and disadvantages of modeling informative distractors for measurement purposes are discussed. PMID:29795904

  15. The breastfeeding self-efficacy scale: psychometric assessment of the short form.

    PubMed

    Dennis, Cindy-Lee

    2003-01-01

    The purpose of this study was to reduce the number of items on the original Breastfeeding Self-Efficacy Scale (BSES) and psychometrically assess the revised BSES-Short Form (BSES-SF). As part of a longitudinal study, participants completed mailed questionnaires at 1, 4, and 8 weeks postpartum. Health region in British Columbia. A population-based sample of 491 breastfeeding mothers. BSES, Edinburgh Postnatal Depression Scale, Rosenberg Self-Esteem Scale, and Perceived Stress Scale. Internal consistency statistics with the original BSES suggested item redundancy. As such, 18 items were deleted, using explicit reduction criteria. Based on the encouraging reliability analysis of the new 14-item BSES-SF, construct validity was assessed using principal components factor analysis, comparison of contrasted groups, and correlations with measures of similar constructs. Support for predictive validity was demonstrated through significant mean differences between breastfeeding and bottle feeding mothers at 4 (p < .001) and 8 (p < .001) weeks postpartum. Demographic response patterns suggested the BSES-SF is a unique tool to identify mothers at risk of prematurely discontinuing breastfeeding. These psychometric results indicate the BSES-SF is an excellent measure of breastfeeding self-efficacy and considered ready for clinical use to (a) identify breastfeeding mothers at high risk, (b) assess breastfeeding behaviors and cognitions to individualize confidence-building strategies, and (c) evaluate the effectiveness of various interventions and guide program development.

  16. Development of the Parent Responses to School Functioning Questionnaire.

    PubMed

    Barber Garcia, Brittany N; Gray, Laura S; Simons, Laura E; Logan, Deirdre E

    2017-10-01

    Parents play an important role in supporting school functioning in youth with chronic pain, but no validated tools exists to assess parental responses to child and adolescent pain behaviors in the school context. Such a tool would be useful in identifying targets of change to reduce pain-related school impairment. The goal of this study was to develop and preliminarily validate the Parent Responses to School Functioning Questionnaire (PRSF), a parent self-report measure of this construct. After initial expert review and pilot testing, the measure was administered to 418 parents of children (ages 6-17 years) seen for initial multidisciplinary chronic pain clinic evaluation. The final 16-item PRSF showed evidence of good internal consistency (α = .82) and 2-week test-retest reliability (intraclass correlation coefficient = .87). Criterion validity was demonstrated by significant correlations with school absence rates and overall school functioning, and construct validity was demonstrated by correlations with general parental responses to pain. Three subscales emerged capturing parents' personal distress, parents' level of distrust of the school, and parents' expectations and behaviors related to their child's management of challenging school situations. These results provide preliminary support for the PRSF as a psychometrically sound tool to assess parents' responses to child pain in the school setting. The 16-item PRSF measures parental responses to their child's chronic pain in the school context. The clinically useful measure can inform interventions aimed reducing functional disability in children with chronic pain by enhancing parents' ability to respond adaptively to child pain behaviors. Copyright © 2017 American Pain Society. Published by Elsevier Inc. All rights reserved.

  17. Development and validation of a scale for mouth handicap in systemic sclerosis: the Mouth Handicap in Systemic Sclerosis scale

    PubMed Central

    Mouthon, L; Rannou, F; Bérezné, A; Pagnoux, C; Arène, J‐P; Foïs, E; Cabane, J; Guillevin, L; Revel, M; Fermanian, J; Poiraudeau, S

    2007-01-01

    Objective To develop and assess the reliability and construct validity of a scale assessing disability involving the mouth in systemic sclerosis (SSc). Methods We generated a 34‐item provisional scale from mailed responses of patients (n = 74), expert consensus (n = 10) and literature analysis. A total of 71 other SSc patients were recruited. The test–retest reliability was assessed using the intraclass coefficient correlation and divergent validity using the Spearman correlation coefficient. Factor analysis followed by varimax rotation was performed to assess the factorial structure of the scale. Results The item reduction process retained 12 items with 5 levels of answers (total score range 0–48). The mean total score of the scale was 20.3 (SD 9.7). The test–retest reliability was 0.96. Divergent validity was confirmed for global disability (Health Assessment Questionnaire (HAQ), r = 0.33), hand function (Cochin Hand Function Scale, r = 0.37), inter‐incisor distance (r = −0.34), handicap (McMaster‐Toronto Arthritis questionnaire (MACTAR), r = 0.24), depression (Hospital Anxiety and Depression (HAD); HADd, r = 0.26) and anxiety (HADa, r = 0.17). Factor analysis extracted 3 factors with eigenvalues of 4.26, 1.76 and 1.47, explaining 63% of the variance. These 3 factors could be clinically characterised. The first factor (5 items) represents handicap induced by the reduction in mouth opening, the second (5 items) handicap induced by sicca syndrome and the third (2 items) aesthetic concerns. Conclusion We propose a new scale, the Mouth Handicap in Systemic Sclerosis (MHISS) scale, which has excellent reliability and good construct validity, and assesses specifically disability involving the mouth in patients with SSc. PMID:17502364

  18. Development and validation of the simulation-based learning evaluation scale.

    PubMed

    Hung, Chang-Chiao; Liu, Hsiu-Chen; Lin, Chun-Chih; Lee, Bih-O

    2016-05-01

    The instruments that evaluate a student's perception of receiving simulated training are English versions and have not been tested for reliability or validity. The aim of this study was to develop and validate a Chinese version Simulation-Based Learning Evaluation Scale (SBLES). Four stages were conducted to develop and validate the SBLES. First, specific desired competencies were identified according to the National League for Nursing and Taiwan Nursing Accreditation Council core competencies. Next, the initial item pool was comprised of 50 items related to simulation that were drawn from the literature of core competencies. Content validity was established by use of an expert panel. Finally, exploratory factor analysis and confirmatory factor analysis were conducted for construct validity, and Cronbach's coefficient alpha determined the scale's internal consistency reliability. Two hundred and fifty students who had experienced simulation-based learning were invited to participate in this study. Two hundred and twenty-five students completed and returned questionnaires (response rate=90%). Six items were deleted from the initial item pool and one was added after an expert panel review. Exploratory factor analysis with varimax rotation revealed 37 items remaining in five factors which accounted for 67% of the variance. The construct validity of SBLES was substantiated in a confirmatory factor analysis that revealed a good fit of the hypothesized factor structure. The findings tally with the criterion of convergent and discriminant validity. The range of internal consistency for five subscales was .90 to .93. Items were rated on a 5-point scale from 1 (strongly disagree) to 5 (strongly agree). The results of this study indicate that the SBLES is valid and reliable. The authors recommend that the scale could be applied in the nursing school to evaluate the effectiveness of simulation-based learning curricula. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Validation of the CMT Pediatric Scale as an outcome measure of disability

    PubMed Central

    Burns, Joshua; Ouvrier, Robert; Estilow, Tim; Shy, Rosemary; Laurá, Matilde; Pallant, Julie F.; Lek, Monkol; Muntoni, Francesco; Reilly, Mary M.; Pareyson, Davide; Acsadi, Gyula; Shy, Michael E.; Finkel, Richard S.

    2012-01-01

    Objective Charcot-Marie-Tooth disease (CMT) is a common heritable peripheral neuropathy. There is no treatment for any form of CMT although clinical trials are increasingly occurring. Patients usually develop symptoms during the first two decades of life but there are no established outcome measures of disease severity or response to treatment. We identified a set of items that represent a range of impairment levels and conducted a series of validation studies to build a patient-centered multi-item rating scale of disability for children with CMT. Methods As part of the Inherited Neuropathies Consortium, patients aged 3–20 years with a variety of CMT types were recruited from the USA, UK, Italy and Australia. Initial development stages involved: definition of the construct, item pool generation, peer review and pilot testing. Based on data from 172 patients, a series of validation studies were conducted, including: item and factor analysis, reliability testing, Rasch modeling and sensitivity analysis. Results Seven areas for measurement were identified (strength, dexterity, sensation, gait, balance, power, endurance), and a psychometrically robust 11-item scale constructed (Charcot-Marie-Tooth disease Pediatric Scale: CMTPedS). Rasch analysis supported the viability of the CMTPedS as a unidimensional measure of disability in children with CMT. It showed good overall model fit, no evidence of misfitting items, no person misfit and it was well targeted for children with CMT. Interpretation The CMTPedS is a well-tolerated outcome measure that can be completed in 25-minutes. It is a reliable, valid and sensitive global measure of disability for children with CMT from the age of 3 years. PMID:22522479

  20. A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839

    ERIC Educational Resources Information Center

    Cai, Li; Monroe, Scott

    2014-01-01

    We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…

Top