unidimensional irt model: Topics by Science.gov

Sample records for unidimensional irt model

Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.

PubMed

Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

2016-01-01

This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.
Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions

PubMed Central

Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan

2016-01-01

This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability. PMID:26941699
Bayesian Estimation of Multi-Unidimensional Graded Response IRT Models

ERIC Educational Resources Information Center

Kuo, Tzu-Chun

2015-01-01

Item response theory (IRT) has gained an increasing popularity in large-scale educational and psychological testing situations because of its theoretical advantages over classical test theory. Unidimensional graded response models (GRMs) are useful when polytomous response items are designed to measure a unified latent trait. They are limited in…
Unidimensional Interpretations for Multidimensional Test Items

ERIC Educational Resources Information Center

Kahraman, Nilufer

2013-01-01

This article considers potential problems that can arise in estimating a unidimensional item response theory (IRT) model when some test items are multidimensional (i.e., show a complex factorial structure). More specifically, this study examines (1) the consequences of model misfit on IRT item parameter estimates due to unintended minor item-level…
Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions.

PubMed

Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee

2013-07-01

Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.
Using Unidimensional IRT Models for Dichotomous Classification via Computerized Classification Testing with Multidimensional Data.

ERIC Educational Resources Information Center

Lau, Che-Ming Allen; And Others

This study focused on the robustness of unidimensional item response theory (UIRT) models in computerized classification testing against violation of the unidimensionality assumption. The study addressed whether UIRT models remain acceptable under various testing conditions and dimensionality strengths. Monte Carlo simulation techniques were used…
IRTPRO 2.1 for Windows (Item Response Theory for Patient-Reported Outcomes)

ERIC Educational Resources Information Center

Paek, Insu; Han, Kyung T.

2013-01-01

This article reviews a new item response theory (IRT) model estimation program, IRTPRO 2.1, for Windows that is capable of unidimensional and multidimensional IRT model estimation for existing and user-specified constrained IRT models for dichotomously and polytomously scored item response data. (Contains 1 figure and 2 notes.)
PROC IRT: A SAS Procedure for Item Response Theory

PubMed Central

Matlock Cole, Ki; Paek, Insu

2017-01-01

This article reviews the procedure for item response theory (PROC IRT) procedure in SAS/STAT 14.1 to conduct item response theory (IRT) analyses of dichotomous and polytomous datasets that are unidimensional or multidimensional. The review provides an overview of available features, including models, estimation procedures, interfacing, input, and output files. A small-scale simulation study evaluates the IRT model parameter recovery of the PROC IRT procedure. The use of the IRT procedure in Statistical Analysis Software (SAS) may be useful for researchers who frequently utilize SAS for analyses, research, and teaching.
Exploring Unidimensional Proficiency Classification Accuracy from Multidimensional Data in a Vertical Scaling Context

ERIC Educational Resources Information Center

Kroopnick, Marc Howard

2010-01-01

When Item Response Theory (IRT) is operationally applied for large scale assessments, unidimensionality is typically assumed. This assumption requires that the test measures a single latent trait. Furthermore, when tests are vertically scaled using IRT, the assumption of unidimensionality would require that the battery of tests across grades…
Item Response Theory with Estimation of the Latent Density Using Davidian Curves

ERIC Educational Resources Information Center

Woods, Carol M.; Lin, Nan

2009-01-01

Davidian-curve item response theory (DC-IRT) is introduced, evaluated with simulations, and illustrated using data from the Schedule for Nonadaptive and Adaptive Personality Entitlement scale. DC-IRT is a method for fitting unidimensional IRT models with maximum marginal likelihood estimation, in which the latent density is estimated,…
Comparing Three Estimation Methods for the Three-Parameter Logistic IRT Model

ERIC Educational Resources Information Center

Lamsal, Sunil

2015-01-01

Different estimation procedures have been developed for the unidimensional three-parameter item response theory (IRT) model. These techniques include the marginal maximum likelihood estimation, the fully Bayesian estimation using Markov chain Monte Carlo simulation techniques, and the Metropolis-Hastings Robbin-Monro estimation. With each…
Ramsay-Curve Item Response Theory for the Three-Parameter Logistic Item Response Model

ERIC Educational Resources Information Center

Woods, Carol M.

2008-01-01

In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters of a unidimensional item response model using marginal maximum likelihood estimation. This study evaluates RC-IRT for the three-parameter logistic (3PL) model with comparisons to the normal model and to the empirical…
The Asymptotic Distribution of Ability Estimates: Beyond Dichotomous Items and Unidimensional IRT Models

ERIC Educational Resources Information Center

Sinharay, Sandip

2015-01-01

The maximum likelihood estimate (MLE) of the ability parameter of an item response theory model with known item parameters was proved to be asymptotically normally distributed under a set of regularity conditions for tests involving dichotomous items and a unidimensional ability parameter (Klauer, 1990; Lord, 1983). This article first considers…
An Application of Unfolding and Cumulative Item Response Theory Models for Noncognitive Scaling: Examining the Assumptions and Applicability of the Generalized Graded Unfolding Model

ERIC Educational Resources Information Center

Sgammato, Adrienne N.

2009-01-01

This study examined the applicability of a relatively new unidimensional, unfolding item response theory (IRT) model called the generalized graded unfolding model (GGUM; Roberts, Donoghue, & Laughlin, 2000). A total of four scaling methods were applied. Two commonly used cumulative IRT models for polytomous data, the Partial Credit Model and…
Assessing Model Data Fit of Unidimensional Item Response Theory Models in Simulated Data

ERIC Educational Resources Information Center

Kose, Ibrahim Alper

2014-01-01

The purpose of this paper is to give an example of how to assess the model-data fit of unidimensional IRT models in simulated data. Also, the present research aims to explain the importance of fit and the consequences of misfit by using simulated data sets. Responses of 1000 examinees to a dichotomously scoring 20 item test were simulated with 25…
Simultaneous Estimation of Overall and Domain Abilities: A Higher-Order IRT Model Approach

ERIC Educational Resources Information Center

de la Torre, Jimmy; Song, Hao

2009-01-01

Assessments consisting of different domains (e.g., content areas, objectives) are typically multidimensional in nature but are commonly assumed to be unidimensional for estimation purposes. The different domains of these assessments are further treated as multi-unidimensional tests for the purpose of obtaining diagnostic information. However, when…
Unidimensional and Multidimensional Models for Item Response Theory.

ERIC Educational Resources Information Center

McDonald, Roderick P.

This paper provides an up-to-date review of the relationship between item response theory (IRT) and (nonlinear) common factor theory and draws out of this relationship some implications for current and future research in IRT. Nonlinear common factor analysis yields a natural embodiment of the weak principle of local independence in appropriate…
On the Complexity of Item Response Theory Models.

PubMed

Bonifay, Wes; Cai, Li

2017-01-01

Complexity in item response theory (IRT) has traditionally been quantified by simply counting the number of freely estimated parameters in the model. However, complexity is also contingent upon the functional form of the model. We examined four popular IRT models-exploratory factor analytic, bifactor, DINA, and DINO-with different functional forms but the same number of free parameters. In comparison, a simpler (unidimensional 3PL) model was specified such that it had 1 more parameter than the previous models. All models were then evaluated according to the minimum description length principle. Specifically, each model was fit to 1,000 data sets that were randomly and uniformly sampled from the complete data space and then assessed using global and item-level fit and diagnostic measures. The findings revealed that the factor analytic and bifactor models possess a strong tendency to fit any possible data. The unidimensional 3PL model displayed minimal fitting propensity, despite the fact that it included an additional free parameter. The DINA and DINO models did not demonstrate a proclivity to fit any possible data, but they did fit well to distinct data patterns. Applied researchers and psychometricians should therefore consider functional form-and not goodness-of-fit alone-when selecting an IRT model.
Coma Recovery Scale-Revised: evidentiary support for hierarchical grading of level of consciousness.

PubMed

Gerrard, Paul; Zafonte, Ross; Giacino, Joseph T

2014-12-01

To investigate the neurobehavioral pattern of recovery of consciousness as reflected by performance on the subscales of the Coma Recovery Scale-Revised (CRS-R). Retrospective item response theory (IRT) and factor analysis. Inpatient rehabilitation facilities. Rehabilitation inpatients (N=180) with posttraumatic disturbance in consciousness who participated in a double-blinded, randomized, controlled drug trial. Not applicable. Scores on CRS-R subscales. The CRS-R was found to fit factor analytic models adhering to the assumptions of unidimensionality and monotonicity. In addition, subscales were mutually independent based on residual correlations. Nonparametric IRT reaffirmed the finding of monotonicity. A highly constrained confirmatory factor analysis model, which imposed equal factor loadings on all items, was found to fit the data well and was used to estimate a 1-parameter IRT model. This study provides evidence of the unidimensionality of the CRS-R and supports the hierarchical structure of the CRS-R subscales, suggesting that it is an effective tool for establishing diagnosis and monitoring recovery of consciousness after severe traumatic brain injury. Copyright © 2014 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Analyses related to the development of DSM-5 criteria for substance use related disorders: 1. Toward amphetamine, cocaine and prescription drug use disorder continua using Item Response Theory.

PubMed

Saha, Tulshi D; Compton, Wilson M; Chou, S Patricia; Smith, Sharon; Ruan, W June; Huang, Boji; Pickering, Roger P; Grant, Bridget F

2012-04-01

Prior research has demonstrated the dimensionality of alcohol, nicotine and cannabis use disorders criteria. The purpose of this study was to examine the unidimensionality of DSM-IV cocaine, amphetamine and prescription drug abuse and dependence criteria and to determine the impact of elimination of the legal problems criterion on the information value of the aggregate criteria. Factor analyses and Item Response Theory (IRT) analyses were used to explore the unidimensionality and psychometric properties of the illicit drug use criteria using a large representative sample of the U.S. population. All illicit drug abuse and dependence criteria formed unidimensional latent traits. For amphetamines, cocaine, sedatives, tranquilizers and opioids, IRT models fit better for models without legal problems criterion than models with legal problems criterion and there were no differences in the information value of the IRT models with and without the legal problems criterion, supporting the elimination of that criterion. Consistent with findings for alcohol, nicotine and cannabis, amphetamine, cocaine, sedative, tranquilizer and opioid abuse and dependence criteria reflect underlying unitary dimensions of severity. The legal problems criterion associated with each of these substance use disorders can be eliminated with no loss in informational value and an advantage of parsimony. Taken together, these findings support the changes to substance use disorder diagnoses recommended by the American Psychiatric Association's DSM-5 Substance and Related Disorders Workgroup. Published by Elsevier Ireland Ltd.

An Information-Correction Method for Testlet-Based Test Analysis: From the Perspectives of Item Response Theory and Generalizability Theory. Research Report. ETS RR-17-27

ERIC Educational Resources Information Center

Li, Feifei

2017-01-01

An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Order-Constrained Bayes Inference for Dichotomous Models of Unidimensional Nonparametric IRT

ERIC Educational Resources Information Center

Karabatsos, George; Sheu, Ching-Fan

2004-01-01

This study introduces an order-constrained Bayes inference framework useful for analyzing data containing dichotomous scored item responses, under the assumptions of either the monotone homogeneity model or the double monotonicity model of nonparametric item response theory (NIRT). The framework involves the implementation of Gibbs sampling to…
Comparison of Unidimensional and Multidimensional Approaches to IRT Parameter Estimation. Research Report. ETS RR-04-44

ERIC Educational Resources Information Center

Zhang, Jinming

2004-01-01

It is common to assume during statistical analysis of a multiscale assessment that the assessment has simple structure or that it is composed of several unidimensional subtests. Under this assumption, both the unidimensional and multidimensional approaches can be used to estimate item parameters. This paper theoretically demonstrates that these…
Assessing the equivalence of Web-based and paper-and-pencil questionnaires using differential item and test functioning (DIF and DTF) analysis: a case of the Four-Dimensional Symptom Questionnaire (4DSQ).

PubMed

Terluin, Berend; Brouwers, Evelien P M; Marchand, Miquelle A G; de Vet, Henrica C W

2018-05-01

Many paper-and-pencil (P&P) questionnaires have been migrated to electronic platforms. Differential item and test functioning (DIF and DTF) analysis constitutes a superior research design to assess measurement equivalence across modes of administration. The purpose of this study was to demonstrate an item response theory (IRT)-based DIF and DTF analysis to assess the measurement equivalence of a Web-based version and the original P&P format of the Four-Dimensional Symptom Questionnaire (4DSQ), measuring distress, depression, anxiety, and somatization. The P&P group (n = 2031) and the Web group (n = 958) consisted of primary care psychology clients. Unidimensionality and local independence of the 4DSQ scales were examined using IRT and Yen's Q3. Bifactor modeling was used to assess the scales' essential unidimensionality. Measurement equivalence was assessed using IRT-based DIF analysis using a 3-stage approach: linking on the latent mean and variance, selection of anchor items, and DIF testing using the Wald test. DTF was evaluated by comparing expected scale scores as a function of the latent trait. The 4DSQ scales proved to be essentially unidimensional in both modalities. Five items, belonging to the distress and somatization scales, displayed small amounts of DIF. DTF analysis revealed that the impact of DIF on the scale level was negligible. IRT-based DIF and DTF analysis is demonstrated as a way to assess the equivalence of Web-based and P&P questionnaire modalities. Data obtained with the Web-based 4DSQ are equivalent to data obtained with the P&P version.
Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

NASA Astrophysics Data System (ADS)

Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

2016-12-01

This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.
Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

ERIC Educational Resources Information Center

Suh, Youngsuk

2016-01-01

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…
Graded Response Model Based on the Logistic Positive Exponent Family of Models for Dichotomous Responses

ERIC Educational Resources Information Center

Samejima, Fumiko

2008-01-01

Samejima ("Psychometrika "65:319--335, 2000) proposed the logistic positive exponent family of models (LPEF) for dichotomous responses in the unidimensional latent space. The objective of the present paper is to propose and discuss a graded response model that is expanded from the LPEF, in the context of item response theory (IRT). This…
Checking Dimensionality in Item Response Models with Principal Component Analysis on Standardized Residuals

ERIC Educational Resources Information Center

Chou, Yeh-Tai; Wang, Wen-Chung

2010-01-01

Dimensionality is an important assumption in item response theory (IRT). Principal component analysis on standardized residuals has been used to check dimensionality, especially under the family of Rasch models. It has been suggested that an eigenvalue greater than 1.5 for the first eigenvalue signifies a violation of unidimensionality when there…
When Cognitive Diagnosis Meets Computerized Adaptive Testing: CD-CAT

ERIC Educational Resources Information Center

Cheng, Ying

2009-01-01

Computerized adaptive testing (CAT) is a mode of testing which enables more efficient and accurate recovery of one or more latent traits. Traditionally, CAT is built upon Item Response Theory (IRT) models that assume unidimensionality. However, the problem of how to build CAT upon latent class models (LCM) has not been investigated until recently,…
Robust Measurement via A Fused Latent and Graphical Item Response Theory Model.

PubMed

Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Ying, Zhiliang

2018-03-12

Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.
The Robustness of IRT-Based Vertical Scaling Methods to Violation of Unidimensionality

ERIC Educational Resources Information Center

Yin, Liqun

2013-01-01

In recent years, many states have adopted Item Response Theory (IRT) based vertically scaled tests due to their compelling features in a growth-based accountability context. However, selection of a practical and effective calibration/scaling method and proper understanding of issues with possible multidimensionality in the test data is critical to…
Between-Person and Within-Person Subscore Reliability: Comparison of Unidimensional and Multidimensional IRT Models

ERIC Educational Resources Information Center

Bulut, Okan

2013-01-01

The importance of subscores in educational and psychological assessments is undeniable. Subscores yield diagnostic information that can be used for determining how each examinee's abilities/skills vary over different content domains. One of the most common criticisms about reporting and using subscores is insufficient reliability of subscores.…
Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing. CRESST Report 830

ERIC Educational Resources Information Center

Cai, Li

2013-01-01

Lord and Wingersky's (1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined…
Application of a General Polytomous Testlet Model to the Reading Section of a Large-Scale English Language Assessment. Research Report. ETS RR-10-21

ERIC Educational Resources Information Center

Li, Yanmei; Li, Shuhong; Wang, Lin

2010-01-01

Many standardized educational tests include groups of items based on a common stimulus, known as "testlets". Standard unidimensional item response theory (IRT) models are commonly used to model examinees' responses to testlet items. However, it is known that local dependence among testlet items can lead to biased item parameter estimates…
The Communicative Participation Item Bank (CPIB): Item bank calibration and development of a disorder-generic short form

PubMed Central

Baylor, Carolyn; Yorkston, Kathryn; Eadie, Tanya; Kim, Jiseon; Chung, Hyewon; Amtmann, Dagmar

2015-01-01

Purpose The purpose of this study was to calibrate the items for the Communicative Participation Item Bank (CPIB) using Item Response Theory (IRT). One overriding objective was to examine if the IRT item parameters would be consistent across different diagnostic groups, thereby allowing creation of a disorder-generic instrument. The intended outcomes were the final item bank and a short form ready for clinical and research applications. Methods Self-report data were collected from 701 individuals representing four diagnoses: multiple sclerosis, Parkinson’s disease, amyotrophic lateral sclerosis and head and neck cancer. Participants completed the CPIB and additional self-report questionnaires. CPIB data were analyzed using the IRT Graded Response Model (GRM). Results The initial set of 94 candidate CPIB items were reduced to an item bank of 46 items demonstrating unidimensionality, local independence, good item fit, and good measurement precision. Differential item function (DIF) analyses detected no meaningful differences across diagnostic groups. A 10-item, disorder-generic short form was generated. Conclusions The CPIB provides speech-language pathologists with a unidimensional, self-report outcomes measurement instrument dedicated to the construct of communicative participation. This instrument may be useful to clinicians and researchers wanting to implement measures of communicative participation in their work. PMID:23816661
Multidimensional student skills with collaborative filtering

NASA Astrophysics Data System (ADS)

Bergner, Yoav; Rayyan, Saif; Seaton, Daniel; Pritchard, David E.

2013-01-01

Despite the fact that a physics course typically culminates in one final grade for the student, many instructors and researchers believe that there are multiple skills that students acquire to achieve mastery. Assessment validation and data analysis in general may thus benefit from extension to multidimensional ability. This paper introduces an approach for model determination and dimensionality analysis using collaborative filtering (CF), which is related to factor analysis and item response theory (IRT). Model selection is guided by machine learning perspectives, seeking to maximize the accuracy in predicting which students will answer which items correctly. We apply the CF to response data for the Mechanics Baseline Test and combine the results with prior analysis using unidimensional IRT.
Improving Measurement Efficiency of the Inner EAR Scale with Item Response Theory.

PubMed

Jessen, Annika; Ho, Andrew D; Corrales, C Eduardo; Yueh, Bevan; Shin, Jennifer J

2018-02-01

Objectives (1) To assess the 11-item Inner Effectiveness of Auditory Rehabilitation (Inner EAR) instrument with item response theory (IRT). (2) To determine whether the underlying latent ability could also be accurately represented by a subset of the items for use in high-volume clinical scenarios. (3) To determine whether the Inner EAR instrument correlates with pure tone thresholds and word recognition scores. Design IRT evaluation of prospective cohort data. Setting Tertiary care academic ambulatory otolaryngology clinic. Subjects and Methods Modern psychometric methods, including factor analysis and IRT, were used to assess unidimensionality and item properties. Regression methods were used to assess prediction of word recognition and pure tone audiometry scores. Results The Inner EAR scale is unidimensional, and items varied in their location and information. Information parameter estimates ranged from 1.63 to 4.52, with higher values indicating more useful items. The IRT model provided a basis for identifying 2 sets of items with relatively lower information parameters. Item information functions demonstrated which items added insubstantial value over and above other items and were removed in stages, creating a 8- and 3-item Inner EAR scale for more efficient assessment. The 8-item version accurately reflected the underlying construct. All versions correlated moderately with word recognition scores and pure tone averages. Conclusion The 11-, 8-, and 3-item versions of the Inner EAR scale have strong psychometric properties, and there is correlational validity evidence for the observed scores. Modern psychometric methods can help streamline care delivery by maximizing relevant information per item administered.
The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

ERIC Educational Resources Information Center

Sahin, Alper; Anil, Duygu

2017-01-01

This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…
Item response theory analysis of Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis.

PubMed

Mielenz, Thelma J; Callahan, Leigh F; Edwards, Michael C

2016-03-12

Examine the feasibility of performing an item response theory (IRT) analysis on two of the Centers for Disease Control and Prevention health-related quality of life (CDC HRQOL) modules - the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM). Previous principal components analyses confirm that the two scales both assess a mix of mental (CDC-MH) and physical health (CDC-PH). The purpose is to conduct item response theory (IRT) analysis on the CDC-MH and CDC-PH scales separately. 2182 patients with self-reported or physician-diagnosed arthritis completed a cross-sectional survey including HDCM and HDSM items. Besides global health, the other 8 items ask the number of days that some statement was true; we chose to recode the data into 8 categories based on observed clustering. The IRT assumptions were assessed using confirmatory factor analysis and the data could be modeled using an unidimensional IRT model. The graded response model was used for IRT analyses and CDC-MH and CDC-PH scales were analyzed separately in flexMIRT. The IRT parameter estimates for the five-item CDC-PH all appeared reasonable. The three-item CDC-MH did not have reasonable parameter estimates. The CDC-PH scale is amenable to IRT analysis but the existing The CDC-MH scale is not. We suggest either using the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM) as they currently stand or the CDC-PH scale alone if the primary goal is to measure physical health related HRQOL.
Modern psychometrics applied in rheumatology--a systematic review.

PubMed

Siemons, Liseth; Ten Klooster, Peter M; Taal, Erik; Glas, Cees Aw; Van de Laar, Mart Afj

2012-10-31

Although item response theory (IRT) appears to be increasingly used within health care research in general, a comprehensive overview of the frequency and characteristics of IRT analyses within the rheumatic field is lacking. An overview of the use and application of IRT in rheumatology to date may give insight into future research directions and highlight new possibilities for the improvement of outcome assessment in rheumatic conditions. Therefore, this study systematically reviewed the application of IRT to patient-reported and clinical outcome measures in rheumatology. Literature searches in PubMed, Scopus and Web of Science resulted in 99 original English-language articles which used some form of IRT-based analysis of patient-reported or clinical outcome data in patients with a rheumatic condition. Both general study information and IRT-specific information were assessed. Most studies used Rasch modeling for developing or evaluating new or existing patient-reported outcomes in rheumatoid arthritis or osteoarthritis patients. Outcomes of principle interest were physical functioning and quality of life. Since the last decade, IRT has also been applied to clinical measures more frequently. IRT was mostly used for evaluating model fit, unidimensionality and differential item functioning, the distribution of items and persons along the underlying scale, and reliability. Less frequently used IRT applications were the evaluation of local independence, the threshold ordering of items, and the measurement precision along the scale. IRT applications have markedly increased within rheumatology over the past decades. To date, IRT has primarily been applied to patient-reported outcomes, however, applications to clinical measures are gaining interest. Useful IRT applications not yet widely used within rheumatology include the cross-calibration of instrument scores and the development of computerized adaptive tests which may reduce the measurement burden for both the patient and the clinician. Also, the measurement precision of outcome measures along the scale was only evaluated occasionally. Performed IRT analyses should be adequately explained, justified, and reported. A global consensus about uniform guidelines should be reached concerning the minimum number of assumptions which should be met and best ways of testing these assumptions, in order to stimulate the quality appraisal of performed IRT analyses.

Evaluation properties of the French version of the OUT-PATSAT35 satisfaction with care questionnaire according to classical and item response theory analyses.

PubMed

Panouillères, M; Anota, A; Nguyen, T V; Brédart, A; Bosset, J F; Monnier, A; Mercier, M; Hardouin, J B

2014-09-01

The present study investigates the properties of the French version of the OUT-PATSAT35 questionnaire, which evaluates the outpatients' satisfaction with care in oncology using classical analysis (CTT) and item response theory (IRT). This cross-sectional multicenter study includes 692 patients who completed the questionnaire at the end of their ambulatory treatment. CTT analyses tested the main psychometric properties (convergent and divergent validity, and internal consistency). IRT analyses were conducted separately for each OUT-PATSAT35 domain (the doctors, the nurses or the radiation therapists and the services/organization) by models from the Rasch family. We examined the fit of the data to the model expectations and tested whether the model assumptions of unidimensionality, monotonicity and local independence were respected. A total of 605 (87.4%) respondents were analyzed with a mean age of 64 years (range 29-88). Internal consistency for all scales separately and for the three main domains was good (Cronbach's α 0.74-0.98). IRT analyses were performed with the partial credit model. No disordered thresholds of polytomous items were found. Each domain showed high reliability but fitted poorly to the Rasch models. Three items in particular, the item about "promptness" in the doctors' domain and the items about "accessibility" and "environment" in the services/organization domain, presented the highest default of fit. A correct fit of the Rasch model can be obtained by dropping these items. Most of the local dependence concerned items about "information provided" in each domain. A major deviation of unidimensionality was found in the nurses' domain. CTT showed good psychometric properties of the OUT-PATSAT35. However, the Rasch analysis revealed some misfitting and redundant items. Taking the above problems into consideration, it could be interesting to refine the questionnaire in a future study.
An analysis of the DuPage County Regional Office of Education physics exam

NASA Astrophysics Data System (ADS)

Muehsler, Hans

In 2009, the DuPage County Regional Office of Education (ROE) tasked volunteer physics teachers with creating a basic skills physics exam reflecting what the participants valued and shared in common across curricula. Mechanics, electricity & magnetism (E&M), and wave phenomena emerged as the primary constructs. The resulting exam was intended for first-exposure physics students. The most recently completed version was psychometrically assessed for unidimensionality within the constructs using a robust WLS structural equation model and for reliability. An item analysis using a 3-PL IRT model was performed on the mechanics items and a 2-PL IRT model was performed on the E&M and waves items; a distractor analysis was also performed on all items. Lastly, differential item functioning (DIF) and differential test functioning (DTF) analyses, using the Mantel-Haenszel procedure, were performed using gender, ethnicity, year in school, ELL, physics level, and math level as groupings.
Application of Item Response Theory to Tests of Substance-related Associative Memory

PubMed Central

Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

2015-01-01

A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051
Assessment of health surveys: fitting a multidimensional graded response model.

PubMed

Depaoli, Sarah; Tiemensma, Jitske; Felt, John M

The multidimensional graded response model, an item response theory (IRT) model, can be used to improve the assessment of surveys, even when sample sizes are restricted. Typically, health-based survey development utilizes classical statistical techniques (e.g. reliability and factor analysis). In a review of four prominent journals within the field of Health Psychology, we found that IRT-based models were used in less than 10% of the studies examining scale development or assessment. However, implementing IRT-based methods can provide more details about individual survey items, which is useful when determining the final item content of surveys. An example using a quality of life survey for Cushing's syndrome (CushingQoL) highlights the main components for implementing the multidimensional graded response model. Patients with Cushing's syndrome (n = 397) completed the CushingQoL. Results from the multidimensional graded response model supported a 2-subscale scoring process for the survey. All items were deemed as worthy contributors to the survey. The graded response model can accommodate unidimensional or multidimensional scales, be used with relatively lower sample sizes, and is implemented in free software (example code provided in online Appendix). Use of this model can help to improve the quality of health-based scales being developed within the Health Sciences.
Non-ignorable missingness item response theory models for choice effects in examinee-selected items.

PubMed

Liu, Chen-Wei; Wang, Wen-Chung

2017-11-01

Examinee-selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non-ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two-dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non-ignorable and to determine how to apply the new model to the data collected. Two follow-up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non-ignorable missing data were mistakenly treated as ignorable. © 2017 The British Psychological Society.
IRTs of the ABCs: Children's Letter Name Acquisition

PubMed Central

Piasta, Shayne B.; Anthony, Jason L.; Lonigan, Christopher J.; Francis, David J.

2015-01-01

We examined the developmental sequence of letter name knowledge acquisition by children from 2 to five years of age. Data from 2 samples representing diverse regions, ethnicity, and socioeconomic backgrounds (ns = 1074 & 500) were analyzed using item response theory (IRT) and differential item functioning techniques. Results from factor analyses indicated that letter name knowledge represented a unidimensional skill; IRT results yielded significant differences between letters in both difficulty and discrimination. Results also indicated an approximate developmental sequence in letter name learning for the simplest and most challenging to learn letters -- but with no clear sequence between these extremes. Findings also suggested that children were most likely to first learn their first initial. We discuss implications for assessment and instruction. PMID:22710016
An item response theory analysis of the Olweus Bullying scale.

PubMed

Breivik, Kyrre; Olweus, Dan

2014-12-02

In the present article, we used IRT (graded response) modeling as a useful technology for a detailed and refined study of the psychometric properties of the various items of the Olweus Bullying scale and the scale itself. The sample consisted of a very large number of Norwegian 4th-10th grade students (n = 48 926). The IRT analyses revealed that the scale was essentially unidimensional and had excellent reliability in the upper ranges of the latent bullying tendency trait, as intended and desired. Gender DIF effects were identified with regard to girls' use of indirect bullying by social exclusion and boys' use of physical bullying by hitting and kicking but these effects were small and worked in opposite directions, having negligible effects at the scale level. Also scale scores adjusted for DIF effects differed very little from non-adjusted scores. In conclusion, the empirical data were well characterized by the chosen IRT model and the Olweus Bullying scale was considered well suited for the conduct of fair and reliable comparisons involving different gender-age groups. Information Aggr. Behav. 9999:XX-XX, 2014. © 2014 Wiley Periodicals, Inc. © 2014 Wiley Periodicals, Inc.
An item response theory analysis of the Olweus Bullying scale.

PubMed

Breivik, Kyrre; Olweus, Dan

2015-01-01

In the present article, we used IRT (graded response) modeling as a useful technology for a detailed and refined study of the psychometric properties of the various items of the Olweus Bullying scale and the scale itself. The sample consisted of a very large number of Norwegian 4th-10th grade students (n = 48 926). The IRT analyses revealed that the scale was essentially unidimensional and had excellent reliability in the upper ranges of the latent bullying tendency trait, as intended and desired. Gender DIF effects were identified with regard to girls' use of indirect bullying by social exclusion and boys' use of physical bullying by hitting and kicking but these effects were small and worked in opposite directions, having negligible effects at the scale level. Also scale scores adjusted for DIF effects differed very little from non-adjusted scores. In conclusion, the empirical data were well characterized by the chosen IRT model and the Olweus Bullying scale was considered well suited for the conduct of fair and reliable comparisons involving different gender-age groups. Information Aggr. Behav. 41:1-13, 2015. © 2014 Wiley Periodicals, Inc. © 2014 Wiley Periodicals, Inc.
Immediate list recall as a measure of short-term episodic memory: insights from the serial position effect and item response theory.

PubMed

Gavett, Brandon E; Horwitz, Julie E

2012-03-01

The serial position effect shows that two interrelated cognitive processes underlie immediate recall of a supraspan word list. The current study used item response theory (IRT) methods to determine whether the serial position effect poses a threat to the construct validity of immediate list recall as a measure of verbal episodic memory. Archival data were obtained from a national sample of 4,212 volunteers aged 28-84 in the Midlife Development in the United States study. Telephone assessment yielded item-level data for a single immediate recall trial of the Rey Auditory Verbal Learning Test (RAVLT). Two parameter logistic IRT procedures were used to estimate item parameters and the Q(1) statistic was used to evaluate item fit. A two-dimensional model better fit the data than a unidimensional model, supporting the notion that list recall is influenced by two underlying cognitive processes. IRT analyses revealed that 4 of the 15 RAVLT items (1, 12, 14, and 15) were misfit (p < .05). Item characteristic curves for items 14 and 15 decreased monotonically, implying an inverse relationship between the ability level and the probability of recall. Elimination of the four misfit items provided better fit to the data and met necessary IRT assumptions. Performance on a supraspan list learning test is influenced by multiple cognitive abilities; failure to account for the serial position of words decreases the construct validity of the test as a measure of episodic memory and may provide misleading results. IRT methods can ameliorate these problems and improve construct validity.
Item response theory analysis of the Lichtenberg Financial Decision Screening Scale.

PubMed

Teresi, Jeanne A; Ocepek-Welikson, Katja; Lichtenberg, Peter A

2017-01-01

The focus of these analyses was to examine the psychometric properties of the Lichtenberg Financial Decision Screening Scale (LFDSS). The purpose of the screen was to evaluate the decisional abilities and vulnerability to exploitation of older adults. Adults aged 60 and over were interviewed by social, legal, financial, or health services professionals who underwent in-person training on the administration and scoring of the scale. Professionals provided a rating of the decision-making abilities of the older adult. The analytic sample included 213 individuals with an average age of 76.9 (SD = 10.1). The majority (57%) were female. Data were analyzed using item response theory (IRT) methodology. The results supported the unidimensionality of the item set. Several IRT models were tested. Ten ordinal and binary items evidenced a slightly higher reliability estimate (0.85) than other versions and better coverage in terms of the range of reliable measurement across the continuum of financial incapacity.
Evaluation of psychometric properties and differential item functioning of 8-item Child Perceptions Questionnaires using item response theory.

PubMed

Yau, David T W; Wong, May C M; Lam, K F; McGrath, Colman

2015-08-19

Four-factor structure of the two 8-item short forms of Child Perceptions Questionnaire CPQ11-14 (RSF:8 and ISF:8) has been confirmed. However, the sum scores are typically reported in practice as a proxy of Oral health-related Quality of Life (OHRQoL), which implied a unidimensional structure. This study first assessed the unidimensionality of 8-item short forms of CPQ11-14. Item response theory (IRT) was employed to offer an alternative and complementary approach of validation and to overcome the limitations of classical test theory assumptions. A random sample of 649 12-year-old school children in Hong Kong was analyzed. Unidimensionality of the scale was tested by confirmatory factor analysis (CFA), principle component analysis (PCA) and local dependency (LD) statistic. Graded response model was fitted to the data. Contribution of each item to the scale was assessed by item information function (IIF). Reliability of the scale was assessed by test information function (TIF). Differential item functioning (DIF) across gender was identified by Wald test and expected score functions. Both CPQ11-14 RSF:8 and ISF:8 did not deviate much from the unidimensionality assumption. Results from CFA indicated acceptable fit of the one-factor model. PCA indicated that the first principle component explained >30 % of the total variation with high factor loadings for both RSF:8 and ISF:8. Almost all LD statistic <10 indicated the absence of local dependency. Flat and low IIFs were observed in the oral symptoms items suggesting little contribution of information to the scale and item removal caused little practical impact. Comparing the TIFs, RSF:8 showed slightly better information than ISF:8. In addition to oral symptoms items, the item "Concerned with what other people think" demonstrated a uniform DIF (p < 0.001). The expected score functions were not much different between boys and girls. Items related to oral symptoms were not informative to OHRQoL and deletion of these items is suggested. The impact of DIF across gender on the overall score was minimal. CPQ11-14 RSF:8 performed slightly better than ISF:8 in measurement precision. The 6-item short forms suggested by IRT validation should be further investigated to ensure their robustness, responsiveness and discriminative performance.
Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form.

PubMed

Kisala, Pamela A; Tulsky, David S; Pace, Natalie; Victorson, David; Choi, Seung W; Heinemann, Allen W

2015-05-01

To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. SCI-QOL Stigma Item Bank A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications.
Measuring stigma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Stigma item bank and short form

PubMed Central

Kisala, Pamela A.; Tulsky, David S.; Pace, Natalie; Victorson, David; Choi, Seung W.; Heinemann, Allen W.

2015-01-01

Objective To develop a calibrated item bank and computer adaptive test (CAT) to assess the effects of stigma on health-related quality of life in individuals with spinal cord injury (SCI). Design Grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, and item response theory (IRT)-based psychometric analyses. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Main Outcome Measures SCI-QOL Stigma Item Bank Results A sample of 611 individuals with traumatic SCI completed 30 items assessing SCI-related stigma. After 7 items were iteratively removed, factor analyses confirmed a unidimensional pool of items. Graded Response Model IRT analyses were used to estimate slopes and thresholds for the final 23 items. Conclusions The SCI-QOL Stigma item bank is unique not only in the assessment of SCI-related stigma but also in the inclusion of individuals with SCI in all phases of its development. Use of confirmatory factor analytic and IRT methods provide flexibility and precision of measurement. The item bank may be administered as a CAT or as a 10-item fixed-length short form and can be used for research and clinical applications. PMID:26010973
Comparing five depression measures in depressed Chinese patients using item response theory: an examination of item properties, measurement precision and score comparability.

PubMed

Zhao, Yue; Chan, Wai; Lo, Barbara Chuen Yee

2017-04-04

Item response theory (IRT) has been increasingly applied to patient-reported outcome (PRO) measures. The purpose of this study is to apply IRT to examine item properties (discrimination and severity of depressive symptoms), measurement precision and score comparability across five depression measures, which is the first study of its kind in the Chinese context. A clinical sample of 207 Hong Kong Chinese outpatients was recruited. Data analyses were performed including classical item analysis, IRT concurrent calibration and IRT true score equating. The IRT assumptions of unidimensionality and local independence were tested respectively using confirmatory factor analysis and chi-square statistics. The IRT linking assumptions of construct similarity, equity and subgroup invariance were also tested. The graded response model was applied to concurrently calibrate all five depression measures in a single IRT run, resulting in the item parameter estimates of these measures being placed onto a single common metric. IRT true score equating was implemented to perform the outcome score linking and construct score concordances so as to link scores from one measure to corresponding scores on another measure for direct comparability. Findings suggested that (a) symptoms on depressed mood, suicidality and feeling of worthlessness served as the strongest discriminating indicators, and symptoms concerning suicidality, changes in appetite, depressed mood, feeling of worthlessness and psychomotor agitation or retardation reflected high levels of severity in the clinical sample. (b) The five depression measures contributed to various degrees of measurement precision at varied levels of depression. (c) After outcome score linking was performed across the five measures, the cut-off scores led to either consistent or discrepant diagnoses for depression. The study provides additional evidence regarding the psychometric properties and clinical utility of the five depression measures, offers methodological contributions to the appropriate use of IRT in PRO measures, and helps elucidate cultural variation in depressive symptomatology. The approach of concurrently calibrating and linking multiple PRO measures can be applied to the assessment of PROs other than the depression context.
Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

ERIC Educational Resources Information Center

Matlock, Ki Lynn; Turner, Ronna

2016-01-01

When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…
The Mindful Attention Awareness Scale: Further Examination of Dimensionality, Reliability, and Concurrent Validity Estimates.

PubMed

Osman, Augustine; Lamis, Dorian A; Bagge, Courtney L; Freedenthal, Stacey; Barnes, Sean M

2016-01-01

We examined the factor structure and psychometric properties of the Mindful Attention Awareness Scale (MAAS) in a sample of 810 undergraduate students. Using common exploratory factor analysis (EFA), we obtained evidence for a 1-factor solution (41.84% common variance). To confirm unidimensionality of the 15-item MAAS, we conducted a 1-factor confirmatory factor analysis (CFA). Results of the EFA and CFA, respectively, provided support for a unidimensional model. Using differential item functioning analysis methods within item response theory modeling (IRT-based DIF), we found that individuals with high and low levels of nonattachment responded similarly to the MAAS items. Following a detailed item analysis, we proposed a 5-item short version of the instrument and present descriptive statistics and composite score reliability for the short and full versions of the MAAS. Finally, correlation analyses showed that scores on the full and short versions of the MAAS were associated with measures assessing related constructs. The 5-item MAAS is as useful as the original MAAS in enhancing our understanding of the mindfulness construct.
Validating a multiple mini-interview question bank assessing entry-level reasoning skills in candidates for graduate-entry medicine and dentistry programmes.

PubMed

Roberts, Chris; Zoanetti, Nathan; Rothnie, Imogene

2009-04-01

The multiple mini-interview (MMI) was initially designed to test non-cognitive characteristics related to professionalism in entry-level students. However, it may be testing cognitive reasoning skills. Candidates to medical and dental schools come from diverse backgrounds and it is important for the validity and fairness of the MMI that these background factors do not impact on their scores. A suite of advanced psychometric techniques drawn from item response theory (IRT) was used to validate an MMI question bank in order to establish the conceptual equivalence of the questions. Bias against candidate subgroups of equal ability was investigated using differential item functioning (DIF) analysis. All 39 questions had a good fit to the IRT model. Of the 195 checklist items, none were found to have significant DIF after visual inspection of expected score curves, consideration of the number of applicants per category, and evaluation of the magnitude of the DIF parameter estimates. The question bank contains items that have been studied carefully in terms of model fit and DIF. Questions appear to measure a cognitive unidimensional construct, 'entry-level reasoning skills in professionalism', as suggested by goodness-of-fit statistics. The lack of items exhibiting DIF is encouraging in a contemporary high-stakes admission setting where candidates of diverse personal, cultural and academic backgrounds are assessed by common means. This IRT approach has potential to provide assessment designers with a quality control procedure that extends to the level of checklist items.
Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing.

PubMed

Cai, Li

2015-06-01

Lord and Wingersky's (Appl Psychol Meas 8:453-461, 1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined on a grid formed by direct products of quadrature points. However, the increase in computational burden remains exponential in the number of dimensions, making the implementation of the recursive algorithm cumbersome for truly high-dimensional models. In this paper, a dimension reduction method that is specific to the Lord-Wingersky recursions is developed. This method can take advantage of the restrictions implied by hierarchical item factor models, e.g., the bifactor model, the testlet model, or the two-tier model, such that a version of the Lord-Wingersky recursive algorithm can operate on a dramatically reduced set of quadrature points. For instance, in a bifactor model, the dimension of integration is always equal to 2, regardless of the number of factors. The new algorithm not only provides an effective mechanism to produce summed score to IRT scaled score translation tables properly adjusted for residual dependence, but leads to new applications in test scoring, linking, and model fit checking as well. Simulated and empirical examples are used to illustrate the new applications.
Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain

PubMed Central

Crins, Martine H. P.; Roorda, Leo D.; Smits, Niels; de Vet, Henrica C. W.; Westhovens, Rene; Cella, David; Cook, Karon F.; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B.

2015-01-01

The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach’s alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach’s alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed. PMID:26214178
Calibration and Validation of the Dutch-Flemish PROMIS Pain Interference Item Bank in Patients with Chronic Pain.

PubMed

Crins, Martine H P; Roorda, Leo D; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Terwee, Caroline B

2015-01-01

The Dutch-Flemish PROMIS Group translated the adult PROMIS Pain Interference item bank into Dutch-Flemish. The aims of the current study were to calibrate the parameters of these items using an item response theory (IRT) model, to evaluate the cross-cultural validity of the Dutch-Flemish translations compared to the original English items, and to evaluate their reliability and construct validity. The 40 items in the bank were completed by 1085 Dutch chronic pain patients. Before calibrating the items, IRT model assumptions were evaluated using confirmatory factor analysis (CFA). Items were calibrated using the graded response model (GRM), an IRT model appropriate for items with more than two response options. To evaluate cross-cultural validity, differential item functioning (DIF) for language (Dutch vs. English) was examined. Reliability was evaluated based on standard errors and Cronbach's alpha. To evaluate construct validity correlations with scores on legacy instruments (e.g., the Disabilities of the Arm, Shoulder and Hand Questionnaire) were calculated. Unidimensionality of the Dutch-Flemish PROMIS Pain Interference item bank was supported by CFA tests of model fit (CFI = 0.986, TLI = 0.986). Furthermore, the data fit the GRM and showed good coverage across the pain interference continuum (threshold-parameters range: -3.04 to 3.44). The Dutch-Flemish PROMIS Pain Interference item bank has good cross-cultural validity (only two out of 40 items showing DIF), good reliability (Cronbach's alpha = 0.98), and good construct validity (Pearson correlations between 0.62 and 0.75). A computer adaptive test (CAT) and Dutch-Flemish PROMIS short forms of the Dutch-Flemish PROMIS Pain Interference item bank can now be developed.

Evaluation of diagnostic criteria for panic attack using item response theory: findings from the National Comorbidity Survey in USA.

PubMed

Ietsugu, Tetsuji; Sukigara, Masune; Furukawa, Toshiaki A

2007-12-01

The dichotomous diagnostic systems such as the Diagnostic and Statistical Manual of Mental Disorders (DSM) and International Classification of Diseases (ICD) lose much important information concerning what each symptom can offer. This study explored the characteristics and performances of DSM-IV and ICD-10 diagnostic criteria items for panic attack using modern item response theory (IRT). The National Comorbidity Survey used the Composite International Diagnostic Interview to assess 14 DSM-IV and ICD-10 panic attack diagnostic criteria items in the general population in the USA. The dimensionality and measurement properties of these items were evaluated using dichotomous factor analysis and the two-parameter IRT model. A total of 1213 respondents reported at least one subsyndromal or syndromal panic attack in their lifetime. Factor analysis indicated that all items constitute a unidimensional construct. The two-parameter IRT model produced meaningful and interpretable results. Among items with high discrimination parameters, the difficulty parameter for "palpitation" was relatively low, while those for "choking," "fear of dying" and "paresthesia" were relatively high. Several items including "dry mouth" and "fear of losing control" had low discrimination parameters. The item characteristics of diagnostic criteria among help-seeking clinical populations may be different from those that we observed in the general population and deserve further examination. "Paresthesia," "choking" and "fear of dying" can be thought to be good indicators of severe panic attacks, while "palpitation" can discriminate well between cases and non-cases at low level of panic attack severity. Items such as "dry mouth" would contribute less to the discrimination.
Development of a PROMIS item bank to measure pain interference.

PubMed

Amtmann, Dagmar; Cook, Karon F; Jensen, Mark P; Chen, Wen-Hung; Choi, Seung; Revicki, Dennis; Cella, David; Rothrock, Nan; Keefe, Francis; Callahan, Leigh; Lai, Jin-Shei

2010-07-01

This paper describes the psychometric properties of the PROMIS-pain interference (PROMIS-PI) bank. An initial candidate item pool (n=644) was developed and evaluated based on the review of existing instruments, interviews with patients, and consultation with pain experts. From this pool, a candidate item bank of 56 items was selected and responses to the items were collected from large community and clinical samples. A total of 14,848 participants responded to all or a subset of candidate items. The responses were calibrated using an item response theory (IRT) model. A final 41-item bank was evaluated with respect to IRT assumptions, model fit, differential item function (DIF), precision, and construct and concurrent validity. Items of the revised bank had good fit to the IRT model (CFI and NNFI/TLI ranged from 0.974 to 0.997), and the data were strongly unidimensional (e.g., ratio of first and second eigenvalue=35). Nine items exhibited statistically significant DIF. However, adjusting for DIF had little practical impact on score estimates and the items were retained without modifying scoring. Scores provided substantial information across levels of pain; for scores in the T-score range 50-80, the reliability was equivalent to 0.96-0.99. Patterns of correlations with other health outcomes supported the construct validity of the item bank. The scores discriminated among persons with different numbers of chronic conditions, disabling conditions, levels of self-reported health, and pain intensity (p<0.0001). The results indicated that the PROMIS-PI items constitute a psychometrically sound bank. Computerized adaptive testing and short forms are available. Copyright 2010 International Association for the Study of Pain. All rights reserved.
Measuring disability across cultures — the psychometric properties of the WHODAS II in older people from seven low- and middle-income countries. The 10/66 Dementia Research Group population-based survey

PubMed Central

Sousa, Renata M; Dewey, Michael E; Acosta, Daisy; Jotheeswaran, AT; Castro-Costa, Erico; Ferri, Cleusa P; Guerra, Mariella; Huang, Yueqin; Jacob, KS; Pichardo, Juana Guillermina Rodriguez; Ramírez, Nayeli Garcia; Rodriguez, Juan Llibre; Rodriguez, Marina Calvo; Salas, Aquiles; Sosa, Ana Luisa; Williams, Joseph; Prince, Martin J

2010-01-01

We evaluated the psychometric properties of the 12-item interviewer-administered screener version of the World Health Organization Disability Assessment Schedule – version II (WHODAS II) among older people living in seven low- and middle-income countries. Principal component analysis (PCA), confirmatory factor analysis (CFA) and Mokken analyses were carried out to test for unidimensionality, hierarchical structure, and measurement invariance across 10/66 Dementia Research Group sites. PCA generated a one-factor solution in most sites. In CFA, the two-factor solution generated in Dominican Republic fitted better for all sites other than rural China. The two factors were not easily interpretable, and may have been an artefact of differing item difficulties. Strong internal consistency and high factor loadings for the one-factor solution supported unidimensionality. Furthermore, the WHODAS II was found to be a ‘strong’ Mokken scale. Measurement invariance was supported by the similarity of factor loadings across sites, and by the high between-site correlations in item difficulties. The Mokken results strongly support that the WHODAS II 12-item screener is a unidimensional and hierarchical scale confirming to item response theory (IRT) principles, at least at the monotone homogeneity model level. More work is needed to assess the generalizability of our findings to different populations. Copyright © 2010 John Wiley & Sons, Ltd. PMID:20104493
Development and psychometric characteristics of the SCI-QOL Pressure Ulcers scale and short form.

PubMed

Kisala, Pamela A; Tulsky, David S; Choi, Seung W; Kirshblum, Steven C

2015-05-01

To develop a self-reported measure of the subjective impact of pressure ulcers on health-related quality of life (HRQOL) in individuals with spinal cord injury (SCI) as part of the SCI quality of life (SCI-QOL) measurement system. Grounded-theory based qualitative item development methods, large-scale item calibration testing, confirmatory factor analysis (CFA), and item response theory-based psychometric analysis. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. SCI-QOL Pressure Ulcers scale. 189 individuals with traumatic SCI who experienced a pressure ulcer within the past 7 days completed 30 items related to pressure ulcers. CFA confirmed a unidimensional pool of items. IRT analyses were conducted. A constrained Graded Response Model with a constant slope parameter was used to estimate item thresholds for the 12 retained items. The 12-item SCI-QOL Pressure Ulcers scale is unique in that it is specifically targeted to individuals with spinal cord injury and at every stage of development has included input from individuals with SCI. Furthermore, use of CFA and IRT methods provide flexibility and precision of measurement. The scale may be administered in its entirety or as a 7-item "short form" and is available for both research and clinical practice.
Computer-adaptive test to measure community reintegration of Veterans.

PubMed

Resnik, Linda; Tian, Feng; Ni, Pengsheng; Jette, Alan

2012-01-01

The Community Reintegration of Injured Service Members (CRIS) measure consists of three scales measuring extent of, perceived limitations in, and satisfaction with community reintegration. Length of the CRIS may be a barrier to its widespread use. Using item response theory (IRT) and computer-adaptive test (CAT) methodologies, this study developed and evaluated a briefer community reintegration measure called the CRIS-CAT. Large item banks for each CRIS scale were constructed. A convenience sample of 517 Veterans responded to all items. Exploratory and confirmatory factor analyses (CFAs) were used to identify the dimensionality within each domain, and IRT methods were used to calibrate items. Accuracy and precision of CATs of different lengths were compared with the full-item bank, and data were examined for differential item functioning (DIF). CFAs supported unidimensionality of scales. Acceptable item fit statistics were found for final models. Accuracy of 10-, 15-, 20-, and variable-item CATs for all three scales was 0.88 or above. CAT precision increased with number of items administered and decreased at the upper ranges of each scale. Three items exhibited moderate DIF by sex. The CRIS-CAT demonstrated promising measurement properties and is recommended for use in community reintegration assessment.
Development and psychometric characteristics of the SCI-QOL Pressure Ulcers scale and short form

PubMed Central

Kisala, Pamela A.; Tulsky, David S.; Choi, Seung W.; Kirshblum, Steven C.

2015-01-01

Objective To develop a self-reported measure of the subjective impact of pressure ulcers on health-related quality of life (HRQOL) in individuals with spinal cord injury (SCI) as part of the SCI quality of life (SCI-QOL) measurement system. Design Grounded-theory based qualitative item development methods, large-scale item calibration testing, confirmatory factor analysis (CFA), and item response theory-based psychometric analysis. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Main Outcome Measures SCI-QOL Pressure Ulcers scale. Results 189 individuals with traumatic SCI who experienced a pressure ulcer within the past 7 days completed 30 items related to pressure ulcers. CFA confirmed a unidimensional pool of items. IRT analyses were conducted. A constrained Graded Response Model with a constant slope parameter was used to estimate item thresholds for the 12 retained items. Conclusions The 12-item SCI-QOL Pressure Ulcers scale is unique in that it is specifically targeted to individuals with spinal cord injury and at every stage of development has included input from individuals with SCI. Furthermore, use of CFA and IRT methods provide flexibility and precision of measurement. The scale may be administered in its entirety or as a 7-item “short form” and is available for both research and clinical practice. PMID:26010965
Item Response Theory Analyses of the Cambridge Face Memory Test (CFMT)

PubMed Central

Cho, Sun-Joo; Wilmer, Jeremy; Herzmann, Grit; McGugin, Rankin; Fiset, Daniel; Van Gulick, Ana E.; Ryan, Katie; Gauthier, Isabel

2014-01-01

We evaluated the psychometric properties of the Cambridge face memory test (CFMT; Duchaine & Nakayama, 2006). First, we assessed the dimensionality of the test with a bi-factor exploratory factor analysis (EFA). This EFA analysis revealed a general factor and three specific factors clustered by targets of CFMT. However, the three specific factors appeared to be minor factors that can be ignored. Second, we fit a unidimensional item response model. This item response model showed that the CFMT items could discriminate individuals at different ability levels and covered a wide range of the ability continuum. We found the CFMT to be particularly precise for a wide range of ability levels. Third, we implemented item response theory (IRT) differential item functioning (DIF) analyses for each gender group and two age groups (Age ≤ 20 versus Age > 21). This DIF analysis suggested little evidence of consequential differential functioning on the CFMT for these groups, supporting the use of the test to compare older to younger, or male to female, individuals. Fourth, we tested for a gender difference on the latent facial recognition ability with an explanatory item response model. We found a significant but small gender difference on the latent ability for face recognition, which was higher for women than men by 0.184, at age mean 23.2, controlling for linear and quadratic age effects. Finally, we discuss the practical considerations of the use of total scores versus IRT scale scores in applications of the CFMT. PMID:25642930
Better assessment of physical function: item improvement is neglected but essential

PubMed Central

2009-01-01

Introduction Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. Methods The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. Results We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Conclusions Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes. PMID:20015354
Better assessment of physical function: item improvement is neglected but essential.

PubMed

Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E

2009-01-01

Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes.
Dimensionality of Hallucinogen and Inhalant/Solvent Abuse and Dependence Criteria: Implications for the Diagnostic and Statistical Manual of Mental Disorders – Fifth Edition

PubMed Central

Kerridge, Bradley T.; Saha, Tulshi D.; Smith, Sharon; Chou, Patricia S.; Pickering, Roger P.; Huang, Boji; Ruan, June W.; Pulay, Attila J.

2012-01-01

Background Prior research has demonstrated the dimensionality of Diagnostic and Statistical Manual of Mental Disorders - Fourth Edition (DSM-IV) alcohol, nicotine, cannabis, cocaine and amphetamine abuse and dependence criteria. The purpose of this study was to examine the dimensionality of hallucinogen and inhalant/solvent abuse and dependence criteria. In addition, we assessed the impact of elimination of the legal problems abuse criterion on the information value of the aggregate abuse and dependence criteria, another proposed change for DSM- IV currently lacking empirical justification. Methods Factor analyses and item response theory (IRT) analyses were used to explore the unidimisionality and psychometric properties of hallucinogen and inhalant/solvent abuse and dependence criteria using a large representative sample of the United States (U.S.) general population. Results Hallucinogen and inhalant/solvent abuse and dependence criteria formed unidimensional latent traits. For both substances, IRT models without the legal problems abuse criterion demonstrated better fit than the corresponding model with the legal problem abuse criterion. Further, there were no differences in the information value of the IRT models with and without the legal problems abuse criterion, supporting the elimination of that criterion. No bias in the new diagnoses was observed by sex, age and race-ethnicity. Conclusion Consistent with findings for alcohol, nicotine, cannabis, cocaine and amphetamine abuse and dependence criteria, hallucinogen and inhalant/solvent criteria reflect underlying dimensions of severity. The legal problems criterion associated with each of these substance use disorders can be eliminated with no loss in informational value and an advantage of parsimony. Taken together, these findings support the changes to substance use disorder diagnoses recommended by the DSM-V Substance and Related Disorders Workgroup, that is, combining DSM-IV abuse and dependence criteria and eliminating the legal problems abuse criterion. PMID:21621334
Dimensionality of hallucinogen and inhalant/solvent abuse and dependence criteria: implications for the Diagnostic and Statistical Manual of Mental Disorders-Fifth Edition.

PubMed

Kerridge, Bradley T; Saha, Tulshi D; Smith, Sharon; Chou, Patricia S; Pickering, Roger P; Huang, Boji; Ruan, June W; Pulay, Attila J

2011-09-01

Prior research has demonstrated the dimensionality of Diagnostic and Statistical Manual of Mental Disorders-Fourth Edition (DSM-IV) alcohol, nicotine, cannabis, cocaine and amphetamine abuse and dependence criteria. The purpose of this study was to examine the dimensionality of hallucinogen and inhalant/solvent abuse and dependence criteria. In addition, we assessed the impact of elimination of the legal problems abuse criterion on the information value of the aggregate abuse and dependence criteria, another proposed change for DSM-IV currently lacking empirical justification. Factor analyses and item response theory (IRT) analyses were used to explore the unidimisionality and psychometric properties of hallucinogen and inhalant/solvent abuse and dependence criteria using a large representative sample of the United States (U.S.) general population. Hallucinogen and inhalant/solvent abuse and dependence criteria formed unidimensional latent traits. For both substances, IRT models without the legal problems abuse criterion demonstrated better fit than the corresponding model with the legal problem abuse criterion. Further, there were no differences in the information value of the IRT models with and without the legal problems abuse criterion, supporting the elimination of that criterion. No bias in the new diagnoses was observed by sex, age and race-ethnicity. Consistent with findings for alcohol, nicotine, cannabis, cocaine and amphetamine abuse and dependence criteria, hallucinogen and inhalant/solvent criteria reflect underlying dimensions of severity. The legal problems criterion associated with each of these substance use disorders can be eliminated with no loss in informational value and an advantage of parsimony. Taken together, these findings support the changes to substance use disorder diagnoses recommended by the DSM-V Substance and Related Disorders Workgroup, that is, combining DSM-IV abuse and dependence criteria and eliminating the legal problems abuse criterion. Published by Elsevier Ltd.
Combination of classical test theory (CTT) and item response theory (IRT) analysis to study the psychometric properties of the French version of the Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (Q-LES-Q-SF).

PubMed

Bourion-Bédès, Stéphanie; Schwan, Raymund; Epstein, Jonathan; Laprevote, Vincent; Bédès, Alex; Bonnet, Jean-Louis; Baumann, Cédric

2015-02-01

The study aimed to examine the construct validity and reliability of the Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (Q-LES-Q-SF) according to both classical test and item response theories. The psychometric properties of the French version of this instrument were investigated in a cross-sectional, multicenter study. A total of 124 outpatients with a substance dependence diagnosis participated in the study. Psychometric evaluation included descriptive analysis, internal consistency, test-retest reliability, and validity. The dimensionality of the instrument was explored using a combination of the classical test, confirmatory factor analysis (CFA), and an item response theory analysis, the Person Separation Index (PSI), in a complementary manner. The results of the Q-LES-Q-SF revealed that the questionnaire was easy to administer and the acceptability was good. The internal consistency and the test-retest reliability were 0.9 and 0.88, respectively. All items were significantly correlated with the total score and the SF-12 used in the study. The CFA with one factor model was good, and for the unidimensional construct, the PSI was found to be 0.902. The French version of the Q-LES-Q-SF yielded valid and reliable clinical assessments of the quality of life for future research and clinical practice involving French substance abusers. In response to recent questioning regarding the unidimensionality or bidimensionality of the instrument and according to the underlying theoretical unidimensional construct used for its development, this study suggests the Q-LES-Q-SF as a one-dimension questionnaire in French QoL studies.
The psychometric properties of the "Reading the Mind in the Eyes" Test: an item response theory (IRT) analysis.

PubMed

Preti, Antonio; Vellante, Marcello; Petretto, Donatella R

2017-05-01

The "Reading the Mind in the Eyes" Test (hereafter: Eyes Test) is considered an advanced task of the Theory of Mind aimed at assessing the performance of the participant in perspective-takingthat is, the ability to sense or understand other people's cognitive and emotional states. In this study, the item response theory analysis was applied to the adult version of the Eyes Test. The Italian version of the Eyes Test was administered to 200 undergraduate students of both genders (males = 46%). Modified parallel analysis (MPA) was used to test unidimensionality. Marginal maximum likelihood estimation was used to fit the 1-, 2-, and 3-parameter logistic (PL) model to the data. Differential Item Functioning (DIF) due to gender was explored with five independent methods. MPA provided evidence in favour of unidimensionality. The Rasch model (1-PL) was superior to the other two models in explaining participants' responses to the Eyes Test. There was no robust evidence of gender-related DIF in the Eyes Test, although some differences may exist for some items as a reflection of real differences by group. The study results support a one-factor model of the Eyes Test. Performance on the Eyes Test is defined by the participant's ability in perspective-taking. Researchers should cease using arbitrarily selected subscores in assessing the performance of participants to the Eyes Test. Lack of gender-related DIF favours the use of the Eyes Test in the investigation of gender differences concerning empathy and social cognition.
Assessment of fatigue in rheumatoid arthritis: a psychometric comparison of single-item, multiitem, and multidimensional measures.

PubMed

Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Bode, Christina; Vonkeman, Harald E; Glas, Cees A W; Jansen, Tim; van Albada-Kuipers, Iet; van Riel, Piet L C M; van de Laar, Mart A F J

2015-03-01

To compare the psychometric functioning of multidimensional disease-specific, multiitem generic, and single-item measures of fatigue in patients with rheumatoid arthritis (RA). Confirmatory factor analysis (CFA) and longitudinal item response theory (IRT) modeling were used to evaluate the measurement structure and local reliability of the Bristol RA Fatigue Multi-Dimensional Questionnaire (BRAF-MDQ), the Medical Outcomes Study Short Form-36 (SF-36) vitality scale, and the BRAF Numerical Rating Scales (BRAF-NRS) in a sample of 588 patients with RA. A 1-factor CFA model yielded a similar fit to a 5-factor model with subscale-specific dimensions, and the items from the different instruments adequately fit the IRT model, suggesting essential unidimensionality in measurement. The SF-36 vitality scale outperformed the BRAF-MDQ at lower levels of fatigue, but was less precise at moderate to higher levels of fatigue. At these levels of fatigue, the living, cognition, and emotion subscales of the BRAF-MDQ provide additional precision. The BRAF-NRS showed a limited measurement range with its highest precision centered on average levels of fatigue. The different instruments appear to access a common underlying domain of fatigue severity, but differ considerably in their measurement precision along the continuum. The SF-36 vitality scale can be used to measure fatigue severity in samples with relatively mild fatigue. For samples expected to have higher levels of fatigue, the multidimensional BRAF-MDQ appears to be a better choice. The BRAF-NRS are not recommended if precise assessment is required, for instance in longitudinal settings.
Development of a Computer-Adaptive Physical Function Instrument for Social Security Administration Disability Determination

PubMed Central

Ni, Pengsheng; McDonough, Christine M.; Jette, Alan M.; Bogusz, Kara; Marfeo, Elizabeth E.; Rasch, Elizabeth K.; Brandt, Diane E.; Meterko, Mark; Chan, Leighton

2014-01-01

Objectives To develop and test an instrument to assess physical function (PF) for Social Security Administration (SSA) disability programs, the SSA-PF. Item Response Theory (IRT) analyses were used to 1) create a calibrated item bank for each of the factors identified in prior factor analyses, 2) assess the fit of the items within each scale, 3) develop separate Computer-Adaptive Test (CAT) instruments for each scale, and 4) conduct initial psychometric testing. Design Cross-sectional data collection; IRT analyses; CAT simulation. Setting Telephone and internet survey. Participants Two samples: 1,017 SSA claimants, and 999 adults from the US general population. Interventions None. Main Outcome Measure Model fit statistics, correlation and reliability coefficients, Results IRT analyses resulted in five unidimensional SSA-PF scales: Changing & Maintaining Body Position, Whole Body Mobility, Upper Body Function, Upper Extremity Fine Motor, and Wheelchair Mobility for a total of 102 items. High CAT accuracy was demonstrated by strong correlations between simulated CAT scores and those from the full item banks. Comparing the simulated CATs to the full item banks, very little loss of reliability or precision was noted, except at the lower and upper ranges of each scale. No difference in response patterns by age or sex was noted. The distributions of claimant scores were shifted to the lower end of each scale compared to those of a sample of US adults. Conclusions The SSA-PF instrument contributes important new methodology for measuring the physical function of adults applying to the SSA disability programs. Initial evaluation revealed that the SSA-PF instrument achieved considerable breadth of coverage in each content domain and demonstrated noteworthy psychometric properties. PMID:23578594
Development of a computer-adaptive physical function instrument for Social Security Administration disability determination.

PubMed

Ni, Pengsheng; McDonough, Christine M; Jette, Alan M; Bogusz, Kara; Marfeo, Elizabeth E; Rasch, Elizabeth K; Brandt, Diane E; Meterko, Mark; Haley, Stephen M; Chan, Leighton

2013-09-01

To develop and test an instrument to assess physical function for Social Security Administration (SSA) disability programs, the SSA-Physical Function (SSA-PF) instrument. Item response theory (IRT) analyses were used to (1) create a calibrated item bank for each of the factors identified in prior factor analyses, (2) assess the fit of the items within each scale, (3) develop separate computer-adaptive testing (CAT) instruments for each scale, and (4) conduct initial psychometric testing. Cross-sectional data collection; IRT analyses; CAT simulation. Telephone and Internet survey. Two samples: SSA claimants (n=1017) and adults from the U.S. general population (n=999). None. Model fit statistics, correlation, and reliability coefficients. IRT analyses resulted in 5 unidimensional SSA-PF scales: Changing & Maintaining Body Position, Whole Body Mobility, Upper Body Function, Upper Extremity Fine Motor, and Wheelchair Mobility for a total of 102 items. High CAT accuracy was demonstrated by strong correlations between simulated CAT scores and those from the full item banks. On comparing the simulated CATs with the full item banks, very little loss of reliability or precision was noted, except at the lower and upper ranges of each scale. No difference in response patterns by age or sex was noted. The distributions of claimant scores were shifted to the lower end of each scale compared with those of a sample of U.S. adults. The SSA-PF instrument contributes important new methodology for measuring the physical function of adults applying to the SSA disability programs. Initial evaluation revealed that the SSA-PF instrument achieved considerable breadth of coverage in each content domain and demonstrated noteworthy psychometric properties. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
A dimensional approach to understanding severity estimates and risk correlates of marijuana abuse and dependence in adults

PubMed Central

WU, LI-TZY; WOODY, GEORGE E.; YANG, CHONGMING; PAN, JENG-JONG; REEVE, BRYCE B.; BLAZER, DAN G.

2012-01-01

While item response theory (IRT) research shows a latent severity trait underlying response patterns of substance abuse and dependence symptoms, little is known about IRT-based severity estimates in relation to clinically relevant measures. In response to increased prevalences of marijuana-related treatment admissions, an elevated level of marijuana potency, and the debate on medical marijuana use, we applied dimensional approaches to understand IRT-based severity estimates for marijuana use disorders (MUDs) and their correlates while simultaneously considering gender- and race/ethnicity-related differential item functioning (DIF). Using adult data from the 2008 National Survey on Drug Use and Health (N=37,897), DSM-IV criteria for MUDs among past-year marijuana users were examined by IRT, logistic regression, and multiple indicators–multiple causes (MIMIC) approaches. Among 6,917 marijuana users, 15% met criteria for a MUD; another 24% exhibited subthreshold dependence. Abuse criteria were highly correlated with dependence criteria (correlation=0.90), indicating unidimensionality; item information curves revealed redundancy in multiple criteria. MIMIC analyses showed that MUD criteria were positively associated with weekly marijuana use, early marijuana use, other substance use disorders, substance abuse treatment, and serious psychological distress. African Americans and Hispanics showed higher levels of MUDs than whites, even after adjusting for race/ethnicity-related DIF. The redundancy in multiple criteria suggests an opportunity to improve efficiency in measuring symptom-level manifestations by removing low-informative criteria. Elevated rates of MUDs among African Americans and Hispanics require research to elucidate risk factors and improve assessments of MUDs for different racial/ethnic groups. PMID:22351489
Lawton IADL scale in dementia: can item response theory make it more informative?

PubMed

McGrory, Sarah; Shenkin, Susan D; Austin, Elizabeth J; Starr, John M

2014-07-01

impairment of functional abilities represents a crucial component of dementia diagnosis. Current functional measures rely on the traditional aggregate method of summing raw scores. While this summary score provides a quick representation of a person's ability, it disregards useful information on the item level. to use item response theory (IRT) methods to increase the interpretive power of the Lawton Instrumental Activities of Daily Living (IADL) scale by establishing a hierarchy of item 'difficulty' and 'discrimination'. this cross-sectional study applied IRT methods to the analysis of IADL outcomes. Participants were 202 members of the Scottish Dementia Research Interest Register (mean age = 76.39, range = 56-93, SD = 7.89 years) with complete itemised data available. a Mokken scale with good reliability (Molenaar Sijtsama statistic 0.79) was obtained, satisfying the IRT assumption that the items comprise a single unidimensional scale. The eight items in the scale could be placed on a hierarchy of 'difficulty' (H coefficient = 0.55), with 'Shopping' being the most 'difficult' item and 'Telephone use' being the least 'difficult' item. 'Shopping' was the most discriminatory item differentiating well between patients of different levels of ability. IRT methods are capable of providing more information about functional impairment than a summed score. 'Shopping' and 'Telephone use' were identified as items that reveal key information about a patient's level of ability, and could be useful screening questions for clinicians. © The Author 2013. Published by Oxford University Press on behalf of the British Geriatrics Society. All rights reserved. For Permissions, please email: journals.permissions@ oup.com.
Psychometric properties of the SDM-Q-9 questionnaire for shared decision-making in multiple sclerosis: item response theory modelling and confirmatory factor analysis.

PubMed

Ballesteros, Javier; Moral, Ester; Brieva, Luis; Ruiz-Beato, Elena; Prefasi, Daniel; Maurino, Jorge

2017-04-22

Shared decision-making is a cornerstone of patient-centred care. The 9-item Shared Decision-Making Questionnaire (SDM-Q-9) is a brief self-assessment tool for measuring patients' perceived level of involvement in decision-making related to their own treatment and care. Information related to the psychometric properties of the SDM-Q-9 for multiple sclerosis (MS) patients is limited. The objective of this study was to assess the performance of the items composing the SDM-Q-9 and its dimensional structure in patients with relapsing-remitting MS. A non-interventional, cross-sectional study in adult patients with relapsing-remitting MS was conducted in 17 MS units throughout Spain. A nonparametric item response theory (IRT) analysis was used to assess the latent construct and dimensional structure underlying the observed responses. A parametric IRT model, General Partial Credit Model, was fitted to obtain estimates of the relationship between the latent construct and item characteristics. The unidimensionality of the SDM-Q-9 instrument was assessed by confirmatory factor analysis. A total of 221 patients were studied (mean age = 42.1 ± 9.9 years, 68.3% female). Median Expanded Disability Status Scale score was 2.5 ± 1.5. Most patients reported taking part in each step of the decision-making process. Internal reliability of the instrument was high (Cronbach's α = 0.91) and the overall scale scalability score was 0.57, indicative of a strong scale. All items, except for the item 1, showed scalability indices higher than 0.30. Four items (items 6 through to 9) conveyed more than half of the SDM-Q-9 overall information (67.3%). The SDM-Q-9 was a good fit for a unidimensional latent structure (comparative fit index = 0.98, root-mean-square error of approximation = 0.07). All freely estimated parameters were statistically significant (P < 0.001). All items presented standardized parameter estimates with salient loadings (>0.40) with the exception of item 1 which presented the lowest loading (0.26). Items 6 through to 8 were the most relevant items for shared decision-making. The SDM-Q-9 presents appropriate psychometric properties and is therefore useful for assessing different aspects of shared decision-making in patients with multiple sclerosis.
Making sense of theory of mind and paranoia: the psychometric properties and reasoning requirements of a false belief sequencing task.

PubMed

Corcoran, Rhiannon; Bentall, Richard P; Rowse, Georgina; Moore, Rosanne; Cummins, Sinead; Blackwood, Nigel; Howard, Robert; Shryane, Nick M

2011-11-01

INTRODUCTION. This study used Item-Response Theory (IRT) to model the psychometric properties of a false belief picture sequencing task. Consistent with the mental time travel hypothesis of paranoia, we anticipated that performance on this deductive theory of mind (ToM) task would not be associated with the presence of persecutory delusions but would be related to other clinical, cognitive, and demographic factors. METHOD. A large (N=237) and diverse clinical and nonclinical sample differing in levels of depression and paranoid ideation performed 2 ToM tasks: the false belief sequencing task and a ToM stories task that was used to assess the validity of the false belief sequencing task as a measure of ToM. RESULTS. A unidimensional IRT model was found to fit the data well. Latent ToM ability as measured by the false belief sequencing task was negatively related with age and positively with IQ. In contrast to the ToM stories measure, there was no association between clinical diagnosis or symptoms and false belief picture sequencing after controlling for age and IQ. CONCLUSIONS. In line with mental time travel hypothesis of paranoia (Corcoran, 2010 ), performance on this deductive nonverbal ToM task is not related to the presence of paranoid symptoms. This measure is best suited for assessing ToM functioning where participants' performance falls just short of the average latent ToM ability. Furthermore, it is sensitive to the effects of increasing age and decreasing IQ.

Measuring grief and loss after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Grief and Loss item bank and short form

PubMed Central

Kalpakjian, Claire Z.; Tulsky, David S.; Kisala, Pamela A.; Bombardier, Charles H.

2015-01-01

Objective To develop an item response theory (IRT) calibrated Grief and Loss item bank as part of the Spinal Cord Injury – Quality of Life (SCI-QOL) measurement system. Design A literature review guided framework development of grief/loss. New items were created from focus groups. Items were revised based on expert review and patient feedback and were then field tested. Analyses included confirmatory factor analysis (CFA), graded response IRT modeling and evaluation of differential item functioning (DIF). Setting We tested a 20-item pool at several rehabilitation centers across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs hospital. Participants A total of 717 individuals with SCI answered the grief and loss questions. Results The final calibrated item bank resulted in 17 retained items. A unidimensional model was observed (CFI = 0.976; RMSEA = 0.078) and measurement precision was good (theta range between −1.48 to 2.48). Ten items were flagged for DIF, however, after examination of effect sizes found this to be negligible with little practical impact on score estimates. Conclusions This study indicates that the SCI-QOL Grief and Loss item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available. PMID:26010969
The Classic Measure of Disability in Activities of Daily Living Is Biased by Age but an Expanded IADL/ADL Measure Is Not

PubMed Central

2010-01-01

Objectives. To evaluate, by age, the performance of 2 disability measures based on needing help: one using 5 classic activities of daily living (ADL) and another using an expanded set of 14 activities including instrumental activities of daily living (IADL), walking, getting outside, and ADL (IADL/ADL). Methods. Guttman and item response theory (IRT) scaling methods are used with a large (N = 25,470) nationally representative household survey of individuals aged 18 years and older. Results. Guttman scalability of the ADL items increases steadily with age, reaching a high level at ages 75 years and older. That is reflected in an IRT model by age-related differential item functioning (DIF) resulting in age-biased measurement of ADL. Guttman scalability of the IADL/ADL items also increases with age but is lower than the ADL. Although age-related DIF also occurs with IADL/ADL items, DIF is lower in magnitude and balances out without causing age bias. Discussion. An IADL/ADL scale measuring need for help is hierarchical, unidimensional, and unbiased by age. It has greater content validity for measuring need for help in the community and shows greater sensitivity by age than the classic ADL measure. As demand for community services is increasing among adults of all ages, an expanded IADL/ADL measure is more useful than ADL. PMID:20100786
A psychometric evaluation of the clinician-rated Quick Inventory of Depressive Symptomatology (QIDS-C16) in patients with bipolar disorder.

PubMed

Bernstein, Ira H; Rush, A John; Suppes, Trisha; Trivedi, Madhukar H; Woo, Ada; Kyutoku, Yasushi; Crismon, M Lynn; Dennehy, Ellen; Carmody, Thomas J

2009-06-01

The clinician-rated, 16-item Quick Inventory of Depressive Symptomatology (QIDS-C16) has been extensively evaluated in patients with major depressive disorder (MDD). This report assesses the psychometric properties of the QIDS-C16 in outpatients with bipolar disorder (BD, N = 405) and MDD (N = 547) and in bipolar patients in the depressed phase only (BD-D) (N = 99) enrolled in the Texas Medication Algorithm Project (TMAP) using classical test theory (CTT) and the Samejima graded item response theory (IRT) model. Values of coefficient alpha were very similar in BD, MDD, and BD-D groups at baseline (alpha = 0.80-0.81) and at exit (alpha = 0.82-0.85). The QIDS-C16 was unidimensional for all three groups. MDD and BD-D patients (n = 99) had comparable symptom levels. The BD-D patients (n = 99) had the most, and bipolar patients in the manic phase had the least depressive symptoms at baseline. IRT analyses indicated that the QIDS-C16 was most sensitive to the measurement of depression for both MDD patients and for BD-D patients in the average range. The QIDS-C16 is suitable for use with patients with BD and can be used as an outcome measure in trials enrolling both BD and MDD patients. John Wiley & Sons, Ltd
Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients.

PubMed

Dirven, Linda; Groenvold, Mogens; Taphoorn, Martin J B; Conroy, Thierry; Tomaszewski, Krzysztof A; Young, Teresa; Petersen, Morten Aa

2017-11-01

The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.
Evaluation of the Patient-Reported Outcomes Information System (PROMIS(®)) Spanish-language physical functioning items.

PubMed

Paz, Sylvia H; Spritzer, Karen L; Morales, Leo S; Hays, Ron D

2013-09-01

To evaluate the equivalence of the PROMIS(®) physical functioning item bank by language of administration (English versus Spanish). The PROMIS(®) wave 1 English-language physical functioning bank consists of 124 items, and 114 of these were translated into Spanish. Item frequencies, means and standard deviations, item-scale correlations, and internal consistency reliability were calculated. The IRT assumption of unidimensionality was evaluated by fitting a single-factor confirmatory factor analytic model. IRT threshold and discrimination parameters were estimated using Samejima's Graded Response Model. DIF by language of administration was evaluated. Item means ranged from 2.53 (SD = 1.36) to 4.62 (SD = 0.82). Coefficient alpha was 0.99, and item-rest correlations ranged from 0.41 to 0.89. A one-factor model fits the data well (CFI = 0.971, TLI = 0.970, and RMSEA = 0.052). The slope parameters ranged from 0.45 ("Are you able to run 10 miles?") to 4.50 ("Are you able to put on a shirt or blouse?"). The threshold parameters ranged from -1.92 ("How much do physical health problems now limit your usual physical activities (such as walking or climbing stairs)?") to 6.06 ("Are you able to run 10 miles?"). Fifty of the 114 items were flagged for DIF based on an R(2) of 0.02 or above criterion. The expected total score was higher for Spanish- than English-language respondents. English- and Spanish-speaking subjects with the same level of underlying physical function responded differently to 50 of 114 items. This study has important implications in the study of physical functioning among diverse populations.
The ABC’s of Suicide Risk Assessment: Applying a Tripartite Approach to Individual Evaluations

PubMed Central

Harris, Keith M.; Syu, Jia-Jia; Lello, Owen D.; Chew, Y. L. Eileen; Willcox, Christopher H.; Ho, Roger H. M.

2015-01-01

There is considerable need for accurate suicide risk assessment for clinical, screening, and research purposes. This study applied the tripartite affect-behavior-cognition theory, the suicidal barometer model, classical test theory, and item response theory (IRT), to develop a brief self-report measure of suicide risk that is theoretically-grounded, reliable and valid. An initial survey (n = 359) employed an iterative process to an item pool, resulting in the six-item Suicidal Affect-Behavior-Cognition Scale (SABCS). Three additional studies tested the SABCS and a highly endorsed comparison measure. Studies included two online surveys (Ns = 1007, and 713), and one prospective clinical survey (n = 72; Time 2, n = 54). Factor analyses demonstrated SABCS construct validity through unidimensionality. Internal reliability was high (α = .86-.93, split-half = .90-.94)). The scale was predictive of future suicidal behaviors and suicidality (r = .68, .73, respectively), showed convergent validity, and the SABCS-4 demonstrated clinically relevant sensitivity to change. IRT analyses revealed the SABCS captured more information than the comparison measure, and better defined participants at low, moderate, and high risk. The SABCS is the first suicide risk measure to demonstrate no differential item functioning by sex, age, or ethnicity. In all comparisons, the SABCS showed incremental improvements over a highly endorsed scale through stronger predictive ability, reliability, and other properties. The SABCS is in the public domain, with this publication, and is suitable for clinical evaluations, public screening, and research. PMID:26030590
Assessing psychological well-being: self-report instruments for the NIH Toolbox.

PubMed

Salsman, John M; Lai, Jin-Shei; Hendrie, Hugh C; Butt, Zeeshan; Zill, Nicholas; Pilkonis, Paul A; Peterson, Christopher; Stoney, Catherine M; Brouwers, Pim; Cella, David

2014-02-01

Psychological well-being (PWB) has a significant relationship with physical and mental health. As a part of the NIH Toolbox for the Assessment of Neurological and Behavioral Function, we developed self-report item banks and short forms to assess PWB. Expert feedback and literature review informed the selection of PWB concepts and the development of item pools for positive affect, life satisfaction, and meaning and purpose. Items were tested with a community-dwelling US Internet panel sample of adults aged 18 and above (N = 552). Classical and item response theory (IRT) approaches were used to evaluate unidimensionality, fit of items to the overall measure, and calibrations of those items, including differential item function (DIF). IRT-calibrated item banks were produced for positive affect (34 items), life satisfaction (16 items), and meaning and purpose (18 items). Their psychometric properties were supported based on the results of factor analysis, fit statistics, and DIF evaluation. All banks measured the concepts precisely (reliability ≥0.90) for more than 98% of participants. These adult scales and item banks for PWB provide the flexibility, efficiency, and precision necessary to promote future epidemiological, observational, and intervention research on the relationship of PWB with physical and mental health.
Caffeine use disorder: An item-response theory analysis of proposed DSM-5 criteria.

PubMed

Ágoston, Csilla; Urbán, Róbert; Richman, Mara J; Demetrovics, Zsolt

2018-06-01

Caffeine is a common psychoactive substance with a documented addictive potential. Caffeine withdrawal has been included in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), but caffeine use disorder (CUD) is considered to be a condition for further study. The aim of the current study is (1) to test the psychometric properties of the Caffeine Use Disorder Questionnaire (CUDQ) by using a confirmatory factor analysis and an item response theory (IRT) approach, (2) to compare IRT models with varying numbers of parameters and models with or without caffeine consumption criteria, and (3) to examine if the total daily caffeine consumption and the use of different caffeinated products can predict the magnitude of CUD symptomatology. A cross-sectional study was conducted on an adult sample (N = 2259). Participants answered several questions regarding their caffeine consumption habits and completed the CUDQ, which incorporates the nine proposed criteria of the DSM-5 as well as one additional item regarding the suffering caused by the symptoms. Factor analyses demonstrated the unidimensionality of the CUDQ. The suffering criterion had the highest discriminative value at a higher degree of latent trait. The criterion of failure to fulfill obligations and social/interpersonal problems discriminate only at the higher value of CUD latent factor, while endorsement the consumption of more caffeine or longer than intended and craving criteria were discriminative at a lower level of CUD. Total daily caffeine intake was related to a higher level of CUD. Daily coffee, energy drink, and cola intake as dummy variables were associated with the presence of more CUD symptoms, while daily tea consumption as a dummy variable was related to less CUD symptoms. Regular smoking was associated with more CUD symptoms, which was explained by a larger caffeine consumption. The IRT approach helped to determine which CUD symptoms indicate more severity and have a greater discriminative value. The level of CUD is influenced by the type and quantity of caffeine consumption. Copyright © 2018 Elsevier Ltd. All rights reserved.
Development and psychometric characteristics of the SCI-QOL Bladder Management Difficulties and Bowel Management Difficulties item banks and short forms and the SCI-QOL Bladder Complications scale.

PubMed

Tulsky, David S; Kisala, Pamela A; Tate, Denise G; Spungen, Ann M; Kirshblum, Steven C

2015-05-01

To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Bladder Management Difficulties and Bowel Management Difficulties item banks and Bladder Complications scale. Using a mixed-methods design, a pool of items assessing bladder and bowel-related concerns were developed using focus groups with individuals with spinal cord injury (SCI) and SCI clinicians, cognitive interviews, and item response theory (IRT) analytic approaches, including tests of model fit and differential item functioning. Thirty-eight bladder items and 52 bowel items were tested at the University of Michigan, Kessler Foundation Research Center, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters VA Medical Center, Bronx, NY. Seven hundred fifty-seven adults with traumatic SCI. The final item banks demonstrated unidimensionality (Bladder Management Difficulties CFI=0.965; RMSEA=0.093; Bowel Management Difficulties CFI=0.955; RMSEA=0.078) and acceptable fit to a graded response IRT model. The final calibrated Bladder Management Difficulties bank includes 15 items, and the final Bowel Management Difficulties item bank consists of 26 items. Additionally, 5 items related to urinary tract infections (UTI) did not fit with the larger Bladder Management Difficulties item bank but performed relatively well independently (CFI=0.992, RMSEA=0.050) and were thus retained as a separate scale. The SCI-QOL Bladder Management Difficulties and Bowel Management Difficulties item banks are psychometrically robust and are available as computer adaptive tests or short forms. The SCI-QOL Bladder Complications scale is a brief, fixed-length outcomes instrument for individuals with a UTI.
Measuring depression after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Depression item bank and linkage with PHQ-9.

PubMed

Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Bombardier, Charles H; Pohlig, Ryan T; Heinemann, Allen W; Carle, Adam; Choi, Seung W

2015-05-01

To develop a calibrated spinal cord injury-quality of life (SCI-QOL) item bank, computer adaptive test (CAT), and short form to assess depressive symptoms experienced by individuals with SCI, transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a crosswalk to the Patient Health Questionnaire (PHQ)-9. We used grounded-theory based qualitative item development methods, large-scale item calibration field testing, confirmatory factor analysis, item response theory (IRT) analyses, and statistical linking techniques to transform scores to a PROMIS metric and to provide a crosswalk with the PHQ-9. Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. Spinal Cord Injury--Quality of Life (SCI-QOL) Depression Item Bank Individuals with SCI were involved in all phases of SCI-QOL development. A sample of 716 individuals with traumatic SCI completed 35 items assessing depression, 18 of which were PROMIS items. After removing 7 non-PROMIS items, factor analyses confirmed a unidimensional pool of items. We used a graded response IRT model to estimate slopes and thresholds for the 28 retained items. The SCI-QOL Depression measure correlated 0.76 with the PHQ-9. The SCI-QOL Depression item bank provides a reliable and sensitive measure of depressive symptoms with scores reported in terms of general population norms. We provide a crosswalk to the PHQ-9 to facilitate comparisons between measures. The item bank may be administered as a CAT or as a short form and is suitable for research and clinical applications.
Methodology for the development and calibration of the SCI-QOL item banks

PubMed Central

Tulsky, David S.; Kisala, Pamela A.; Victorson, David; Choi, Seung W.; Gershon, Richard; Heinemann, Allen W.; Cella, David

2015-01-01

Objective To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Methods Individual interviews (n = 44) and focus groups (n = 65 individuals with SCI and n = 42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n = 877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n = 245) to assess test-retest reliability and stability. Participants and Procedures A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. Results We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury – Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. Conclusions The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM. PMID:26010963
Methodology for the development and calibration of the SCI-QOL item banks.

PubMed

Tulsky, David S; Kisala, Pamela A; Victorson, David; Choi, Seung W; Gershon, Richard; Heinemann, Allen W; Cella, David

2015-05-01

To develop a comprehensive, psychometrically sound, and conceptually grounded patient reported outcomes (PRO) measurement system for individuals with spinal cord injury (SCI). Individual interviews (n=44) and focus groups (n=65 individuals with SCI and n=42 SCI clinicians) were used to select key domains for inclusion and to develop PRO items. Verbatim items from other cutting-edge measurement systems (i.e. PROMIS, Neuro-QOL) were included to facilitate linkage and cross-population comparison. Items were field tested in a large sample of individuals with traumatic SCI (n=877). Dimensionality was assessed with confirmatory factor analysis. Local item dependence and differential item functioning were assessed, and items were calibrated using the item response theory (IRT) graded response model. Finally, computer adaptive tests (CATs) and short forms were administered in a new sample (n=245) to assess test-retest reliability and stability. A calibration sample of 877 individuals with traumatic SCI across five SCI Model Systems sites and one Department of Veterans Affairs medical center completed SCI-QOL items in interview format. We developed 14 unidimensional calibrated item banks and 3 calibrated scales across physical, emotional, and social health domains. When combined with the five Spinal Cord Injury--Functional Index physical function banks, the final SCI-QOL system consists of 22 IRT-calibrated item banks/scales. Item banks may be administered as CATs or short forms. Scales may be administered in a fixed-length format only. The SCI-QOL measurement system provides SCI researchers and clinicians with a comprehensive, relevant and psychometrically robust system for measurement of physical-medical, physical-functional, emotional, and social outcomes. All SCI-QOL instruments are freely available on Assessment CenterSM.
The Exploration of the Relationship between Guessing and Latent Ability in IRT Models

ERIC Educational Resources Information Center

Gao, Song

2011-01-01

This study explored the relationship between successful guessing and latent ability in IRT models. A new IRT model was developed with a guessing function integrating probability of guessing an item correctly with the examinee's ability and the item parameters. The conventional 3PL IRT model was compared with the new 2PL-Guessing model on…
Practical Consequences of Item Response Theory Model Misfit in the Context of Test Equating with Mixed-Format Test Data

PubMed Central

Zhao, Yue; Hambleton, Ronald K.

2017-01-01

In item response theory (IRT) models, assessing model-data fit is an essential step in IRT calibration. While no general agreement has ever been reached on the best methods or approaches to use for detecting misfit, perhaps the more important comment based upon the research findings is that rarely does the research evaluate IRT misfit by focusing on the practical consequences of misfit. The study investigated the practical consequences of IRT model misfit in examining the equating performance and the classification of examinees into performance categories in a simulation study that mimics a typical large-scale statewide assessment program with mixed-format test data. The simulation study was implemented by varying three factors, including choice of IRT model, amount of growth/change of examinees’ abilities between two adjacent administration years, and choice of IRT scaling methods. Findings indicated that the extent of significant consequences of model misfit varied over the choice of model and IRT scaling methods. In comparison with mean/sigma (MS) and Stocking and Lord characteristic curve (SL) methods, separate calibration with linking and fixed common item parameter (FCIP) procedure was more sensitive to model misfit and more robust against various amounts of ability shifts between two adjacent administrations regardless of model fit. SL was generally the least sensitive to model misfit in recovering equating conversion and MS was the least robust against ability shifts in recovering the equating conversion when a substantial degree of misfit was present. The key messages from the study are that practical ways are available to study model fit, and, model fit or misfit can have consequences that should be considered when choosing an IRT model. Not only does the study address the consequences of IRT model misfit, but also it is our hope to help researchers and practitioners find practical ways to study model fit and to investigate the validity of particular IRT models for achieving a specified purpose, to assure that the successful use of the IRT models are realized, and to improve the applications of IRT models with educational and psychological test data. PMID:28421011
Development and validation of an item response theory-based Social Responsiveness Scale short form.

PubMed

Sturm, Alexandra; Kuhfeld, Megan; Kasari, Connie; McCracken, James T

2017-09-01

Research and practice in autism spectrum disorder (ASD) rely on quantitative measures, such as the Social Responsiveness Scale (SRS), for characterization and diagnosis. Like many ASD diagnostic measures, SRS scores are influenced by factors unrelated to ASD core features. This study further interrogates the psychometric properties of the SRS using item response theory (IRT), and demonstrates a strategy to create a psychometrically sound short form by applying IRT results. Social Responsiveness Scale analyses were conducted on a large sample (N = 21,426) of youth from four ASD databases. Items were subjected to item factor analyses and evaluation of item bias by gender, age, expressive language level, behavior problems, and nonverbal IQ. Item selection based on item psychometric properties, DIF analyses, and substantive validity produced a reduced item SRS short form that was unidimensional in structure, highly reliable (α = .96), and free of gender, age, expressive language, behavior problems, and nonverbal IQ influence. The short form also showed strong relationships with established measures of autism symptom severity (ADOS, ADI-R, Vineland). Degree of association between all measures varied as a function of expressive language. Results identified specific SRS items that are more vulnerable to non-ASD-related traits. The resultant 16-item SRS short form may possess superior psychometric properties compared to the original scale and emerge as a more precise measure of ASD core symptom severity, facilitating research and practice. Future research using IRT is needed to further refine existing measures of autism symptomatology. © 2017 Association for Child and Adolescent Mental Health.
Psychometrics of the Fitness-to-Drive Screening Measure.

PubMed

Classen, Sherrilene; Velozo, Craig A; Winter, Sandra M; Bédard, Michel; Wang, Yanning

2015-01-01

We employed item response theory (IRT), specifically using Rasch modeling, to determine the measurement precision of the Fitness-to-Drive Screening Measure (FTDS), a tool that can be used by caregivers and occupational therapists to help detect at-risk drivers. We examined unidimensionality through the factor structure (how items contribute to the central construct of fitness to drive), rating scale (use of the categories of the rating scale), item/person-level separation (distinguishing between items with different difficulty levels or persons with different ability levels) and reliability, item hierarchy (easier driving items advancing to more difficult driving items), rater reliability, rater effects (severity vs. leniency of a rater), and criterion validity of the FTDS to an on-road assessment, via three rater groups (n = 200 older drivers; n = 200 caregivers; n = 2 evaluators). The FTDS is unidimensional, the rating scale performed well, has good person (> 3.07) and item (> 5.43) separation, good person (> 0.90) and item reliability (> 0.97), with < 10% misfitting items for two rater groups (caregivers and drivers). The intraclass correlation (ICC) coefficient among the three rater groups was significant (.253, p < .001) and the evaluators were the most severe raters. When comparing the caregivers' FTDS rating with the drivers' on-road assessment, the areas under the curve (index of discriminability; caregivers .726, p < .001) suggested concurrent validity between the FTDS and the on-road assessment. Despite limitations, the FTDS is a reliable and accurate screening measure for caregivers to help identify at-risk older drivers and for occupational therapy practitioners to start conversations about driving.
A PROMIS Measure of Neuropathic Pain Quality

PubMed Central

Askew, Robert L.; Cook, Karon F.; Keefe, Francis J.; Nowinski, Cindy J; Cella, David; Revicki, Dennis A.; DeWitt, Esi M. Morgan; Michaud, Kaleb; Trence, Dace L.; Amtmann, Dagmar

2016-01-01

Objectives Neuropathic pain is a consequence of many chronic conditions. This study aimed to develop a unidimensional neuropathic pain scale whose scores represent levels of neuropathic pain and distinguish between individuals with neuropathic and non-neuropathic pain conditions. Methods A candidate item pool of 42 pain quality descriptors was administered to participants with osteoarthritis, rheumatoid arthritis, diabetic neuropathy, and cancer chemotherapy-induced peripheral neuropathy. A subset of pain quality descriptors (items) that best distinguished between participants with and those without neuropathic pain conditions were identified. Dimensionality of pain descriptors was evaluated in a development sample and cross-validated in a hold-out sample. Item responses were calibrated using an item response theory model, and scores were generated on a T-score metric. Neuropathic pain scale scores were evaluated in terms of reliability, validity, and the ability to distinguish between participants with and without conditions typically associated with neuropathic pain. Results Of the 42 initial items, 5 were identified for the Patient Reported Outcome Measurement Information System (PROMIS) Neuropathic Pain Quality scale (PROMIS-PQ-Neuro). The IRT-generated T-scores exhibited good discriminatory ability based on receiver operator characteristic analysis. Score thresholds were identified that optimize sensitivity and specificity. Construct, criterion, and discriminant validity, and reliability of scale scores were supported. Conclusions The 5-item PROMIS PQ-Neuro is a short and practical measure that can be used to identify patients more likely to have neuropathic pain and to distinguish levels of neuropathic pain. The data collected will support future research that targets other unidimensional pain quality domains (e.g., nociceptive pain). PMID:27565279
Comparing Vertical Scales Derived from Dichotomous and Polytomous IRT Models for a Test Composed of Testlets.

ERIC Educational Resources Information Center

Bishop, N. Scott; Omar, Md Hafidz

Previous research has shown that testlet structures often violate important assumptions of dichotomous item response theory (D-IRT) models, applied to item-level scores, that can in turn affect the results of many measurement applications. In this situation, polytomous IRT (P-IRT) models, applied to testlet-level scores, have been used as an…
Scale refinement and initial evaluation of a behavioral health function measurement tool for work disability evaluation.

PubMed

Marfeo, Elizabeth E; Ni, Pengsheng; Haley, Stephen M; Bogusz, Kara; Meterko, Mark; McDonough, Christine M; Chan, Leighton; Rasch, Elizabeth K; Brandt, Diane E; Jette, Alan M

2013-09-01

To use item response theory (IRT) data simulations to construct and perform initial psychometric testing of a newly developed instrument, the Social Security Administration Behavioral Health Function (SSA-BH) instrument, that aims to assess behavioral health functioning relevant to the context of work. Cross-sectional survey followed by IRT calibration data simulations. Community. Sample of individuals applying for Social Security Administration disability benefits: claimants (n=1015) and a normative comparative sample of U.S. adults (n=1000). None. SSA-BH measurement instrument. IRT analyses supported the unidimensionality of 4 SSA-BH scales: mood and emotions (35 items), self-efficacy (23 items), social interactions (6 items), and behavioral control (15 items). All SSA-BH scales demonstrated strong psychometric properties including reliability, accuracy, and breadth of coverage. High correlations of the simulated 5- or 10-item computer adaptive tests with the full item bank indicated robust ability of the computer adaptive testing approach to comprehensively characterize behavioral health function along 4 distinct dimensions. Initial testing and evaluation of the SSA-BH instrument demonstrated good accuracy, reliability, and content coverage along all 4 scales. Behavioral function profiles of Social Security Administration claimants were generated and compared with age- and sex-matched norms along 4 scales: mood and emotions, behavioral control, social interactions, and self-efficacy. Using the computer adaptive test-based approach offers the ability to collect standardized, comprehensive functional information about claimants in an efficient way, which may prove useful in the context of the Social Security Administration's work disability programs. Copyright © 2013 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
An Investigation of Item Fit Statistics for Mixed IRT Models

ERIC Educational Resources Information Center

Chon, Kyong Hee

2009-01-01

The purpose of this study was to investigate procedures for assessing model fit of IRT models for mixed format data. In this study, various IRT model combinations were fitted to data containing both dichotomous and polytomous item responses, and the suitability of the chosen model mixtures was evaluated based on a number of model fit procedures.…

Fitting IRT Models to Dichotomous and Polytomous Data: Assessing the Relative Model-Data Fit of Ideal Point and Dominance Models

ERIC Educational Resources Information Center

Tay, Louis; Ali, Usama S.; Drasgow, Fritz; Williams, Bruce

2011-01-01

This study investigated the relative model-data fit of an ideal point item response theory (IRT) model (the generalized graded unfolding model [GGUM]) and dominance IRT models (e.g., the two-parameter logistic model [2PLM] and Samejima's graded response model [GRM]) to simulated dichotomous and polytomous data generated from each of these models.…
The Utility of IRT in Small-Sample Testing Applications.

ERIC Educational Resources Information Center

Sireci, Stephen G.

The utility of modified item response theory (IRT) models in small sample testing applications was studied. The modified IRT models were modifications of the one- and two-parameter logistic models. One-, two-, and three-parameter models were also studied. Test data were from 4 years of a national certification examination for persons desiring…
Some Observations on the Identification and Interpretation of the 3PL IRT Model

ERIC Educational Resources Information Center

Azevedo, Caio Lucidius Naberezny

2009-01-01

The paper by Maris, G., & Bechger, T. (2009) entitled, "On the Interpreting the Model Parameters for the Three Parameter Logistic Model," addressed two important questions concerning the three parameter logistic (3PL) item response theory (IRT) model (and in a broader sense, concerning all IRT models). The first one is related to the model…
An introduction to multidimensional measurement using Rasch models.

PubMed

Briggs, Derek C; Wilson, Mark

2003-01-01

The act of constructing a measure requires a number of important assumptions. Principle among these assumptions is that the construct is unidimensional. In practice there are many instances when the assumption of unidimensionality does not hold, and where the application of a multidimensional measurement model is both technically appropriate and substantively advantageous. In this paper we illustrate the usefulness of a multidimensional approach to measurement with the Multidimensional Random Coefficient Multinomial Logit (MRCML) model, an extension of the unidimensional Rasch model. An empirical example is taken from a collection of embedded assessments administered to 541 students enrolled in middle school science classes with a hands-on science curriculum. Student achievement on these assessments are multidimensional in nature, but can also be treated as consecutive unidimensional estimates, or as is most common, as a composite unidimensional estimate. Structural parameters are estimated for each model using ConQuest, and model fit is compared. Student achievement in science is also compared across models. The multidimensional approach has the best fit to the data, and provides more reliable estimates of student achievement than under the consecutive unidimensional approach. Finally, at an interpretational level, the multidimensional approach may well provide richer information to the classroom teacher about the nature of student achievement.
Application of the IRT and TRT Models to a Reading Comprehension Test

ERIC Educational Resources Information Center

Kim, Weon H.

2017-01-01

The purpose of the present study is to apply the item response theory (IRT) and testlet response theory (TRT) models to a reading comprehension test. This study applied the TRT models and the traditional IRT model to a seventh-grade reading comprehension test (n = 8,815) with eight testlets. These three models were compared to determine the best…
Development of a subjective cognitive decline questionnaire using item response theory: a pilot study.

PubMed

Gifford, Katherine A; Liu, Dandan; Romano, Raymond; Jones, Richard N; Jefferson, Angela L

2015-12-01

Subjective cognitive decline (SCD) may indicate unhealthy cognitive changes, but no standardized SCD measurement exists. This pilot study aims to identify reliable SCD questions. 112 cognitively normal (NC, 76±8 years, 63% female), 43 mild cognitive impairment (MCI; 77±7 years, 51% female), and 33 diagnostically ambiguous participants (79±9 years, 58% female) were recruited from a research registry and completed 57 self-report SCD questions. Psychometric methods were used for item-reduction. Factor analytic models assessed unidimensionality of the latent trait (SCD); 19 items were removed with extreme response distribution or trait-fit. Item response theory (IRT) provided information about question utility; 17 items with low information were dropped. Post-hoc simulation using computerized adaptive test (CAT) modeling selected the most commonly used items (n=9 of 21 items) that represented the latent trait well (r=0.94) and differentiated NC from MCI participants (F(1,146)=8.9, p=0.003). Item response theory and computerized adaptive test modeling identified nine reliable SCD items. This pilot study is a first step toward refining SCD assessment in older adults. Replication of these findings and validation with Alzheimer's disease biomarkers will be an important next step for the creation of a SCD screener.
An IRT Model with a Parameter-Driven Process for Change

ERIC Educational Resources Information Center

Rijmen, Frank; De Boeck, Paul; van der Maas, Han L. J.

2005-01-01

An IRT model with a parameter-driven process for change is proposed. Quantitative differences between persons are taken into account by a continuous latent variable, as in common IRT models. In addition, qualitative inter-individual differences and auto-dependencies are accounted for by assuming within-subject variability with respect to the…
A quantitative comparison of noise reduction across five commercial (hybrid and model-based) iterative reconstruction techniques: an anthropomorphic phantom study.

PubMed

Patino, Manuel; Fuentes, Jorge M; Hayano, Koichi; Kambadakone, Avinash R; Uyeda, Jennifer W; Sahani, Dushyant V

2015-02-01

OBJECTIVE. The objective of our study was to compare the performance of three hybrid iterative reconstruction techniques (IRTs) (ASiR, iDose4, SAFIRE) and their respective strengths for image noise reduction on low-dose CT examinations using filtered back projection (FBP) as the standard reference. Also, we compared the performance of these three hybrid IRTs with two model-based IRTs (Veo and IMR) for image noise reduction on low-dose examinations. MATERIALS AND METHODS. An anthropomorphic abdomen phantom was scanned at 100 and 120 kVp and different tube current-exposure time products (25-100 mAs) on three CT systems (for ASiR and Veo, Discovery CT750 HD; for iDose4 and IMR, Brilliance iCT; and for SAFIRE, Somatom Definition Flash). Images were reconstructed using FBP and using IRTs at various strengths. Nine noise measurements (mean ROI size, 423 mm(2)) on extracolonic fat for the different strengths of IRTs were recorded and compared with FBP using ANOVA. Radiation dose, which was measured as the volume CT dose index and dose-length product, was also compared. RESULTS. There were no significant differences in radiation dose and image noise among the scanners when FBP was used (p > 0.05). Gradual image noise reduction was observed with each increasing increment of hybrid IRT strength, with a maximum noise suppression of approximately 50% (48.2-53.9%). Similar noise reduction was achieved on the scanners by applying specific hybrid IRT strengths. Maximum noise reduction was higher on model-based IRTs (68.3-81.1%) than hybrid IRTs (48.2-53.9%) (p < 0.05). CONCLUSION. When constant scanning parameters are used, radiation dose and image noise on FBP are similar for CT scanners made by different manufacturers. Significant image noise reduction is achieved on low-dose CT examinations rendered with IRTs. The image noise on various scanners can be matched by applying specific hybrid IRT strengths. Model-based IRTs attain substantially higher noise reduction than hybrid IRTs irrespective of the radiation dose.
A Nonparametric Approach for Assessing Goodness-of-Fit of IRT Models in a Mixed Format Test

ERIC Educational Resources Information Center

Liang, Tie; Wells, Craig S.

2015-01-01

Investigating the fit of a parametric model plays a vital role in validating an item response theory (IRT) model. An area that has received little attention is the assessment of multiple IRT models used in a mixed-format test. The present study extends the nonparametric approach, proposed by Douglas and Cohen (2001), to assess model fit of three…
An Introduction to Item Response Theory and Rasch Models for Speech-Language Pathologists

ERIC Educational Resources Information Center

Baylor, Carolyn; Hula, William; Donovan, Neila J.; Doyle, Patrick J.; Kendall, Diane; Yorkston, Kathryn

2011-01-01

Purpose: To present a primarily conceptual introduction to item response theory (IRT) and Rasch models for speech-language pathologists (SLPs). Method: This tutorial introduces SLPs to basic concepts and terminology related to IRT as well as the most common IRT models. The article then continues with an overview of how instruments are developed…
A Computer-Adaptive Disability Instrument for Lower Extremity Osteoarthritis Research Demonstrated Promising Breadth, Precision and Reliability

PubMed Central

Jette, Alan M.; McDonough, Christine M.; Haley, Stephen M.; Ni, Pengsheng; Olarsch, Sippy; Latham, Nancy; Hambleton, Ronald K.; Felson, David; Kim, Young-jo; Hunter, David

2012-01-01

Objective To develop and evaluate a prototype measure (OA-DISABILITY-CAT) for osteoarthritis research using Item Response Theory (IRT) and Computer Adaptive Test (CAT) methodologies. Study Design and Setting We constructed an item bank consisting of 33 activities commonly affected by lower extremity (LE) osteoarthritis. A sample of 323 adults with LE osteoarthritis reported their degree of limitation in performing everyday activities and completed the Health Assessment Questionnaire-II (HAQ-II). We used confirmatory factor analyses to assess scale unidimensionality and IRT methods to calibrate the items and examine the fit of the data. Using CAT simulation analyses, we examined the performance of OA-DISABILITY-CATs of different lengths compared to the full item bank and the HAQ-II. Results One distinct disability domain was identified. The 10-item OA-DISABILITY-CAT demonstrated a high degree of accuracy compared with the full item bank (r=0.99). The item bank and the HAQ-II scales covered a similar estimated scoring range. In terms of reliability, 95% of OA-DISABILITY reliability estimates were over 0.83 versus 0.60 for the HAQ-II. Except at the highest scores the 10-item OA-DISABILITY-CAT demonstrated superior precision to the HAQ-II. Conclusion The prototype OA-DISABILITY-CAT demonstrated promising measurement properties compared to the HAQ-II, and is recommended for use in LE osteoarthritis research. PMID:19216052
Improving Assessment of Work Related Mental Health Function Using the Work Disability Functional Assessment Battery (WD-FAB).

PubMed

Marfeo, Elizabeth E; Ni, Pengsheng; McDonough, Christine; Peterik, Kara; Marino, Molly; Meterko, Mark; Rasch, Elizabeth K; Chan, Leighton; Brandt, Diane; Jette, Alan M

2018-03-01

Purpose To improve the mental health component of the Work Disability Functional Assessment Battery (WD-FAB), developed for the US Social Security Administration's (SSA) disability determination process. Specifically our goal was to expand the WD-FAB scales of mood & emotions, resilience, social interactions, and behavioral control to improve the depth and breadth of the current scales and expand the content coverage to include aspects of cognition & communication function. Methods Data were collected from a random, stratified sample of 1695 claimants applying for the SSA work disability benefits, and a general population sample of 2025 working age adults. 169 new items were developed to replenish the WD-FAB scales and analyzed using factor analysis and item response theory (IRT) analysis to construct unidimensional scales. We conducted computer adaptive test (CAT) simulations to examine the psychometric properties of the WD-FAB. Results Analyses supported the inclusion of four mental health subdomains: Cognition & Communication (68 items), Self-Regulation (34 items), Resilience & Sociability (29 items) and Mood & Emotions (34 items). All scales yielded acceptable psychometric properties. Conclusions IRT methods were effective in expanding the WD-FAB to assess mental health function. The WD-FAB has the potential to enhance work disability assessment both within the context of the SSA disability programs as well as other clinical and vocational rehabilitation settings.
IRT Equating of the MCAT. MCAT Monograph.

ERIC Educational Resources Information Center

Hendrickson, Amy B.; Kolen, Michael J.

This study compared various equating models and procedures for a sample of data from the Medical College Admission Test(MCAT), considering how item response theory (IRT) equating results compare with classical equipercentile results and how the results based on use of various IRT models, observed score versus true score, direct versus linked…
Using item response theory to investigate the structure of anticipated affect: do self-reports about future affective reactions conform to typical or maximal models?

PubMed

Zampetakis, Leonidas A; Lerakis, Manolis; Kafetsios, Konstantinos; Moustakis, Vassilis

2015-01-01

In the present research, we used item response theory (IRT) to examine whether effective predictions (anticipated affect) conforms to a typical (i.e., what people usually do) or a maximal behavior process (i.e., what people can do). The former, correspond to non-monotonic ideal point IRT models, whereas the latter correspond to monotonic dominance IRT models. A convenience, cross-sectional student sample (N = 1624) was used. Participants were asked to report on anticipated positive and negative affect around a hypothetical event (emotions surrounding the start of a new business). We carried out analysis comparing graded response model (GRM), a dominance IRT model, against generalized graded unfolding model, an unfolding IRT model. We found that the GRM provided a better fit to the data. Findings suggest that the self-report responses to anticipated affect conform to dominance response process (i.e., maximal behavior). The paper also discusses implications for a growing literature on anticipated affect.
Using item response theory to investigate the structure of anticipated affect: do self-reports about future affective reactions conform to typical or maximal models?

PubMed Central

Zampetakis, Leonidas A.; Lerakis, Manolis; Kafetsios, Konstantinos; Moustakis, Vassilis

2015-01-01

In the present research, we used item response theory (IRT) to examine whether effective predictions (anticipated affect) conforms to a typical (i.e., what people usually do) or a maximal behavior process (i.e., what people can do). The former, correspond to non-monotonic ideal point IRT models, whereas the latter correspond to monotonic dominance IRT models. A convenience, cross-sectional student sample (N = 1624) was used. Participants were asked to report on anticipated positive and negative affect around a hypothetical event (emotions surrounding the start of a new business). We carried out analysis comparing graded response model (GRM), a dominance IRT model, against generalized graded unfolding model, an unfolding IRT model. We found that the GRM provided a better fit to the data. Findings suggest that the self-report responses to anticipated affect conform to dominance response process (i.e., maximal behavior). The paper also discusses implications for a growing literature on anticipated affect. PMID:26441806
Application of an IRT Polytomous Model for Measuring Health Related Quality of Life

ERIC Educational Resources Information Center

Tejada, Antonio J. Rojas; Rojas, Oscar M. Lozano

2005-01-01

Background: The Item Response Theory (IRT) has advantages for measuring Health Related Quality of Life (HRQOL) as opposed to the Classical Tests Theory (CTT). Objectives: To present the results of the application of a polytomous model based on IRT, specifically, the Rating Scale Model (RSM), to measure HRQOL with the EORTC QLQ-C30. Methods: 103…
Effect of Item Response Theory (IRT) Model Selection on Testlet-Based Test Equating. Research Report. ETS RR-14-19

ERIC Educational Resources Information Center

Cao, Yi; Lu, Ru; Tao, Wei

2014-01-01

The local item independence assumption underlying traditional item response theory (IRT) models is often not met for tests composed of testlets. There are 3 major approaches to addressing this issue: (a) ignore the violation and use a dichotomous IRT model (e.g., the 2-parameter logistic [2PL] model), (b) combine the interdependent items to form a…
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)

ERIC Educational Resources Information Center

Arenson, Ethan A.; Karabatsos, George

2017-01-01

Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
Performance of the Generalized S-X[Superscript 2] Item Fit Index for Polytomous IRT Models

ERIC Educational Resources Information Center

Kang, Taehoon; Chen, Troy T.

2008-01-01

Orlando and Thissen's S-X[superscript 2] item fit index has performed better than traditional item fit statistics such as Yen' s Q[subscript 1] and McKinley and Mill' s G[superscript 2] for dichotomous item response theory (IRT) models. This study extends the utility of S-X[superscript 2] to polytomous IRT models, including the generalized partial…
Using the Item Response Theory (IRT) for Educational Evaluation through Games

ERIC Educational Resources Information Center

Euzébio Batista, Marcelo Henrique; Victória Barbosa, Jorge Luis; da Rosa Tavares, João Elison; Hackenhaar, Jonathan Luis

2013-01-01

This article shows the application of Item Response Theory (IRT) for educational evaluation using games. The article proposes a computational model to create user profiles, called Psychometric Profile Generator (PPG). PPG uses the IRT mathematical model for exploring the levels of skills and behaviors in the form of items and/or stimuli. The model…

Standard Errors and Confidence Intervals from Bootstrapping for Ramsay-Curve Item Response Theory Model Item Parameters

ERIC Educational Resources Information Center

Gu, Fei; Skorupski, William P.; Hoyle, Larry; Kingston, Neal M.

2011-01-01

Ramsay-curve item response theory (RC-IRT) is a nonparametric procedure that estimates the latent trait using splines, and no distributional assumption about the latent trait is required. For item parameters of the two-parameter logistic (2-PL), three-parameter logistic (3-PL), and polytomous IRT models, RC-IRT can provide more accurate estimates…
The Puzzling Unidimensionality of DSM-5 Substance Use Disorder Diagnoses

PubMed Central

MacCoun, Robert J.

2013-01-01

There is a perennial expert debate about the criteria to be included or excluded for the DSM diagnoses of substance use dependence. Yet analysts routinely report evidence for the unidimensionality of the resulting checklist. If in fact the checklist is unidimensional, the experts are wrong that the criteria are distinct, so either the experts are mistaken or the reported unidimensionality is spurious. I argue for the latter position, and suggest that the traditional reflexive measurement model is inappropriate for the DSM; a formative measurement model would be a more accurate characterization of the institutional process by which the checklist is created, and a network or causal model would be a more appropriate foundation for a scientifically grounded diagnostic system. PMID:24324446
Screening for elevated levels of fear-avoidance beliefs regarding work or physical activities in people receiving outpatient therapy.

PubMed

Hart, Dennis L; Werneke, Mark W; George, Steven Z; Matheson, James W; Wang, Ying-Chih; Cook, Karon F; Mioduski, Jerome E; Choi, Seung W

2009-08-01

Screening people for elevated levels of fear-avoidance beliefs is uncommon, but elevated levels of fear could worsen outcomes. Developing short screening tools might reduce the data collection burden and facilitate screening, which could prompt further testing or management strategy modifications to improve outcomes. The purpose of this study was to develop efficient yet accurate screening methods for identifying elevated levels of fear-avoidance beliefs regarding work or physical activities in people receiving outpatient rehabilitation. A secondary analysis of data collected prospectively from people with a variety of common neuromusculoskeletal diagnoses was conducted. Intake Fear-Avoidance Beliefs Questionnaire (FABQ) data were collected from 17,804 people who had common neuromusculoskeletal conditions and were receiving outpatient rehabilitation in 121 clinics in 26 states (in the United States). Item response theory (IRT) methods were used to analyze the FABQ data, with particular emphasis on differential item functioning among clinically logical groups of subjects, and to identify screening items. The accuracy of screening items for identifying subjects with elevated levels of fear was assessed with receiver operating characteristic analyses. Three items for fear of physical activities and 10 items for fear of work activities represented unidimensional scales with adequate IRT model fit. Differential item functioning was negligible for variables known to affect functional status outcomes: sex, age, symptom acuity, surgical history, pain intensity, condition severity, and impairment. Items that provided maximum information at the median for the FABQ scales were selected as screening items to dichotomize subjects by high versus low levels of fear. The accuracy of the screening items was supported for both scales. This study represents a retrospective analysis, which should be replicated using prospective designs. Future prospective studies should assess the reliability and validity of using one FABQ item to screen people for high levels of fear-avoidance beliefs. The lack of differential item functioning in the FABQ scales in the sample tested in this study suggested that FABQ screening could be useful in routine clinical practice and allowed the development of single-item screening for fear-avoidance beliefs that accurately identified subjects with elevated levels of fear. Because screening was accurate and efficient, single IRT-based FABQ screening items are recommended to facilitate improved evaluation and care of heterogeneous populations of people receiving outpatient rehabilitation.
A Combined IRT and SEM Approach for Individual-Level Assessment in Test-Retest Studies

ERIC Educational Resources Information Center

Ferrando, Pere J.

2015-01-01

The standard two-wave multiple-indicator model (2WMIM) commonly used to analyze test-retest data provides information at both the group and item level. Furthermore, when applied to binary and graded item responses, it is related to well-known item response theory (IRT) models. In this article the IRT-2WMIM relations are used to obtain additional…
Correspondence between the RAND-Negative Impact of Asthma on Quality of Life item bank and the Marks Asthma Quality of Life Questionnaire.

PubMed

Edelen, Maria Orlando; Stucky, Brian D; Sherbourne, Cathy; Eberhart, Nicole; Lara, Marielena

2014-05-01

In many research and clinical settings in which patient-reported outcome (PRO) measures are used, it is often desirable to link scores across disparate measures or to use scores from 1 measure to describe scores on a separate measure. However, PRO measures are scored by using a variety of metrics, making such comparisons difficult. The objective of this article was to provide an example of how to transform scores across disparate measures (the Marks Asthma Quality of Life Questionnaire [AQLQ-Marks] and the newly developed RAND-Negative Impact of Asthma on Quality of Life item bank [RAND-IAQL-Bank]) by using an item response theory (IRT)-based linking method. Our sample of adults with asthma (N = 2032) completed 2 measures of asthma-specific quality of life: the AQLQ-Marks and the RAND-IAQL-Bank. We use IRT-based co-calibration of the 2 measures to provide a linkage, or a common metric, between the 2 measures. Co-calibration refers to the process of using IRT to estimate item parameters that describe the responses to the scales' items according to a common metric; in this case, a normal distribution transformed to a T scale with a mean of 50 and an SD of 10. Respondents had an average age of 43 (15), were 60% female, and predominantly non-Hispanic White (56%), with 19% African American, 14% Hispanic, and 11% Asian. Most had at least some college education (83%), and 90% had experienced an asthma attack during the last 12 months. Our results indicate that the AQLQ-Marks and RAND-IAQL-Bank scales measured highly similar constructs and were sufficiently unidimensional for IRT co-calibration. Once linked, scores from the 2 measures were invariant across subgroups. A crosswalk is provided that allows researchers and clinicians using AQLQ-Marks to crosswalk to the RAND-IAQL toolkit. The ability to translate scores from the RAND-IAQL toolkit to other "legacy" (ie, commonly used) measures increases the value of the new toolkit, aids in interpretation, and will hopefully facilitate adoption by asthma researchers and clinicians. More generally, the techniques we illustrate can be applied to other newly developed or existing measures in the PRO research field to obtain crosswalks with widely used traditional legacy instruments. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.
Development and psychometric evaluation of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions.

PubMed

Forrest, Christopher B; Devine, Janine; Bevans, Katherine B; Becker, Brandon D; Carle, Adam C; Teneralli, Rachel E; Moon, JeanHee; Tucker, Carole A; Ravens-Sieberer, Ulrike

2018-01-01

To describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Life Satisfaction item banks, child-report, and parent-proxy editions. A pool of 55 life satisfaction items was administered to 1992 children 8-17 years old and 964 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and assessment of construct validity. Thirteen items were deleted because of poor psychometric performance. An 8-item short form was administered to a national sample of 996 children 8-17 years old, and 1294 parents of children 5-17 years old. The combined sample (2988 children and 2258 parents) was used in item response theory (IRT) calibration analyses. The final item banks were unidimensional, the items were locally independent, and the items were free from impactful differential item functioning. The 8-item and 4-item short form scales showed excellent reliability, convergent validity, and discriminant validity. Life satisfaction decreased with declining socio-economic status, presence of a special health care need, and increasing age for girls, but not boys. After IRT calibration, we found that 4- and 8-item short forms had a high degree of precision (reliability) across a wide range (>4 SD units) of the latent variable. The PROMIS Pediatric Life Satisfaction item banks and their short forms provide efficient, precise, and valid assessments of life satisfaction in children and youth.
Development and Evaluation of the PROMIS® Pediatric Positive Affect Item Bank, Child-Report and Parent-Proxy Editions.

PubMed

Forrest, Christopher B; Ravens-Sieberer, Ulrike; Devine, Janine; Becker, Brandon D; Teneralli, Rachel; Moon, JeanHee; Carle, Adam; Tucker, Carole A; Bevans, Katherine B

2018-03-01

The purpose of this study is to describe the psychometric evaluation and item response theory calibration of the PROMIS Pediatric Positive Affect item bank, child-report and parent-proxy editions. The initial item pool comprising 53 items, previously developed using qualitative methods, was administered to 1,874 children 8-17 years old and 909 parents of children 5-17 years old. Analyses included descriptive statistics, reliability, factor analysis, differential item functioning, and construct validity. A total of 14 items were deleted, because of poor psychometric performance, and an 8-item short form constructed from the remaining 39 items was administered to a national sample of 1,004 children 8-17 years old, and 1,306 parents of children 5-17 years old. The combined sample was used in item response theory (IRT) calibration analyses. The final item bank appeared unidimensional, the items appeared locally independent, and the items were free from differential item functioning. The scales showed excellent reliability and convergent and discriminant validity. Positive affect decreased with children's age and was lower for those with a special health care need. After IRT calibration, we found that 4 and 8 item short forms had a high degree of precision (reliability) across a wide range of the latent trait (>4 SD units). The PROMIS Pediatric Positive Affect item bank and its short forms provide an efficient, precise, and valid assessment of positive affect in children and youth.
Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers

PubMed Central

2012-01-01

Background Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Methods Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. Results and conclusions After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12) – when binary scored – were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech’s “well-being” and “distress” clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS) met criteria for the monotone homogeneity model but four items violated double monotonicity with respect to a single underlying dimension. Software availability and commands used to specify unidimensionality and reliability analysis and graphical displays for diagnosing monotone homogeneity and double monotonicity are discussed, with an emphasis on current implementations in freeware. PMID:22686586
Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers.

PubMed

Stochl, Jan; Jones, Peter B; Croudace, Tim J

2012-06-11

Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12)--when binary scored--were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech's "well-being" and "distress" clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS) met criteria for the monotone homogeneity model but four items violated double monotonicity with respect to a single underlying dimension.Software availability and commands used to specify unidimensionality and reliability analysis and graphical displays for diagnosing monotone homogeneity and double monotonicity are discussed, with an emphasis on current implementations in freeware.
Model Selection Methods for Mixture Dichotomous IRT Models

ERIC Educational Resources Information Center

Li, Feiming; Cohen, Allan S.; Kim, Seock-Ho; Cho, Sun-Joo

2009-01-01

This study examines model selection indices for use with dichotomous mixture item response theory (IRT) models. Five indices are considered: Akaike's information coefficient (AIC), Bayesian information coefficient (BIC), deviance information coefficient (DIC), pseudo-Bayes factor (PsBF), and posterior predictive model checks (PPMC). The five…
Do Concept Inventories Actually Measure Anything?

ERIC Educational Resources Information Center

Wallace, Colin S.; Bailey, Janelle M.

2010-01-01

Although concept inventories are among the most frequently used tools in the physics and astronomy education communities, they are rarely evaluated using item response theory (IRT). When IRT models fit the data, they offer sample-independent estimates of item and person parameters. IRT may also provide a way to measure students' learning gains…
Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory

ERIC Educational Resources Information Center

Lee, Won-Chan

2010-01-01

In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…
The Value of Item Response Theory in Clinical Assessment: A Review

ERIC Educational Resources Information Center

Thomas, Michael L.

2011-01-01

Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical…
Exploring the Robustness of a Unidimensional Item Response Theory Model with Empirically Multidimensional Data

ERIC Educational Resources Information Center

Anderson, Daniel; Kahn, Joshua D.; Tindal, Gerald

2017-01-01

Unidimensionality and local independence are two common assumptions of item response theory. The former implies that all items measure a common latent trait, while the latter implies that responses are independent, conditional on respondents' location on the latent trait. Yet, few tests are truly unidimensional. Unmodeled dimensions may result in…
Model Selection Indices for Polytomous Items

ERIC Educational Resources Information Center

Kang, Taehoon; Cohen, Allan S.; Sung, Hyun-Jung

2009-01-01

This study examines the utility of four indices for use in model selection with nested and nonnested polytomous item response theory (IRT) models: a cross-validation index and three information-based indices. Four commonly used polytomous IRT models are considered: the graded response model, the generalized partial credit model, the partial credit…
The nutrition for sport knowledge questionnaire (NSKQ): development and validation using classical test theory and Rasch analysis.

PubMed

Trakman, Gina Louise; Forsyth, Adrienne; Hoye, Russell; Belski, Regina

2017-01-01

Appropriate dietary intake can have a significant influence on athletic performance. There is a growing consensus on sports nutrition and professionals working with athletes often provide dietary education. However, due to the limitations of existing sports nutrition knowledge questionnaires, previous reports of athletes' nutrition knowledge may be inaccurate. An updated questionnaire has been developed based on a recent review of sports nutrition guidelines. The tool has been validated using a robust methodology that incorporates relevant techniques from classical test theory (CTT) and Item response theory (IRT), namely, Rasch analysis. The final questionnaire has 89 questions and six sub-sections (weight management, macronutrients, micronutrients, sports nutrition, supplements, and alcohol). The content and face validity of the tool have been confirmed based on feedback from expert sports dietitians and university sports students, respectively. The internal reliability of the questionnaire as a whole is high (KR = 0.88), and most sub-sections achieved an acceptable internal reliability. Construct validity has been confirmed, with an independent T-test revealing a significant ( p < 0.001) difference in knowledge scores of nutrition (64 ± 16%) and non-nutrition students (51 ± 19%). Test-retest reliability has been assured, with a strong correlation ( r = 0.92, p < 0.001) between individuals' scores on two attempts of the test, 10 days to 2 weeks apart. Three of the sub-sections fit the Rasch Unidimensional Model. The final version of the questionnaire represents a significant improvement over previous tools. Each nutrition sub-section is unidimensional, and therefore researchers and practitioners can use these individually, as required. Use of the questionnaire will allow researchers to draw conclusions about the effectiveness of nutrition education programs, and differences in knowledge across athletes of varying ages, genders, and athletic calibres.
Stochastic Ordering Using the Latent Trait and the Sum Score in Polytomous IRT Models.

ERIC Educational Resources Information Center

Hemker, Bas T.; Sijtsma, Klaas; Molenaar, Ivo W.; Junker, Brian W.

1997-01-01

Stochastic ordering properties are investigated for a broad class of item response theory (IRT) models for which the monotone likelihood ratio does not hold. A taxonomy is given for nonparametric and parametric models for polytomous models based on the hierarchical relationship between the models. (SLD)
Modelling Mathematics Problem Solving Item Responses Using a Multidimensional IRT Model

ERIC Educational Resources Information Center

Wu, Margaret; Adams, Raymond

2006-01-01

This research examined students' responses to mathematics problem-solving tasks and applied a general multidimensional IRT model at the response category level. In doing so, cognitive processes were identified and modelled through item response modelling to extract more information than would be provided using conventional practices in scoring…
Item Response Theory as an Efficient Tool to Describe a Heterogeneous Clinical Rating Scale in De Novo Idiopathic Parkinson's Disease Patients.

PubMed

Buatois, Simon; Retout, Sylvie; Frey, Nicolas; Ueckert, Sebastian

2017-10-01

This manuscript aims to precisely describe the natural disease progression of Parkinson's disease (PD) patients and evaluate approaches to increase the drug effect detection power. An item response theory (IRT) longitudinal model was built to describe the natural disease progression of 423 de novo PD patients followed during 48 months while taking into account the heterogeneous nature of the MDS-UPDRS. Clinical trial simulations were then used to compare drug effect detection power from IRT and sum of item scores based analysis under different analysis endpoints and drug effects. The IRT longitudinal model accurately describes the evolution of patients with and without PD medications while estimating different progression rates for the subscales. When comparing analysis methods, the IRT-based one consistently provided the highest power. IRT is a powerful tool which enables to capture the heterogeneous nature of the MDS-UPDRS.
The value of item response theory in clinical assessment: a review.

PubMed

Thomas, Michael L

2011-09-01

Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical assessment are reviewed to appraise its current and potential value. Benefits of IRT include comprehensive analyses and reduction of measurement error, creation of computer adaptive tests, meaningful scaling of latent variables, objective calibration and equating, evaluation of test and item bias, greater accuracy in the assessment of change due to therapeutic intervention, and evaluation of model and person fit. The theory may soon reinvent the manner in which tests are selected, developed, and scored. Although challenges remain to the widespread implementation of IRT, its application to clinical assessment holds great promise. Recommendations for research, test development, and clinical practice are provided.

Tracking functional status across the spinal cord injury lifespan: linking pediatric and adult patient-reported outcome scores.

PubMed

Tian, Feng; Ni, Pengsheng; Mulcahey, M J; Hambleton, Ronald K; Tulsky, David; Haley, Stephen M; Jette, Alan M

2014-11-01

To use item response theory (IRT) methods to link scores from 2 recently developed contemporary functional outcome measures, the adult Spinal Cord Injury-Functional Index (SCI-FI) and the Pedi SCI (both the parent version and the child version). Secondary data analysis of the physical functioning items of the adult SCI-FI and the Pedi SCI instruments. We used a nonequivalent group design with items common to both instruments and the Stocking-Lord method for the linking. Linking was conducted so that the adult SCI-FI and Pedi SCI scaled scores could be compared. Community. This study included a total sample of 1558 participants. Pedi SCI items were administered to a sample of children (n=381) with SCI aged 8 to 21 years, and of parents/caregivers (n=322) of children with SCI aged 4 to 21 years. Adult SCI-FI items were administered to a sample of adults (n=855) with SCI aged 18 to 92 years. Not applicable. Five scales common to both instruments were included in the analysis: Wheelchair, Daily Routine/Self-care, Daily Routine/Fine Motor, Ambulation, and General Mobility functioning. Confirmatory factor analysis and exploratory factor analysis results indicated that the 5 scales are unidimensional. A graded response model was used to calibrate the items. Misfitting items were identified and removed from the item banks. Items that function differently between the adult and child samples (ie, exhibit differential item functioning) were identified and removed from the common items used for linking. Domain scores from the Pedi SCI instruments were transformed onto the adult SCI-FI metric. This IRT linking allowed estimation of adult SCI-FI scale scores based on Pedi SCI scale scores and vice versa; therefore, it provides clinicians with a means of tracking long-term functional data for children with an SCI across their entire lifespan. Copyright © 2014 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
A psychometric investigation of the hypersexual disorder screening inventory among highly sexually active gay and bisexual men: an item response theory analysis.

PubMed

Parsons, Jeffrey T; Rendina, H Jonathon; Ventuneac, Ana; Cook, Karon F; Grov, Christian; Mustanski, Brian

2013-12-01

The Hypersexual Disorder Screening Inventory (HDSI) was designed as an instrument for the screening of hypersexuality by the American Psychiatric Association's taskforce for the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders. Our study sought to conduct a psychometric analysis of the HDSI, including an investigation of its underlying structure and reliability utilizing item response theory (IRT) modeling, and an examination of its polythetic scoring criteria in comparison to a standard dimensionally based cutoff score. We examined a diverse group of 202 highly sexually active gay and bisexual men in New York City. We conducted psychometric analyses of the HDSI, including both confirmatory factor analysis of its structure and IRT analysis of the item and scale reliabilities. We utilized the HDSI. The HDSI adequately fit a single-factor solution, although there was evidence that two of the items may measure a second factor that taps into sex as a form of coping. The scale showed evidence of strong reliability across much of the continuum of hypersexuality, and results suggested that, in addition to the proposed polythetic scoring criteria, a cutoff score of 20 on the severity index might be used for preliminary classification of HD. The HDSI was found to be highly reliable, and results suggested that a unidimensional, quantitative conception of hypersexuality with a clinically relevant cutoff score may be more appropriate than a qualitative syndrome comprised of multiple distinct clusters of problems. However, we also found preliminary evidence that three clusters of symptoms may constitute an HD syndrome as opposed to the two clusters initially proposed. Future research is needed to determine which of these issues are characteristic of the hypersexuality and HD constructs themselves and which are more likely to be methodological artifacts of the HDSI. © 2013 International Society for Sexual Medicine.
An Introduction to Item Response Theory for Health Behavior Researchers

ERIC Educational Resources Information Center

Warne, Russell T.; McKyer, E. J. Lisako; Smith, Matthew L.

2012-01-01

Objective: To introduce item response theory (IRT) to health behavior researchers by contrasting it with classical test theory and providing an example of IRT in health behavior. Method: Demonstrate IRT by fitting the 2PL model to substance-use survey data from the Adolescent Health Risk Behavior questionnaire (n = 1343 adolescents). Results: An…
Building an Evaluation Scale using Item Response Theory.

PubMed

Lalor, John P; Wu, Hao; Yu, Hong

2016-11-01

Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.
Building an Evaluation Scale using Item Response Theory

PubMed Central

Lalor, John P.; Wu, Hao; Yu, Hong

2016-01-01

Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.1 PMID:28004039
Estimating the Nominal Response Model under Nonnormal Conditions

ERIC Educational Resources Information Center

Preston, Kathleen Suzanne Johnson; Reise, Steven Paul

2014-01-01

The nominal response model (NRM), a much understudied polytomous item response theory (IRT) model, provides researchers the unique opportunity to evaluate within-item category distinctions. Polytomous IRT models, such as the NRM, are frequently applied to psychological assessments representing constructs that are unlikely to be normally…
Projective Item Response Model for Test-Independent Measurement

ERIC Educational Resources Information Center

Ip, Edward Hak-Sing; Chen, Shyh-Huei

2012-01-01

The problem of fitting unidimensional item-response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that contains a major dimension of interest but that may also contain minor nuisance dimensions. Because fitting a unidimensional model to multidimensional data results in…
Item response theory and the measurement of motor behavior.

PubMed

Safrit, M J; Cohen, A S; Costa, M G

1989-12-01

Item response theory (IRT) has been the focus of intense research and development activity in educational and psychological measurement during the past decade. Because this theory can provide more precise information about test items than other theories usually used in measuring motor behavior, the application of IRT in physical education and exercise science merits investigation. In IRT, the difficulty level of each item (e.g., trial or task) can be estimated and placed on the same scale as the ability of the examinee. Using this information, the test developer can determine the ability levels at which the test functions best. Equating the scores of individuals on two or more items or tests can be handled efficiently by applying IRT. The precision of the identification of performance standards in a mastery test context can be enhanced, as can adaptive testing procedures. In this tutorial, several potential benefits of applying IRT to the measurement of motor behavior were described. An example is provided using bowling data and applying the graded-response form of the Rasch IRT model. The data were calibrated and the goodness of fit was examined. This analysis is described in a step-by-step approach. Limitations to using an IRT model with a test consisting of repeated measures were noted.
Cognitive psychology meets psychometric theory: on the relation between process models for decision making and latent variable models for individual differences.

PubMed

van der Maas, Han L J; Molenaar, Dylan; Maris, Gunter; Kievit, Rogier A; Borsboom, Denny

2011-04-01

This article analyzes latent variable models from a cognitive psychology perspective. We start by discussing work by Tuerlinckx and De Boeck (2005), who proved that a diffusion model for 2-choice response processes entails a 2-parameter logistic item response theory (IRT) model for individual differences in the response data. Following this line of reasoning, we discuss the appropriateness of IRT for measuring abilities and bipolar traits, such as pro versus contra attitudes. Surprisingly, if a diffusion model underlies the response processes, IRT models are appropriate for bipolar traits but not for ability tests. A reconsideration of the concept of ability that is appropriate for such situations leads to a new item response model for accuracy and speed based on the idea that ability has a natural zero point. The model implies fundamentally new ways to think about guessing, response speed, and person fit in IRT. We discuss the relation between this model and existing models as well as implications for psychology and psychometrics. 2011 APA, all rights reserved
Scale Model Icing Research Tunnel

NASA Technical Reports Server (NTRS)

Canacci, Victor A.

1997-01-01

NASA Lewis Research Center's Icing Research Tunnel (IRT) is the world's largest refrigerated wind tunnel and one of only three icing wind tunnel facilities in the United States. The IRT was constructed in the 1940's and has been operated continually since it was built. In this facility, natural icing conditions are duplicated to test the effects of inflight icing on actual aircraft components as well as on models of airplanes and helicopters. IRT tests have been used successfully to reduce flight test hours for the certification of ice-detection instrumentation and ice protection systems. To ensure that the IRT will remain the world's premier icing facility well into the next century, Lewis is making some renovations and is planning others. These improvements include modernizing the control room, replacing the fan blades with new ones to increase the test section maximum velocity to 430 mph, installing new spray bars to increase the size and uniformity of the artificial icing cloud, and replacing the facility heat exchanger. Most of the improvements will have a first-order effect on the IRT's airflow quality. To help us understand these effects and evaluate potential improvements to the flow characteristics of the IRT, we built a modular 1/10th-scale aerodynamic model of the facility. This closed-loop scale-model pilot tunnel was fabricated onsite in the various shops of Lewis' Fabrication Support Division. The tunnel's rectangular sections are composed of acrylic walls supported by an aluminum angle framework. Its turning vanes are made of tubing machined to the contour of the IRT turning vanes. The fan leg of the tunnel, which transitions from rectangular to circular and back to rectangular cross sections, is fabricated of fiberglass sections. The contraction section of the tunnel is constructed from sheet aluminum. A 12-bladed aluminum fan is coupled to a turbine powered by high-pressure air capable of driving the maximum test section velocity to 550 ft/sec (Mach 0.45). The air turbine and instrumentation are housed inside a fiberglass nacelle. Total and static pressure measurements can be taken around the loop, and velocity and flow angularity measurements can be taken with hot-wire and five-hole probes at specific locations. The Scale Model Icing Research Tunnel (SMIRT) is undergoing checkout tests to determine how its airflow characteristics compare with the IRT. Near-term uses for this scale-model tunnel include determining the aerodynamic effects of replacing the 52-yearold W-shaped heat exchanger with a flat-faced heat exchanger. SMIRT is an integral part of the improvements planned for the IRT because testing the proposed IRT improvements in a scale-model tunnel will lower costs and improve productivity.
On the Bayesian Nonparametric Generalization of IRT-Type Models

ERIC Educational Resources Information Center

San Martin, Ernesto; Jara, Alejandro; Rolin, Jean-Marie; Mouchart, Michel

2011-01-01

We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general…
Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form

PubMed Central

Kalpakjian, Claire Z.; Tate, Denise G.; Kisala, Pamela A.; Tulsky, David S.

2015-01-01

Objective To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Design Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory- (IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. Participants A total of 717 individuals with SCI completed the self-esteem items. Results A unidimensional model was observed (CFI = 0.946; RMSEA = 0.087) and measurement precision was good (theta range between −2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. Conclusion This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available. PMID:26010972
Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

PubMed

Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

2015-05-01

To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.
A Decision-Tree Approach to Cost Comparison of Newborn Screening Strategies for Cystic Fibrosis

PubMed Central

Wells, Janelle; Rosenberg, Marjorie; Hoffman, Gary; Anstead, Michael

2012-01-01

OBJECTIVE: Because cystic fibrosis can be difficult to diagnose and treat early, newborn screening programs have rapidly developed nationwide but methods vary widely. We therefore investigated the costs and consequences or specific outcomes of the 2 most commonly used methods. METHODS: With available data on screening and follow-up, we used a simulation approach with decision trees to compare immunoreactive trypsinogen (IRT) screening followed by a second IRT test against an IRT/DNA analysis. By using a Monte Carlo simulation program, variation in the model parameters for counts at various nodes of the decision trees, as well as for costs, are included and applied to fictional cohorts of 100 000 newborns. The outcome measures included the numbers of newborns given a diagnosis of cystic fibrosis and costs of screening strategy at each branch and cost per newborn. RESULTS: Simulations revealed a substantial number of potential missed diagnoses for the IRT/IRT system versus IRT/DNA. Although the IRT/IRT strategy with commonly used cutoff values offers an average overall cost savings of $2.30 per newborn, a breakdown of costs by societal segments demonstrated higher out-of-pocket costs for families. Two potential system failures causing delayed diagnoses were identified relating to the screening protocols and the follow-up system. CONCLUSIONS: The IRT/IRT screening algorithm reduces the costs to laboratories and insurance companies but has more system failures. IRT/DNA offers other advantages, including fewer delayed diagnoses and lower out-of-pocket costs to families. PMID:22291119
Extended Mixed-Efects Item Response Models with the MH-RM Algorithm

ERIC Educational Resources Information Center

Chalmers, R. Philip

2015-01-01

A mixed-effects item response theory (IRT) model is presented as a logical extension of the generalized linear mixed-effects modeling approach to formulating explanatory IRT models. Fixed and random coefficients in the extended model are estimated using a Metropolis-Hastings Robbins-Monro (MH-RM) stochastic imputation algorithm to accommodate for…
Bayesian Estimation of the Logistic Positive Exponent IRT Model

ERIC Educational Resources Information Center

Bolfarine, Heleno; Bazan, Jorge Luis

2010-01-01

A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
Hybrid Model of IRT and Latent Class Models.

ERIC Educational Resources Information Center

Yamamoto, Kentaro

This study developed a hybrid of item response theory (IRT) models and latent class models, which combined the strengths of each type of model. The primary motivation for developing the new model is to describe characteristics of examinees' knowledge at the time of the examination. Hence, the application of the model lies mainly in so-called…
Stability of Rasch Scales over Time

ERIC Educational Resources Information Center

Taylor, Catherine S.; Lee, Yoonsun

2010-01-01

Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…
Score Equating and Item Response Theory: Some Practical Considerations.

ERIC Educational Resources Information Center

Cook, Linda L.; Eignor, Daniel R.

The purposes of this paper are five-fold to discuss: (1) when item response theory (IRT) equating methods should provide better results than traditional methods; (2) which IRT model, the three-parameter logistic or the one-parameter logistic (Rasch), is the most reasonable to use; (3) what unique contributions IRT methods can offer the equating…
A Zero- and K-Inflated Mixture Model for Health Questionnaire Data

PubMed Central

Finkelman, Matthew D.; Green, Jennifer Greif; Gruber, Michael J.; Zaslavsky, Alan M.

2011-01-01

In psychiatric assessment, Item Response Theory (IRT) is a popular tool to formalize the relation between the severity of a disorder and associated responses to questionnaire items. Practitioners of IRT sometimes make the assumption of normally distributed severities within a population; while convenient, this assumption is often violated when measuring psychiatric disorders. Specifically, there may be a sizable group of respondents whose answers place them at an extreme of the latent trait spectrum. In this article, a zero- and K-inflated mixture model is developed to account for the presence of such respondents. The model is fitted using an expectation-maximization (E-M) algorithm to estimate the percentage of the population at each end of the continuum, concurrently analyzing the remaining “graded component” via IRT. A method to perform factor analysis for only the graded component is introduced. In assessments of oppositional defiant disorder and conduct disorder, the zero- and K-inflated model exhibited better fit than the standard IRT model. PMID:21365673

Goodness of Model-Data Fit and Invariant Measurement

ERIC Educational Resources Information Center

Engelhard, George, Jr.; Perkins, Aminah

2013-01-01

In this commentary, Englehard and Perkins remark that Maydeu-Olivares has presented a framework for evaluating the goodness of model-data fit for item response theory (IRT) models and correctly points out that overall goodness-of-fit evaluations of IRT models and data are not generally explored within most applications in educational and…
Applying Kaplan-Meier to Item Response Data

ERIC Educational Resources Information Center

McNeish, Daniel

2018-01-01

Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…
IRT Model Selection Methods for Dichotomous Items

ERIC Educational Resources Information Center

Kang, Taehoon; Cohen, Allan S.

2007-01-01

Fit of the model to the data is important if the benefits of item response theory (IRT) are to be obtained. In this study, the authors compared model selection results using the likelihood ratio test, two information-based criteria, and two Bayesian methods. An example illustrated the potential for inconsistency in model selection depending on…
Estimating a Noncompensatory IRT Model Using Metropolis within Gibbs Sampling

ERIC Educational Resources Information Center

Babcock, Ben

2011-01-01

Relatively little research has been conducted with the noncompensatory class of multidimensional item response theory (MIRT) models. A Monte Carlo simulation study was conducted exploring the estimation of a two-parameter noncompensatory item response theory (IRT) model. The estimation method used was a Metropolis-Hastings within Gibbs algorithm…
Adaptive Testing without IRT.

ERIC Educational Resources Information Center

Yan, Duanli; Lewis, Charles; Stocking, Martha

It is unrealistic to suppose that standard item response theory (IRT) models will be appropriate for all new and currently considered computer-based tests. In addition to developing new models, researchers will need to give some attention to the possibility of constructing and analyzing new tests without the aid of strong models. Computerized…
The Performance of IRT Model Selection Methods with Mixed-Format Tests

ERIC Educational Resources Information Center

Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G.

2012-01-01

When tests consist of multiple-choice and constructed-response items, researchers are confronted with the question of which item response theory (IRT) model combination will appropriately represent the data collected from these mixed-format tests. This simulation study examined the performance of six model selection criteria, including the…
A Comparison of General Diagnostic Models (GDM) and Bayesian Networks Using a Middle School Mathematics Test

ERIC Educational Resources Information Center

Wu, Haiyan

2013-01-01

General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…
Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

ERIC Educational Resources Information Center

Drabinová, Adéla; Martinková, Patrícia

2017-01-01

In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…
Relationships among Classical Test Theory and Item Response Theory Frameworks via Factor Analytic Models

ERIC Educational Resources Information Center

Kohli, Nidhi; Koran, Jennifer; Henn, Lisa

2015-01-01

There are well-defined theoretical differences between the classical test theory (CTT) and item response theory (IRT) frameworks. It is understood that in the CTT framework, person and item statistics are test- and sample-dependent. This is not the perception with IRT. For this reason, the IRT framework is considered to be theoretically superior…
Scale Refinement and Initial Evaluation of a Behavioral Health Function Measurement Tool for Work Disability Evaluation

PubMed Central

Marfeo, Elizabeth E.; Ni, Pengsheng; Bogusz, Kara; Meterko, Mark; McDonough, Christine M.; Chan, Leighton; Rasch, Elizabeth K.; Brandt, Diane E.; Jette, Alan M.

2014-01-01

Objectives To use item response theory (IRT) data simulations to construct and perform initial psychometric testing of a newly developed instrument, the Social Security Administration Behavioral Health Function (SSA-BH) instrument, that aims to assess behavioral health functioning relevant to the context of work. Design Cross-sectional survey followed by item response theory (IRT) calibration data simulations Setting Community Participants A sample of individuals applying for SSA disability benefits, claimants (N=1015), and a normative comparative sample of US adults (N=1000) Interventions None. Main Outcome Measure Social Security Administration Behavioral Health Function (SSA-BH) measurement instrument Results Item response theory analyses supported the unidimensionality of four SSA-BH scales: Mood and Emotions (35 items), Self-Efficacy (23 items), Social Interactions (6 items), and Behavioral Control (15 items). All SSA-BH scales demonstrated strong psychometric properties including reliability, accuracy, and breadth of coverage. High correlations of the simulated 5- or 10- item CATs with the full item bank indicated robust ability of the CAT approach to comprehensively characterize behavioral health function along four distinct dimensions. Conclusions Initial testing and evaluation of the SSA-BH instrument demonstrated good accuracy, reliability, and content coverage along all four scales. Behavioral function profiles of SSA claimants were generated and compared to age and sex matched norms along four scales: Mood and Emotions, Behavioral Control, Social Interactions, and Self-Efficacy. Utilizing the CAT based approach offers the ability to collect standardized, comprehensive functional information about claimants in an efficient way, which may prove useful in the context of the SSA’s work disability programs. PMID:23542404
The Protective Behavioral Strategies for Marijuana Scale: Further examination using item response theory.

PubMed

Pedersen, Eric R; Huang, Wenjing; Dvorak, Robert D; Prince, Mark A; Hummer, Justin F

2017-08-01

Given recent state legislation legalizing marijuana for recreational purposes and majority popular opinion favoring these laws, we developed the Protective Behavioral Strategies for Marijuana scale (PBSM) to identify strategies that may mitigate the harms related to marijuana use among those young people who choose to use the drug. In the current study, we expand on the initial exploratory study of the PBSM to further validate the measure with a large and geographically diverse sample (N = 2,117; 60% women, 30% non-White) of college students from 11 different universities across the United States. We sought to develop a psychometrically sound item bank for the PBSM and to create a short assessment form that minimizes respondent burden and time. Quantitative item analyses, including exploratory and confirmatory factor analyses with item response theory (IRT) and evaluation of differential item functioning (DIF), revealed an item bank of 36 items that was examined for unidimensionality and good content coverage, as well as a short form of 17 items that is free of bias in terms of gender (men vs. women), race (White vs. non-White), ethnicity (Hispanic vs. non-Hispanic), and recreational marijuana use legal status (state recreational marijuana was legal for 25.5% of participants). We also provide a scoring table for easy transformation from sum scores to IRT scale scores. The PBSM item bank and short form associated strongly and negatively with past month marijuana use and consequences. The measure may be useful to researchers and clinicians conducting intervention and prevention programs with young adults. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Mixture Rasch model for guessing group identification

NASA Astrophysics Data System (ADS)

Siow, Hoo Leong; Mahdi, Rasidah; Siew, Eng Ling

2013-04-01

Several alternative dichotomous Item Response Theory (IRT) models have been introduced to account for guessing effect in multiple-choice assessment. The guessing effect in these models has been considered to be itemrelated. In the most classic case, pseudo-guessing in the three-parameter logistic IRT model is modeled to be the same for all the subjects but may vary across items. This is not realistic because subjects can guess worse or better than the pseudo-guessing. Derivation from the three-parameter logistic IRT model improves the situation by incorporating ability in guessing. However, it does not model non-monotone function. This paper proposes to study guessing from a subject-related aspect which is guessing test-taking behavior. Mixture Rasch model is employed to detect latent groups. A hybrid of mixture Rasch and 3-parameter logistic IRT model is proposed to model the behavior based guessing from the subjects' ways of responding the items. The subjects are assumed to simply choose a response at random. An information criterion is proposed to identify the behavior based guessing group. Results show that the proposed model selection criterion provides a promising method to identify the guessing group modeled by the hybrid model.
Flow Quality Measurements in an Aerodynamic Model of NASA Lewis' Icing Research Tunnel

NASA Technical Reports Server (NTRS)

Canacci, Victor A.; Gonsalez, Jose C.

1999-01-01

As part of an ongoing effort to improve the aerodynamic flow characteristics of the Icing Research Tunnel (IRT), a modular scale model of the facility was fabricated. This 1/10th-scale model was used to gain further understanding of the flow characteristics in the IRT. The model was outfitted with instrumentation and data acquisition systems to determine pressures, velocities, and flow angles in the settling chamber and test section. Parametric flow quality studies involving the insertion and removal of a model of the IRT's distinctive heat exchanger (cooler) and/or of a honeycomb in the settling chamber were performed. These experiments illustrate the resulting improvement or degradation in flow quality.
Geometrical Characteristics of Cd-Rich Inclusion Defects in CdZnTe Materials

NASA Astrophysics Data System (ADS)

Xu, Chao; Sheng, Fengfeng; Yang, Jianrong

2017-08-01

The geometrical characteristics of Cd-rich inclusion defects in CdZnTe crystals have been investigated by infrared transmission (IRT) microscopy and chemical etching methods, revealing that they are composed of a Cd-rich inclusion core zone with high dislocation density and defect extension belts. Based on the experimental results, the orientation and shape of these belts were determined, showing that their extension directions in three-dimensional (3-D) space are along <211> crystal orientation. To explain the observed IRT images of Cd-rich inclusion defects, a 3-D model with plate-shaped structure for dislocation extension belts is proposed. Greyscale IRT images of dislocation extension belts thus depend on their absorption layer thickness. Assuming that defects can be discerned by IRT microscopy only when their absorption layer thickness is greater than twice that of the plate-shaped dislocation extension belts, this 3-D defect model can rationalize the IRT images of Cd-rich inclusion defects.
Item Response Theory and Health Outcomes Measurement in the 21st Century

PubMed Central

Hays, Ron D.; Morales, Leo S.; Reise, Steve P.

2006-01-01

Item response theory (IRT) has a number of potential advantages over classical test theory in assessing self-reported health outcomes. IRT models yield invariant item and latent trait estimates (within a linear transformation), standard errors conditional on trait level, and trait estimates anchored to item content. IRT also facilitates evaluation of differential item functioning, inclusion of items with different response formats in the same scale, and assessment of person fit and is ideally suited for implementing computer adaptive testing. Finally, IRT methods can be helpful in developing better health outcome measures and in assessing change over time. These issues are reviewed, along with a discussion of some of the methodological and practical challenges in applying IRT methods. PMID:10982088
An NCME Instructional Module on Item-Fit Statistics for Item Response Theory Models

ERIC Educational Resources Information Center

Ames, Allison J.; Penfield, Randall D.

2015-01-01

Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model-data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing…
Factor Structure of the Quality of Life Scale for Mental Disorders in Patients With Schizophrenia.

PubMed

Chiu, En-Chi; Lee, Shu-Chun

2018-06-01

The Quality of Life for Mental Disorders (QOLMD) scale was designed to measure health-related quality of life (HRQOL) in patients with mental illness, especially schizophrenia. The QOLMD contains 45 items, which are divided into eight domains. However, the factor structure of the QOLMD has not been evaluated, which restricts the interpretations of the results of this scale. The purpose of this study was to evaluate the factor structures (i.e., unidimensionality, eight-factor structure, and second-order model) of the QOLMD in patients with schizophrenia. Two hundred thirty-eight outpatients with schizophrenia participated. We first conducted confirmatory factor analysis to evaluate the unidimensionality of each domain. After the unidimensionality of the eight individual domains was supported, we examined the eight-factor structure and second-order model. The results of unidimensionality showed sufficient model fit in all of the domains with the exception of the autonomy domain. A good model fit was confirmed for the autonomy domain after deleting two of the original items. The eight-factor structure for the 43-item QOLMD showed an acceptable model fit, although the second-order model showed poor model fit. Our results supported the unidimensionality and eight-factor structure of the 43-item QOLMD. The sum score for each of the domains may be used to reflect its domain-specific function. We recommend using the 43-item QOLMD to capture the multiple domains of HRQOL. However, the second-order model showed an unsatisfactory model fit. Furthermore, caution is advised when interpreting overall HRQOL using the total score for the eight domains.
The e-MSWS-12: improving the multiple sclerosis walking scale using item response theory.

PubMed

Engelhard, Matthew M; Schmidt, Karen M; Engel, Casey E; Brenton, J Nicholas; Patek, Stephen D; Goldman, Myla D

2016-12-01

The Multiple Sclerosis Walking Scale (MSWS-12) is the predominant patient-reported measure of multiple sclerosis (MS) -elated walking ability, yet it had not been analyzed using item response theory (IRT), the emerging standard for patient-reported outcome (PRO) validation. This study aims to reduce MSWS-12 measurement error and facilitate computerized adaptive testing by creating an IRT model of the MSWS-12 and distributing it online. MSWS-12 responses from 284 subjects with MS were collected by mail and used to fit and compare several IRT models. Following model selection and assessment, subpopulations based on age and sex were tested for differential item functioning (DIF). Model comparison favored a one-dimensional graded response model (GRM). This model met fit criteria and explained 87 % of response variance. The performance of each MSWS-12 item was characterized using category response curves (CRCs) and item information. IRT-based MSWS-12 scores correlated with traditional MSWS-12 scores (r = 0.99) and timed 25-foot walk (T25FW) speed (r = -0.70). Item 2 showed DIF based on age (χ 2 = 19.02, df = 5, p < 0.01), and Item 11 showed DIF based on sex (χ 2 = 13.76, df = 5, p = 0.02). MSWS-12 measurement error depends on walking ability, but could be lowered by improving or replacing items with low information or DIF. The e-MSWS-12 includes IRT-based scoring, error checking, and an estimated T25FW derived from MSWS-12 responses. It is available at https://ms-irt.shinyapps.io/e-MSWS-12 .
How Often Is the Misfit of Item Response Theory Models Practically Significant?

ERIC Educational Resources Information Center

Sinharay, Sandip; Haberman, Shelby J.

2014-01-01

Standard 3.9 of the Standards for Educational and Psychological Testing ([, 1999]) demands evidence of model fit when item response theory (IRT) models are employed to data from tests. Hambleton and Han ([Hambleton, R. K., 2005]) and Sinharay ([Sinharay, S., 2005]) recommended the assessment of practical significance of misfit of IRT models, but…
An Estimation Procedure for the Structural Parameters of the Unified Cognitive/IRT Model.

ERIC Educational Resources Information Center

Jiang, Hai; And Others

L. V. DiBello, W. F. Stout, and L. A. Roussos (1993) have developed a new item response model, the Unified Model, which brings together the discrete, deterministic aspects of cognition favored by cognitive scientists, and the continuous, stochastic aspects of test response behavior that underlie item response theory (IRT). The Unified Model blends…

Fitting a Mixture Item Response Theory Model to Personality Questionnaire Data: Characterizing Latent Classes and Investigating Possibilities for Improving Prediction

ERIC Educational Resources Information Center

Maij-de Meij, Annette M.; Kelderman, Henk; van der Flier, Henk

2008-01-01

Mixture item response theory (IRT) models aid the interpretation of response behavior on personality tests and may provide possibilities for improving prediction. Heterogeneity in the population is modeled by identifying homogeneous subgroups that conform to different measurement models. In this study, mixture IRT models were applied to the…
A Primer on the 2- and 3-Parameter Item Response Theory Models.

ERIC Educational Resources Information Center

Thornton, Artist

Item response theory (IRT) is a useful and effective tool for item response measurement if used in the proper context. This paper discusses the sets of assumptions under which responses can be modeled while exploring the framework of the IRT models relative to response testing. The one parameter model, or one parameter logistic model, is perhaps…
Distinguishing Continuous and Discrete Approaches to Multilevel Mixture IRT Models: A Model Comparison Perspective

ERIC Educational Resources Information Center

Zhu, Xiaoshu

2013-01-01

The current study introduced a general modeling framework, multilevel mixture IRT (MMIRT) which detects and describes characteristics of population heterogeneity, while accommodating the hierarchical data structure. In addition to introducing both continuous and discrete approaches to MMIRT, the main focus of the current study was to distinguish…
Finite Mixture Multilevel Multidimensional Ordinal IRT Models for Large Scale Cross-Cultural Research

ERIC Educational Resources Information Center

de Jong, Martijn G.; Steenkamp, Jan-Benedict E. M.

2010-01-01

We present a class of finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Our model is proposed for confirmatory research settings. Our prior for item parameters is a mixture distribution to accommodate situations where different groups of countries have different measurement operations, while…
IRT-ZIP Modeling for Multivariate Zero-Inflated Count Data

ERIC Educational Resources Information Center

Wang, Lijuan

2010-01-01

This study introduces an item response theory-zero-inflated Poisson (IRT-ZIP) model to investigate psychometric properties of multiple items and predict individuals' latent trait scores for multivariate zero-inflated count data. In the model, two link functions are used to capture two processes of the zero-inflated count data. Item parameters are…
Diagnosing cystic fibrosis in newborn screening in Poland - 15 years of experience.

PubMed

Sands, Dorota; Zybert, Katarzyna; Mierzejewska, Ewa; Ołtarzewski, Mariusz

2015-01-01

Early diagnosis of cystic fibrosis (CF) made by the introduction of CF NBS (Cystic Fibrosis Newborn Screening) provides the opportunity to undertake preventive measures and provide treatment before the development of irreversible changes in the respiratory tract and other complications. CF NBS was conducted as a pilot programme in four Polish districts in the period 1999-2003. In 2006 CF NBS started again and was gradually extended across the country. The aim of this study was to show the evolution of the Polish CF NBS strategies and assess the diagnostic consequences of this programme. The study involved children diagnosed and treated only in the IMiD Centre. The strategy in Polish CF NBS was modified over time. Firstly, the model IRT/IRT and IRT/IRT/DNA with one mutation was implemented, which was followed by IRT/DNA with a gradually expanding number of CFTR mutations (tab. I). Newborns with positive results of CF NBS were called to the CF IMiD Centre, and sweat tests were performed. The children diagnosed and children with mutations in both alleles of the CFTR gene even if at least one of them had undefined pathogenicity) were taken under IMiD Centre care. Sensitivity, specificity and positive predictive values during subsequent stages of CF NBS were calculated (tab. III). During the 1999-2003 pilot study 444 063 newborns underwent CF NBS and in 74 cases CF was diagnosed. 582 693 newborns were screened from September 2006 to December 2011 in four regions and 100 children were diagnosed with CF. The frequencies of CF in the Polish population in both screening periods were 1:5767 and 1:5712 respectively. Firstly, the IRT/IRT model was implemented, but the number of newborns called to the CF Centre was high - the PPV was 7.6%. In the next step CF NBS DNA analysis was used. Here sensitivity and specificity were high - nearly 100%. In the following years the number of mutations detected was expanded (including 16 most common ones in the Polish population). Due to the panel changes, the number of calls declined and the PPV (predictive positive value) improved (to 26.1%) after the application of expanded genetic analysis. Expanding the panel of mutations resulted in an increased number of carriers and observational subjects. IRT/DNA strategy with expanded DNA analysis provides the opportunity for earlier CF diagnosis even in children with normal sweat test values. However, this model caused frequent carrier detection and inconclusive diagnosis in comparison to IRT/IRT or IRT/IRT/DNA with a limited number of mutations. Further research and changes in Polish CF NBS are needed to increase the PPV, while preserving high sensitivity and specificity..
Unidimensional factor models imply weaker partial correlations than zero-order correlations.

PubMed

van Bork, Riet; Grasman, Raoul P P P; Waldorp, Lourens J

2018-06-01

In this paper we present a new implication of the unidimensional factor model. We prove that the partial correlation between two observed variables that load on one factor given any subset of other observed variables that load on this factor lies between zero and the zero-order correlation between these two observed variables. We implement this result in an empirical bootstrap test that rejects the unidimensional factor model when partial correlations are identified that are either stronger than the zero-order correlation or have a different sign than the zero-order correlation. We demonstrate the use of the test in an empirical data example with data consisting of fourteen items that measure extraversion.
Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients - a simulation study

PubMed Central

2010-01-01

Background Patients-Reported Outcomes (PRO) are increasingly used in clinical and epidemiological research. Two main types of analytical strategies can be found for these data: classical test theory (CTT) based on the observed scores and models coming from Item Response Theory (IRT). However, whether IRT or CTT would be the most appropriate method to analyse PRO data remains unknown. The statistical properties of CTT and IRT, regarding power and corresponding effect sizes, were compared. Methods Two-group cross-sectional studies were simulated for the comparison of PRO data using IRT or CTT-based analysis. For IRT, different scenarios were investigated according to whether items or person parameters were assumed to be known, to a certain extent for item parameters, from good to poor precision, or unknown and therefore had to be estimated. The powers obtained with IRT or CTT were compared and parameters having the strongest impact on them were identified. Results When person parameters were assumed to be unknown and items parameters to be either known or not, the power achieved using IRT or CTT were similar and always lower than the expected power using the well-known sample size formula for normally distributed endpoints. The number of items had a substantial impact on power for both methods. Conclusion Without any missing data, IRT and CTT seem to provide comparable power. The classical sample size formula for CTT seems to be adequate under some conditions but is not appropriate for IRT. In IRT, it seems important to take account of the number of items to obtain an accurate formula. PMID:20338031
The Organization of Controller Motifs Leading to Robust Plant Iron Homeostasis

PubMed Central

Agafonov, Oleg; Selstø, Christina Helen; Thorsen, Kristian; Xu, Xiang Ming; Drengstig, Tormod; Ruoff, Peter

2016-01-01

Iron is an essential element needed by all organisms for growth and development. Because iron becomes toxic at higher concentrations iron is under homeostatic control. Plants face also the problem that iron in the soil is tightly bound to oxygen and difficult to access. Plants have therefore developed special mechanisms for iron uptake and regulation. During the last years key components of plant iron regulation have been identified. How these components integrate and maintain robust iron homeostasis is presently not well understood. Here we use a computational approach to identify mechanisms for robust iron homeostasis in non-graminaceous plants. In comparison with experimental results certain control arrangements can be eliminated, among them that iron homeostasis is solely based on an iron-dependent degradation of the transporter IRT1. Recent IRT1 overexpression experiments suggested that IRT1-degradation is iron-independent. This suggestion appears to be misleading. We show that iron signaling pathways under IRT1 overexpression conditions become saturated, leading to a breakdown in iron regulation and to the observed iron-independent degradation of IRT1. A model, which complies with experimental data places the regulation of cytosolic iron at the transcript level of the transcription factor FIT. Including the experimental observation that FIT induces inhibition of IRT1 turnover we found a significant improvement in the system’s response time, suggesting a functional role for the FIT-mediated inhibition of IRT1 degradation. By combining iron uptake with storage and remobilization mechanisms a model is obtained which in a concerted manner integrates iron uptake, storage and remobilization. In agreement with experiments the model does not store iron during its high-affinity uptake. As an iron biofortification approach we discuss the possibility how iron can be accumulated even during high-affinity uptake. PMID:26800438
Using SAS PROC MCMC for Item Response Theory Models

PubMed Central

Samonte, Kelli

2014-01-01

Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian methods in the context of item response theory to serve as a useful guide for practitioners in estimating and interpreting item response theory (IRT) models. Included is a description of the estimation procedure used by SAS PROC MCMC. Syntax is provided for estimation of both dichotomous and polytomous IRT models, as well as a discussion on how to extend the syntax to accommodate more complex IRT models. PMID:29795834
A novel method for expediting the development of patient-reported outcome measures and an evaluation across several populations

PubMed Central

Garrard, Lili; Price, Larry R.; Bott, Marjorie J.; Gajewski, Byron J.

2016-01-01

Item response theory (IRT) models provide an appropriate alternative to the classical ordinal confirmatory factor analysis (CFA) during the development of patient-reported outcome measures (PROMs). Current literature has identified the assessment of IRT model fit as both challenging and underdeveloped (Sinharay & Johnson, 2003; Sinharay, Johnson, & Stern, 2006). This study evaluates the performance of Ordinal Bayesian Instrument Development (OBID), a Bayesian IRT model with a probit link function approach, through applications in two breast cancer-related instrument development studies. The primary focus is to investigate an appropriate method for comparing Bayesian IRT models in PROMs development. An exact Bayesian leave-one-out cross-validation (LOO-CV) approach (Vehtari & Lampinen, 2002) is implemented to assess prior selection for the item discrimination parameter in the IRT model and subject content experts’ bias (in a statistical sense and not to be confused with psychometric bias as in differential item functioning) toward the estimation of item-to-domain correlations. Results support the utilization of content subject experts’ information in establishing evidence for construct validity when sample size is small. However, the incorporation of subject experts’ content information in the OBID approach can be sensitive to the level of expertise of the recruited experts. More stringent efforts need to be invested in the appropriate selection of subject experts to efficiently use the OBID approach and reduce potential bias during PROMs development. PMID:27667878
A novel method for expediting the development of patient-reported outcome measures and an evaluation across several populations.

PubMed

Garrard, Lili; Price, Larry R; Bott, Marjorie J; Gajewski, Byron J

2016-10-01

Item response theory (IRT) models provide an appropriate alternative to the classical ordinal confirmatory factor analysis (CFA) during the development of patient-reported outcome measures (PROMs). Current literature has identified the assessment of IRT model fit as both challenging and underdeveloped (Sinharay & Johnson, 2003; Sinharay, Johnson, & Stern, 2006). This study evaluates the performance of Ordinal Bayesian Instrument Development (OBID), a Bayesian IRT model with a probit link function approach, through applications in two breast cancer-related instrument development studies. The primary focus is to investigate an appropriate method for comparing Bayesian IRT models in PROMs development. An exact Bayesian leave-one-out cross-validation (LOO-CV) approach (Vehtari & Lampinen, 2002) is implemented to assess prior selection for the item discrimination parameter in the IRT model and subject content experts' bias (in a statistical sense and not to be confused with psychometric bias as in differential item functioning) toward the estimation of item-to-domain correlations. Results support the utilization of content subject experts' information in establishing evidence for construct validity when sample size is small. However, the incorporation of subject experts' content information in the OBID approach can be sensitive to the level of expertise of the recruited experts. More stringent efforts need to be invested in the appropriate selection of subject experts to efficiently use the OBID approach and reduce potential bias during PROMs development.
A Comparison of Item Fit Statistics for Mixed IRT Models

ERIC Educational Resources Information Center

Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B.

2010-01-01

In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…
Comparing the Fit of Item Response Theory and Factor Analysis Models

ERIC Educational Resources Information Center

Maydeu-Olivares, Alberto; Cai, Li; Hernandez, Adolfo

2011-01-01

Linear factor analysis (FA) models can be reliably tested using test statistics based on residual covariances. We show that the same statistics can be used to reliably test the fit of item response theory (IRT) models for ordinal data (under some conditions). Hence, the fit of an FA model and of an IRT model to the same data set can now be…
A Note on the Equivalence between Observed and Expected Information Functions with Polytomous IRT Models

ERIC Educational Resources Information Center

Magis, David

2015-01-01

The purpose of this note is to study the equivalence of observed and expected (Fisher) information functions with polytomous item response theory (IRT) models. It is established that observed and expected information functions are equivalent for the class of divide-by-total models (including partial credit, generalized partial credit, rating…
An Extension of Least Squares Estimation of IRT Linking Coefficients for the Graded Response Model

ERIC Educational Resources Information Center

Kim, Seonghoon

2010-01-01

The three types (generalized, unweighted, and weighted) of least squares methods, proposed by Ogasawara, for estimating item response theory (IRT) linking coefficients under dichotomous models are extended to the graded response model. A simulation study was conducted to confirm the accuracy of the extended formulas, and a real data study was…
Rasch Analysis for Binary Data with Nonignorable Nonresponses

ERIC Educational Resources Information Center

Bertoli-Barsotti, Lucio; Punzo, Antonio

2013-01-01

This paper introduces a two-dimensional Item Response Theory (IRT) model to deal with nonignorable nonresponses in tests with dichotomous items. One dimension provides information about the omitting behavior, while the other dimension is related to the person's "ability". The idea of embedding an IRT model for missingness into the measurement…
Invariance Properties for General Diagnostic Classification Models

ERIC Educational Resources Information Center

Bradshaw, Laine P.; Madison, Matthew J.

2016-01-01

In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic…
Applying item response theory and computer adaptive testing: the challenges for health outcomes assessment.

PubMed

Fayers, Peter M

2007-01-01

We review the papers presented at the NCI/DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs). IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research. Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to differential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions. Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.
Georg Rasch and Benjamin Wright's Struggle with the Unidimensional Polytomous Model with Sufficient Statistics

ERIC Educational Resources Information Center

Andrich, David

2016-01-01

This article reproduces correspondence between Georg Rasch of The University of Copenhagen and Benjamin Wright of The University of Chicago in the period from January 1966 to July 1967. This correspondence reveals their struggle to operationalize a unidimensional measurement model with sufficient statistics for responses in a set of ordered…

An Extension of IRT-Based Equating to the Dichotomous Testlet Response Theory Model

ERIC Educational Resources Information Center

Tao, Wei; Cao, Yi

2016-01-01

Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…
New Method of Calibrating IRT Models.

ERIC Educational Resources Information Center

Jiang, Hai; Tang, K. Linda

This discussion of new methods for calibrating item response theory (IRT) models looks into new optimization procedures, such as the Genetic Algorithm (GA) to improve on the use of the Newton-Raphson procedure. The advantages of using a global optimization procedure like GA is that this kind of procedure is not easily affected by local optima and…
More relevant, precise, and efficient items for assessment of physical function and disability: moving beyond the classic instruments

PubMed Central

Fries, J F; Bruce, B; Bjorner, J; Rose, M

2006-01-01

Objectives Patient reported outcomes (PROs) have become standard study endpoints. However, little attention has been given to using item improvement to advance PRO performance which could improve precision, clarity, patient relevance, and information content of “physical function/disability” items and thus the performance of resulting instruments. Methods The present study included1860 physical function/disability items from 165 instruments. Item formulations were assessed by frequency of use, modified Delphi consensus, respondent judgement of clarity and importance, and item response theory (IRT). Data from 1100 rheumatoid arthritis, osteoarthritis, and normal ageing subjects, using qualitative item review, focus groups, cognitive interviews, and patient survey were used to achieve a unique item pool that was clear, reliable, sensitive to change, readily translatable, devoid of floor and ceiling limitations, contained unidimensional subdomains, and had maximal information content. Results A “present tense” time frame was used most frequently, better understood, more readily translated, and more directly estimated the latent trait of disability. Items in the “past tense” had 80–90% false negatives (p<0.001). The best items were brief, clear, and contained a single construct. Responses with four to five options were preferred by both experts and respondents. The term physical function may be preferable to the term disability because of fewer floor effects. IRT analyses of “disability” suggest four independent subdomains (mobility, dexterity, axial, and compound) with factor loadings of 0.81–0.99. Conclusions Major improvement in performance of items and instruments is possible, and may have the effect of substantially reducing sample size requirements for clinical trials. PMID:17038464
A Multi-Study Analysis of Conceptual and Measurement Issues Related to Health Research on Acculturation in Latinos

PubMed Central

Andrews, Arthur R.; Bridges, Ana J.; Gomez, Debbie

2014-01-01

Purpose The aims of the study were to evaluate the orthogonality of acculturation for Latinos. Design Regression analyses were used to examine acculturation in two Latino samples (N = 77; N = 40). In a third study (N = 673), confirmatory factor analyses compared unidimensional and bidimensional models. Method Acculturation was assessed with the ARSMA-II (Studies 1 and 2), and language proficiency items from the Children of Immigrants Longitudinal Study (Study 3). Results In Studies 1 and 2, the bidimensional model accounted for slightly more variance (R2Study 1 = .11; R2Study 2 = .21) than the unidimensional model (R2Study 1 = .10; R2Study 2 = .19). In Study 3, the bidimensional model evidenced better fit (Akaike information criterion = 167.36) than the unidimensional model (Akaike information criterion = 1204.92). Discussion/Conclusions Acculturation is multidimensional. Implications for Practice Care providers should examine acculturation as a bidimensional construct. PMID:23361579
A Comparison of the One-, the Modified Three-, and the Three-Parameter Item Response Theory Models in the Test Development Item Selection Process.

ERIC Educational Resources Information Center

Eignor, Daniel R.; Douglass, James B.

This paper attempts to provide some initial information about the use of a variety of item response theory (IRT) models in the item selection process; its purpose is to compare the information curves derived from the selection of items characterized by several different IRT models and their associated parameter estimation programs. These…
Pretest-Posttest-Posttest Multilevel IRT Modeling of Competence Growth of Students in Higher Education in Germany

ERIC Educational Resources Information Center

Schmidt, Susanne; Zlatkin-Troitschanskaia, Olga; Fox, Jean-Paul

2016-01-01

Longitudinal research in higher education faces several challenges. Appropriate methods of analyzing competence growth of students are needed to deal with those challenges and thereby obtain valid results. In this article, a pretest-posttest-posttest multivariate multilevel IRT model for repeated measures is introduced which is designed to address…
Linking Parameters Estimated with the Generalized Graded Unfolding Model: A Comparison of the Accuracy of Characteristic Curve Methods

ERIC Educational Resources Information Center

Anderson Koenig, Judith; Roberts, James S.

2007-01-01

Methods for linking item response theory (IRT) parameters are developed for attitude questionnaire responses calibrated with the generalized graded unfolding model (GGUM). One class of IRT linking methods derives the linking coefficients by comparing characteristic curves, and three of these methods---test characteristic curve (TCC), item…
Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

ERIC Educational Resources Information Center

Elosua, Paula; Wells, Craig

2013-01-01

The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…
Identifying Aberrant Responding: Use of Multiple Measures

ERIC Educational Resources Information Center

Steinkamp, Susan Christa

2017-01-01

For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…
A Systematic Comparison between Classical Optimal Scaling and the Two-Parameter IRT Model

ERIC Educational Resources Information Center

Warrens, Matthijs J.; de Gruijter, Dato N. M.; Heiser, Willem J.

2007-01-01

In this article, the relationship between two alternative methods for the analysis of multivariate categorical data is systematically explored. It is shown that the person score of the first dimension of classical optimal scaling correlates strongly with the latent variable for the two-parameter item response theory (IRT) model. Next, under the…
Examining the Effectiveness of Test Accommodation Using DIF and a Mixture IRT Model

ERIC Educational Resources Information Center

Cho, Hyun-Jeong; Lee, Jaehoon; Kingston, Neal

2012-01-01

This study examined the validity of test accommodation in third-eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With the first data set (N = 51,591) we examined whether item type (i.e., story, explanation, straightforward) or item features were associated with item…
Mixture IRT Model with a Higher-Order Structure for Latent Traits

ERIC Educational Resources Information Center

Huang, Hung-Yu

2017-01-01

Mixture item response theory (IRT) models have been suggested as an efficient method of detecting the different response patterns derived from latent classes when developing a test. In testing situations, multiple latent traits measured by a battery of tests can exhibit a higher-order structure, and mixtures of latent classes may occur on…
Defining the developmental parameters of temper loss in early childhood: implications for developmental psychopathology

PubMed Central

Wakschlag, Lauren S.; Choi, Seung W.; Carter, Alice S.; Hullsiek, Heide; Burns, James; McCarthy, Kimberly; Leibenluft, Ellen; Briggs-Gowan, Margaret J.

2013-01-01

Background Temper modulation problems are both a hallmark of early childhood and a common mental health concern. Thus, characterizing specific behavioral manifestations of temper loss along a dimension from normative misbehaviors to clinically significant problems is an important step toward identifying clinical thresholds. Methods Parent-reported patterns of temper loss were delineated in a diverse community sample of preschoolers (n = 1,490). A developmentally sensitive questionnaire, the Multidimensional Assessment of Preschool Disruptive Behavior (MAP-DB), was used to assess temper loss in terms of tantrum features and anger regulation. Specific aims were: (a) document the normative distribution of temper loss in preschoolers from normative misbehaviors to clinically concerning temper loss behaviors, and test for sociodemographic differences; (b) use Item Response Theory (IRT) to model a Temper Loss dimension; and (c) examine associations of temper loss and concurrent emotional and behavioral problems. Results Across sociodemographic subgroups, a unidimensional Temper Loss model fit the data well. Nearly all (83.7%) preschoolers had tantrums sometimes but only 8.6% had daily tantrums. Normative misbehaviors occurred more frequently than clinically concerning temper loss behaviors. Milder behaviors tended to reflect frustration in expectable contexts, whereas clinically concerning problem indicators were unpredictable, prolonged, and/or destructive. In multivariate models, Temper Loss was associated with emotional and behavioral problems. Conclusions Parent reports on a developmentally informed questionnaire, administered to a large and diverse sample, distinguished normative and problematic manifestations of preschool temper loss. A developmental, dimensional approach shows promise for elucidating the boundaries between normative early childhood temper loss and emergent psychopathology. PMID:22928674
The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): item response theory findings.

PubMed

Grigg, Kaine; Manderson, Lenore

2016-03-17

Racism and associated discrimination are pervasive and persistent challenges with multiple cumulative deleterious effects contributing to inequities in various health outcomes. Globally, research over the past decade has shown consistent associations between racism and negative health concerns. Such research confirms that race endures as one of the strongest predictors of poor health. Due to the lack of validated Australian measures of racist attitudes, RACES (Racism, Acceptance, and Cultural-Ethnocentrism Scale) was developed. Here, we examine RACES' psychometric properties, including the latent structure, utilising Item Response Theory (IRT). Unidimensional and Multidimensional Rating Scale Model (RSM) Rasch analyses were utilised with 296 Victorian primary school students and 182 adolescents and 220 adults from the Australian community. RACES was demonstrated to be a robust 24-item three-dimensional scale of Accepting Attitudes (12 items), Racist Attitudes (8 items), and Ethnocentric Attitudes (4 items). RSM Rasch analyses provide strong support for the instrument as a robust measure of racist attitudes in the Australian context, and for the overall factorial and construct validity of RACES across primary school children, adolescents, and adults. RACES provides a reliable and valid measure that can be utilised across the lifespan to evaluate attitudes towards all racial, ethnic, cultural, and religious groups. A core function of RACES is to assess the effectiveness of interventions to reduce community levels of racism and in turn inequities in health outcomes within Australia.
Factor Models for Ordinal Variables With Covariate Effects on the Manifest and Latent Variables: A Comparison of LISREL and IRT Approaches

ERIC Educational Resources Information Center

Moustaki, Irini; Joreskog, Karl G.; Mavridis, Dimitris

2004-01-01

We consider a general type of model for analyzing ordinal variables with covariate effects and 2 approaches for analyzing data for such models, the item response theory (IRT) approach and the PRELIS-LISREL (PLA) approach. We compare these 2 approaches on the basis of 2 examples, 1 involving only covariate effects directly on the ordinal variables…
Going Places No Infrared Temperature Devices Have Gone Before

NASA Technical Reports Server (NTRS)

2003-01-01

Exergen's IRt/c is a self-powered sensor that matches a thermocouple within specified temperature ranges and provides a predictable and repeatable signal outside of this specified range. Possessing an extremely fast time constant, the infrared technology allows users to measure product temperature without touching the product. The IRt/c uses a device called a thermopile to measure temperature and generate current. Traditionally, these devices are not available in a size that would be compatible with the Exergen IRt/c, based on NASA s quarterinch specifications. After going through five circuit designs to find a thermopile that would suit the IRt/c design and match the signal needed for output, Exergen maintains that it developed a model that totaled just 20 percent of the volume of the previous smallest detector in the world. Following completion of the project with Glenn, Exergen continued development of the IRt/c for other customers, spinning off a new product line called the micro IRt/c. This latest development has broadened applications for industries that previously could not use infrared thermometers due to size constraints. The first commercial use of the micro IRt/c involved an original equipment manufacturer that makes laminating machinery consisting of heated rollers in very tight spots. Accurate temperature measurement for this application requires close proximity to the heated rollers. With the micro IRt/c s 50-millisecond time constant, the manufacturer is able to gain closer access to the intended temperature targets for exact readings, thereby increasing productivity and staying ahead of competition.In a separate application, the infrared temperature sensor is being utilized for avalanche warnings in Switzerland. The IRt/c is mounted about 5 meters above the ground to measure the snow cover throughout the mountainous regions of the country.
Testing item response theory invariance of the standardized Quality-of-life Disease Impact Scale (QDIS(®)) in acute coronary syndrome patients: differential functioning of items and test.

PubMed

Deng, Nina; Anatchkova, Milena D; Waring, Molly E; Han, Kyung T; Ware, John E

2015-08-01

The Quality-of-life (QOL) Disease Impact Scale (QDIS(®)) standardizes the content and scoring of QOL impact attributed to different diseases using item response theory (IRT). This study examined the IRT invariance of the QDIS-standardized IRT parameters in an independent sample. The differential functioning of items and test (DFIT) of a static short-form (QDIS-7) was examined across two independent sources: patients hospitalized for acute coronary syndrome (ACS) in the TRACE-CORE study (N = 1,544) and chronically ill US adults in the QDIS standardization sample. "ACS-specific" IRT item parameters were calibrated and linearly transformed to compare to "standardized" IRT item parameters. Differences in IRT model-expected item, scale and theta scores were examined. The DFIT results were also compared in a standard logistic regression differential item functioning analysis. Item parameters estimated in the ACS sample showed lower discrimination parameters than the standardized discrimination parameters, but only small differences were found for thresholds parameters. In DFIT, results on the non-compensatory differential item functioning index (range 0.005-0.074) were all below the threshold of 0.096. Item differences were further canceled out at the scale level. IRT-based theta scores for ACS patients using standardized and ACS-specific item parameters were highly correlated (r = 0.995, root-mean-square difference = 0.09). Using standardized item parameters, ACS patients scored one-half standard deviation higher (indicating greater QOL impact) compared to chronically ill adults in the standardization sample. The study showed sufficient IRT invariance to warrant the use of standardized IRT scoring of QDIS-7 for studies comparing the QOL impact attributed to acute coronary disease and other chronic conditions.
Anchor Selection Strategies for DIF Analysis: Review, Assessment, and New Approaches

ERIC Educational Resources Information Center

Kopf, Julia; Zeileis, Achim; Strobl, Carolin

2015-01-01

Differential item functioning (DIF) indicates the violation of the invariance assumption, for instance, in models based on item response theory (IRT). For item-wise DIF analysis using IRT, a common metric for the item parameters of the groups that are to be compared (e.g., for the reference and the focal group) is necessary. In the Rasch model,…
Generalized IRT Models for Extreme Response Style

ERIC Educational Resources Information Center

Jin, Kuan-Yu; Wang, Wen-Chung

2014-01-01

Extreme response style (ERS) is a systematic tendency for a person to endorse extreme options (e.g., strongly disagree, strongly agree) on Likert-type or rating-scale items. In this study, we develop a new class of item response theory (IRT) models to account for ERS so that the target latent trait is free from the response style and the tendency…
Rasch Model Parameter Estimation in the Presence of a Nonnormal Latent Trait Using a Nonparametric Bayesian Approach

ERIC Educational Resources Information Center

Finch, Holmes; Edwards, Julianne M.

2016-01-01

Standard approaches for estimating item response theory (IRT) model parameters generally work under the assumption that the latent trait being measured by a set of items follows the normal distribution. Estimation of IRT parameters in the presence of nonnormal latent traits has been shown to generate biased person and item parameter estimates. A…

Accuracy and Variability of Item Parameter Estimates from Marginal Maximum a Posteriori Estimation and Bayesian Inference via Gibbs Samplers

ERIC Educational Resources Information Center

Wu, Yi-Fang

2015-01-01

Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…
Measuring Anxiety in Visually-Impaired People: A Comparison between the Linear and the Nonlinear IRT Approaches

ERIC Educational Resources Information Center

Ferrando, Pere J.; Pallero, Rafael; Anguiano-Carrasco, Cristina

2013-01-01

The present study has two main interests. First, some pending issues about the psychometric properties of the CTAC (an anxiety questionnaire for blind and visually-impaired people) are assessed using item response theory (IRT). Second, the linear model is compared to the graded response model (GRM) in terms of measurement precision, sensitivity…
Estimation of a Ramsay-Curve Item Response Theory Model by the Metropolis-Hastings Robbins-Monro Algorithm. CRESST Report 834

ERIC Educational Resources Information Center

Monroe, Scott; Cai, Li

2013-01-01

In Ramsay curve item response theory (RC-IRT, Woods & Thissen, 2006) modeling, the shape of the latent trait distribution is estimated simultaneously with the item parameters. In its original implementation, RC-IRT is estimated via Bock and Aitkin's (1981) EM algorithm, which yields maximum marginal likelihood estimates. This method, however,…
Estimation of a Ramsay-Curve Item Response Theory Model by the Metropolis-Hastings Robbins-Monro Algorithm

ERIC Educational Resources Information Center

Monroe, Scott; Cai, Li

2014-01-01

In Ramsay curve item response theory (RC-IRT) modeling, the shape of the latent trait distribution is estimated simultaneously with the item parameters. In its original implementation, RC-IRT is estimated via Bock and Aitkin's EM algorithm, which yields maximum marginal likelihood estimates. This method, however, does not produce the…
Use of a Scale Model in the Design of Modifications to the NASA Glenn Icing Research Tunnel

NASA Technical Reports Server (NTRS)

Canacci, Victor A.; Gonsalez, Jose C.; Spera, David A.; Burke, Thomas (Technical Monitor)

2001-01-01

Major modifications were made in 1999 to the 6- by 9-Foot (1.8- by 2.7-m) Icing Research tunnel (IRT) at the NASA Glenn Research Center, including replacement of its heat exchanger and associated ducts and turning vanes, and the addition of fan outlet guide vanes (OGV's). A one-tenth scale model of the IRT (designated as the SMIRT) was constructed with and without these modifications and tested to increase confidence in obtaining expected improvements in flow quality around the tunnel loop. The SMIRT is itself an aerodynamic test facility whose flow patterns without modifications have been shown to be accurate, scaled representations of those measured in the IRT prior to the 1999 upgrade program. In addition, tests in the SMIRT equipped with simulated OGV's indicated that these devices in the IRT might reduce flow distortions immediately downstream of the fan by two thirds. Flow quality parameters measured in the SMIRT were projected to the full-size modified IRT, and quantitative estimates of improvements in flow quality were given prior to construction. In this paper, the results of extensive flow quality studies conducted in the SMIRT are documented. Samples of these are then compared with equivalent measurements made in the full-scale IRT, both before and after its configuration was upgraded. Airspeed, turbulence intensity, and flow angularity distributions are presented for cross sections downstream of the drive fan, both upstream and downstream of the replacement flat heat exchanger, in the stilling chamber, in the test section, and in the wakes of the new comer turning vanes with their unique expanding and contracting designs. Lessons learned from these scale-model studies are discussed.
www.common-metrics.org: a web application to estimate scores from different patient-reported outcome measures on a common scale.

PubMed

Fischer, H Felix; Rose, Matthias

2016-10-19

Recently, a growing number of Item-Response Theory (IRT) models has been published, which allow estimation of a common latent variable from data derived by different Patient Reported Outcomes (PROs). When using data from different PROs, direct estimation of the latent variable has some advantages over the use of sum score conversion tables. It requires substantial proficiency in the field of psychometrics to fit such models using contemporary IRT software. We developed a web application ( http://www.common-metrics.org ), which allows estimation of latent variable scores more easily using IRT models calibrating different measures on instrument independent scales. Currently, the application allows estimation using six different IRT models for Depression, Anxiety, and Physical Function. Based on published item parameters, users of the application can directly estimate latent trait estimates using expected a posteriori (EAP) for sum scores as well as for specific response patterns, Bayes modal (MAP), Weighted likelihood estimation (WLE) and Maximum likelihood (ML) methods and under three different prior distributions. The obtained estimates can be downloaded and analyzed using standard statistical software. This application enhances the usability of IRT modeling for researchers by allowing comparison of the latent trait estimates over different PROs, such as the Patient Health Questionnaire Depression (PHQ-9) and Anxiety (GAD-7) scales, the Center of Epidemiologic Studies Depression Scale (CES-D), the Beck Depression Inventory (BDI), PROMIS Anxiety and Depression Short Forms and others. Advantages of this approach include comparability of data derived with different measures and tolerance against missing values. The validity of the underlying models needs to be investigated in the future.
The Rosenberg Self-Esteem Scale: a bifactor answer to a two-factor question?

PubMed

McKay, Michael T; Boduszek, Daniel; Harvey, Séamus A

2014-01-01

Despite its long-standing and widespread use, disagreement remains regarding the structure of the Rosenberg Self-Esteem Scale (RSES). In particular, concern remains regarding the degree to which the scale assesses self-esteem as a unidimensional or multidimensional (positive and negative self-esteem) construct. Using a sample of 3,862 high school students in the United Kingdom, 4 models were tested: (a) a unidimensional model, (b) a correlated 2-factor model in which the 2 latent variables are represented by positive and negative self-esteem, (c) a hierarchical model, and (d) a bifactor model. The totality of results including item loadings, goodness-of-fit indexes, reliability estimates, and correlations with self-efficacy measures all supported the bifactor model, suggesting that the 2 hypothesized factors are better understood as "grouping" factors rather than as representative of latent constructs. Accordingly, this study supports the unidimensionality of the RSES and the scoring of all 10 items to produce a global self-esteem score.
Molecular Imaging and Therapy of Prostate Cancer

DTIC Science & Technology

2015-10-01

arsenic-based, IGF1R-targeted radiopharmaceuticals can allow for PET imaging, IRT, and monitoring the therapeutic response of PCa. Specific Aims: Aim 1: To...models with PET imaging. Aim 3: To monitor the efficacy of 76As-based IRT of PCa with multimodality imaging.
An Instructional Module on Mokken Scale Analysis

ERIC Educational Resources Information Center

Wind, Stefanie A.

2017-01-01

Mokken scale analysis (MSA) is a probabilistic-nonparametric approach to item response theory (IRT) that can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models. This instructional module provides an introduction to MSA as a probabilistic-nonparametric framework in which to explore…
Analysis Test of Understanding of Vectors with the Three-Parameter Logistic Model of Item Response Theory and Item Response Curves Technique

ERIC Educational Resources Information Center

Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

2016-01-01

This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming…
Comparing Different Approaches of Bias Correction for Ability Estimation in IRT Models. Research Report. ETS RR-08-13

ERIC Educational Resources Information Center

Lee, Yi-Hsuan; Zhang, Jinming

2008-01-01

The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…
A Note on Stochastic Ordering of the Latent Trait Using the Sum of Polytomous Item Scores

ERIC Educational Resources Information Center

van der Ark, L. Andries; Bergsma, Wicher P.

2010-01-01

In contrast to dichotomous item response theory (IRT) models, most well-known polytomous IRT models do not imply stochastic ordering of the latent trait by the total test score (SOL). This has been thought to make the ordering of respondents on the latent trait using the total test score questionable and throws doubt on the justifiability of using…
A Comparison of Three IRT Approaches to Examinee Ability Change Modeling in a Single-Group Anchor Test Design

ERIC Educational Resources Information Center

Paek, Insu; Park, Hyun-Jeong; Cai, Li; Chi, Eunlim

2014-01-01

Typically a longitudinal growth modeling based on item response theory (IRT) requires repeated measures data from a single group with the same test design. If operational or item exposure problems are present, the same test may not be employed to collect data for longitudinal analyses and tests at multiple time points are constructed with unique…
Performance of DIMTEST-and NOHARM-Based Statistics for Testing Unidimensionality

ERIC Educational Resources Information Center

Finch, Holmes; Habing, Brian

2007-01-01

This Monte Carlo study compares the ability of the parametric bootstrap version of DIMTEST with three goodness-of-fit tests calculated from a fitted NOHARM model to detect violations of the assumption of unidimensionality in testing data. The effectiveness of the procedures was evaluated for different numbers of items, numbers of examinees,…
Developing an Essentially Unidimensional Test with Cognitively Designed Items

ERIC Educational Resources Information Center

Bryant, Damon U.; Wooten, William

2006-01-01

The purpose of this study was to demonstrate how cognitive and measurement principles can be integrated to create an essentially unidimensional test. Two studies were conducted. In Study 1, test questions were created by using the feature integration theory of attention to develop a cognitive model of performance and then manipulating complexity…
Least Squares Metric, Unidimensional Scaling of Multivariate Linear Models.

ERIC Educational Resources Information Center

Poole, Keith T.

1990-01-01

A general approach to least-squares unidimensional scaling is presented. Ordering information contained in the parameters is used to transform the standard squared error loss function into a discrete rather than continuous form. Monte Carlo tests with 38,094 ratings of 261 senators, and 1,258 representatives demonstrate the procedure's…
Calibration of Response Data Using MIRT Models with Simple and Mixed Structures

ERIC Educational Resources Information Center

Zhang, Jinming

2012-01-01

It is common to assume during a statistical analysis of a multiscale assessment that the assessment is composed of several unidimensional subtests or that it has simple structure. Under this assumption, the unidimensional and multidimensional approaches can be used to estimate item parameters. These two approaches are equivalent in parameter…
The use of the bi-factor model to test the uni-dimensionality of a battery of reasoning tests.

PubMed

Primi, Ricardo; Rocha da Silva, Marjorie Cristina; Rodrigues, Priscila; Muniz, Monalisa; Almeida, Leandro S

2013-02-01

The Battery of Reasoning Tests 5 (BPR-5) aims to assess the reasoning ability of individuals, using sub-tests with different formats and contents that require basic processes of inductive and deductive reasoning for their resolution. The BPR has three sequential forms: BPR-5i (for children from first to fifth grade), BPR-5 - Form A (for children from sixth to eighth grade) and BPR-5 - form B (for high school and undergraduate students). The present study analysed 412 questionnaires concerning BPR-5i, 603 questionnaires concerning BPR-5 - Form A and 1748 questionnaires concerning BPR-5 - Form B. The main goal was to test the uni-dimensionality of the battery and its tests in relation to items using the bi-factor model. Results suggest that the g factor loadings (extracted by the uni-dimensional model) do not change when the data is adjusted for a more flexible multi-factor model (bi-factor model). A general reasoning factor underlying different contents items is supported.
A Review of PROC IRT in SAS

ERIC Educational Resources Information Center

Choi, Jinnie

2017-01-01

This article reviews PROC IRT, which was added to Statistical Analysis Software in 2014. We provide an introductory overview of a free version of SAS, describe what PROC IRT offers for item response theory (IRT) analysis and how one can use PROC IRT, and discuss how other SAS macros and procedures may compensate the IRT functionalities of PROC IRT.
Detecting Local Item Dependence in Polytomous Adaptive Data

ERIC Educational Resources Information Center

Mislevy, Jessica L.; Rupp, Andre A.; Harring, Jeffrey R.

2012-01-01

A rapidly expanding arena for item response theory (IRT) is in attitudinal and health-outcomes survey applications, often with polytomous items. In particular, there is interest in computer adaptive testing (CAT). Meeting model assumptions is necessary to realize the benefits of IRT in this setting, however. Although initial investigations of…

Practical Guide to Conducting an Item Response Theory Analysis

ERIC Educational Resources Information Center

Toland, Michael D.

2014-01-01

Item response theory (IRT) is a psychometric technique used in the development, evaluation, improvement, and scoring of multi-item scales. This pedagogical article provides the necessary information needed to understand how to conduct, interpret, and report results from two commonly used ordered polytomous IRT models (Samejima's graded…
IRT Item Parameter Recovery with Marginal Maximum Likelihood Estimation Using Loglinear Smoothing Models

ERIC Educational Resources Information Center

Casabianca, Jodi M.; Lewis, Charles

2015-01-01

Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
Likelihood-Ratio DIF Testing: Effects of Nonnormality

ERIC Educational Resources Information Center

Woods, Carol M.

2008-01-01

Differential item functioning (DIF) occurs when an item has different measurement properties for members of one group versus another. Likelihood-ratio (LR) tests for DIF based on item response theory (IRT) involve statistically comparing IRT models that vary with respect to their constraints. A simulation study evaluated how violation of the…
IRT-Estimated Reliability for Tests Containing Mixed Item Formats

ERIC Educational Resources Information Center

Shu, Lianghua; Schwarz, Richard D.

2014-01-01

As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…
Interval timing under a behavioral microscope: Dissociating motivational and timing processes in fixed-interval performance.

PubMed

Daniels, Carter W; Sanabria, Federico

2017-03-01

The distribution of latencies and interresponse times (IRTs) of rats was compared between two fixed-interval (FI) schedules of food reinforcement (FI 30 s and FI 90 s), and between two levels of food deprivation. Computational modeling revealed that latencies and IRTs were well described by mixture probability distributions embodying two-state Markov chains. Analysis of these models revealed that only a subset of latencies is sensitive to the periodicity of reinforcement, and prefeeding only reduces the size of this subset. The distribution of IRTs suggests that behavior in FI schedules is organized in bouts that lengthen and ramp up in frequency with proximity to reinforcement. Prefeeding slowed down the lengthening of bouts and increased the time between bouts. When concatenated, latency and IRT models adequately reproduced sigmoidal FI response functions. These findings suggest that behavior in FI schedules fluctuates in and out of schedule control; an account of such fluctuation suggests that timing and motivation are dissociable components of FI performance. These mixture-distribution models also provide novel insights on the motivational, associative, and timing processes expressed in FI performance. These processes may be obscured, however, when performance in timing tasks is analyzed in terms of mean response rates.
Prophylactic Effect of Probiotics on the Development of Experimental Autoimmune Myasthenia Gravis

PubMed Central

Chae, Chang-Suk; Kwon, Ho-Keun; Hwang, Ji-Sun; Kim, Jung-Eun; Im, Sin-Hyeog

2012-01-01

Probiotics are live bacteria that confer health benefits to the host physiology. Although protective role of probiotics have been reported in diverse diseases, no information is available whether probiotics can modulate neuromuscular immune disorders. We have recently demonstrated that IRT5 probiotics, a mixture of 5 probiotics, could suppress diverse experimental disorders in mice model. In this study we further investigated whether IRT5 probiotics could modulate the progression of experimental autoimmune myasthenia gravis (EAMG). Myasthenia gravis (MG) is a T cell dependent antibody mediated autoimmune disorder in which acetylcholine receptor (AChR) at the neuromuscular junction is the major auto-antigen. Oral administration of IRT5 probiotics significantly reduced clinical symptoms of EAMG such as weight loss, body trembling and grip strength. Prophylactic effect of IRT5 probiotics on EMAG is mediated by down-regulation of effector function of AChR-reactive T cells and B cells. Administration of IRT5 probiotics decreased AChR-reactive lymphocyte proliferation, anti-AChR reactive IgG levels and inflammatory cytokine levels such as IFN-γ, TNF-α, IL-6 and IL-17. Down-regulation of inflammatory mediators in AChR-reactive lymphocytes by IRT5 probiotics is mediated by the generation of regulatory dendritic cells (rDCs) that express increased levels of IL-10, TGF-β, arginase 1 and aldh1a2. Furthermore, DCs isolated from IRT5 probiotics-fed group effectively converted CD4+ T cells into CD4+Foxp3+ regulatory T cells compared with control DCs. Our data suggest that IRT5 probiotics could be applicable to modulate antibody mediated autoimmune diseases including myasthenia gravis. PMID:23284891
Parallel Analysis with Unidimensional Binary Data

ERIC Educational Resources Information Center

Weng, Li-Jen; Cheng, Chung-Ping

2005-01-01

The present simulation investigated the performance of parallel analysis for unidimensional binary data. Single-factor models with 8 and 20 indicators were examined, and sample size (50, 100, 200, 500, and 1,000), factor loading (.45, .70, and .90), response ratio on two categories (50/50, 60/40, 70/30, 80/20, and 90/10), and types of correlation…
Circular Unidimensional Scaling: A New Look at Group Differences in Interest Structure

ERIC Educational Resources Information Center

Armstrong, Patrick Ian; Hubert, Lawrence; Rounds, James

2003-01-01

The fit of J. L. Holland's (1959, 1997) RIASEC model to U.S. racial-ethnic groups was assessed using circular unidimensional scaling. Samples of African American, Asian American, Caucasian American and Hispanic American high school students and employed adults who completed either the UNIACT Interest Inventory (K. B. Swaney, 1995) or the Strong…
Statistical Indexes for Monitoring Item Behavior under Computer Adaptive Testing Environment.

ERIC Educational Resources Information Center

Zhu, Renbang; Yu, Feng; Liu, Su

A computerized adaptive test (CAT) administration usually requires a large supply of items with accurately estimated psychometric properties, such as item response theory (IRT) parameter estimates, to ensure the precision of examinee ability estimation. However, an estimated IRT model of a given item in any given pool does not always correctly…
Five Methods for Estimating Angoff Cut Scores with IRT

ERIC Educational Resources Information Center

Wyse, Adam E.

2017-01-01

This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Random Item IRT Models

ERIC Educational Resources Information Center

De Boeck, Paul

2008-01-01

It is common practice in IRT to consider items as fixed and persons as random. Both, continuous and categorical person parameters are most often random variables, whereas for items only continuous parameters are used and they are commonly of the fixed type, although exceptions occur. It is shown in the present article that random item parameters…
The Information Function for the One-Parameter Logistic Model: Is it Reliability?

ERIC Educational Resources Information Center

Doran, Harold C.

2005-01-01

The information function is an important statistic in item response theory (IRT) applications. Although the information function is often described as the IRT version of reliability, it differs from the classical notion of reliability from a critical perspective: replication. This article first explores the information function for the…
Alternative Matching Scores to Control Type I Error of the Mantel-Haenszel Procedure for DIF in Dichotomously Scored Items Conforming to 3PL IRT and Nonparametric 4PBCB Models

ERIC Educational Resources Information Center

Monahan, Patrick O.; Ankenmann, Robert D.

2010-01-01

When the matching score is either less than perfectly reliable or not a sufficient statistic for determining latent proficiency in data conforming to item response theory (IRT) models, Type I error (TIE) inflation may occur for the Mantel-Haenszel (MH) procedure or any differential item functioning (DIF) procedure that matches on summed-item…
An Investigation of the Performance of the Generalized S-X[superscript 2] Item-Fit Index for Polytomous IRT Models. ACT Research Report Series, 2007-1

ERIC Educational Resources Information Center

Kang, Taehoon; Chen, Troy T.

2007-01-01

Orlando and Thissen (2000, 2003) proposed an item-fit index, S-X[superscript 2], for dichotomous item response theory (IRT) models, which has performed better than traditional item-fit statistics such as Yen's (1981) Q[subscript 1] and McKinley and Mill's (1985) G[superscript 2]. This study extends the utility of S-X[superscript 2] to polytomous…
Development and evaluation of the PI-G: a three-scale measure based on the German translation of the PROMIS ® pain interference item bank.

PubMed

Farin, Erik; Nagl, Michaela; Gramm, Lukas; Heyduck, Katja; Glattacker, Manuela

2014-05-01

Study aim was to translate the PROMIS(®) pain interference (PI) item bank (41 items) into German, test its psychometric properties in patients with chronic low back pain and develop static subforms. We surveyed N = 262 patients undergoing rehabilitation who were asked to fill out questionnaires at the beginning and 2 weeks after the end of rehabilitation, applying the Oswestry Disability Index (ODI) and Pain Disability Index (PDI) in addition to the PROMIS(®) PI items. For psychometric testing, a 1-parameter item response theory (IRT) model was used. Exploratory and confirmatory factor analyses as well as reliability and construct validity analyses were conducted. The assumptions regarding IRT scaling of the translated PROMIS(®) PI item bank as a whole were not confirmed. However, we succeeded in devising three static subforms (PI-G scales: PI mental 13 items, PI functional 11 items, PI physical 4 items), revealing good psychometric properties. The PI-G scales in their static form can be recommended for use in German-speaking countries. Their strengths versus the ODI and PDI are that pain interference is assessed in a differentiated manner and that several psychometric values are somewhat better than those associated with the ODI and PDI (distribution properties, IRT model fit, reliability). To develop an IRT-scaled item bank of the German translations of the PROMIS(®) PI items, it would be useful to have additional studies (e.g., with larger sample sizes and using a 2-parameter IRT model).
Biases and power for groups comparison on subjective health measurements.

PubMed

Hamel, Jean-François; Hardouin, Jean-Benoit; Le Neel, Tanguy; Kubis, Gildas; Roquelaure, Yves; Sébille, Véronique

2012-01-01

Subjective health measurements are increasingly used in clinical research, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: so-called classical test theory (CTT), relying on observed scores and models coming from Item Response Theory (IRT) relying on a response model relating the items responses to a latent parameter, often called latent trait. Whether IRT or CTT would be the most appropriate method to compare two independent groups of patients on a patient reported outcomes measurement remains unknown and was investigated using simulations. For CTT-based analyses, groups comparison was performed using t-test on the scores. For IRT-based analyses, several methods were compared, according to whether the Rasch model was considered with random effects or with fixed effects, and the group effect was included as a covariate or not. Individual latent traits values were estimated using either a deterministic method or by stochastic approaches. Latent traits were then compared with a t-test. Finally, a two-steps method was performed to compare the latent trait distributions, and a Wald test was performed to test the group effect in the Rasch model including group covariates. The only unbiased IRT-based method was the group covariate Wald's test, performed on the random effects Rasch model. This model displayed the highest observed power, which was similar to the power using the score t-test. These results need to be extended to the case frequently encountered in practice where data are missing and possibly informative.
In-flight investigations of the unsteady behaviour of the boundary layer with infrared thermography

NASA Astrophysics Data System (ADS)

Szewczyk, Mariusz; Smusz, Robert; de Groot, Klaus; Meyer, Joerg; Kucaba-Pietal, Anna; Rzucidlo, Pawel

2017-04-01

Infrared thermography (IRT) has been well established in wind tunnel and flight tests for the last decade. Former applications of IRT were focused, in nearly all cases, on steady measurements. In the last years, requirements of unsteady IRT measurements (up to 10 Hz) have been formulated, but the problem of a very slow thermal response of common materials of wind tunnel models or airplane components has to be overcome by finding a surface modification with a fast thermal response (low heat capacity, low thermal conductivity and high thermal diffusivity). Therefore, lab investigations of potential material combinations and flight tests with a ‘low cost’ aircraft, i.e. a glider with a modified wing surface, were conducted. In order to induce unsteady conditions (rapid change of laminar-turbulent boundary layer transition), special maneuvers of a glider during IRT measurements were performed.
Improving the Sensitivity and Positive Predictive Value in a Cystic Fibrosis Newborn Screening Program Using a Repeat Immunoreactive Trypsinogen and Genetic Analysis.

PubMed

Sontag, Marci K; Lee, Rachel; Wright, Daniel; Freedenberg, Debra; Sagel, Scott D

2016-08-01

To evaluate the performance of a new cystic fibrosis (CF) newborn screening algorithm, comprised of immunoreactive trypsinogen (IRT) in first (24-48 hours of life) and second (7-14 days of life) dried blood spot plus DNA on second dried blood spot, over existing algorithms. A retrospective review of the IRT/IRT/DNA algorithm implemented in Colorado, Wyoming, and Texas. A total of 1 520 079 newborns were screened, 32 557 (2.1%) had abnormal first IRT; 8794 (0.54%) on second. Furthermore, 14 653 mutation analyses were performed; 1391 newborns were referred for diagnostic testing; 274 newborns were diagnosed; and 201/274 (73%) of newborns had 2 mutations on the newborn screening CFTR panel. Sensitivity was 96.2%, compared with sensitivity of 76.1% observed with IRT/IRT (105 ng/mL cut-offs, P < .0001). The ratio of newborns with CF to heterozygote carriers was 1:2.5, and newborns with CF to newborns with CFTR-related metabolic syndrome was 10.8:1. The overall positive predictive value was 20%. The median age of diagnosis was 28, 30, and 39.5 days in the 3 states. IRT/IRT/DNA is more sensitive than IRT/IRT because of lower cut-offs (∼97 percentile or 60 ng/mL); higher cut-offs in IRT/IRT programs (>99 percentile, 105 ng/mL) would not achieve sufficient sensitivity. Carrier identification and identification of newborns with CFTR-related metabolic syndrome is less common in IRT/IRT/DNA compared with IRT/DNA. The time to diagnosis is nominally longer, but diagnosis can be achieved in the neonatal period and opportunities to further improve timeliness have been enacted. IRT/IRT/DNA algorithm should be considered by programs with 2 routine screens. Copyright © 2016 Elsevier Inc. All rights reserved.
Evaluation of the IRT Parameter Invariance Property for the MCAT.

ERIC Educational Resources Information Center

Kelkar, Vinaya; Wightman, Linda F.; Luecht, Richard M.

The purpose of this study was to investigate the viability of the property of parameter invariance for the one-parameter (1P), two-parameter (2P), and three-parameter (3P) item response theory (IRT) models for the Medical College Admissions Tests (MCAT). Invariance of item parameters across different gender, ethnic, and language groups and the…
Comparing the IRT Pre-equating and Section Pre-equating: A Simulation Study.

ERIC Educational Resources Information Center

Hwang, Chi-en; Cleary, T. Anne

The results obtained from two basic types of pre-equatings of tests were compared: the item response theory (IRT) pre-equating and section pre-equating (SPE). The simulated data were generated from a modified three-parameter logistic model with a constant guessing parameter. Responses of two replication samples of 3000 examinees on two 72-item…

The Effect of Including or Excluding Students with Testing Accommodations on IRT Calibrations.

ERIC Educational Resources Information Center

Karkee, Thakur; Lewis, Dan M.; Barton, Karen; Haug, Carolyn

This study aimed to determine the degree to which the inclusion of accommodated students with disabilities in the calibration sample affects the characteristics of item parameters and the test results. Investigated were effects on test reliability, item fit to the applicable item response theory (IRT) model, item parameter estimates, and students'…
Assessing the Item Response Theory with Covariate (IRT-C) Procedure for Ascertaining Differential Item Functioning

ERIC Educational Resources Information Center

Tay, Louis; Vermunt, Jeroen K.; Wang, Chun

2013-01-01

We evaluate the item response theory with covariates (IRT-C) procedure for assessing differential item functioning (DIF) without preknowledge of anchor items (Tay, Newman, & Vermunt, 2011). This procedure begins with a fully constrained baseline model, and candidate items are tested for uniform and/or nonuniform DIF using the Wald statistic.…
Creation and validation of the barriers to alcohol reduction (BAR) scale using classical test theory and item response theory.

PubMed

Kunicki, Zachary J; Schick, Melissa R; Spillane, Nichea S; Harlow, Lisa L

2018-06-01

Those who binge drink are at increased risk for alcohol-related consequences when compared to non-binge drinkers. Research shows individuals may face barriers to reducing their drinking behavior, but few measures exist to assess these barriers. This study created and validated the Barriers to Alcohol Reduction (BAR) scale. Participants were college students ( n  = 230) who endorsed at least one instance of past-month binge drinking (4+ drinks for women or 5+ drinks for men). Using classical test theory, exploratory structural equation modeling found a two-factor structure of personal/psychosocial barriers and perceived program barriers. The sub-factors, and full scale had reasonable internal consistency (i.e., coefficient omega = 0.78 (personal/psychosocial), 0.82 (program barriers), and 0.83 (full measure)). The BAR also showed evidence for convergent validity with the Brief Young Adult Alcohol Consequences Questionnaire ( r  = 0.39, p  < .001) and discriminant validity with Barriers to Physical Activity ( r  = -0.02, p  = .81). Item Response Theory (IRT) analysis showed the two factors separately met the unidimensionality assumption, and provided further evidence for severity of the items on the two factors. Results suggest that the BAR measure appears reliable and valid for use in an undergraduate student population of binge drinkers. Future studies may want to re-examine this measure in a more diverse sample.
A Comment on Early Student Blunders on Computer-Based Adaptive Tests

ERIC Educational Resources Information Center

Green, Bert F.

2011-01-01

This article refutes a recent claim that computer-based tests produce biased scores for very proficient test takers who make mistakes on one or two initial items and that the "bias" can be reduced by using a four-parameter IRT model. Because the same effect occurs with pattern scores on nonadaptive tests, the effect results from IRT scoring, not…
Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

ERIC Educational Resources Information Center

Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.

2016-01-01

In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
Person Response Functions and the Definition of Units in the Social Sciences

ERIC Educational Resources Information Center

Engelhard, George, Jr.; Perkins, Aminah F.

2011-01-01

Humphry (this issue) has written a thought-provoking piece on the interpretation of item discrimination parameters as scale units in item response theory. One of the key features of his work is the description of an item response theory (IRT) model that he calls the logistic measurement function that combines aspects of two traditions in IRT that…
Applying Item Response Theory to the Development of a Screening Adaptation of the Goldman-Fristoe Test of Articulation-Second Edition

ERIC Educational Resources Information Center

Brackenbury, Tim; Zickar, Michael J.; Munson, Benjamin; Storkel, Holly L.

2017-01-01

Purpose: Item response theory (IRT) is a psychometric approach to measurement that uses latent trait abilities (e.g., speech sound production skills) to model performance on individual items that vary by difficulty and discrimination. An IRT analysis was applied to preschoolers' productions of the words on the Goldman-Fristoe Test of…
Small helium-cooled infrared telescope experiment for Spacelab-2 (IRT)

NASA Technical Reports Server (NTRS)

Fazio, Giovanni G.

1990-01-01

The Infrared Telescope (IRT) experiment, flown on Spacelab-2, was used to make infrared measurements between 2 and 120 microns. The objectives were multidisciplinary in nature with astrophysical goals of mapping the diffuse cosmic emission and extended infrared sources and technical goals of measuring the induced Shuttle environment, studying properties of superfluid helium in space, and testing various infrared telescope system designs. Astrophysically, new data were obtained on the structure of the Galaxy at near-infrared wavelengths. A summary of the large scale diffuse near-infrared observations of the Galaxy by the IRT is presented, as well as a summary of the preliminary results obtained from this data on the structure of the galactic disk and bulge. The importance of combining CO and near-infrared maps of similar resolution to determine a 3-D model of galactic extinction is demonstrated. The IRT data are used, in conjunction with a proposed galactic model, to make preliminary measurements of the global scale parameters of the Galaxy. During the mission substantial amounts of data were obtained concerning the induced Shuttle environment. An experiment was also performed to measure spacecraft glow in the IR.
Rasch Analysis of the General Self-Efficacy Scale in Workers with Traumatic Limb Injuries.

PubMed

Wu, Tzu-Yi; Yu, Wan-Hui; Huang, Chien-Yu; Hou, Wen-Hsuan; Hsieh, Ching-Lin

2016-09-01

Purpose The purpose of this study was to apply Rasch analysis to examine the unidimensionality and reliability of the General Self-Efficacy Scale (GSE) in workers with traumatic limb injuries. Furthermore, if the items of the GSE fitted the Rasch model's assumptions, we transformed the raw sum ordinal scores of the GSE into Rasch interval scores. Methods A total of 1076 participants completed the GSE at 1 month post injury. Rasch analysis was used to examine the unidimensionality and person reliability of the GSE. The unidimensionality of the GSE was verified by determining whether the items fit the Rasch model's assumptions: (1) item fit indices: infit and outfit mean square (MNSQ) ranged from 0.6 to 1.4; and (2) the eigenvalue of the first factor extracted from principal component analysis (PCA) for residuals was <2. Person reliability was calculated. Results The unidimensionality of the 10-item GSE was supported in terms of good item fit statistics (infit and outfit MNSQ ranging from 0.92 to 1.32) and acceptable eigenvalues (1.6) of the first factor of the PCA, with person reliability = 0.89. Consequently, the raw sum scores of the GSE were transformed into Rasch scores. Conclusions The results indicated that the items of GSE are unidimensional and have acceptable person reliability in workers with traumatic limb injuries. Additionally, the raw sum scores of the GSE can be transformed into Rasch interval scores for prospective users to quantify workers' levels of self-efficacy and to conduct further statistical analyses.
Item response theory analysis to evaluate reliability and minimal clinically important change of the Roland-Morris Disability Questionnaire in patients with severe disability due to back pain from vertebral compression fractures.

PubMed

Lee, Minji K; Yost, Kathleen J; McDonald, Jennifer S; Dougherty, Ryne W; Vine, Roanna L; Kallmes, David F

2017-06-01

The majority of validation done on the Roland-Morris Disability Questionnaire (RMDQ) has been in patients with mild or moderate disability. There is paucity of research focusing on the psychometric quality of the RMDQ in patients with severe disability. To evaluate the psychometric quality of the RMDQ in patients with severe disability. Observational clinical study. The sample consisted of 214 patients with painful vertebral compression fractures who underwent vertebroplasty or kyphoplasty. The 23-item version of the RMDQ was completed at two time points: baseline and 30-day postintervention follow-up. With the two-parameter logistic unidimensional item response theory (IRT) analyses, we derived the range of scores that produced reliable measurement and investigated the minimal clinically important difference (MCID). Scores for 214 (100%) patients at baseline and 108 (50%) patients at follow-up did not meet the reliability criterion of 0.90 or higher, with the majority of patients having disability due to back pain that was too severe to be reliably measured by the RMDQ. Depending on methodology, MCID estimates ranged from 2 to 8 points and the proportion of patients classified as having experienced meaningful improvement ranged from 26% to 68%. A greater change in score was needed at the extreme ends of the score scale to be classified as having achieved MCID using IRT methods. Replacing items measuring moderate disability with items measuring severe disability could yield a version of the RMDQ that better targets patients with severe disability due to back pain. Improved precision in measuring disability would be valuable to clinicians who treat patients with greater functional impairments. Caution is needed when choosing criteria for interpreting meaningful change using the RMDQ. Copyright © 2017 Elsevier Inc. All rights reserved.
Two iron-regulated transporter (IRT) genes showed differential expression in poplar trees under iron or zinc deficiency.

PubMed

Huang, Danqiong; Dai, Wenhao

2015-08-15

Two iron-regulated transporter (IRT) genes were cloned from the iron chlorosis resistant (PtG) and susceptible (PtY) Populus tremula 'Erecta' lines. Nucleotide sequence analysis showed no significant difference between PtG and PtY. The predicted proteins contain a conserved ZIP domain with 8 transmembrane (TM) regions. A ZIP signature sequence was found in the fourth TM domain. Phylogenetic analysis revealed that PtIRT1 was clustered with tomato and tobacco IRT genes that are highly responsible to iron deficiency. The PtIRT3 gene was clustered with the AtIRT3 gene that was related to zinc and iron transport in plants. Tissue specific expression indicated that PtIRT1 only expressed in the root, while PtIRT3 constitutively expressed in all tested tissues. Under iron deficiency, the expression of PtIRT1 was dramatically increased and a significantly higher transcript level was detected in PtG than in PtY. Iron deficiency also enhanced the expression of PtIRT3 in PtG. On the other hand, zinc deficiency down-regulated the expression of PtIRT1 and PtIRT3 in both PtG and PtY. Zinc accumulated significantly under iron-deficient conditions, whereas the zinc deficiency showed no significant effect on iron accumulation. A yeast complementation test revealed that the PtIRT1 and PtIRT3 genes could restore the iron uptake ability under the iron uptake-deficiency condition. The results will help understand the mechanisms of iron deficiency response in poplar trees and other woody species. Copyright © 2015 Elsevier GmbH. All rights reserved.
Biases and Power for Groups Comparison on Subjective Health Measurements

PubMed Central

Hamel, Jean-François; Hardouin, Jean-Benoit; Le Neel, Tanguy; Kubis, Gildas; Roquelaure, Yves; Sébille, Véronique

2012-01-01

Subjective health measurements are increasingly used in clinical research, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: so-called classical test theory (CTT), relying on observed scores and models coming from Item Response Theory (IRT) relying on a response model relating the items responses to a latent parameter, often called latent trait. Whether IRT or CTT would be the most appropriate method to compare two independent groups of patients on a patient reported outcomes measurement remains unknown and was investigated using simulations. For CTT-based analyses, groups comparison was performed using t-test on the scores. For IRT-based analyses, several methods were compared, according to whether the Rasch model was considered with random effects or with fixed effects, and the group effect was included as a covariate or not. Individual latent traits values were estimated using either a deterministic method or by stochastic approaches. Latent traits were then compared with a t-test. Finally, a two-steps method was performed to compare the latent trait distributions, and a Wald test was performed to test the group effect in the Rasch model including group covariates. The only unbiased IRT-based method was the group covariate Wald’s test, performed on the random effects Rasch model. This model displayed the highest observed power, which was similar to the power using the score t-test. These results need to be extended to the case frequently encountered in practice where data are missing and possibly informative. PMID:23115620
A modular approach for item response theory modeling with the R package flirt.

PubMed

Jeon, Minjeong; Rijmen, Frank

2016-06-01

The new R package flirt is introduced for flexible item response theory (IRT) modeling of psychological, educational, and behavior assessment data. flirt integrates a generalized linear and nonlinear mixed modeling framework with graphical model theory. The graphical model framework allows for efficient maximum likelihood estimation. The key feature of flirt is its modular approach to facilitate convenient and flexible model specifications. Researchers can construct customized IRT models by simply selecting various modeling modules, such as parametric forms, number of dimensions, item and person covariates, person groups, link functions, etc. In this paper, we describe major features of flirt and provide examples to illustrate how flirt works in practice.
General mixture item response models with different item response structures: Exposition with an application to Likert scales.

PubMed

Tijmstra, Jesper; Bolsinova, Maria; Jeon, Minjeong

2018-01-10

This article proposes a general mixture item response theory (IRT) framework that allows for classes of persons to differ with respect to the type of processes underlying the item responses. Through the use of mixture models, nonnested IRT models with different structures can be estimated for different classes, and class membership can be estimated for each person in the sample. If researchers are able to provide competing measurement models, this mixture IRT framework may help them deal with some violations of measurement invariance. To illustrate this approach, we consider a two-class mixture model, where a person's responses to Likert-scale items containing a neutral middle category are either modeled using a generalized partial credit model, or through an IRTree model. In the first model, the middle category ("neither agree nor disagree") is taken to be qualitatively similar to the other categories, and is taken to provide information about the person's endorsement. In the second model, the middle category is taken to be qualitatively different and to reflect a nonresponse choice, which is modeled using an additional latent variable that captures a person's willingness to respond. The mixture model is studied using simulation studies and is applied to an empirical example.
Evaluating the validity of the Work Role Functioning Questionnaire (Canadian French version) using classical test theory and item response theory.

PubMed

Hong, Quan Nha; Coutu, Marie-France; Berbiche, Djamal

2017-01-01

The Work Role Functioning Questionnaire (WRFQ) was developed to assess workers' perceived ability to perform job demands and is used to monitor presenteeism. Still few studies on its validity can be found in the literature. The purpose of this study was to assess the items and factorial composition of the Canadian French version of the WRFQ (WRFQ-CF). Two measurement approaches were used to test the WRFQ-CF: Classical Test Theory (CTT) and non-parametric Item Response Theory (IRT). A total of 352 completed questionnaires were analyzed. A four-factor and three-factor model models were tested and shown respectively good fit with 14 items (Root Mean Square Error of Approximation (RMSEA) = 0.06, Standardized Root Mean Square Residual (SRMR) = 0.04, Bentler Comparative Fit Index (CFI) = 0.98) and with 17 items (RMSEA = 0.059, SRMR = 0.048, CFI = 0.98). Using IRT, 13 problematic items were identified, of which 9 were common with CTT. This study tested different models with fewer problematic items found in a three-factor model. Using a non-parametric IRT and CTT for item purification gave complementary results. IRT is still scarcely used and can be an interesting alternative method to enhance the quality of a measurement instrument. More studies are needed on the WRFQ-CF to refine its items and factorial composition.
Taking the Missing Propensity Into Account When Estimating Competence Scores

PubMed Central

Pohl, Steffi; Carstensen, Claus H.

2014-01-01

When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically made when using these models: (1) The missing propensity is unidimensional and (2) the missing propensity and the ability are bivariate normally distributed. These assumptions may, however, be violated in real data sets and could, thus, pose a threat to the validity of this approach. The present study focuses on modeling competencies in various domains, using data from a school sample (N = 15,396) and an adult sample (N = 7,256) from the National Educational Panel Study. Our interest was to investigate whether violations of unidimensionality and the normal distribution assumption severely affect the performance of the model-based approach in terms of differences in ability estimates. We propose a model with a competence dimension, a unidimensional missing propensity and a distributional assumption more flexible than a multivariate normal. Using this model for ability estimation results in different ability estimates compared with a model ignoring missing responses. Implications for ability estimation in large-scale assessments are discussed. PMID:29795844
Integration of Infrared Thermography and Photogrammetric Surveying of Built Landscape

NASA Astrophysics Data System (ADS)

Scaioni, M.; Rosina, E.; L'Erario, A.; Dìaz-Vilariño, L.

2017-05-01

The thermal analysis of buildings represents a key-step for reduction of energy consumption, also in the case of Cultural Heritage. Here the complexity of the constructions and the adopted materials might require special analysis and tailored solutions. Infrared Thermography (IRT) is an important non-destructive investigation technique that may aid in the thermal analysis of buildings. The paper reports the application of IRT on a listed building, belonging to the Cultural Heritage and to a residential one, as a demonstration that IRT is a suitable and convenient tool for analysing the existing buildings. The purposes of the analysis are the assessment of the damages and energy efficiency of the building envelope. Since in many cases the complex geometry of historic constructions may involve the thermal analysis, the integration of IRT and accurate 3D models were developed during the latest years. Here authors propose a solution based on the up-to-date photogrammetric solutions for purely image-based 3D modelling, including automatic image orientation/sensor calibration using Structure-from-Motion and dense matching. Thus, an almost fully automatic pipeline for the generation of accurate 3D models showing the temperatures on a building skin in a realistic manner is described, where the only manual task is given by the measurement of a few common points for co-registration of RGB and IR photogrammetric projects.
The Robustness of LOGIST and BILOG IRT Estimation Programs to Violations of Local Independence.

ERIC Educational Resources Information Center

Ackerman, Terry A.

One of the important underlying assumptions of all item response theory (IRT) models is that of local independence. This assumption requires that the response to an item on a test not be influenced by the response to any other items. This assumption is often taken for granted, with little or no scrutiny of the response process required to answer…
Assessing the Utility of Item Response Theory Models: Differential Item Functioning.

ERIC Educational Resources Information Center

Scheuneman, Janice Dowd

The current status of item response theory (IRT) is discussed. Several IRT methods exist for assessing whether an item is biased. Focus is on methods proposed by L. M. Rudner (1975), F. M. Lord (1977), D. Thissen et al. (1988) and R. L. Linn and D. Harnisch (1981). Rudner suggested a measure of the area lying between the two item characteristic…
An Introduction to Item Response Theory for Patient-Reported Outcome Measurement

PubMed Central

Nguyen, Tam H.; Han, Hae-Ra; Kim, Miyong T.

2015-01-01

The growing emphasis on patient-centered care has accelerated the demand for high-quality data from patient-reported outcome (PRO) measures. Traditionally, the development and validation of these measures has been guided by classical test theory. However, item response theory (IRT), an alternate measurement framework, offers promise for addressing practical measurement problems found in health-related research that have been difficult to solve through classical methods. This paper introduces foundational concepts in IRT, as well as commonly used models and their assumptions. Existing data on a combined sample (n = 636) of Korean American and Vietnamese American adults who responded to the High Blood Pressure Health Literacy Scale and the Patient Health Questionnaire-9 are used to exemplify typical applications of IRT. These examples illustrate how IRT can be used to improve the development, refinement, and evaluation of PRO measures. Greater use of methods based on this framework can increase the accuracy and efficiency with which PROs are measured. PMID:24403095

An introduction to item response theory for patient-reported outcome measurement.

PubMed

Nguyen, Tam H; Han, Hae-Ra; Kim, Miyong T; Chan, Kitty S

2014-01-01

The growing emphasis on patient-centered care has accelerated the demand for high-quality data from patient-reported outcome (PRO) measures. Traditionally, the development and validation of these measures has been guided by classical test theory. However, item response theory (IRT), an alternate measurement framework, offers promise for addressing practical measurement problems found in health-related research that have been difficult to solve through classical methods. This paper introduces foundational concepts in IRT, as well as commonly used models and their assumptions. Existing data on a combined sample (n = 636) of Korean American and Vietnamese American adults who responded to the High Blood Pressure Health Literacy Scale and the Patient Health Questionnaire-9 are used to exemplify typical applications of IRT. These examples illustrate how IRT can be used to improve the development, refinement, and evaluation of PRO measures. Greater use of methods based on this framework can increase the accuracy and efficiency with which PROs are measured.
How States Can Reduce the Dropout Rate for Undocumented Immigrant Youth: The Effects of In-State Resident Tuition Policies

PubMed Central

Potochnick, Stephanie

2016-01-01

As of December 2011, 13 states have adopted an in-state resident tuition (IRT) policy that provides in-state tuition to undocumented immigrants and several other states are considering similar legislation. While previous research focuses on how IRT policies affect college entry and attainment, this study examines the effect these policies have on high school dropout behavior. Using the Current Population Survey (CPS) and difference-in-difference models, this paper examines whether IRT policies reduce the likelihood of dropping out of high school for Mexican foreign-born non-citizens (FBNC), a proxy for undocumented youth. The policy is estimated to cause an eight percentage point reduction in the proportion that drops out of high school. The paper develops an integrated framework that combines human capital theory with segmented assimilation theory to provide insight into how IRT policies influence student motivation and educational attainment at the high school level. PMID:24576624
A dynamic Thurstonian item response theory of motive expression in the picture story exercise: solving the internal consistency paradox of the PSE.

PubMed

Lang, Jonas W B

2014-07-01

The measurement of implicit or unconscious motives using the picture story exercise (PSE) has long been a target of debate in the psychological literature. Most debates have centered on the apparent paradox that PSE measures of implicit motives typically show low internal consistency reliability on common indices like Cronbach's alpha but nevertheless predict behavioral outcomes. I describe a dynamic Thurstonian item response theory (IRT) model that builds on dynamic system theories of motivation, theorizing on the PSE response process, and recent advancements in Thurstonian IRT modeling of choice data. To assess the models' capability to explain the internal consistency paradox, I first fitted the model to archival data (Gurin, Veroff, & Feld, 1957) and then simulated data based on bias-corrected model estimates from the real data. Simulation results revealed that the average squared correlation reliability for the motives in the Thurstonian IRT model was .74 and that Cronbach's alpha values were similar to the real data (<.35). These findings suggest that PSE motive measures have long been reliable and increase the scientific value of extant evidence from motivational research using PSE motive measures. (c) 2014 APA, all rights reserved.
Value-based formulas for purchasing. PEHP's designated service provider program: value-based purchasing through global fees.

PubMed

Emery, D W

1997-01-01

In many circles, managed care and capitation have become synonymous; unfortunately, the assumptions informing capitation are based on a flawed unidimensional model of risk. PEHP of Utah has rejected the unidimensional model and has therefore embraced a multidimensional model of risk that suggests that global fees are the optimal purchasing modality. A globally priced episode of care forms a natural unit of analysis that enhances purchasing clarity, allows providers to more efficiently focus on the Marginal Rate of Technical Substitution, and conforms to the multidimensional reality of risk. Most importantly, global fees simultaneously maximize patient choice and provider cost consciousness.
Standardized assessment of infrared thermographic fever screening system performance

NASA Astrophysics Data System (ADS)

Ghassemi, Pejhman; Pfefer, Joshua; Casamento, Jon; Wang, Quanzeng

2017-03-01

Thermal modalities represent the only currently viable mass fever screening approach for outbreaks of infectious disease pandemics such as Ebola and SARS. Non-contact infrared thermometers (NCITs) and infrared thermographs (IRTs) have been previously used for mass fever screening in transportation hubs such as airports to reduce the spread of disease. While NCITs remain a more popular choice for fever screening in the field and at fixed locations, there has been increasing evidence in the literature that IRTs can provide greater accuracy in estimating core body temperature if appropriate measurement practices are applied - including the use of technically suitable thermographs. Therefore, the purpose of this study was to develop a battery of evaluation test methods for standardized, objective and quantitative assessment of thermograph performance characteristics critical to assessing suitability for clinical use. These factors include stability, drift, uniformity, minimum resolvable temperature difference, and accuracy. Two commercial IRT models were characterized. An external temperature reference source with high temperature accuracy was utilized as part of the screening thermograph. Results showed that both IRTs are relatively accurate and stable (<1% error of reading with stability of +/-0.05°C). Overall, results of this study may facilitate development of standardized consensus test methods to enable consistent and accurate use of IRTs for fever screening.
Equal Area Logistic Estimation for Item Response Theory

NASA Astrophysics Data System (ADS)

Lo, Shih-Ching; Wang, Kuo-Chang; Chang, Hsin-Li

2009-08-01

Item response theory (IRT) models use logistic functions exclusively as item response functions (IRFs). Applications of IRT models require obtaining the set of values for logistic function parameters that best fit an empirical data set. However, success in obtaining such set of values does not guarantee that the constructs they represent actually exist, for the adequacy of a model is not sustained by the possibility of estimating parameters. In this study, an equal area based two-parameter logistic model estimation algorithm is proposed. Two theorems are given to prove that the results of the algorithm are equivalent to the results of fitting data by logistic model. Numerical results are presented to show the stability and accuracy of the algorithm.
The factor structure of the Values in Action Inventory of Strengths (VIA-IS): An item-level exploratory structural equation modeling (ESEM) bifactor analysis.

PubMed

Ng, Vincent; Cao, Mengyang; Marsh, Herbert W; Tay, Louis; Seligman, Martin E P

2017-08-01

The factor structure of the Values in Action Inventory of Strengths (VIA-IS; Peterson & Seligman, 2004) has not been well established as a result of methodological challenges primarily attributable to a global positivity factor, item cross-loading across character strengths, and questions concerning the unidimensionality of the scales assessing character strengths. We sought to overcome these methodological challenges by applying exploratory structural equation modeling (ESEM) at the item level using a bifactor analytic approach to a large sample of 447,573 participants who completed the VIA-IS with all 240 character strengths items and a reduced set of 107 unidimensional character strength items. It was found that a 6-factor bifactor structure generally held for the reduced set of unidimensional character strength items; these dimensions were justice, temperance, courage, wisdom, transcendence, humanity, and an overarching general factor that is best described as dispositional positivity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Newborn screening for cystic fibrosis in Wisconsin: comparison of biochemical and molecular methods.

PubMed

Gregg, R G; Simantel, A; Farrell, P M; Koscik, R; Kosorok, M R; Laxova, A; Laessig, R; Hoffman, G; Hassemer, D; Mischler, E H; Splaingard, M

1997-06-01

To evaluate neonatal screening for cystic fibrosis (CF), including study of the screening procedures and characteristics of false-positive infants, over the past 10 years in Wisconsin. An important objective evolving from the original design has been to compare use of a single-tier immunoreactive trypsinogen (IRT) screening method with that of a two-tier method using IRT and analyses of samples for the most common cystic fibrosis transmembrane regulator (CFTR) (DeltaF508) mutation. We also examined the benefit of including up to 10 additional CFTR mutations in the screening protocol. From 1985 to 1994, using either the IRT or IRT/DNA protocol, 220 862 and 104 308 neonates, respectively, were screened for CF. For the IRT protocol, neonates with an IRT >/=180 ng/mL were considered positive, and the standard sweat chloride test was administered to determine CF status. For the IRT/DNA protocol, samples from the original dried-blood specimen on the Guthrie card of neonates with an IRT >/=110 ng/mL were tested for the presence of the DeltaF508 CFTR allele, and if the DNA test revealed one or two DeltaF508 alleles, a sweat test was obtained. Both screening procedures had very high specificity. The sensitivity tended to be higher with the IRT/DNA protocol, but the differences were not statistically significant. The positive predictive value of the IRT/DNA screening protocol was 15.2% compared with 6.4% if the same samples had been screened by the IRT method. Assessment of the false-positive IRT/DNA population revealed that the two-tier method eliminates the disproportionate number of infants with low Apgar scores and also the high prevalence of African-Americans identified previously in our study of newborns with high IRT levels. We found that 55% of DNA-positive CF infants were homozygous for DeltaF508 and 40% had one DeltaF508 allele. Adding analyses for 10 more CFTR mutations has only a small effect on the sensitivity but is likely to add significantly to the cost of screening. Advantages of the IRT/DNA protocol over IRT analysis include improved positive predictive value, reduction of false-positive infants, and more rapid diagnosis with elimination of recall specimens.
Patient self-report section of the ASES questionnaire: a Spanish validation study using classical test theory and the Rasch model.

PubMed

Vrotsou, Kalliopi; Cuéllar, Ricardo; Silió, Félix; Rodriguez, Miguel Ángel; Garay, Daniel; Busto, Gorka; Trancho, Ziortza; Escobar, Antonio

2016-10-18

The aim of the current study was to validate the self-report section of the American Shoulder and Elbow Surgeons questionnaire (ASES-p) into Spanish. Shoulder pathology patients were recruited and followed up to 6 months post treatment. The ASES-p, Constant, SF-36 and Barthel scales were filled-in pre and post treatment. Reliability was tested with Cronbach's alpha, convergent validity with Spearman's correlations coefficients. Confirmatory factor analysis (CFA) and the Rasch model were implemented for assessing structural validity and unidimensionality of the scale. Models with and without the pain item were considered. Responsiveness to change was explored via standardised effect sizes. Results were acceptable for both tested models. Cronbach's alpha was 0.91, total scale correlations with Constant and physical SF-36 dimensions were >0.50. Factor loadings for CFA were >0.40. The Rasch model confirmed unidimensionality of the scale, even though item 10 "do usual sport" was suggested as non-informative. Finally, patients with improved post treatment shoulder function and those receiving surgery had higher standardised effect sizes. The adapted Spanish ASES-p version is a valid and reliable tool for shoulder evaluation and its unidimensionality is supported by the data.
Impact of IRT item misfit on score estimates and severity classifications: an examination of PROMIS depression and pain interference item banks.

PubMed

Zhao, Yue

2017-03-01

In patient-reported outcome research that utilizes item response theory (IRT), using statistical significance tests to detect misfit is usually the focus of IRT model-data fit evaluations. However, such evaluations rarely address the impact/consequence of using misfitting items on the intended clinical applications. This study was designed to evaluate the impact of IRT item misfit on score estimates and severity classifications and to demonstrate a recommended process of model-fit evaluation. Using secondary data sources collected from the Patient-Reported Outcome Measurement Information System (PROMIS) wave 1 testing phase, analyses were conducted based on PROMIS depression (28 items; 782 cases) and pain interference (41 items; 845 cases) item banks. The identification of misfitting items was assessed using Orlando and Thissen's summed-score item-fit statistics and graphical displays. The impact of misfit was evaluated according to the agreement of both IRT-derived T-scores and severity classifications between inclusion and exclusion of misfitting items. The examination of the presence and impact of misfit suggested that item misfit had a negligible impact on the T-score estimates and severity classifications with the general population sample in the PROMIS depression and pain interference item banks, implying that the impact of item misfit was insignificant. Findings support the T-score estimates in the two item banks as robust against item misfit at both the group and individual levels and add confidence to the use of T-scores for severity diagnosis in the studied sample. Recommendations on approaches for identifying item misfit (statistical significance) and assessing the misfit impact (practical significance) are given.
An introduction to mixture item response theory models.

PubMed

De Ayala, R J; Santiago, S Y

2017-02-01

Mixture item response theory (IRT) allows one to address situations that involve a mixture of latent subpopulations that are qualitatively different but within which a measurement model based on a continuous latent variable holds. In this modeling framework, one can characterize students by both their location on a continuous latent variable as well as by their latent class membership. For example, in a study of risky youth behavior this approach would make it possible to estimate an individual's propensity to engage in risky youth behavior (i.e., on a continuous scale) and to use these estimates to identify youth who might be at the greatest risk given their class membership. Mixture IRT can be used with binary response data (e.g., true/false, agree/disagree, endorsement/not endorsement, correct/incorrect, presence/absence of a behavior), Likert response scales, partial correct scoring, nominal scales, or rating scales. In the following, we present mixture IRT modeling and two examples of its use. Data needed to reproduce analyses in this article are available as supplemental online materials at http://dx.doi.org/10.1016/j.jsp.2016.01.002. Copyright © 2016 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Expression of Malus xiaojinensis IRT1 (MxIRT1) protein in transgenic yeast cells leads to degradation through autophagy in the presence of excessive iron.

PubMed

Li, Shuang; Zhang, Xi; Zhang, Xiu-Yue; Xiao, Wei; Berry, James O; Li, Peng; Jin, Si; Tan, Song; Zhang, Peng; Zhao, Wei-Zhong; Yin, Li-Ping

2015-07-01

Iron is essential for plants, but highly toxic when present in excess. Consequently, iron uptake by root transporters must be finely tuned to avoid excess uptake from soil under iron excess. The iron-regulated transporter of Malus xiaojinensis (MxIRT1), induced in roots under iron deficiency, is a highly effective iron(II) transporter. Here, we investigated how the presence of excessive iron leads to MxIRT1 degradation in yeast expressing this plant iron transporter protein. To determine the relationship between iron abundance and MxIRT1 degradation, relative levels of autophagy-related gene-8 (ATG8) mRNA and the active ATG8-phosphatidylethanolamine-conjugated (PE) protein were measured in wild-type yeast and the autophagic mutant strains atg1∆, atg5∆, atg7∆, ypt7∆ and tor1∆ under normal and excessive iron conditions. The data showed that the exposure of MxIRT1-eGFP-transformed wild-type and tor1∆ strains to excessive iron led to significantly increased levels of ATG8 transcript and ATG8-PE protein, which resulted in enhanced MxIRT1 degradation. Co-localization of mCherry-ATG8 and MxIRT1-eGFP provided evidence that these proteins interact during autophagy in yeast. While inhibition of autophagic initiation, autophagosome formation and vacuole fusion all decreased MxIRT1 degradation. PMSF inhibition of autophagy prevented degradation, leading to the accumulation of MxIRT1-containing vesicles in the vacuoles. MxIRT1-vesicles were sorted into autophagosomes for iron-induced degradation in yeast, whereas the endogenous iron(II) transporter Fet4 was degraded in an autophagy-independent manner. Moreover, immunoprecipitation showed that multimono-ubiquitins provided MxIRT1 with the ubiquitination signal. Together, three factors, iron excess, autophagy and mono-ubiquitination, affect the functional activity and stability of exogenous MxIRT1 in yeast, thereby preventing iron uptake via this root transporter. Copyright © 2015 John Wiley & Sons, Ltd.
Dimensionality and predictive validity of the HAM-Nat, a test of natural sciences for medical school admission

PubMed Central

2011-01-01

Background Knowledge in natural sciences generally predicts study performance in the first two years of the medical curriculum. In order to reduce delay and dropout in the preclinical years, Hamburg Medical School decided to develop a natural science test (HAM-Nat) for student selection. In the present study, two different approaches to scale construction are presented: a unidimensional scale and a scale composed of three subject specific dimensions. Their psychometric properties and relations to academic success are compared. Methods 334 first year medical students of the 2006 cohort responded to 52 multiple choice items from biology, physics, and chemistry. For the construction of scales we generated two random subsamples, one for development and one for validation. In the development sample, unidimensional item sets were extracted from the item pool by means of weighted least squares (WLS) factor analysis, and subsequently fitted to the Rasch model. In the validation sample, the scales were subjected to confirmatory factor analysis and, again, Rasch modelling. The outcome measure was academic success after two years. Results Although the correlational structure within the item set is weak, a unidimensional scale could be fitted to the Rasch model. However, psychometric properties of this scale deteriorated in the validation sample. A model with three highly correlated subject specific factors performed better. All summary scales predicted academic success with an odds ratio of about 2.0. Prediction was independent of high school grades and there was a slight tendency for prediction to be better in females than in males. Conclusions A model separating biology, physics, and chemistry into different Rasch scales seems to be more suitable for item bank development than a unidimensional model, even when these scales are highly correlated and enter into a global score. When such a combination scale is used to select the upper quartile of applicants, the proportion of successful completion of the curriculum after two years is expected to rise substantially. PMID:21999767
Dimensionality and predictive validity of the HAM-Nat, a test of natural sciences for medical school admission.

PubMed

Hissbach, Johanna C; Klusmann, Dietrich; Hampe, Wolfgang

2011-10-14

Knowledge in natural sciences generally predicts study performance in the first two years of the medical curriculum. In order to reduce delay and dropout in the preclinical years, Hamburg Medical School decided to develop a natural science test (HAM-Nat) for student selection. In the present study, two different approaches to scale construction are presented: a unidimensional scale and a scale composed of three subject specific dimensions. Their psychometric properties and relations to academic success are compared. 334 first year medical students of the 2006 cohort responded to 52 multiple choice items from biology, physics, and chemistry. For the construction of scales we generated two random subsamples, one for development and one for validation. In the development sample, unidimensional item sets were extracted from the item pool by means of weighted least squares (WLS) factor analysis, and subsequently fitted to the Rasch model. In the validation sample, the scales were subjected to confirmatory factor analysis and, again, Rasch modelling. The outcome measure was academic success after two years. Although the correlational structure within the item set is weak, a unidimensional scale could be fitted to the Rasch model. However, psychometric properties of this scale deteriorated in the validation sample. A model with three highly correlated subject specific factors performed better. All summary scales predicted academic success with an odds ratio of about 2.0. Prediction was independent of high school grades and there was a slight tendency for prediction to be better in females than in males. A model separating biology, physics, and chemistry into different Rasch scales seems to be more suitable for item bank development than a unidimensional model, even when these scales are highly correlated and enter into a global score. When such a combination scale is used to select the upper quartile of applicants, the proportion of successful completion of the curriculum after two years is expected to rise substantially.
Rasch analysis of the Chedoke-McMaster Attitudes towards Children with Handicaps scale.

PubMed

Armstrong, Megan; Morris, Christopher; Tarrant, Mark; Abraham, Charles; Horton, Mike C

2017-02-01

Aim To assess whether the Chedoke-McMaster Attitudes towards Children with Handicaps (CATCH) 36-item total scale and subscales fit the unidimensional Rasch model. Method The CATCH was administered to 1881 children, aged 7-16 years in a cross-sectional survey. Data were used from a random sample of 416 for the initial Rasch analysis. The analysis was performed on the 36-item scale and then separately for each subscale. The analysis explored fit to the Rasch model in terms of overall scale fit, individual item fit, item response categories, and unidimensionality. Item bias for gender and school level was also assessed. Revised scales were then tested on an independent second random sample of 415 children. Results Analyses indicated that the 36-item overall scale was not unidimensional and did not fit the Rasch model. Two scales of affective attitudes and behavioural intention were retained after four items were removed from each due to misfit to the Rasch model. Additionally, the scaling was improved when the two most negative response categories were aggregated. There was no item bias by gender or school level on the revised scales. Items assessing cognitive attitudes did not fit the Rasch model and had low internal consistency as a scale. Conclusion Affective attitudes and behavioural intention CATCH sub-scales should be treated separately. Caution should be exercised when using the cognitive subscale. Implications for Rehabilitation The 36-item Chedoke-McMaster Attitudes towards Children with Handicaps (CATCH) scale as a whole did not fit the Rasch model; thus indicating a multi-dimensional scale. Researchers should use two revised eight-item subscales of affective attitudes and behavioural intentions when exploring interventions aiming to improve children's attitudes towards disabled people or factors associated with those attitudes. Researchers should use the cognitive subscale with caution, as it did not create a unidimensional and internally consistent scale. Therefore, conclusions drawn from this scale may not accurately reflect children's attitudes.
What are the appropriate methods for analyzing patient-reported outcomes in randomized trials when data are missing?

PubMed

Hamel, J F; Sebille, V; Le Neel, T; Kubis, G; Boyer, F C; Hardouin, J B

2017-12-01

Subjective health measurements using Patient Reported Outcomes (PRO) are increasingly used in randomized trials, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: Classical Test Theory (CTT) and Item Response Theory models (IRT). These two strategies display very similar characteristics when data are complete, but in the common case when data are missing, whether IRT or CTT would be the most appropriate remains unknown and was investigated using simulations. We simulated PRO data such as quality of life data. Missing responses to items were simulated as being completely random, depending on an observable covariate or on an unobserved latent trait. The considered CTT-based methods allowed comparing scores using complete-case analysis, personal mean imputations or multiple-imputations based on a two-way procedure. The IRT-based method was the Wald test on a Rasch model including a group covariate. The IRT-based method and the multiple-imputations-based method for CTT displayed the highest observed power and were the only unbiased method whatever the kind of missing data. Online software and Stata® modules compatibles with the innate mi impute suite are provided for performing such analyses. Traditional procedures (listwise deletion and personal mean imputations) should be avoided, due to inevitable problems of biases and lack of power.
Development and initial evaluation of the SCI-FI/AT

PubMed Central

Jette, Alan M.; Slavin, Mary D.; Ni, Pengsheng; Kisala, Pamela A.; Tulsky, David S.; Heinemann, Allen W.; Charlifue, Susie; Tate, Denise G.; Fyffe, Denise; Morse, Leslie; Marino, Ralph; Smith, Ian; Williams, Steve

2015-01-01

Objectives To describe the domain structure and calibration of the Spinal Cord Injury Functional Index for samples using Assistive Technology (SCI-FI/AT) and report the initial psychometric properties of each domain. Design Cross sectional survey followed by computerized adaptive test (CAT) simulations. Setting Inpatient and community settings. Participants A sample of 460 adults with traumatic spinal cord injury (SCI) stratified by level of injury, completeness of injury, and time since injury. Interventions None Main outcome measure SCI-FI/AT Results Confirmatory factor analysis (CFA) and Item response theory (IRT) analyses identified 4 unidimensional SCI-FI/AT domains: Basic Mobility (41 items) Self-care (71 items), Fine Motor Function (35 items), and Ambulation (29 items). High correlations of full item banks with 10-item simulated CATs indicated high accuracy of each CAT in estimating a person's function, and there was high measurement reliability for the simulated CAT scales compared with the full item bank. SCI-FI/AT item difficulties in the domains of Self-care, Fine Motor Function, and Ambulation were less difficult than the same items in the original SCI-FI item banks. Conclusion With the development of the SCI-FI/AT, clinicians and investigators have available multidimensional assessment scales that evaluate function for users of AT to complement the scales available in the original SCI-FI. PMID:26010975
Development and initial evaluation of the SCI-FI/AT.

PubMed

Jette, Alan M; Slavin, Mary D; Ni, Pengsheng; Kisala, Pamela A; Tulsky, David S; Heinemann, Allen W; Charlifue, Susie; Tate, Denise G; Fyffe, Denise; Morse, Leslie; Marino, Ralph; Smith, Ian; Williams, Steve

2015-05-01

To describe the domain structure and calibration of the Spinal Cord Injury Functional Index for samples using Assistive Technology (SCI-FI/AT) and report the initial psychometric properties of each domain. Cross sectional survey followed by computerized adaptive test (CAT) simulations. Inpatient and community settings. A sample of 460 adults with traumatic spinal cord injury (SCI) stratified by level of injury, completeness of injury, and time since injury. None SCI-FI/AT RESULTS: Confirmatory factor analysis (CFA) and Item response theory (IRT) analyses identified 4 unidimensional SCI-FI/AT domains: Basic Mobility (41 items) Self-care (71 items), Fine Motor Function (35 items), and Ambulation (29 items). High correlations of full item banks with 10-item simulated CATs indicated high accuracy of each CAT in estimating a person's function, and there was high measurement reliability for the simulated CAT scales compared with the full item bank. SCI-FI/AT item difficulties in the domains of Self-care, Fine Motor Function, and Ambulation were less difficult than the same items in the original SCI-FI item banks. With the development of the SCI-FI/AT, clinicians and investigators have available multidimensional assessment scales that evaluate function for users of AT to complement the scales available in the original SCI-FI.
Using R and WinBUGS to fit a Generalized Partial Credit Model for developing and evaluating patient-reported outcomes assessments

PubMed Central

Li, Yuelin; Baser, Ray

2013-01-01

The US Food and Drug Administration recently announced the final guidelines on the development and validation of Patient-Reported Outcomes (PROs) assessments in drug labeling and clinical trials. This guidance paper may boost the demand for new PRO survey questionnaires. Henceforth biostatisticians may encounter psychometric methods more frequently, particularly Item Response Theory (IRT) models to guide the shortening of a PRO assessment instrument. This article aims to provide an introduction on the theory and practical analytic skills in fitting a Generalized Partial Credit Model in IRT (GPCM). GPCM theory is explained first, with special attention to a clearer exposition of the formal mathematics than what is typically available in the psychometric literature. Then a worked example is presented, using self-reported responses taken from the International Personality Item Pool. The worked example contains step-by-step guides on using the statistical languages R and WinBUGS in fitting the GPCM. Finally, the Fisher information function of the GPCM model is derived and used to evaluate, as an illustrative example, the usefulness of assessment items by their information contents. This article aims to encourage biostatisticians to apply IRT models in the re-analysis of existing data and in future research. PMID:22362655
Using R and WinBUGS to fit a generalized partial credit model for developing and evaluating patient-reported outcomes assessments.

PubMed

Li, Yuelin; Baser, Ray

2012-08-15

The US Food and Drug Administration recently announced the final guidelines on the development and validation of patient-reported outcomes (PROs) assessments in drug labeling and clinical trials. This guidance paper may boost the demand for new PRO survey questionnaires. Henceforth, biostatisticians may encounter psychometric methods more frequently, particularly item response theory (IRT) models to guide the shortening of a PRO assessment instrument. This article aims to provide an introduction on the theory and practical analytic skills in fitting a generalized partial credit model (GPCM) in IRT. GPCM theory is explained first, with special attention to a clearer exposition of the formal mathematics than what is typically available in the psychometric literature. Then, a worked example is presented, using self-reported responses taken from the international personality item pool. The worked example contains step-by-step guides on using the statistical languages r and WinBUGS in fitting the GPCM. Finally, the Fisher information function of the GPCM model is derived and used to evaluate, as an illustrative example, the usefulness of assessment items by their information contents. This article aims to encourage biostatisticians to apply IRT models in the re-analysis of existing data and in future research. Copyright © 2012 John Wiley & Sons, Ltd.

An uncleaved signal peptide directs the Malus xiaojinensis iron transporter protein Mx IRT1 into the ER for the PM secretory pathway.

PubMed

Zhang, Peng; Tan, Song; Berry, James O; Li, Peng; Ren, Na; Li, Shuang; Yang, Guang; Wang, Wei-Bing; Qi, Xiao-Ting; Yin, Li-Ping

2014-11-07

Malus xiaojinensis iron-regulated transporter 1 (Mx IRT1) is a highly effective inducible iron transporter in the iron efficient plant Malus xiaojinensis. As a multi-pass integral plasma membrane (PM) protein, Mx IRT1 is predicted to consist of eight transmembrane domains, with a putative N-terminal signal peptide (SP) of 1-29 amino acids. To explore the role of the putative SP, constructs expressing Mx IRT1 (with an intact SP) and Mx DsIRT1 (with a deleted SP) were prepared for expression in Arabidopsis and in yeast. Mx IRT1 could rescue the iron-deficiency phenotype of an Arabidopsis irt1 mutant, and complement the iron-limited growth defect of the yeast mutant DEY 1453 (fet3fet4). Furthermore, fluorescence analysis indicated that a chimeric Mx IRT1-eGFP (enhanced Green Fluorescent Protein) construct was translocated into the ER (Endoplasmic reticulum) for the PM sorting pathway. In contrast, the SP-deleted Mx DsIRT1 could not rescue either of the mutant phenotypes, nor direct transport of the GFP signal into the ER. Interestingly, immunoblot analysis indicated that the SP was not cleaved from the mature protein following transport into the ER. Taken together, data presented here provides strong evidence that an uncleaved SP determines ER-targeting of Mx IRT1 during the initial sorting stage, thereby enabling the subsequent transport and integration of this protein into the PM for its crucial role in iron uptake.
Rasch models suggested the satisfactory psychometric properties of the World Health Organization Quality of Life-Brief among lung cancer patients.

PubMed

Lin, Chung-Ying; Yang, Szu-Chun; Lai, Wu-Wei; Su, Wu-Chou; Wang, Jung-Der

2017-03-01

The study examined whether the items of the World Health Organization Quality of Life-Brief questionnaire can assess its four underlying domains (Physical, Psychological, Social, and Environment) in a sample of lung cancer patients. All patients ( n = 1150) were recruited from a medical center in Tainan, and each participant completed the World Health Organization Quality of Life-Brief. Several Rasch rating scale models were used to examine the data-model fit, and Rasch analyses corroborated that each domain of the World Health Organization Quality of Life-Brief could be unidimensional. Although three items were found to have a poor fit, all the other items fit the unidimensionality with ordered thresholds.
Testing the multidimensionality of the inventory of school motivation in a Dutch student sample.

PubMed

Korpershoek, Hanke; Xu, Kun; Mok, Magdalena Mo Ching; McInerney, Dennis M; van der Werf, Greetje

2015-01-01

A factor analytic and a Rasch measurement approach were applied to evaluate the multidimensional nature of the school motivation construct among more than 7,000 Dutch secondary school students. The Inventory of School Motivation (McInerney and Ali, 2006) was used, which intends to measure four motivation dimensions (mastery, performance, social, and extrinsic motivation), each comprising of two first-order factors. One unidimensional model and three multidimensional models (4-factor, 8-factor, higher order) were fit to the data. Results of both approaches showed that the multidimensional models validly represented the school motivation among Dutch secondary school pupils, whereas model fit of the unidimensional model was poor. The differences in model fit between the three multidimensional models were small, although a different model was favoured by the two approaches. The need for improvement of some of the items and the need to increase measurement precision of several first-order factors are discussed.
A Unified Approach to IRT Scale Linking and Scale Transformations. Research Report. RR-04-09

ERIC Educational Resources Information Center

von Davier, Matthias; von Davier, Alina A.

2004-01-01

This paper examines item response theory (IRT) scale transformations and IRT scale linking methods used in the Non-Equivalent Groups with Anchor Test (NEAT) design to equate two tests, X and Y. It proposes a unifying approach to the commonly used IRT linking methods: mean-mean, mean-var linking, concurrent calibration, Stocking and Lord and…
Extending item response theory to online homework

NASA Astrophysics Data System (ADS)

Kortemeyer, Gerd

2014-06-01

Item response theory (IRT) becomes an increasingly important tool when analyzing "big data" gathered from online educational venues. However, the mechanism was originally developed in traditional exam settings, and several of its assumptions are infringed upon when deployed in the online realm. For a large-enrollment physics course for scientists and engineers, the study compares outcomes from IRT analyses of exam and homework data, and then proceeds to investigate the effects of each confounding factor introduced in the online realm. It is found that IRT yields the correct trends for learner ability and meaningful item parameters, yet overall agreement with exam data is moderate. It is also found that learner ability and item discrimination is robust over a wide range with respect to model assumptions and introduced noise. Item difficulty is also robust, but over a narrower range.
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.

PubMed

Teresi, Jeanne A; Ocepek-Welikson, Katja; Cook, Karon F; Kleinman, Marjorie; Ramirez, Mildred; Reid, M Carrington; Siu, Albert

2016-01-01

Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System ® (PROMIS ® ) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, "How much did pain interfere with enjoyment of social activities?" was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and sensitivity analyses: ability to concentrate, enjoyment of recreational activities, tasks away from home, participation in social activities, and socializing with others. The magnitude of DIF was small and the impact negligible. Three items were consistently identified with DIF for education: enjoyment of life, ability to concentrate, and enjoyment of recreational activities. No item showed DIF above the magnitude threshold and the impact of DIF on the overall measure was minimal. No item showed gender DIF after correction for multiple comparisons in the primary analyses. Four items showed consistent age DIF: enjoyment of life, ability to concentrate, day to day activities, and enjoyment of recreational activities, none with primary magnitude values above threshold. Conditional on the pain state, Spanish speakers were hypothesized to report less pain interference on one item, enjoyment of life. The DIF findings confirmed the hypothesis; however, the magnitude was small. Using an arbitrary cutoff point of theta ( θ ) ≥ 1.0 to classify respondents with acute pain interference, the highest number of changes were for the education groups analyses. There were 231 respondents (4% of the total sample) who changed from the designation of no acute pain interference to acute interference after the DIF adjustment. There was no change in the designations for race/ethnic subgroups, and a small number of changes for respondents aged 65 to 84. Although significant DIF was observed after correction for multiple comparisons, all DIF was of low magnitude and impact. However, some individual-level impact was observed for low education groups. Reliability estimates were high. Thus, the PROMIS short form pain items examined in this ethnically diverse sample performed relatively well; although one item was problematic and removed from the analyses. It is concluded that the majority of the PROMIS pain interference short form items can be recommended for use among ethnically diverse groups, including those in palliative care and with cancer and chronic illness.
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations

PubMed Central

Teresi, Jeanne A.; Ocepek-Welikson, Katja; Cook, Karon F.; Kleinman, Marjorie; Ramirez, Mildred; Reid, M. Carrington; Siu, Albert

2017-01-01

Reducing the response burden of standardized pain measures is desirable, particularly for individuals who are frail or live with chronic illness, e.g., those suffering from cancer and those in palliative care. The Patient Reported Outcome Measurement Information System® (PROMIS®) project addressed this issue with the provision of computerized adaptive tests (CAT) and short form measures that can be used clinically and in research. Although there has been substantial evaluation of PROMIS item banks, little is known about the performance of PROMIS short forms, particularly in ethnically diverse groups. Reviewed in this article are findings related to the differential item functioning (DIF) and reliability of the PROMIS pain interference short forms across diverse sociodemographic groups. Methods DIF hypotheses were generated for the PROMIS short form pain interference items. Initial analyses tested item response theory (IRT) model assumptions of unidimensionality and local independence. Dimensionality was evaluated using factor analytic methods; local dependence (LD) was tested using IRT-based LD indices. Wald tests were used to examine group differences in IRT parameters, and to test DIF hypotheses. A second DIF-detection method used in sensitivity analyses was based on ordinal logistic regression with a latent IRT-derived conditioning variable. Magnitude and impact of DIF were investigated, and reliability and item and scale information statistics were estimated. Results The reliability of the short form item set was excellent. However, there were a few items with high local dependency, which affected the estimation of the final discrimination parameters. As a result, the item, “How much did pain interfere with enjoyment of social activities?” was excluded in the DIF analyses for all subgroup comparisons. No items were hypothesized to show DIF for race and ethnicity; however, five items showed DIF after adjustment for multiple comparisons in both primary and sensitivity analyses: ability to concentrate, enjoyment of recreational activities, tasks away from home, participation in social activities, and socializing with others. The magnitude of DIF was small and the impact negligible. Three items were consistently identified with DIF for education: enjoyment of life, ability to concentrate, and enjoyment of recreational activities. No item showed DIF above the magnitude threshold and the impact of DIF on the overall measure was minimal. No item showed gender DIF after correction for multiple comparisons in the primary analyses. Four items showed consistent age DIF: enjoyment of life, ability to concentrate, day to day activities, and enjoyment of recreational activities, none with primary magnitude values above threshold. Conditional on the pain state, Spanish speakers were hypothesized to report less pain interference on one item, enjoyment of life. The DIF findings confirmed the hypothesis; however, the magnitude was small. Using an arbitrary cutoff point of theta (θ) ≥ 1.0 to classify respondents with acute pain interference, the highest number of changes were for the education groups analyses. There were 231 respondents (4% of the total sample) who changed from the designation of no acute pain interference to acute interference after the DIF adjustment. There was no change in the designations for race/ethnic subgroups, and a small number of changes for respondents aged 65 to 84. Conclusions Although significant DIF was observed after correction for multiple comparisons, all DIF was of low magnitude and impact. However, some individual-level impact was observed for low education groups. Reliability estimates were high. Thus, the PROMIS short form pain items examined in this ethnically diverse sample performed relatively well; although one item was problematic and removed from the analyses. It is concluded that the majority of the PROMIS pain interference short form items can be recommended for use among ethnically diverse groups, including those in palliative care and with cancer and chronic illness. PMID:28983449
Experimental study on infrared radiation temperature field of concrete under uniaxial compression

NASA Astrophysics Data System (ADS)

Lou, Quan; He, Xueqiu

2018-05-01

Infrared thermography, as a nondestructive, non-contact and real-time monitoring method, has great significance in assessing the stability of concrete structure and monitoring its failure. It is necessary to conduct in depth study on the mechanism and application of infrared radiation (IR) of concrete failure under loading. In this paper, the concrete specimens with size of 100 × 100 × 100 mm were adopted to carry out the uniaxial compressions for the IR tests. The distribution of IR temperatures (IRTs), surface topography of IRT field and the reconstructed IR images were studied. The results show that the IRT distribution follows the Gaussian distribution, and the R2 of Gaussian fitting changes along with the loading time. The abnormities of R2 and AE counts display the opposite variation trends. The surface topography of IRT field is similar to the hyperbolic paraboloid, which is related to the stress distribution in the sample. The R2 of hyperbolic paraboloid fitting presents an upward trend prior to the fracture which enables to change the IRT field significantly. This R2 has a sharp drop in response to this large destruction. The normalization images of IRT field, including the row and column normalization images, were proposed as auxiliary means to analyze the IRT field. The row and column normalization images respectively show the transverse and longitudinal distribution of the IRT field, and they have clear responses to the destruction occurring on the sample surface. In this paper, the new methods and quantitative index were proposed for the analysis of IRT field, which have some theoretical and instructive significance for the analysis of the characteristics of IRT field, as well as the monitoring of instability and failure for concrete structure.
Ground and surface temperature variability for remote sensing of soil moisture in a heterogeneous landscape

USGS Publications Warehouse

Giraldo, M.A.; Bosch, D.; Madden, M.; Usery, L.; Finn, M.

2009-01-01

At the Little River Watershed (LRW) heterogeneous landscape near Tifton Georgia US an in situ network of stations operated by the US Department of Agriculture-Agriculture Research Service-Southeast Watershed Research Lab (USDA-ARS-SEWRL) was established in 2003 for the long term study of climatic and soil biophysical processes. To develop an accurate interpolation of the in situ readings that can be used to produce distributed representations of soil moisture (SM) and energy balances at the landscape scale for remote sensing studies, we studied (1) the temporal and spatial variations of ground temperature (GT) and infra red temperature (IRT) within 30 by 30 m plots around selected network stations; (2) the relationship between the readings from the eight 30 by 30 m plots and the point reading of the network stations for the variables SM, GT and IRT; and (3) the spatial and temporal variation of GT and IRT within agriculture landuses: grass, orchard, peanuts, cotton and bare soil in the surrounding landscape. The results showed high correlations between the station readings and the adjacent 30 by 30 m plot average value for SM; high seasonal independent variation in the GT and IRT behavior among the eight 30 by 30 m plots; and site specific, in-field homogeneity in each 30 by 30 m plot. We found statistical differences in the GT and IRT between the different landuses as well as high correlations between GT and IRT regardless of the landuse. Greater standard deviations for IRT than for GT (in the range of 2-4) were found within the 30 by 30 m, suggesting that when a single point reading for this variable is selected for the validation of either remote sensing data or water-energy models, errors may occur. The results confirmed that in this landscape homogeneous 30 by 30 m plots can be used as landscape spatial units for soil moisture and ground temperature studies. Under this landscape conditions small plots can account for local expressions of environmental processes, decreasing the errors and uncertainties in remote sensing estimates caused by landscape heterogeneity.
Overview of the Icing and Flow Quality Improvements Program for the NASA Glenn Icing Research Tunnel

NASA Technical Reports Server (NTRS)

Irvine, Thomas B.; Kevdzija, Susan L.; Sheldon, David W.; Spera, David A.

2001-01-01

Major upgrades were made in 1999 to the 6- by 9-Foot (1.8- by 2.7-m) Icing Research Tunnel (IRT) at the NASA Glenn Research Center. These included replacement of the electronic controls for the variable-speed drive motor, replacement of the heat exchanger, complete replacement and enlargement of the leg of the tunnel containing the new heat-exchanger, the addition of flow-expanding and flow-contracting turning vanes upstream and downstream of the heat exchanger, respectively, and the addition of fan outlet guide vanes (OGV's). This paper describes the rationale behind this latest program of IRT upgrades and the program's requirements and goals. An overview is given of the scope of work undertaken by the design and construction contractors, the scale-model IRT (SMIRT) design verification program, the comprehensive reactivation test program initiated upon completion of construction, and the overall management approach followed.
Salient Features of the Harnischfeger-Wiley Model

ERIC Educational Resources Information Center

Hallinan, Maureen T.

1976-01-01

Explicates the Harnischfeger-Wiley model and points out its properties, underlying assumptions, and location in the literature on achievement. It also describes and critiques an empirical test by Harnischfeger and Wiley of their model. (Author/IRT)
An Item Response Theory Analysis of DSM–IV Diagnostic Criteria for Personality Disorders: Findings From the National Epidemiologic Survey on Alcohol and Related Conditions

PubMed Central

Harford, Thomas C.; Chen, Chiung M.; Saha, Tulshi D.; Smith, Sharon M.; Hasin, Deborah S.; Grant, Bridget F.

2013-01-01

The purpose of this study was to evaluate the psychometric properties of DSM–IV symptom criteria for assessing personality disorders (PDs) in a national population and to compare variations in proposed symptom coding for social and/or occupational dysfunction. Data were obtained from a total sample of 34,653 respondents from Waves 1 and 2 of the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). For each personality disorder, confirmatory factor analysis (CFA) established a 1-factor latent factor structure for the respective symptom criteria. A 2-parameter item response theory (IRT) model was applied to the symptom criteria for each PD to assess the probabilities of symptom item endorsements across different values of the underlying trait (latent factor). Findings were compared with a separate IRT model using an alternative coding of symptom criteria that requires distress/impairment to be related to each criterion. The CFAs yielded a good fit for a single underlying latent dimension for each PD. Findings from the IRT indicated that DSM–IV PD symptom criteria are clustered in the moderate to severe range of the underlying latent dimension for each PD and are peaked, indicating high measurement precision only within a narrow range of the underlying trait and lower measurement precision at lower and higher levels of severity. Compared with the NESARC symptom coding, the IRT results for the alternative symptom coding are shifted toward the more severe range of the latent trait but generally have lower measurement precision for each PD. The IRT findings provide support for a reliable assessment of each PD for both NESARC and alternative coding for distress/impairment. The use of symptom dysfunction for each criterion, however, raises a number of issues and implications for the DSM-5 revision currently proposed for Axis II disorders (American Psychiatric Association, 2010). PMID:22449066
Evaluation of Internal Construct Validity and Unidimensionality of the Brachial Assessment Tool, A Patient-Reported Outcome Measure for Brachial Plexus Injury.

PubMed

Hill, Bridget; Pallant, Julie; Williams, Gavin; Olver, John; Ferris, Scott; Bialocerkowski, Andrea

2016-12-01

To evaluate the internal construct validity and dimensionality of a new patient-reported outcome measure for people with traumatic brachial plexus injury (BPI) based on the International Classification of Functioning, Disability and Health definition of activity. Cross-sectional study. Outpatient clinics. Adults (age range, 18-82y) with a traumatic BPI (N=106). There were 106 people with BPI who completed a 51-item 5-response questionnaire. Responses were analyzed in 4 phases (missing responses, item correlations, exploratory factor analysis, and Rasch analysis) to evaluate the properties of fit to the Rasch model, threshold response, local dependency, dimensionality, differential item functioning, and targeting. Not applicable, as this study addresses the development of an outcome measure. Six items were deleted for missing responses, and 10 were deleted for high interitem correlations >.81. The remaining 35 items, while demonstrating fit to the Rasch model, showed evidence of local dependency and multidimensionality. Items were divided into 3 subscales: dressing and grooming (8 items), arm and hand (17 items), and no hand (6 items). All 3 subscales demonstrated fit to the model with no local dependency, minimal disordered thresholds, no unidimensionality or differential item functioning for age, time postinjury, or self-selected dominance. Subscales were combined into 3 subtests and demonstrated fit to the model, no misfit, and unidimensionality, allowing calculation of a summary score. This preliminary analysis supports the internal construct validity of the Brachial Assessment Tool, a unidimensional targeted 4-response patient-reported outcome measure designed to solely assess activity after traumatic BPI regardless of level of injury, age at recruitment, premorbid limb dominance, and time postinjury. Further examination is required to determine test-retest reliability and responsiveness. Copyright Â© 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Imagery rehearsal therapy in addition to treatment as usual for patients with diverse psychiatric diagnoses suffering from nightmares: a randomized controlled trial.

PubMed

van Schagen, Annette M; Lancee, Jaap; de Groot, Izaäk W; Spoormaker, Victor I; van den Bout, Jan

2015-09-01

Nightmares are associated with psychopathology and daily distress. They are highly prevalent in a psychiatric population (30%). Currently, imagery rehearsal therapy (IRT) is the treatment of choice for nightmares. With IRT, the script of the nightmare is changed into a new dream, which is imagined during the day. However, the effects of IRT in a psychiatric population remain unknown. The aim of this study was to determine the effectiveness of IRT in a heterogeneous psychiatric population. Between January 2006 and July 2010, 90 patients with psychiatric disorders (DSM-IV-TR) were randomized to IRT or treatment-as-usual conditions. IRT consisted of 6 individual sessions added to the treatment as usual. Nightmare frequency was assessed using daily nightmare logs and the Nightmare Frequency Questionnaire. Nightmare distress was assessed using the Nightmare Distress Questionnaire and the Nightmare Effects Survey. General psychiatric symptoms were assessed using the Symptom Checklist-90 and a PTSD symptom questionnaire. Assessments were administered at the start of the trial, after the IRT and at follow-up 3 months later. IRT showed a moderate effect (Cohen d = 0.5-0.7, P < .05) on nightmare frequency, nightmare distress, and psychopathology measures compared with treatment as usual. These effects were largely sustained at the 3-month follow-up (Cohen d = 0.4-0.6, P < .10). IRT is an effective treatment for nightmares among patients with comorbid psychiatric disorders and can be employed in addition to the on-going treatment. ClinicalTrials.gov identifier: NCT00291031. © Copyright 2015 Physicians Postgraduate Press, Inc.
Performance of the likelihood ratio difference (G2 Diff) test for detecting unidimensionality in applications of the multidimensional Rasch model.

PubMed

Harrell-Williams, Leigh; Wolfe, Edward W

2014-01-01

Previous research has investigated the influence of sample size, model misspecification, test length, ability distribution offset, and generating model on the likelihood ratio difference test in applications of item response models. This study extended that research to the evaluation of dimensionality using the multidimensional random coefficients multinomial logit model (MRCMLM). Logistic regression analysis of simulated data reveal that sample size and test length have a large effect on the capacity of the LR difference test to correctly identify unidimensionality, with shorter tests and smaller sample sizes leading to smaller Type I error rates. Higher levels of simulated misfit resulted in fewer incorrect decisions than data with no or little misfit. However, Type I error rates indicate that the likelihood ratio difference test is not suitable under any of the simulated conditions for evaluating dimensionality in applications of the MRCMLM.
Item response theory - A first approach

NASA Astrophysics Data System (ADS)

Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar

2017-07-01

The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.
Polarization of IRON-REGULATED TRANSPORTER 1 (IRT1) to the plant-soil interface plays crucial role in metal homeostasis.

PubMed

Barberon, Marie; Dubeaux, Guillaume; Kolb, Cornelia; Isono, Erika; Zelazny, Enric; Vert, Grégory

2014-06-03

In plants, the controlled absorption of soil nutrients by root epidermal cells is critical for growth and development. IRON-REGULATED TRANSPORTER 1 (IRT1) is the main root transporter taking up iron from the soil and is also the main entry route in plants for potentially toxic metals such as manganese, zinc, cobalt, and cadmium. Previous work demonstrated that the IRT1 protein localizes to early endosomes/trans-Golgi network (EE/TGN) and is constitutively endocytosed through a monoubiquitin- and clathrin-dependent mechanism. Here, we show that the availability of secondary non-iron metal substrates of IRT1 (Zn, Mn, and Co) controls the localization of IRT1 between the outer polar domain of the plasma membrane and EE/TGN in root epidermal cells. We also identify FYVE1, a phosphatidylinositol-3-phosphate-binding protein recruited to late endosomes, as an important regulator of IRT1-dependent metal transport and metal homeostasis in plants. FYVE1 controls IRT1 recycling to the plasma membrane and impacts the polar delivery of this transporter to the outer plasma membrane domain. This work establishes a functional link between the dynamics and the lateral polarity of IRT1 and the transport of its substrates, and identifies a molecular mechanism driving polar localization of a cell surface protein in plants.
Calibration and tests of commercial wireless infrared thermometers

USDA-ARS?s Scientific Manuscript database

Applications of infrared thermometers (IRTs) in large agricultural fields require wireless data transmission, and IRT target temperature should have minimal sensitivity to internal detector temperature. To meet these objectives, a prototype wireless IRT system was developed at USDA Agricultural Rese...
Psychometric characteristics of daily diaries for the Patient-Reported Outcomes Measurement Information System (PROMIS®): a preliminary investigation.

PubMed

Schneider, Stefan; Choi, Seung W; Junghaenel, Doerte U; Schwartz, Joseph E; Stone, Arthur A

2013-09-01

The Patient-Reported Outcomes (PRO) Measurement Information System (PROMIS(®)) has developed assessment tools for numerous PROs, most using a 7-day recall format. We examined whether modifying the recall period for use in daily diary research would affect the psychometric characteristics of several PROMIS measures. Daily versions of short-forms for three PROMIS domains (pain interference, fatigue, depression) were administered to a general population sample (n = 100) for 28 days. Analyses used multilevel item response theory (IRT) models. We examined differential item functioning (DIF) across recall periods by comparing the IRT parameters from the daily data with the PROMIS 7-day recall IRT parameters. Additionally, we examined whether the IRT parameters for day-to-day within-person changes are invariant to those for between-person (cross-sectional) differences in PROs. Dimensionality analyses of the daily data suggested a single dimension for each PRO domain, consistent with PROMIS instruments. One-third of the daily items showed uniform DIF when compared with PROMIS 7-day recall, but the impact of DIF on the scale level was minor. IRT parameters for within-person changes differed from between-person parameters for 3 depression items, which were more sensitive for measuring change than between-person differences, but not for pain interference and fatigue items. Notably, mean scores from daily diaries were significantly lower than the PROMIS 7-day recall norms. The results provide initial evidence supporting the adaptation of PROMIS measures for daily diary research. However, scores from daily diaries cannot be directly interpreted on PROMIS norms established for 7-day recall.
The Spinal Cord Injury- Functional Index: Item Banks to Measure Physical Functioning of Individuals with Spinal Cord Injury

PubMed Central

Tulsky, David S.; Jette, Alan; Kisala, Pamela A.; Kalpakjian, Claire; Dijkers, Marcel P.; Whiteneck, Gale; Ni, Pengsheng; Kirshblum, Steven; Charlifue, Susan; Heinemann, Allen W.; Forchheimer, Martin; Slavin, Mary; Houlihan, Bethlyn; Tate, Denise; Dyson-Hudson, Trevor; Fyffe, Denise; Williams, Steve; Zanca, Jeanne

2012-01-01

Objective To develop a comprehensive set of patient reported items to assess multiple aspects of physical functioning relevant to the lives of people with spinal cord injury (SCI) and to evaluate the underlying structure of physical functioning. Design Cross-sectional Setting Inpatient and community Participants Item pools of physical functioning were developed, refined and field tested in a large sample of 855 individuals with traumatic spinal cord injury stratified by diagnosis, severity, and time since injury Interventions None Main Outcome Measure SCI-FI measurement system Results Confirmatory factor analysis (CFA) indicated that a 5-factor model, including basic mobility, ambulation, wheelchair mobility, self care, and fine motor, had the best model fit and was most closely aligned conceptually with feedback received from individuals with SCI and SCI clinicians. When just the items making up basic mobility were tested in CFA, the fit statistics indicate strong support for a unidimensional model. Similar results were demonstrated for each of the other four factors indicating unidimensional models. Conclusions Though unidimensional or 2-factor (mobility and upper extremity) models of physical functioning make up outcomes measures in the general population, the underlying structure of physical function in SCI is more complex. A 5-factor solution allows for comprehensive assessment of key domain areas of physical functioning. These results informed the structure and development of the SCI-FI measurement system of physical functioning. PMID:22609299

Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

PubMed

Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

2006-11-01

We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
The Isolation of Motivational, Motoric, and Schedule Effects on Operant Performance: A Modeling Approach

PubMed Central

Brackney, Ryan J; Cheung, Timothy H. C; Neisewander, Janet L; Sanabria, Federico

2011-01-01

Dissociating motoric and motivational effects of pharmacological manipulations on operant behavior is a substantial challenge. To address this problem, we applied a response-bout analysis to data from rats trained to lever press for sucrose on variable-interval (VI) schedules of reinforcement. Motoric, motivational, and schedule factors (effort requirement, deprivation level, and schedule requirements, respectively) were manipulated. Bout analysis found that interresponse times (IRTs) were described by a mixture of two exponential distributions, one characterizing IRTs within response bouts, another characterizing intervals between bouts. Increasing effort requirement lengthened the shortest IRT (the refractory period between responses). Adding a ratio requirement increased the length and density of response bouts. Both manipulations also decreased the bout-initiation rate. In contrast, food deprivation only increased the bout-initiation rate. Changes in the distribution of IRTs over time showed that responses during extinction were also emitted in bouts, and that the decrease in response rate was primarily due to progressively longer intervals between bouts. Taken together, these results suggest that changes in the refractory period indicate motoric effects, whereas selective alterations in bout initiation rate indicate incentive-motivational effects. These findings support the use of response-bout analyses to identify the influence of pharmacological manipulations on processes underlying operant performance. PMID:21765544
Using iRT, a normalized retention time for more targeted measurement of peptides

PubMed Central

Escher, Claudia; Reiter, Lukas; MacLean, Brendan; Ossola, Reto; Herzog, Franz; Chilton, John; MacCoss, Michael J.; Rinner, Oliver

2014-01-01

Multiple reaction monitoring (MRM) has recently become the method of choice for targeted quantitative measurement of proteins using mass spectrometry. The method, however, is limited in the number of peptides that can be measured in one run. This number can be markedly increased by scheduling the acquisition if the accurate retention time (RT) of each peptide is known. Here we present iRT, an empirically derived dimensionless peptide-specific value that allows for highly accurate RT prediction. The iRT of a peptide is a fixed number relative to a standard set of reference iRT-peptides that can be transferred across laboratories and chromatographic systems. We show that iRT facilitates the setup of multiplexed experiments with acquisition windows more than 4 times smaller compared to in silico RT predictions resulting in improved quantification accuracy. iRTs can be determined by any laboratory and shared transparently. The iRT concept has been implemented in Skyline, the most widely used software for MRM experiments. PMID:22577012
A Non-Parametric Item Response Theory Evaluation of the CAGE Instrument Among Older Adults.

PubMed

Abdin, Edimansyah; Sagayadevan, Vathsala; Vaingankar, Janhavi Ajit; Picco, Louisa; Chong, Siow Ann; Subramaniam, Mythily

2018-02-23

The validity of the CAGE using item response theory (IRT) has not yet been examined in older adult population. This study aims to investigate the psychometric properties of the CAGE using both non-parametric and parametric IRT models, assess whether there is any differential item functioning (DIF) by age, gender and ethnicity and examine the measurement precision at the cut-off scores. We used data from the Well-being of the Singapore Elderly study to conduct Mokken scaling analysis (MSA), dichotomous Rasch and 2-parameter logistic IRT models. The measurement precision at the cut-off scores were evaluated using classification accuracy (CA) and classification consistency (CC). The MSA showed the overall scalability H index was 0.459, indicating a medium performing instrument. All items were found to be homogenous, measuring the same construct and able to discriminate well between respondents with high levels of the construct and the ones with lower levels. The item discrimination ranged from 1.07 to 6.73 while the item difficulty ranged from 0.33 to 2.80. Significant DIF was found for 2-item across ethnic group. More than 90% (CC and CA ranged from 92.5% to 94.3%) of the respondents were consistently and accurately classified by the CAGE cut-off scores of 2 and 3. The current study provides new evidence on the validity of the CAGE from the IRT perspective. This study provides valuable information of each item in the assessment of the overall severity of alcohol problem and the precision of the cut-off scores in older adult population.
NOD promoter-controlled AtIRT1 expression functions synergistically with NAS and FERRITIN genes to increase iron in rice grains.

PubMed

Boonyaves, Kulaporn; Gruissem, Wilhelm; Bhullar, Navreet K

2016-02-01

Rice is a staple food for over half of the world's population, but it contains only low amounts of bioavailable micronutrients for human nutrition. Consequently, micronutrient deficiency is a widespread health problem among people who depend primarily on rice as their staple food. Iron deficiency anemia is one of the most serious forms of malnutrition. Biofortification of rice grains for increased iron content is an effective strategy to reduce iron deficiency. Unlike other grass species, rice takes up iron as Fe(II) via the IRON REGULATED TRANSPORTER (IRT) in addition to Fe(III)-phytosiderophore chelates. We expressed Arabidopsis IRT1 (AtIRT1) under control of the Medicago sativa EARLY NODULIN 12B promoter in our previously developed high-iron NFP rice lines expressing NICOTIANAMINE SYNTHASE (AtNAS1) and FERRITIN. Transgenic rice lines expressing AtIRT1 alone had significant increases in iron and combined with NAS and FERRITIN increased iron to 9.6 µg/g DW in the polished grains that is 2.2-fold higher as compared to NFP lines. The grains of AtIRT1 lines also accumulated more copper and zinc but not manganese. Our results demonstrate that the concerted expression of AtIRT1, AtNAS1 and PvFERRITIN synergistically increases iron in both polished and unpolished rice grains. AtIRT1 is therefore a valuable transporter for iron biofortification programs when used in combination with other genes encoding iron transporters and/or storage proteins.
Application of Item Response Theory to Modeling of Expanded Disability Status Scale in Multiple Sclerosis.

PubMed

Novakovic, A M; Krekels, E H J; Munafo, A; Ueckert, S; Karlsson, M O

2017-01-01

In this study, we report the development of the first item response theory (IRT) model within a pharmacometrics framework to characterize the disease progression in multiple sclerosis (MS), as measured by Expanded Disability Status Score (EDSS). Data were collected quarterly from a 96-week phase III clinical study by a blinder rater, involving 104,206 item-level observations from 1319 patients with relapsing-remitting MS (RRMS), treated with placebo or cladribine. Observed scores for each EDSS item were modeled describing the probability of a given score as a function of patients' (unobserved) disability using a logistic model. Longitudinal data from placebo arms were used to describe the disease progression over time, and the model was then extended to cladribine arms to characterize the drug effect. Sensitivity with respect to patient disability was calculated as Fisher information for each EDSS item, which were ranked according to the amount of information they contained. The IRT model was able to describe baseline and longitudinal EDSS data on item and total level. The final model suggested that cladribine treatment significantly slows disease-progression rate, with a 20% decrease in disease-progression rate compared to placebo, irrespective of exposure, and effects an additional exposure-dependent reduction in disability progression. Four out of eight items contained 80% of information for the given range of disabilities. This study has illustrated that IRT modeling is specifically suitable for accurate quantification of disease status and description and prediction of disease progression in phase 3 studies on RRMS, by integrating EDSS item-level data in a meaningful manner.
[Mokken scaling of the Cognitive Screening Test].

PubMed

Diesfeldt, H F A

2009-10-01

The Cognitive Screening Test (CST) is a twenty-item orientation questionnaire in Dutch, that is commonly used to evaluate cognitive impairment. This study applied Mokken Scale Analysis, a non-parametric set of techniques derived from item response theory (IRT), to CST-data of 466 consecutive participants in psychogeriatric day care. The full item set and the standard short version of fourteen items both met the assumptions of the monotone homogeneity model, with scalability coefficient H = 0.39, which is considered weak. In order to select items that would fulfil the assumption of invariant item ordering or the double monotonicity model, the subjects were randomly partitioned into a training set (50% of the sample) and a test set (the remaining half). By means of an automated item selection eleven items were found to measure one latent trait, with H = 0.67 and item H coefficients larger than 0.51. Cross-validation of the item analysis in the remaining half of the subjects gave comparable values (H = 0.66; item H coefficients larger than 0.56). The selected items involve year, place of residence, birth date, the monarch's and prime minister's names, and their predecessors. Applying optimal discriminant analysis (ODA) it was found that the full set of twenty CST items performed best in distinguishing two predefined groups of patients of lower or higher cognitive ability, as established by an independent criterion derived from the Amsterdam Dementia Screening Test. The chance corrected predictive value or prognostic utility was 47.5% for the full item set, 45.2% for the fourteen items of the standard short version of the CST, and 46.1% for the homogeneous, unidimensional set of selected eleven items. The results of the item analysis support the application of the CST in cognitive assessment, and revealed a more reliable 'short' version of the CST than the standard short version (CST14).
Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest.

PubMed

Spencer, Mercedes; Cho, Sun-Joo; Cutting, Laurie E

2018-02-02

In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.
Equating with Miditests Using IRT

ERIC Educational Resources Information Center

Fitzpatrick, Joseph; Skorupski, William P.

2016-01-01

The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…
Modern Psychometric Methodology: Applications of Item Response Theory

ERIC Educational Resources Information Center

Reid, Christine A.; Kolakowsky-Hayner, Stephanie A.; Lewis, Allen N.; Armstrong, Amy J.

2007-01-01

Item response theory (IRT) methodology is introduced as a tool for improving assessment instruments used with people who have disabilities. Need for this approach in rehabilitation is emphasized; differences between IRT and classical test theory are clarified. Concepts essential to understanding IRT are defined, necessary data assumptions are…
Infrared thermography for condition monitoring - A review

NASA Astrophysics Data System (ADS)

Bagavathiappan, S.; Lahiri, B. B.; Saravanan, T.; Philip, John; Jayakumar, T.

2013-09-01

Temperature is one of the most common indicators of the structural health of equipment and components. Faulty machineries, corroded electrical connections, damaged material components, etc., can cause abnormal temperature distribution. By now, infrared thermography (IRT) has become a matured and widely accepted condition monitoring tool where the temperature is measured in real time in a non-contact manner. IRT enables early detection of equipment flaws and faulty industrial processes under operating condition thereby, reducing system down time, catastrophic breakdown and maintenance cost. Last three decades witnessed a steady growth in the use of IRT as a condition monitoring technique in civil structures, electrical installations, machineries and equipment, material deformation under various loading conditions, corrosion damages and welding processes. IRT has also found its application in nuclear, aerospace, food, paper, wood and plastic industries. With the advent of newer generations of infrared camera, IRT is becoming a more accurate, reliable and cost effective technique. This review focuses on the advances of IRT as a non-contact and non-invasive condition monitoring tool for machineries, equipment and processes. Various conditions monitoring applications are discussed in details, along with some basics of IRT, experimental procedures and data analysis techniques. Sufficient background information is also provided for the beginners and non-experts for easy understanding of the subject.
Effects of eight weeks of aerobic interval training and of isoinertial resistance training on risk factors of cardiometabolic diseases and exercise capacity in healthy elderly subjects

PubMed Central

Bruseghini, Paolo; Calabria, Elisa; Tam, Enrico; Milanese, Chiara; Oliboni, Eugenio; Pezzato, Andrea; Pogliaghi, Silvia; Salvagno, Gian Luca; Schena, Federico; Mucelli, Roberto Pozzi; Capelli, Carlo

2015-01-01

We investigated the effect of 8 weeks of high intensity interval training (HIT) and isoinertial resistance training (IRT) on cardiovascular fitness, muscle mass-strength and risk factors of metabolic syndrome in 12 healthy older adults (68 yy ± 4). HIT consisted in 7 two-minute repetitions at 80%–90% of V˙O2max, 3 times/w. After 4 months of recovery, subjects were treated with IRT, which included 4 sets of 7 maximal, bilateral knee extensions/flexions 3 times/w on a leg-press flywheel ergometer. HIT elicited significant: i) modifications of selected anthropometrical features; ii) improvements of cardiovascular fitness and; iii) decrease of systolic pressure. HIT and IRT induced hypertrophy of the quadriceps muscle, which, however, was paralleled by significant increases in strength only after IRT. Neither HIT nor IRT induced relevant changes in blood lipid profile, with the exception of a decrease of LDL and CHO after IRT. Physiological parameters related with aerobic fitness and selected body composition values predicting cardiovascular risk remained stable during detraining and, after IRT, they were complemented by substantial increase of muscle strength, leading to further improvements of quality of life of the subjects. PMID:26046575
Effects of instability versus traditional resistance training on strength, power and velocity in untrained men.

PubMed

Maté-Muñoz, José Luis; Monroy, Antonio J Antón; Jodra Jiménez, Pablo; Garnacho-Castaño, Manuel V

2014-09-01

The purpose of this study was compare the effects of a traditional and an instability resistance circuit training program on upper and lower limb strength, power, movement velocity and jumping ability. Thirty-six healthy untrained men were assigned to two experimental groups and a control group. Subjects in the experimental groups performed a resistance circuit training program consisting of traditional exercises (TRT, n = 10) or exercises executed in conditions of instability (using BOSU® and TRX®) (IRT, n = 12). Both programs involved three days per week of training for a total of seven weeks. The following variables were determined before and after training: maximal strength (1RM), average (AV) and peak velocity (PV), average (AP) and peak power (PP), all during bench press (BP) and back squat (BS) exercises, along with squat jump (SJ) height and counter movement jump (CMJ) height. All variables were found to significantly improve (p <0.05) in response to both training programs. Major improvements were observed in SJ height (IRT = 22.1%, TRT = 20.1%), CMJ height (IRT = 17.7%, TRT = 15.2%), 1RM in BS (IRT = 13.03%, TRT = 12.6%), 1RM in BP (IRT = 4.7%, TRT = 4.4%), AP in BS (IRT = 10.5%, TRT = 9.3%), AP in BP (IRT = 2.4%, TRT = 8.1%), PP in BS (IRT=19.42%, TRT = 22.3%), PP in BP (IRT = 7.6%, TRT = 11.5%), AV in BS (IRT = 10.5%, TRT = 9.4%), and PV in BS (IRT = 8.6%, TRT = 4.5%). Despite such improvements no significant differences were detected in the posttraining variables recorded for the two experimental groups. These data indicate that a circuit training program using two instability training devices is as effective in untrained men as a program executed under stable conditions for improving strength (1RM), power, movement velocity and jumping ability. Key PointsSimilar adaptations in terms of gains in strength, power, movement velocity and jumping ability were produced in response to both training programs.Both the stability and instability approaches seem suitable for healthy, physically-active individuals with or with limited experience in resistance training.RPE emerged as a useful tool to monitor exercise intensity during instability strength training.
Effects of Instability Versus Traditional Resistance Training on Strength, Power and Velocity in Untrained Men

PubMed Central

Maté-Muñoz, José Luis; Monroy, Antonio J. Antón; Jodra Jiménez, Pablo; Garnacho-Castaño, Manuel V.

2014-01-01

The purpose of this study was compare the effects of a traditional and an instability resistance circuit training program on upper and lower limb strength, power, movement velocity and jumping ability. Thirty-six healthy untrained men were assigned to two experimental groups and a control group. Subjects in the experimental groups performed a resistance circuit training program consisting of traditional exercises (TRT, n = 10) or exercises executed in conditions of instability (using BOSU® and TRX®) (IRT, n = 12). Both programs involved three days per week of training for a total of seven weeks. The following variables were determined before and after training: maximal strength (1RM), average (AV) and peak velocity (PV), average (AP) and peak power (PP), all during bench press (BP) and back squat (BS) exercises, along with squat jump (SJ) height and counter movement jump (CMJ) height. All variables were found to significantly improve (p <0.05) in response to both training programs. Major improvements were observed in SJ height (IRT = 22.1%, TRT = 20.1%), CMJ height (IRT = 17.7%, TRT = 15.2%), 1RM in BS (IRT = 13.03%, TRT = 12.6%), 1RM in BP (IRT = 4.7%, TRT = 4.4%), AP in BS (IRT = 10.5%, TRT = 9.3%), AP in BP (IRT = 2.4%, TRT = 8.1%), PP in BS (IRT=19.42%, TRT = 22.3%), PP in BP (IRT = 7.6%, TRT = 11.5%), AV in BS (IRT = 10.5%, TRT = 9.4%), and PV in BS (IRT = 8.6%, TRT = 4.5%). Despite such improvements no significant differences were detected in the posttraining variables recorded for the two experimental groups. These data indicate that a circuit training program using two instability training devices is as effective in untrained men as a program executed under stable conditions for improving strength (1RM), power, movement velocity and jumping ability. Key Points Similar adaptations in terms of gains in strength, power, movement velocity and jumping ability were produced in response to both training programs. Both the stability and instability approaches seem suitable for healthy, physically-active individuals with or with limited experience in resistance training. RPE emerged as a useful tool to monitor exercise intensity during instability strength training. PMID:25177170
Comparison of CTT and Rasch-based approaches for the analysis of longitudinal Patient Reported Outcomes.

PubMed

Blanchin, Myriam; Hardouin, Jean-Benoit; Le Neel, Tanguy; Kubis, Gildas; Blanchard, Claire; Mirallié, Eric; Sébille, Véronique

2011-04-15

Health sciences frequently deal with Patient Reported Outcomes (PRO) data for the evaluation of concepts, in particular health-related quality of life, which cannot be directly measured and are often called latent variables. Two approaches are commonly used for the analysis of such data: Classical Test Theory (CTT) and Item Response Theory (IRT). Longitudinal data are often collected to analyze the evolution of an outcome over time. The most adequate strategy to analyze longitudinal latent variables, which can be either based on CTT or IRT models, remains to be identified. This strategy must take into account the latent characteristic of what PROs are intended to measure as well as the specificity of longitudinal designs. A simple and widely used IRT model is the Rasch model. The purpose of our study was to compare CTT and Rasch-based approaches to analyze longitudinal PRO data regarding type I error, power, and time effect estimation bias. Four methods were compared: the Score and Mixed models (SM) method based on the CTT approach, the Rasch and Mixed models (RM), the Plausible Values (PV), and the Longitudinal Rasch model (LRM) methods all based on the Rasch model. All methods have shown comparable results in terms of type I error, all close to 5 per cent. LRM and SM methods presented comparable power and unbiased time effect estimations, whereas RM and PV methods showed low power and biased time effect estimations. This suggests that RM and PV methods should be avoided to analyze longitudinal latent variables. Copyright © 2010 John Wiley & Sons, Ltd.
Generalizability in Item Response Modeling

ERIC Educational Resources Information Center

Briggs, Derek C.; Wilson, Mark

2007-01-01

An approach called generalizability in item response modeling (GIRM) is introduced in this article. The GIRM approach essentially incorporates the sampling model of generalizability theory (GT) into the scaling model of item response theory (IRT) by making distributional assumptions about the relevant measurement facets. By specifying a random…
Some Statistics for Assessing Person-Fit Based on Continuous-Response Models

ERIC Educational Resources Information Center

Ferrando, Pere Joan

2010-01-01

This article proposes several statistics for assessing individual fit based on two unidimensional models for continuous responses: linear factor analysis and Samejima's continuous response model. Both models are approached using a common framework based on underlying response variables and are formulated at the individual level as fixed regression…
Robustness of Hierarchical Modeling of Skill Association in Cognitive Diagnosis Models

ERIC Educational Resources Information Center

Templin, Jonathan L.; Henson, Robert A.; Templin, Sara E.; Roussos, Louis

2008-01-01

Several types of parameterizations of attribute correlations in cognitive diagnosis models use the reduced reparameterized unified model. The general approach presumes an unconstrained correlation matrix with K(K - 1)/2 parameters, whereas the higher order approach postulates K parameters, imposing a unidimensional structure on the correlation…
Structural and photodynamic properties of the anti-cancer drug irinotecan in aqueous solutions of different pHs.

PubMed

di Nunzio, Maria Rosaria; Douhal, Yasmin; Organero, Juan Angel; Douhal, Abderrazzak

2018-05-23

This work reports on photophysical studies of the irinotecan (IRT) anti-cancer drug in water solutions of different acidities (pH = 1.11-9.46). We found that IRT co-exists as mono-cationic (C1), di-cationic (C2), or neutral (N) forms. The population of each prototropic species depends on the pH of the solution. At pH = 1.11-3.01, the C1 and C2 structures are stabilized. At pH = 7.00, the most populated species is C1, while at pH values larger than 9.46 the N form is the most stable species. In the 1.11-2.61 pH range, the C1* emission is efficiently quenched by protons to give rise to the emission from C2*. The dynamic quenching constant, KD, is ∼32 M-1. While the diffusion governs the rate of excited-state proton-transfer (ESPT) under these conditions, the reaction rate increases with the proton concentration. A two-step diffusive Debye-Smoluchowski model was applied at pH = 1.11-2.61 to describe the protonation of C1*. The ESPT time constants derived for C1* are 382 and 1720 ps at pH = 1.11 and 1.95, respectively. We found that one proton species is involved in the protonation of C1* to give C2*, in the analyzed acidic pH range. Under alkaline conditions (pH = 9.46), the N form is the most stable structure of IRT. These results indicate the influence of the pH of the medium on the structural and dynamical properties of IRT in water solution. They may help to provide a better understanding on the relationship between the structure and biological activity of IRT.
Assessing Unidimensionality and Differential Item Functioning in Qualifying Examination for Senior Secondary School Students, Osun State, Nigeria

ERIC Educational Resources Information Center

Ajeigbe, Taiwo Oluwafemi; Afolabi, Eyitayo Rufus Ifedayo

2017-01-01

This study assessed unidimensionality and occurrence of Differential Item Functioning (DIF) in Mathematics and English Language items of Osun State Qualifying Examination. The study made use of secondary data. The results showed that OSQ Mathematics (-0.094 = r = 0.236) and English Language items (-0.095 = r = 0.228) were unidimensional. Also,…

Comparison of immunoreactive serum trypsinogen and lipase in Cystic Fibrosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lloyd-Still, J.D.; Weiss, S.; Wessel, H.

1984-01-01

The incidence of Cystic Fibrosis (CF) is 1 in 2,000. Early detection and treatment of CF may necessitate newborn screening with a reliable and cost-effective test. Serum immunoreactive trypsinogen (IRT) an enzyme produced by the pancreas, is detectable by radioimmunoassay (RIA) techniques. Recently, it has been shown that IRT is elevated in CF infants for the first few months of life and levels become subnormal as pancreatic insufficiency progresses. Other enzymes produced by the pancreas, such as lipase, are also elevated during this time. The author's earlier work confirmed previous reports of elevated IRT levels in CF infants. The developmentmore » of a new RIA for lipase (nuclipase) has enabled comparison of these 2 pancreatic enzymes in C.F. Serum IRT and lipase determinations were performed on 2 groups of CF patients; infants under 1 year of age, and children between 1 and 18 years of age. Control populations of the same age groups were included. The results showed that both trypsin (161 +- 92 ng/ml, range 20 to 400) and lipase (167 +- 151 ng/ml, range 29 to 500) are elevated in CF in the majority of infants. Control infants had values of IRT ranging from 20 to 29.5 ng/ml and lipase values ranging from 23 to 34 ng/ml. IRT becomes subnormal in most CF patients by 8 years of age as pancreatic function insufficiency increases. Lipase levels and IRT levels correlate well in infancy, but IRT is a more sensitive indicator of pancreatic insufficiency in older patients with CF.« less
Using Item Response Theory to Evaluate LSCI Learning Gains

NASA Astrophysics Data System (ADS)

Schlingman, Wayne M.; Prather, E. E.; Collaboration of Astronomy Teaching Scholars CATS

2012-01-01

Analyzing the data from the recent national study using the Light and Spectroscopy Concept Inventory (LSCI), this project uses Item Response Theory (IRT) to investigate the learning gains of students as measured by the LSCI. IRT provides a theoretical model to generate parameters accounting for students’ abilities. We use IRT to measure changes in students’ abilities to reason about light from pre- to post-instruction. Changes in students’ abilities are compared by classroom to better understand the learning that is taking place in classrooms across the country. We compare the average change in ability for each classroom to the Interactivity Assessment Score (IAS) to provide further insight into the prior results presented from this data set. This material is based upon work supported by the National Science Foundation under Grant No. 0715517, a CCLI Phase III Grant for the Collaboration of Astronomy Teaching Scholars (CATS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
A Longitudinal Item Response Theory Model to Characterize Cognition Over Time in Elderly Subjects

PubMed Central

Bornkamp, Björn; Krahnke, Tillmann; Mielke, Johanna; Monsch, Andreas; Quarg, Peter

2017-01-01

For drug development in neurodegenerative diseases such as Alzheimer's disease, it is important to understand which cognitive domains carry the most information on the earliest signs of cognitive decline, and which subject characteristics are associated with a faster decline. A longitudinal Item Response Theory (IRT) model was developed for the Basel Study on the Elderly, in which the Consortium to Establish a Registry for Alzheimer's Disease – Neuropsychological Assessment Battery (with additions) and the California Verbal Learning Test were measured on 1,750 elderly subjects for up to 13.9 years. The model jointly captured the multifaceted nature of cognition and its longitudinal trajectory. The word list learning and delayed recall tasks carried the most information. Greater age at baseline, fewer years of education, and positive APOEɛ4 carrier status were associated with a faster cognitive decline. Longitudinal IRT modeling is a powerful approach for progressive diseases with multifaceted endpoints. PMID:28643388
A new item response theory model to adjust data allowing examinee choice

PubMed Central

Costa, Marcelo Azevedo; Braga Oliveira, Rivert Paulo

2018-01-01

In a typical questionnaire testing situation, examinees are not allowed to choose which items they answer because of a technical issue in obtaining satisfactory statistical estimates of examinee ability and item difficulty. This paper introduces a new item response theory (IRT) model that incorporates information from a novel representation of questionnaire data using network analysis. Three scenarios in which examinees select a subset of items were simulated. In the first scenario, the assumptions required to apply the standard Rasch model are met, thus establishing a reference for parameter accuracy. The second and third scenarios include five increasing levels of violating those assumptions. The results show substantial improvements over the standard model in item parameter recovery. Furthermore, the accuracy was closer to the reference in almost every evaluated scenario. To the best of our knowledge, this is the first proposal to obtain satisfactory IRT statistical estimates in the last two scenarios. PMID:29389996
Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT

ERIC Educational Resources Information Center

Lathrop, Quinn N.

2015-01-01

There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…
IRT-Stimulus Contingencies in Chained Schedules: Implications for the Concept of Conditioned Reinforcement

ERIC Educational Resources Information Center

Bejarano, Rafael; Hackenberg, Timothy D.

2007-01-01

Two experiments with pigeons investigated the effects of contingencies between interresponse times (IRTs) and the transitions between the components of 2- and 4-component chained schedules (Experiments 1 and 2, respectively). The probability of component transitions varied directly with the most recent (Lag 0) IRT in some experimental conditions…
Item Response Theory: A Basic Concept

ERIC Educational Resources Information Center

Mahmud, Jumailiyah

2017-01-01

With the development in computing technology, item response theory (IRT) develops rapidly, and has become a user friendly application in psychometrics world. Limitation in classical theory is one aspect that encourages the use of IRT. In this study, the basic concept of IRT will be discussed. In addition, it will briefly review the ability…
Item Response Theory: Overview, Applications, and Promise for Institutional Research

ERIC Educational Resources Information Center

Bowman, Nicholas A.; Herzog, Serge; Sharkness, Jessica

2014-01-01

Item Response Theory (IRT) is a measurement theory that is ideal for scale and test development in institutional research, but it is not without its drawbacks. This chapter provides an overview of IRT, describes an example of its use, and highlights the pros and cons of using IRT in applied settings.
Using iRT, a normalized retention time for more targeted measurement of peptides.

PubMed

Escher, Claudia; Reiter, Lukas; MacLean, Brendan; Ossola, Reto; Herzog, Franz; Chilton, John; MacCoss, Michael J; Rinner, Oliver

2012-04-01

Multiple reaction monitoring (MRM) has recently become the method of choice for targeted quantitative measurement of proteins using mass spectrometry. The method, however, is limited in the number of peptides that can be measured in one run. This number can be markedly increased by scheduling the acquisition if the accurate retention time (RT) of each peptide is known. Here we present iRT, an empirically derived dimensionless peptide-specific value that allows for highly accurate RT prediction. The iRT of a peptide is a fixed number relative to a standard set of reference iRT-peptides that can be transferred across laboratories and chromatographic systems. We show that iRT facilitates the setup of multiplexed experiments with acquisition windows more than four times smaller compared to in silico RT predictions resulting in improved quantification accuracy. iRTs can be determined by any laboratory and shared transparently. The iRT concept has been implemented in Skyline, the most widely used software for MRM experiments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Spectral Analysis and Experimental Modeling of Ice Accretion Roughness

NASA Technical Reports Server (NTRS)

Orr, D. J.; Breuer, K. S.; Torres, B. E.; Hansman, R. J., Jr.

1996-01-01

A self-consistent scheme for relating wind tunnel ice accretion roughness to the resulting enhancement of heat transfer is described. First, a spectral technique of quantitative analysis of early ice roughness images is reviewed. The image processing scheme uses a spectral estimation technique (SET) which extracts physically descriptive parameters by comparing scan lines from the experimentally-obtained accretion images to a prescribed test function. Analysis using this technique for both streamwise and spanwise directions of data from the NASA Lewis Icing Research Tunnel (IRT) are presented. An experimental technique is then presented for constructing physical roughness models suitable for wind tunnel testing that match the SET parameters extracted from the IRT images. The icing castings and modeled roughness are tested for enhancement of boundary layer heat transfer using infrared techniques in a "dry" wind tunnel.
Assessment of Daily and Weekly Fatigue among African American Cancer Survivors

PubMed Central

Sobel, Rina M.; McSorley, Anna-Michelle M.; Roesch, Scott C.; Malcarne, Vanessa L.; Hawes, Starlyn M.; Sadler, Georgia Robins

2013-01-01

This investigation evaluates two common measures of cancer-related fatigue, one multidimensional/retrospective and one unidimensional/same-day. Fifty-two African American survivors of diverse cancers completed fatigue visual analogue scales once daily, and the Multidimensional Fatigue Symptom Inventory (MFSI) once weekly, for four weeks. Zero-order correlations showed retrospectivefatigue was significantly related to average, peak, and most recent same-dayfatigue. Multilevel random coefficient modeling showed unidimensional fatigue shared the most variance with the MFSI’s General subscale for three weeks, and with the Vigor subscale for one week. Researchers and clinicians may wish to prioritize multidimensional measures when assessing cancer-related fatigue, if appropriate. PMID:23844922
Log-Multiplicative Association Models as Item Response Models

ERIC Educational Resources Information Center

Anderson, Carolyn J.; Yu, Hsiu-Ting

2007-01-01

Log-multiplicative association (LMA) models, which are special cases of log-linear models, have interpretations in terms of latent continuous variables. Two theoretical derivations of LMA models based on item response theory (IRT) arguments are presented. First, we show that Anderson and colleagues (Anderson & Vermunt, 2000; Anderson & Bockenholt,…
Modeling Diagnostic Assessments with Bayesian Networks

ERIC Educational Resources Information Center

Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego

2007-01-01

This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…
Study of the Forced Response of a Clamped Circular Plate Coupled to a Uni-Dimensional Acoustic Cavity

NASA Astrophysics Data System (ADS)

Curà, F.; Curti, G.; Mantovani, M.

1996-03-01

The subject of this paper is an experimental and analytical study of a structural-acoustical coupling problem. To simplify the issue, the analytical model considered here consists of a uni-dimensional acoustic cavity coupled to a one-degree-of-freedom system (mass, spring and damper). An harmonic excitation force is applied to the mass of the oscillator. In the theoretical analysis, the uni-dimensional cavity is subjected, in correspondence of its end sections, to boundary conditions, which are either the usual ones (closed or open ended) or those deriving from the coupling with the oscillator. This simple model proved to be very useful to investigate the influence of the variation of both the geometrical parameters (i.e., the length of the cavity) and the physical parameters (i.e., mass, damping coefficient and stiffness of the oscillator). The analytical results are compared to those obtained experimentally on a real coupled system, consisting of a cavity enclosed by an acoustically rigid steel cylinder, closed at one end by a movable, acoustically rigid piston and at the other end by a flexible plate, clamped around its edge by the cylinder. Thus the length of the cavity can be varied by simply moving the rigid piston.
Development of indirect ring tension test for fracture characterization of asphalt mixtures

NASA Astrophysics Data System (ADS)

Zeinali Siavashani, Alireza

Low temperature cracking is a major distress in asphalt pavements. Several test configurations have been introduced to characterize the fracture properties of hot mix (HMA); however, most are considered to be research tools due to the complexity of the test methods or equipment. This dissertation describes the development of the indirect ring tension (IRT) fracture test for HMA, which was designed to be an effective and user-friendly test that could be deployed at the Department of Transportation level. The primary advantages of this innovative and yet practical test include: relatively large fracture surface test zone, simplicity of the specimen geometry, widespread availability of the required test equipment, and ability to test laboratory compacted specimens as well as field cores. Numerical modeling was utilized to calibrate the stress intensity factor formula of the IRT fracture test for various specimen dimensions. The results of this extensive analysis were encapsulated in a single equation. To develop the test procedure, a laboratory study was conducted to determine the optimal test parameters for HMA material. An experimental plan was then developed to evaluate the capability of the test in capturing the variations in the mix properties, asphalt pavement density, asphalt material aging, and test temperature. Five plant-produced HMA mixtures were used in this extensive study, and the results revealed that the IRT fracture test is highly repeatable, and capable of capturing the variations in the fracture properties of HMA. Furthermore, an analytical model was developed based on the viscoelastic properties of HMA to estimate the maximum allowable crack size for the pavements in the experimental study. This analysis indicated that the low-temperature cracking potential of the asphalt mixtures is highly sensitive to the fracture toughness and brittleness of the HMA material. Additionally, the IRT fracture test data seemed to correlate well with the data from the distress survey which was conducted on the pavements after five years of service. The maximum allowable crack size analysis revealed that a significant improvement could be realized in terms of the pavements performance if the HMA were to be compacted to a higher density. Finally, the IRT fracture test data were compared to the results of the disk-shaped compact [DC(t)] test. The results of the two tests showed a strong correlation; however, the IRT test seemed to be more repeatable. KEYWORDS: Asphalt Pavement, Low-Temperature Cracking, Fracture Mechanics, Material Characterization, Laboratory Testing.
The Motivational Value Systems Questionnaire (MVSQ): Psychometric Analysis Using a Forced Choice Thurstonian IRT Model

PubMed Central

Merk, Josef; Schlotz, Wolff; Falter, Thomas

2017-01-01

This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts. PMID:28979228
The Motivational Value Systems Questionnaire (MVSQ): Psychometric Analysis Using a Forced Choice Thurstonian IRT Model.

PubMed

Merk, Josef; Schlotz, Wolff; Falter, Thomas

2017-01-01

This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts.
Examining Differential Item Functioning: IRT-Based Detection in the Framework of Confirmatory Factor Analysis

ERIC Educational Resources Information Center

Dimitrov, Dimiter M.

2017-01-01

This article offers an approach to examining differential item functioning (DIF) under its item response theory (IRT) treatment in the framework of confirmatory factor analysis (CFA). The approach is based on integrating IRT- and CFA-based testing of DIF and using bias-corrected bootstrap confidence intervals with a syntax code in Mplus.
Investigation of IRT-Based Equating Methods in the Presence of Outlier Common Items

ERIC Educational Resources Information Center

Hu, Huiqin; Rogers, W. Todd; Vukmirovic, Zarko

2008-01-01

Common items with inconsistent b-parameter estimates may have a serious impact on item response theory (IRT)--based equating results. To find a better way to deal with the outlier common items with inconsistent b-parameters, the current study investigated the comparability of 10 variations of four IRT-based equating methods (i.e., concurrent…
The Relationship between CTT and IRT Approaches in Analyzing Item Characteristics

ERIC Educational Resources Information Center

Abedalaziz, Nabeel; Leng, Chin Hai

2013-01-01

Most of the tests and inventories used by counseling psychologists have been developed using CTT; IRT derives from what is called latent trait theory. A number of important differences exist between CTT- versus IRT-based approaches to both test development and evaluation, as well as the process of scoring the response profiles of individual…

Technology and Teaching: Promoting Active Learning Using Individual Response Technology in Large Introductory Psychology Classes

ERIC Educational Resources Information Center

Poirier, Christopher R.; Feldman, Robert S.

2007-01-01

Individual response technology (IRT), in which students use wireless handsets to communicate real-time responses, permits the recording and display of aggregated student responses during class. In comparison to a traditional class that did not employ IRT, students using IRT performed better on exams and held positive attitudes toward the…
Using IRT Trait Estimates versus Summated Scores in Predicting Outcomes

ERIC Educational Resources Information Center

Xu, Ting; Stone, Clement A.

2012-01-01

It has been argued that item response theory trait estimates should be used in analyses rather than number right (NR) or summated scale (SS) scores. Thissen and Orlando postulated that IRT scaling tends to produce trait estimates that are linearly related to the underlying trait being measured. Therefore, IRT trait estimates can be more useful…
Measurement Invariance in Careers Research: Using IRT to Study Gender Differences in Medical Students' Specialization Decisions

ERIC Educational Resources Information Center

Behrend, Tara S.; Thompson, Lori Foster; Meade, Adam W.; Newton, Dale A.; Grayson, Martha S.

2008-01-01

The current study demonstrates the use of item response theory (IRT) to conduct measurement invariance analyses in careers research. A self-report survey was used to assess the importance 1,363 fourth-year medical students placed on opportunities to provide comprehensive patient care when choosing a career specialty. IRT analyses supported…
"We Freeze to Please": A History of NASA's Icing Research Tunnel and the Quest for Flight Safety

NASA Technical Reports Server (NTRS)

Leary, William M.

2002-01-01

The formation of ice on wings and other control surfaces of airplanes is one of the oldest and most vexing problems that aircraft engineers and scientists continue to face. While no easy, comprehensive answers exist, the staff at NASAs Icing Research Tunnel (IRT) at the Glenn Research Center in Cleveland has done pioneering work to make flight safer for experimental, commercial, and military customers. The National Advisory Committee for Aeronautics (NACA) initiated government research on aircraft icing in the 1930s at its Langley facility in Virginia. Icing research shifted to the NACA's Cleveland facility in the 1940s. Initially there was little focus on icing at either location, as these facilities were more concerned with aerodynamics and engine development. With several high-profile fatal crashes of air mail carriers, however, the NACA soon realized the need for a leading research facility devoted to icing prevention and removal. The IRT began operation in 1944 and, despite renovations and periodic attempts to shut it down, has continued to function productively for almost 60 years. In part because icing has proved so problematic over time, IRT researchers have been unusually open-minded in experimenting with a wide variety of substances, devices, and techniques. Early icing prevention experiments involved grease, pumping hot engine exhaust onto the wings, glycerin soap, mechanical and inflatable "boots," and even corn syrup. The IRT staff also looked abroad for ideas and later tried a German and Soviet technique of electromagnetism, to no avail. More recently, European polymer fluids have been more promising. The IRT even periodically had "amateur nights" in which a dentist's coating for children's teeth proved unequal to the demands of super-cooled water droplets blown at 100 miles per hour. Despite many research dead-ends, IRT researchers have achieved great success over the years. They have developed important computer models, such as the LEWICE software, and made significant contributions to prevent ice buildup on turbine-powered commercial aircraft, helicopters, and military planes.
A Componential IRT Model for Guilt.

ERIC Educational Resources Information Center

Smits, Dirk J. M.; De Boeck, Paul

2003-01-01

Studied the process structure of guilt with an adaptation of the Model with Internal Restrictions on Item Difficulty (R. Butter and others, 1998) administered to 270 high school students. Findings show that this kind of modeling is appropriate to investigate the structure of other emotions. (SLD)
Stellar rotation periods determined from simultaneously measured Ca II H&K and Ca II IRT lines

NASA Astrophysics Data System (ADS)

Mittag, M.; Hempelmann, A.; Schmitt, J. H. M. M.; Fuhrmeister, B.; González-Pérez, J. N.; Schröder, K.-P.

2017-11-01

Aims: Previous studies have shown that, for late-type stars, activity indicators derived from the Ca II infrared-triplet (IRT) lines are correlated with the indicators derived from the Ca II H&K lines. Therefore, the Ca II IRT lines are in principle usable for activity studies, but they may be less sensitive when measuring the rotation period. Our goal is to determine whether the Ca II IRT lines are sufficiently sensitive to measure rotation periods and how any Ca II IRT derived rotation periods compare with periods derived from the "classical" Mount Wilson S-index. Methods: To analyse the Ca II IRT lines' sensitivity and to measure rotation periods, we define an activity index for each of the Ca II IRT lines similar to the Mount Wilson S-index and perform a period analysis for the lines separately and jointly. Results: For eleven late-type stars we can measure the rotation periods using the Ca II IRT indices similar to those found in the Mount Wilson S-index time series and find that a period derived from all four indices gives the most probable rotation period; we find good agreement for stars with already existing literature values. In a few cases the computed periodograms show a complicated structure with multiple peaks, meaning that formally different periods are derived in different indices. We show that in one case, this is due to data sampling effects and argue that denser cadence sampling is necessary to provide credible evidence for differential rotation. However, our TIGRE data for HD 101501 shows good evidence for the presence of differential rotation.
The probiotic mixture IRT5 ameliorates age-dependent colitis in rats.

PubMed

Jeong, Jin-Ju; Woo, Jae-Yeon; Ahn, Young-Tae; Shim, Jae-Hun; Huh, Chul-Sung; Im, Sin-Heog; Han, Myung Joo; Kim, Dong-Hyun

2015-06-01

To investigate the anti-inflammatory effect of probiotics, we orally administered IRT5 (1×10(9)CFU/rat) for 8 weeks to aged (16 months-old) Fischer 344 rats, and measured parameters of colitis. The expression levels of the inflammatory markers' inducible NO synthase (iNOS), cyclooxygenase-2 (COX2), tumor necrosis factor (TNF)-α, and interleukin (IL)-1β were higher in the colons of normal aged rats (18 months-old) than in the colons of normal young rats (6 months-old). Treatment with IRT5 suppressed the age-associated increased expression of iNOS, COX2, TNF-α, and IL-1β, and activation of NF-κB and mitogen-activated protein kinases. In a similar manner, the expression of tight junction proteins in the colon of normal aged rats was suppressed more potently than in normal young rats, and treatment of aged rats with IRT5 decreased the age-dependent suppression of tight junction proteins ZO-1, occludin, and claudin-1. Treatment with IRT5 suppressed age-associated increases in expressions of senescence markers p16 and p53 in the colon of aged rats, but increased age-suppressed expression of SIRT1. However, treatment with IRT5 inhibited age-associated increased myeloperoxidase activity in the colon. In addition, treatment with IRT5 lowered the levels of LPS in intestinal fluid and blood of aged rats, as well as the reduced concentrations of reactive oxygen species, malondialdehyde, and C-reactive protein in the blood. These findings suggest that IRT5 treatment may suppress age-dependent colitis by inhibiting gut microbiota LPS production. Copyright © 2015 Elsevier B.V. All rights reserved.
Validation of self-directed learning instrument and establishment of normative data for nursing students in taiwan: using polytomous item response theory.

PubMed

Cheng, Su-Fen; Lee-Hsieh, Jane; Turton, Michael A; Lin, Kuan-Chia

2014-06-01

Little research has investigated the establishment of norms for nursing students' self-directed learning (SDL) ability, recognized as an important capability for professional nurses. An item response theory (IRT) approach was used to establish norms for SDL abilities valid for the different nursing programs in Taiwan. The purposes of this study were (a) to use IRT with a graded response model to reexamine the SDL instrument, or the SDLI, originally developed by this research team using confirmatory factor analysis and (b) to establish SDL ability norms for the four different nursing education programs in Taiwan. Stratified random sampling with probability proportional to size was used. A minimum of 15% of students from the four different nursing education degree programs across Taiwan was selected. A total of 7,879 nursing students from 13 schools were recruited. The research instrument was the 20-item SDLI developed by Cheng, Kuo, Lin, and Lee-Hsieh (2010). IRT with the graded response model was used with a two-parameter logistic model (discrimination and difficulty) for the data analysis, calculated using MULTILOG. Norms were established using percentile rank. Analysis of item information and test information functions revealed that 18 items exhibited very high discrimination and two items had high discrimination. The test information function was higher in this range of scores, indicating greater precision in the estimate of nursing student SDL. Reliability fell between .80 and .94 for each domain and the SDLI as a whole. The total information function shows that the SDLI is appropriate for all nursing students, except for the top 2.5%. SDL ability norms were established for each nursing education program and for the nation as a whole. IRT is shown to be a potent and useful methodology for scale evaluation. The norms for SDL established in this research will provide practical standards for nursing educators and students in Taiwan.
Handbook of Polytomous Item Response Theory Models

ERIC Educational Resources Information Center

Nering, Michael L., Ed.; Ostini, Remo, Ed.

2010-01-01

This comprehensive "Handbook" focuses on the most used polytomous item response theory (IRT) models. These models help us understand the interaction between examinees and test questions where the questions have various response categories. The book reviews all of the major models and includes discussions about how and where the models…
Stochastic Approximation Methods for Latent Regression Item Response Models

ERIC Educational Resources Information Center

von Davier, Matthias; Sinharay, Sandip

2010-01-01

This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates…
Fitting Item Response Theory Models to Two Personality Inventories: Issues and Insights.

PubMed

Chernyshenko, O S; Stark, S; Chan, K Y; Drasgow, F; Williams, B

2001-10-01

The present study compared the fit of several IRT models to two personality assessment instruments. Data from 13,059 individuals responding to the US-English version of the Fifth Edition of the Sixteen Personality Factor Questionnaire (16PF) and 1,770 individuals responding to Goldberg's 50 item Big Five Personality measure were analyzed. Various issues pertaining to the fit of the IRT models to personality data were considered. We examined two of the most popular parametric models designed for dichotomously scored items (i.e., the two- and three-parameter logistic models) and a parametric model for polytomous items (Samejima's graded response model). Also examined were Levine's nonparametric maximum likelihood formula scoring models for dichotomous and polytomous data, which were previously found to provide good fits to several cognitive ability tests (Drasgow, Levine, Tsien, Williams, & Mead, 1995). The two- and three-parameter logistic models fit some scales reasonably well but not others; the graded response model generally did not fit well. The nonparametric formula scoring models provided the best fit of the models considered. Several implications of these findings for personality measurement and personnel selection were described.
Some Issues in Item Response Theory: Dimensionality Assessment and Models for Guessing

ERIC Educational Resources Information Center

Smith, Jessalyn

2009-01-01

Currently, standardized tests are widely used as a method to measure how well schools and students meet academic standards. As a result, measurement issues have become an increasingly popular topic of study. Unidimensional item response models are used to model latent abilities and specific item characteristics. This class of models makes…
Consequences of Ignoring Guessing when Estimating the Latent Density in Item Response Theory

ERIC Educational Resources Information Center

Woods, Carol M.

2008-01-01

In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters. In extant Monte Carlo evaluations of RC-IRT, the item response function (IRF) used to fit the data is the same one used to generate the data. The present simulation study examines RC-IRT when the IRF is imperfectly…
A Structural View of American Educational History

ERIC Educational Resources Information Center

Maxcy, Spencer J.

1977-01-01

Displays the components of the structuralist views of Levi-Strauss, Michel Foucault, and Thomas S. Kuhn; constructs a model for doing structuralist studies in educational research; and tests the model on the pragmatic/progressive period in American educational history. (Author/IRT)
Forecasting the Movement of Educational Administrators Through Vacancy Flows

ERIC Educational Resources Information Center

Brown, Daniel J.

1976-01-01

Discusses the problem of forecasting manpower flows in administrative hierarchies of educational organizations, reviews groups of manpower models, discusses characteristics of administrative hierarchies and the vacancy model as it relates to those characteristics, and carries out validation and projective tests of the model. (Author/IRT)
On the Performance Characteristics of Latent-Factor and Knowledge Tracing Models

ERIC Educational Resources Information Center

Klingler, Severin; Käser, Tanja; Solenthaler, Barbara; Gross, Markus

2015-01-01

Modeling student knowledge is a fundamental task of an intelligent tutoring system. A popular approach for modeling the acquisition of knowledge is Bayesian Knowledge Tracing (BKT). Various extensions to the original BKT model have been proposed, among them two novel models that unify BKT and Item Response Theory (IRT). Latent Factor Knowledge…
Assessing the Accuracy and Consistency of Language Proficiency Classification under Competing Measurement Models

ERIC Educational Resources Information Center

Zhang, Bo

2010-01-01

This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…
Looking Closer at the Effects of Framing on Risky Choice: An Item Response Theory Analysis.

PubMed

Sickar; Highhouse

1998-07-01

Item response theory (IRT) methodology allowed an in-depth examination of several issues that would be difficult to explore using traditional methodology. IRT models were estimated for 4 risky-choice items, answered by students under either a gain or loss frame. Results supported the typical framing finding of risk-aversion for gains and risk-seeking for losses but also suggested that a latent construct we label preference for risk was influential in predicting risky choice. Also, the Asian Disease item, most often used in framing research, was found to have anomalous statistical properties when compared to other framing items. Copyright 1998 Academic Press.
Construct validity of the Swedish version of the revised piper fatigue scale in an oncology sample--a Rasch analysis.

PubMed

Lundgren-Nilsson, Asa; Dencker, Anna; Jakobsson, Sofie; Taft, Charles; Tennant, Alan

2014-06-01

Fatigue is a common and distressing symptom in cancer patients due to both the disease and its treatments. The concept of fatigue is multidimensional and includes both physical and mental components. The 22-item Revised Piper Fatigue Scale (RPFS) is a multidimensional instrument developed to assess cancer-related fatigue. This study reports on the construct validity of the Swedish version of the RPFS from the perspective of Rasch measurement. The Swedish version of the RPFS was answered by 196 cancer patients fatigued after 4 to 5 weeks of curative radiation therapy. Data from the scale were fitted to the Rasch measurement model. This involved testing a series of assumptions, including the stochastic ordering of items, local response dependency, and unidimensionality. A series of fit statistics were computed, differential item functioning (DIF) was tested, and local response dependency was accommodated through testlets. The Behavioral, Affective and Sensory domains all satisfied the Rasch model expectations. No DIF was observed, and all domains were found to be unidimensional. The Mood/Cognitive scale failed to fit the model, and substantial multidimensionality was found. Splitting the scale between Mood and Cognitive items resolved fit to the Rasch model, and new domains were unidimensional without DIF. The current Rasch analyses add to the evidence of measurement properties of the scale and show that the RPFS has good psychometric properties and works well to measure fatigue. The original four-factor structure, however, was not supported. Copyright © 2014 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Time change of perceptual reversal of ambiguous figures by rTMS.

PubMed

Nojima, K; Ge, S; Katayama, Y; Iramina, K

2010-01-01

The aim of this study was to investigate the effect of stimulus frequency and number of pulses during rTMS (repetitive transcranial magnetic stimulation) on the phenomenon of perceptual reversal. Particularly, we focused on the temporal dynamics of perceptual reversal in the right SPL (superior parietal lobule), using the spinning wheel illusion. We measured the IRT (inter-reversal time) of perceptual reversal. To investigate whether stimulus frequency or the number of pulses is critical for the rTMS effect, we applied the following schedules over the right SPL and the right PTL (posterior temporal lobe): 0.25Hz 60 pulses, 0.25Hz 120pulses, 0.5Hz 120 pulses, and 1Hz 120 pulses biphasic rTMS at 90% of the resting motor threshold. As a control, we included a No-TMS condition. The results showed that rTMS with 0.25Hz 60 pulses over the right SPL caused shorter IRT. There were no significant differences between IRTs for rTMS with 0.25Hz 120 pulses, 0.5Hz 120 pulses or 1Hz 120 pulses over the right SPL. Comparing these results with those of a previous study, we found that an rTMS condition with 60 pulses causes shorter IRT; 240 pulses causes longer IRT; and 120 pulses does not change IRT. Therefore, when applying rTMS over the right SPL, the IRT of perceptual reversal is primarily affected by the number of pulses.

The Estimation of the IRT Reliability Coefficient and Its Lower and Upper Bounds, with Comparisons to CTT Reliability Statistics

ERIC Educational Resources Information Center

Kim, Seonghoon; Feldt, Leonard S.

2010-01-01

The primary purpose of this study is to investigate the mathematical characteristics of the test reliability coefficient rho[subscript XX'] as a function of item response theory (IRT) parameters and present the lower and upper bounds of the coefficient. Another purpose is to examine relative performances of the IRT reliability statistics and two…
The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

ERIC Educational Resources Information Center

Culpepper, Steven Andrew

2013-01-01

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…
Screening for fever by remote-sensing infrared thermographic camera.

PubMed

Chan, Lung-Sang; Cheung, Giselle T Y; Lauder, Ian J; Kumana, Cyrus R; Lauder, Ian J

2004-01-01

Following the severe acute respiratory syndrome (SARS) outbreak, remote-sensing infrared thermography (IRT) has been advocated as a possible means of screening for fever in travelers at airports and border crossings, but its applicability has not been established. We therefore set out to evaluate (1) the feasibility of IRT imaging to identify subjects with fever, and (2) the optimal instrumental configuration and validity for such testing. Over a 20-day inclusive period, 176 subjects (49 hospital inpatients without SARS or suspected SARS, 99 health clinic attendees and 28 healthy volunteers) were recruited. Remotely sensed IRT readings were obtained from various parts of the front and side of the face (at distances of 1.5 and 0.5 m), and compared to concurrently determined body temperature measurements using conventional means (aural tympanic IRT and oral mercury thermometry). The resulting data were submitted to linear regression/correlation and sensitivity analyses. All recruits gave prior informed consent and our Faculty Institutional Review Board approved the protocol. Optimal correlations were found between conventionally measured body temperatures and IRT readings from (1) the front of the face at 1.5m with the mouth open (r=0.80), (2) the ear at 0.5 m (r=0.79), and (3) the side of the face at 1.5m (r=0.76). Average IRT readings from the forehead and elsewhere were 1 degrees C to 2 degrees C lower and correlated less well. Ear IRT readings at 0.5 m yielded the narrowest confidence intervals and could be used to predict conventional body temperature readings of < or = 38 degrees C with a sensitivity and specificity of 83% and 88% respectively. IRT readings from the side of the face, especially from the ear at 0.5 m, yielded the most reliable, precise and consistent estimates of conventionally determined body temperatures. Our results have important implications for walk-through IRT scanning/screening systems at airports and border crossings, particularly as the point prevalence of fever in such subjects would be very low.
An Aggregate IRT Procedure for Exploratory Factor Analysis

ERIC Educational Resources Information Center

Camilli, Gregory; Fox, Jean-Paul

2015-01-01

An aggregation strategy is proposed to potentially address practical limitation related to computing resources for two-level multidimensional item response theory (MIRT) models with large data sets. The aggregate model is derived by integration of the normal ogive model, and an adaptation of the stochastic approximation expectation maximization…
Project Simu-School Component Washington State University

ERIC Educational Resources Information Center

Glass, Thomas E.

1976-01-01

This component of the project attempts to facilitate planning by furnishing models that manage cumbersome and complex data, supply an objectivity that identifies all relationships between elements of the model, and provide a quantitative model allowing for various forecasting techniques that describe the long-range impact of decisions. (Author/IRT)
Cognitive Diagnostic Modeling Using R

ERIC Educational Resources Information Center

Ravand, Hamdollah

2015-01-01

Cognitive diagnostic models (CDM) have been around for more than a decade but their application is far from widespread for mainly two reasons: (1) CDMs are novel, as compared to traditional IRT models. Consequently, many researchers lack familiarity with them and their properties, and (2) Software programs doing CDMs have been expensive and not…
Bayesian Analysis of Multidimensional Item Response Theory Models: A Discussion and Illustration of Three Response Style Models

ERIC Educational Resources Information Center

Leventhal, Brian C.; Stone, Clement A.

2018-01-01

Interest in Bayesian analysis of item response theory (IRT) models has grown tremendously due to the appeal of the paradigm among psychometricians, advantages of these methods when analyzing complex models, and availability of general-purpose software. Possible models include models which reflect multidimensionality due to designed test structure,…
Performance of the S - [chi][squared] Statistic for Full-Information Bifactor Models

ERIC Educational Resources Information Center

Li, Ying; Rupp, Andre A.

2011-01-01

This study investigated the Type I error rate and power of the multivariate extension of the S - [chi][squared] statistic using unidimensional and multidimensional item response theory (UIRT and MIRT, respectively) models as well as full-information bifactor (FI-bifactor) models through simulation. Manipulated factors included test length, sample…
A Binary Programming Approach to Automated Test Assembly for Cognitive Diagnosis Models

ERIC Educational Resources Information Center

Finkelman, Matthew D.; Kim, Wonsuk; Roussos, Louis; Verschoor, Angela

2010-01-01

Automated test assembly (ATA) has been an area of prolific psychometric research. Although ATA methodology is well developed for unidimensional models, its application alongside cognitive diagnosis models (CDMs) is a burgeoning topic. Two suggested procedures for combining ATA and CDMs are to maximize the cognitive diagnostic index and to use a…
Multidimensional Extension of Multiple Indicators Multiple Causes Models to Detect DIF

ERIC Educational Resources Information Center

Lee, Soo; Bulut, Okan; Suh, Youngsuk

2017-01-01

A number of studies have found multiple indicators multiple causes (MIMIC) models to be an effective tool in detecting uniform differential item functioning (DIF) for individual items and item bundles. A recently developed MIMIC-interaction model is capable of detecting both uniform and nonuniform DIF in the unidimensional item response theory…
Rewards of bridging the divide between measurement and clinical theory: demonstration of a bifactor model for the Brief Symptom Inventory.

PubMed

Thomas, Michael L

2012-03-01

There is growing evidence that psychiatric disorders maintain hierarchical associations where general and domain-specific factors play prominent roles (see D. Watson, 2005). Standard, unidimensional measurement models can fail to capture the meaningful nuances of such complex latent variable structures. The present study examined the ability of the multidimensional item response theory bifactor model (see R. D. Gibbons & D. R. Hedeker, 1992) to improve construct validity by serving as a bridge between measurement and clinical theories. Archival data consisting of 688 outpatients' psychiatric diagnoses and item-level responses to the Brief Symptom Inventory (BSI; L. R. Derogatis, 1993) were extracted from files at a university mental health clinic. The bifactor model demonstrated superior fit for the internal structure of the BSI and improved overall diagnostic accuracy in the sample (73%) compared with unidimensional (61%) and oblique simple structure (65%) models. Consistent with clinical theory, multiple sources of item variance were drawn from individual test items. Test developers and clinical researchers are encouraged to consider model-based measurement in the assessment of psychiatric distress.
A Model Fit Statistic for Generalized Partial Credit Model

ERIC Educational Resources Information Center

Liang, Tie; Wells, Craig S.

2009-01-01

Investigating the fit of a parametric model is an important part of the measurement process when implementing item response theory (IRT), but research examining it is limited. A general nonparametric approach for detecting model misfit, introduced by J. Douglas and A. S. Cohen (2001), has exhibited promising results for the two-parameter logistic…
The Graded Unfolding Model: A Unidimensional Item Response Model for Unfolding Graded Responses.

ERIC Educational Resources Information Center

Roberts, James S.; Laughlin, James E.

Binary or graded disagree-agree responses to attitude items are often collected for the purpose of attitude measurement. Although such data are sometimes analyzed with cumulative measurement models, recent investigations suggest that unfolding models are more appropriate (J. S. Roberts, 1995; W. H. Van Schuur and H. A. L. Kiers, 1994). Advances in…
A Unidimensional Item Response Model for Unfolding Responses from a Graded Disagree-Agree Response Scale.

ERIC Educational Resources Information Center

Roberts, James S.; Laughlin, James E.

1996-01-01

A parametric item response theory model for unfolding binary or graded responses is developed. The graded unfolding model (GUM) is a generalization of the hyperbolic cosine model for binary data of D. Andrich and G. Luo (1993). Applicability of the GUM to attitude testing is illustrated with real data. (SLD)
Item response theory in personality assessment: a demonstration using the MMPI-2 depression scale.

PubMed

Childs, R A; Dahlstrom, W G; Kemp, S M; Panter, A T

2000-03-01

Item response theory (IRT) analyses have, over the past 3 decades, added much to our understanding of the relationships among and characteristics of test items, as revealed in examinees response patterns. Assessment instruments used outside the educational context have only infrequently been analyzed using IRT, however. This study demonstrates the relevance of IRT to personality data through analyses of Scale 2 (the Depression Scale) on the revised Minnesota Multiphasic Personality Inventory (MMPI-2). A rich set of hypotheses regarding the items on this scale, including contrasts among the Harris-Lingoes and Wiener-Harmon subscales and differences in the items measurement characteristics for men and women, are investigated through the IRT analyses.
Psychometric properties for the Balanced Inventory of Desirable Responding: dichotomous versus polytomous conventional and IRT scoring.

PubMed

Vispoel, Walter P; Kim, Han Yi

2014-09-01

[Correction Notice: An Erratum for this article was reported in Vol 26(3) of Psychological Assessment (see record 2014-16017-001). The mean, standard deviation and alpha coefficient originally reported in Table 1 should be 74.317, 10.214 and .802, respectively. The validity coefficients in the last column of Table 4 are affected as well. Correcting this error did not change the substantive interpretations of the results, but did increase the mean, standard deviation, alpha coefficient, and validity coefficients reported for the Honesty subscale in the text and in Tables 1 and 4. The corrected versions of Tables 1 and Table 4 are shown in the erratum.] Item response theory (IRT) models were applied to dichotomous and polytomous scoring of the Self-Deceptive Enhancement and Impression Management subscales of the Balanced Inventory of Desirable Responding (Paulhus, 1991, 1999). Two dichotomous scoring methods reflecting exaggerated endorsement and exaggerated denial of socially desirable behaviors were examined. The 1- and 2-parameter logistic models (1PLM, 2PLM, respectively) were applied to dichotomous responses, and the partial credit model (PCM) and graded response model (GRM) were applied to polytomous responses. For both subscales, the 2PLM fit dichotomous responses better than did the 1PLM, and the GRM fit polytomous responses better than did the PCM. Polytomous GRM and raw scores for both subscales yielded higher test-retest and convergent validity coefficients than did PCM, 1PLM, 2PLM, and dichotomous raw scores. Information plots showed that the GRM provided consistently high measurement precision that was superior to that of all other IRT models over the full range of both construct continuums. Dichotomous scores reflecting exaggerated endorsement of socially desirable behaviors provided noticeably weak precision at low levels of the construct continuums, calling into question the use of such scores for detecting instances of "faking bad." Dichotomous models reflecting exaggerated denial of the same behaviors yielded much better precision at low levels of the constructs, but it was still less precision than that of the GRM. These results support polytomous over dichotomous scoring in general, alternative dichotomous scoring for detecting faking bad, and extension of GRM scoring to situations in which IRT offers additional practical advantages over classical test theory (adaptive testing, equating, linking, scaling, detecting differential item functioning, and so forth). PsycINFO Database Record (c) 2014 APA, all rights reserved.
The application of item response theory in developing and validating a shortened version of the Emirate Marital Satisfaction Scale.

PubMed

Dodeen, Hamzeh; Al-Darmaki, Fatima

2016-12-01

The aim of this study was to determine the feasibility of generating a shorter version of the Emirati Marital Satisfaction Scale (EMSS) using item response theory (IRT)-based methodology. The EMSS is the first national scale used to provide an understanding of the family function and level of marital satisfaction within the cultural context of the United Arab Emirates. A sample of 1,049 Emirati married individuals from different ages, genders, places of residence, and monthly incomes participated in this study. The IRT was calibrated using X-Calibre 4.2 and the graded response model. The analysis was developed on the basis of a short form of the EMSS (7 items), which constitutes a promising alternative to the original scale for practitioners and researchers. This short version is reliable, valid, and it gives results very similar to the original scale. The results of this study confirmed the usefulness of IRT-based methodology for developing psychological and counseling scales. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Intrinsic Membrane Properties of Pre-oromotor Neurons in the Intermediate Zone of the Medullary Reticular Formation

PubMed Central

Venugopal, Sharmila; Boulant, Jack A.; Chen, Zhixiong; Travers, Joseph B.

2010-01-01

Neurons in the lower brainstem that control consummatory behavior are widely distributed in the reticular formation (RF) of the pons and medulla. The intrinsic membrane properties of neurons within this distributed system shape complex excitatory and inhibitory inputs from both orosensory and central structures implicated in homeostatic control to produce coordinated oromotor patterns. The current study explored the intrinsic membrane properties of neurons in the intermediate subdivision of the medullary reticular formation (IRt). Neurons in the IRt receive input from the overlying (gustatory) nucleus of the solitary tract and project to the oromotor nuclei. Recent behavioral pharmacology studies as well as computational modeling suggest that inhibition in the IRt plays an important role in the transition from a taste-initiated oromotor pattern of ingestion to one of rejection. The present study explored the impact of hyperpolarization on membrane properties. In response to depolarization, neurons responded with either a tonic discharge, an irregular/burst pattern or were spike-adaptive. A hyperpolarizing pre-pulse modulated the excitability of most (82%) IRt neurons to subsequent depolarization. Instances of both increased (30%) and decreased (52%) excitability were observed. Currents induced by the hyperpolarization included an outward 4-AP sensitive K+ current that suppressed excitability and an inward cation current that increased excitability. These currents are also present in other subpopulations of RF neurons that influence the oromotor nuclei and we discuss how these currents could alter ring characteristics to impact pattern generation. PMID:20338224
An experimental and theoretical study of the ice accretion process during artificial and natural icing conditions

NASA Technical Reports Server (NTRS)

Kirby, Mark S.; Hansman, R. John

1988-01-01

Real-time measurements of ice growth during artificial and natural icing conditions were conducted using an ultrasonic pulse-echo technique. This technique allows ice thickness to be measured with an accuracy of + or - 0.5 mm; in addition, the ultrasonic signal characteristics may be used to detect the presence of liquid on the ice surface and hence discern wet and dry ice growth behavior. Ice growth was measured on the stagnation line of a cylinder exposed to artificial icing conditions in the NASA Lewis Icing Research Tunnel (IRT), and similarly for a cylinder exposed in flight to natural icing conditions. Ice thickness was observed to increase approximately linearly with exposure time during the initial icing period. The ice accretion rate was found to vary with cloud temperature during wet ice growth, and liquid runback from the stagnation region was inferred. A steady-state energy balance model for the icing surface was used to compare heat transfer characteristics for IRT and natural icing conditions. Ultrasonic measurements of wet and dry ice growth observed in the IRT and in flight were compared with icing regimes predicted by a series of heat transfer coefficients. The heat transfer magnitude was generally inferred to be higher for the IRT than for the natural icing conditions encountered in flight. An apparent variation in the heat transfer magnitude was also observed for flights conducted through different natural icing-cloud formations.
Methods for Equating Mental Tests.

DTIC Science & Technology

1984-11-01

1983) compared conventional and IRT methods for equating the Test of English as a Foreign Language ( TOEFL ) after chaining. Three conventional and...three IRT equating methods were examined in this study; two sections of TOEFL were each (separately) equated. The IRT methods included the following: (a...group. A separate base form was established for each of the six equating methods. Instead of equating the base-form TOEFL to itself, the last (eighth

Aero-Thermal Calibration of the NASA Glenn Icing Research Tunnel (2012 Tests)

NASA Technical Reports Server (NTRS)

Pastor-Barsi, Christine; Allen, Arrington E.

2013-01-01

A full aero-thermal calibration of the NASA Glenn Icing Research Tunnel (IRT) was completed in 2012 following the major modifications to the facility that included replacement of the refrigeration plant and heat exchanger. The calibration test provided data used to fully document the aero-thermal flow quality in the IRT test section and to construct calibration curves for the operation of the IRT.
A new network of faint calibration stars from the near infrared spectrometer (NIRS) on the IRTS

NASA Technical Reports Server (NTRS)

Freund, Minoru M.; Matsuura, Mikako; Murakami, Hiroshi; Cohen, Martin; Noda, Manabu; Matsuura, Shuji; Matsumoto, Toshio

1997-01-01

The point source extraction and calibration of the near infrared spectrometer (NIRS) onboard the Infrared Telescope in Space (IRTS) is described. About 7 percent of the sky was observed during a one month mission in the range of 1.4 micrometers to 4 micrometers. The accuracy of the spectral shape and absolute values of calibration stars provided by the NIRS/IRTS were validated.
Inoculation with Bacillus subtilis and Azospirillum brasilense produces abscisic acid that reduces IRT1-mediated cadmium uptake of roots.

PubMed

Xu, Qianru; Pan, Wei; Zhang, Ranran; Lu, Qi; Xue, Wanlei; Wu, Cainan; Song, Bixiu; Du, Shaoting

2018-05-08

Cadmium (Cd) contamination of agricultural soils represents a serious risk to crop safety. A new strategy using abscisic acid (ABA)-generating bacteria, Bacillus subtilis or Azospirillum brasilense, was developed to reduce the Cd accumulation in plants grown in Cd-contaminated soil. Inoculation with either bacterium resulted in a pronounced increase in the ABA level in wild-type Arabidopsis Col-0 plants, accompanied by a decrease in Cd levels in plant tissues, which mitigated the Cd toxicity. As a consequence, the growth of plants exposed to Cd was improved. Nevertheless, B. subtilis and A. brasilense inoculation had little effect on Cd levels and toxicity in the ABA-insensitive mutant snrk 2.2/2.3, indicating that the action of ABA is required for these bacteria to reduce Cd accumulation in plants. Furthermore, inoculation with either B. subtilis or A. brasilense down-regulated the expression of IRT1 (IRON-REGULATED TRANSPORTER 1) in the roots of wild-type plants and had little effect on Cd levels in the IRT1-knockout mutants irt1-1 and irt1-2. In summary, we conclude that B. subtilis and A. brasilense can reduce Cd levels in plants via an IRT1-dependent ABA-mediated mechanism.
Medical applications of infrared thermography: A review

NASA Astrophysics Data System (ADS)

Lahiri, B. B.; Bagavathiappan, S.; Jayakumar, T.; Philip, John

2012-07-01

Abnormal body temperature is a natural indicator of illness. Infrared thermography (IRT) is a fast, passive, non-contact and non-invasive alternative to conventional clinical thermometers for monitoring body temperature. Besides, IRT can also map body surface temperature remotely. Last five decades witnessed a steady increase in the utility of thermal imaging cameras to obtain correlations between the thermal physiology and skin temperature. IRT has been successfully used in diagnosis of breast cancer, diabetes neuropathy and peripheral vascular disorders. It has also been used to detect problems associated with gynecology, kidney transplantation, dermatology, heart, neonatal physiology, fever screening and brain imaging. With the advent of modern infrared cameras, data acquisition and processing techniques, it is now possible to have real time high resolution thermographic images, which is likely to surge further research in this field. The present efforts are focused on automatic analysis of temperature distribution of regions of interest and their statistical analysis for detection of abnormalities. This critical review focuses on advances in the area of medical IRT. The basics of IRT, essential theoretical background, the procedures adopted for various measurements and applications of IRT in various medical fields are discussed in this review. Besides background information is provided for beginners for better understanding of the subject.
Efficacy of a cognitive-behavioral treatment for insomnia and nightmares in Afghanistan and Iraq veterans with PTSD.

PubMed

Margolies, Skye Ochsner; Rybarczyk, Bruce; Vrana, Scott R; Leszczyszyn, David J; Lynch, John

2013-10-01

Sleep disturbances are a core and salient feature of posttraumatic stress disorder (PTSD). Pilot studies have indicated that combined cognitive-behavioral therapy for insomnia (CBT-I) and imagery rehearsal therapy (IRT) for nightmares improves sleep as well as PTSD symptoms. The present study randomized 40 combat veterans (mean age 37.7 years; 90% male and 60% African American) who served in Afghanistan and/or Iraq (Operation Enduring Freedom [OEF]/Operation Iraqi Freedom [OIF]) to 4 sessions of CBT-I with adjunctive IRT or a waitlist control group. Two thirds of participants had nightmares at least once per week and received the optional IRT module. At posttreatment, veterans who participated in CBT-I/IRT reported improved subjectively and objectively measured sleep, a reduction in PTSD symptom severity and PTSD-related nighttime symptoms, and a reduction in depression and distressed mood compared to the waitlist control group. The findings from this first controlled study with OEF/OIF veterans suggest that CBT-I combined with adjunctive IRT may hold promise for reducing both insomnia and PTSD symptoms. Given the fact that only half of the patients with nightmares fully implemented the brief IRT protocol, future studies should determine if this supplement adds differential efficacy to CBT-I alone. © 2013 Wiley Periodicals, Inc.
A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

ERIC Educational Resources Information Center

Fukuhara, Hirotaka; Kamata, Akihito

2011-01-01

A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…
A Person Fit Test for IRT Models for Polytomous Items

ERIC Educational Resources Information Center

Glas, C. A. W.; Dagohoy, Anna Villa T.

2007-01-01

A person fit test based on the Lagrange multiplier test is presented for three item response theory models for polytomous items: the generalized partial credit model, the sequential model, and the graded response model. The test can also be used in the framework of multidimensional ability parameters. It is shown that the Lagrange multiplier…
Limits on Log Cross-Product Ratios for Item Response Models. Research Report. ETS RR-06-10

ERIC Educational Resources Information Center

Haberman, Shelby J.; Holland, Paul W.; Sinharay, Sandip

2006-01-01

Bounds are established for log cross-product ratios (log odds ratios) involving pairs of items for item response models. First, expressions for bounds on log cross-product ratios are provided for unidimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model.…
The Log-Linear Cognitive Diagnostic Model (LCDM) as a Special Case of The General Diagnostic Model (GDM). Research Report. ETS RR-14-40

ERIC Educational Resources Information Center

von Davier, Matthias

2014-01-01

Diagnostic models combine multiple binary latent variables in an attempt to produce a latent structure that provides more information about test takers' performance than do unidimensional latent variable models. Recent developments in diagnostic modeling emphasize the possibility that multiple skills may interact in a conjunctive way within the…
Variants in Solute Carrier SLC26A9 Modify Prenatal Exocrine Pancreatic Damage in Cystic Fibrosis

PubMed Central

Miller, Melissa R.; Soave, David; Li, Weili; Gong, Jiafen; Pace, Rhonda G.; Boëlle, Pierre-Yves; Cutting, Garry R.; Drumm, Mitchell L.; Knowles, Michael R.; Sun, Lei; Rommens, Johanna M.; Accurso, Frank; Durie, Peter R.; Corvol, Harriet; Levy, Hara; Sontag, Marci K.; Strug, Lisa J.

2015-01-01

Objectives To test the hypothesis that multiple constituents of the apical plasma membrane residing alongside the causal CF Transmembrane Conductance Regulator (CFTR) protein, including known cystic fibrosis (CF) modifiers SLC26A9, SLC6A14, and SLC9A3, would be associated with prenatal exocrine pancreatic damage as measured by newborn screened (NBS) IRT levels. Study design NBS IRT measures and genome-wide genotype data were available on 111 subjects from Colorado, 37 subjects from Wisconsin, and 80 subjects from France. Multiple linear regression was used to determine whether any of eight SNPs in SLC26A9, SLC6A14 and SLC9A3 were associated with IRT and whether other constituents of the apical plasma membrane contributed to IRT. Results In the Colorado sample, three SLC26A9 SNPs were associated with NBS IRT (min P = 1.16 × 10−3; rs7512462), but no SLC6A14 or SLC9A3 SNPs were associated (P > 0.05). The rs7512462 association replicated in the Wisconsin sample (P = 0.03) but not in the French sample (P = 0.76). Furthermore, rs7512462 was the top ranked apical membrane constituent in the combined Colorado and Wisconsin sample. Conclusions NBS IRT is a biomarker of prenatal exocrine pancreatic disease in patients with CF, and a SNP in SLC26A9 accounts for significant IRT variability. This suggests SLC26A9 as a potential therapeutic target to ameliorate exocrine pancreatic disease. PMID:25771386
Can Handheld Thermal Imaging Technology Improve Detection of Poachers in African Bushveldt?

PubMed Central

Dandy, Shantelle; Stubbs, Hannah; MacTavish, Dougal; MacTavish, Lynne

2015-01-01

Illegal hunting (poaching) is a global threat to wildlife. Anti-poaching initiatives are making increasing use of technology, such as infrared thermography (IRT), to support traditional foot and vehicle patrols. To date, the effectiveness of IRT for poacher location has not been tested under field conditions, where thermal signatures are often complex. Here, we test the hypothesis that IRT will increase the distance over which a poacher hiding in African scrub bushveldt can be detected relative to a conventional flashlight. We also test whether any increase in effectiveness is related to the cost and complexity of the equipment by comparing comparatively expensive (22000 USD) and relatively inexpensive (2000 USD) IRT devices. To test these hypotheses we employ a controlled, fully randomised, double-blind procedure to find a poacher in nocturnal field conditions in African bushveldt. Each of our 27 volunteer observers walked three times along a pathway using one detection technology on each pass in randomised order. They searched a prescribed search area of bushveldt within which the target was hiding. Hiding locations were pre-determined, randomised, and changed with each pass. Distances of first detection and positive detection were noted. All technologies could be used to detect the target. Average first detection distance for flashlight was 37.3m, improving by 19.8m to 57.1m using LIRT and by a further 11.2m to 68.3m using HIRT. Although detection distances were significantly greater for both IRTs compared to flashlight, there was no significant difference between LIRT and HIRT. False detection rates were low and there was no significant association between technology and accuracy of detection. Although IRT technology should ideally be tested in the specific environment intended before significant investment is made, we conclude that IRT technology is promising for anti-poaching patrols and that for this purpose low cost IRT units are as effective as units ten times more expensive. PMID:26110865
Splicing factor SR34b mutation reduces cadmium tolerance in Arabidopsis by regulating iron-regulated transporter 1 gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Wentao; Du, Bojing; Liu, Di

Highlights: • Arabidopsis splicing factor SR34b gene is cadmium-inducible. • SR34b T-DNA insertion mutant is sensitive to cadmium due to high cadmium uptake. • SR34b is a regulator of cadmium transporter IRT1 at the posttranscription level. • These results highlight the roles of splicing factors in cadmium tolerance of plant. - Abstract: Serine/arginine-rich (SR) proteins are important splicing factors. However, the biological functions of plant SR proteins remain unclear especially in abiotic stresses. Cadmium (Cd) is a non-essential element that negatively affects plant growth and development. In this study, we provided clear evidence for SR gene involved in Cd tolerancemore » in planta. Systemic expression analysis of 17 Arabidopsis SR genes revealed that SR34b is the only SR gene upregulated by Cd, suggesting its potential roles in Arabidopsis Cd tolerance. Consistent with this, a SR34b T-DNA insertion mutant (sr34b) was moderately sensitive to Cd, which had higher Cd{sup 2+} uptake rate and accumulated Cd in greater amounts than wild-type. This was due to the altered expression of iron-regulated transporter 1 (IRT1) gene in sr34b mutant. Under normal growth conditions, IRT1 mRNAs highly accumulated in sr34b mutant, which was a result of increased stability of IRT1 mRNA. Under Cd stress, however, sr34b mutant plants had a splicing defect in IRT1 gene, thus reducing the IRT1 mRNA accumulation. Despite of this, sr34b mutant plants still constitutively expressed IRT1 proteins under Cd stress, thereby resulting in Cd stress-sensitive phenotype. We therefore propose the essential roles of SR34b in posttranscriptional regulation of IRT1 expression and identify it as a regulator of Arabidopsis Cd tolerance.« less
An Exploratory Analysis of Functional Staging Using an Item Response Theory Approach

PubMed Central

Tao, Wei; Haley, Stephen M.; Coster, Wendy J.; Ni, Pengsheng; Jette, Alan M.

2009-01-01

Objectives To develop and explore the feasibility of a functional staging system (defined as the process of assigning subjects, according to predetermined standards, into a set of hierarchical levels with regard to their functioning performance in mobility, daily activities, and cognitive skills) based on item response theory (IRT) methods using short-forms of the Activity Measure for Post-Acute Care (AM-PAC); and to compare the criterion validity and sensitivity of the IRT-based staging system to a non-IRT-based staging system developed for the FIM instrument. Design Prospective, longitudinal cohort study of patients interviewed at hospital discharge and 1, 6, and 12 months after inpatient rehabilitation. Setting Follow-up interviews conducted in patients’ homes. Participants Convenience sample of 516 patients (47% men; sample mean age, 68.3y) at baseline (retention at the final follow-up, 65%) with neurologic, lower-extremity orthopedic, or complex medical conditions. Interventions Not applicable Main Outcome Measures AM-PAC basic mobility, daily activity, and applied cognitive activity stages; FIM executive control, mobility, activities of daily living, and sphincter stages. Stages refer to the hierarchical levels assigned to patient’s functioning performance. Results We were able to define IRT-based staging definitions and create meaningful cut scores based on the 3 AM-PAC short-forms. The IRT stages correlated as well or better to the criterion items than the FIM stages. Both the IRT-based stages and the FIM stages were sensitive to changes throughout the 6-month follow-up period. The FIM stages were more sensitive in detecting changes between baseline and 1-month follow-up visit. The AM-PAC stages were more discriminant in the follow-up visits. Conclusions An IRT-based staging approach appeared feasible and effective in classifying patients throughout long-term follow-up. Although these stages were developed from short-forms, this staging methodology could also be applied to improve the meaning of scores generated from IRT-based computerized adaptive testing in future work. PMID:18503798
RhinAsthma patient perspective: A Rasch validation study.

PubMed

Molinengo, Giorgia; Baiardini, Ilaria; Braido, Fulvio; Loera, Barbara

2018-02-01

In daily practice, Health-Related Quality of Life (HRQoL) tools are useful for supplementing clinical data with the patient's perspective. To encourage their use by clinicians, the availability of tools that can quickly provide valid results is crucial. A new HRQoL tool has been proposed for patients with asthma and rhinitis: the RhinAsthma Patient Perspective-RAPP. The aim of this study was to evaluate the psychometric robustness of the RAPP using the Item Response Theory (IRT) approach, to evaluate the scalability of items and test whether or not patients use the items response scale correctly. 155 patients (53.5% women, mean age 39.1, range 16-76) were recruited during a multicenter study. RAPP metric properties were investigated using IRT models. Differential item functioning (DIF) was used for gender, age, and asthma control test (ACT). The RAPP adequately fitted the Rating Scale model, demonstrating the equality of the rating scale structure for all items. All statistics on items were satisfactory. The RAPP had adequate internal reliability and showed good ability to discriminate among different groups of participants. DIF analysis indicated that there were no differential item functioning issues for gender. One item showed a DIF by age and four items by ACT. The psychometric evaluation performed using IRT models demonstrated that the RAPP met all the criteria to be considered a reliable and valid method of measurement. From a clinical perspective, this will allow physicians to confidently interpret scores as good indicators of Quality of Life of patients with asthma.
ZIP14 and DMT1 in the liver, pancreas, and heart are differentially regulated by iron deficiency and overload: implications for tissue iron uptake in iron-related disorders

PubMed Central

Nam, Hyeyoung; Wang, Chia-Yu; Zhang, Lin; Zhang, Wei; Hojyo, Shintaro; Fukada, Toshiyuki; Knutson, Mitchell D.

2013-01-01

The liver, pancreas, and heart are particularly susceptible to iron-related disorders. These tissues take up plasma iron from transferrin or non-transferrin-bound iron, which appears during iron overload. Here, we assessed the effect of iron status on the levels of the transmembrane transporters, ZRT/IRT-like protein 14 and divalent metal-ion transporter-1, which have both been implicated in transferrin- and non-transferrin-bound iron uptake. Weanling male rats (n=6/group) were fed an iron-deficient, iron-adequate, or iron-overloaded diet for 3 weeks. ZRT/IRT-like protein 14, divalent metal-ion transporter-1 protein and mRNA levels in liver, pancreas, and heart were determined by using immunoblotting and quantitative reverse transcriptase polymerase chain reaction analysis. Confocal immunofluorescence microscopy was used to localize ZRT/IRT-like protein 14 in the liver and pancreas. ZRT/IRT-like protein 14 and divalent metal-ion transporter-1 protein levels were also determined in hypotransferrinemic mice with genetic iron overload. Hepatic ZRT/IRT-like protein 14 levels were found to be 100% higher in iron-loaded rats than in iron-adequate controls. By contrast, hepatic divalent metal-ion transporter-1 protein levels were 70% lower in iron-overloaded animals and nearly 3-fold higher in iron-deficient ones. In the pancreas, ZRT/IRT-like protein 14 levels were 50% higher in iron-overloaded rats, and in the heart, divalent metal-ion transporter-1 protein levels were 4-fold higher in iron-deficient animals. At the mRNA level, ZRT/IRT-like protein 14 expression did not vary with iron status, whereas divalent metal-ion transporter-1 expression was found to be elevated in iron-deficient livers. Immunofluorescence staining localized ZRT/IRT-like protein 14 to the basolateral membrane of hepatocytes and to acinar cells of the pancreas. Hepatic ZRT/IRT-like protein 14, but not divalent metal-ion transporter-1, protein levels were elevated in iron-loaded hypotransferrinemic mice. In conclusion, ZRT/IRT-like protein 14 protein levels are up-regulated in iron-loaded rat liver and pancreas and in hypotransferrinemic mouse liver. Divalent metal-ion transporter-1 protein levels are down-regulated in iron-loaded rat liver, and up-regulated in iron-deficient liver and heart. Our results provide insight into the potential contributions of these transporters to tissue iron uptake during iron deficiency and overload. PMID:23349308
Partially Observed Mixtures of IRT Models: An Extension of the Generalized Partial-Credit Model

ERIC Educational Resources Information Center

Von Davier, Matthias; Yamamoto, Kentaro

2004-01-01

The generalized partial-credit model (GPCM) is used frequently in educational testing and in large-scale assessments for analyzing polytomous data. Special cases of the generalized partial-credit model are the partial-credit model--or Rasch model for ordinal data--and the two parameter logistic (2PL) model. This article extends the GPCM to the…
Back to "the Future": Evidence of a Bifactor Solution for Scores on the Consideration of Future Consequences Scale.

PubMed

McKay, Michael T; Morgan, Grant B; van Exel, N Job; Worrell, Frank C

2015-01-01

Despite its widespread use, disagreement remains regarding the structure of the Consideration of Future Consequences Scale (CFCS). In particular there is disagreement regarding whether the scale assesses future orientation as a unidimensional or multidimensional (immediate and future) construct. Using 2 samples of high school students in the United Kingdom, 4 models were tested. The totality of results including item loadings, goodness-of-fit indexes, and reliability estimates all supported the bifactor model, suggesting that the 2 hypothesized factors are better understood as grouping or method factors rather than as representative of latent constructs. Accordingly this study supports the unidimensionality of the CFCS and the scoring of all 12 items to produce a global future orientation score. Researchers intending to use the CFCS, and those with existing data, are encouraged to examine a bifactor solution for the scale.
Development and Standardization of the Diagnostic Adaptive Behavior Scale: Application of Item Response Theory to the Assessment of Adaptive Behavior.

PubMed

Tassé, Marc J; Schalock, Robert L; Thissen, David; Balboni, Giulia; Bersani, Henry Hank; Borthwick-Duffy, Sharon A; Spreat, Scott; Widaman, Keith F; Zhang, Dalun; Navas, Patricia

2016-03-01

The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT modeling and a nationally representative standardization sample, the item set was reduced to 75 items that provide the most precise adaptive behavior information at the cutoff area determining the presence or not of significant adaptive behavior deficits across conceptual, social, and practical skills. The standardization of the DABS is described and discussed.
Parameter Recovery for the 1-P HGLLM with Non-Normally Distributed Level-3 Residuals

ERIC Educational Resources Information Center

Kara, Yusuf; Kamata, Akihito

2017-01-01

A multilevel Rasch model using a hierarchical generalized linear model is one approach to multilevel item response theory (IRT) modeling and is referred to as a one-parameter hierarchical generalized linear logistic model (1-P HGLLM). Although it has the flexibility to model nested structure of data with covariates, the model assumes the normality…
A Bayesian Semiparametric Item Response Model with Dirichlet Process Priors

ERIC Educational Resources Information Center

Miyazaki, Kei; Hoshino, Takahiro

2009-01-01

In Item Response Theory (IRT), item characteristic curves (ICCs) are illustrated through logistic models or normal ogive models, and the probability that examinees give the correct answer is usually a monotonically increasing function of their ability parameters. However, since only limited patterns of shapes can be obtained from logistic models…

Introduction to Multilevel Item Response Theory Analysis: Descriptive and Explanatory Models

ERIC Educational Resources Information Center

Sulis, Isabella; Toland, Michael D.

2017-01-01

Item response theory (IRT) models are the main psychometric approach for the development, evaluation, and refinement of multi-item instruments and scaling of latent traits, whereas multilevel models are the primary statistical method when considering the dependence between person responses when primary units (e.g., students) are nested within…
Randomized Item Response Theory Models

ERIC Educational Resources Information Center

Fox, Jean-Paul

2005-01-01

The randomized response (RR) technique is often used to obtain answers on sensitive questions. A new method is developed to measure latent variables using the RR technique because direct questioning leads to biased results. Within the RR technique is the probability of the true response modeled by an item response theory (IRT) model. The RR…
Goodness-of-Fit Assessment of Item Response Theory Models

ERIC Educational Resources Information Center

Maydeu-Olivares, Alberto

2013-01-01

The article provides an overview of goodness-of-fit assessment methods for item response theory (IRT) models. It is now possible to obtain accurate "p"-values of the overall fit of the model if bivariate information statistics are used. Several alternative approaches are described. As the validity of inferences drawn on the fitted model…
Structured Constructs Models Based on Change-Point Analysis

ERIC Educational Resources Information Center

Shin, Hyo Jeong; Wilson, Mark; Choi, In-Hee

2017-01-01

This study proposes a structured constructs model (SCM) to examine measurement in the context of a multidimensional learning progression (LP). The LP is assumed to have features that go beyond a typical multidimentional IRT model, in that there are hypothesized to be certain cross-dimensional linkages that correspond to requirements between the…
What IRT Can and Cannot Do

ERIC Educational Resources Information Center

Glas, Cees A. W.

2009-01-01

This author states that, while the article by Gunter Maris and Timo Bechger ("On Interpreting the Model Parameters for the Three Parameter Logistic Model," this issue) is highly interesting, the interest is not so much in the practical implications, but rather in the issue of the meaning and role of statistical models in psychometrics and…
Impact of Diagnosticity on the Adequacy of Models for Cognitive Diagnosis under a Linear Attribute Structure: A Simulation Study

ERIC Educational Resources Information Center

de La Torre, Jimmy; Karelitz, Tzur M.

2009-01-01

Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…
Exploring the Full-Information Bifactor Model in Vertical Scaling with Construct Shift

ERIC Educational Resources Information Center

Li, Ying; Lissitz, Robert W.

2012-01-01

To address the lack of attention to construct shift in item response theory (IRT) vertical scaling, a multigroup, bifactor model was proposed to model the common dimension for all grades and the grade-specific dimensions. Bifactor model estimation accuracy was evaluated through a simulation study with manipulated factors of percentage of common…
Stochastic Approximation Methods for Latent Regression Item Response Models. Research Report. ETS RR-09-09

ERIC Educational Resources Information Center

von Davier, Matthias; Sinharay, Sandip

2009-01-01

This paper presents an application of a stochastic approximation EM-algorithm using a Metropolis-Hastings sampler to estimate the parameters of an item response latent regression model. Latent regression models are extensions of item response theory (IRT) to a 2-level latent variable model in which covariates serve as predictors of the…
A Note on Item-Restscore Association in Rasch Models

ERIC Educational Resources Information Center

Kreiner, Svend

2011-01-01

To rule out the need for a two-parameter item response theory (IRT) model during item analysis by Rasch models, it is important to check the Rasch model's assumption that all items have the same item discrimination. Biserial and polyserial correlation coefficients measuring the association between items and restscores are often used in an informal…
Using Data Augmentation and Markov Chain Monte Carlo for the Estimation of Unfolding Response Models

ERIC Educational Resources Information Center

Johnson, Matthew S.; Junker, Brian W.

2003-01-01

Unfolding response models, a class of item response theory (IRT) models that assume a unimodal item response function (IRF), are often used for the measurement of attitudes. Verhelst and Verstralen (1993)and Andrich and Luo (1993) independently developed unfolding response models by relating the observed responses to a more common monotone IRT…
Modified Likelihood-Based Item Fit Statistics for the Generalized Graded Unfolding Model

ERIC Educational Resources Information Center

Roberts, James S.

2008-01-01

Orlando and Thissen (2000) developed an item fit statistic for binary item response theory (IRT) models known as S-X[superscript 2]. This article generalizes their statistic to polytomous unfolding models. Four alternative formulations of S-X[superscript 2] are developed for the generalized graded unfolding model (GGUM). The GGUM is a…
Psychometric evaluation of the revised Illness Perception Questionnaire (IPQ-R) in cancer patients: confirmatory factor analysis and Rasch analysis.

PubMed

Ashley, Laura; Smith, Adam B; Keding, Ada; Jones, Helen; Velikova, Galina; Wright, Penny

2013-12-01

To provide new insights into the psychometrics of the revised Illness Perception Questionnaire (IPQ-R) in cancer patients. To undertake, for the first time using data from breast, colorectal and prostate cancer patients, a confirmatory factor analysis (CFA) to assess the validity of the IPQ-R's core seven-factor structure. Also, for the first time in any illness group, to undertake Rasch analysis to explore the extent to which the IPQ-R factors form unidimensional scales, with linear measurement properties and no Differential Item Functioning (DIF). Patients with potentially curable breast, colorectal or prostate cancer, within 6months post-diagnosis, completed the IPQ-R online (N=531). CFA was conducted, including multi-sample analysis, and for each IPQ-R factor fit to the Rasch model was assessed by examining, amongst other things, item fit, DIF and unidimensionality. The CFA showed a moderate fit of the data to the IPQ-R model, and stability across diagnosis, although fit was significantly improved following the removal of selected items. All seven factors achieved fit to the Rasch model, and exhibited unidimensionality and minimal DIF, although in most cases this was after some item rescoring and/or deletion. In both analyses, IPQ-R items 12, 18 and 24 were indicated as misfitting and removed. Given the rigorous standard of Rasch measurement, and the generic nature of the IPQ-R, it stood up well to the demands of the Rasch model in this study. Importantly, the results show that with some relatively minor, pragmatic modifications the IPQ-R could possess Rasch-standard measurement in cancer patients. © 2013.
Time in School: The Case of the Prudent Patron.

ERIC Educational Resources Information Center

Johnson, Thomas

1978-01-01

Explores the properties of a life cycle model of human capital accumulation under the assumptions that the individual cannot borrow to finance his schooling, but may receive an allowance while specializing. (Author/IRT)
Unidimensional versus multidimensional approaches to the assessment of acculturation for Asian American populations.

PubMed

Abe-Kim, J; Okazaki, S; Goto, S G

2001-08-01

This study used generational status and the Suinn-Lew Asian Self-Identity Acculturation scale to examine unidimensional versus multidimensional approaches to the conceptualization and measurement of acculturation and their relationships to relevant cultural indicator variables, including measures of Individualism-Collectivism, Independent-Interdependent Self-Construal, Loss of Face, and Impression Management. Multivariate analyses of covariance and partial correlations were used to examine the relationship between the acculturation models and each set of cultural indicator variables while controlling for socioeconomic status. Given that acculturation differences are often cited as evidence for a culture effect between groups, the present findings of an uneven nature of these relationships as a function of the particular acculturation measurement strategy have important implications for research on Asian Americans.
Use of Laser Speckle Contrast Imaging to Assess Digital Microvascular Function in Primary Raynaud Phenomenon and Systemic Sclerosis: A Comparison Using the Raynaud Condition Score Diary.

PubMed

Pauling, John D; Shipley, Jacqueline A; Hart, Darren J; McGrogan, Anita; McHugh, Neil J

2015-07-01

Evaluate objective assessment of digital microvascular function using laser speckle contrast imaging (LSCI) in a cross-sectional study of patients with primary Raynaud phenomenon (RP) and systemic sclerosis (SSc), comparing LSCI with both infrared thermography (IRT) and subjective assessment using the Raynaud Condition Score (RCS) diary. Patients with SSc (n = 25) and primary RP (n = 18) underwent simultaneous assessment of digital perfusion using LSCI and IRT with a cold challenge on 2 occasions, 2 weeks apart. The RCS diary was completed between assessments. The relationship between objective and subjective assessments of RP was evaluated. Reproducibility of LSCI/IRT was assessed, along with differences between primary RP and SSc, and the effect of sex. There was moderate-to-good correlation between LSCI and IRT (Spearman rho 0.58-0.84, p < 0.01), but poor correlation between objective assessments and the RCS diary (p > 0.05 for all analyses). Reproducibility of IRT and LSCI was moderate at baseline (ICC 0.51-0.63) and immediately following cold challenge (ICC 0.56-0.86), but lower during reperfusion (ICC 0.3-0.7). Neither subjective nor objective assessments differentiated between primary RP and SSc. Men reported lower median daily frequency of RP attacks (0.82 vs 1.93, p = 0.03). Perfusion using LSCI/IRT was higher in men for the majority of assessments. Objective and subjective methods provide differing information on microvascular function in RP. There is good convergent validity of LSCI with IRT and acceptable reproducibility of both modalities. Neither subjective nor objective assessments could differentiate between primary RP and SSc. Influence of sex on subjective and objective assessment of RP warrants further evaluation.
Genetic and clinical features of false-negative infants in a neonatal screening programme for cystic fibrosis.

PubMed

Padoan, R; Genoni, S; Moretti, E; Seia, M; Giunta, A; Corbetta, C

2002-01-01

A study was performed on the delayed diagnosis of cystic fibrosis (CF) in infants who had false-negative results in a neonatal screening programme. The genetic and clinical features of false-negative infants in this screening programme were assessed together with the efficiency of the screening procedure in the Lombardia region. In total, 774,687 newborns were screened using a two-step immunoreactive trypsinogen (IRT) (in the years 1990-1992), IRT/IRT + delF508 (1993-1998) or IRT/IRT + polymerase chain reaction (PCR) and oligonucleotide ligation assay (OLA) protocol (1998-1999). Out of 196 CF children born in the 10 y period 15 were false negative on screening (7.6%) and molecular analysis showed a high variability in the genotypes. The cystic fibrosis transmembrane regulator (CFTR) gene mutations identified were delF508, D1152H, R1066C, R334W, G542X, N1303K, F1052V, A120T, 3849 + 10kbC --> T, 2789 + 5G --> A, 5T-12TG and the novel mutation D110E. In three patients no mutation was identified after denaturing gradient gel electrophoresis of the majority of CFTR gene exons. The clinical phenotypes of CF children diagnosed by their symptoms at different ages were very mild. None of them presented with a severe lung disease. The majority of them did not seem to have been damaged by the delayed diagnosis. The combination of IRT assay plus genotype analysis (1998-1999) appears to be a more reliable method of detecting CF than IRT measurement alone or combined with only the delF508 mutation.
Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS).

PubMed

Rose, M; Bjorner, J B; Becker, J; Fries, J F; Ware, J E

2008-01-01

The Patient-Reported Outcomes Measurement Information System (PROMIS) was initiated to improve precision, reduce respondent burden, and enhance the comparability of health outcomes measures. We used item response theory (IRT) to construct and evaluate a preliminary item bank for physical function assuming four subdomains. Data from seven samples (N=17,726) using 136 items from nine questionnaires were evaluated. A generalized partial credit model was used to estimate item parameters, which were normed to a mean of 50 (SD=10) in the US population. Item bank properties were evaluated through Computerized Adaptive Test (CAT) simulations. IRT requirements were fulfilled by 70 items covering activities of daily living, lower extremity, and central body functions. The original item context partly affected parameter stability. Items on upper body function, and need for aid or devices did not fit the IRT model. In simulations, a 10-item CAT eliminated floor and decreased ceiling effects, achieving a small standard error (< 2.2) across scores from 20 to 50 (reliability >0.95 for a representative US sample). This precision was not achieved over a similar range by any comparable fixed length item sets. The methods of the PROMIS project are likely to substantially improve measures of physical function and to increase the efficiency of their administration using CAT.
A signal detection-item response theory model for evaluating neuropsychological measures.

PubMed

Thomas, Michael L; Brown, Gregory G; Gur, Ruben C; Moore, Tyler M; Patt, Virginie M; Risbrough, Victoria B; Baker, Dewleen G

2018-02-05

Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias. Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness. SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education. SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the development of computerized adaptive tests and integration with mixture and random-effects models.
Investigating the application of Rasch theory in measuring change in middle school student performance in physical science

NASA Astrophysics Data System (ADS)

Cunningham, Jessica D.

Newton's Universe (NU), an innovative teacher training program, strives to obtain measures from rural, middle school science teachers and their students to determine the impact of its distance learning course on understanding of temperature. No consensus exists on the most appropriate and useful method of analysis to measure change in psychological constructs over time. Several item response theory (IRT) models have been deemed useful in measuring change, which makes the choice of an IRT model not obvious. The appropriateness and utility of each model, including a comparison to a traditional analysis of variance approach, was investigated using middle school science student performance on an assessment over an instructional period. Predetermined criteria were outlined to guide model selection based on several factors including research questions, data properties, and meaningful interpretations to determine the most appropriate model for this study. All methods employed in this study reiterated one common interpretation of the data -- specifically, that the students of teachers with any NU course experience had significantly greater gains in performance over the instructional period. However, clear distinctions were made between an analysis of variance and the racked and stacked analysis using the Rasch model. Although limited research exists examining the usefulness of the Rasch model in measuring change in understanding over time, this study applied these methods and detailed plausible implications for data-driven decisions based upon results for NU and others. Being mindful of the advantages and usefulness of each method of analysis may help others make informed decisions about choosing an appropriate model to depict changes to evaluate other programs. Results may encourage other researchers to consider the meaningfulness of using IRT for this purpose. Results have implications for data-driven decisions for future professional development courses, in science education and other disciplines. KEYWORDS: Item Response Theory, Rasch Model, Racking and Stacking, Measuring Change in Student Performance, Newton's Universe teacher training
Further Empirical Results on Parametric Versus Non-Parametric IRT Modeling of Likert-Type Personality Data

ERIC Educational Resources Information Center

Maydeu-Olivares, Albert

2005-01-01

Chernyshenko, Stark, Chan, Drasgow, and Williams (2001) investigated the fit of Samejima's logistic graded model and Levine's non-parametric MFS model to the scales of two personality questionnaires and found that the graded model did not fit well. We attribute the poor fit of the graded model to small amounts of multidimensionality present in…

The Impact of Varied Discrimination Parameters on Mixed-Format Item Response Theory Model Selection

ERIC Educational Resources Information Center

Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G.

2013-01-01

Whittaker, Chang, and Dodd compared the performance of model selection criteria when selecting among mixed-format IRT models and found that the criteria did not perform adequately when selecting the more parameterized models. It was suggested by M. S. Johnson that the problems when selecting the more parameterized models may be because of the low…
Evidence for a causal relationship between early exocrine pancreatic disease and cystic fibrosis-related diabetes: a Mendelian randomization study.

PubMed

Soave, David; Miller, Melissa R; Keenan, Katherine; Li, Weili; Gong, Jiafen; Ip, Wan; Accurso, Frank; Sun, Lei; Rommens, Johanna M; Sontag, Marci; Durie, Peter R; Strug, Lisa J

2014-06-01

Circulating immunoreactive trypsinogen (IRT), a biomarker of exocrine pancreatic disease in cystic fibrosis (CF), is elevated in most CF newborns. In those with severe CF transmembrane conductance regulator (CFTR) genotypes, IRT declines rapidly in the first years of life, reflecting progressive pancreatic damage. Consistent with this progression, a less elevated newborn IRT measure would reflect more severe pancreatic disease, including compromised islet compartments, and potentially increased risk of CF-related diabetes (CFRD). We show in two independent CF populations that a lower newborn IRT estimate is associated with higher CFRD risk among individuals with severe CFTR genotypes, and we provide evidence to support a causal relationship. Increased loge(IRT) at birth was associated with decreased CFRD risk in Canadian and Colorado samples (hazard ratio 0.30 [95% CI 0.15-0.61] and 0.39 [0.18-0.81], respectively). Using Mendelian randomization with the SLC26A9 rs7512462 genotype as an instrumental variable since it is known to be associated with IRT birth levels in the CF population, we provide evidence to support a causal contribution of exocrine pancreatic status on CFRD risk. Our findings suggest CFRD risk could be predicted in early life and that maintained ductal fluid flow in the exocrine pancreas could delay the onset of CFRD. © 2014 by the American Diabetes Association.
Modeling Math Growth Trajectory--An Application of Conventional Growth Curve Model and Growth Mixture Model to ECLS K-5 Data

ERIC Educational Resources Information Center

Lu, Yi

2016-01-01

To model students' math growth trajectory, three conventional growth curve models and three growth mixture models are applied to the Early Childhood Longitudinal Study Kindergarten-Fifth grade (ECLS K-5) dataset in this study. The results of conventional growth curve model show gender differences on math IRT scores. When holding socio-economic…
Considerations for the independent reaction times and step-by-step methods for radiation chemistry simulations

NASA Astrophysics Data System (ADS)

Plante, Ianik; Devroye, Luc

2017-10-01

Ionizing radiation interacts with the water molecules of the tissues mostly by ionizations and excitations, which result in the formation of the radiation track structure and the creation of radiolytic species such as H.,.OH, H2, H2O2, and e-aq. After their creation, these species diffuse and may chemically react with the neighboring species and with the molecules of the medium. Therefore radiation chemistry is of great importance in radiation biology. As the chemical species are not distributed homogeneously, the use of conventional models of homogeneous reactions cannot completely describe the reaction kinetics of the particles. Actually, many simulations of radiation chemistry are done using the Independent Reaction Time (IRT) method, which is a very fast technique to calculate radiochemical yields but which do not calculate the positions of the radiolytic species as a function of time. Step-by-step (SBS) methods, which are able to provide such information, have been used only sparsely because these are time-consuming in terms of calculation. Recent improvements in computer performance now allow the regular use of the SBS method in radiation chemistry. The SBS and IRT methods are both based on the Green's functions of the diffusion equation (GFDE). In this paper, several sampling algorithms of the GFDE and for the IRT method are presented. We show that the IRT and SBS methods are exactly equivalent for 2-particles systems for diffusion and partially diffusion-controlled reactions between non-interacting particles. We also show that the results obtained with the SBS simulation method with periodic boundary conditions are in agreement with the predictions by classical reaction kinetics theory, which is an important step towards using this method for modelling of biochemical networks and metabolic pathways involved in oxidative stress. Finally, the first simulation results obtained with the code RITRACKS (Relativistic Ion Tracks) are presented.
Method variation in the impact of missing data on response shift detection.

PubMed

Schwartz, Carolyn E; Sajobi, Tolulope T; Verdam, Mathilde G E; Sebille, Veronique; Lix, Lisa M; Guilleux, Alice; Sprangers, Mirjam A G

2015-03-01

Missing data due to attrition or item non-response can result in biased estimates and loss of power in longitudinal quality-of-life (QOL) research. The impact of missing data on response shift (RS) detection is relatively unknown. This overview article synthesizes the findings of three methods tested in this special section regarding the impact of missing data patterns on RS detection in incomplete longitudinal data. The RS detection methods investigated include: (1) Relative importance analysis to detect reprioritization RS in stroke caregivers; (2) Oort's structural equation modeling (SEM) to detect recalibration, reprioritization, and reconceptualization RS in cancer patients; and (3) Rasch-based item-response theory-based (IRT) models as compared to SEM models to detect recalibration and reprioritization RS in hospitalized chronic disease patients. Each method dealt with missing data differently, either with imputation (1), attrition-based multi-group analysis (2), or probabilistic analysis that is robust to missingness due to the specific objectivity property (3). Relative importance analyses were sensitive to the type and amount of missing data and imputation method, with multiple imputation showing the largest RS effects. The attrition-based multi-group SEM revealed differential effects of both the changes in health-related QOL and the occurrence of response shift by attrition stratum, and enabled a more complete interpretation of findings. The IRT RS algorithm found evidence of small recalibration and reprioritization effects in General Health, whereas SEM mostly evidenced small recalibration effects. These differences may be due to differences between the two methods in handling of missing data. Missing data imputation techniques result in different conclusions about the presence of reprioritization RS using the relative importance method, while the attrition-based SEM approach highlighted different recalibration and reprioritization RS effects by attrition group. The IRT analyses detected more recalibration and reprioritization RS effects than SEM, presumably due to IRT's robustness to missing data. Future research should apply simulation techniques in order to make conclusive statements about the impacts of missing data according to the type and amount of RS.
Assessing Peer Victimization across Adolescence: Measurement Invariance and Developmental Change

ERIC Educational Resources Information Center

Rosen, Lisa H.; Beron, Kurt J.; Underwood, Marion K.

2013-01-01

An upward extension of the Revised Social Experience Questionnaire (Paquette & Underwood, 1999) was tested in a sample of adolescents followed longitudinally from 7th through 10th grade. We hypothesized that a 2-factor model with overt and social victimization factors would fit the data better than would a unidimensional model (a single…
Model-Based Collaborative Filtering Analysis of Student Response Data: Machine-Learning Item Response Theory

ERIC Educational Resources Information Center

Bergner, Yoav; Droschler, Stefan; Kortemeyer, Gerd; Rayyan, Saif; Seaton, Daniel; Pritchard, David E.

2012-01-01

We apply collaborative filtering (CF) to dichotomously scored student response data (right, wrong, or no interaction), finding optimal parameters for each student and item based on cross-validated prediction accuracy. The approach is naturally suited to comparing different models, both unidimensional and multidimensional in ability, including a…
Fitting the Mixed Rasch Model to a Reading Comprehension Test: Identifying Reader Types

ERIC Educational Resources Information Center

Baghaei, Purya; Carstensen, Claus H.

2013-01-01

Standard unidimensional Rasch models assume that persons with the same ability parameters are comparable. That is, the same interpretation applies to persons with identical ability estimates as regards the underlying mental processes triggered by the test. However, research in cognitive psychology shows that persons at the same trait level may…
A Monte Carlo Approach to Unidimensionality Testing in Polytomous Rasch Models

ERIC Educational Resources Information Center

Christensen, Karl Bang; Kreiner, Svend

2007-01-01

Many statistical tests are designed to test the different assumptions of the Rasch model, but only few are directed at detecting multidimensionality. The Martin-Lof test is an attractive approach, the disadvantage being that its null distribution deviates strongly from the asymptotic chi-square distribution for most realistic sample sizes. A Monte…
An Investigation of Sample Size Splitting on ATFIND and DIMTEST

ERIC Educational Resources Information Center

Socha, Alan; DeMars, Christine E.

2013-01-01

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
Introducing Multidimensional Item Response Modeling in Health Behavior and Health Education Research

ERIC Educational Resources Information Center

Allen, Diane D.; Wilson, Mark

2006-01-01

When measuring participant-reported attitudes and outcomes in the behavioral sciences, there are many instances when the common measurement assumption of unidimensionality does not hold. In these cases, the application of a multidimensional measurement model is both technically appropriate and potentially advantageous in substance. In this paper,…
Evaluation of Weighted Scale Reliability and Criterion Validity: A Latent Variable Modeling Approach

ERIC Educational Resources Information Center

Raykov, Tenko

2007-01-01

A method is outlined for evaluating the reliability and criterion validity of weighted scales based on sets of unidimensional measures. The approach is developed within the framework of latent variable modeling methodology and is useful for point and interval estimation of these measurement quality coefficients in counseling and education…
A Design for the Evaluation of Management Information Systems.

ERIC Educational Resources Information Center

Spuck, Dennis W.; Bozeman, William C.

1980-01-01

This paper has presented a model for the evaluation of management information systems. The three levels of information considered were actual, perceptual, and attitudinal. The dimensions of evaluation discussed were function, utilization, and effects. (Author/IRT)
Multimodal Education: A Model with Promise.

ERIC Educational Resources Information Center

Gerler, Edwin R., Jr.; Locke, Don C.

1980-01-01

Describes a program that uses Lazarus's factors that contribute to human growth and development as the basis for its program. The modalities covered are given the headings behavior, affect, sensation and imagery, cognition, interpersonal, and diet/physiology. (IRT)
Measuring organizational effectiveness in information and communication technology companies using item response theory.

PubMed

Trierweiller, Andréa Cristina; Peixe, Blênio César Severo; Tezza, Rafael; Pereira, Vera Lúcia Duarte do Valle; Pacheco, Waldemar; Bornia, Antonio Cezar; de Andrade, Dalton Francisco

2012-01-01

The aim of this paper is to measure the effectiveness of the organizations Information and Communication Technology (ICT) from the point of view of the manager, using Item Response Theory (IRT). There is a need to verify the effectiveness of these organizations which are normally associated to complex, dynamic, and competitive environments. In academic literature, there is disagreement surrounding the concept of organizational effectiveness and its measurement. A construct was elaborated based on dimensions of effectiveness towards the construction of the items of the questionnaire which submitted to specialists for evaluation. It demonstrated itself to be viable in measuring organizational effectiveness of ICT companies under the point of view of a manager through using Two-Parameter Logistic Model (2PLM) of the IRT. This modeling permits us to evaluate the quality and property of each item placed within a single scale: items and respondents, which is not possible when using other similar tools.
Intrinsic membrane properties of pre-oromotor neurons in the intermediate zone of the medullary reticular formation.

PubMed

Venugopal, S; Boulant, J A; Chen, Z; Travers, J B

2010-06-16

Neurons in the lower brainstem that control consummatory behavior are widely distributed in the reticular formation (RF) of the pons and medulla. The intrinsic membrane properties of neurons within this distributed system shape complex excitatory and inhibitory inputs from both orosensory and central structures implicated in homeostatic control to produce coordinated oromotor patterns. The current study explored the intrinsic membrane properties of neurons in the intermediate subdivision of the medullary reticular formation (IRt). Neurons in the IRt receive input from the overlying (gustatory) nucleus of the solitary tract and project to the oromotor nuclei. Recent behavioral pharmacology studies as well as computational modeling suggest that inhibition in the IRt plays an important role in the transition from a taste-initiated oromotor pattern of ingestion to one of rejection. The present study explored the impact of hyperpolarization on membrane properties. In response to depolarization, neurons responded with either a tonic discharge, an irregular/burst pattern or were spike-adaptive. A hyperpolarizing pre-pulse modulated the excitability of most (82%) IRt neurons to subsequent depolarization. Instances of both increased (30%) and decreased (52%) excitability were observed. Currents induced by the hyperpolarization included an outward 4-aminopyridine (4-AP) sensitive K+ current that suppressed excitability and an inward cation current that increased excitability. These currents are also present in other subpopulations of RF neurons that influence the oromotor nuclei and we discuss how these currents could alter firing characteristics to impact pattern generation. 2010 IBRO. Published by Elsevier Ltd. All rights reserved.
Measuring Constructs in Family Science: How Can Item Response Theory Improve Precision and Validity?

PubMed Central

Gordon, Rachel A.

2014-01-01

This article provides family scientists with an understanding of contemporary measurement perspectives and the ways in which item response theory (IRT) can be used to develop measures with desired evidence of precision and validity for research uses. The article offers a nontechnical introduction to some key features of IRT, including its orientation toward locating items along an underlying dimension and toward estimating precision of measurement for persons with different levels of that same construct. It also offers a didactic example of how the approach can be used to refine conceptualization and operationalization of constructs in the family sciences, using data from the National Longitudinal Survey of Youth 1979 (n = 2,732). Three basic models are considered: (a) the Rasch and (b) two-parameter logistic models for dichotomous items and (c) the Rating Scale Model for multicategory items. Throughout, the author highlights the potential for researchers to elevate measurement to a level on par with theorizing and testing about relationships among constructs. PMID:25663714
Stepwise Analysis of Differential Item Functioning Based on Multiple-Group Partial Credit Model.

ERIC Educational Resources Information Center

Muraki, Eiji

1999-01-01

Extended an Item Response Theory (IRT) method for detection of differential item functioning to the partial credit model and applied the method to simulated data using a stepwise procedure. Then applied the stepwise DIF analysis based on the multiple-group partial credit model to writing trend data from the National Assessment of Educational…
A Multidimensional Partial Credit Model with Associated Item and Test Statistics: An Application to Mixed-Format Tests

ERIC Educational Resources Information Center

Yao, Lihua; Schwarz, Richard D.

2006-01-01

Multidimensional item response theory (IRT) models have been proposed for better understanding the dimensional structure of data or to define diagnostic profiles of student learning. A compensatory multidimensional two-parameter partial credit model (M-2PPC) for constructed-response items is presented that is a generalization of those proposed to…
Robustness of Value-Added Analysis of School Effectiveness. Research Report. ETS RR-08-22

ERIC Educational Resources Information Center

Braun, Henry; Qu, Yanxuan

2008-01-01

This paper reports on a study conducted to investigate the consistency of the results between 2 approaches to estimating school effectiveness through value-added modeling. Estimates of school effects from the layered model employing item response theory (IRT) scaled data are compared to estimates derived from a discrete growth model based on the…

Modeling Nonignorable Missing Data in Speeded Tests

ERIC Educational Resources Information Center

Glas, Cees A. W.; Pimentel, Jonald L.

2008-01-01

In tests with time limits, items at the end are often not reached. Usually, the pattern of missing responses depends on the ability level of the respondents; therefore, missing data are not ignorable in statistical inference. This study models data using a combination of two item response theory (IRT) models: one for the observed response data and…
IRT Models for Ability-Based Guessing

ERIC Educational Resources Information Center

Martin, Ernesto San; del Pino, Guido; De Boeck, Paul

2006-01-01

An ability-based guessing model is formulated and applied to several data sets regarding educational tests in language and in mathematics. The formulation of the model is such that the probability of a correct guess does not only depend on the item but also on the ability of the individual, weighted with a general discrimination parameter. By so…
Model Choice and Sample Size in Item Response Theory Analysis of Aphasia Tests

ERIC Educational Resources Information Center

Hula, William D.; Fergadiotis, Gerasimos; Martin, Nadine

2012-01-01

Purpose: The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models. Method: Pyramids and Palm Trees (Howard & Patterson, 1992) test data that had been collected from…
Confirming the Multidimensionality of Psychologically Controlling Parenting among Chinese-American Mothers: Love Withdrawal, Guilt Induction, and Shaming.

PubMed

Cheah, Charissa; Yu, Jing; Hart, Craig; Sun, Shuyan; Olsen, Joseph

2015-05-01

Despite the theoretical conceptualization of parental psychological control as a multidimensional construct, the majority of previous studies have examined psychological control as a unidimensional scale. Moreover, the conceptualization of shaming and its associations with love withdrawal and guilt induction are unclear. The current study aimed to fill these gaps by evaluating the latent factor structure underlying 18 items from Olsen et al. (2002) that were conceptually relevant to love withdrawal, guilt induction, and shaming practices in a sample of 169 mothers of Chinese-American preschoolers. A multidimensional three-factor model and bi-factor model were specified based on our formulated operational definitions for the three dimensions of psychological control. Both models were found to be superior to the unidimensional model. In addition, results from the bi-factor model and an additional second-order factor model indicated that psychological control is essentially empirically isomorphic with guilt induction. Although love withdrawal and shaming factors were also fairly strong indicators of psychological control, each exhibited important additional unique variability and mutual distinctiveness. Implications for the conceptualization of love withdrawal, guilt induction, and shaming as well as directions for future studies are discussed.
Confirming the Multidimensionality of Psychologically Controlling Parenting among Chinese-American Mothers: Love Withdrawal, Guilt Induction, and Shaming

PubMed Central

Cheah, Charissa; Yu, Jing; Hart, Craig; Sun, Shuyan; Olsen, Joseph

2014-01-01

Despite the theoretical conceptualization of parental psychological control as a multidimensional construct, the majority of previous studies have examined psychological control as a unidimensional scale. Moreover, the conceptualization of shaming and its associations with love withdrawal and guilt induction are unclear. The current study aimed to fill these gaps by evaluating the latent factor structure underlying 18 items from Olsen et al. (2002) that were conceptually relevant to love withdrawal, guilt induction, and shaming practices in a sample of 169 mothers of Chinese-American preschoolers. A multidimensional three-factor model and bi-factor model were specified based on our formulated operational definitions for the three dimensions of psychological control. Both models were found to be superior to the unidimensional model. In addition, results from the bi-factor model and an additional second-order factor model indicated that psychological control is essentially empirically isomorphic with guilt induction. Although love withdrawal and shaming factors were also fairly strong indicators of psychological control, each exhibited important additional unique variability and mutual distinctiveness. Implications for the conceptualization of love withdrawal, guilt induction, and shaming as well as directions for future studies are discussed. PMID:26052168
Precise determination of the heat delivery during in vivo magnetic nanoparticle hyperthermia with infrared thermography

NASA Astrophysics Data System (ADS)

Rodrigues, Harley F.; Capistrano, Gustavo; Mello, Francyelli M.; Zufelato, Nicholas; Silveira-Lacerda, Elisângela; Bakuzis, Andris F.

2017-05-01

Non-invasive and real-time monitoring of the heat delivery during magnetic nanoparticle hyperthermia (MNH) is of fundamental importance to predict clinical outcomes for cancer treatment. Infrared thermography (IRT) can determine the surface temperature due to three-dimensional heat delivery inside a subcutaneous tumor, an argument that is supported by numerical simulations. However, for precise temperature determination, it is of crucial relevance to use a correct experimental configuration. This work reports an MNH study using a sarcoma 180 murine tumor containing 3.9 mg of intratumorally injected manganese-ferrite nanoparticles. MNH was performed at low field amplitude and non-uniform field configuration. Five 30 min in vivo magnetic hyperthermia experiments were performed, monitoring the surface temperature with a fiber optical sensor and thermal camera at distinct angles with respect to the animal’s surface. The results indicate that temperature errors as large as 7~\\circ C can occur if the experiment is not properly designed. A new IRT error model is found to explain the data. More importantly, we show how to precisely monitor temperature with IRT during hyperthermia, which could positively impact heat dosimetry and clinical planning.
Precise determination of the heat delivery during in vivo magnetic nanoparticle hyperthermia with infrared thermography.

PubMed

Rodrigues, Harley F; Capistrano, Gustavo; Mello, Francyelli M; Zufelato, Nicholas; Silveira-Lacerda, Elisângela; Bakuzis, Andris F

2017-05-21

Non-invasive and real-time monitoring of the heat delivery during magnetic nanoparticle hyperthermia (MNH) is of fundamental importance to predict clinical outcomes for cancer treatment. Infrared thermography (IRT) can determine the surface temperature due to three-dimensional heat delivery inside a subcutaneous tumor, an argument that is supported by numerical simulations. However, for precise temperature determination, it is of crucial relevance to use a correct experimental configuration. This work reports an MNH study using a sarcoma 180 murine tumor containing 3.9 mg of intratumorally injected manganese-ferrite nanoparticles. MNH was performed at low field amplitude and non-uniform field configuration. Five 30 min in vivo magnetic hyperthermia experiments were performed, monitoring the surface temperature with a fiber optical sensor and thermal camera at distinct angles with respect to the animal's surface. The results indicate that temperature errors as large as [Formula: see text]C can occur if the experiment is not properly designed. A new IRT error model is found to explain the data. More importantly, we show how to precisely monitor temperature with IRT during hyperthermia, which could positively impact heat dosimetry and clinical planning.
A Psychometric Analysis of the Italian Version of the eHealth Literacy Scale Using Item Response and Classical Test Theory Methods

PubMed Central

Dima, Alexandra Lelia; Schulz, Peter Johannes

2017-01-01

Background The eHealth Literacy Scale (eHEALS) is a tool to assess consumers’ comfort and skills in using information technologies for health. Although evidence exists of reliability and construct validity of the scale, less agreement exists on structural validity. Objective The aim of this study was to validate the Italian version of the eHealth Literacy Scale (I-eHEALS) in a community sample with a focus on its structural validity, by applying psychometric techniques that account for item difficulty. Methods Two Web-based surveys were conducted among a total of 296 people living in the Italian-speaking region of Switzerland (Ticino). After examining the latent variables underlying the observed variables of the Italian scale via principal component analysis (PCA), fit indices for two alternative models were calculated using confirmatory factor analysis (CFA). The scale structure was examined via parametric and nonparametric item response theory (IRT) analyses accounting for differences between items regarding the proportion of answers indicating high ability. Convergent validity was assessed by correlations with theoretically related constructs. Results CFA showed a suboptimal model fit for both models. IRT analyses confirmed all items measure a single dimension as intended. Reliability and construct validity of the final scale were also confirmed. The contrasting results of factor analysis (FA) and IRT analyses highlight the importance of considering differences in item difficulty when examining health literacy scales. Conclusions The findings support the reliability and validity of the translated scale and its use for assessing Italian-speaking consumers’ eHealth literacy. PMID:28400356
Icing Simulation Research Supporting the Ice-Accretion Testing of Large-Scale Swept-Wing Models

NASA Technical Reports Server (NTRS)

Yadlin, Yoram; Monnig, Jaime T.; Malone, Adam M.; Paul, Bernard P.

2018-01-01

The work summarized in this report is a continuation of NASA's Large-Scale, Swept-Wing Test Articles Fabrication; Research and Test Support for NASA IRT contract (NNC10BA05 -NNC14TA36T) performed by Boeing under the NASA Research and Technology for Aerospace Propulsion Systems (RTAPS) contract. In the study conducted under RTAPS, a series of icing tests in the Icing Research Tunnel (IRT) have been conducted to characterize ice formations on large-scale swept wings representative of modern commercial transport airplanes. The outcome of that campaign was a large database of ice-accretion geometries that can be used for subsequent aerodynamic evaluation in other experimental facilities and for validation of ice-accretion prediction codes.
Cognitive Psychology Meets Psychometric Theory: On the Relation between Process Models for Decision Making and Latent Variable Models for Individual Differences

ERIC Educational Resources Information Center

van der Maas, Han L. J.; Molenaar, Dylan; Maris, Gunter; Kievit, Rogier A.; Borsboom, Denny

2011-01-01

This article analyzes latent variable models from a cognitive psychology perspective. We start by discussing work by Tuerlinckx and De Boeck (2005), who proved that a diffusion model for 2-choice response processes entails a 2-parameter logistic item response theory (IRT) model for individual differences in the response data. Following this line…
A Nondeterministic Resource Planning Model in Education

ERIC Educational Resources Information Center

Yoda, Koji

1977-01-01

Discusses a simple technique for stochastic resource planning that, when computerized, can assist educational managers in the process of quantifying the future uncertainty, thereby, helping them make better decisions. The example used is a school lunch program. (Author/IRT)
The Comparative Performance of Conditional Independence Indices

ERIC Educational Resources Information Center

Kim, Doyoung; De Ayala, R. J.; Ferdous, Abdullah A.; Nering, Michael L.

2011-01-01

To realize the benefits of item response theory (IRT), one must have model-data fit. One facet of a model-data fit investigation involves assessing the tenability of the conditional item independence (CII) assumption. In this Monte Carlo study, the comparative performance of 10 indices for identifying conditional item dependence is assessed. The…
A Multilevel Testlet Model for Dual Local Dependence

ERIC Educational Resources Information Center

Jiao, Hong; Kamata, Akihito; Wang, Shudong; Jin, Ying

2012-01-01

The applications of item response theory (IRT) models assume local item independence and that examinees are independent of each other. When a representative sample for psychometric analysis is selected using a cluster sampling method in a testlet-based assessment, both local item dependence and local person dependence are likely to be induced.…
An NCME Instructional Module on Polytomous Item Response Theory Models

ERIC Educational Resources Information Center

Penfield, Randall David

2014-01-01

A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…
A Test of the Need Hierarchy Concept by a Markov Model of Change in Need Strength.

ERIC Educational Resources Information Center

Rauschenberger, John; And Others

1980-01-01

In this study of 547 high school graduates, Alderfer's and Maslow's need hierarchy theories were expressed in Markov chain form and were subjected to empirical test. Both models were disconfirmed. Corroborative multiwave correlational analysis also failed to support the need hierarchy concept. (Author/IRT)
Firestar-"D": Computerized Adaptive Testing Simulation Program for Dichotomous Item Response Theory Models

ERIC Educational Resources Information Center

Choi, Seung W.; Podrabsky, Tracy; McKinney, Natalie

2012-01-01

Computerized adaptive testing (CAT) enables efficient and flexible measurement of latent constructs. The majority of educational and cognitive measurement constructs are based on dichotomous item response theory (IRT) models. An integral part of developing various components of a CAT system is conducting simulations using both known and empirical…
Instability resistance training across the exercise continuum.

PubMed

Behm, David G; Colado, Juan C; Colado, Juan C

2013-11-01

Instability resistance training (IRT; unstable surfaces and devices to strengthen the core or trunk muscles) is popular in fitness training facilities. To examine contradictory IRT recommendations for health enthusiasts and rehabilitation. A literature search was performed using MEDLINE, SPORT Discus, ScienceDirect, Web of Science, and Google Scholar databases from 1990 to 2012. Databases were searched using key terms, including "balance," "stability," "instability," "resistance training," "core," "trunk," and "functional performance." Additionally, relevant articles were extracted from reference lists. To be included, research questions addressed the effect of balance or IRT on performance, healthy and active participants, and physiologic or performance outcome measures and had to be published in English in a peer-reviewed journal. There is a dichotomy of opinions on the effectiveness and application of instability devices and conditions for health and performance training. Balance training without resistance has been shown to improve not only balance but functional performance as well. IRT studies document similar training adaptations as stable resistance training programs with recreationally active individuals. Similar progressions with lower resistance may improve balance and stability, increase core activation, and improve motor control. IRT is highly recommended for youth, elderly, recreationally active individuals, and highly trained enthusiasts.
Impact of Violation of the Missing-at-Random Assumption on Full-Information Maximum Likelihood Method in Multidimensional Adaptive Testing

ERIC Educational Resources Information Center

Han, Kyung T.; Guo, Fanmin

2014-01-01

The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…
Examination of Different Item Response Theory Models on Tests Composed of Testlets

ERIC Educational Resources Information Center

Kogar, Esin Yilmaz; Kelecioglu, Hülya

2017-01-01

The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…
A Model-Free Diagnostic for Single-Peakedness of Item Responses Using Ordered Conditional Means

ERIC Educational Resources Information Center

Polak, Marike; De Rooij, Mark; Heiser, Willem J.

2012-01-01

In this article we propose a model-free diagnostic for single-peakedness (unimodality) of item responses. Presuming a unidimensional unfolding scale and a given item ordering, we approximate item response functions of all items based on ordered conditional means (OCM). The proposed OCM methodology is based on Thurstone & Chave's (1929) "criterion…

Bully-Victimization Scale: Using Rasch Modeling in the Analysis of a Qualitative Scale

ERIC Educational Resources Information Center

Lehto, Marybeth

2009-01-01

The primary purpose of this study was to determine whether the data from the qualitative study fit Rasch model requirements for the definition of a measure, as well as to address concern in the extant literature regarding the appropriate number of items needed in analysis to assure unidimensionality. The self-report victimization scale was…
Personality Correlates of Aggression: Evidence from Measures of the Five-Factor Model, UPPS Model of Impulsivity, and BIS/BAS

ERIC Educational Resources Information Center

Miller, Joshua D.; Zeichner, Amos; Wilson, Lauren F.

2012-01-01

Although many studies of personality and aggression focus on multidimensional traits and higher order personality disorders (e.g., psychopathy), lower order, unidimensional traits may provide more precision in identifying specific aspects of personality that relate to aggression. The current study includes a comprehensive measurement of lower…
Interval Estimation of Revision Effect on Scale Reliability via Covariance Structure Modeling

ERIC Educational Resources Information Center

Raykov, Tenko

2009-01-01

A didactic discussion of a procedure for interval estimation of change in scale reliability due to revision is provided, which is developed within the framework of covariance structure modeling. The method yields ranges of plausible values for the population gain or loss in reliability of unidimensional composites, which results from deletion or…
The Process of Horizontal Differentiation: Two Models.

ERIC Educational Resources Information Center

Daft, Richard L.; Bradshaw, Patricia J.

1980-01-01

Explores the process of horizontal differentiation by examining events leading to the establishment of 30 new departments in five universities. Two types of horizontal differentiation processes--administrative and academic--were observed and each was associated with different organizational conditions. (Author/IRT)
Confirmatory factor analysis of the Oral Health Impact Profile.

PubMed

John, M T; Feuerstahler, L; Waller, N; Baba, K; Larsson, P; Celebić, A; Kende, D; Rener-Sitar, K; Reissmann, D R

2014-09-01

Previous exploratory analyses suggest that the Oral Health Impact Profile (OHIP) consists of four correlated dimensions and that individual differences in OHIP total scores reflect an underlying higher-order factor. The aim of this report is to corroborate these findings in the Dimensions of Oral Health-Related Quality of Life (DOQ) Project, an international study of general population subjects and prosthodontic patients. Using the project's Validation Sample (n = 5022), we conducted confirmatory factor analyses in a sample of 4993 subjects with sufficiently complete data. In particular, we compared the psychometric performance of three models: a unidimensional model, a four-factor model and a bifactor model that included one general factor and four group factors. Using model-fit criteria and factor interpretability as guides, the four-factor model was deemed best in terms of strong item loadings, model fit (RMSEA = 0·05, CFI = 0·99) and interpretability. These results corroborate our previous findings that four highly correlated factors - which we have named Oral Function, Oro-facial Pain, Oro-facial Appearance and Psychosocial Impact - can be reliably extracted from the OHIP item pool. However, the good fit of the unidimensional model and the high interfactor correlations in the four-factor solution suggest that OHRQoL can also be sufficiently described with one score. © 2014 John Wiley & Sons Ltd.
Fit of Item Response Theory Models: A Survey of Data from Several Operational Tests. Research Report. ETS RR-11-29

ERIC Educational Resources Information Center

Sinharay, Sandip; Haberman, Shelby J.; Jia, Helena

2011-01-01

Standard 3.9 of the "Standards for Educational and Psychological Testing" (American Educational Research Association, American Psychological Association, & National Council for Measurement in Education, 1999) demands evidence of model fit when an item response theory (IRT) model is used to make inferences from a data set. We applied two recently…
A Comparison of Item Exposure Control Procedures with the Generalized Partial Credit Model

ERIC Educational Resources Information Center

Sanchez, Edgar Isaac

2008-01-01

To enhance test security of high stakes tests, it is vital to understand the way various exposure control strategies function under various IRT models. To that end the present dissertation focused on the performance of several exposure control strategies under the generalized partial credit model with an item pool of 100 and 200 items. These…
A Multidimensional Ideal Point Item Response Theory Model for Binary Data

ERIC Educational Resources Information Center

Maydeu-Olivares, Albert; Hernandez, Adolfo; McDonald, Roderick P.

2006-01-01

We introduce a multidimensional item response theory (IRT) model for binary data based on a proximity response mechanism. Under the model, a respondent at the mode of the item response function (IRF) endorses the item with probability one. The mode of the IRF is the ideal point, or in the multidimensional case, an ideal hyperplane. The model…
An Assessment of the Nonparametric Approach for Evaluating the Fit of Item Response Models

ERIC Educational Resources Information Center

Liang, Tie; Wells, Craig S.; Hambleton, Ronald K.

2014-01-01

As item response theory has been more widely applied, investigating the fit of a parametric model becomes an important part of the measurement process. There is a lack of promising solutions to the detection of model misfit in IRT. Douglas and Cohen introduced a general nonparametric approach, RISE (Root Integrated Squared Error), for detecting…
Intensive remote monitoring versus conventional care in type 1 diabetes: A randomized controlled trial.

PubMed

Gandrud, Laura; Altan, Aylin; Buzinec, Paul; Hemphill, Jesse; Chatterton, Jayne; Kelley, Tina; Vojta, Deneen

2018-02-21

While frequent contact with diabetes care providers may improve glycemic control among patients with type 1 diabetes (T1D), in-person visits are labor-intensive and costly. This study was conducted to assess the impact of an intensive remote therapy (IRT) intervention for pediatric patients with T1D. Pediatric patients with T1D were randomized to IRT or conventional care (CC) for 6 months. Both cohorts continued routine quarterly clinic visits and uploaded device data; for the IRT cohort, data were reviewed and patients were contacted if regimen adjustments were indicated. Glycated hemoglobin (HbA1c) change from baseline was assessed at 6 and 9 months. Diabetes-related quality of life (QoL), healthcare services utilization, and hypoglycemic events were also tracked. Among 117 enrollees (60 IRT, 57 CC), mean (SD) 6-month %HbA1c change for IRT vs CC was -0.34 (0.85) (-3.7 mmol/mol) vs -0.05 (0.74) (-0.5 mmol/mol) overall (P = .071); -0.15 (0.67) (1.6 mmol/mol) vs -0.02 (0.66) (0.2 mmol/mol) for ages 8 to 12 (P = .541); and -0.50 (0.95) (-5.5 mmol/mol) vs -0.06 (0.80) (-0.7 mmol/mol) for ages 13 to 17 (P = .056). Diabetes-related QoL increased by 6.5 and 1.3 points for IRT and CC, respectively (P = .062). Three months after intervention cessation, %HbA1c changed minimally among treated children aged 8 to 12 but increased by 0.22 (0.89) (2.4 mmol/mol) among those aged 13 to 17. IRT substantially affected diabetes metrics and improved QoL among pediatric patients with T1D. Adolescents experienced a stronger treatment effect, but had difficulty in sustaining improved control after intervention cessation. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Construct Validity of Science Motivation and Beliefs Instrument (SLA-MB): A Case study in Sumedang, Indonesia

NASA Astrophysics Data System (ADS)

Rachmatullah, A.; Octavianda, R. P.; Ha, M.; Rustaman, N. Y.; Diana, S.

2017-02-01

Along with numerous instruments developed and used in science education researches, some of those instruments have been translated to local language in the country where the instruments were used. Most of researchers that used those translated instruments did not report the quality of those translated instruments. One of the instruments is the Scientific Literacy Assessment (SLA) including the Science Motivation and Beliefs (SLA-MB) as part of the SLA. In this study, the SLA-MB has been translated into Indonesian Language (Bahasa). The purpose of this study is to investigate the SLA-MB instrument that has been translated to Indonesian language from the view of dimensionality, reliability, item quality and differential item functioning (DIF) based on IRT-Rasch analysis. We used Conquest and Winstep as the program for IRT-Rasch analysis. We employed quantitative research method with school-survey on this study. Research subjects are 223 Indonesian Middle school students (age 13-16), with 64 boys and 159 girls. IRT-Rasch analysis of the SLA-MB Indonesian version indicated that a three-dimensional model fit significantly better than one-dimension model, and the reliability of each dimensions are about 0.60 to 0.82. As well as those findings, fit values of all items are acceptable, moreover we found no DIF for all of the SLA-MB items. Overall, our study suggests that Indonesian version of SLA-MB is acceptable to be implemented as research instrument conducted in Indonesia.
Electronic signatures of dimerization in IrTe2

NASA Astrophysics Data System (ADS)

Dai, Jixia; Wu, Weida; Oh, Yoon Seok; Cheong, S.-W.; Yang, J. J.

2014-03-01

Recently, the mysterious phase transition around Tc ~ 260 K in IrTe2 has been intensively studied. A structural supermodulation with q =1/5 was identified below Tc. A variety of microscopic mechanisms have been proposed to account for this transition, including charge-density wave due to Fermi surface nesting, Te p-orbital driven structure instability, anionic depolymerization, ionic dimerization, and so on. However, there has not been an unified picture on the nature of this transition. To address this issue, we have performed low-temperature scanning tunneling microscopy and spectroscopy (STM/STS) experiments on IrTe2 and IrTe2-xSex. Our STM data clearly shows a strong bias dependence in both topography and local density of states (STS) maps. High resolution spectroscopic data further confirms the stripe-like electronic states modulation, which provides insight to the ionic dimerization revealed by X-ray diffraction.
[Torrance Tests of Creative Thinking (TTCT): elements for construct validity in Portuguese adolescents].

PubMed

Oliveira, Ema; Almeida, Leandro; Ferrándiz, Carmen; Ferrando, Mercedes; Sainz, Marta; Prieto, María Dolores

2009-11-01

The aim of this work is to study the unidimensional and multidimensional nature of creativity when assessed through divergent thinking tasks, as proposed in Torrance's battery (Torrance Creative Thinking Test, TTCT). This battery is made up of various tasks with verbal and figurative content, aimed at estimating the level of creativity according to the dimensions or cognitive functions of fluency, flexibility, originality and elaboration of the individuals' ideas. This work used a sample of 595 Portuguese students from 5th and 6th grade. The results of confirmatory factor analysis reveals that the unidimensional model (a general factor of creativity) and the model of factors as a function of the cognitive dimensions of creativity, based on task content, do not fit well. The model with the best fit has a hierarchical factor structure, in which the first level comprises the factors for each of the subtests applied and the second level includes verbal or figurative content. The difficulty to verify the structural validity of the TTCT is noted, and the need for further studies to achieve, in practice, better individual creativity scores.
Using self-reported callous-unemotional traits to cross-nationally assess the DSM-5 'With Limited Prosocial Emotions' specifier.

PubMed

Kimonis, Eva R; Fanti, Kostas A; Frick, Paul J; Moffitt, Terrie E; Essau, Cecilia; Bijttebier, Patricia; Marsee, Monica A

2015-11-01

The presence of callous-unemotional (CU) traits designates an important subgroup of antisocial youth at risk for severe, persistent, and impairing conduct problems. As a result, the fifth revision of the Diagnostic and Statistical Manual includes a specifier for youth meeting diagnostic criteria for Conduct Disorder who show elevated CU traits. The current study evaluated the DSM-5 criteria using Item Response Theory (IRT) analyses and evaluated two methods for using a self-report measure of CU traits to make this diagnosis. The sample included 2257 adolescent (M age = 15.64, SD = 1.69 years) boys (53%) and girls (47%) from community and incarcerated settings in the United States and the European countries of Belgium, Germany, and Cyprus. IRT analyses suggested that four- or eight-item sets from the self-report measure (comparable to the symptoms used by the DSM-5 specifier) provided good model fit, suggesting that they assess a single underlying CU construct. Further, the most stringent method of scoring the self-report scale (i.e. taking only the most extreme responses) to approximate symptom presence provided the best discrimination in IRT analyses, showed reasonable prevalence rates of the specifier, and designated community adolescents who were highly antisocial, whereas the less stringent method best discriminated detained youth. Refined self-report scales developed on the basis of IRT findings provided good assessments of most of the symptoms used in the DSM-5 criteria. These scales may be used as one component of a multimethod assessment of the 'With Limited Prosocial Emotions' specifier for Conduct Disorder. © 2014 Association for Child and Adolescent Mental Health.
Clinical vs. Self-report Versions of the Quick Inventory of Depressive Symptomatology in a Public Sector Sample

PubMed Central

Bernstein, Ira H.; Rush, A. John; Carmody, Thomas J.; Woo, Ada; Trivedi, Madhukar H.

2007-01-01

Objectives Recent work using classical test theory (CTT) and item response theory (IRT) has found that the self-report (QIDS-SR16) and clinician-rated (QIDS-C16) versions of the 16-item Quick Inventory of Depressive Symptomatology were generally comparable in outpatients with nonpsychotic major depressive disorder (MDD). This report extends this comparison to a less well-educated, more treatment-resistant sample that included more ethnic/racial minorities using IRT and selected classical test analyses. Methods The QIDS-SR16 and QIDS-C16 were obtained in a sample of 441 outpatients with nonpsychotic MDD seen in the public sector in the Texas Medication Algorithm Project (TMAP). The Samejima graded response IRT model was used to compare the QIDS-SR16 and QIDS-C16. Results The nine symptom domains in the QIDS-SR16 and QIDS-C16 related well to overall depression. The slopes of the item response functions a), which index the strength of relationship between overall depression and each symptom, were extremely similar with the two measures. Likewise, the CTT and IRT indices of symptom frequency (item means and locations of the item response functions, bi) were also similar with these two measures. For example, sad mood and difficulty with concentration/decision making were highly related to the overall depression severity with both the QIDS-C16 and QIDS-SR16. Likewise, sleeping difficulties were commonly reported, even though they were not as strongly related to overall magnitude of depression. Conclusion In this less educated, socially disadvantaged sample, differences between the QIDS-C16 and QIDS-SR16 were minor. The QIDS-SR16 is a satisfactory substitute for the more time-consuming QIDS-C16 in a broad range of adult, nonpsychotic, depressed outpatients. PMID:16716351
Clinical vs. self-report versions of the quick inventory of depressive symptomatology in a public sector sample.

PubMed

Bernstein, Ira H; Rush, A John; Carmody, Thomas J; Woo, Ada; Trivedi, Madhukar H

2007-01-01

Recent work using classical test theory (CTT) and item response theory (IRT) has found that the self-report (QIDS-SR(16)) and clinician-rated (QIDS-C(16)) versions of the 16-item quick inventory of depressive symptomatology were generally comparable in outpatients with nonpsychotic major depressive disorder (MDD). This report extends this comparison to a less well-educated, more treatment-resistant sample that included more ethnic/racial minorities using IRT and selected classical test analyses. The QIDS-SR(16) and QIDS-C(16) were obtained in a sample of 441 outpatients with nonpsychotic MDD seen in the public sector in the Texas Medication Algorithm Project (TMAP). The Samejima graded response IRT model was used to compare the QIDS-SR(16) and QIDS-C(16). The nine symptom domains in the QIDS-SR(16) and QIDS-C(16) related well to overall depression. The slopes of the item response functions, a, which index the strength of relationship between overall depression and each symptom, were extremely similar with the two measures. Likewise, the CTT and IRT indices of symptom frequency (item means and locations of the item response functions, b(i) were also similar with these two measures. For example, sad mood and difficulty with concentration/decision making were highly related to the overall depression severity with both the QIDS-C(16) and QIDS-SR(16). Likewise, sleeping difficulties were commonly reported, even though they were not as strongly related to overall magnitude of depression. In this less educated, socially disadvantaged sample, differences between the QIDS-C(16) and QIDS-SR(16) were minor. The QIDS-SR(16) is a satisfactory substitute for the more time-consuming QIDS-C(16) in a broad range of adult, nonpsychotic, depressed outpatients.
Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.

PubMed

Cappelleri, Joseph C; Jason Lundy, J; Hays, Ron D

2014-05-01

The US Food and Drug Administration's guidance for industry document on patient-reported outcomes (PRO) defines content validity as "the extent to which the instrument measures the concept of interest" (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity" (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. We review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized "difficulty" (severity) order of items is represented by observed responses. If a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. Classical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures. Copyright © 2014 Elsevier HS Journals, Inc. All rights reserved.
Pancreatic cellular injury after cardiac surgery with cardiopulmonary bypass: frequency, time course and risk factors.

PubMed

Nys, Monique; Venneman, Ingrid; Deby-Dupont, Ginette; Preiser, Jean-Charles; Vanbelle, Sophie; Albert, Adelin; Camus, Gérard; Damas, Pierre; Larbuisson, Robert; Lamy, Maurice

2007-05-01

Although often clinically silent, pancreatic cellular injury (PCI) is relatively frequent after cardiac surgery with cardiopulmonary bypass; and its etiology and time course are largely unknown. We defined PCI as the simultaneous presence of abnormal values of pancreatic isoamylase and immunoreactive trypsin (IRT). The frequency and time evolution of PCI were assessed in this condition using assays for specific exocrine pancreatic enzymes. Correlations with inflammatory markers were searched for preoperative risk factors. One hundred ninety-three patients submitted to cardiac surgery were enrolled prospectively. Blood IRT, amylase, pancreatic isoamylase, lipase, and markers of inflammation (alpha1-protease inhibitor, alpha2-macroglobulin, myeloperoxidase) were measured preoperatively and postoperatively until day 8. The postoperative increase in plasma levels of pancreatic enzymes and urinary IRT was biphasic in all patients: early after surgery and later (from day 4 to 8 after surgery). One hundred thirty-three patients (69%) experienced PCI, with mean IRT, isoamylase, and alpha1-protease inhibitor values higher for each sample than that in patients without PCI. By multiple regression analysis, we found preoperative values of plasma IRT >or=40 ng/mL, amylase >or=42 IU/mL, and pancreatic isoamylase >or=20 IU/L associated with a higher incidence of postsurgery PCI (P < 0.005). In the PCI patients, a significant correlation was found between the 4 pancreatic enzymes and urinary IRT, total calcium, myeloperoxidase, alpha1-protease inhibitor, and alpha2-macroglobulin. These data support a high prevalence of postoperative PCI after cardiac surgery with cardiopulmonary bypass, typically biphasic and clinically silent, especially when pancreatic enzymes were elevated preoperatively.
Differences of Cd uptake and expression of OAS and IRT genes in two varieties of ryegrasses.

PubMed

Chi, Sunlin; Qin, Yuli; Xu, Weihong; Chai, Yourong; Feng, Deyu; Li, Yanhua; Li, Tao; Yang, Mei; He, Zhangmi

2018-06-16

Pot experiment was conducted to study the difference of cadmium uptake and OAS and IRT genes' expression between the two ryegrass varieties under cadmium stress. The results showed that with the increase of cadmium levels, the dry weights of roots of the two ryegrass varieties, and the dry weights of shoots and plants of Abbott first increased and then decreased. When exposed to 75 mg kg -1 Cd, the dry weights of shoot and plant of Abbott reached the maximum, which increased by 11.13 and 10.67% compared with the control. At 75 mg kg -1 Cd, cadmium concentrations in shoot of the two ryegrass varieties were higher than the critical value of Cd hyperaccumulator (100 mg kg -1 ), 111.19 mg kg -1 (Bond), and 133.69 mg kg -1 (Abbott), respectively. The OAS gene expression in the leaves of the two ryegrass varieties showed a unimodal curve, which was up to the highest at the cadmium level of 150 mg kg -1 , but fell back at high cadmium levels of 300 and 600 mg kg -1 . The OAS gene expression in Bond and Abbott roots showed a bimodal curve. The OAS gene expression in Bond root and Abbott stem mainly showed a unimodal curve. The expression of IRT genes family in the leaves of ryegrass varieties was basically in line with the characteristics of unimodal curve, which was up to the highest at cadmium level of 75 or 150 mg kg -1 , respectively. The IRT expression in the ryegrass stems showed characteristics of bimodal and unimodal curves, while that in the roots was mainly unimodal. The expression of OAS and IRT genes was higher in Bond than that in Abbott due to genotype difference between the two varieties. The expression of OAS and IRT was greater in leaves than that in roots and stems. Ryegrass tolerance to cadmium can be increased by increasing the expression of OAS and IRT genes in roots and stems, and transfer of cadmium from roots and stems to the leaves can be enhanced by increasing expression OAS and IRT in leaves.
Rasch measurement: the Arm Activity measure (ArmA) passive function sub-scale.

PubMed

Ashford, Stephen; Siegert, Richard J; Alexandrescu, Roxana

2016-01-01

To evaluate the conformity of the Arm Activity measure (ArmA) passive function sub-scale to the Rasch model. A consecutive cohort of patients (n = 92) undergoing rehabilitation, including upper limb rehabilitation and spasticity management, at two specialist rehabilitation units were included. Rasch analysis was used to examine scaling and conformity to the model. Responses were analysed using Rasch unidimensional measurement models (RUMM 2030). The following aspects were considered: overall model and individual item fit statistics and fit residuals, internal reliability, item response threshold ordering, item bias, local dependency and unidimensionality. ArmA contains both active and passive function sub-scales, but in this analysis only the passive function sub-scale was considered. Four of the seven items in the ArmA passive function sub-scale initially had disordered thresholds. These items were rescored to four response options, which resulted in ordered thresholds for all items. Once the items with disordered thresholds had been rescored, item bias was not identified for age, global disability level or diagnosis, but with a small difference in difficulty between males and females for one item of the scale. Local dependency was not observed and the unidimensionality of the sub-scale was supported and good fit to the Rasch model was identified. The person separation index (PSI) was 0.95 indicating that the scale is able to reliably differentiate at least two groups of patients. The ArmA passive function sub-scale was shown in this evaluation to conform to the Rasch model once disordered thresholds had been addressed. Using the logit scores produced by the Rasch model it was possible to convert this back to the original scale range. Implications for Rehabilitation The ArmA passive function sub-scale was shown, in this evaluation, to conform to the Rasch model once disordered thresholds had been addressed and therefore to be a clinically applicable and potentially useful hierarchical measure. Using Rasch logit scores it has be possible to convert back to the original ordinal scale range and provide an indication of real change to enable evaluation of clinical outcome of importance to patients and clinicians.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.