ERIC Educational Resources Information Center
Pohl, Steffi; Gräfe, Linda; Rose, Norman
2014-01-01
Data from competence tests usually show a number of missing responses on test items due to both omitted and not-reached items. Different approaches for dealing with missing responses exist, and there are no clear guidelines on which of those to use. While classical approaches rely on an ignorable missing data mechanism, the most recently developed…
Estimation of Item Response Theory Parameters in the Presence of Missing Data
ERIC Educational Resources Information Center
Finch, Holmes
2008-01-01
Missing data are a common problem in a variety of measurement settings, including responses to items on both cognitive and affective assessments. Researchers have shown that such missing data may create problems in the estimation of item difficulty parameters in the Item Response Theory (IRT) context, particularly if they are ignored. At the same…
ERIC Educational Resources Information Center
Finch, Holmes
2011-01-01
Methods of uniform differential item functioning (DIF) detection have been extensively studied in the complete data case. However, less work has been done examining the performance of these methods when missing item responses are present. Research that has been done in this regard appears to indicate that treating missing item responses as…
Missing data in FFQs: making assumptions about item non-response.
Lamb, Karen E; Olstad, Dana Lee; Nguyen, Cattram; Milte, Catherine; McNaughton, Sarah A
2017-04-01
FFQs are a popular method of capturing dietary information in epidemiological studies and may be used to derive dietary exposures such as nutrient intake or overall dietary patterns and diet quality. As FFQs can involve large numbers of questions, participants may fail to respond to all questions, leaving researchers to decide how to deal with missing data when deriving intake measures. The aim of the present commentary is to discuss the current practice for dealing with item non-response in FFQs and to propose a research agenda for reporting and handling missing data in FFQs. Single imputation techniques, such as zero imputation (assuming no consumption of the item) or mean imputation, are commonly used to deal with item non-response in FFQs. However, single imputation methods make strong assumptions about the missing data mechanism and do not reflect the uncertainty created by the missing data. This can lead to incorrect inference about associations between diet and health outcomes. Although the use of multiple imputation methods in epidemiology has increased, these have seldom been used in the field of nutritional epidemiology to address missing data in FFQs. We discuss methods for dealing with item non-response in FFQs, highlighting the assumptions made under each approach. Researchers analysing FFQs should ensure that missing data are handled appropriately and clearly report how missing data were treated in analyses. Simulation studies are required to enable systematic evaluation of the utility of various methods for handling item non-response in FFQs under different assumptions about the missing data mechanism.
What You Don't Know Can Hurt You: Missing Data and Partial Credit Model Estimates
Thomas, Sarah L.; Schmidt, Karen M.; Erbacher, Monica K.; Bergeman, Cindy S.
2017-01-01
The authors investigated the effect of Missing Completely at Random (MCAR) item responses on partial credit model (PCM) parameter estimates in a longitudinal study of Positive Affect. Participants were 307 adults from the older cohort of the Notre Dame Study of Health and Well-Being (Bergeman and Deboeck, 2014) who completed questionnaires including Positive Affect items for 56 days. Additional missing responses were introduced to the data, randomly replacing 20%, 50%, and 70% of the responses on each item and each day with missing values, in addition to the existing missing data. Results indicated that item locations and person trait level measures diverged from the original estimates as the level of degradation from induced missing data increased. In addition, standard errors of these estimates increased with the level of degradation. Thus, MCAR data does damage the quality and precision of PCM estimates. PMID:26784376
ERIC Educational Resources Information Center
Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H.
2015-01-01
When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…
Modeling Nonignorable Missing Data in Speeded Tests
ERIC Educational Resources Information Center
Glas, Cees A. W.; Pimentel, Jonald L.
2008-01-01
In tests with time limits, items at the end are often not reached. Usually, the pattern of missing responses depends on the ability level of the respondents; therefore, missing data are not ignorable in statistical inference. This study models data using a combination of two item response theory (IRT) models: one for the observed response data and…
A Method for Imputing Response Options for Missing Data on Multiple-Choice Assessments
ERIC Educational Resources Information Center
Wolkowitz, Amanda A.; Skorupski, William P.
2013-01-01
When missing values are present in item response data, there are a number of ways one might impute a correct or incorrect response to a multiple-choice item. There are significantly fewer methods for imputing the actual response option an examinee may have provided if he or she had not omitted the item either purposely or accidentally. This…
The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning
ERIC Educational Resources Information Center
Finch, W. Holmes
2011-01-01
Missing information is a ubiquitous aspect of data analysis, including responses to items on cognitive and affective instruments. Although the broader statistical literature describes missing data methods, relatively little work has focused on this issue in the context of differential item functioning (DIF) detection. Such prior research has…
A model for incomplete longitudinal multivariate ordinal data.
Liu, Li C
2008-12-30
In studies where multiple outcome items are repeatedly measured over time, missing data often occur. A longitudinal item response theory model is proposed for analysis of multivariate ordinal outcomes that are repeatedly measured. Under the MAR assumption, this model accommodates missing data at any level (missing item at any time point and/or missing time point). It allows for multiple random subject effects and the estimation of item discrimination parameters for the multiple outcome items. The covariates in the model can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is described utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher-scoring solution, which provides standard errors for all model parameters, is used. A data set from a longitudinal prevention study is used to motivate the application of the proposed model. In this study, multiple ordinal items of health behavior are repeatedly measured over time. Because of a planned missing design, subjects answered only two-third of all items at a given point. Copyright 2008 John Wiley & Sons, Ltd.
The Effect of Missing Data Treatment on Mantel-Haenszel DIF Detection
ERIC Educational Resources Information Center
Emenogu, Barnabas C.; Falenchuk, Olesya; Childs, Ruth A.
2010-01-01
Most implementations of the Mantel-Haenszel differential item functioning procedure delete records with missing responses or replace missing responses with scores of 0. These treatments of missing data make strong assumptions about the causes of the missing data. Such assumptions may be particularly problematic when groups differ in their patterns…
Peyre, Hugo; Leplège, Alain; Coste, Joël
2011-03-01
Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory. Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin's "missing completely at random," "missing at random," and "missing not at random"). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36. For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias <2%) in all studied situations. Whereas multiple imputation and full information maximum likelihood are confirmed as reference methods, the personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.
ERIC Educational Resources Information Center
Robitzsch, Alexander; Rupp, Andre A.
2009-01-01
This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…
ERIC Educational Resources Information Center
Longford, Nicholas T.
This study is a critical evaluation of the roles for coding and scoring of missing responses to multiple-choice items in educational tests. The focus is on tests in which the test-takers have little or no motivation; in such tests omitting and not reaching (as classified by the currently adopted operational rules) is quite frequent. Data from the…
Impact of Missing Data on Person-Model Fit and Person Trait Estimation
ERIC Educational Resources Information Center
Zhang, Bo; Walker, Cindy M.
2008-01-01
The purpose of this research was to examine the effects of missing data on person-model fit and person trait estimation in tests with dichotomous items. Under the missing-completely-at-random framework, four missing data treatment techniques were investigated including pairwise deletion, coding missing responses as incorrect, hotdeck imputation,…
de Bock, Élodie; Hardouin, Jean-Benoit; Blanchin, Myriam; Le Neel, Tanguy; Kubis, Gildas; Bonnaud-Antignac, Angélique; Dantan, Étienne; Sébille, Véronique
2016-10-01
The objective was to compare classical test theory and Rasch-family models derived from item response theory for the analysis of longitudinal patient-reported outcomes data with possibly informative intermittent missing items. A simulation study was performed in order to assess and compare the performance of classical test theory and Rasch model in terms of bias, control of the type I error and power of the test of time effect. The type I error was controlled for classical test theory and Rasch model whether data were complete or some items were missing. Both methods were unbiased and displayed similar power with complete data. When items were missing, Rasch model remained unbiased and displayed higher power than classical test theory. Rasch model performed better than the classical test theory approach regarding the analysis of longitudinal patient-reported outcomes with possibly informative intermittent missing items mainly for power. This study highlights the interest of Rasch-based models in clinical research and epidemiology for the analysis of incomplete patient-reported outcomes data. © The Author(s) 2013.
Holman, Rebecca; Glas, Cees AW; Lindeboom, Robert; Zwinderman, Aeilko H; de Haan, Rob J
2004-01-01
Background Whenever questionnaires are used to collect data on constructs, such as functional status or health related quality of life, it is unlikely that all respondents will respond to all items. This paper examines ways of dealing with responses in a 'not applicable' category to items included in the AMC Linear Disability Score (ALDS) project item bank. Methods The data examined in this paper come from the responses of 392 respondents to 32 items and form part of the calibration sample for the ALDS item bank. The data are analysed using the one-parameter logistic item response theory model. The four practical strategies for dealing with this type of response are: cold deck imputation; hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. Results The item and respondent population parameter estimates were very similar for the strategies involving hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. The estimates obtained using the cold deck imputation method were substantially different. Conclusions The cold deck imputation method was not considered suitable for use in the ALDS item bank. The other three methods described can be usefully implemented in the ALDS item bank, depending on the purpose of the data analysis to be carried out. These three methods may be useful for other data sets examining similar constructs, when item response theory based methods are used. PMID:15200681
A Framework for Dimensionality Assessment for Multidimensional Item Response Models
ERIC Educational Resources Information Center
Svetina, Dubravka; Levy, Roy
2014-01-01
A framework is introduced for considering dimensionality assessment procedures for multidimensional item response models. The framework characterizes procedures in terms of their confirmatory or exploratory approach, parametric or nonparametric assumptions, and applicability to dichotomous, polytomous, and missing data. Popular and emerging…
Exploratory Item Classification Via Spectral Graph Clustering
Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang
2017-01-01
Large-scale assessments are supported by a large item pool. An important task in test development is to assign items into scales that measure different characteristics of individuals, and a popular approach is cluster analysis of items. Classical methods in cluster analysis, such as the hierarchical clustering, K-means method, and latent-class analysis, often induce a high computational overhead and have difficulty handling missing data, especially in the presence of high-dimensional responses. In this article, the authors propose a spectral clustering algorithm for exploratory item cluster analysis. The method is computationally efficient, effective for data with missing or incomplete responses, easy to implement, and often outperforms traditional clustering algorithms in the context of high dimensionality. The spectral clustering algorithm is based on graph theory, a branch of mathematics that studies the properties of graphs. The algorithm first constructs a graph of items, characterizing the similarity structure among items. It then extracts item clusters based on the graphical structure, grouping similar items together. The proposed method is evaluated through simulations and an application to the revised Eysenck Personality Questionnaire. PMID:29033476
Non-ignorable missingness item response theory models for choice effects in examinee-selected items.
Liu, Chen-Wei; Wang, Wen-Chung
2017-11-01
Examinee-selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non-ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two-dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non-ignorable and to determine how to apply the new model to the data collected. Two follow-up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non-ignorable missing data were mistakenly treated as ignorable. © 2017 The British Psychological Society.
Maindal, Helle Terkildsen; Sokolowski, Ineta; Vedsted, Peter
2009-06-29
The Patient Activation Measure (PAM) is a measure that assesses patient knowledge, skill, and confidence for self-management. This study validates the Danish translation of the 13-item Patient Activation Measure (PAM13) in a Danish population with dysglycaemia. 358 people with screen-detected dysglycaemia participating in a primary care health education study responded to PAM13. The PAM13 was translated into Danish by a standardised forward-backward translation. Data quality was assessed by mean, median, item response, missing values, floor and ceiling effects, internal consistency (Cronbach's alpha and average inter-item correlation) and item-rest correlations. Scale properties were assessed by Rasch Rating Scale models. The item response was high with a small number of missing values (0.8-4.2%). Floor effect was small (range 0.6-3.6%), but the ceiling effect was above 15% for all items (range 18.6-62.7%). The alpha-coefficient was 0.89 and the average inter-item correlation 0.38. The Danish version formed a unidimensional, probabilistic Guttman-like scale explaining 43.2% of the variance. We did however, find a different item sequence compared to the original scale. A Danish version of PAM13 with acceptable validity and reliability is now available. Further development should focus on single items, response categories in relation to ceiling effects and further validation of reproducibility and responsiveness.
The Missing Data Assumptions of the NEAT Design and Their Implications for Test Equating
ERIC Educational Resources Information Center
Sinharay, Sandip; Holland, Paul W.
2010-01-01
The Non-Equivalent groups with Anchor Test (NEAT) design involves "missing data" that are "missing by design." Three nonlinear observed score equating methods used with a NEAT design are the "frequency estimation equipercentile equating" (FEEE), the "chain equipercentile equating" (CEE), and the "item-response-theory observed-score-equating" (IRT…
Combining item response theory with multiple imputation to equate health assessment questionnaires.
Gu, Chenyang; Gutman, Roee
2017-09-01
The assessment of patients' functional status across the continuum of care requires a common patient assessment tool. However, assessment tools that are used in various health care settings differ and cannot be easily contrasted. For example, the Functional Independence Measure (FIM) is used to evaluate the functional status of patients who stay in inpatient rehabilitation facilities, the Minimum Data Set (MDS) is collected for all patients who stay in skilled nursing facilities, and the Outcome and Assessment Information Set (OASIS) is collected if they choose home health care provided by home health agencies. All three instruments or questionnaires include functional status items, but the specific items, rating scales, and instructions for scoring different activities vary between the different settings. We consider equating different health assessment questionnaires as a missing data problem, and propose a variant of predictive mean matching method that relies on Item Response Theory (IRT) models to impute unmeasured item responses. Using real data sets, we simulated missing measurements and compared our proposed approach to existing methods for missing data imputation. We show that, for all of the estimands considered, and in most of the experimental conditions that were examined, the proposed approach provides valid inferences, and generally has better coverages, relatively smaller biases, and shorter interval estimates. The proposed method is further illustrated using a real data set. © 2016, The International Biometric Society.
Comparison of methods for dealing with missing values in the EPV-R.
Paniagua, David; Amor, Pedro J; Echeburúa, Enrique; Abad, Francisco J
2017-08-01
The development of an effective instrument to assess the risk of partner violence is a topic of great social relevance. This study evaluates the scale of “Predicción del Riesgo de Violencia Grave Contra la Pareja” –Revisada– (EPV-R - Severe Intimate Partner Violence Risk Prediction Scale-Revised), a tool developed in Spain, which is facing the problem of how to treat the high rate of missing values, as is usual in this type of scale. First, responses to the EPV-R in a sample of 1215 male abusers who were reported to the police were used to analyze the patterns of occurrence of missing values, as well as the factor structure. Second, we analyzed the performance of various imputation methods using simulated data that emulates the missing data mechanism found in the empirical database. The imputation procedure originally proposed by the authors of the scale provides acceptable results, although the application of a method based on the Item Response Theory could provide greater accuracy and offers some additional advantages. Item Response Theory appears to be a useful tool for imputing missing data in this type of questionnaire.
de Bock, Élodie; Hardouin, Jean-Benoit; Blanchin, Myriam; Le Neel, Tanguy; Kubis, Gildas; Sébille, Véronique
2015-01-01
The purpose of this study was to identify the most adequate strategy for group comparison of longitudinal patient-reported outcomes in the presence of possibly informative intermittent missing data. Models coming from classical test theory (CTT) and item response theory (IRT) were compared. Two groups of patients' responses to dichotomous items with three times of assessment were simulated. Different cases were considered: presence or absence of a group effect and/or a time effect, a total of 100 or 200 patients, 4 or 7 items and two different values for the correlation coefficient of the latent trait between two consecutive times (0.4 or 0.9). Cases including informative and non-informative intermittent missing data were compared at different rates (15, 30 %). These simulated data were analyzed with CTT using score and mixed model (SM) and with IRT using longitudinal Rasch mixed model (LRM). The type I error, the power and the bias of the group effect estimations were compared between the two methods. This study showed that LRM performs better than SM. When the rate of missing data rose to 30 %, estimations were biased with SM mainly for informative missing data. Otherwise, LRM and SM methods were comparable concerning biases. However, regardless of the rate of intermittent missing data, power of LRM was higher compared to power of SM. In conclusion, LRM should be favored when the rate of missing data is higher than 15 %. For other cases, SM and LRM provide similar results.
Screening for Moral Injury: The Moral Injury Symptom Scale - Military Version Short Form.
Koenig, Harold G; Ames, Donna; Youssef, Nagy A; Oliver, John P; Volk, Fred; Teng, Ellen J; Haynes, Kerry; Erickson, Zachary D; Arnold, Irina; O'Garo, Keisha; Pearce, Michelle
2018-03-26
To develop a short form (SF) of the 45-item multidimensional Moral Injury Symptom Scale - Military Version (MISS-M) to use when screening for moral injury and monitoring treatment response in veterans and active duty military with PTSD. A total of 427 veterans and active duty military with PTSD symptoms were recruited from VA Medical Centers in Augusta, GA; Los Angeles, CA; Durham, NC; Houston, TX; and San Antonio, TX; and from Liberty University, Lynchburg, Virginia. The sample was randomly split in two. In the first half (n = 214), exploratory factor analysis identified the highest loading item on each of the 10 MISS scales (guilt, shame, moral concerns, loss of meaning, difficulty forgiving, loss of trust, self-condemnation, religious struggle, and loss of religious faith) to form the 10-item MISS-M-SF; confirmatory factor analysis was then performed to replicate results in the second half of the sample (n = 213). Internal reliability, test-retest reliability, and convergent, discriminant, and concurrent validity were examined in the overall sample. The study was approved by the institutional review boards and the Research & Development (R&D) Committees at Veterans Administration medical centers in Durham, Los Angeles, Augusta, Houston, and San Antonio, and the Liberty University and Duke University Medical Center institutional review boards. The 10-item MISS-M-SF had a median of 50 and a range of 12-91 (possible range 10-100). Over 70% scored a 9 or 10 (highest possible) on at least one item. Cronbach's alpha was 0.73 (95% CI 0.69-0.76), and test-retest reliability was 0.87 (95% CI 0.79-0.92). Convergent validity with the 45-item MISS-M was r = 0.92. Discriminant validity was demonstrated by relatively weak correlations with social, religious, and physical health constructs (r = 0.21-0.35), and concurrent validity was indicated by strong correlations with PTSD, depression, and anxiety symptoms (r = 0.54-0.58). The MISS-M-SF is a reliable and valid measure of MI symptoms that can be used to screen for MI and monitor response to treatment in veterans and active duty military with PTSD.
Modeling Skipped and Not-Reached Items Using IRTrees
ERIC Educational Resources Information Center
Debeer, Dries; Janssen, Rianne; De Boeck, Paul
2017-01-01
When dealing with missing responses, two types of omissions can be discerned: items can be skipped or not reached by the test taker. When the occurrence of these omissions is related to the proficiency process the missingness is nonignorable. The purpose of this article is to present a tree-based IRT framework for modeling responses and omissions…
Hill, Bridget; Pallant, Julie; Williams, Gavin; Olver, John; Ferris, Scott; Bialocerkowski, Andrea
2016-12-01
To evaluate the internal construct validity and dimensionality of a new patient-reported outcome measure for people with traumatic brachial plexus injury (BPI) based on the International Classification of Functioning, Disability and Health definition of activity. Cross-sectional study. Outpatient clinics. Adults (age range, 18-82y) with a traumatic BPI (N=106). There were 106 people with BPI who completed a 51-item 5-response questionnaire. Responses were analyzed in 4 phases (missing responses, item correlations, exploratory factor analysis, and Rasch analysis) to evaluate the properties of fit to the Rasch model, threshold response, local dependency, dimensionality, differential item functioning, and targeting. Not applicable, as this study addresses the development of an outcome measure. Six items were deleted for missing responses, and 10 were deleted for high interitem correlations >.81. The remaining 35 items, while demonstrating fit to the Rasch model, showed evidence of local dependency and multidimensionality. Items were divided into 3 subscales: dressing and grooming (8 items), arm and hand (17 items), and no hand (6 items). All 3 subscales demonstrated fit to the model with no local dependency, minimal disordered thresholds, no unidimensionality or differential item functioning for age, time postinjury, or self-selected dominance. Subscales were combined into 3 subtests and demonstrated fit to the model, no misfit, and unidimensionality, allowing calculation of a summary score. This preliminary analysis supports the internal construct validity of the Brachial Assessment Tool, a unidimensional targeted 4-response patient-reported outcome measure designed to solely assess activity after traumatic BPI regardless of level of injury, age at recruitment, premorbid limb dominance, and time postinjury. Further examination is required to determine test-retest reliability and responsiveness. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Sinharay, Sandip; Holland, Paul W.
2008-01-01
The nonequivalent groups with anchor test (NEAT) design involves missing data that are missing by design. Three popular equating methods that can be used with a NEAT design are the poststratification equating method, the chain equipercentile equating method, and the item-response-theory observed-score-equating method. These three methods each…
Method variation in the impact of missing data on response shift detection.
Schwartz, Carolyn E; Sajobi, Tolulope T; Verdam, Mathilde G E; Sebille, Veronique; Lix, Lisa M; Guilleux, Alice; Sprangers, Mirjam A G
2015-03-01
Missing data due to attrition or item non-response can result in biased estimates and loss of power in longitudinal quality-of-life (QOL) research. The impact of missing data on response shift (RS) detection is relatively unknown. This overview article synthesizes the findings of three methods tested in this special section regarding the impact of missing data patterns on RS detection in incomplete longitudinal data. The RS detection methods investigated include: (1) Relative importance analysis to detect reprioritization RS in stroke caregivers; (2) Oort's structural equation modeling (SEM) to detect recalibration, reprioritization, and reconceptualization RS in cancer patients; and (3) Rasch-based item-response theory-based (IRT) models as compared to SEM models to detect recalibration and reprioritization RS in hospitalized chronic disease patients. Each method dealt with missing data differently, either with imputation (1), attrition-based multi-group analysis (2), or probabilistic analysis that is robust to missingness due to the specific objectivity property (3). Relative importance analyses were sensitive to the type and amount of missing data and imputation method, with multiple imputation showing the largest RS effects. The attrition-based multi-group SEM revealed differential effects of both the changes in health-related QOL and the occurrence of response shift by attrition stratum, and enabled a more complete interpretation of findings. The IRT RS algorithm found evidence of small recalibration and reprioritization effects in General Health, whereas SEM mostly evidenced small recalibration effects. These differences may be due to differences between the two methods in handling of missing data. Missing data imputation techniques result in different conclusions about the presence of reprioritization RS using the relative importance method, while the attrition-based SEM approach highlighted different recalibration and reprioritization RS effects by attrition group. The IRT analyses detected more recalibration and reprioritization RS effects than SEM, presumably due to IRT's robustness to missing data. Future research should apply simulation techniques in order to make conclusive statements about the impacts of missing data according to the type and amount of RS.
Handling missing values in the MDS-UPDRS.
Goetz, Christopher G; Luo, Sheng; Wang, Lu; Tilley, Barbara C; LaPelle, Nancy R; Stebbins, Glenn T
2015-10-01
This study was undertaken to define the number of missing values permissible to render valid total scores for each Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS) part. To handle missing values, imputation strategies serve as guidelines to reject an incomplete rating or create a surrogate score. We tested a rigorous, scale-specific, data-based approach to handling missing values for the MDS-UPDRS. From two large MDS-UPDRS datasets, we sequentially deleted item scores, either consistently (same items) or randomly (different items) across all subjects. Lin's Concordance Correlation Coefficient (CCC) compared scores calculated without missing values with prorated scores based on sequentially increasing missing values. The maximal number of missing values retaining a CCC greater than 0.95 determined the threshold for rendering a valid prorated score. A second confirmatory sample was selected from the MDS-UPDRS international translation program. To provide valid part scores applicable across all Hoehn and Yahr (H&Y) stages when the same items are consistently missing, one missing item from Part I, one from Part II, three from Part III, but none from Part IV can be allowed. To provide valid part scores applicable across all H&Y stages when random item entries are missing, one missing item from Part I, two from Part II, seven from Part III, but none from Part IV can be allowed. All cutoff values were confirmed in the validation sample. These analyses are useful for constructing valid surrogate part scores for MDS-UPDRS when missing items fall within the identified threshold and give scientific justification for rejecting partially completed ratings that fall below the threshold. © 2015 International Parkinson and Movement Disorder Society.
Tsubono, Y; Fukao, A; Hisamichi, S
1994-06-01
A self-administered questionnaire using the mark-sheet method (MSM), in which responses of subjects are computer processed directly through an optical scanning device, has recently been utilized in epidemiologic surveys. Compared to the data coding process for a conventional questionnaire, in which a keypuncher enters the responses manually into a computer (manual method; MM), optical scanning requires less time and cost. Accuracy of the MSM for use in the general population in Japan, however, remains uncertain. Therefore the response rates, frequencies of missing values, validity and reproducibility of the answers in self-administered questionnaires were compared between the MSM and MM. Subjects were 463 residents aged 40-69 years living in 6 local districts of a rural town in northeastern Japan. They were randomly allocated, by district basis, to the MSM group (n = 242) or the MM group (n = 221). The questionnaire was delivered and collected at the subject's home by volunteers. Two weeks after collecting the original questionnaire, the same type of questionnaire was again distributed to half of the responders randomly chosen to investigate reproducibility. The overall response rate did not differ in MSM and MM (96.7% vs 98.2%, p = 0.312). Among questions with a multiple-choice type of answer, proportions of missing values were not different for most of the items, but it was lower in MSM for all of the 33 food frequency items. Reproducibilities of food frequency items measured by Spearman's rank correlation did not differ substantially in two groups.(ABSTRACT TRUNCATED AT 250 WORDS)
Erueti, Chrissy; Glasziou, Paul P
2013-01-01
Objectives To evaluate the completeness of descriptions of non-pharmacological interventions in randomised trials, identify which elements are most frequently missing, and assess whether authors can provide missing details. Design Analysis of consecutive sample of randomised trials of non-pharmacological interventions. Data sources and study selection All reports of randomised trials of non-pharmacological interventions published in 2009 in six leading general medical journals; 133 trial reports, with 137 interventions, met the inclusion criteria. Data collection Using an eight item checklist, two raters assessed the primary full trial report, plus any reference materials, appendices, or websites. Questions about missing details were emailed to corresponding authors, and relevant items were then reassessed. Results Of 137 interventions, only 53 (39%) were adequately described; this was increased to 81 (59%) by using 63 responses from 88 contacted authors. The most frequently missing item was the “intervention materials” (47% complete), but it also improved the most after author response (92% complete). Whereas some authors (27/70) provided materials or further information, other authors (21/70) could not; their reasons included copyright or intellectual property concerns, not having the materials or intervention details, or being unaware of their importance. Although 46 (34%) trial interventions had further information or materials readily available on a website, many were not mentioned in the report, were not freely accessible, or the URL was no longer functioning. Conclusions Missing essential information about interventions is a frequent, yet remediable, contributor to the worldwide waste in research funding. If trial reports do not have a sufficient description of interventions, other researchers cannot build on the findings, and clinicians and patients cannot reliably implement useful interventions. Improvement will require action by funders, researchers, and publishers, aided by long term repositories of materials linked to publications. PMID:24021722
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level
Savalei, Victoria; Rhemtulla, Mijke
2017-01-01
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data—that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study. PMID:29276371
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level.
Savalei, Victoria; Rhemtulla, Mijke
2017-08-01
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data-that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study.
Godin, Judith; Keefe, Janice; Andrew, Melissa K
2017-04-01
Missing values are commonly encountered on the Mini Mental State Examination (MMSE), particularly when administered to frail older people. This presents challenges for MMSE scoring in research settings. We sought to describe missingness in MMSEs administered in long-term-care facilities (LTCF) and to compare and contrast approaches to dealing with missing items. As part of the Care and Construction project in Nova Scotia, Canada, LTCF residents completed an MMSE. Different methods of dealing with missing values (e.g., use of raw scores, raw scores/number of items attempted, scale-level multiple imputation [MI], and blended approaches) are compared to item-level MI. The MMSE was administered to 320 residents living in 23 LTCF. The sample was predominately female (73%), and 38% of participants were aged >85 years. At least one item was missing from 122 (38.2%) of the MMSEs. Data were not Missing Completely at Random (MCAR), χ 2 (1110) = 1,351, p < 0.001. Using raw scores for those missing <6 items in combination with scale-level MI resulted in the regression coefficients and standard errors closest to item-level MI. Patterns of missing items often suggest systematic problems, such as trouble with manual dexterity, literacy, or visual impairment. While these observations may be relatively easy to take into account in clinical settings, non-random missingness presents challenges for research and must be considered in statistical analyses. We present suggestions for dealing with missing MMSE data based on the extent of missingness and the goal of analyses. Copyright © 2016 The Authors. Production and hosting by Elsevier B.V. All rights reserved.
A Monte Carlo Study Investigating Missing Data, Differential Item Functioning, and Effect Size
ERIC Educational Resources Information Center
Garrett, Phyllis
2009-01-01
The use of polytomous items in assessments has increased over the years, and as a result, the validity of these assessments has been a concern. Differential item functioning (DIF) and missing data are two factors that may adversely affect assessment validity. Both factors have been studied separately, but DIF and missing data are likely to occur…
Ipsative imputation for a 15-item Geriatric Depression Scale in community-dwelling elderly people.
Imai, Hissei; Furukawa, Toshiaki A; Kasahara, Yoriko; Ishimoto, Yasuko; Kimura, Yumi; Fukutomi, Eriko; Chen, Wen-Ling; Tanaka, Mire; Sakamoto, Ryota; Wada, Taizo; Fujisawa, Michiko; Okumiya, Kiyohito; Matsubayashi, Kozo
2014-09-01
Missing data are inevitable in almost all medical studies. Imputation methods using the probabilistic model are common, but they cannot impute individual data and require special software. In contrast, the ipsative imputation method, which substitutes the missing items by the mean of the remaining items within the individual, is easy and does not need any special software, but it can provide individual scores. The aim of the present study was to evaluate the validity of the ipsative imputation method using data involving the 15-item Geriatric Depression Scale. Participants were community-dwelling elderly individuals (n = 1178). A structural equation model was constructed. The model fit indexes were calculated to assess the validity of the imputation method when it is used for individuals who were missing 20% of data or less and 40% of data or less, depending on whether we assumed that their correlation coefficients were the same as the dataset with no missing items. Finally, we compared path coefficients of the dataset imputed by ipsative imputation with those by multiple imputation. When compared with the assumption that the datasets differed, all of the model fit indexes were better under the assumption that the dataset without missing data is the same as that that was missing 20% of data or less. However, by the same assumption, the model fit indexes were worse in the dataset that was missing 40% of data or less. The path coefficients of the dataset imputed by ipsative imputation and by multiple imputation were compatible with each other if the proportion of missing items was 20% or less. Ipsative imputation appears to be a valid imputation method and can be used to impute data in studies using the 15-item Geriatric Depression Scale, if the percentage of its missing items is 20% or less. © 2014 The Authors. Psychogeriatrics © 2014 Japanese Psychogeriatric Society.
Evaluation of Student Performance through a Multidimensional Finite Mixture IRT Model.
Bacci, Silvia; Bartolucci, Francesco; Grilli, Leonardo; Rampichini, Carla
2017-01-01
In the Italian academic system, a student can enroll for an exam immediately after the end of the teaching period or can postpone it; in this second case the exam result is missing. We propose an approach for the evaluation of a student performance throughout the course of study, accounting also for nonattempted exams. The approach is based on an item response theory model that includes two discrete latent variables representing student performance and priority in selecting the exams to take. We explicitly account for nonignorable missing observations as the indicators of attempted exams also contribute to measure the performance (within-item multidimensionality). The model also allows for individual covariates in its structural part.
Giesbrecht, Gerald F; Dewey, Deborah
2014-10-01
The Infant Behavior Questionnaire-Revised (IBQ-R) is a widely used parent report measure of infant temperament. Items marked 'does not apply' (NA) are treated as missing data when calculating scale scores, but the effect of this practice on assessment of infant temperament has not been reported. To determine the effect of NA responses on assessment of infant temperament and to evaluate the remedy offered by several missing data strategies. A prospective, community-based longitudinal cohort study. 401 infants who were born>37 weeks of gestation. Mothers completed the short form of the IBQ-R when infants were 3-months and 6-months of age. The rate of NA responses at the 3-month assessment was three times as high (22%) as the rate at six months (7%). Internal consistency was appreciably reduced and scale means were inflated in the presence of NA responses, especially at 3-months. The total number of NA items endorsed by individual parents was associated with infant age and parity. None of the missing data strategies completely eliminated problems related to NA responses but the Expectation Maximization algorithm greatly reduced these problems. The findings suggest that researchers should exercise caution when interpreting results obtained from infants at 3 months of age. Careful selection of scales, selecting a full length version of the IBQ-R, and use of a modern missing data technique may help to maintain the quality of data obtained from very young infants. Copyright © 2014 Elsevier Ltd. All rights reserved.
The Rasch Model and Missing Data, with an Emphasis on Tailoring Test Items.
ERIC Educational Resources Information Center
de Gruijter, Dato N. M.
Many applications of educational testing have a missing data aspect (MDA). This MDA is perhaps most pronounced in item banking, where each examinee responds to a different subtest of items from a large item pool and where both person and item parameter estimates are needed. The Rasch model is emphasized, and its non-parametric counterpart (the…
Identifying patterns of item missing survey data using latent groups: an observational study
McElwee, Paul; Nathan, Andrea; Burton, Nicola W; Turrell, Gavin
2017-01-01
Objectives To examine whether respondents to a survey of health and physical activity and potential determinants could be grouped according to the questions they missed, known as ‘item missing’. Design Observational study of longitudinal data. Setting Residents of Brisbane, Australia. Participants 6901 people aged 40–65 years in 2007. Materials and methods We used a latent class model with a mixture of multinomial distributions and chose the number of classes using the Bayesian information criterion. We used logistic regression to examine if participants’ characteristics were associated with their modal latent class. We used logistic regression to examine whether the amount of item missing in a survey predicted wave missing in the following survey. Results Four per cent of participants missed almost one-fifth of the questions, and this group missed more questions in the middle of the survey. Eighty-three per cent of participants completed almost every question, but had a relatively high missing probability for a question on sleep time, a question which had an inconsistent presentation compared with the rest of the survey. Participants who completed almost every question were generally younger and more educated. Participants who completed more questions were less likely to miss the next longitudinal wave. Conclusions Examining patterns in item missing data has improved our understanding of how missing data were generated and has informed future survey design to help reduce missing data. PMID:29084795
Identifying patterns of item missing survey data using latent groups: an observational study.
Barnett, Adrian G; McElwee, Paul; Nathan, Andrea; Burton, Nicola W; Turrell, Gavin
2017-10-30
To examine whether respondents to a survey of health and physical activity and potential determinants could be grouped according to the questions they missed, known as 'item missing'. Observational study of longitudinal data. Residents of Brisbane, Australia. 6901 people aged 40-65 years in 2007. We used a latent class model with a mixture of multinomial distributions and chose the number of classes using the Bayesian information criterion. We used logistic regression to examine if participants' characteristics were associated with their modal latent class. We used logistic regression to examine whether the amount of item missing in a survey predicted wave missing in the following survey. Four per cent of participants missed almost one-fifth of the questions, and this group missed more questions in the middle of the survey. Eighty-three per cent of participants completed almost every question, but had a relatively high missing probability for a question on sleep time, a question which had an inconsistent presentation compared with the rest of the survey. Participants who completed almost every question were generally younger and more educated. Participants who completed more questions were less likely to miss the next longitudinal wave. Examining patterns in item missing data has improved our understanding of how missing data were generated and has informed future survey design to help reduce missing data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Kelly, Nichole R; Cotter, Elizabeth W; Lydecker, Janet A; Mazzeo, Suzanne E
2017-01-01
The aim of this study was to examine relations among missing and discrepant data on the Eating Disorders Examination Questionnaire (EDE-Q; Fairburn & Beglin, 1994) and individual demographic factors and eating disorder symptoms. Data from 3968 men and women collected in five independent studies were examined. Descriptive statistics were used to detect the quantity of missing and discrepant data, as well as independent samples t-tests and chi-square analyses to examine group differences between participants with and without missing or discrepant data. Results indicated significant differences in data completeness by participant race/ethnicity and severity of eating disorder symptoms. White participants were most likely to provide complete survey responses, and Asian American participants were least likely to provide complete survey responses. Participants with incomplete surveys reported greater eating disorder symptoms and behaviors compared with those with complete surveys. Similarly, those with discrepant responses to behavioral items reported greater eating disorder symptoms and behaviors compared with those with congruent responses. Practical implications and recommendations for reducing and addressing incomplete data on the EDE-Q are discussed. Copyright © 2016. Published by Elsevier Ltd.
Nguyen, Cattram D; Strazdins, Lyndall; Nicholson, Jan M; Cooklin, Amanda R
2018-07-01
Understanding the long-term health effects of employment - a major social determinant - on population health is best understood via longitudinal cohort studies, yet missing data (attrition, item non-response) remain a ubiquitous challenge. Additionally, and unique to the work-family context, is the intermittent participation of parents, particularly mothers, in employment, yielding 'incomplete' data. Missing data are patterned by gender and social circumstances, and the extent and nature of resulting biases are unknown. This study investigates how estimates of the association between work-family conflict and mental health depend on the use of four different approaches to missing data treatment, each of which allows for progressive inclusion of more cases in the analyses. We used 5 waves of data from 4983 mothers participating in the Longitudinal Study of Australian Children. Only 23% had completely observed work-family conflict data across all waves. Participants with and without missing data differed such that complete cases were the most advantaged group. Comparison of the missing data treatments indicate the expected narrowing of confidence intervals when more sample were included. However, impact on the estimated strength of association varied by level of exposure: At the lower levels of work-family conflict, estimates strengthened (were larger); at higher levels they weakened (were smaller). Our results suggest that inadequate handling of missing data in extant longitudinal studies of work-family conflict and mental health may have misestimated the adverse effects of work-family conflict, particularly for mothers. Considerable caution should be exercised in interpreting analyses that fail to explore and account for biases arising from missing data. Copyright © 2018. Published by Elsevier Ltd.
Hamel, J F; Sebille, V; Le Neel, T; Kubis, G; Boyer, F C; Hardouin, J B
2017-12-01
Subjective health measurements using Patient Reported Outcomes (PRO) are increasingly used in randomized trials, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: Classical Test Theory (CTT) and Item Response Theory models (IRT). These two strategies display very similar characteristics when data are complete, but in the common case when data are missing, whether IRT or CTT would be the most appropriate remains unknown and was investigated using simulations. We simulated PRO data such as quality of life data. Missing responses to items were simulated as being completely random, depending on an observable covariate or on an unobserved latent trait. The considered CTT-based methods allowed comparing scores using complete-case analysis, personal mean imputations or multiple-imputations based on a two-way procedure. The IRT-based method was the Wald test on a Rasch model including a group covariate. The IRT-based method and the multiple-imputations-based method for CTT displayed the highest observed power and were the only unbiased method whatever the kind of missing data. Online software and Stata® modules compatibles with the innate mi impute suite are provided for performing such analyses. Traditional procedures (listwise deletion and personal mean imputations) should be avoided, due to inevitable problems of biases and lack of power.
Bell, Melanie L; Butow, Phyllis N; Goldstein, David
2013-12-01
Although cancer can seriously affect peoples' sexual well-being, survivors and patients may be reluctant to answer questions about sex. This reluctance may be stronger for immigrants. This study aimed to investigate missing sex data rates and predictors of missingness in two large studies on immigrants and Anglo-Australian controls with cancer and to investigate whether those with missing sex data may have worse sexual outcomes than those with complete data. We carried out two studies aimed at describing the quality of life (QoL) and unmet needs amongst Arabic, Chinese and Greek immigrants versus Anglo-Australians cancer survivors (n = 596, recruited from cancer registries) and patients (n = 845). Logistic regression was used to model the probability of having missing sex data in either of the questionnaires. We compared the mean of the unmet sex needs responses of those who had missing QoL sex data (but not needs) to those who had completed both, and vice versa. Missing sex data rates were as high as 65 %, with immigrants more likely to skip sex items than Anglo-Australians (p = 0.02 for registry study, p < 0.0001 for hospital study). Women, older participants and participants with more advanced disease had increased odds of missingness. There was evidence that data were informatively missing. Additionally, the questionnaire which stated that the sex questions are optional had higher missing data rates. High missing data rates and informatively missing data can lead to biased results. Using the questionnaires that state that they may skip sex items may lead to an underestimation of sexual problems or an overestimation of quality of life.
Wolfe, Edward W; McGill, Michael T
2011-01-01
This article summarizes a simulation study of the performance of five item quality indicators (the weighted and unweighted versions of the mean square and standardized mean square fit indices and the point-measure correlation) under conditions of relatively high and low amounts of missing data under both random and conditional patterns of missing data for testing contexts such as those encountered in operational administrations of a computerized adaptive certification or licensure examination. The results suggest that weighted fit indices, particularly the standardized mean square index, and the point-measure correlation provide the most consistent information between random and conditional missing data patterns and that these indices perform more comparably for items near the passing score than for items with extreme difficulty values.
Marfeo, Elizabeth E; Ni, Pengsheng; Chan, Leighton; Rasch, Elizabeth K; Jette, Alan M
2014-07-01
The goal of this article was to investigate optimal functioning of using frequency vs. agreement rating scales in two subdomains of the newly developed Work Disability Functional Assessment Battery: the Mood & Emotions and Behavioral Control scales. A psychometric study comparing rating scale performance embedded in a cross-sectional survey used for developing a new instrument to measure behavioral health functioning among adults applying for disability benefits in the United States was performed. Within the sample of 1,017 respondents, the range of response category endorsement was similar for both frequency and agreement item types for both scales. There were fewer missing values in the frequency items than the agreement items. Both frequency and agreement items showed acceptable reliability. The frequency items demonstrated optimal effectiveness around the mean ± 1-2 standard deviation score range; the agreement items performed better at the extreme score ranges. Findings suggest an optimal response format requires a mix of both agreement-based and frequency-based items. Frequency items perform better in the normal range of responses, capturing specific behaviors, reactions, or situations that may elicit a specific response. Agreement items do better for those whose scores are more extreme and capture subjective content related to general attitudes, behaviors, or feelings of work-related behavioral health functioning. Copyright © 2014 Elsevier Inc. All rights reserved.
Taking the Missing Propensity Into Account When Estimating Competence Scores
Pohl, Steffi; Carstensen, Claus H.
2014-01-01
When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically made when using these models: (1) The missing propensity is unidimensional and (2) the missing propensity and the ability are bivariate normally distributed. These assumptions may, however, be violated in real data sets and could, thus, pose a threat to the validity of this approach. The present study focuses on modeling competencies in various domains, using data from a school sample (N = 15,396) and an adult sample (N = 7,256) from the National Educational Panel Study. Our interest was to investigate whether violations of unidimensionality and the normal distribution assumption severely affect the performance of the model-based approach in terms of differences in ability estimates. We propose a model with a competence dimension, a unidimensional missing propensity and a distributional assumption more flexible than a multivariate normal. Using this model for ability estimation results in different ability estimates compared with a model ignoring missing responses. Implications for ability estimation in large-scale assessments are discussed. PMID:29795844
Daniels, Vijay J; Bordage, Georges; Gierl, Mark J; Yudkowsky, Rachel
2014-10-01
Objective structured clinical examinations (OSCEs) are used worldwide for summative examinations but often lack acceptable reliability. Research has shown that reliability of scores increases if OSCE checklists for medical students include only clinically relevant items. Also, checklists are often missing evidence-based items that high-achieving learners are more likely to use. The purpose of this study was to determine if limiting checklist items to clinically discriminating items and/or adding missing evidence-based items improved score reliability in an Internal Medicine residency OSCE. Six internists reviewed the traditional checklists of four OSCE stations classifying items as clinically discriminating or non-discriminating. Two independent reviewers augmented checklists with missing evidence-based items. We used generalizability theory to calculate overall reliability of faculty observer checklist scores from 45 first and second-year residents and predict how many 10-item stations would be required to reach a Phi coefficient of 0.8. Removing clinically non-discriminating items from the traditional checklist did not affect the number of stations (15) required to reach a Phi of 0.8 with 10 items. Focusing the checklist on only evidence-based clinically discriminating items increased test score reliability, needing 11 stations instead of 15 to reach 0.8; adding missing evidence-based clinically discriminating items to the traditional checklist modestly improved reliability (needing 14 instead of 15 stations). Checklists composed of evidence-based clinically discriminating items improved the reliability of checklist scores and reduced the number of stations needed for acceptable reliability. Educators should give preference to evidence-based items over non-evidence-based items when developing OSCE checklists.
ERIC Educational Resources Information Center
van Ginkel, Joost R.; van der Ark, L. Andries; Sijtsma, Klaas
2007-01-01
The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at…
ERIC Educational Resources Information Center
Sweller, Naomi
2015-01-01
Individuals with autism have difficulty generalising information from one situation to another, a process that requires the learning of categories and concepts. Category information may be learned through: (1) classifying items into categories, or (2) predicting missing features of category items. Predicting missing features has to this point been…
Merz, Erin L; Kwakkenbos, Linda; Carrier, Marie-Eve; Gholizadeh, Shadi; Mills, Sarah D; Fox, Rina S; Jewett, Lisa R; Williamson, Heidi; Harcourt, Diana; Assassi, Shervin; Furst, Daniel E; Gottesman, Karen; Mayes, Maureen D; Moss, Tim P; Thombs, Brett D; Malcarne, Vanessa L
2018-01-01
Objective Valid measures of appearance concern are needed in systemic sclerosis (SSc), a rare, disfiguring autoimmune disease. The Derriford Appearance Scale-24 (DAS-24) assesses appearance-related distress related to visible differences. There is uncertainty regarding its factor structure, possibly due to its scoring method. Design Cross-sectional survey. Setting Participants with SSc were recruited from 27 centres in Canada, the USA and the UK. Participants who self-identified as having visible differences were recruited from community and clinical settings in the UK. Participants Two samples were analysed (n=950 participants with SSc; n=1265 participants with visible differences). Primary and secondary outcome measures The DAS-24 factor structure was evaluated using two scoring methods. Convergent validity was evaluated with measures of social interaction anxiety, depression, fear of negative evaluation, social discomfort and dissatisfaction with appearance. Results When items marked by respondents as ‘not applicable’ were scored as 0, per standard DAS-24 scoring, a one-factor model fit poorly; when treated as missing data, the one-factor model fit well. Convergent validity analyses revealed strong correlations that were similar across scoring methods. Conclusions Treating ‘not applicable’ responses as missing improved the measurement model, but did not substantively influence practical inferences that can be drawn from DAS-24 scores. Indications of item redundancy and poorly performing items suggest that the DAS-24 could be improved and potentially shortened. PMID:29511009
Reliability of dietary information from surrogate respondents.
Hislop, T G; Coldman, A J; Zheng, Y Y; Ng, V T; Labo, T
1992-01-01
A self-administered food frequency questionnaire was included as part of a case-control study of breast cancer in 1980-82. In 1986-87, a second food frequency questionnaire was sent to surviving cases and husbands of deceased cases; 30 spouses (86% response rate) and 263 surviving cases (88% response rate) returned questionnaires. The dietary questions concerned consumption of specific food items by the case before diagnosis of breast cancer. Missing values were less common in the second questionnaire; there was no significant difference in missing values between surviving cases and spouses of deceased cases. Kappa statistics comparing responses in the first and second questionnaires were significantly lower for spouses of deceased cases than for surviving cases. Reported level of confidence by the husbands regarding knowledge about their wives' eating habits did not influence the kappa statistics or the frequencies of missing values. The lack of good agreement has important implications for the use of proxy interviews from husbands in retrospective dietary studies.
Bannon, William
2015-04-01
Missing data typically refer to the absence of one or more values within a study variable(s) contained in a dataset. The development is often the result of a study participant choosing not to provide a response to a survey item. In general, a greater number of missing values within a dataset reflects a greater challenge to the data analyst. However, if researchers are armed with just a few basic tools, they can quite effectively diagnose how serious the issue of missing data is within a dataset, as well as prescribe the most appropriate solution. Specifically, the keys to effectively assessing and treating missing data values within a dataset involve specifying how missing data will be defined in a study, assessing the amount of missing data, identifying the pattern of the missing data, and selecting the best way to treat the missing data values. I will touch on each of these processes and provide a brief illustration of how the validity of study findings are at great risk if missing data values are not treated effectively. ©2015 American Association of Nurse Practitioners.
Generating Multiple Imputations for Matrix Sampling Data Analyzed with Item Response Models.
ERIC Educational Resources Information Center
Thomas, Neal; Gan, Nianci
1997-01-01
Describes and assesses missing data methods currently used to analyze data from matrix sampling designs implemented by the National Assessment of Educational Progress. Several improved methods are developed, and these models are evaluated using an EM algorithm to obtain maximum likelihood estimates followed by multiple imputation of complete data…
Tadić, Valerija; Cooper, Andrew; Cumberland, Phillippa; Lewando-Hundt, Gillian; Rahi, Jugnoo S
2013-12-01
To develop a novel age-appropriate measure of functional vision (FV) for self-reporting by visually impaired (VI) children and young people. Questionnaire development. A representative patient sample of VI children and young people aged 10 to 15 years, visual acuity of the logarithm of the minimum angle of resolution (logMAR) worse than 0.48, and a school-based (nonrandom) expert group sample of VI students aged 12 to 17 years. A total of 32 qualitative semistructured interviews supplemented by narrative feedback from 15 eligible VI children and young people were used to generate draft instrument items. Seventeen VI students were consulted individually on item relevance and comprehensibility, instrument instructions, format, and administration methods. The resulting draft instrument was piloted with 101 VI children and young people comprising a nationally representative sample, drawn from 21 hospitals in the United Kingdom. Initial item reduction was informed by presence of missing data and individual item response pattern. Exploratory factor analysis (FA) and parallel analysis (PA), and Rasch analysis (RA) were applied to test the instrument's psychometric properties. Psychometric indices and validity assessment of the Functional Vision Questionnaire for Children and Young People (FVQ_CYP). A total of 712 qualitative statements became a 56-item draft scale, capturing the level of difficulty in performing vision-dependent activities. After piloting, items were removed iteratively as follows: 11 for high percentage of missing data, 4 for skewness, and 1 for inadequate item infit and outfit values in RA, 3 having shown differential item functioning across age groups and 1 across gender in RA. The remaining 36 items showed item fit values within acceptable limits, good measurement precision and targeting, and ordered response categories. The reduced scale has a clear unidimensional structure, with all items having a high factor loading on the single factor in FA and PA. The summary scores correlated significantly with visual acuity. We have developed a novel, psychometrically robust self-report questionnaire for children and young people-the FVQ_CYP-that captures the functional impact of visual disability from their perspective. The 36-item, 4-point unidimensional scale has potential as a complementary adjunct to objective clinical assessments in routine pediatric ophthalmology practice and in research. Copyright © 2013 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
The hippocampus supports both recollection and familiarity when memories are strong
Smith, Christine N.; Wixted, John T.; Squire, Larry R.
2011-01-01
Recognition memory is thought to consist of two component processes – recollection and familiarity. It has been suggested that the hippocampus supports recollection, while adjacent cortex supports familiarity. However, the qualitative experiences of recollection and familiarity are typically confounded with a quantitative difference in memory strength (recollection > familiarity). Thus, the question remains whether the hippocampus might in fact support familiarity-based memories whenever they are as strong as recollection-based memories. We addressed this problem in a novel way using the Remember/Know procedure where we could explicitly match the confidence and accuracy of Remember and Know decisions. As in earlier studies, recollected items had higher accuracy and confidence than familiar items, and hippocampal activity was higher for recollected items than for familiar items. Furthermore hippocampal activity was similar for familiar items, misses, and correct rejections. When the accuracy and confidence of recollected and familiar items were matched, the findings were dramatically different. Hippocampal activity was now similar for recollected and familiar items. Importantly, hippocampal activity was also greater for familiar items than for misses or correct rejections (as well as for recollected items vs. misses or correct rejections). Our findings suggest that the hippocampus supports both recollection and familiarity when memories are strong. PMID:22049412
Patient-reported outcome measures in arthroplasty registries
Bohm, Eric; Franklin, Patricia; Lyman, Stephen; Denissen, Geke; Dawson, Jill; Dunn, Jennifer; Eresian Chenok, Kate; Dunbar, Michael; Overgaard, Søren; Garellick, Göran; Lübbeke, Anne
2016-01-01
Abstract — The International Society of Arthroplasty Registries (ISAR) Patient-Reported Outcome Measures (PROMs) Working Group have evaluated and recommended best practices in the selection, administration, and interpretation of PROMs for hip and knee arthroplasty registries. The 2 generic PROMs in common use are the Short Form health surveys (SF-36 or SF-12) and EuroQol 5-dimension (EQ-5D). The Working Group recommends that registries should choose specific PROMs that have been appropriately developed with good measurement properties for arthroplasty patients. The Working Group recommend the use of a 1-item pain question (“During the past 4 weeks, how would you describe the pain you usually have in your [right/left] [hip/knee]?”; response: none, very mild, mild, moderate, or severe) and a single-item satisfaction outcome (“How satisfied are you with your [right/left] [hip/knee] replacement?”; response: very unsatisfied, dissatisfied, neutral, satisfied, or very satisfied). Survey logistics include patient instructions, paper- and electronic-based data collection, reminders for follow-up, centralized as opposed to hospital-based follow-up, sample size, patient- or joint-specific evaluation, collection intervals, frequency of response, missing values, and factors in establishing a PROMs registry program. The Working Group recommends including age, sex, diagnosis at joint, general health status preoperatively, and joint pain and function score in case-mix adjustment models. Interpretation and statistical analysis should consider the absolute level of pain, function, and general health status as well as improvement, missing data, approaches to analysis and case-mix adjustment, minimal clinically important difference, and minimal detectable change. The Working Group recommends data collection immediately before and 1 year after surgery, a threshold of 60% for acceptable frequency of response, documentation of non-responders, and documentation of incomplete or missing data. PMID:27228230
Fransson, Eleonor I; Nyberg, Solja T; Heikkilä, Katriina; Alfredsson, Lars; Bacquer, De Dirk; Batty, G David; Bonenfant, Sébastien; Casini, Annalisa; Clays, Els; Goldberg, Marcel; Kittel, France; Koskenvuo, Markku; Knutsson, Anders; Leineweber, Constanze; Magnusson Hanson, Linda L; Nordin, Maria; Singh-Manoux, Archana; Suominen, Sakari; Vahtera, Jussi; Westerholm, Peter; Westerlund, Hugo; Zins, Marie; Theorell, Töres; Kivimäki, Mika
2012-01-20
Job strain (i.e., high job demands combined with low job control) is a frequently used indicator of harmful work stress, but studies have often used partial versions of the complete multi-item job demands and control scales. Understanding whether the different instruments assess the same underlying concepts has crucial implications for the interpretation of findings across studies, harmonisation of multi-cohort data for pooled analyses, and design of future studies. As part of the 'IPD-Work' (Individual-participant-data meta-analysis in working populations) consortium, we compared different versions of the demands and control scales available in 17 European cohort studies. Six of the 17 studies had information on the complete scales and 11 on partial scales. Here, we analyse individual level data from 70 751 participants of the studies which had complete scales (5 demand items, 6 job control items). We found high Pearson correlation coefficients between complete scales of job demands and control relative to scales with at least three items (r > 0.90) and for partial scales with two items only (r = 0.76-0.88). In comparison with scores from the complete scales, the agreement between job strain definitions was very good when only one item was missing in either the demands or the control scale (kappa > 0.80); good for job strain assessed with three demand items and all six control items (kappa > 0.68) and moderate to good when items were missing from both scales (kappa = 0.54-0.76). The sensitivity was > 0.80 when only one item was missing from either scale, decreasing when several items were missing in one or both job strain subscales. Partial job demand and job control scales with at least half of the items of the complete scales, and job strain indices based on one complete and one partial scale, seemed to assess the same underlying concepts as the complete survey instruments.
2012-01-01
Background Job strain (i.e., high job demands combined with low job control) is a frequently used indicator of harmful work stress, but studies have often used partial versions of the complete multi-item job demands and control scales. Understanding whether the different instruments assess the same underlying concepts has crucial implications for the interpretation of findings across studies, harmonisation of multi-cohort data for pooled analyses, and design of future studies. As part of the 'IPD-Work' (Individual-participant-data meta-analysis in working populations) consortium, we compared different versions of the demands and control scales available in 17 European cohort studies. Methods Six of the 17 studies had information on the complete scales and 11 on partial scales. Here, we analyse individual level data from 70 751 participants of the studies which had complete scales (5 demand items, 6 job control items). Results We found high Pearson correlation coefficients between complete scales of job demands and control relative to scales with at least three items (r > 0.90) and for partial scales with two items only (r = 0.76-0.88). In comparison with scores from the complete scales, the agreement between job strain definitions was very good when only one item was missing in either the demands or the control scale (kappa > 0.80); good for job strain assessed with three demand items and all six control items (kappa > 0.68) and moderate to good when items were missing from both scales (kappa = 0.54-0.76). The sensitivity was > 0.80 when only one item was missing from either scale, decreasing when several items were missing in one or both job strain subscales. Conclusions Partial job demand and job control scales with at least half of the items of the complete scales, and job strain indices based on one complete and one partial scale, seemed to assess the same underlying concepts as the complete survey instruments. PMID:22264402
Modeling Nonignorable Missing Data with Item Response Theory (IRT). Research Report. ETS RR-10-11
ERIC Educational Resources Information Center
Rose, Norman; von Davier, Matthias; Xu, Xueli
2010-01-01
Large-scale educational surveys are low-stakes assessments of educational outcomes conducted using nationally representative samples. In these surveys, students do not receive individual scores, and the outcome of the assessment is inconsequential for respondents. The low-stakes nature of these surveys, as well as variations in average performance…
Goossens, Eva; Luyckx, Koen; Mommen, Nele; Gewillig, Marc; Budts, Werner; Zupancic, Nele; Moons, Philip
2013-12-01
To optimize long-term outcomes, patients with congenital heart disease (CHD) should adopt health-promoting behaviors. Studies on health behavior in afflicted patients are scarce and comparability of study results is limited. To enlarge the body of evidence, we have developed the Health Behavior Scale-Congenital Heart Disease (HBS-CHD). We examined the psychometric properties of the HBS-CHD by providing evidence for (a) the content validity; (b) validity based on the relationships with other variables; (c) reliability in terms of stability; and (d) responsiveness. Ten experts rated the relevance of the HBS-CHD items. The item content validity index (I-CVI) and the averaged scale content validity index (S-CVI/Ave); the modified multi-rater Kappa and proportion of missing values for each question were calculated. Relationships with other variables were evaluated using six hypotheses that were tested in 429 adolescents with CHD. Stability of the instrument was assessed using Heise's method; and responsiveness was tested by calculating the Guyatt's Responsiveness Index (GRI). Overall, 86.3% of the items had a good to excellent content validity; the S-CVI/Ave (0.81) and multi-rater Kappa (0.78) were adequate. The average proportion of missing values was low (1.2%). Because five out of six hypotheses were confirmed, evidence for the validity of the HBS-CHD based on relationships with other variables was provided. The stability of the instrument could not be confirmed based on our data. The GRI showed good to excellent capacity of the HBS-CHD to detect clinical changes in the health behavior over time. We found that the HBS-CHD is a valid and responsive questionnaire to assess health behaviors in patients with CHD.
An Introduction to Missing Data in the Context of Differential Item Functioning
ERIC Educational Resources Information Center
Banks, Kathleen
2015-01-01
This article introduces practitioners and researchers to the topic of missing data in the context of differential item functioning (DIF), reviews the current literature on the issue, discusses implications of the review, and offers suggestions for future research. A total of nine studies were reviewed. All of these studies determined what effect…
Chen, Tzu-Ching; Kuo, Wen-Jui; Chiang, Ming-Chang; Tseng, Yi-Jhan; Lin, Yung-Yang
2013-08-01
We evaluated the subsequent memory and forgotten effects for Chinese using event-related fMRI. Sixteen normal subjects were recruited and performing incidental memory tasks where semantic decision was required during memory encoding. Consistent with previous studies, our results showed bilateral frontal regions as the main locus for the subsequent memory effect. However, contrast between miss and hit responses revealed larger activation in bilateral superior temporal gyrus. We proposed that larger activation in the superior temporal gyrus may reflect alteration of self-monitoring process which resulted in unsuccessful memory encoding for the miss items. Copyright © 2013 Elsevier Inc. All rights reserved.
Differences in change blindness to real-life scenes in adults with autism spectrum conditions.
Ashwin, Chris; Wheelwright, Sally; Baron-Cohen, Simon
2017-01-01
People often fail to detect large changes to visual scenes following a brief interruption, an effect known as 'change blindness'. People with autism spectrum conditions (ASC) have superior attention to detail and better discrimination of targets, and often notice small details that are missed by others. Together these predict people with autism should show enhanced perception of changes in simple change detection paradigms, including reduced change blindness. However, change blindness studies to date have reported mixed results in ASC, which have sometimes included no differences to controls or even enhanced change blindness. Attenuated change blindness has only been reported to date in ASC in children and adolescents, with no study reporting reduced change blindness in adults with ASC. The present study used a change blindness flicker task to investigate the detection of changes in images of everyday life in adults with ASC (n = 22) and controls (n = 22) using a simple change detection task design and full range of original scenes as stimuli. Results showed the adults with ASC had reduced change blindness compared to adult controls for changes to items of marginal interest in scenes, with no group difference for changes to items of central interest. There were no group differences in overall response latencies to correctly detect changes nor in the overall number of missed detections in the experiment. However, the ASC group showed greater missed changes for marginal interest changes of location, showing some evidence of greater change blindness as well. These findings show both reduced change blindness to marginal interest changes in ASC, based on response latencies, as well as greater change blindness to changes of location of marginal interest items, based on detection rates. The findings of reduced change blindness are consistent with clinical reports that people with ASC often notice small changes to less salient items within their environment, and are in-line with theories of enhanced local processing and greater attention to detail in ASC. The findings of lower detection rates for one of the marginal interest conditions may be related to problems in shifting attention or an overly focused attention spotlight.
Wiklander, Maria; Brännström, Johanna; Svedhem, Veronica; Eriksson, Lars E
2015-11-19
Barriers to HIV testing experienced by individuals at risk for HIV can result in treatment delay and further transmission of the disease. Instruments to systematically measure barriers are scarce, but could contribute to improved strategies for HIV testing. Aims of this study were to develop and test a barriers to HIV testing scale in a Swedish context. An 18-item scale was developed, based on an existing scale with addition of six new items related to fear of the disease or negative consequences of being diagnosed as HIV-infected. Items were phrased as statements about potential barriers with a three-point response format representing not important, somewhat important, and very important. The scale was evaluated regarding missing values, floor and ceiling effects, exploratory factor analysis, and internal consistencies. The questionnaire was completed by 292 adults recently diagnosed with HIV infection, of whom 7 were excluded (≥9 items missing) and 285 were included (≥12 items completed) in the analyses. The participants were 18-70 years old (mean 40.5, SD 11.5), 39 % were females and 77 % born outside Sweden. Routes of transmission were heterosexual transmission 63 %, male to male sex 20 %, intravenous drug use 5 %, blood product/transfusion 2 %, and unknown 9 %. All scale items had <3 % missing values. The data was feasible for factor analysis (KMO = 0.92) and a four-factor solution was chosen, based on level of explained common variance (58.64 %) and interpretability of factor structure. The factors were interpreted as; personal consequences, structural barriers, social and economic security, and confidentiality. Ratings on the minimum level (suggested barrier not important) were common, resulting in substantial floor effects on the scales. The scales were internally consistent (Cronbach's α 0.78-0.91). This study gives preliminary evidence of the scale being feasible, reliable and valid to identify different types of barriers to HIV testing.
Missing Data and Institutional Research
ERIC Educational Resources Information Center
Croninger, Robert G.; Douglas, Karen M.
2005-01-01
Many do not consider the effect that missing data have on their survey results nor do they know how to handle missing data. This chapter offers strategies for handling item-missing data and provides a practical example of how these strategies may affect results. The chapter concludes with recommendations for preventing and dealing with missing…
Parameter Estimation in Rasch Models for Examinee-Selected Items
ERIC Educational Resources Information Center
Liu, Chen-Wei; Wang, Wen-Chung
2017-01-01
The examinee-selected-item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set of items (e.g., choose one item to respond from a pair of items), always yields incomplete data (i.e., only the selected items are answered and the others have missing data) that are likely nonignorable. Therefore, using…
More to it than meets the eye: how eye movements can elucidate the development of episodic memory.
Pathman, Thanujeni; Ghetti, Simona
2016-07-01
The ability to recognise past events along with the contexts in which they occurred is a hallmark of episodic memory, a critical capacity. Eye movements have been shown to track veridical memory for the associations between events and their contexts (relational binding). Such eye-movement effects emerge several seconds before, or in the absence of, explicit response, and are linked to the integrity and function of the hippocampus. Drawing from research from infancy through late childhood, and by comparing to investigations from typical adults, patient populations, and animal models, it seems increasingly clear that eye movements reflect item-item, item-temporal, and item-spatial associations in developmental populations. We analyse this line of work, identify missing pieces in the literature and outline future avenues of research, in order to help elucidate the development of episodic memory.
Modeling individualized coefficient alpha to measure quality of test score data.
Liu, Molei; Hu, Ming; Zhou, Xiao-Hua
2018-05-23
Individualized coefficient alpha is defined. It is item and subject specific and is used to measure the quality of test score data with heterogenicity among the subjects and items. A regression model is developed based on 3 sets of generalized estimating equations. The first set of generalized estimating equation models the expectation of the responses, the second set models the response's variance, and the third set is proposed to estimate the individualized coefficient alpha, defined and used to measure individualized internal consistency of the responses. We also use different techniques to extend our method to handle missing data. Asymptotic property of the estimators is discussed, based on which inference on the coefficient alpha is derived. Performance of our method is evaluated through simulation study and real data analysis. The real data application is from a health literacy study in Hunan province of China. Copyright © 2018 John Wiley & Sons, Ltd.
Kilanowski, Jill F; Trapl, Erika S
2010-04-01
We describe the feasibility of audio-enhanced personal digital assistants (ADPAs) for data collection with 60 Latino migrant farmworkers. All participants chose to complete APDA surveys rather than using paper-and-pencil. No one left the study prematurely: two (3%) data cases were lost due to technical difficulties. Across all data .27% missing data were observed: nine missing responses on eight items. Participants took 19 minutes on average to complete the 58-question survey. The factor most influential for completion was education level. APDA methodology enabled both English- and Spanish-speaking Latino migrant farmworkers to become active research participants with minimal loss of data. (c) 2010 Wiley Periodicals, Inc.
Kilanowski, Jill F.; Trapl, Erika S.
2011-01-01
We describe the feasibility of audio-enhanced personal digital assistants (ADPAs) for data collection with 60 Latino migrant farmworkers. All participants chose to complete APDA surveys rather than using paper-and-pencil. No one left the study prematurely: two (3%) data cases were lost due to technical difficulties. Across all data .27% missing data were observed: nine missing responses on eight items. Participants took 19 minutes on average to complete the 58-question survey. The factor most influential for completion was education level. APDA methodology enabled both English- and Spanish-speaking Latino migrant farmworkers to become active research participants with minimal loss of data. PMID:20135629
Support for an auto-associative model of spoken cued recall: evidence from fMRI.
de Zubicaray, Greig; McMahon, Katie; Eastburn, Mathew; Pringle, Alan J; Lorenz, Lina; Humphreys, Michael S
2007-03-02
Cued recall and item recognition are considered the standard episodic memory retrieval tasks. However, only the neural correlates of the latter have been studied in detail with fMRI. Using an event-related fMRI experimental design that permits spoken responses, we tested hypotheses from an auto-associative model of cued recall and item recognition [Chappell, M., & Humphreys, M. S. (1994). An auto-associative neural network for sparse representations: Analysis and application to models of recognition and cued recall. Psychological Review, 101, 103-128]. In brief, the model assumes that cues elicit a network of phonological short term memory (STM) and semantic long term memory (LTM) representations distributed throughout the neocortex as patterns of sparse activations. This information is transferred to the hippocampus which converges upon the item closest to a stored pattern and outputs a response. Word pairs were learned from a study list, with one member of the pair serving as the cue at test. Unstudied words were also intermingled at test in order to provide an analogue of yes/no recognition tasks. Compared to incorrectly rejected studied items (misses) and correctly rejected (CR) unstudied items, correctly recalled items (hits) elicited increased responses in the left hippocampus and neocortical regions including the left inferior prefrontal cortex (LIPC), left mid lateral temporal cortex and inferior parietal cortex, consistent with predictions from the model. This network was very similar to that observed in yes/no recognition studies, supporting proposals that cued recall and item recognition involve common rather than separate mechanisms.
Overstreet, Michael F; Healy, Alice F; Neath, Ian
2017-01-01
University of Colorado (CU) students were tested for both order and item information in their semantic memory for the "CU Fight Song". Following an earlier study by Overstreet and Healy [(2011). Item and order information in semantic memory: Students' retention of the "CU fight song" lyrics. Memory & Cognition, 39, 251-259. doi: 10.3758/s13421-010-0018-3 ], a symmetrical bow-shaped serial position function (with both primacy and recency advantages) was found for reconstructing the order of the nine lines in the song, whereas a function with no primacy advantage was found for recalling a missing word from each line. This difference between order and item information was found even though students filled in missing words without any alternatives provided and missing words came from the beginning, middle, or end of each line. Similar results were found for CU students' recall of the sequence of Harry Potter book titles and the lyrics of the Scooby Doo theme song. These findings strengthen the claim that the pronounced serial position function in semantic memory occurs largely because of the retention of order, rather than item, information.
Simons, Claire L; Rivero-Arias, Oliver; Yu, Ly-Mee; Simon, Judit
2015-04-01
Missing data are a well-known and widely documented problem in cost-effectiveness analyses alongside clinical trials using individual patient-level data. Current methodological research recommends multiple imputation (MI) to deal with missing health outcome data, but there is little guidance on whether MI for multi-attribute questionnaires, such as the EQ-5D-3L, should be carried out at domain or at summary score level. In this paper, we evaluated the impact of imputing individual domains versus imputing index values to deal with missing EQ-5D-3L data using a simulation study and developed recommendations for future practice. We simulated missing data in a patient-level dataset with complete EQ-5D-3L data at one point in time from a large multinational clinical trial (n = 1,814). Different proportions of missing data were generated using a missing at random (MAR) mechanism and three different scenarios were studied. The performance of using each method was evaluated using root mean squared error and mean absolute error of the actual versus predicted EQ-5D-3L indices. In large sample sizes (n > 500) and a missing data pattern that follows mainly unit non-response, imputing domains or the index produced similar results. However, domain imputation became more accurate than index imputation with pattern of missingness following an item non-response. For smaller sample sizes (n < 100), index imputation was more accurate. When MI models were misspecified, both domain and index imputations were inaccurate for any proportion of missing data. The decision between imputing the domains or the EQ-5D-3L index scores depends on the observed missing data pattern and the sample size available for analysis. Analysts conducting this type of exercises should also evaluate the sensitivity of the analysis to the MAR assumption and whether the imputation model is correctly specified.
The Effects of Methods of Imputation for Missing Values on the Validity and Reliability of Scales
ERIC Educational Resources Information Center
Cokluk, Omay; Kayri, Murat
2011-01-01
The main aim of this study is the comparative examination of the factor structures, corrected item-total correlations, and Cronbach-alpha internal consistency coefficients obtained by different methods used in imputation for missing values in conditions of not having missing values, and having missing values of different rates in terms of testing…
Audio-Enhanced Tablet Computers to Assess Children's Food Frequency From Migrant Farmworker Mothers.
Kilanowski, Jill F; Trapl, Erika S; Kofron, Ryan M
2013-06-01
This study sought to improve data collection in children's food frequency surveys for non-English speaking immigrant/migrant farmworker mothers using audio-enhanced tablet computers (ATCs). We hypothesized that by using technological adaptations, we would be able to improve data capture and therefore reduce lost surveys. This Food Frequency Questionnaire (FFQ), a paper-based dietary assessment tool, was adapted for ATCs and assessed consumption of 66 food items asking 3 questions for each food item: frequency, quantity of consumption, and serving size. The tablet-based survey was audio enhanced with each question "read" to participants, accompanied by food item images, together with an embedded short instructional video. Results indicated that respondents were able to complete the 198 questions from the 66 food item FFQ on ATCs in approximately 23 minutes. Compared with paper-based FFQs, ATC-based FFQs had less missing data. Despite overall reductions in missing data by use of ATCs, respondents still appeared to have difficulty with question 2 of the FFQ. Ability to score the FFQ was dependent on what sections missing data were located. Unlike the paper-based FFQs, no ATC-based FFQs were unscored due to amount or location of missing data. An ATC-based FFQ was feasible and increased ability to score this survey on children's food patterns from migrant farmworker mothers. This adapted technology may serve as an exemplar for other non-English speaking immigrant populations.
Rabideau, Dustin J; Nierenberg, Andrew A; Sylvia, Louisa G; Friedman, Edward S.; Bowden, Charles L.; Thase, Michael E.; Ketter, Terence; Ostacher, Michael J.; Reilly-Harrington, Noreen; Iosifescu, Dan V.; Calabrese, Joseph R.; Leon, Andrew C.; Schoenfeld, David A
2014-01-01
Background Missing data are unavoidable in most randomized controlled clinical trials, especially when measurements are taken repeatedly. If strong assumptions about the missing data are not accurate, crude statistical analyses are biased and can lead to false inferences. Furthermore, if we fail to measure all predictors of missing data, we may not be able to model the missing data process sufficiently. In longitudinal randomized trials, measuring a patient's intent to attend future study visits may help to address both of these problems. Leon et al. developed and included the Intent to Attend assessment in the Lithium Treatment—Moderate dose Use Study (LiTMUS), aiming to remove bias due to missing data from the primary study hypothesis [1]. Purpose The purpose of this study is to assess the performance of the Intent to Attend assessment with regard to its use in a sensitivity analysis of missing data. Methods We fit marginal models to assess whether a patient's self-rated intent predicted actual study adherence. We applied inverse probability of attrition weighting (IPAW) coupled with patient intent to assess whether there existed treatment group differences in response over time. We compared the IPAW results to those obtained using other methods. Results Patient-rated intent predicted missed study visits, even when adjusting for other predictors of missing data. On average, the hazard of retention increased by 19% for every one-point increase in intent. We also found that more severe mania, male gender, and a previously missed visit predicted subsequent absence. Although we found no difference in response between the randomized treatment groups, IPAW increased the estimated group difference over time. Limitations LiTMUS was designed to limit missed study visits, which may have attenuated the effects of adjusting for missing data. Additionally, IPAW can be less efficient and less powerful than maximum likelihood or Bayesian estimators, given that the parametric model is well-specified. Conclusions In LiTMUS, the Intent to Attend assessment predicted missed study visits. This item was incorporated into our IPAW models and helped reduce bias due to informative missing data. This analysis should both encourage and facilitate future use of the Intent to Attend assessment along with IPAW to address missing data in a randomized trial. PMID:24872362
Safety climate in Swiss hospital units: Swiss version of the Safety Climate Survey
Gehring, Katrin; Mascherek, Anna C.; Bezzola, Paula
2015-01-01
Abstract Rationale, aims and objectives Safety climate measurements are a broadly used element of improvement initiatives. In order to provide a sound and easy‐to‐administer instrument for the use in Swiss hospitals, we translated the Safety Climate Survey into German and French. Methods After translating the Safety Climate Survey into French and German, a cross‐sectional survey study was conducted with health care professionals (HCPs) in operating room (OR) teams and on OR‐related wards in 10 Swiss hospitals. Validity of the instrument was examined by means of Cronbach's alpha and missing rates of the single items. Item‐descriptive statistics group differences and percentage of ‘problematic responses’ (PPR) were calculated. Results 3153 HCPs completed the survey (response rate: 63.4%). 1308 individuals were excluded from the analyses because of a profession other than doctor or nurse or invalid answers (n = 1845; nurses = 1321, doctors = 523). Internal consistency of the translated Safety Climate Survey was good (Cronbach's alpha G erman = 0.86; Cronbach's alpha F rench = 0.84). Missing rates at item level were rather low (0.23–4.3%). We found significant group differences in safety climate values regarding profession, managerial function, work area and time spent in direct patient care. At item level, 14 out of 21 items showed a PPR higher than 10%. Conclusions Results indicate that the French and German translations of the Safety Climate Survey might be a useful measurement instrument for safety climate in Swiss hospital units. Analyses at item level allow for differentiating facets of safety climate into more positive and critical safety climate aspects. PMID:25656302
Use of Item Parceling in Structural Equation Modeling with Missing Data
ERIC Educational Resources Information Center
Orcan, Fatih
2013-01-01
Parceling is referred to as a procedure for computing sums or average scores across multiple items. Parcels instead of individual items are then used as indicators of latent factors in the structural equation modeling analysis (Bandalos 2002, 2008; Little et al., 2002; Yang, Nay, & Hoyle, 2010). Item parceling may be applied to alleviate some…
Sajobi, Tolulope T; Lix, Lisa M; Singh, Gurbakhshash; Lowerison, Mark; Engbers, Jordan; Mayo, Nancy E
2015-03-01
Response shift (RS) is an important phenomenon that influences the assessment of longitudinal changes in health-related quality of life (HRQOL) studies. Given that RS effects are often small, missing data due to attrition or item non-response can contribute to failure to detect RS effects. Since missing data are often encountered in longitudinal HRQOL data, effective strategies to deal with missing data are important to consider. This study aims to compare different imputation methods on the detection of reprioritization RS in the HRQOL of caregivers of stroke survivors. Data were from a Canadian multi-center longitudinal study of caregivers of stroke survivors over a one-year period. The Stroke Impact Scale physical function score at baseline, with a cutoff of 75, was used to measure patient stroke severity for the reprioritization RS analysis. Mean imputation, likelihood-based expectation-maximization imputation, and multiple imputation methods were compared in test procedures based on changes in relative importance weights to detect RS in SF-36 domains over a 6-month period. Monte Carlo simulation methods were used to compare the statistical powers of relative importance test procedures for detecting RS in incomplete longitudinal data under different missing data mechanisms and imputation methods. Of the 409 caregivers, 15.9 and 31.3 % of them had missing data at baseline and 6 months, respectively. There were no statistically significant changes in relative importance weights on any of the domains when complete-case analysis was adopted. But statistical significant changes were detected on physical functioning and/or vitality domains when mean imputation or EM imputation was adopted. There were also statistically significant changes in relative importance weights for physical functioning, mental health, and vitality domains when multiple imputation method was adopted. Our simulations revealed that relative importance test procedures were least powerful under complete-case analysis method and most powerful when a mean imputation or multiple imputation method was adopted for missing data, regardless of the missing data mechanism and proportion of missing data. Test procedures based on relative importance measures are sensitive to the type and amount of missing data and imputation method. Relative importance test procedures based on mean imputation and multiple imputation are recommended for detecting RS in incomplete data.
The Impact of Different Missing Data Handling Methods on DINA Model
ERIC Educational Resources Information Center
Sünbül, Seçil Ömür
2018-01-01
In this study, it was aimed to investigate the impact of different missing data handling methods on DINA model parameter estimation and classification accuracy. In the study, simulated data were used and the data were generated by manipulating the number of items and sample size. In the generated data, two different missing data mechanisms…
SPSS Syntax for Missing Value Imputation in Test and Questionnaire Data
ERIC Educational Resources Information Center
van Ginkel, Joost R.; van der Ark, L. Andries
2005-01-01
A well-known problem in the analysis of test and questionnaire data is that some item scores may be missing. Advanced methods for the imputation of missing data are available, such as multiple imputation under the multivariate normal model and imputation under the saturated logistic model (Schafer, 1997). Accompanying software was made available…
Feveile, Helene; Olsen, Ole; Hogh, Annie
2007-01-01
Background Data for health surveys are often collected using either mailed questionnaires, telephone interviews or a combination. Mode of data collection can affect the propensity to refuse to respond and result in different patterns of responses. The objective of this paper is to examine and quantify effects of mode of data collection in health surveys. Methods A stratified sample of 4,000 adults residing in Denmark was randomised to mailed questionnaires or computer-assisted telephone interviews. 45 health-related items were analyzed; four concerning behaviour and 41 concerning self assessment. Odds ratios for more positive answers and more frequent use of extreme response categories (both positive and negative) among telephone respondents compared to questionnaire respondents were estimated. Tests were Bonferroni corrected. Results For the four health behaviour items there were no significant differences in the response patterns. For 32 of the 41 health self assessment items the response pattern was statistically significantly different and extreme response categories were used more frequently among telephone respondents (Median estimated odds ratio: 1.67). For a majority of these mode sensitive items (26/32), a more positive reporting was observed among telephone respondents (Median estimated odds ratio: 1.73). The overall response rate was similar among persons randomly assigned to questionnaires (58.1%) and to telephone interviews (56.2%). A differential nonresponse bias for age and gender was observed. The rate of missing responses was higher for questionnaires (0.73 – 6.00%) than for telephone interviews (0 – 0.51%). The "don't know" option was used more often by mail respondents (10 – 24%) than by telephone respondents (2 – 4%). Conclusion The mode of data collection affects the reporting of self assessed health items substantially. In epidemiological studies, the method effect may be as large as the effects under investigation. Caution is needed when comparing prevalences across surveys or when studying time trends. PMID:17592653
Zúñiga, Franziska; Schubert, Maria; Hamers, Jan P H; Simon, Michael; Schwendimann, René; Engberg, Sandra; Ausserhofer, Dietmar
2016-08-01
To develop and test psychometrically the Basel Extent of Rationing of Nursing Care for Nursing Homes instrument, providing initial evidence on the validity and reliability of the German, French and Italian-language versions. In the hospital setting, implicit rationing of nursing care is defined as the withholding of nursing activities due to lack of resources, such as staffing or time. No instrument existed to measure this concept in nursing homes. Cross-sectional study. We developed the instrument in three phases: (1) adaption and translation; (2) content validity testing; and (3) initial validity and reliability testing. For phase 3, we analysed survey data from 4748 care workers collected between May 2012-April 2013 from a randomly selected sample of 162 nursing homes in the German-, French- and Italian-speaking regions of Switzerland to provide evidence from response processes (e.g. missing), internal structure (exploratory factor analysis), inter-item inconsistencies (e.g. Cronbach's alpha) and interscorer differences (e.g. within-group agreement). Exploratory factor analysis revealed a four-factor structure with good fit statistics. Rationing of nursing care was structured in four domains: (1) activities of daily living; (2) caring, rehabilitation and monitoring; (3) documentation; and (4) social care. Items of the social care subscale showed lower content validity and more missing values than items of other subscales. First evidence indicates that the new instrument can be recommended for research and practice to measure implicit rationing of nursing care in nursing homes. Further refinements of single items are needed. © 2016 John Wiley & Sons Ltd.
Audio-Enhanced Tablet Computers to Assess Children’s Food Frequency From Migrant Farmworker Mothers
Kilanowski, Jill F.; Trapl, Erika S.; Kofron, Ryan M.
2014-01-01
This study sought to improve data collection in children’s food frequency surveys for non-English speaking immigrant/migrant farmworker mothers using audio-enhanced tablet computers (ATCs). We hypothesized that by using technological adaptations, we would be able to improve data capture and therefore reduce lost surveys. This Food Frequency Questionnaire (FFQ), a paper-based dietary assessment tool, was adapted for ATCs and assessed consumption of 66 food items asking 3 questions for each food item: frequency, quantity of consumption, and serving size. The tablet-based survey was audio enhanced with each question “read” to participants, accompanied by food item images, together with an embedded short instructional video. Results indicated that respondents were able to complete the 198 questions from the 66 food item FFQ on ATCs in approximately 23 minutes. Compared with paper-based FFQs, ATC-based FFQs had less missing data. Despite overall reductions in missing data by use of ATCs, respondents still appeared to have difficulty with question 2 of the FFQ. Ability to score the FFQ was dependent on what sections missing data were located. Unlike the paper-based FFQs, no ATC-based FFQs were unscored due to amount or location of missing data. An ATC-based FFQ was feasible and increased ability to score this survey on children’s food patterns from migrant farmworker mothers. This adapted technology may serve as an exemplar for other non-English speaking immigrant populations. PMID:25343004
Planning a Study for Testing the Rasch Model given Missing Values due to the use of Test-booklets.
Yanagida, Takuya; Kubinger, Klaus D; Rasch, Dieter
2015-01-01
Though calibration of an achievement test within a psychological and educational context is very often carried out by the Rasch model, data sampling is hardly designed according to statistical foundations. However, Kubinger, Rasch, and Yanagida (2009, 2011) suggested an approach for the determination of sample size according to a given Type-I and Type-II risk and a certain effect of model contradiction when testing the Rasch model. The approach uses a three-way analysis of variance design with mixed classification. For the while, their simulation studies deal with complete data, meaning every examinee is administered with all of the items of an item pool. The simulation study now presented in this paper deals with the practical relevant case, in particular for large-scale assessments, that item presentation happens to use several test-booklets. As a consequence, there are missing values by design. Therefore, the question to be considered is, whether this approach works in this case as well. Besides the fact, that data are not normally distributed but there is a dichotomous variable (an examinee either solves an item or fails to solve it), only a single entry for each cell exists in the given three-way analysis of variance design, if at all, due to missing values. Hence, the obligatory test-statistic's distribution may not be retained, in contrast to the case of having no missing values. The result of our simulation study, despite applying only to a very special scenario, is that this approach works, indeed: Whether test-booklets were used or every examinee is administered all of the items changes nothing in respect to the actual Type-I risk or to the power of the test, given almost the same amount of information of examinees per item. However, as the results are limited to a special scenario, we currently recommend any interested researcher to simulate the appropriate one in advance by him/herself.
Suzukamo, Yoshimi; Oshika, Tetsuro; Yuzawa, Mitsuko; Tokuda, Yoshihiro; Tomidokoro, Atsuo; Oki, Kotaro; Mangione, Carol M; Green, Joseph; Fukuhara, Shunichi
2005-10-26
The importance of evaluating the outcomes of health care from the standpoint of the patient is now widely recognized. The purpose of this study is to develop and test a Japanese version of the National Eye Institute Visual Function Questionnaire (NEI VFQ-25). A Japanese version was developed with a previously standardized method. The questionnaire and optional items were completed by 245 patients with cataracts, glaucoma, or age-related macular degeneration, by 110 others before and after cataract surgery, and by a reference group (n = 31). We computed rates of missing data, measured reproducibility and internal consistency reliability, and tested for convergent and discriminant validity, concurrent validity, known-groups validity, factor structure, and responsiveness to change. Based on information from the participants, some items were changed to 2-step items (asking if an activity was done, and if it was done, then asking how difficult it was). The near-vision and distance-vision subscales each had 1 item that was endorsed by very few participants, so these items were replaced with items that were optional in the English version. For example, more than 60% of participants did not drive, so the driving question was excluded. Reliability and validity were adequate for all subscales except driving, ocular pain, color vision, and peripheral vision. With cataract surgery, most scores improved by at least 20 points. With minor modifications from the English version, the Japanese NEI VFQ-25 can give reliable, valid, responsive data on vision-related quality of life, for group-level comparisons or for tracking therapeutic outcomes.
Anota, Amélie; Barbieri, Antoine; Savina, Marion; Pam, Alhousseiny; Gourgou-Bourgade, Sophie; Bonnetain, Franck; Bascoul-Mollevi, Caroline
2014-12-31
Health-Related Quality of Life (HRQoL) is an important endpoint in oncology clinical trials aiming to investigate the clinical benefit of new therapeutic strategies for the patient. However, the longitudinal analysis of HRQoL remains complex and unstandardized. There is clearly a need to propose accessible statistical methods and meaningful results for clinicians. The objective of this study was to compare three strategies for longitudinal analyses of HRQoL data in oncology clinical trials through a simulation study. The methods proposed were: the score and mixed model (SM); a survival analysis approach based on the time to HRQoL score deterioration (TTD); and the longitudinal partial credit model (LPCM). Simulations compared the methods in terms of type I error and statistical power of the test of an interaction effect between treatment arm and time. Several simulation scenarios were explored based on the EORTC HRQoL questionnaires and varying the number of patients (100, 200 or 300), items (1, 2 or 4) and response categories per item (4 or 7). Five or 10 measurement times were considered, with correlations ranging from low to high between each measure. The impact of informative missing data on these methods was also studied to reflect the reality of most clinical trials. With complete data, the type I error rate was close to the expected value (5%) for all methods, while the SM method was the most powerful method, followed by LPCM. The power of TTD is low for single-item dimensions, because only four possible values exist for the score. When the number of items increases, the power of the SM approach remained stable, those of the TTD method increases while the power of LPCM remained stable. With 10 measurement times, the LPCM was less efficient. With informative missing data, the statistical power of SM and TTD tended to decrease, while that of LPCM tended to increase. To conclude, the SM model was the most powerful model, irrespective of the scenario considered, and the presence or not of missing data. The TTD method should be avoided for single-item dimensions of the EORTC questionnaire. While the LPCM model was more adapted to this kind of data, it was less efficient than the SM model. These results warrant validation through comparisons on real data.
ERIC Educational Resources Information Center
Daniels, Vijay J.; Bordage, Georges; Gierl, Mark J.; Yudkowsky, Rachel
2014-01-01
Objective structured clinical examinations (OSCEs) are used worldwide for summative examinations but often lack acceptable reliability. Research has shown that reliability of scores increases if OSCE checklists for medical students include only clinically relevant items. Also, checklists are often missing evidence-based items that high-achieving…
Nimmo, J.R.
2010-01-01
Germann's (2010) comment helpfully presents supporting evidence that I have missed, notes items that need clarification or correction, and stimulates discussion of what is needed for improved theory of unsaturated flow. Several points from this comment relate not only to specific features of the content of my paper (Nimmo, 2010), but also to the broader question of what methodology is appropriate for developing an applied earth science. Accordingly, before addressing specific points that Germann identified, I present here some considerations of purpose and background relevant to evaluation of the unsaturated flow model of Nimmo (2010).
Naal, Florian D; Impellizzeri, Franco M; von Eisenhart-Rothe, Rüdiger; Mannion, Anne F; Leunig, Michael
2012-11-01
To evaluate reproducibility, validity, and responsiveness of the Hip Outcome Score (HOS) in patients with end-stage hip osteoarthritis. In a cohort of 157 consecutive patients (mean age 66 years; 79 women) undergoing total hip replacement, the HOS was tested for the following measurement properties: feasibility (percentage of evaluable questionnaires), reproducibility (intraclass correlation coefficient [ICC] and standard error of measurement [SEM]), construct validity (correlation with the Western Ontario and McMaster Universities Osteoarthritis Index [WOMAC], Oxford Hip Score [OHS], Short Form 12 health survey, and University of California, Los Angeles activity scale), internal consistency (Cronbach's alpha), factorial validity (factor analysis), floor and ceiling effects, and internal and external responsiveness at 6 months after surgery (standardized response mean and change score correlations). Missing items occurred frequently. Five percent to 6% of the HOS activities of daily living (ADL) subscales and 20-32% of the sport subscales could not be scored. ICCs were 0.92 for both subscales. SEMs were 1.8 points (ADL subscale) and 2.3 points (sport subscale). Highest correlations were found with the OHS (r = 0.81 for ADL subscale and r = 0.58 for sport subscale) and the WOMAC physical function subscale (r = 0.83 for ADL subscale and r = 0.56 for sport subscale). Cronbach's alpha was 0.93 and 0.88 for the ADL and sport subscales, respectively. Neither unidimensionality of the subscales nor the 2-factor structure was supported by factor analysis. Both subscales showed good internal and external responsiveness. The HOS is reproducible and responsive when assessing patients with end-stage hip osteoarthritis in whom the items are relevant. However, based on the large proportion of missing data and the findings of the factor analysis, we cannot recommend this questionnaire for routine use in this target group. Copyright © 2012 by the American College of Rheumatology.
US Media Coverage of Tobacco Industry Corporate Social Responsibility Initiatives.
McDaniel, Patricia A; Lown, E Anne; Malone, Ruth E
2018-02-01
Media coverage of tobacco industry corporate social responsibility (CSR) initiatives represents a competitive field where tobacco control advocates and the tobacco industry vie to shape public and policymaker understandings about tobacco control and the industry. Through a content analysis of 649 US news items, we examined US media coverage of tobacco industry CSR and identified characteristics of media items associated with positive coverage. Most coverage appeared in local newspapers, and CSR initiatives unrelated to tobacco, with non-controversial beneficiaries, were most commonly mentioned. Coverage was largely positive. Tobacco control advocates were infrequently cited as sources and rarely authored opinion pieces; however, when their voices were included, coverage was less likely to have a positive slant. Media items published in the South, home to several tobacco company headquarters, were more likely than those published in the West to have a positive slant. The absence of tobacco control advocates from media coverage represents a missed opportunity to influence opinion regarding the negative public health implications of tobacco industry CSR. Countering the media narrative of virtuous companies doing good deeds could be particularly beneficial in the South, where the burdens of tobacco-caused disease are greatest, and coverage of tobacco companies more positive.
US Media Coverage of Tobacco Industry Corporate Social Responsibility Initiatives
McDaniel, Patricia A.; Lown, E. Anne; Malone, Ruth E.
2017-01-01
Media coverage of tobacco industry corporate social responsibility (CSR) initiatives represents a competitive field where tobacco control advocates and the tobacco industry vie to shape public and policymaker understandings about tobacco control and the industry. Through a content analysis of 649 US news items, we examined US media coverage of tobacco industry CSR and identified characteristics of media items associated with positive coverage. Most coverage appeared in local newspapers, and CSR initiatives unrelated to tobacco, with non-controversial beneficiaries, were most commonly mentioned. Coverage was largely positive. Tobacco control advocates were infrequently cited as sources and rarely authored opinion pieces; however, when their voices were included, coverage was less likely to have a positive slant. Media items published in the South, home to several tobacco company headquarters, were more likely than those published in the West to have a positive slant. The absence of tobacco control advocates from media coverage represents a missed opportunity to influence opinion regarding the negative public health implications of tobacco industry CSR. Countering the media narrative of virtuous companies doing good deeds could be particularly beneficial in the South, where the burdens of tobacco-caused disease are greatest, and coverage of tobacco companies more positive. PMID:28685318
2010-01-01
Background Patients-Reported Outcomes (PRO) are increasingly used in clinical and epidemiological research. Two main types of analytical strategies can be found for these data: classical test theory (CTT) based on the observed scores and models coming from Item Response Theory (IRT). However, whether IRT or CTT would be the most appropriate method to analyse PRO data remains unknown. The statistical properties of CTT and IRT, regarding power and corresponding effect sizes, were compared. Methods Two-group cross-sectional studies were simulated for the comparison of PRO data using IRT or CTT-based analysis. For IRT, different scenarios were investigated according to whether items or person parameters were assumed to be known, to a certain extent for item parameters, from good to poor precision, or unknown and therefore had to be estimated. The powers obtained with IRT or CTT were compared and parameters having the strongest impact on them were identified. Results When person parameters were assumed to be unknown and items parameters to be either known or not, the power achieved using IRT or CTT were similar and always lower than the expected power using the well-known sample size formula for normally distributed endpoints. The number of items had a substantial impact on power for both methods. Conclusion Without any missing data, IRT and CTT seem to provide comparable power. The classical sample size formula for CTT seems to be adequate under some conditions but is not appropriate for IRT. In IRT, it seems important to take account of the number of items to obtain an accurate formula. PMID:20338031
[Severe intimate partner violence risk prediction scale-revised].
Echeburúa, Enrique; Amor, Pedro Javier; Loinaz, Ismael; de Corral, Paz
2010-11-01
The aim of this study was to describe the psychometric properties of the Severe Intimate Partner Violence Risk Prediction Scale and to revise it in order to ponderate the 20 items according to their discriminant capacity and to solve the missing item problem. The sample for this study consisted of 450 male batterers who were reported to the police station. The victims were classified as high-risk (18.2%), moderate-risk (45.8%) and low-risk (36%), depending on the cutoff scores in the original scale. Internal consistency (Cronbach's alpha=.72) and interrater reliability (r=.73) were acceptable. The point biserial correlation coefficient between each item and the corrected total score of the 20-item scale was calculated to determine the most discriminative items, which were associated with the context of intimate partner violence in the last month, with the male batterer's profile and with the victim's vulnerability. A revised scale (EPV-R) with new cutoff scores and indications on how to deal with the missing items were proposed in accordance with these results. This easy-to-use tool appears to be suitable to the requirements of criminal justice professionals and is intended for use in safety planning. Implications of these results for further research are discussed.
Lazy collaborative filtering for data sets with missing values.
Ren, Yongli; Li, Gang; Zhang, Jun; Zhou, Wanlei
2013-12-01
As one of the biggest challenges in research on recommender systems, the data sparsity issue is mainly caused by the fact that users tend to rate a small proportion of items from the huge number of available items. This issue becomes even more problematic for the neighborhood-based collaborative filtering (CF) methods, as there are even lower numbers of ratings available in the neighborhood of the query item. In this paper, we aim to address the data sparsity issue in the context of neighborhood-based CF. For a given query (user, item), a set of key ratings is first identified by taking the historical information of both the user and the item into account. Then, an auto-adaptive imputation (AutAI) method is proposed to impute the missing values in the set of key ratings. We present a theoretical analysis to show that the proposed imputation method effectively improves the performance of the conventional neighborhood-based CF methods. The experimental results show that our new method of CF with AutAI outperforms six existing recommendation methods in terms of accuracy.
Marsh, Herbert W; Martin, Andrew J; Jackson, Susan
2010-08-01
Based on the Physical Self Description Questionnaire (PSDQ) normative archive (n = 1,607 Australian adolescents), 40 of 70 items were selected to construct a new short form (PSDQ-S). The PSDQ-S was evaluated in a new cross-validation sample of 708 Australian adolescents and four additional samples: 349 Australian elite-athlete adolescents, 986 Spanish adolescents, 395 Israeli university students, 760 Australian older adults. Across these six groups, the 11 PSDQ-S factors had consistently high reliabilities and invariant factor structures. Study 1, using a missing-by-design variation of multigroup invariance tests, showed invariance across 40 PSDQ-S items and 70 PSDQ items. Study 2 demonstrated factorial invariance over a 1-year interval (test-retest correlations .57-.90; Mdn = .77), and good convergent and discriminant validity in relation to time. Study 3 showed good and nearly identical support for convergent and discriminant validity of PSDQ and PSDQ-S responses in relation to two other physical self-concept instruments.
Janssen, Eva; Verduyn, Philippe; Waters, Erika A
2018-05-01
Many people report uncertainty when appraising their risk of cancer and other diseases, but prior research about the topic has focused solely on cognitive risk perceptions. We investigated uncertainty related to cognitive and affective risk questions. We also explored whether any differences in uncertainty between cognitive and affective questions varied in magnitude by item-specific or socio-demographic characteristics. Secondary analysis of data collected for a 2 × 2 × 3 full-factorial risk communication experiment (N = 835) that was embedded within an online survey. We investigated the frequency of 'don't know' responses (DKR) to eight perceived risk items that varied according to whether they assessed (1) cognitive versus affective perceived risk, (2) absolute versus comparative risk, and (3) colon cancer versus 'any exercise-related diseases'. Socio-demographics were as follows: sex, age, education, family history, and numeracy. We analysed the data using multilevel logistic regression. The odds of DKR were lower for affective than cognitive perceived risk (OR = 0.64, p < .001). This difference occurred for absolute but not comparative risk perceptions (interaction effect, p = .004), but no interactions for disease type or demographic characteristics were found (ps > .05). Lower uncertainty for affective (vs. cognitive) absolute perceived risk items is consistent with research stating: (1) Risk perceptions are grounded in people's feelings about a hazard, and (2) feelings are easier for people to access than facts. Including affective perceived risk items in health behaviour surveys may reduce missing data and improve data quality. Statement of contribution What is already known on this subject? Many people report that they don't know their risk (i.e., risk uncertainty). Evidence is growing for the importance of feelings of risk in explaining health behaviour. Feelings are easier for people to access than facts. What does this study add? Don't know responding is higher for absolute cognitive than absolute affective risk questions. This difference does not vary in magnitude by demographic characteristics. Affective perceived risk questions in surveys may reduce missing data and improve data quality. © 2018 The British Psychological Society.
2012-03-01
both a transmitter and receiver antenna. The lower coil was located 42 cm above the ground surface for optimal data collection using the standard wheel ... eccentricity . Over 54% (26 of the 46) had P0x parameter values below the 4,500 Category 3 threshold in order to reduce the risk of missing TOI smaller...it is not uncommon to have a large eccentricity for an ordnance item. As previously stated, URS used LM as secondary, in that it served to override
Treatment of Not-Administered Items on Individually Administered Intelligence Tests
ERIC Educational Resources Information Center
He, Wei; Wolfe, Edward W.
2012-01-01
In administration of individually administered intelligence tests, items are commonly presented in a sequence of increasing difficulty, and test administration is terminated after a predetermined number of incorrect answers. This practice produces stochastically censored data, a form of nonignorable missing data. By manipulating four factors…
Gries, Katharine Suzanne; Esser, Dirk; Wiklund, Ingela
2013-09-16
The study objective was to assess the content validity of the Cough and Sputum Assessment Questionnaire (CASA-Q) cough domains and the UCSD Shortness of Breath Questionnaire (SOBQ) for use in patients with Idiopathic Pulmonary Fibrosis (IPF). Cross-sectional, qualitative study with cognitive interviews in patients with IPF. Study outcomes included relevance, comprehension of item meaning, understanding of the instructions, recall period, response options, and concept saturation. Interviews were conducted with 18 IPF patients. The mean age was 68.9 years (SD 11.9), 77.8% were male, and 88.9% were Caucasian. The intended meaning of the CASA-Q cough domain items was clearly understood by most of the participants (89-100%). All participants understood the CASA-Q instructions; the correct recall period was reported by 89% of the patients, and the response options were understood by 76%. The intended meaning of the UCSD-SOBQ items was relevant and clearly understood by all participants. Participants understood the instructions (83%) and all patients understood the response options (100%). The reported recall period varied based on the type of activity performed. No concepts were missing, suggesting that saturation was demonstrated for both measures. This study provides evidence for content validity for the CASA-Q cough domains and the UCSD-SOBQ for patients with IPF. Items of both questionnaires were understood and perceived as relevant to measure the key symptoms of IPF. The results of this study support the use of these instruments in IPF clinical trials as well as further studies of their psychometric properties.
Hislop, T G; Lamb, C W; Ng, V T
1990-01-01
Cases (n = 263) and controls (n = 200) returned self-administered food frequency questionnaires in 1980-1982 and again in 1986 as part of a case-control study of breast cancer. The questionnaire asked about consumption of specific food items as recalled for four different age periods. K-statistics comparing responses in the first and second questionnaires were generally similar for cases and controls and were consistent across the different age periods. The influence of recent dietary change on dietary recall diminished for the more distant past. The food frequency questionnaire was found to be more reliable for specific food items for the distant past than for the more recent past. Differential misclassification bias between cases and controls was less apparent for the more distant past. The frequency and interpretation of missing values is discussed.
The Handling of Missing Binary Data in Language Research
ERIC Educational Resources Information Center
Pichette, François; Béland, Sébastien; Jolani, Shahab; Lesniewska, Justyna
2015-01-01
Researchers are frequently confronted with unanswered questions or items on their questionnaires and tests, due to factors such as item difficulty, lack of testing time, or participant distraction. This paper first presents results from a poll confirming previous claims (Rietveld & van Hout, 2006; Schafer & Graham, 2002) that data…
Haugum, Mona; Iversen, Hilde Hestad; Bjertnaes, Oyvind; Lindahl, Anne Karin
2017-02-20
Patient experiences are an important aspect of health care quality, but there is a lack of validated instruments for their measurement in the substance dependence literature. A new questionnaire to measure inpatients' experiences of interdisciplinary treatment for substance dependence has been developed in Norway. The aim of this study was to psychometrically test the new questionnaire, using data from a national survey in 2013. The questionnaire was developed based on a literature review, qualitative interviews with patients, expert group discussions and pretesting. Data were collected in a national survey covering all residential facilities with inpatients in treatment for substance dependence in 2013. Data quality and psychometric properties were assessed, including ceiling effects, item missing, exploratory factor analysis, and tests of internal consistency reliability, test-retest reliability and construct validity. The sample included 978 inpatients present at 98 residential institutions. After correcting for excluded patients (n = 175), the response rate was 91.4%. 28 out of 33 items had less than 20.5% of missing data or replies in the "not applicable" category. All but one item met the ceiling effect criterion of less than 50.0% of the responses in the most favorable category. Exploratory factor analysis resulted in three scales: "treatment and personnel", "milieu" and "outcome". All scales showed satisfactory internal consistency reliability (Cronbach's alpha ranged from 0.75-0.91) and test-retest reliability (ICC ranged from 0.82-0.85). 17 of 18 significant associations between single variables and the scales supported construct validity of the PEQ-ITSD. The content validity of the PEQ-ITSD was secured by a literature review, consultations with an expert group and qualitative interviews with patients. The PEQ-ITSD was used in a national survey in Norway in 2013 and psychometric testing showed that the instrument had satisfactory internal consistency reliability and construct validity.
Quinn, Gwendolyn P; Huang, I-Chan; Murphy, Devin; Zidonik-Eddelton, Katie; Krull, Kevin R
2013-02-01
Young adult survivors of childhood cancer (YASCC) are an ever-growing cohort of survivors due to increasing advances in technology. Today, there is a shift of focus to not just ensuring survivorship but also the quality of survivorship, which can be assessed with standardized instruments. The majority of standardized health related quality of life (HRQoL) instruments, however, are non-specific to this age group and the unique late effects within YASCC populations. The purpose of this study was to investigate the relevance and accuracy of standardized HRQoL instruments used with YASCC. In a previous study, HRQoL items from several instruments (SF-36, QLACS, QLS-CS) were examined for relevance with a population of YASCC. Participants (n = 30) from this study were recruited for a follow-up qualitative interview to expand on their perceptions of missing content from existing instruments. Respondents reported missing, relevant content among all three of the HRQoL instruments. Results identified three content areas of missing information: (1) Perceived sense of self, (2) Relationships, and (3) Parenthood. Existing HRQoL instruments do not take into account the progression and interdependence of emotional development impacted by a cancer diagnosis. The themes derived from our qualitative interviews may serve as a foundation for the generation of new items in future HRQoL instruments for YASCC populations. Further testing is required to examine the prevalence, frequency, and breadth of these items in a larger sample.
Decision-related factors in pupil old/new effects: Attention, response execution, and false memory.
Brocher, Andreas; Graf, Tim
2017-07-28
In this study, we investigate the effects of decision-related factors on recognition memory in pupil old/new paradigms. In Experiment 1, we used an old/new paradigm with words and pseudowords and participants made lexical decisions during recognition rather than old/new decisions. Importantly, participants were instructed to focus on the nonword-likeness of presented items, not their word-likeness. We obtained no old/new effects. In Experiment 2, participants discriminated old from new words and old from new pseudowords during recognition, and they did so as quickly as possible. We found old/new effects for both words and pseudowords. In Experiment 3, we used materials and an old/new design known to elicit a large number of incorrect responses. For false alarms ("old" response for new word), we found larger pupils than for correctly classified new items, starting at the point at which response execution was allowed (2750ms post stimulus onset). In contrast, pupil size for misses ("new" response for old word) was statistically indistinguishable from pupil size in correct rejections. Taken together, our data suggest that pupil old/new effects result more from the intentional use of memory than from its automatic use. Copyright © 2017 Elsevier Ltd. All rights reserved.
Optimal Assignment Methods in Three-Form Planned Missing Data Designs for Longitudinal Panel Studies
ERIC Educational Resources Information Center
Jorgensen, Terrence D.; Rhemtulla, Mijke; Schoemann, Alexander; McPherson, Brent; Wu, Wei; Little, Todd D.
2014-01-01
Planned missing designs are becoming increasingly popular, but because there is no consensus on how to implement them in longitudinal research, we simulated longitudinal data to distinguish between strategies of assigning items to forms and of assigning forms to participants across measurement occasions. Using relative efficiency as the criterion,…
Repetition Blindness for Rotated Objects
ERIC Educational Resources Information Center
Hayward, William G.; Zhou, Guomei; Man, Wai-Fung; Harris, Irina M.
2010-01-01
Repetition blindness (RB) is the finding that observers often miss the repetition of an item within a rapid stream of words or objects. Recent studies have shown that RB for objects is largely unaffected by variations in viewpoint between the repeated items. In 5 experiments, we tested RB under different axes of rotation, with different types of…
Clustering with Missing Values: No Imputation Required
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri
2004-01-01
Clustering algorithms can identify groups in large data sets, such as star catalogs and hyperspectral images. In general, clustering methods cannot analyze items that have missing data values. Common solutions either fill in the missing values (imputation) or ignore the missing data (marginalization). Imputed values are treated as just as reliable as the truly observed data, but they are only as good as the assumptions used to create them. In contrast, we present a method for encoding partially observed features as a set of supplemental soft constraints and introduce the KSC algorithm, which incorporates constraints into the clustering process. In experiments on artificial data and data from the Sloan Digital Sky Survey, we show that soft constraints are an effective way to enable clustering with missing values.
Discrete-Slots Models of Visual Working-Memory Response Times
Donkin, Christopher; Nosofsky, Robert M.; Gold, Jason M.; Shiffrin, Richard M.
2014-01-01
Much recent research has aimed to establish whether visual working memory (WM) is better characterized by a limited number of discrete all-or-none slots or by a continuous sharing of memory resources. To date, however, researchers have not considered the response-time (RT) predictions of discrete-slots versus shared-resources models. To complement the past research in this field, we formalize a family of mixed-state, discrete-slots models for explaining choice and RTs in tasks of visual WM change detection. In the tasks under investigation, a small set of visual items is presented, followed by a test item in 1 of the studied positions for which a change judgment must be made. According to the models, if the studied item in that position is retained in 1 of the discrete slots, then a memory-based evidence-accumulation process determines the choice and the RT; if the studied item in that position is missing, then a guessing-based accumulation process operates. Observed RT distributions are therefore theorized to arise as probabilistic mixtures of the memory-based and guessing distributions. We formalize an analogous set of continuous shared-resources models. The model classes are tested on individual subjects with both qualitative contrasts and quantitative fits to RT-distribution data. The discrete-slots models provide much better qualitative and quantitative accounts of the RT and choice data than do the shared-resources models, although there is some evidence for “slots plus resources” when memory set size is very small. PMID:24015956
Planned Missing Designs to Optimize the Efficiency of Latent Growth Parameter Estimates
ERIC Educational Resources Information Center
Rhemtulla, Mijke; Jia, Fan; Wu, Wei; Little, Todd D.
2014-01-01
We examine the performance of planned missing (PM) designs for correlated latent growth curve models. Using simulated data from a model where latent growth curves are fitted to two constructs over five time points, we apply three kinds of planned missingness. The first is item-level planned missingness using a three-form design at each wave such…
A qualitative examination of the content validity of the EQ-5D-5L in patients with type 2 diabetes.
Matza, Louis S; Boye, Kristina S; Stewart, Katie D; Curtis, Bradley H; Reaney, Matthew; Landrian, Amanda S
2015-12-01
The EQ-5D is frequently used to derive utilities for patients with type 2 diabetes (T2D). Despite widely available quantitative psychometric data on the EQ-5D, little is known about content validity in this population. Thus, the purpose of this qualitative study was to examine content validity of the EQ-5D in patients with T2D. Patients with T2D in the UK completed concept elicitation interviews, followed by administration of the EQ-5D-5L and cognitive interviewing focused on the instrument's relevance, clarity, and comprehensiveness. A total of 25 participants completed interviews (52.0 % male; mean age = 53.5 years). Approximately half (52 %) reported that the EQ-5D-5L was relevant to their experience with T2D. When asked if each individual item was relevant to their experience with T2D, responses varied widely (24.0 % said the self-care item was relevant; 68.0 % said the anxiety/depression item was relevant). Participants frequently said items were not relevant to themselves, but could be relevant to patients with more severe diabetes. Most participants (92.0 %) reported that T2D and/or its treatment/monitoring requirements had an impact on their quality of life that was not captured by the EQ-5D-5L. Common missing concepts included food awareness/restriction (n = 13, 52.0 %); activities (n = 11, 44.0 %); emotional functioning other than depression/anxiety (n = 8, 32.0 %); and social/relationship functioning (n = 8, 32.0 %). The results highlight strengths and potential limitations of the EQ-5D-5L, including missing content that could be important for some patients with T2D. Suggestions for addressing limitations are provided.
Work conditions and the food choice coping strategies of employed parents.
Devine, Carol M; Farrell, Tracy J; Blake, Christine E; Jastran, Margaret; Wethington, Elaine; Bisogni, Carole A
2009-01-01
How work conditions relate to parents' food choice coping strategies. Pilot telephone survey. City in the northeastern United States (US). Black, white, and Hispanic employed mothers (25) and fathers (25) randomly recruited from low-/moderate-income zip codes; 78% of those reached and eligible participated. Sociodemographic characteristics; work conditions (hours, shift, job schedule, security, satisfaction, food access); food choice coping strategies (22 behavioral items for managing food in response to work and family demands (ie, food prepared at/away from home, missing meals, individualizing meals, speeding up, planning). Two-tailed chi-square and Fisher exact tests (P < or = .05, unless noted). Half or more of respondents often/sometimes used 12 of 22 food choice coping strategies. Long hours and nonstandard hours and schedules were positively associated among fathers with take-out meals, missed family meals, prepared entrees, and eating while working; and among mothers with restaurant meals, missed breakfast, and prepared entrees. Job security, satisfaction, and food access were also associated with gender-specific strategies. Structural work conditions among parents such as job hours, schedule, satisfaction, and food access are associated with food choice coping strategies with importance for dietary quality. Findings have implications for worksite interventions but need examination in a larger sample.
Zhang, Zhiyong; Yuan, Ke-Hai
2016-06-01
Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation methods for alpha and omega often implicitly assume that data are complete and normally distributed. This study proposes robust procedures to estimate both alpha and omega as well as corresponding standard errors and confidence intervals from samples that may contain potential outlying observations and missing values. The influence of outlying observations and missing data on the estimates of alpha and omega is investigated through two simulation studies. Results show that the newly developed robust method yields substantially improved alpha and omega estimates as well as better coverage rates of confidence intervals than the conventional nonrobust method. An R package coefficientalpha is developed and demonstrated to obtain robust estimates of alpha and omega.
Zhang, Zhiyong; Yuan, Ke-Hai
2015-01-01
Cronbach’s coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald’s omega has been used as a popular alternative to alpha in the literature. Traditional estimation methods for alpha and omega often implicitly assume that data are complete and normally distributed. This study proposes robust procedures to estimate both alpha and omega as well as corresponding standard errors and confidence intervals from samples that may contain potential outlying observations and missing values. The influence of outlying observations and missing data on the estimates of alpha and omega is investigated through two simulation studies. Results show that the newly developed robust method yields substantially improved alpha and omega estimates as well as better coverage rates of confidence intervals than the conventional nonrobust method. An R package coefficientalpha is developed and demonstrated to obtain robust estimates of alpha and omega. PMID:29795870
ERIC Educational Resources Information Center
Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H.
2017-01-01
Competence data from low-stakes educational large-scale assessment studies allow for evaluating relationships between competencies and other variables. The impact of item-level nonresponse has not been investigated with regard to statistics that determine the size of these relationships (e.g., correlations, regression coefficients). Classical…
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level
ERIC Educational Resources Information Center
Savalei, Victoria; Rhemtulla, Mijke
2017-01-01
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately…
Pompilus, Farrah; Burgess, Somali; Hudgens, Stacie; Banderas, Benjamin; Daniels, Selena
2015-12-01
Facial lines or wrinkles are among the most visible signs of aging, and minimally invasive cosmetic procedures are becoming increasingly popular. The aim of this study was to develop and validate the Facial Line Satisfaction Questionnaire (FLSQ) for use in adults with upper facial lines (UFL). A literature review, concept elicitation interviews (n = 33), and cognitive debriefing interviews (n = 23) of adults with UFL were conducted to develop the FLSQ. The FLSQ comprises Baseline and Follow-up versions and was field-tested with 150 subjects in a US observational study designed to assess its psychometric performance. Analyses included acceptability (item and scale distribution [i.e. missingness, floor, and ceiling effects]), reliability, and validity (including concurrent validity). In total, 69 concepts were elicited during patient interviews. Following cognitive debriefing interviews, the FLSQ-Baseline version included 11 items and the Follow-up version included 13 items. Response rates for the FLSQ were 100% and 73% at baseline and follow-up, respectively; no items had excessive missing data. Questionnaire scale scores were normally distributed. Most domain scores demonstrated good internal consistency reliability (Cronbach's α ≥ 0.70). Most items within their respective domains exhibited good convergent (item-scale correlations > 0.40) and discriminant (items had higher correlation with their hypothesized scales than other scales) validity. Concurrent validity correlation coefficients of the FLSQ domain scores with the associated concurrent measures were acceptable (range: r = 0.40-0.70). Six FLSQ items demonstrated reliability and validity as stand-alone items outside their domains. The FLSQ is a valid questionnaire for assessing treatment expectations, satisfaction, impact, and preference in adults with UFL. © 2015 The Authors. Journal of Cosmetic Dermatology Published by Wiley Periodicals, Inc.
PACIC Instrument: disentangling dimensions using published validation models.
Iglesias, K; Burnand, B; Peytremann-Bridevaux, I
2014-06-01
To better understand the structure of the Patient Assessment of Chronic Illness Care (PACIC) instrument. More specifically to test all published validation models, using one single data set and appropriate statistical tools. Validation study using data from cross-sectional survey. A population-based sample of non-institutionalized adults with diabetes residing in Switzerland (canton of Vaud). French version of the 20-items PACIC instrument (5-point response scale). We conducted validation analyses using confirmatory factor analysis (CFA). The original five-dimension model and other published models were tested with three types of CFA: based on (i) a Pearson estimator of variance-covariance matrix, (ii) a polychoric correlation matrix and (iii) a likelihood estimation with a multinomial distribution for the manifest variables. All models were assessed using loadings and goodness-of-fit measures. The analytical sample included 406 patients. Mean age was 64.4 years and 59% were men. Median of item responses varied between 1 and 4 (range 1-5), and range of missing values was between 5.7 and 12.3%. Strong floor and ceiling effects were present. Even though loadings of the tested models were relatively high, the only model showing acceptable fit was the 11-item single-dimension model. PACIC was associated with the expected variables of the field. Our results showed that the model considering 11 items in a single dimension exhibited the best fit for our data. A single score, in complement to the consideration of single-item results, might be used instead of the five dimensions usually described. © The Author 2014. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
Dykema, Jennifer; Stevenson, John; Kniss, Chad; Kvale, Katherine; González, Kim; Cautley, Eleanor
2012-05-01
From 2009 to 2010, an experiment was conducted to increase response rates among African American mothers in the Wisconsin Pregnancy Risk Assessment Monitoring System (PRAMS). Sample members were randomly assigned to groups that received a prepaid, cash incentive of $5 (n = 219); a coupon for diapers valued at $6 (n = 210); or no incentive (n = 209). Incentives were included with the questionnaire, which was mailed to respondents. We examined the effects of the incentives on several outcomes, including response rates, cost effectiveness, survey response distributions, and item nonresponse. Response rates were significantly higher for the cash group than for the coupon (42.5 vs. 32.4%, P < .05) or no incentive group (42.5 vs. 30.1%, P < .01); the coupon and no incentive groups performed similarly. While absolute costs were the highest for the cash group, the cost per completed survey was the lowest. The incentives had limited effects on response distributions for specific survey questions. Although respondents completing the survey by mail in the cash and coupon groups exhibited a trend toward being less likely to have missing data, the effect was not significant. Compared to a coupon or no incentive, a small cash incentive significantly improved response rates and was cost effective among African American respondents in Wisconsin PRAMS. Incentives had only limited effects, however, on survey response distributions, and no significant effects on item nonresponse.
Solari, A; Mattarozzi, K; Vignatelli, L; Giordano, A; Russo, P M; Uccelli, M Messmer; D'Alessandro, R
2010-10-01
We describe the development and clinical validation of a patient self-administered tool assessing the quality of multiple sclerosis diagnosis disclosure. A multiple sclerosis expert panel generated questionnaire items from the Doctor's Interpersonal Skills Questionnaire, literature review, and interviews with neurology inpatients. The resulting 19-item Comunicazione medico-paziente nella Sclerosi Multipla (COSM) was pilot tested/debriefed on seven patients with multiple sclerosis and administered to 80 patients newly diagnosed with multiple sclerosis. The resulting revised 20-item version (COSM-R) was debriefed on five patients with multiple sclerosis, field tested/debriefed on multiple sclerosis patients, and field tested on 105 patients newly diagnosed with multiple sclerosis participating in a clinical trial on an information aid. The hypothesized monofactorial structure of COSM-R section 2 was tested on the latter two groups. The questionnaire was well accepted. Scaling assumptions were satisfactory in terms of score distributions, item-total correlations and internal consistency. Factor analysis confirmed section 2's monofactorial structure, which was also test-retest reliable (intraclass correlation coefficient [ICC] 0.73; 95% CI 0.54-0.85). Section 1 had only fair test-retest reliability (ICC 0.45; 95% CI 0.12-0.69), and three items had 8-21% missed responses. COSM-R is a brief, easy-to-interpret MS-specific questionnaire for use as a health care indicator.
Meikle, Mary B; Henry, James A; Griest, Susan E; Stewart, Barbara J; Abrams, Harvey B; McArdle, Rachel; Myers, Paula J; Newman, Craig W; Sandridge, Sharon; Turk, Dennis C; Folmer, Robert L; Frederick, Eric J; House, John W; Jacobson, Gary P; Kinney, Sam E; Martin, William H; Nagler, Stephen M; Reich, Gloria E; Searchfield, Grant; Sweetow, Robert; Vernon, Jack A
2012-01-01
Chronic subjective tinnitus is a prevalent condition that causes significant distress to millions of Americans. Effective tinnitus treatments are urgently needed, but evaluating them is hampered by the lack of standardized measures that are validated for both intake assessment and evaluation of treatment outcomes. This work was designed to develop a new self-report questionnaire, the Tinnitus Functional Index (TFI), that would have documented validity both for scaling the severity and negative impact of tinnitus for use in intake assessment and for measuring treatment-related changes in tinnitus (responsiveness) and that would provide comprehensive coverage of multiple tinnitus severity domains. To use preexisting knowledge concerning tinnitus-related problems, an Item Selection Panel (17 expert judges) surveyed the content (175 items) of nine widely used tinnitus questionnaires. From those items, the Panel identified 13 separate domains of tinnitus distress and selected 70 items most likely to be responsive to treatment effects. Eliminating redundant items while retaining good content validity and adding new items to achieve the recommended minimum of 3 to 4 items per domain yielded 43 items, which were then used for constructing TFI Prototype 1.Prototype 1 was tested at five clinics. The 326 participants included consecutive patients receiving tinnitus treatment who provided informed consent-constituting a convenience sample. Construct validity of Prototype 1 as an outcome measure was evaluated by measuring responsiveness of the overall scale and its individual items at 3 and 6 mo follow-up with 65 and 42 participants, respectively. Using a predetermined list of criteria, the 30 best-functioning items were selected for constructing TFI Prototype 2.Prototype 2 was tested at four clinics with 347 participants, including 155 and 86 who provided 3 and 6 mo follow-up data, respectively. Analyses were the same as for Prototype 1. Results were used to select the 25 best-functioning items for the final TFI. Both prototypes and the final TFI displayed strong measurement properties, with few missing data, high validity for scaling of tinnitus severity, and good reliability. All TFI versions exhibited the same eight factors characterizing tinnitus severity and negative impact. Responsiveness, evaluated by computing effect sizes for responses at follow-up, was satisfactory in all TFI versions.In the final TFI, Cronbach's alpha was 0.97 and test-retest reliability 0.78. Convergent validity (r = 0.86 with Tinnitus Handicap Inventory [THI]; r = 0.75 with Visual Analog Scale [VAS]) and discriminant validity (r = 0.56 with Beck Depression Inventory-Primary Care [BDI-PC]) were good. The final TFI was successful at detecting improvement from the initial clinic visit to 3 mo with moderate to large effect sizes and from initial to 6 mo with large effect sizes. Effect sizes for the TFI were generally larger than those obtained for the VAS and THI. After careful evaluation, a 13-point reduction was considered a preliminary criterion for meaningful reduction in TFI outcome scores. The TFI should be useful in both clinical and research settings because of its responsiveness to treatment-related change, validity for scaling the overall severity of tinnitus, and comprehensive coverage of multiple domains of tinnitus severity.
Braekman, Elise; Berete, Finaba; Charafeddine, Rana; Demarest, Stefaan; Drieskens, Sabine; Gisle, Lydia; Molenberghs, Geert; Tafforeau, Jean; Van der Heyden, Johan; Van Hal, Guido
2018-01-01
Before organizing mixed-mode data collection for the self-administered questionnaire of the Belgian Health Interview Survey, measurement effects between the paper-and-pencil and the web-based questionnaire were evaluated. A two-period cross-over study was organized with a sample of 149 employees of two Belgian research institutes (age range 22-62 years, 72% female). Measurement agreement was assessed for a diverse range of health indicators related to general health, mental and psychosocial health, health behaviors and prevention with kappa coefficients and intraclass correlation (ICC). The quality of the data collected by both modes was evaluated by quantifying the missing, 'don't know' and inconsistent values and data entry mistakes. Good to very good agreement was found for all categorical indicators with kappa coefficients superior to 0.60, except for two mental and psychosocial health indicators namely the presence of a sleeping disorder and of a depressive disorder (kappa≥0.50). For the continuous indicators high to acceptable agreement was observed with ICC superior to 0.70. Inconsistent answers and data-entry mistakes were only occurring in the paper-and-pencil mode. There were no less missing values in the web-based mode compared to the paper-and-pencil mode. The study supports the idea that web-based modes provide, in general, equal responses to paper-and-pencil modes. However, health indicators based upon factual and objective items tend to have higher measurement agreement than indicators requiring an assessment of personal subjective feelings. A web-based mode greatly facilitates the data-entry process and guides the completing of a questionnaire. However, item non-response was not positively affected.
Implementing AORN recommended practices for prevention of retained surgical items.
Goldberg, Judith L; Feldman, David L
2012-02-01
Retention of a surgical item is a preventable event that can result in patient injury. AORN's "Recommended practices for prevention of retained surgical items" emphasizes the importance of using a multidisciplinary approach for prevention. Procedures should include counts of soft goods, needles, miscellaneous items, and instruments, and efforts should be made to prevent retention of fragments of broken devices. If a count discrepancy occurs, the perioperative team should follow procedures to locate the missing item. Perioperative leaders may consider the use of adjunct technologies such as bar-code scanning, radio-frequency detection, and radio-frequency identification. Ambulatory and hospital patient scenarios are included to exemplify appropriate strategies for preventing retained surgical items. Copyright © 2012 AORN, Inc. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Zhang, Zhiyong; Yuan, Ke-Hai
2016-01-01
Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation…
ERIC Educational Resources Information Center
Zhang, Zhiyong; Yuan, Ke-Hai
2016-01-01
Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation…
Validation of the Kohnen Restless Legs Syndrome-Quality of Life instrument.
Kohnen, Ralf; Martinez-Martin, Pablo; Benes, Heike; Trenkwalder, Claudia; Högl, Birgit; Dunkl, Elmar; Walters, Arthur S
2016-08-01
Due to the symptoms and the sleep disturbances it causes, Restless Legs Syndrome (RLS) has a negative impact on quality of life. Measurement of such impact can be performed by means of questionnaires, such as the Kohnen Restless Legs Syndrome-Quality of Life questionnaire (KRLS-QoL), a specific 12-item instrument that is self-applied by patients. The present study is aimed at performing a first formal validation study of this instrument. Eight hundred ninety-one patients were included for analysis. RLS severity was assessed by the International Restless Legs Scale (IRLS), Restless Legs Syndrome-6 scales (RLS-6), and Clinical Global Impression of Severity. In addition the Epworth Sleepiness Scale (ESS) was assessed. Acceptability, dimensionality, scaling assumptions, reliability, precision, hypotheses-related validity, and responsiveness were tested. There were missing data in 3.58% patients. Floor and ceiling effects were low for the subscales, global evaluation, and summary index derived from items 1 to 11 after checking that scaling assumptions were met. Exploratory parallel factor analysis showed that the KRLS-QoL may be deemed unidimensional, ie, that all components of the scale are part of one overall general quality of life factor. Indexes of internal consistency (alpha = 0.88), item-total correlation (r S = 0.32-0.71), item homogeneity coefficient (0.41), and scale stability (ICC = 0.73) demonstrated a satisfactory reliability of the KRLS-QoL. Moderate or high correlations were obtained between KRLS-QoL scores and the IRLS, some components of the RLS-6, inter-KRLS-QoL domains, and global evaluations. Known-groups validity for severity levels grouping and responsiveness analysis results were satisfactory, the latter showing higher magnitudes of response for treated than for placebo arms. The KRLS-QoL was proven an acceptable, reliable, valid, and responsive measure to assess the impact of the RLS on quality of life. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Some Activities of MISSE 6 Mission
NASA Technical Reports Server (NTRS)
Prasad, Narasimha S.
2009-01-01
The objective of the Materials International Space Station Experiment (MISSE) is to study the performance of novel materials when subjected to the synergistic effects of the harsh space environment for several months. In this paper, a few laser and optical elements from NASA Langley Research Center (LaRC) that have been flown on MISSE 6 mission will be discussed. These items were characterized and packed inside a ruggedized Passive Experiment Container (PEC) that resembles a suitcase. The PEC was tested for survivability due to launch conditions. Subsequently, the MISSE 6 PEC was transported by the STS-123 mission to International Space Station (ISS) on March 11, 2008. The astronauts successfully attached the PEC to external handrails and opened the PEC for long term exposure to the space environment. The plan is to retrieve the MISSE 6 PEC by STS-128 mission in August 2009.
Rübsamen, Nicole; Akmatov, Manas K; Castell, Stefanie; Karch, André; Mikolajczyk, Rafael T
2017-01-01
Increasing availability of the Internet allows using only online data collection for more epidemiological studies. We compare response patterns in a population-based health survey using two survey designs: mixed-mode (choice between paper-and-pencil and online questionnaires) and online-only design (without choice). We used data from a longitudinal panel, the Hygiene and Behaviour Infectious Diseases Study (HaBIDS), conducted in 2014/2015 in four regions in Lower Saxony, Germany. Individuals were recruited using address-based probability sampling. In two regions, individuals could choose between paper-and-pencil and online questionnaires. In the other two regions, individuals were offered online-only participation. We compared sociodemographic characteristics of respondents who filled in all panel questionnaires between the mixed-mode group (n = 1110) and the online-only group (n = 482). Using 134 items, we performed multinomial logistic regression to compare responses between survey designs in terms of type (missing, "do not know" or valid response) and ordinal regression to compare responses in terms of content. We applied the false discovery rates (FDR) to control for multiple testing and investigated effects of adjusting for sociodemographic characteristic. For validation of the differential response patterns between mixed-mode and online-only, we compared the response patterns between paper and online mode among the respondents in the mixed-mode group in one region (n = 786). Respondents in the online-only group were older than those in the mixed-mode group, but both groups did not differ regarding sex or education. Type of response did not differ between the online-only and the mixed-mode group. Survey design was associated with different content of response in 18 of the 134 investigated items; which decreased to 11 after adjusting for sociodemographic variables. In the validation within the mixed-mode, only two of those were among the 11 significantly different items. The probability of observing by chance the same two or more significant differences in this setting was 22%. We found similar response patterns in both survey designs with only few items being answered differently, likely attributable to chance. Our study supports the equivalence of the compared survey designs and suggests that, in the studied setting, using online-only design does not cause strong distortion of the results.
Missing voices: polling and health care.
Berinsky, Adam J; Margolis, Michele
2011-12-01
Examining data on the recent health care legislation, we demonstrate that public opinion polls on health care should be treated with caution because of item nonresponse--or "don't know" answers--on survey questions. Far from being the great equalizer, opinion polls can actually misrepresent the attitudes of the population. First, we show that respondents with lower levels of socioeconomic resources are systematically more likely to give a "don't know" response when asked their opinion about health care legislation. Second, these same individuals are more likely to back health care reform. The result is an incomplete portrait of public opinion on the issue of health care in the United States.
Quality and accuracy of electronic pre-anesthesia evaluation forms.
Almeshari, Meshari; Khalifa, Mohamed; El-Metwally, Ashraf; Househ, Mowafa; Alanazi, Abdullah
2018-07-01
Paper-based forms have been widely used to document patient health information for anesthesia; however, hospitals are now switching to electronic patient file documentation for anesthesia. The aim of this study is to compare the quality of paper-based and electronic pre-anesthesia assessment forms. The research conducted in this study was quasi-experimental using a pretest-posttest design without a control group. The study was conducted at King Abdulaziz Medical City, Riyadh (KAMC-RD) during November 2015. Paper-based forms were converted into electronic forms, and the paper-based pre-anesthesia forms were used during the first two weeks of the data collection period while electronic forms were completed in the last two weeks. The quality of each (electronic vs. paper) was evaluated with respect to missing items, errors, and unreadable items. The sample size included all 15 anesthetists working in the pre-anesthesia clinic at KAMC-RD. The anesthetists completed 25 pre-anesthesia forms daily during a five-day week schedule. A total of 500 patient forms were completed during the study (250 paper-based and 250 electronic forms). Anesthetists' satisfaction with the electronic pre-anesthesia form was also measured using a questionnaire. The electronic form shows significantly higher quality in all assessment categories (missing items, errors, and unreadable items; X² (2, N = 500) = 171.64, p < 0.001). The satisfaction survey found 81.65% of the anesthetists were satisfied with the electronic pre-anesthesia form for all questions. Our study demonstrates that the electronic pre-anesthesia form has better data quality, meets the expectations of anesthetists and aids to decrease missing key preoperative information. This type of approach is imperative for the safety of perioperative patients. Copyright © 2018. Published by Elsevier B.V.
Mapping integration of midwives across the United States: Impact on access, equity, and outcomes
Stoll, Kathrin; MacDorman, Marian; Declercq, Eugene; Cramer, Renee; Cheyney, Melissa; Fisher, Timothy; Butt, Emma; Yang, Y. Tony; Powell Kennedy, Holly
2018-01-01
Poor coordination of care across providers and birth settings has been associated with adverse maternal-newborn outcomes. Research suggests that integration of midwives into regional health systems is a key determinant of optimal maternal-newborn outcomes, yet, to date, the characteristics of an integrated system have not been described, nor linked to health disparities. Methods Our multidisciplinary team examined published regulatory data to inform a 50-state database describing the environment for midwifery practice and interprofessional collaboration. Items (110) detailed differences across jurisdictions in scope of practice, autonomy, governance, and prescriptive authority; as well as restrictions that can affect patient safety, quality, and access to maternity providers across birth settings. A nationwide survey of state regulatory experts (n = 92) verified the ‘on the ground’ relevance, importance, and realities of local interpretation of these state laws. Using a modified Delphi process, we selected 50/110 key items to include in a weighted, composite Midwifery Integration Scoring (MISS) system. Higher scores indicate greater integration of midwives across all settings. We ranked states by MISS scores; and, using reliable indicators in the CDC-Vital Statistics Database, we calculated correlation coefficients between MISS scores and maternal-newborn outcomes by state, as well as state density of midwives and place of birth. We conducted hierarchical linear regression analysis to control for confounding effects of race. Results MISS scores ranged from lowest at 17 (North Carolina) to highest at 61 (Washington), out of 100 points. Higher MISS scores were associated with significantly higher rates of spontaneous vaginal delivery, vaginal birth after cesarean, and breastfeeding, and significantly lower rates of cesarean, preterm birth, low birth weight infants, and neonatal death. MISS scores also correlated with density of midwives and access to care across birth settings. Significant differences in newborn outcomes accounted for by MISS scores persisted after controlling for proportion of African American births in each state. Conclusion The MISS scoring system assesses the level of integration of midwives and evaluates regional access to high quality maternity care. In the United States, higher MISS Scores were associated with significantly higher rates of physiologic birth, less obstetric interventions, and fewer adverse neonatal outcomes. PMID:29466389
On analyzing ordinal data when responses and covariates are both missing at random.
Rana, Subrata; Roy, Surupa; Das, Kalyan
2016-08-01
In many occasions, particularly in biomedical studies, data are unavailable for some responses and covariates. This leads to biased inference in the analysis when a substantial proportion of responses or a covariate or both are missing. Except a few situations, methods for missing data have earlier been considered either for missing response or for missing covariates, but comparatively little attention has been directed to account for both missing responses and missing covariates, which is partly attributable to complexity in modeling and computation. This seems to be important as the precise impact of substantial missing data depends on the association between two missing data processes as well. The real difficulty arises when the responses are ordinal by nature. We develop a joint model to take into account simultaneously the association between the ordinal response variable and covariates and also that between the missing data indicators. Such a complex model has been analyzed here by using the Markov chain Monte Carlo approach and also by the Monte Carlo relative likelihood approach. Their performance on estimating the model parameters in finite samples have been looked into. We illustrate the application of these two methods using data from an orthodontic study. Analysis of such data provides some interesting information on human habit. © The Author(s) 2013.
Testing of NASA LaRC Materials under MISSE 6 and MISSE 7 Missions
NASA Technical Reports Server (NTRS)
Prasad, Narasimha S.
2009-01-01
The objective of the Materials International Space Station Experiment (MISSE) is to study the performance of novel materials when subjected to the synergistic effects of the harsh space environment for several months. MISSE missions provide an opportunity for developing space qualifiable materials. Two lasers and a few optical components from NASA Langley Research Center (LaRC) were included in the MISSE 6 mission for long term exposure. MISSE 6 items were characterized and packed inside a ruggedized Passive Experiment Container (PEC) that resembles a suitcase. The PEC was tested for survivability due to launch conditions. MISSE 6 was transported to the international Space Station (ISS) via STS 123 on March 11. 2008. The astronauts successfully attached the PEC to external handrails of the ISS and opened the PEC for long term exposure to the space environment. The current plan is to bring the MISSE 6 PEC back to the Earth via STS 128 mission scheduled for launch in August 2009. Currently, preparations for launching the MISSE 7 mission are progressing. Laser and lidar components assembled on a flight-worthy platform are included from NASA LaRC. MISSE 7 launch is scheduled to be launched on STS 129 mission. This paper will briefly review recent efforts on MISSE 6 and MISSE 7 missions at NASA Langley Research Center (LaRC).
Rolls, Edmund T
2017-05-01
The art of memory (ars memoriae) used since classical times includes using a well-known scene to associate each view or part of the scene with a different item in a speech. This memory technique is also known as the "method of loci." The new theory is proposed that this type of memory is implemented in the CA3 region of the hippocampus where there are spatial view cells in primates that allow a particular view to be associated with a particular object in an event or episodic memory. Given that the CA3 cells with their extensive recurrent collateral system connecting different CA3 cells, and associative synaptic modifiability, form an autoassociation or attractor network, the spatial view cells with their approximately Gaussian view fields become linked in a continuous attractor network. As the view space is traversed continuously (e.g., by self-motion or imagined self-motion across the scene), the views are therefore successively recalled in the correct order, with no view missing, and with low interference between the items to be recalled. Given that each spatial view has been associated with a different discrete item, the items are recalled in the correct order, with none missing. This is the first neuroscience theory of ars memoriae. The theory provides a foundation for understanding how a key feature of ars memoriae, the ability to use a spatial scene to encode a sequence of items to be remembered, is implemented. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
The Use of an Audience Response System in an Elementary School–Based Health Education Program
DeSorbo, Alexandra L.; Noble, James M.; Shaffer, Michele; Gerin, William; Williams, Olajide A.
2016-01-01
Background The audience response system (ARS) allows students to respond and interact anonymously with teachers via small handheld wireless keypads. Despite increasing popularity in classroom settings, the application of these devices to health education programming has not been studied. We assessed feasibility, engagement, and learning among children using an ARS compared with traditional pencil–paper formats, (ARS) for a stroke health education program. Method We compared outcome data generated via an ARS-based intervention to pencil–paper controls, including test scores and missing data rates among 265 schoolchildren 9 to 11 years old participating in stroke education. Among 119 children, we evaluated the feasibility of ARS use and explored student motivation with a 10-item questionnaire. We assessed facilitator experience with both methods. Results ARS use is feasible. Students reported having more fun (p < .001), increased attention (p < .001), participation (p < .001), and perceived learning outcomes (p < .001) compared with pencil–paper controls. Test scores showed highly positive improvement for both ARS and paper without additional benefits of ARS on learning. There was no difference in missing data rates (p < .001). Educators preferred the ARS. Conclusion The use of an ARS among children is feasible and improves student and facilitator engagement without additional benefits on stroke learning. PMID:23086554
Empirical likelihood method for non-ignorable missing data problems.
Guan, Zhong; Qin, Jing
2017-01-01
Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)'s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.
The EORTC module for quality of life in patients with thyroid cancer: phase III.
Singer, Susanne; Jordan, Susan; Locati, Laura D; Pinto, Monica; Tomaszewska, Iwona M; Araújo, Cláudia; Hammerlid, Eva; Vidhubala, E; Husson, Olga; Kiyota, Naomi; Brannan, Christine; Salem, Dina; Gamper, Eva M; Arraras, Juan Ignacio; Ioannidis, Georgios; Andry, Guy; Inhestern, Johanna; Grégoire, Vincent; Licitra, Lisa
2017-04-01
The purpose of the study was to pilot-test a questionnaire measuring health-related quality of life (QoL) in thyroid cancer patients to be used with the European Organisation for Research and Treatment of Cancer (EORTC) core questionnaire EORTC QLQ-C30. A provisional questionnaire with 47 items was administered to patients treated for thyroid cancer within the last 2 years. Patients were interviewed about time and help needed to complete the questionnaire, and whether they found the items understandable, confusing or annoying. Items were kept in the questionnaire if they fulfilled pre-defined criteria: relevant to the patients, easy to understand, not confusing, few missing values, neither floor nor ceiling effects, and high variance. A total of 182 thyroid cancer patients in 15 countries participated ( n = 115 with papillary, n = 31 with follicular, n = 22 with medullary, n = 6 with anaplastic, and n = 8 with other types of thyroid cancer). Sixty-six percent of the patients needed 15 min or less to complete the questionnaire. Of the 47 items, 31 fulfilled the predefined criteria and were kept unchanged, 14 were removed, and 2 were changed. Shoulder dysfunction was mentioned by 5 patients as missing and an item covering this issue was added. To conclude, the EORTC quality of life module for thyroid cancer (EORTC QLQ-THY34) is ready for the final validation phase IV. © 2017 Society for Endocrinology.
Amplified Striatal Responses to Near-Miss Outcomes in Pathological Gamblers
Sescousse, Guillaume; Janssen, Lieneke K; Hashemi, Mahur M; Timmer, Monique H M; Geurts, Dirk E M; ter Huurne, Niels P; Clark, Luke; Cools, Roshan
2016-01-01
Near-misses in gambling games are losing events that come close to a win. Near-misses were previously shown to recruit reward-related brain regions including the ventral striatum, and to invigorate gambling behavior, supposedly by fostering an illusion of control. Given that pathological gamblers are particularly vulnerable to such cognitive illusions, their persistent gambling behavior might result from an amplified striatal sensitivity to near-misses. In addition, animal studies have shown that behavioral responses to near-miss-like events are sensitive to dopamine, but this dopaminergic influence has not been tested in humans. To investigate these hypotheses, we recruited 22 pathological gamblers and 22 healthy controls who played a slot machine task delivering wins, near-misses and full-misses, inside an fMRI scanner. Each participant played the task twice, once under placebo and once under a dopamine D2 receptor antagonist (sulpiride 400 mg), in a double-blind, counter-balanced design. Participants were asked about their motivation to continue gambling throughout the task. Across all participants, near-misses elicited higher motivation to continue gambling and increased striatal responses compared with full-misses. Crucially, pathological gamblers showed amplified striatal responses to near-misses compared with controls. These group differences were not observed following win outcomes. In contrast to our hypothesis, sulpiride did not induce any reliable modulation of brain responses to near-misses. Together, our results demonstrate that pathological gamblers have amplified brain responses to near-misses, which likely contribute to their persistent gambling behavior. However, there is no evidence that these responses are influenced by dopamine. These results have implications for treatment and gambling regulation. PMID:27006113
RFID Meets GWOT: Considering a New Technology for a New Kind of War
2006-06-01
creating a tag that will enable commuters to travel conveniently throughout the city’s subway system. While these items have yet to come, they are... food items that will let your refrigerator know what is missing inside. In turn, your refrigerator will communicate to your cell phone that you need...regular routines. Additionally, businesses reopened and outside investors sought new opportunities as they brought the island’s first franchise
Code Description for Generation of Meteorological Height and Pressure Level and Layer Profiles
2016-06-01
defined by user input height or pressure levels. It can process input profiles from sensing systems such as radiosonde, lidar, or wind profiling radar...nearly the same way, but the split between wind and temperature/humidity (TH) special levels leads to some changes to one other routine. If changes are...top of the sounding, sometimes the moisture, the thermal, both thermal and moisture, and/or the wind data are missing. Missing data items in the
What is the value of the SAGES/AORN MIS checklist? A multi-institutional practical assessment.
Benham, Emily; Richardson, William; Dort, Jonathan; Lin, Henry; Tummers, A Michael; Walker, Travelyan M; Stefanidis, Dimitrios
2017-04-01
Surgical safety checklists reduce perioperative complications and mortality. Given that minimally invasive surgery (MIS) is dependent on technology and vulnerable to equipment failure, SAGES and AORN partnered to create a MIS checklist to optimize case flow and minimize errors. The aim of this project was to evaluate the effectiveness of the SAGES/AORN checklist in preventing disruptions and determine its ease of use. The checklist was implemented across four institutions and completed by the operating team. To assess its effectiveness, we recorded how often the checklist identified problems and how frequently each of the 45 checklist items were not completed. The perceived usefulness, ease of use, and frustration associated with checklist use were rated on a 5-point Likert scale by the surgeon. We assessed any differences dependent on timing of checklist completion and among institutions. The checklist was performed during MIS procedures (n = 114). When used before the procedure (n = 36), the checklist identified missing items in 13 cases (36.11 %). When used after the procedure (n = 61), the checklist identified missing items in 18 cases (29.51 %) that caused a delay of 4.1 ± 11.1 min. The most frequently missed items included preference card review (14.0 %), readiness of the carbon dioxide insufflator (8.7 %), and availability of the Veress needle (3.6 %). The checklist took an average of 3.6 ± 2.7 min to complete with its usefulness rated 2.6 ± 1.5, ease of use 2.0 ± 1.2, and frustration 1.3 ± 1.1. The checklist identified problems in 24 % of cases that led to preventable delays. The checklist was easy to complete and not frustrating, indicating it could improve operative flow. This study also identified the most useful items which may help abbreviate the checklist, minimizing the frustration and time taken to complete it while maximizing its utility. These attributes of the SAGES/AORN MIS checklist should be explored in future larger-scale studies.
Howard, Michelle; Day, Andrew G; Bernard, Carrie; Tan, Amy; You, John; Klein, Doug; Heyland, Daren K
2018-01-01
Valid and reliable measurement of barriers to advance care planning (ACP) in health care settings can inform the design of robust interventions. This article describes the development and psychometric evaluation of an instrument to measure the presence and magnitude of perceived barriers to ACP discussion with patients from the perspective of family physicians. A questionnaire was designed through literature review and expert input, asking family physicians to rate the importance of barriers (0 = not at all a barrier and 6 = an extreme amount) to ACP discussions with patients and administered to 117 physicians. Floor effects and missing data patterns were examined. Item-by-item correlations were examined using Pearson correlation. Exploratory factor analysis was conducted (iterated principle factor analysis with oblique rotation), internal consistency (Cronbach's alpha) overall and within factors was calculated, and construct validity was evaluated by calculating three correlations with related questions that were specified a priori. The questionnaire included 31 questions in three domains relating to the clinician, patient/family and system or external factors. No items were removed due to missing data, floor effects, or high correlation with another item. A solution of three factors accounted for 71% of variance. One item was removed because it did not load strongly on any factor. All other items except one remained in the original domain in the questionnaire. Cronbach's alpha for the three factors ranged from 0.84 to 0.90. Two of three a priori correlations with related questions were statistically significant. This questionnaire to assess barriers to ACP discussion from the perspective of family physicians demonstrates preliminary evidence of reliability and validity. Copyright © 2017 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
van der Kooy, Jacoba; Valentine, Nicole B; Birnie, Erwin; Vujkovic, Marijana; de Graaf, Johanna P; Denktaş, Semiha; Steegers, Eric A P; Bonsel, Gouke J
2014-12-03
The concept of responsiveness, introduced by the World Health Organization (WHO), addresses non-clinical aspects of health service quality that are relevant regardless of provider, country, health system or health condition. Responsiveness refers to "aspects related to the way individuals are treated and the environment in which they are treated" during health system interactions. This paper assesses the psychometric properties of a newly developed responsiveness questionnaire dedicated to evaluating maternal experiences of perinatal care services, called the Responsiveness in Perinatal and Obstetric Health Care Questionnaire (ReproQ), using the eight-domain WHO concept. The ReproQ was developed between October 2009 and February 2010 by adapting the WHO Responsiveness Questionnaire items to the perinatal care context. The psychometric properties of feasibility, construct validity, and discriminative validity were empirically assessed in a sample of Dutch women two weeks post partum. A total of 171 women consented to participation. Feasibility: the interviews lasted between 20 and 40 minutes and the overall missing rate was 8%. Construct validity: mean Cronbach's alphas for the antenatal, birth and postpartum phase were: 0.73 (range 0.57-0.82), 0.84 (range 0.66-0.92), and 0.87 (range 0.62-0.95) respectively. The item-own scale correlations within all phases were considerably higher than most of the item-other scale correlations. Within the antenatal care, birth care and post partum phases, the eight factors explained 69%, 69%, and 76% of variance respectively. Discriminative validity: overall responsiveness mean sum scores were higher for women whose children were not admitted. This confirmed the hypothesis that dissatisfaction with health outcomes is transferred to their judgement on responsiveness of the perinatal services. The ReproQ interview-based questionnaire demonstrated satisfactory psychometric properties to describe the quality of perinatal care in the Netherlands, with the potential to discriminate between different levels of quality of care. In view of the relatively small sample, further testing and research is recommended.
McGrane, J A; Butow, P N; Sze, M; Eisenbruch, M; Goldstein, D; King, M T
2014-12-01
The purpose of this study was to assess the invariance of a culturally competent multi-lingual unmet needs survey. A cross-sectional study was conducted among immigrants of Arabic-, Chinese- and Greek-speaking backgrounds, and Anglo-Australian-born controls, recruited through Cancer Registries (n = 591) and oncology clinics (n = 900). The survey included four subscales, with newly developed items addressing unmet need in culturally competent health information and patient support (CCHIPS), and items adapted from existing questionnaires addressing physical and daily living (PDL), sexuality (SEX) and survivorship (SURV) unmet need. The survey was translated into Arabic, Chinese and Greek. Rasch analysis was carried out on the four domains. Whilst many items were mistargeted to less prevalent areas of unmet need, causing substantial floor effects in person estimates, reliability indices were acceptable. The CCHIPS domain showed differential item functioning (DIF) for cultural background and language, and the PDL domain showed DIF for treatment phase and gender. The results for SEX and SURV domains were limited by floor effects and missing responses. All domains showed adequate fit to the model after DIF was resolved and a small number of items were deleted. The study highlights the intricacies in designing a culturally competent survey that can be applied to culturally and linguistically diverse groups across different treatment contexts. Overall, the results demonstrate that this survey is somewhat invariant with respect to these factors. Future refinements are suggested to enhance the survey's cultural competence and general validity.
1987-01-01
information missing from the record, such as signatures and dicta- tions. These items are tracked to determine deficiencies and delinquencies in the clini...MONTH PERIOD BEGINNING PROCEDURES PERFORMED PATITENTS DISCHARGEO MALPRACTICE CLAIMS FILED --_ OED RECORD DEFICIENCIES -- WID RECORD DELINQUENCIES...number of medical records con- sidered deficient because this provider had not supplied all chart items within the time limit set by the MIF (e.g
Pohl, Steffi; Südkamp, Anna; Hardt, Katinka; Carstensen, Claus H.; Weinert, Sabine
2016-01-01
Assessing competencies of students with special educational needs in learning (SEN-L) poses a challenge for large-scale assessments (LSAs). For students with SEN-L, the available competence tests may fail to yield test scores of high psychometric quality, which are—at the same time—measurement invariant to test scores of general education students. We investigated whether we can identify a subgroup of students with SEN-L, for which measurement invariant competence measures of adequate psychometric quality may be obtained with tests available in LSAs. We furthermore investigated whether differences in test-taking behavior may explain dissatisfying psychometric properties and measurement non-invariance of test scores within LSAs. We relied on person fit indices and mixture distribution models to identify students with SEN-L for whom test scores with satisfactory psychometric properties and measurement invariance may be obtained. We also captured differences in test-taking behavior related to guessing and missing responses. As a result we identified a subgroup of students with SEN-L for whom competence scores of adequate psychometric quality that are measurement invariant to those of general education students were obtained. Concerning test taking behavior, there was a small number of students who unsystematically picked response options. Removing these students from the sample slightly improved item fit. Furthermore, two different patterns of missing responses were identified that explain to some extent problems in the assessments of students with SEN-L. PMID:26941665
49 CFR 213.119 - Continuous welded rail (CWR); plan contents.
Code of Federal Regulations, 2010 CFR
2010-10-01
... following items: (i) Loose, bent, or missing joint bolts; (ii) Rail end batter or mismatch that contributes... amount and length of rail end batter or ramp on each rail end; the amount of tread mismatch; the vertical...
49 CFR 213.119 - Continuous welded rail (CWR); plan contents.
Code of Federal Regulations, 2011 CFR
2011-10-01
... following items: (i) Loose, bent, or missing joint bolts; (ii) Rail end batter or mismatch that contributes... amount and length of rail end batter or ramp on each rail end; the amount of tread mismatch; the vertical...
Doidge, James C
2018-02-01
Population-based cohort studies are invaluable to health research because of the breadth of data collection over time, and the representativeness of their samples. However, they are especially prone to missing data, which can compromise the validity of analyses when data are not missing at random. Having many waves of data collection presents opportunity for participants' responsiveness to be observed over time, which may be informative about missing data mechanisms and thus useful as an auxiliary variable. Modern approaches to handling missing data such as multiple imputation and maximum likelihood can be difficult to implement with the large numbers of auxiliary variables and large amounts of non-monotone missing data that occur in cohort studies. Inverse probability-weighting can be easier to implement but conventional wisdom has stated that it cannot be applied to non-monotone missing data. This paper describes two methods of applying inverse probability-weighting to non-monotone missing data, and explores the potential value of including measures of responsiveness in either inverse probability-weighting or multiple imputation. Simulation studies are used to compare methods and demonstrate that responsiveness in longitudinal studies can be used to mitigate bias induced by missing data, even when data are not missing at random.
Wenborn, Jennifer; Challis, David; Pool, Jackie; Burgess, Jane; Elliott, Nicola; Orrell, Martin
2008-03-01
Activity is key to maintaining physical and mental health and well-being. However, as dementia affects the ability to engage in activity, care-givers can find it difficult to provide appropriate activities. The Pool Activity Level (PAL) Checklist guides the selection of appropriate, personally meaningful activities. The aim of this study was to assess the reliability and validity of the PAL Checklist when used with older people with dementia. A postal questionnaire sent to activity providers assessed content validity. Validity and reliability were measured in a sample of 60 older people with dementia. The questionnaire response rate was 83% (102/122). Most respondents felt no important items were missing. Seven of the nine activities were ranked as 'very important' or 'essential' by at least 77% of the sample, indicating very good content validity. Correlation with measures of cognition, severity of dementia and activity performance demonstrated strong concurrent validity. Inter-item correlation indicated strong construct validity. Cronbach's alpha coefficient measured internal consistency as excellent (0.95). All items achieved acceptable test-retest reliability, and the majority demonstrated acceptable inter-rater reliability. We conclude that the PAL Checklist demonstrates adequate validity and reliability when used with older people with dementia and appears a useful tool for a variety of care settings.
Modeling Achievement Trajectories when Attrition Is Informative
ERIC Educational Resources Information Center
Feldman, Betsy J.; Rabe-Hesketh, Sophia
2012-01-01
In longitudinal education studies, assuming that dropout and missing data occur completely at random is often unrealistic. When the probability of dropout depends on covariates and observed responses (called "missing at random" [MAR]), or on values of responses that are missing (called "informative" or "not missing at random" [NMAR]),…
NASA Technical Reports Server (NTRS)
Johnston, James C.; Hochhaus, Larry; Ruthruff, Eric
2002-01-01
Four experiments tested whether repetition blindness (RB; reduced accuracy reporting repetitions of briefly displayed items) is a perceptual or a memory-recall phenomenon. RB was measured in rapid serial visual presentation (RSVP) streams, with the task altered to reduce memory demands. In Experiment 1 only the number of targets (1 vs. 2) was reported, eliminating the need to remember target identities. Experiment 2 segregated repeated and nonrepeated targets into separate blocks to reduce bias against repeated targets. Experiments 3 and 4 required immediate "online" buttonpress responses to targets as they occurred. All 4 experiments showed very strong RB. Furthermore, the online response data showed clearly that the 2nd of the repeated targets is the one missed. The present results show that in the RSVP paradigm, RB occurs online during initial stimulus encoding and decision making. The authors argue that RB is indeed a perceptual phenomenon.
A Mixed Effects Randomized Item Response Model
ERIC Educational Resources Information Center
Fox, J.-P.; Wyrick, Cheryl
2008-01-01
The randomized response technique ensures that individual item responses, denoted as true item responses, are randomized before observing them and so-called randomized item responses are observed. A relationship is specified between randomized item response data and true item response data. True item response data are modeled with a (non)linear…
Controlling hospital library theft
Cuddy, Theresa M.; Marchok, Catherine
2003-01-01
At Capital Health System/Fuld Campus (formerly Helene Fuld Medical Center), the Health Sciences Library lost many books and videocassettes. These materials were listed in the catalog but were missing when staff went to the shelves. The hospital had experienced a downsizing of staff, a reorganization, and a merger. When the library staff did an inventory, $10,000 worth of materials were found to be missing. We corrected the situation through a series of steps that we believe will help other libraries control their theft. Through regularly scheduling inventories, monitoring items, advertising, and using specific security measures, we have successfully controlled the library theft. The January 2002 inventory resulted in meeting our goal of zero missing books and videocassettes. We work to maintain that goal. PMID:12883573
Controlling hospital library theft.
Cuddy, Theresa M; Marchok, Catherine
2003-04-01
At Capital Health System/Fuld Campus (formerly Helene Fuld Medical Center), the Health Sciences Library lost many books and videocassettes. These materials were listed in the catalog but were missing when staff went to the shelves. The hospital had experienced a downsizing of staff, a reorganization, and a merger. When the library staff did an inventory, $10,000 worth of materials were found to be missing. We corrected the situation through a series of steps that we believe will help other libraries control their theft. Through regularly scheduling inventories, monitoring items, advertising, and using specific security measures, we have successfully controlled the library theft. The January 2002 inventory resulted in meeting our goal of zero missing books and videocassettes. We work to maintain that goal.
Falkenström, Fredrik; Hatcher, Robert L; Skjulsvik, Tommy; Larsson, Mattias Holmqvist; Holmqvist, Rolf
2015-03-01
Recently, researchers have started to measure the working alliance repeatedly across sessions of psychotherapy, relating the working alliance to symptom change session by session. Responding to questionnaires after each session can become tedious, leading to careless responses and/or increasing levels of missing data. Therefore, assessment with the briefest possible instrument is desirable. Because previous research on the Working Alliance Inventory has found the separation of the Goal and Task factors problematic, the present study examined the psychometric properties of a 2-factor, 6-item working alliance measure, adapted from the Working Alliance Inventory, in 3 patient samples (ns = 1,095, 235, and 234). Results showed that a bifactor model fit the data well across the 3 samples, and the factor structure was stable across 10 sessions of primary care counseling/psychotherapy. Although the bifactor model with 1 general and 2 specific factors outperformed the 1-factor model in terms of model fit, dimensionality analyses based on the bifactor model results indicated that in practice the instrument is best treated as unidimensional. Results support the use of composite scores of all 6 items. The instrument was validated by replicating previous findings of session-by-session prediction of symptom reduction using the Autoregressive Latent Trajectory model. The 6-item working alliance scale, called the Session Alliance Inventory, is a promising alternative for researchers in search for a brief alliance measure to administer after every session. 2015 APA, all rights reserved
Investigation of Missing Responses in Implementation of Cognitive Diagnostic Models
ERIC Educational Resources Information Center
Dai, Shenghai
2017-01-01
This dissertation is aimed at investigating the impact of missing data and evaluating the performance of five selected methods for handling missing responses in the implementation of Cognitive Diagnostic Models (CDMs). The five methods are: a) treating missing data as incorrect (IN), b) person mean imputation (PM), c) two-way imputation (TW), d)…
A content validated questionnaire for assessment of self reported venous blood sampling practices
2012-01-01
Background Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. Findings We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. Conclusions The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward. PMID:22260505
A content validated questionnaire for assessment of self reported venous blood sampling practices.
Bölenius, Karin; Brulin, Christine; Grankvist, Kjell; Lindkvist, Marie; Söderberg, Johan
2012-01-19
Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward.
Ernstmann, Nicole; Halbach, Sarah; Kowalski, Christoph; Pfaff, Holger; Ansmann, Lena
2017-04-01
Studies addressing the organizational contexts of care that may help increase the patients' ability to cope with a disease and to navigate through the health care system are still rare. Especially instruments allowing the assessment of such organizational efforts from the patients' perspective are missing. The aim of our study was to develop a survey instrument assessing organizational health literacy (HL) from the patients' perspective, i. e., health care organizations' responsiveness to patients' individual needs. A pool of 30 items was developed by a group of experts based on a literature review. The items were developed, tested and prioritized according to their importance in 11 semi-structured interviews and cognitive think-aloud interviews with cancer patients. The resulting 16 items were rated in a standardized postal survey involving a total of N=453 colon and breast cancer patients treated in cancer centers in Germany. An exploratory factor analysis, a confirmatory factor analysis and structural equation modelling were conducted. Item properties were analyzed. 83.2 % of the patients were diagnosed with breast cancer, 16.8 % had a diagnosis of colon cancer. The patients' mean age was 61 (26-88), 89.4 % were female. The most common comorbidities were hypertension (34.0 %) and cardiovascular disease (11.0 %). The final prediction model included nine items measuring the degree of health literacy-sensitivity of communication. The model showed an acceptable model fit. The nine items showed corrected item-total correlations between .622 and .762 and item difficulties between 0.77 and 0.87. Cronbach's α was .912. In a comprehensive development process, the original item pool comprising several aspects of organizational HL was reduced to a one-dimensional scale. The instrument measures an important aspect of organizational HL; i.e., the degree of health literacy-sensitivity of communication (HL-COM). HL-COM was found to impact patient enablement, mediated through the support by physicians. Future research will have to test these associations in the context of other diseases or institutions. Copyright © 2017. Published by Elsevier GmbH.
Penfold, Suzanne; Shamba, Donat; Hanson, Claudia; Jaribu, Jennie; Manzi, Fatuma; Marchant, Tanya; Tanner, Marcel; Ramsey, Kate; Schellenberg, David; Schellenberg, Joanna Armstrong
2013-02-14
The poor maintenance of equipment and inadequate supplies of drugs and other items contribute to the low quality of maternity services often found in rural settings in low- and middle-income countries, and raise the risk of adverse patient outcomes through delaying care provision. We aim to describe staff experiences of providing maternal and neonatal care in rural health facilities in Southern Tanzania, focusing on issues related to equipment, drugs and supplies. Focus group discussions and in-depth interviews were conducted with different staff cadres from all facility levels in order to explore experiences and views of providing maternity care in the context of poorly maintained equipment, and insufficient drugs and other supplies. A facility survey quantified the availability of relevant items. The facility survey, which found many missing or broken items and frequent stock outs, corroborated staff reports of providing care in the context of missing or broken care items. Staff reported increased workloads, reduced morale, difficulties in providing optimal maternity care, and carrying out procedures with potential health risks to themselves as a result. Inadequately stocked and equipped facilities compromise the health system's ability to reduce maternal and neonatal mortality and morbidity by affecting staff personally and professionally, which hinders the provision of timely and appropriate interventions. Improving stock control and maintaining equipment could benefit mothers and babies, not only through removing restrictions to the availability of care, but also through improving staff working conditions.
2013-01-01
Background The poor maintenance of equipment and inadequate supplies of drugs and other items contribute to the low quality of maternity services often found in rural settings in low- and middle-income countries, and raise the risk of adverse patient outcomes through delaying care provision. We aim to describe staff experiences of providing maternal and neonatal care in rural health facilities in Southern Tanzania, focusing on issues related to equipment, drugs and supplies. Methods Focus group discussions and in-depth interviews were conducted with different staff cadres from all facility levels in order to explore experiences and views of providing maternity care in the context of poorly maintained equipment, and insufficient drugs and other supplies. A facility survey quantified the availability of relevant items. Results The facility survey, which found many missing or broken items and frequent stock outs, corroborated staff reports of providing care in the context of missing or broken care items. Staff reported increased workloads, reduced morale, difficulties in providing optimal maternity care, and carrying out procedures with potential health risks to themselves as a result. Conclusions Inadequately stocked and equipped facilities compromise the health system’s ability to reduce maternal and neonatal mortality and morbidity by affecting staff personally and professionally, which hinders the provision of timely and appropriate interventions. Improving stock control and maintaining equipment could benefit mothers and babies, not only through removing restrictions to the availability of care, but also through improving staff working conditions. PMID:23410228
Röttger, Julia; Blümel, Miriam; Linder, Roland; Busse, Reinhard
2017-07-01
Health system responsiveness is an important aspect of health systems performance. The concept of responsiveness relates to the interpersonal and contextual aspects of health care. While disease management programs (DMPs) aim to improve the quality of health care (e.g. by improving the coordination of care), it has not been analyzed yet whether these programs improve the perceived health system responsiveness. Our study aims to close this gap by analyzing the differences in the perceived health system responsiveness between DMP-participants and non-participants. We used linked survey- and administrative claims data from 7037 patients with coronary heart disease in Germany. Of those, 5082 were enrolled and 1955 were not enrolled in the DMP. Responsiveness was assessed with an adapted version of the WHO responsiveness questionnaire in a postal survey in 2013. The survey covered 9 dimensions of responsiveness and included 17 items for each, GP and specialist care. Each item had five answer categories (very good - very bad). We handled missing values in the covariates by multiple imputation and applied propensity score matching (PSM) to control for differences between the two groups (DMP/non-DMP). We used Wilcoxon-signed-rank and McNemar test to analyze differences regarding the reported responsiveness. The PSM led to a matched and well balanced sample of 1921 pairs. Overall, DMP-participants rated the responsiveness of care more positive. The main difference was found for the coordination of care at the GP, with 62.0% of 1703 non-participants reporting a "good" or "very good" experience, compared to 69.1% of 1703 participants (p < 0.001). The results of our study indicate an overall high responsiveness for CHD-care, as well for DMP-participants as for non-participants. Yet, the results also clearly indicate that there is still a need to improve the coordination of care. Copyright © 2017 Elsevier Ltd. All rights reserved.
Biases and power for groups comparison on subjective health measurements.
Hamel, Jean-François; Hardouin, Jean-Benoit; Le Neel, Tanguy; Kubis, Gildas; Roquelaure, Yves; Sébille, Véronique
2012-01-01
Subjective health measurements are increasingly used in clinical research, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: so-called classical test theory (CTT), relying on observed scores and models coming from Item Response Theory (IRT) relying on a response model relating the items responses to a latent parameter, often called latent trait. Whether IRT or CTT would be the most appropriate method to compare two independent groups of patients on a patient reported outcomes measurement remains unknown and was investigated using simulations. For CTT-based analyses, groups comparison was performed using t-test on the scores. For IRT-based analyses, several methods were compared, according to whether the Rasch model was considered with random effects or with fixed effects, and the group effect was included as a covariate or not. Individual latent traits values were estimated using either a deterministic method or by stochastic approaches. Latent traits were then compared with a t-test. Finally, a two-steps method was performed to compare the latent trait distributions, and a Wald test was performed to test the group effect in the Rasch model including group covariates. The only unbiased IRT-based method was the group covariate Wald's test, performed on the random effects Rasch model. This model displayed the highest observed power, which was similar to the power using the score t-test. These results need to be extended to the case frequently encountered in practice where data are missing and possibly informative.
Feasibility of Community Food Item Collection for the National Children's Study
The National Children’s Study proposes to investigate the role of contaminants on health outcomes in pregnant women and children. A specific area of concern is contaminant exposure through the ingestion of solid foods. National food contaminant databases may miss environmental ex...
Berry, Sean L; Tierney, Kevin P; Elguindi, Sharif; Mechalakos, James G
2017-12-24
An electronic checklist has been designed with the intention of reducing errors while minimizing user effort in completing the checklist. We analyze the clinical use and evolution of the checklist over the past 5 years and review data in an incident learning system (ILS) to investigate whether it has contributed to an improvement in patient safety. The checklist is written as a standalone HTML application using VBScript. User selection of pertinent demographic details limits the display of checklist items only to those necessary for the particular clinical scenario. Ten common clinical scenarios were used to illustrate the difference between the maximum possible number of checklist items available in the code versus the number displayed to the user at any one time. An ILS database of errors and near misses was reviewed to evaluate whether the checklist influenced the occurrence of reported events. Over 5 years, the number of checklist items available in the code nearly doubled, whereas the number displayed to the user at any one time stayed constant. Events reported in our ILS related to the beam energy used with pacemakers, projection of anatomy on digitally reconstructed radiographs, orthogonality of setup fields, and field extension beyond match lines, did not recur after the items were added to the checklist. Other events related to bolus documentation and breakpoints continued to be reported. Our checklist is adaptable to the introduction of new technologies, transitions between planning systems, and to errors and near misses recorded in the ILS. The electronic format allows us to restrict user display to a small, relevant, subset of possible checklist items, limiting the planner effort needed to review and complete the checklist. Copyright © 2018. Published by Elsevier Inc.
Money Attitude, Self-esteem, and Compulsive Buying in a Population of Medical Students
Lejoyeux, Michel; Richoux-Benhaim, Charlotte; Betizeau, Annabelle; Lequen, Valérie; Lohnhardt, Hannah
2011-01-01
This study tried to determine the prevalence of compulsive buying (CB) and to identify among compulsive buyers a specific relation to money, a different buying style, and a lowered level of self-esteem. We included 203 medical students and diagnosed CB with the Mc Elroy criteria and a specific questionnaire. The money attitude was characterized by the Yamauchi and Templer's scale and self-esteem with the Rosenberg scale. 11% of the medical students presented compulsive buying (CB+). Sex ratio and mean ages were comparable in the CB+ and control groups. CB+ students drank less alcohol and smoked an equivalent number of cigarettes. Compulsive buyers had higher scores of distress (tendency to be hesitant, suspicious, and doubtful attitude toward situations involving money) and bargain missing (fear of missing a good opportunity to buy an item). They bought more often gifts for themselves, items they use less than expected and choose goods increasing their self-esteem. Their score of self-esteem was not different from the one from controls. PMID:21556283
Lemos, Raquel; Afonso, Ana; Martins, Cristina; Waters, James H; Blanco, Filipe Sobral; Simões, Mário R; Santana, Isabel
2016-01-01
The Selective Reminding Test (SRT) and the Free and Cued Selective Reminding Test (FCSRT) are multitrial memory tests that use a common "selective reminding" paradigm that aims to facilitate learning by presenting only the missing words from the previous recall trial. While in the FCSRT semantic cues are provided to elicit recall, in the SRT, participants are merely reminded of the missing items by repeating them. These tests have been used to assess age-related memory changes and to predict dementia. The performance of healthy elders on these tests has been compared before, and results have shown that twice as many words were retrieved from long-term memory in the FCSRT compared with the SRT. In this study, we compared the tests' properties and their accuracy in discriminating amnestic mild cognitive impairment (aMCI; n = 20) from Alzheimer disease (AD; n = 18). Patients with AD performed significantly worse than patients with aMCI on both tests. The percentage of items recalled during the learning trials was significantly higher for the FCSRT in both groups, and a higher number of items were later retrieved, showing the benefit of category cueing. Our key finding was that the FCSRT showed higher accuracy in discriminating patients with aMCI from those with AD.
Baron, Gabriel; Tubach, Florence; Ravaud, Philippe; Logeart, Isabelle; Dougados, Maxime
2007-05-15
A short version of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) function scale has recently been developed to enhance the applicability of the scale in routine practice and clinical research for patients with hip and knee osteoarthritis. The goal of the present study was to validate this short form. We conducted a prospective 4-week cohort study of 1,036 outpatients. Performance on the WOMAC function long form (LF) and short form (SF) was compared. Agreement between responses on the 2 forms was examined according to a Bland-Altman plot. Responsiveness to change (by standardized response mean [SRM]), reproducibility (intraclass correlation coefficient [ICC]), and internal consistency (Cronbach's alpha) were computed for both forms. Construct validity was assessed based on functional impairment as measured on a numerical rating scale. At baseline, 24% of patients who completed the WOMAC LF had missing data for at least 1 item as compared with only 6% of patients who completed the WOMAC SF. The mean WOMAC SF score was greater than the mean WOMAC LF score (mean +/- SD difference -4.3 +/- 4.8 on a 0-100 scale). SRMs were 0.61 and 0.73, ICCs were 0.76 and 0.68, and Cronbach's alphas were 0.93 and 0.85 for the WOMAC LF and SF, respectively. The 2 forms had comparable correlation with functional impairment. The WOMAC function short form has a low rate of missing data and is a responsive, reproducible, and valid measure. The mean SF score was 4 points higher than the mean LF score.
Impact of emotionality on memory and meta-memory in schizophrenia using video sequences.
Peters, Maarten J V; Hauschildt, Marit; Moritz, Steffen; Jelinek, Lena
2013-03-01
A vast amount of memory and meta-memory research in schizophrenia shows that these patients perform worse on memory accuracy and hold false information with strong conviction compared to healthy controls. So far, studies investigating these effects mainly used traditional static stimulus material like word lists or pictures. The question remains whether these memory and meta-memory effects are also present in (1) more near-life dynamic situations (i.e., using standardized videos) and (2) whether emotionality has an influence on memory and meta-memory deficits (i.e., response confidence) in schizophrenia compared to healthy controls. Twenty-seven schizophrenia patients and 24 healthy controls were administered a newly developed emotional video paradigm with five videos differing in emotionality (positive, two negative, neutral, and delusional related). After each video, a recognition task required participants to make old-new discriminations along with confidence ratings, investigating memory accuracy and meta-memory deficits in more dynamic settings. For all but the positively valenced video, patients recognized fewer correct items compared to healthy controls, and did not differ with regard to the number of false memories for related items. In line with prior findings, schizophrenia patients showed more high-confident responses for misses and false memories for related items but displayed underconfidence for hits when compared to healthy controls, independent of emotionality. Limited sample size and control group; combined valence and arousal indicator for emotionality; general psychopathology indicator. Emotionality differentially moderated memory accuracy, biases in schizophrenia patients compared to controls. Moreover, the meta-memory deficits identified in static paradigms also manifest in more dynamic settings near-life settings and seem to be independent of emotionality. Copyright © 2012 Elsevier Ltd. All rights reserved.
Graffigna, Guendalina; Barello, Serena; Bonanomi, Andrea; Lozza, Edoardo; Hibbard, Judith
2015-12-23
The Patient Activation Measure (PAM13) is an instrument that assesses patient knowledge, skills, and confidence for disease self-management. This cross-sectional study was aimed to validate a culturally-adapted Italian Patient Activation Measure (PAM13-I) for patients with chronic conditions. 519 chronic patients were involved in the Italian validation study and responded to PAM13-I. The PAM 13 was translated into Italian by a standardized forward-backward translation. Data quality was assessed by mean, median, item response, missing values, floor and ceiling effects, internal consistency (Cronbach's alpha and average inter-item correlation), item-rest correlations. Rasch Model and differential item functioning assessed scale properties. Mean PAM13-I score was 66.2. Rasch analysis showed that the PAM13-I is a good measure of patient activation. The level of internal consistency was good (α = 0.88). For all items, the distribution of answers was left-skewed, with a small floor effect (range 1.7-4.5 %) and a moderate ceiling effect (range 27.6-55.0 %). The Italian version formed a unidimensional, probabilistic Guttman-like scale explaining 41 % of the variance. The PAM13-I has been demonstrated to be a valid and reliable measure of patient activation and the present study suggests its applicability to the Italian-speaking chronic patient population. The measure has good psychometric properties and appears to be consistent with the developmental nature of the patient activation phenomenon, although it presents a different ranking order of the items comparing to the American version. PAM13-I can be a useful assessment tool to evaluate interventions aimed at improving patient engagement in healthcare and to train doctors in attuning their communication to the level of patients' activation. Future research could be conducted to further confirm the validity of the PAM13-I.
A comparative study: classification vs. user-based collaborative filtering for clinical prediction.
Hao, Fang; Blair, Rachael Hageman
2016-12-08
Recommender systems have shown tremendous value for the prediction of personalized item recommendations for individuals in a variety of settings (e.g., marketing, e-commerce, etc.). User-based collaborative filtering is a popular recommender system, which leverages an individuals' prior satisfaction with items, as well as the satisfaction of individuals that are "similar". Recently, there have been applications of collaborative filtering based recommender systems for clinical risk prediction. In these applications, individuals represent patients, and items represent clinical data, which includes an outcome. Application of recommender systems to a problem of this type requires the recasting a supervised learning problem as unsupervised. The rationale is that patients with similar clinical features carry a similar disease risk. As the "Big Data" era progresses, it is likely that approaches of this type will be reached for as biomedical data continues to grow in both size and complexity (e.g., electronic health records). In the present study, we set out to understand and assess the performance of recommender systems in a controlled yet realistic setting. User-based collaborative filtering recommender systems are compared to logistic regression and random forests with different types of imputation and varying amounts of missingness on four different publicly available medical data sets: National Health and Nutrition Examination Survey (NHANES, 2011-2012 on Obesity), Study to Understand Prognoses Preferences Outcomes and Risks of Treatment (SUPPORT), chronic kidney disease, and dermatology data. We also examined performance using simulated data with observations that are Missing At Random (MAR) or Missing Completely At Random (MCAR) under various degrees of missingness and levels of class imbalance in the response variable. Our results demonstrate that user-based collaborative filtering is consistently inferior to logistic regression and random forests with different imputations on real and simulated data. The results warrant caution for the collaborative filtering for the purpose of clinical risk prediction when traditional classification is feasible and practical. CF may not be desirable in datasets where classification is an acceptable alternative. We describe some natural applications related to "Big Data" where CF would be preferred and conclude with some insights as to why caution may be warranted in this context.
76 FR 72082 - Miscellaneous Administrative Changes
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-22
... the 2008 administrative rule. Revise Table Formatting Error in 10 CFR Part 171 The table in paragraph (c) of Sec. 171.16 is missing a colon and a hard return that would separate the heading... subsequent list item, ``35 to 500 employees.'' The formatting errors are corrected, adding a colon after the...
Goldstein, Ayelet; Shahar, Yuval; Orenbuch, Efrat; Cohen, Matan J
2017-10-01
To examine the feasibility of the automated creation of meaningful free-text summaries of longitudinal clinical records, using a new general methodology that we had recently developed; and to assess the potential benefits to the clinical decision-making process of using such a method to generate draft letters that can be further manually enhanced by clinicians. We had previously developed a system, CliniText (CTXT), for automated summarization in free text of longitudinal medical records, using a clinical knowledge base. In the current study, we created an Intensive Care Unit (ICU) clinical knowledge base, assisted by two ICU clinical experts in an academic tertiary hospital. The CTXT system generated free-text summary letters from the data of 31 different patients, which were compared to the respective original physician-composed discharge letters. The main evaluation measures were (1) relative completeness, quantifying the data items missed by one of the letters but included by the other, and their importance; (2) quality parameters, such as readability; (3) functional performance, assessed by the time needed, by three clinicians reading each of the summaries, to answer five key questions, based on the discharge letter (e.g., "What are the patient's current respiratory requirements?"), and by the correctness of the clinicians' answers. Completeness: In 13/31 (42%) of the letters the number of important items missed in the CTXT-generated letter was actually less than or equal to the number of important items missed by the MD-composed letter. In each of the MD-composed letters, at least two important items that were mentioned by the CTXT system were missed (a mean of 7.2±5.74). In addition, the standard deviation in the number of missed items in the MD letters (STD=15.4) was much higher than the standard deviation in the CTXT-generated letters (STD=5.3). Quality: The MD-composed letters obtained a significantly better grade in three out of four measured parameters. However, the standard variation in the quality of the MD-composed letters was much greater than the standard variation in the quality of the CTXT-generated letters (STD=6.25 vs. STD=2.57, respectively). Functional evaluation: The clinicians answered the five questions on average 40% faster (p<0.001) when using the CTXT-generated letters than when using the MD-composed letters. In four out of the five questions the clinicians' correctness was equal to or significantly better (p<0.005) when using the CTXT-generated letters than when using the MD-composed letters. An automatic knowledge-based summarization system, such as the CTXT system, has the capability to model complex clinical domains, such as the ICU, and to support interpretation and summarization tasks such as the creation of a discharge summary letter. Based on the results, we suggest that the use of such systems could potentially enhance the standardization of the letters, significantly increase their completeness, and reduce the time to write the discharge summary. The results also suggest that using the resultant structured letters might reduce the decision time, and enhance the decision quality, of decisions made by other clinicians. Copyright © 2017 Elsevier B.V. All rights reserved.
Konias, Sokratis; Chouvarda, Ioanna; Vlahavas, Ioannis; Maglaveras, Nicos
2005-09-01
Current approaches for mining association rules usually assume that the mining is performed in a static database, where the problem of missing attribute values does not practically exist. However, these assumptions are not preserved in some medical databases, like in a home care system. In this paper, a novel uncertainty rule algorithm is illustrated, namely URG-2 (Uncertainty Rule Generator), which addresses the problem of mining dynamic databases containing missing values. This algorithm requires only one pass from the initial dataset in order to generate the item set, while new metrics corresponding to the notion of Support and Confidence are used. URG-2 was evaluated over two medical databases, introducing randomly multiple missing values for each record's attribute (rate: 5-20% by 5% increments) in the initial dataset. Compared with the classical approach (records with missing values are ignored), the proposed algorithm was more robust in mining rules from datasets containing missing values. In all cases, the difference in preserving the initial rules ranged between 30% and 60% in favour of URG-2. Moreover, due to its incremental nature, URG-2 saved over 90% of the time required for thorough re-mining. Thus, the proposed algorithm can offer a preferable solution for mining in dynamic relational databases.
Lucidi, Valerio; Hendlisz, Alain; Van Laethem, Jean-Luc; Donckier, Vincent
2016-04-21
In oncosurgical approach to colorectal liver metastases, surgery remains considered as the only potentially curative option, while chemotherapy alone represents a strictly palliative treatment. However, missing metastases, defined as metastases disappearing after chemotherapy, represent a unique model to evaluate the curative potential of chemotherapy and to challenge current therapeutic algorithms. We reviewed recent series on missing colorectal liver metastases to evaluate incidence of this phenomenon, predictive factors and rates of cure defined by complete pathologic response in resected missing metastases and sustained clinical response when they were left unresected. According to the progresses in the efficacy of chemotherapeutic regimen, the incidence of missing liver metastases regularly increases these last years. Main predictive factors are small tumor size, low marker level, duration of chemotherapy, and use of intra-arterial chemotherapy. Initial series showed low rates of complete pathologic response in resected missing metastases and high recurrence rates when unresected. However, recent reports describe complete pathologic responses and sustained clinical responses reaching 50%, suggesting that chemotherapy could be curative in some cases. Accordingly, in case of missing colorectal liver metastases, the classical recommendation to resect initial tumor sites might have become partially obsolete. Furthermore, the curative effect of chemotherapy in selected cases could lead to a change of paradigm in patients with unresectable liver-only metastases, using intensive first-line chemotherapy to intentionally induce missing metastases, followed by adjuvant surgery on remnant chemoresistant tumors and close surveillance of initial sites that have been left unresected.
Wiklander, Maria; Rydström, Lise-Lott; Ygge, Britt-Marie; Navér, Lars; Wettergren, Lena; Eriksson, Lars E
2013-11-14
HIV is a stigmatizing medical condition. The concept of HIV stigma is multifaceted, with personalized stigma (perceived stigmatizing consequences of others knowing of their HIV status), disclosure concerns, negative self-image, and concerns with public attitudes described as core aspects of stigma for individuals with HIV infection. There is limited research on HIV stigma in children. The aim of this study was to test a short version of the 40-item HIV Stigma Scale (HSS-40), adapted for 8-18 years old children with HIV infection living in Sweden. A Swedish version of the HSS-40 was adapted for children by an expert panel and evaluated by think aloud interviews. A preliminary short version with twelve items covering the four dimensions of stigma in the HSS-40 was tested. The psychometric evaluation included inspection of missing values, principal component analysis (PCA), internal consistency, and correlations with measures of health-related quality of life (HRQoL). Fifty-eight children, representing 71% of all children with HIV infection in Sweden meeting the inclusion criteria, completed the 12-item questionnaire. Four items concerning participants' experiences of others' reactions to their HIV had unacceptable rates of missing values and were therefore excluded. The remaining items constituted an 8-item scale, the HIV Stigma Scale for Children (HSSC-8), measuring HIV-related disclosure concerns, negative self-image, and concerns with public attitudes. Evidence for internal validity was supported by a PCA, suggesting a three factor solution with all items loading on the same subscales as in the original HSS-40. The scale demonstrated acceptable internal consistency, with exception for the disclosure concerns subscale. Evidence for external validity was supported in correlational analyses with measures of HRQoL, where higher levels of stigma correlated with poorer HRQoL. The results suggest feasibility, reliability, as well as internal and external validity of the HSSC-8, an HIV stigma scale for children with HIV infection, measuring disclosure concerns, negative self-image, and concerns with public attitudes. The present study shows that different aspects of HIV stigma can be assessed among children with HIV in the age group 8-18.
NASA Astrophysics Data System (ADS)
Ding, Lin
2014-02-01
Discipline-based science concept assessments are powerful tools to measure learners' disciplinary core ideas. Among many such assessments, the Brief Electricity and Magnetism Assessment (BEMA) has been broadly used to gauge student conceptions of key electricity and magnetism (E&M) topics in college-level introductory physics courses. Differing from typical concept inventories that focus only on one topic of a subject area, BEMA covers a broad range of topics in the electromagnetism domain. In spite of this fact, prior studies exclusively used a single aggregate score to represent individual students' overall understanding of E&M without explicating the construct of this assessment. Additionally, BEMA has been used to compare traditional physics courses with a reformed course entitled Matter and Interactions (M&I). While prior findings were in favor of M&I, no empirical evidence was sought to rule out possible differential functioning of BEMA that may have inadvertently advantaged M&I students. In this study, we used Rasch analysis to seek two missing pieces regarding the construct and differential functioning of BEMA. Results suggest that although BEMA items generally can function together to measure the same construct of application and analysis of E&M concepts, several items may need further revision. Additionally, items that demonstrate differential functioning for the two courses are detected. Issues such as item contextual features and student familiarity with question settings may underlie these findings. This study highlights often overlooked threats in science concept assessments and provides an exemplar for using evidence-based reasoning to make valid inferences and arguments.
Revising the Lubben Social Network Scale for use in residential long-term care settings.
Munn, Jean; Radey, Melissa; Brown, Kristin; Kim, Hyejin
2018-04-19
We revised the Lubben Social Network Scale (LSNS) to develop a measure of social support specific to residential long-term care (LTC) settings, the LSNS-LTC with five domains (i.e., family, friends, residents, volunteers, and staff). The authors modified the LSNS-18 to capture sources of social support specific to LTC, specifically relationships with residents, volunteers, and staff. We piloted the resultant 28-item measure with 64 LTC residents. Fifty-four respondents provided adequate information for analyses that included descriptive statistics and reliability coefficients. Twenty of the items performed well (had correlations >0.3, overall α = 0.85) and were retained. Three items required modification. The five items related to volunteers were eliminated due to extensive (>15%) missing data resulting in a proposed 23-item measure. We identified, and to some degree quantified, supportive relationships within the LTC environment, while developing a self-report tool to measure social support in these settings.
Sales, Célia Md; Neves, Inês Td; Alves, Paula G; Ashworth, Mark
2017-11-22
There is increasing interest in individualized patient-reported outcome measures (I-PROMS), where patients themselves indicate the specific problems they want to address in therapy and these problems are used as items within the outcome measurement tool. This paper examined the extent to which 279 items reported in an I-PROM (PSYCHLOPS) added qualitative information which was not captured by two well-established outcome measures (CORE-OM and PHQ-9). Comparison of items was only conducted for patients scoring above the "caseness" threshold on the standardized measures. 107 patients were participating in therapy within addiction and general psychiatric clinical settings. Almost every patient (95%) reported at least one item whose content was not covered by PHQ-9, and 71% reported at least one item not covered by CORE-OM. Results demonstrate the relevance of individualized outcome assessment for capturing data describing the issues of greatest concern to patients, as nomothetic measures do not always seem to capture the whole story. © 2017 The Authors Health Expectations Published by John Wiley & Sons Ltd.
Trends in Sexual Orientation Missing Data Over a Decade of the California Health Interview Survey
Viana, Joseph; Grant, David; Cochran, Susan D.; Lee, Annie C.; Ponce, Ninez A.
2015-01-01
Objectives. We explored changes in sexual orientation question item completion in a large statewide health survey. Methods. We used 2003 to 2011 California Health Interview Survey data to investigate sexual orientation item nonresponse and sexual minority self-identification trends in a cross-sectional sample representing the noninstitutionalized California household population aged 18 to 70 years (n = 182 812 adults). Results. Asians, Hispanics, limited-English-proficient respondents, and those interviewed in non-English languages showed the greatest declines in sexual orientation item nonresponse. Asian women, regardless of English-proficiency status, had the highest odds of item nonresponse. Spanish interviews produced more nonresponse than English interviews and Asian-language interviews produced less nonresponse when we controlled for demographic factors and survey cycle. Sexual minority self-identification increased in concert with the item nonresponse decline. Conclusions. Sexual orientation nonresponse declines and the increase in sexual minority identification suggest greater acceptability of sexual orientation assessment in surveys. Item nonresponse rate convergence among races/ethnicities, language proficiency groups, and interview languages shows that sexual orientation can be measured in surveys of diverse populations. PMID:25790399
2000-12-01
A SKIP FLAG INDICATING THE RESULT OF CHECKING THE RESPONSE ON THE PARENT (SCREENING) ITEM AGAINST THE RESPONSE(S) ON THE ITEMS WITHIN THE SKIP...RESPONSE ON THE PARENT (SCREENING) ITEM AGAINST THE RESPONSE(S) ON THE ITEMS WITHIN THE SKIP PATTERN. SEE TABLE D-5, NOTE 2, IN APPENDIX D. G-52...RESULT OF CHECKING THE RESPONSE ON THE PARENT (SCREENING) ITEM AGAINST THE RESPONSE(S) ON THE ITEMS WITHIN THE SKIP PATTERN. SEE TABLE D-5
Blome, Christine; von Usslar, Kathrin; Augustin, Matthias
2016-06-01
Qualitative interviews are used to assess understandability and content validity of patient-reported outcomes. However, the common approach of asking patients to paraphrase items may not be sufficient to completely reveal item content as understood by patients. We used qualitative interviews to elicit more detailed information about patients' understanding of treatment goal items for the Patient Benefit Index 2.0 (PBI 2.0). This questionnaire measures patient-relevant benefit from treatments for skin diseases by assessing goal importance prior to and goal attainment after treatment. We interviewed 16 patients with psoriasis, atopic dermatitis, leg ulcers, and vitiligo. Patients were asked to elaborate in detail on their understanding of 15 treatment goal items. Subsequently, they were asked to suggest changes in item wording and to name missing treatment goals. Interview transcripts were analyzed according to an adapted approach of content analysis. The task was easy for the patients to understand, and they shared detailed information on what each goal meant to them. Results of the content analysis induced a range of revisions of the PBI 2.0 items, including changes in wording (four items) and item order (two items). Four items were deleted because they were found to be redundant or irrelevant, and one item was added to the list of treatment goals. Asking patients to elaborate on their item understanding in qualitative interviews provided detailed insight into item content and understandability. This method has helped considerably to improve feasibility and content validity of the PBI 2.0.
NASA Astrophysics Data System (ADS)
Karataş, F. Ö.; Bodner, G. M.; Unal, Suat
2016-01-01
A study was conducted on the views of the nature of engineering held by 114 first-year engineering majors; the study built on prior work on views of the nature of science held by students, their instructors, and the general public. Open-coding analysis of responses to a 12-item questionnaire suggested that the participants held tacit beliefs that engineering (1) involves problem solving; (2) is a form of applied science; (3) involves the design of artefacts or systems; (4) is subject to various constraints; and (5) requires teamwork. These beliefs, however, were often unsophisticated, and significant aspects of the field of engineering as described in the literature on engineering practices were missing from the student responses. The results of this study are important because students' beliefs have a strong influence on what they value in a classroom situation, what they attend to in class, and how they choose to study for a course.
Molenaar, Dylan; de Boeck, Paul
2018-06-01
In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject's response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.
Feraco, Angela M.; Dussel, Veronica; Orellana, Liliana; Kang, Tammy I.; Geyer, J. Russell; Rosenberg, Abby R.; Feudtner, Chris; Wolfe, Joanne
2017-01-01
Context Little is known about how parents of children with advanced cancer classify news they receive about their child’s medical condition. Objective To develop concepts of “good news” and “bad news” in discussions of advanced childhood cancer from parent perspectives. Methods Parents of children with advanced cancer cared for at three children’s hospitals were asked to share details of conversations in the preceding 3 months that contained “good news” or “bad news” related to their child’s medical condition. We used mixed methods to evaluate parent responses to both open-ended and fixed response items. Results Of 104 enrolled parents, 86 (83%) completed the survey. Six (7%) parents reported discussing neither good nor bad news, 18 (21%) reported only bad news, 15 (17%) reported only good news, and 46 (54%) reported both good and bad news (1 missing response). Seventy-six parents (88%) answered free response items. Descriptions of both good and bad news discussions consisted predominantly of “tumor talk” or cancer control. Additional treatment options featured prominently, particularly in discussions of bad news (42%). Child well-being, an important good news theme, encompassed treatment tolerance, symptom reduction, and quality of life. Conclusion A majority of parents of children with advanced cancer report discussing both good and bad news in the preceding 3 months. While news related primarily to cancer control, parents also describe good news discussions related to their child’s well-being. Understanding how parents of children with advanced cancer classify and describe the news they receive may enhance efforts to promote family-centered communication. PMID:28062345
Feraco, Angela M; Dussel, Veronica; Orellana, Liliana; Kang, Tammy I; Geyer, J Russell; Rosenberg, Abby R; Feudtner, Chris; Wolfe, Joanne
2017-05-01
Little is known about how parents of children with advanced cancer classify news they receive about their child's medical condition. To develop concepts of "good news" and "bad news" in discussions of advanced childhood cancer from parent perspectives. Parents of children with advanced cancer cared for at three children's hospitals were asked to share details of conversations in the preceding three months that contained "good news" or "bad news" related to their child's medical condition. We used mixed methods to evaluate parent responses to both open-ended and fixed-response items. Of 104 enrolled parents, 86 (83%) completed the survey. Six (7%) parents reported discussing neither good nor bad news, 18 (21%) reported only bad news, 15 (17%) reported only good news, and 46 (54%) reported both good and bad news (one missing response). Seventy-six parents (88%) answered free-response items. Descriptions of both good and bad news discussions consisted predominantly of "tumor talk" or cancer control. Additional treatment options featured prominently, particularly in discussions of bad news (42%). Child well-being, an important good news theme, encompassed treatment tolerance, symptom reduction, and quality of life. A majority of parents of children with advanced cancer report discussing both good and bad news in the preceding three months. Although news related primarily to cancer control, parents also describe good news discussions related to their child's well-being. Understanding how parents of children with advanced cancer classify and describe the news they receive may enhance efforts to promote family-centered communication. Copyright © 2017 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Massof, Robert W
2014-10-01
A simple theoretical framework explains patient responses to items in rating scale questionnaires. Fixed latent variables position each patient and each item on the same linear scale. Item responses are governed by a set of fixed category thresholds, one for each ordinal response category. A patient's item responses are magnitude estimates of the difference between the patient variable and the patient's estimate of the item variable, relative to his/her personally defined response category thresholds. Differences between patients in their personal estimates of the item variable and in their personal choices of category thresholds are represented by random variables added to the corresponding fixed variables. Effects of intervention correspond to changes in the patient variable, the patient's response bias, and/or latent item variables for a subset of items. Intervention effects on patients' item responses were simulated by assuming the random variables are normally distributed with a constant scalar covariance matrix. Rasch analysis was used to estimate latent variables from the simulated responses. The simulations demonstrate that changes in the patient variable and changes in response bias produce indistinguishable effects on item responses and manifest as changes only in the estimated patient variable. Changes in a subset of item variables manifest as intervention-specific differential item functioning and as changes in the estimated person variable that equals the average of changes in the item variables. Simulations demonstrate that intervention-specific differential item functioning produces inefficiencies and inaccuracies in computer adaptive testing. © The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Biases and Power for Groups Comparison on Subjective Health Measurements
Hamel, Jean-François; Hardouin, Jean-Benoit; Le Neel, Tanguy; Kubis, Gildas; Roquelaure, Yves; Sébille, Véronique
2012-01-01
Subjective health measurements are increasingly used in clinical research, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: so-called classical test theory (CTT), relying on observed scores and models coming from Item Response Theory (IRT) relying on a response model relating the items responses to a latent parameter, often called latent trait. Whether IRT or CTT would be the most appropriate method to compare two independent groups of patients on a patient reported outcomes measurement remains unknown and was investigated using simulations. For CTT-based analyses, groups comparison was performed using t-test on the scores. For IRT-based analyses, several methods were compared, according to whether the Rasch model was considered with random effects or with fixed effects, and the group effect was included as a covariate or not. Individual latent traits values were estimated using either a deterministic method or by stochastic approaches. Latent traits were then compared with a t-test. Finally, a two-steps method was performed to compare the latent trait distributions, and a Wald test was performed to test the group effect in the Rasch model including group covariates. The only unbiased IRT-based method was the group covariate Wald’s test, performed on the random effects Rasch model. This model displayed the highest observed power, which was similar to the power using the score t-test. These results need to be extended to the case frequently encountered in practice where data are missing and possibly informative. PMID:23115620
The Costs and Benefits of Testing and Guessing on Recognition Memory
ERIC Educational Resources Information Center
Huff, Mark J.; Balota, David A.; Hutchison, Keith A.
2016-01-01
We examined whether 2 types of interpolated tasks (i.e., retrieval-practice via free recall or guessing a missing critical item) improved final recognition for related and unrelated word lists relative to restudying or completing a filler task. Both retrieval-practice and guessing tasks improved correct recognition relative to restudy and filler…
Multiple Imputation of Multilevel Missing Data-Rigor versus Simplicity
ERIC Educational Resources Information Center
Drechsler, Jörg
2015-01-01
Multiple imputation is widely accepted as the method of choice to address item-nonresponse in surveys. However, research on imputation strategies for the hierarchical structures that are typically found in the data in educational contexts is still limited. While a multilevel imputation model should be preferred from a theoretical point of view if…
Response-related fMRI of veridical and false recognition of words.
Heun, Reinhard; Jessen, Frank; Klose, Uwe; Erb, Michael; Granath, Dirk-Oliver; Grodd, Wolfgang
2004-02-01
Studies on the relation between local cerebral activation and retrieval success usually compared high and low performance conditions, and thus showed performance-related activation of different brain areas. Only a few studies directly compared signal intensities of different response categories during retrieval. During verbal recognition, we recently observed increased parieto-occipital activation related to false alarms. The present study intends to replicate and extend this observation by investigating common and differential activation by veridical and false recognition. Fifteen healthy volunteers performed a verbal recognition paradigm using 160 learned target and 160 new distractor words. The subjects had to indicate whether they had learned the word before or not. Echo-planar MRI of blood-oxygen-level-dependent signal changes was performed during this recognition task. Words were classified post hoc according to the subjects' responses, i.e. hits, false alarms, correct rejections and misses. Response-related fMRI-analysis was used to compare activation associated with the subjects' recognition success, i.e. signal intensities related to the presentation of words were compared by the above-mentioned four response types. During recognition, all word categories showed increased bilateral activation of the inferior frontal gyrus, the inferior temporal gyrus, the occipital lobe and the brainstem in comparison with the control condition. Hits and false alarms activated several areas including the left medial and lateral parieto-occipital cortex in comparison with subjectively unknown items, i.e. correct rejections and misses. Hits showed more pronounced activation in the medial, false alarms in the lateral parts of the left parieto-occipital cortex. Veridical and false recognition show common as well as different areas of cerebral activation in the left parieto-occipital lobe: increased activation of the medial parietal cortex by hits may correspond to true recognition, increased activation of the parieto-occipital cortex by false alarms may correspond to familiarity decisions. Further studies are needed to investigate the reasons for false decisions in healthy subjects and patients with memory problems.
Assessing the implementation of a bedside service handoff on an academic hospitalist service.
Wray, Charlie M; Arora, Vineet M; Hedeker, Donald; Meltzer, David O
2018-06-01
Inpatient service handoffs are a vulnerable transition during a patients' hospitalization. We hypothesized that performing the service handoff at the patients' bedside may be one mechanism to more efficiently transfer patient information between physicians, while further integrating the patient into their hospital care. We performed a 6-month prospective study of performing a bedside handoff (BHO) at the service transition on a non-teaching hospitalist service. On a weekly basis, transitioning hospitalists co-rounded at patient's bedsides. Post-handoff surveys assessed for completeness of handoff, communication, missed information, and adverse events. A control group who performed the handoff via email, phone or face-to-face was also surveyed. Chi-square and item-response theory (IRT) analysis assessed for differences between BHO and control groups. Narrative responses were elicited to qualitatively describe the BHO. In total, 21/31 (67%) scheduled BHOs were performed. On average, 4 out of 6 eligible patients experienced a BHO, with a total of 90 patients experiencing a BHO. Of those asked to perform the BHO, 52% stated the service transition took 31-60 min compared to 24% in the control group. Controlling for the nesting of observations within physicians, IRT analysis found that BHO respondents had statistically significant greater odds of: reporting increased patient awareness of the service handoff, more certainty in the plan for each patient, less discovery of missed information, and less time needed to learn about the patient on the first day compared to control methods. Narrative responses described a more patient-centered handoff with improved communication that was time-consuming and often logistically difficult to implement. Despite its time-intensive nature, performing the service handoff at the patient's bedside may lead to a more complete and efficient service transition. Published by Elsevier Inc.
Cara Status and Upcoming Enhancements
NASA Technical Reports Server (NTRS)
Newman, Lauri
2015-01-01
RIC Miss Values in Summary TableTabular presentation of miss vector in Summary Section RIC Uncertainty Values in Details SectionNumerical presentation of miss component uncertainty values in Details SectionGreen Events with Potentially Maneuverable Secondary ObjectsAll potentially maneuverable secondary objects will be reported out to 7-days prior to TCA for LEO events and 10-days for NONLEO events, regardless of risk (relates to MOWG Action Item 1309-11) All green events with potentially active secondary objects included in Summary ReportsAllows more time for contacting other OOBlack Box FixSometimes a black square appeared in the summary report where the ASW RIC time history plot should beAppendix Orbit RegimeMission Name MismatchPc 0 Plotting BugAll Pc points less than 1e-10 (zero) are now plotted as 1e-10 (instead of not at all)Maneuver Indication FixManeuver indicator now present even if maneuver was in the past.
Hoffmann, Tammy C; Walker, Marion F; Langhorne, Peter; Eames, Sally; Thomas, Emma; Glasziou, Paul
2015-01-01
Objective To assess, in a sample of systematic reviews of non-pharmacological interventions, the completeness of intervention reporting, identify the most frequently missing elements, and assess review authors’ use of and beliefs about providing intervention information. Design Analysis of a random sample of systematic reviews of non-pharmacological stroke interventions; online survey of review authors. Data sources and study selection The Cochrane Library and PubMed were searched for potentially eligible systematic reviews and a random sample of these assessed for eligibility until 60 (30 Cochrane, 30 non-Cochrane) eligible reviews were identified. Data collection In each review, the completeness of the intervention description in each eligible trial (n=568) was assessed by 2 independent raters using the Template for Intervention Description and Replication (TIDieR) checklist. All review authors (n=46) were invited to complete a survey. Results Most reviews were missing intervention information for the majority of items. The most incompletely described items were: modifications, fidelity, materials, procedure and tailoring (missing from all interventions in 97%, 90%, 88%, 83% and 83% of reviews, respectively). Items that scored better, but were still incomplete for the majority of reviews, were: ‘when and how much’ (in 31% of reviews, adequate for all trials; in 57% of reviews, adequate for some trials); intervention mode (in 22% of reviews, adequate for all trials; in 38%, adequate for some trials); and location (in 19% of reviews, adequate for all trials). Of the 33 (71%) authors who responded, 58% reported having further intervention information but not including it, and 70% tried to obtain information. Conclusions Most focus on intervention reporting has been directed at trials. Poor intervention reporting in stroke systematic reviews is prevalent, compounded by poor trial reporting. Without adequate intervention descriptions, the conduct, usability and interpretation of reviews are restricted and therefore, require action by trialists, systematic reviewers, peer reviewers and editors. PMID:26576811
Tew, Garry A.; Brabyn, Sally; Cook, Liz; Peckham, Emily
2016-01-01
Research supports the use of supervised exercise training as a primary therapy for improving the functional status of people with peripheral arterial disease (PAD). Several reviews have focused on reporting the outcomes of exercise interventions, but none have critically examined the quality of intervention reporting. Adequate reporting of the exercise protocols used in randomised controlled trials (RCTs) is central to interpreting study findings and translating effective interventions into practice. The purpose of this review was to evaluate the completeness of intervention descriptions in RCTs of supervised exercise training in people with PAD. A systematic search strategy was used to identify relevant trials published until June 2015. Intervention description completeness in the main trial publication was assessed using the Template for Intervention Description and Replication checklist. Missing intervention details were then sought from additional published material and by emailing authors. Fifty-eight trials were included, reporting on 76 interventions. Within publications, none of the interventions were sufficiently described for all of the items required for replication; this increased to 24 (32%) after contacting authors. Although programme duration, and session frequency and duration were well-reported in publications, complete descriptions of the equipment used, intervention provider, and number of participants per session were missing for three quarters or more of interventions (missing for 75%, 93% and 80% of interventions, respectively). Furthermore, 20%, 24% and 26% of interventions were not sufficiently described for the mode of exercise, intensity of exercise, and tailoring/progression, respectively. Information on intervention adherence/fidelity was also frequently missing: attendance rates were adequately described for 29 (38%) interventions, whereas sufficient detail about the intensity of exercise performed was presented for only 8 (11%) interventions. Important intervention details are commonly missing for supervised exercise programmes in the PAD trial literature. This has implications for the interpretation of outcome data, the investigation of dose-response effects, and the replication of protocols in future studies and clinical practice. Researchers should be mindful of intervention reporting guidelines when attempting to publish information about supervised exercise programmes, regardless of the population being studied. PMID:26938879
[Imputation methods for missing data in educational diagnostic evaluation].
Fernández-Alonso, Rubén; Suárez-Álvarez, Javier; Muñiz, José
2012-02-01
In the diagnostic evaluation of educational systems, self-reports are commonly used to collect data, both cognitive and orectic. For various reasons, in these self-reports, some of the students' data are frequently missing. The main goal of this research is to compare the performance of different imputation methods for missing data in the context of the evaluation of educational systems. On an empirical database of 5,000 subjects, 72 conditions were simulated: three levels of missing data, three types of loss mechanisms, and eight methods of imputation. The levels of missing data were 5%, 10%, and 20%. The loss mechanisms were set at: Missing completely at random, moderately conditioned, and strongly conditioned. The eight imputation methods used were: listwise deletion, replacement by the mean of the scale, by the item mean, the subject mean, the corrected subject mean, multiple regression, and Expectation-Maximization (EM) algorithm, with and without auxiliary variables. The results indicate that the recovery of the data is more accurate when using an appropriate combination of different methods of recovering lost data. When a case is incomplete, the mean of the subject works very well, whereas for completely lost data, multiple imputation with the EM algorithm is recommended. The use of this combination is especially recommended when data loss is greater and its loss mechanism is more conditioned. Lastly, the results are discussed, and some future lines of research are analyzed.
Ozturk, Erhan Arif; Kocer, Bilge Gonenli; Umay, Ebru; Cakci, Aytul
2018-06-07
The objectives of the present study were to translate and cross-culturally adapt the English version of the Parkinson Fatigue Scale into Turkish, to evaluate its psychometric properties, and to compare them with that of other language versions. A total of 144 patients with idiopathic Parkinson disease were included in the study. The Turkish version of Parkinson Fatigue Scale was evaluated for data quality, scaling assumptions, acceptability, reliability, and validity. The questionnaire response rate was 100% for both test and retest. The percentage of missing data was zero for items, and the percentage of computable scores was full. Floor and ceiling effects were absent. The Parkinson Fatigue Scale provides an acceptable internal consistency (Cronbach's alpha was 0.974 for 1st test and 0.964 for a retest, and corrected item-to-total correlations were ranged from 0.715 to 0.906) and test-retest reliability (Cohen's kappa coefficients were ranged from 0.632 to 0.786 for individuals items, and intraclass correlation coefficient was 0.887 for the overall Parkinson Fatigue Scale Score). An exploratory factor analysis of the items revealed a single factor explaining 71.7% of variance. The goodness-of-fit statistics for the one-factorial confirmatory factor analysis were Tucker Lewis index = 0.961, comparative fit index = 0.971 and root mean square error of approximation = 0.077 for a single factor. The average Parkinson Fatigue Scale Score was correlated significantly with sociodemographic data, clinical characteristics and scores of rating scales. The Turkish version of the Parkinson Fatigue Scale seems to be culturally well adapted and have good psychometric properties. The scale can be used in further studies to assess the fatigue in patients with Parkinson's disease.
2013-01-01
Background The Scale to Assess Unawareness in Mental Disorder (SUMD) is widely used in clinical trials and epidemiological studies but more rarely in clinical practice because of its length (74 items). In clinical practice, it is necessary to provide shorter instruments. The aim of this study was to investigate the validity and reliability of the abbreviated version of the SUMD. Methods Design: We used data from four cross-sectional studies conducted in several psychiatric hospitals in France. Inclusion criteria: a diagnosis of schizophrenia based on DSM-IV criteria. Data collection: socio-demographic and clinical data (including duration of illness, Positive and Negative Syndrome Scale, and the Calgary Depression Scale); quality of life; SUMD. Statistical analysis: confirmatory factor analyses, item-dimension correlations, Cronbach’s alpha coefficients, Rasch statistics, relationships between the SUMD and other parameters. We tested two different scoring models and considered the response ‘not applicable’ as ‘0’ or as missing data. Results Five hundred and thirty-one patients participated in this study. The 3-factor structure of the SUMD (awareness of the disease, consequences and need for treatment; awareness of positive symptoms; and awareness of negative symptoms) was confirmed using LISREL confirmatory factor analysis for the two models. Internal item consistency and reliability were satisfactory for all dimensions. External validity testing revealed that dimension scores correlated significantly with all PANSS scores, especially with the G12 item (lack of judgement and awareness). Significant associations with age, disease duration, education level, and living arrangements showed good discriminant validity. Conclusion The abbreviated version of the SUMD appears to be a valid and reliable instrument for measuring insight in patients with schizophrenia and may be used by clinicians to accurately assess insight in clinical settings. PMID:24053640
Haberman, Shelby J; Sinharay, Sandip; Chon, Kyong Hee
2013-07-01
Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.
Psychometric assessment of the IBS-D Daily Symptom Diary and Symptom Event Log.
Rosa, Kathleen; Delgado-Herrera, Leticia; Zeiher, Bernie; Banderas, Benjamin; Arbuckle, Rob; Spears, Glen; Hudgens, Stacie
2016-12-01
Diarrhea-predominant irritable bowel syndrome (IBS-D) can considerably impact patients' lives. Patient-reported symptoms are crucial in understanding the diagnosis and progression of IBS-D. This study psychometrically evaluates the newly developed IBS-D Daily Symptom Diary and Symptom Event Log (hereafter, "Event Log") according to US regulatory recommendations. A US-based observational field study was conducted to understand cross-sectional psychometric properties of the IBS-D Daily Symptom Diary and Event Log. Analyses included item descriptive statistics, item-to-item correlations, reliability, and construct validity. The IBS-D Daily Symptom Diary and Event Log had no items with excessive missing data. With the exception of two items ("frequency of gas" and "accidents"), moderate to high inter-item correlations were observed among all items of the IBS-D Daily Symptom Diary and Event Log (day 1 range 0.67-0.90). Item scores demonstrated reliability, with the exception of the "frequency of gas" and "accidents" items of the Diary and "incomplete evacuation" item of the Event Log. The pattern of correlations of the IBS-D Daily Symptom Diary and Event Log item scores with generic and disease-specific measures was as expected, moderate for similar constructs and low for dissimilar constructs, supporting construct validity. Known-groups methods showed statistically significant differences and monotonic trends in each of the IBS-D Daily Symptom Diary item scores among groups defined by patients' IBS-D severity ratings ("none"/"mild," "moderate," or "severe"/"very severe"), supporting construct validity. Initial psychometric results support the reliability and validity of the items of the IBS-D Daily Symptom Diary and Event Log.
An NCME Instructional Module on Polytomous Item Response Theory Models
ERIC Educational Resources Information Center
Penfield, Randall David
2014-01-01
A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…
Malaty, Hoda M; Abudayyeh, Suhaib; O'Malley, Kimberly J; Wilsey, Michael J; Fraley, Ken; Gilger, Mark A; Hollier, David; Graham, David Y; Rabeneck, Linda
2005-02-01
Recurrent abdominal pain (RAP) is a common problem in children and adolescents. Evaluation and treatment of children with RAP continue to challenge physicians because of the lack of a psychometrically sound measure for RAP. A major obstacle to progress in research on RAP has been the lack of a biological marker for RAP and the lack of a reliable and valid clinical measure for RAP. The objectives of this study were (1) to develop and test a multidimensional measure for RAP (MM-RAP) in children to serve as a primary outcome measure for clinical trials, (2) to evaluate the reliability of the measure and compare its responses across different populations, and (3) to examine the reliabilities of the measure scales in relation to the demographic variables of the studied population. We conducted 3 cross-sectional studies. Two studies were clinic-based studies that enrolled children with RAP from 1 pediatric gastroenterology clinic and 6 primary care clinics. The third study was a community-based study in which children from 1 elementary and 2 middle schools were screened for frequent episodes of abdominal pain. The 3 studies were conducted in Houston, Texas. Inclusion criteria for the clinic-based studies were (1) age of 4 to 18 years; (2) abdominal pain that had persisted for 3 or more months; (3) abdominal pain that was moderate to severe and interfered with some or all regular activities; (4) abdominal pain that may or may not be accompanied by upper-gastrointestinal symptoms; and (5) children were accompanied by a parent or guardian who was capable of giving informed consent, and children over the age of 10 years were capable of giving informed assent. The community-based study used standardized questionnaires that were offered to 1080 children/parents from the 3 participating schools; 700 completed and returned the questionnaires (65% response rate). The questionnaire was designed to elicit data concerning the history of abdominal pain or discomfort. A total of 160 children met Apley's criteria and were classified as having RAP. Inclusion criteria were identical to those criteria for the clinic-based studies. Participating children in the 3 studies received a standardized questionnaire that asked about socioeconomic variables, abdominal pain (intensity; frequency; duration; nature of abdominal pain, if present, and possible relationships with school activities; and other upper gastrointestinal symptoms). We used 4 scales for the MM-RAP: pain intensity scale (3 items), nonpain symptoms scale (12 items), disability scale (3 items), and satisfaction scale (2 items). Age 7 was used as a cutoff point for the analysis as the 7-year-olds have been shown to exhibit more sophisticated knowledge of illness than younger children. A total of 295 children who were aged 4 to 18 years participated in the study: 155 children from the pediatric gastroenterology clinics, 82 from the primary care clinics, and 58 from the schools. The interitem consistency (Cronbach's coefficient alpha) for the pain intensity items, nonpain symptoms items, disability items, and satisfaction items were 0.75, 0.81, 0.80, and 0.78, respectively, demonstrating good reliability of the measure. The internal consistencies of the 4 scales did not significantly differ between younger (< or =7 years) and older (>7 years) children. There was also no significant variation in the coefficient alpha of each of the 4 scales in relation to gender or the level of the parent's education. Reliability was identical for the pain-intensity items (0.74) among children who sought medical attention from primary care or pediatric gastroenterology clinics. The intercorrelations of factor scores among the 4 scales showed a strong relationship among the factors but not high enough that correlations would be expected to be measuring the same items. The results of the factor analysis identified 5 components instead of 4 components representing the 4 scales. The 12 items of the nonpain symptoms scale were classified into 2 components; 1 component included heartburn, burping, passing gas, bloating, problem with ingestion of milk, bad breath, and sour taste (nonpain symptoms I), and the other included nausea/vomiting, diarrhea, and constipation (nonpain symptoms II). The program ordered the 5 components on the basis of the percentage of the total variance explained by each component and consequently by the strength of each components in the following order: nonpain symptoms I, pain intensity, pain disability, satisfaction, and nonpain symptoms II. Of the 20 items that composed the MM-RAP, 17 met the inclusion criteria of having a correlation of > or =0.40 on the primary factor analyses. The 3 items that assessed pain intensity met the inclusion criteria as well as the 2 items that assessed satisfaction. Two of the 3 items that assessed disability met the inclusion criteria; however, the missed school item did not. The sleep problem and the loss of appetite items in the nonpain items also did not meet the inclusion criteria in both components of the nonpain symptoms scale. However, the loss of appetite item met the inclusion criteria in the disability scale with a correlation of 0.6. The 2 items that did not meet the inclusion criteria (missed school days and sour taste) will be eliminated in the revised measure for RAP. The MM-RAP demonstrated good reliability evidence in population samples. Children who have RAP and are seen at pediatric gastroenterology or primary care pediatric clinics have similar responses, showing that the measure performed well across several populations. Age did not affect the reliability of responses. The MM-RAP included 4 dimensions, each with several items that may identify disease-specific dimensions. In addition, dividing the nonpain symptoms scale into 2 components instead of 1 component could assist in creating a disease-specific measure. The present study focused exclusively on developing the multidimensional measure for RAP in children that could assist physicians in evaluating the efficacy of RAP treatment independent of psychological evaluations. In addition, the measure was designed for use in clinical trials that evaluate the efficacy of RAP treatment and to allow comparison between intervention studies. In conclusion, we were able to identify 4 dimensions of RAP in children (pain intensity, nonpain symptoms, pain disability, and satisfaction with health). We demonstrated that these dimensions can be measured in a reliable manner that is applicable to children who experience RAP in various settings.
Ramsay-Curve Item Response Theory for the Three-Parameter Logistic Item Response Model
ERIC Educational Resources Information Center
Woods, Carol M.
2008-01-01
In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters of a unidimensional item response model using marginal maximum likelihood estimation. This study evaluates RC-IRT for the three-parameter logistic (3PL) model with comparisons to the normal model and to the empirical…
ERIC Educational Resources Information Center
Preston, Kathleen; Reise, Steven; Cai, Li; Hays, Ron D.
2011-01-01
The authors used a nominal response item response theory model to estimate category boundary discrimination (CBD) parameters for items drawn from the Emotional Distress item pools (Depression, Anxiety, and Anger) developed in the Patient-Reported Outcomes Measurement Information Systems (PROMIS) project. For polytomous items with ordered response…
Okura, Mika; Ogita, Mihoko; Yamamoto, Miki; Nakai, Toshimi; Numata, Tomoko; Arai, Hidenori
This study aimed to examine the relationship of participating in community activities (CA) with cognitive impairment and depressive mood independent of mobility disorder (MD) among older Japanese people. Elderly residents in institutions or those requiring long-term care insurance services were excluded; questionnaires were mailed to 5401 older adults in 2013. The response rate was 94.3% (n=5094). We used multiple imputation to manage missing data. The questionnaire addressed physical fitness, memory, mood, and CA. Participants were divided into two groups (good and bad) based on the median scores for physical fitness, memory, and mood. We identified items related to periodically performed CA, cognitive impairment, and depressive mood, and examined correlations between scores on these sets of items. The mean age was 75.9 years; 58.4% of participants were women. The following CA significantly predicted reduced cognitive impairment and depressive mood independent of MD: volunteer activity, community activity, visiting friends at home, pursuing hobbies, paid work, farm work, and daily shopping. These results were corrected for age, sex, and response method (mail or home-visit). Higher CA scores were associated with lower cognitive impairment and lower depressive mood independent of MD. CA is negatively associated with cognitive impairment and depressive mood among community-dwelling elderly independent of MD; promoting CA may protect against cognitive impairment and depressive mood in this population. However, MD, cognitive impairment, and depressive mood may lead to reduced CA. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Found and Missed: Failing to Recognize a Search Target despite Moving It
ERIC Educational Resources Information Center
Solman, Grayden J. F.; Cheyne, J. Allan; Smilek, Daniel
2012-01-01
We present results from five search experiments using a novel "unpacking" paradigm in which participants use a mouse to sort through random heaps of distractors to locate the target. We report that during this task participants often fail to recognize the target despite moving it, and despite having looked at the item. Additionally, the missed…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-15
.... Three additional unassociated funerary objects (one baked clay artifact and two beads) are missing. From... bags of baked clay, 1 bead, 2 bags of carbonized material, 13 bags of faunal material, 1 piece of... County, CA, by the university. The 510 unassociated funerary objects are 11 bags of baked clay, 420 beads...
8 CFR 299.4 - Reproduction of Public Use Forms by public and private entities.
Code of Federal Regulations, 2010 CFR
2010-01-01
... read, or displays added or missing data elements, will be rejected by the Service. Any problems... official form. The wording and punctuation of all data elements and identifying information must match exactly. No data elements may be added or deleted. The sequence and format for each item on the form must...
Hospital saves $1 million by outsourcing laundry.
1999-04-01
Thirty-five percent of hospitals nationwide are outsourcing laundry services, according to the Textile Rental Services Association. Pennsylvania Hospital cut its cost per pound of laundry from 61.5 cents to 46 cents, saving $1 million in its first year of outsourcing. Outsourcing also brought the hospital better inventory control, more efficient delivery, and fewer complaints about missing items.
How and How Not to Prepare Students for the New Tests
ERIC Educational Resources Information Center
Shanahan, Timothy
2014-01-01
"Data-driven school reform" emphasizes the idea that if teachers analyze the kinds of questions students miss on standardized reading comprehension tests, and then give students lots of practice with such items, they will end up with higher test scores. This approach is likely to be popular with the new Partnership for Assessment of…
ERIC Educational Resources Information Center
Fukuhara, Hirotaka; Kamata, Akihito
2011-01-01
A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…
Item Response Models for Examinee-Selected Items
ERIC Educational Resources Information Center
Wang, Wen-Chung; Jin, Kuan-Yu; Qiu, Xue-Lan; Wang, Lei
2012-01-01
In some tests, examinees are required to choose a fixed number of items from a set of given items to answer. This practice creates a challenge to standard item response models, because more capable examinees may have an advantage by making wiser choices. In this study, we developed a new class of item response models to account for the choice…
ERIC Educational Resources Information Center
Lee, Woo-yeol; Cho, Sun-Joo
2017-01-01
Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…
An NCME Instructional Module on Latent DIF Analysis Using Mixture Item Response Models
ERIC Educational Resources Information Center
Cho, Sun-Joo; Suh, Youngsuk; Lee, Woo-yeol
2016-01-01
The purpose of this ITEMS module is to provide an introduction to differential item functioning (DIF) analysis using mixture item response models. The mixture item response models for DIF analysis involve comparing item profiles across latent groups, instead of manifest groups. First, an overview of DIF analysis based on latent groups, called…
Guenole, Nigel; Brown, Anna A; Cooper, Andrew J
2018-06-01
This article describes an investigation of whether Thurstonian item response modeling is a viable method for assessment of maladaptive traits. Forced-choice responses from 420 working adults to a broad-range personality inventory assessing six maladaptive traits were considered. The Thurstonian item response model's fit to the forced-choice data was adequate, while the fit of a counterpart item response model to responses to the same items but arranged in a single-stimulus design was poor. Monotrait heteromethod correlations indicated corresponding traits in the two formats overlapped substantially, although they did not measure equivalent constructs. A better goodness of fit and higher factor loadings for the Thurstonian item response model, coupled with a clearer conceptual alignment to the theoretical trait definitions, suggested that the single-stimulus item responses were influenced by biases that the independent clusters measurement model did not account for. Researchers may wish to consider forced-choice designs and appropriate item response modeling techniques such as Thurstonian item response modeling for personality questionnaire applications in industrial psychology, especially when assessing maladaptive traits. We recommend further investigation of this approach in actual selection situations and with different assessment instruments.
A Quasi-Parametric Method for Fitting Flexible Item Response Functions
ERIC Educational Resources Information Center
Liang, Longjuan; Browne, Michael W.
2015-01-01
If standard two-parameter item response functions are employed in the analysis of a test with some newly constructed items, it can be expected that, for some items, the item response function (IRF) will not fit the data well. This lack of fit can also occur when standard IRFs are fitted to personality or psychopathology items. When investigating…
Qualitative Development of the PROMIS® Pediatric Stress Response Item Banks
Gardner, William; Pajer, Kathleen; Riley, Anne W.; Forrest, Christopher B.
2013-01-01
Objective To describe the qualitative development of the Patient-Reported Outcome Measurement Information System (PROMIS®) Pediatric Stress Response item banks. Methods Stress response concepts were specified through a literature review and interviews with content experts, children, and parents. A library comprising 2,677 items derived from 71 instruments was developed. Items were classified into conceptual categories; new items were written and redundant items were removed. Items were then revised based on cognitive interviews (n = 39 children), readability analyses, and translatability reviews. Results 2 pediatric Stress Response sub-domains were identified: somatic experiences (43 items) and psychological experiences (64 items). Final item pools cover the full range of children’s stress experiences. Items are comprehensible among children aged ≥8 years and ready for translation. Conclusions Child- and parent-report versions of the item banks assess children’s somatic and psychological states when demands tax their adaptive capabilities. PMID:23124904
Development of short and very short forms of the Children's Behavior Questionnaire.
Putnam, Samuel P; Rothbart, Mary K
2006-08-01
Using data from 468 parents and taking into account internal consistency, breadth of item content, within-scale factor analysis, and patterns of missing data, we developed short (94 items, 15 scales) and very short (36 items, 3 broad scales) forms of the Children's Behavior Questionnaire (CBQ; Rothbart, Ahadi, & Hershey, 1994; Rothbart, Ahadi, Hershey, & Fisher, 2001), a well-established parent-report measure of temperament for children aged 3 to 8 years. We subsequently evaluated the forms with data from 1,189 participants. In mid/high-income and White samples, the CBQ short and very short forms demonstrated both satisfactory internal consistency and criterion validity, and exhibited longitudinal stability and cross-informant agreement comparable to that of the standard CBQ. Internal consistency was somewhat lower among African American and low-income samples for some scales. Very short form scales demonstrated acceptable internal consistency for all samples, and confirmatory factor analyses indicated marginal fit of the very short form items to a three-factor model.
NASA Astrophysics Data System (ADS)
Reynolds, A. M.
2008-07-01
The results of numerical simulations indicate that deterministic walks with inverse-square power-law scaling are a robust emergent property of predators that use chemotaxis to locate randomly and sparsely distributed stationary prey items. It is suggested that chemotactic destructive foraging accounts for the apparent Lévy flight movement patterns of Oxyrrhis marina microzooplankton in still water containing prey items. This challenges the view that these organisms are executing an innate optimal Lévy flight searching strategy. Crucial for the emergence of inverse-square power-law scaling is the tendency of chemotaxis to occasionally cause predators to miss the nearest prey item, an occurrence which would not arise if prey were located through the employment of a reliable cognitive map or if prey location were visually cued and perfect.
Chiba, Rie; Umeda, Maki; Goto, Kyohei; Miyamoto, Yuki; Yamaguchi, Sosei; Kawakami, Norito
2017-01-01
The Recovery Knowledge Inventory (RKI) is one of the influential scales to assess knowledge and attitude toward recovery-oriented practices among mental health service providers. In the present study, we aimed to develop a Japanese version of RKI and examine the validity and reliability. We translated RKI into Japanese by reference to the guidelines for translating and adapting psychometric scales. A cross-sectional questionnaire survey was conducted with mental health service providers. Of a total of 475 eligible professionals, we used data from the 299 participants without missing value for the analyses (valid response rate = 62.9%). The questionnaire included Japanese RKI, Recovery Attitudes Questionnaire, The positive attitudes scale, and Japanese-language version of the Social Distance Scale. To examine the factorial validity of RKI, explanatory factor analysis and confirmatory factor analysis was employed. Convergent validity was assessed by calculating Pearson's correlation coefficients between the total RKI score and the scores for the other three scales. We also calculated Cronbach's α coefficients for the total score and for each domain of RKI to assess internal consistency reliability. The participants' mean age was 40.4 years and 30.4% were men. 20-item RKI did not provide any adequate or interpretable factor solutions at any number of factors by EFAs. Thus four items (#1, 4, 5, and 13) were subsequently eliminated in stages, then 16-item RKI was employed as a consequence for further analyses. EFA with four factor structures yielded marginally interpretable constitution. Each factor represented the knowledge regarding psychiatric symptoms and recovery; knowledge about the recovery process; the understanding of what is important for recovery; and the understanding of the challenges and responsibility in recovery, respectively. Subsequent CFA suggested good fit to the data. Good convergent validity and understandable internal consistency reliability were also observed. The Japanese 16-item RKI revealed reasonable factorial validity, good convergent validity, and understandable internal consistency reliability among mental health professionals. Japanese cultural settings seemed to influence the four-factor structure in the present study. It can be used for future study in Japan, while future large-scale research is required to ensure robust verification.
A Comparison of Web and Telephone Responses From a National HIV and AIDS Survey
Calzavara, Liviana; Allman, Dan; Worthington, Catherine A; Tyndall, Mark; Iveniuk, James
2016-01-01
Background Response differences to survey questions are known to exist for different modes of questionnaire completion. Previous research has shown that response differences by mode are larger for sensitive and complicated questions. However, it is unknown what effect completion mode may have on HIV and AIDS survey research, which addresses particularly sensitive and stigmatized health issues. Objectives We seek to compare responses between self-selected Web and telephone respondents in terms of social desirability and item nonresponse in a national HIV and AIDS survey. Methods A survey of 2085 people in Canada aged 18 years and older was conducted to explore public knowledge, attitudes, and behaviors around HIV and AIDS in May 2011. Participants were recruited using random-digit dialing and could select to be interviewed on the telephone or self-complete through the Internet. For this paper, 15 questions considered to be either sensitive, stigma-related, or less-sensitive in nature were assessed to estimate associations between responses and mode of completion. Multivariate regression analyses were conducted for questions with significant (P≤.05) bivariate differences in responses to adjust for sociodemographic factors. As survey mode was not randomly assigned, we created a propensity score variable and included it in our multivariate models to control for mode selection bias. Results A total of 81% of participants completed the questionnaire through the Internet, and 19% completed by telephone. Telephone respondents were older, reported less education, had lower incomes, and were more likely from the province of Quebec. Overall, 2 of 13 questions assessed for social desirability and 3 of 15 questions assessed for item nonresponse were significantly associated with choice of mode in the multivariate analysis. For social desirability, Web respondents were more likely than telephone respondents to report more than 1 sexual partner in the past year (fully adjusted odds ratio (OR)=3.65, 95% CI 1.80-7.42) and more likely to have donated to charity in the past year (OR=1.63, 95% CI 1.15-2.29). For item nonresponse, Web respondents were more likely than telephone respondents to have a missing or “don’t know” response when asked about: the disease they were most concerned about (OR=3.02, 95% CI 1.67-5.47); if they had ever been tested for HIV (OR=8.04, 95% CI 2.46-26.31); and when rating their level of comfort with shopping at grocery store if the owner was known to have HIV or AIDS (OR=3.11, 95% CI 1.47-6.63). Conclusion Sociodemographic differences existed between Web and telephone respondents, but for 23 of 28 questions considered in our analysis, there were no significant differences in responses by mode. For surveys with very sensitive health content, such as HIV and AIDS, Web administration may be subject to less social desirability bias but may also have greater item nonresponse for certain questions. PMID:27473597
Kawasaki, Yohei; Ide, Kazuki; Akutagawa, Maiko; Yamada, Hiroshi; Furukawa, Toshiaki A.; Ono, Yutaka
2016-01-01
Background Several studies have shown that total depressive symptom scores in the general population approximate an exponential pattern, except for the lower end of the distribution. The Center for Epidemiologic Studies Depression Scale (CES-D) consists of 20 items, each of which may take on four scores: “rarely,” “some,” “occasionally,” and “most of the time.” Recently, we reported that the item responses for 16 negative affect items commonly exhibit exponential patterns, except for the level of “rarely,” leading us to hypothesize that the item responses at the level of “rarely” may be related to the non-exponential pattern typical of the lower end of the distribution. To verify this hypothesis, we investigated how the item responses contribute to the distribution of the sum of the item scores. Methods Data collected from 21,040 subjects who had completed the CES-D questionnaire as part of a Japanese national survey were analyzed. To assess the item responses of negative affect items, we used a parameter r, which denotes the ratio of “rarely” to “some” in each item response. The distributions of the sum of negative affect items in various combinations were analyzed using log-normal scales and curve fitting. Results The sum of the item scores approximated an exponential pattern regardless of the combination of items, whereas, at the lower end of the distributions, there was a clear divergence between the actual data and the predicted exponential pattern. At the lower end of the distributions, the sum of the item scores with high values of r exhibited higher scores compared to those predicted from the exponential pattern, whereas the sum of the item scores with low values of r exhibited lower scores compared to those predicted. Conclusions The distributional pattern of the sum of the item scores could be predicted from the item responses of such items. PMID:27806132
Stochastic Approximation Methods for Latent Regression Item Response Models
ERIC Educational Resources Information Center
von Davier, Matthias; Sinharay, Sandip
2010-01-01
This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates…
Silver Alerts and the Problem of Missing Adults with Dementia
ERIC Educational Resources Information Center
Carr, Dawn; Muschert, Glenn W.; Kinney, Jennifer; Robbins, Emily; Petonito, Gina; Manning, Lydia; Brown, J. Scott
2010-01-01
In the months following the introduction of the National AMBER (America's Missing: Broadcast Emergency Response) Alert plan used to locate missing and abducted children, Silver Alert programs began to emerge. These programs use the same infrastructure and approach to find a different missing population, cognitively impaired older adults. By late…
Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items
ERIC Educational Resources Information Center
Aybek, Eren Can; Demirtasli, R. Nukhet
2017-01-01
This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…
ERIC Educational Resources Information Center
Ito, Kyoko; Sykes, Robert C.
This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…
Asaoka, Shoichi; Aritake, Sayaka; Komada, Yoko; Ozaki, Akiko; Odagiri, Yuko; Inoue, Shigeru; Shimomitsu, Teruichi; Inoue, Yuichi
2013-05-01
Workers who meet the criteria for shift work disorder (SWD) have elevated levels of risk for various health and behavioral problems. However, the impact of having SWD on shiftworkers engaged in rapid-rotation schedules is unknown. Moreover, the risk factors for the occurrence of SWD remain unclear. To clarify these issues, we conducted a questionnaire-based, cross-sectional survey on a sample of shiftworking nurses. Responses were obtained from 1202 nurses working at university hospitals in Tokyo, Japan, including 727 two-shift workers and 315 three-shift workers. The questionnaire included items relevant to age, gender, family structure, work environment, health-related quality of life (QOL), diurnal type, depressive symptoms, and SWD. Participants who reported insomnia and/or excessive sleepiness for at least 1 mo that was subjectively relevant to their shiftwork schedules were categorized as having SWD. The prevalence of SWD in the sampled shiftworking nurses was 24.4%; shiftworking nurses with SWD showed lower health-related QOL and more severe depressive symptoms, with greater rates of both actual accidents/errors and near misses, than those without SWD. The results of logistic regression analyses showed that more time spent working at night, frequent missing of nap opportunities during night work, and having an eveningness-oriented chronotype were significantly associated with SWD. The present study indicated that SWD might be associated with reduced health-related QOL and decreased work performance in shiftworking nurses on rapid-rotation schedules. The results also suggested that missing napping opportunities during night work, long nighttime working hours, and the delay of circadian rhythms are associated with the occurrence of SWD among shiftworking nurses on rapid-rotation schedules.
ERIC Educational Resources Information Center
Howlett, Melissa A.; Sidener, Tina M.; Progar, Patrick R.; Sidener, David W.
2011-01-01
The effects of contriving motivating operations (MOs) and script fading on the acquisition of the mand "Where's [object]?" were evaluated in 2 boys with language delays. During each session, trials were alternated in which high-preference items were present (abolishing operation [AO] trials) or missing (establishing operation [EO] trials) from…
Linking Research and Practice: Effective Strategies for Teaching Vocabulary in the ESL Classroom
ERIC Educational Resources Information Center
Nam, Jihyun
2010-01-01
Vocabulary plays a pivotal role in the ESL classroom. Whereas a considerable amount of research has examined effective ESL vocabulary teaching and learning, missing are studies that provide examples of how to put various research findings into practice: that is, apply them to real texts including target vocabulary items. In order to close the gap…
49 CFR 393.134 - What are the rules for securing roll-on/roll-off or hook lift containers?
Code of Federal Regulations, 2014 CFR
2014-10-01
... which is not equipped with an integral securement system must be: (1) Blocked against forward movement... least as effectively as the tiedowns in the two previous items. (4) The mechanisms used to secure the... secure the container to the vehicle, providing the same level of securement as the missing, damaged or...
49 CFR 393.134 - What are the rules for securing roll-on/roll-off or hook lift containers?
Code of Federal Regulations, 2012 CFR
2012-10-01
... which is not equipped with an integral securement system must be: (1) Blocked against forward movement... least as effectively as the tiedowns in the two previous items. (4) The mechanisms used to secure the... secure the container to the vehicle, providing the same level of securement as the missing, damaged or...
A Comparison of Linking and Concurrent Calibration under the Graded Response Model.
ERIC Educational Resources Information Center
Kim, Seock-Ho; Cohen, Allan S.
Applications of item response theory to practical testing problems including equating, differential item functioning, and computerized adaptive testing, require that item parameter estimates be placed onto a common metric. In this study, two methods for developing a common metric for the graded response model under item response theory were…
Writing, Evaluating and Assessing Data Response Items in Economics.
ERIC Educational Resources Information Center
Trotman-Dickenson, D. I.
1989-01-01
Describes some of the problems in writing data response items in economics for use by A Level and General Certificate of Secondary Education (GCSE) students. Examines the experience of two series of workshops on writing items, evaluating them and assessing responses from schools. Offers suggestions for producing packages of data response items as…
Item Response Modeling with Sum Scores
ERIC Educational Resources Information Center
Johnson, Timothy R.
2013-01-01
One of the distinctions between classical test theory and item response theory is that the former focuses on sum scores and their relationship to true scores, whereas the latter concerns item responses and their relationship to latent scores. Although item response theory is often viewed as the richer of the two theories, sum scores are still…
A Model-Free Diagnostic for Single-Peakedness of Item Responses Using Ordered Conditional Means
ERIC Educational Resources Information Center
Polak, Marike; De Rooij, Mark; Heiser, Willem J.
2012-01-01
In this article we propose a model-free diagnostic for single-peakedness (unimodality) of item responses. Presuming a unidimensional unfolding scale and a given item ordering, we approximate item response functions of all items based on ordered conditional means (OCM). The proposed OCM methodology is based on Thurstone & Chave's (1929) "criterion…
Sync and swim: the impact of medication consolidation on adherence in Medicaid patients.
Ross, Alexander; Jami, Humaira; Young, Heather A; Katz, Richard
2013-10-01
Medication nonadherence is associated with higher cost of care and poor outcomes. Medication refill consolidation (synchronization of refill dates for patients on multiple drugs) is an important component of regimen complexity. We presumed that Medicaid patients with a 30-day medication supply limit would have significant difficulty with refill consolidation. We evaluated regimen complexity and refill consolidation in relation to medication adherence in the Medicaid population. A survey was administered to 50 Medicaid patients taking 2 or more daily medications in the outpatient setting. The survey included demographics, 13 items related to medication and pharmacy history, and 10 items related to medication regimen complexity and refill consolidation. Chi-square analysis was used to assess the relationship between adherence and missed medication doses due to regimen complexity. Wilcoxon rank sum test was used to determine association between total number of prescribing providers and number of daily medications with various aspects of regimen complexity. 52% were required to go to the pharmacy more than once per month to keep all of their medications filled and 46% missed a day or more of medication because their medications must be refilled on different dates. Those who missed a day or more of medication because of need to refill prescriptions on different days had higher number of prescriptions (P = .03) and higher number of prescribers (P = .03). Medicaid patients had low medication adherence in the context of high regimen complexity and poor refill consolidation. This population would benefit from interventions focused on improving synchronization of medication refills.
45 CFR 1355.40 - Foster care and adoption data collection.
Code of Federal Regulations, 2011 CFR
2011-10-01
.... These are specified in Appendix E to this part. (c) Missing data standards. (1) The term “missing data... missing data. All data which are “out of range” (i.e., the response is beyond the parameters allowed for that particular data element) will also be converted to missing data. Details of the circumstances...
45 CFR 1355.40 - Foster care and adoption data collection.
Code of Federal Regulations, 2010 CFR
2010-10-01
.... These are specified in Appendix E to this part. (c) Missing data standards. (1) The term “missing data... missing data. All data which are “out of range” (i.e., the response is beyond the parameters allowed for that particular data element) will also be converted to missing data. Details of the circumstances...
Sweller, Naomi; Hayes, Brett K
2010-08-01
Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.
Daly, Justine B; Campbell, Elizabeth M; Wiggers, John H; Considine, Robyn J
2002-06-01
This study aimed to determine the prevalence of responsible hospitality policies in a group of licensed premises associated with alcohol-related harm. During March 1999, 108 licensed premises with one or more police-identified alcohol-related incidents in the previous 3 months received a visit from a police officer. A 30-item audit checklist was used to determine the responsible hospitality policies being undertaken by each premises within eight policy domains: display required signage (three items); responsible host practices to prevent intoxication and under-age drinking (five items); written policies and guidelines for responsible service (three items); discouraging inappropriate promotions (three items); safe transport (two items); responsible management issues (seven items); physical environment (three items) and entry conditions (four items). No premises were undertaking all 30 items. Eighty per cent of the premises were undertaking 20 of the 30 items. All premises were undertaking at least 17 of the items. The proportion of premises undertaking individual items ranged from 16% to 100%. Premises were less likely to report having and providing written responsible hospitality documentation to staff, using door charges and having entry/re-entry rules. Significant differences between rural and urban premises were evident for four policies. Clubs were significantly more likely than hotels to have a written responsible service of alcohol policy and to clearly display codes of dress and conditions of entry. This study provides an indication of the extent and nature of responsible hospitality policies in a sample of licensed premises that are associated with a broad range of alcohol related harms. The finding that a large majority of such premises appear to adopt responsible hospitality policies suggests a need to assess the validity and reliability of tools used in the routine assessment of such policies, and of the potential for harm from licensed premises.
Caregiving for ill dependents and its association with employee health risks and productivity.
Burton, Wayne N; Chen, Chin-Yu; Conti, Daniel J; Pransky, Glenn; Edington, Dee W
2004-10-01
This study examined the loss of productivity and health risk status associated with employees who provide care for an ill dependent. A total of 16,651 employees (23% response rate) of a major financial services company completed a confidential Health Risk Appraisal (HRA) that included an eight-item version of the Work Limitations Questionnaire and a self-report of time missed from work during the previous 2 weeks to care for an ill dependent. A total of 10.6% of the respondents reported an average of 7.7 hours absent from work during the previous 2-week period to provide care for an ill dependent. Caregiving also was associated with a significant increase in the number of health risks for the employee. As the demand for caregiving time increased, caregivers reported a significant increase in work limitations. Caregiving for an ill dependent is associated with increased absenteeism and significant work limitations while on the job. Programs and work organization that helps employees balance their caregiving responsibilities for ill dependents may have a positive effect on health and productivity.
Introduction to biological complexity as a missing link in drug discovery.
Gintant, Gary A; George, Christopher H
2018-06-06
Despite a burgeoning knowledge of the intricacies and mechanisms responsible for human disease, technological advances in medicinal chemistry, and more efficient assays used for drug screening, it remains difficult to discover novel and effective pharmacologic therapies. Areas covered: By reference to the primary literature and concepts emerging from academic and industrial drug screening landscapes, the authors propose that this disconnect arises from the inability to scale and integrate responses from simpler model systems to outcomes from more complex and human-based biological systems. Expert opinion: Further collaborative efforts combining target-based and phenotypic-based screening along with systems-based pharmacology and informatics will be necessary to harness the technological breakthroughs of today to derive the novel drug candidates of tomorrow. New questions must be asked of enabling technologies-while recognizing inherent limitations-in a way that moves drug development forward. Attempts to integrate mechanistic and observational information acquired across multiple scales frequently expose the gap between our knowledge and our understanding as the level of complexity increases. We hope that the thoughts and actionable items highlighted will help to inform the directed evolution of the drug discovery process.
Those who hesitate lose: the relationship between assertiveness and response latency.
Collins, L H; Powell, J L; Oliver, P V
2000-06-01
Individuals who are low in assertiveness may take longer to sort out, process, and state their own perceptions, attitudes and priorities, which puts them at a disadvantage in getting their needs met. The reason for this may not be inhibition in social situations or cognitive ability, but a lack of clarity regarding their own attitudes, opinions, preferences, goals, and priorities. 101 undergraduate students (57% women and 43% men) completed a demographics questionnaire, the Wonderlic Personnel Test, a self-monitoring scale, the Marlowe-Crowne Social Desirability Scale, the Rosenberg Self-esteem Scale, the College Self-expression Scale, and a test of the false-consensus effect. Response latencies to questions were measured. Individuals with higher scores on the Wonderlic Personnel Test answered items more quickly but, even when cognitive ability was controlled, individuals low in assertiveness still took significantly longer to respond to questions about themselves, their opinions, and their preferences. If individuals fall behind at this early step in the process of asserting themselves, then they may be more likely to miss opportunities to be assertive.
Item Response Data Analysis Using Stata Item Response Theory Package
ERIC Educational Resources Information Center
Yang, Ji Seung; Zheng, Xiaying
2018-01-01
The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…
Item Response Models for Local Dependence among Multiple Ratings
ERIC Educational Resources Information Center
Wang, Wen-Chung; Su, Chi-Ming; Qiu, Xue-Lan
2014-01-01
Ratings given to the same item response may have a stronger correlation than those given to different item responses, especially when raters interact with one another before giving ratings. The rater bundle model was developed to account for such local dependence by forming multiple ratings given to an item response as a bundle and assigning…
Haggerty, Jeannie L; Beaulieu, Marie-Dominique; Pineault, Raynald; Burge, Frederick; Lévesque, Jean-Frédéric; Santor, Darcy A; Bouharaoui, Fatima; Beaulieu, Christine
2011-12-01
Comprehensiveness relates both to scope of services offered and to a whole-person clinical approach. Comprehensive services are defined as "the provision, either directly or indirectly, of a full range of services to meet most patients' healthcare needs"; whole-person care is "the extent to which a provider elicits and considers the physical, emotional and social aspects of a patient's health and considers the community context in their care." Among instruments that evaluate primary healthcare, two had subscales that mapped to comprehensive services and to the community component of whole-person care: the Primary Care Assessment Tool - Short Form (PCAT-S) and the Components of Primary Care Index (CPCI, a limited measure of whole-person care). To examine how well comprehensiveness is captured in validated instruments that evaluate primary healthcare from the patient's perspective. 645 adults with at least one healthcare contact in the previous 12 months responded to six instruments that evaluate primary healthcare. Scores were normalized for descriptive comparison. Exploratory and confirmatory (structural equation modelling) factor analysis examined fit to operational definition, and item response theory analysis examined item performance on common constructs. Over one-quarter of respondents had missing responses on services offered or doctor's knowledge of the community. The subscales did not load on a single factor; comprehensive services and community orientation were examined separately. The community orientation subscales did not perform satisfactorily. The three comprehensive services subscales fit very modestly onto two factors: (1) most healthcare needs (from one provider) (CPCI Comprehensive Care, PCAT-S First-Contact Utilization) and (2) range of services (PCAT-S Comprehensive Services Available). Individual item performance revealed several problems. Measurement of comprehensiveness is problematic, making this attribute a priority for measure development. Range of services offered is best obtained from providers. Whole-person care is not addressed as a separate construct, but some dimensions are covered by attributes such as interpersonal communication and relational continuity.
Item response theory - A first approach
NASA Astrophysics Data System (ADS)
Nunes, Sandra; Oliveira, Teresa; Oliveira, Amílcar
2017-07-01
The Item Response Theory (IRT) has become one of the most popular scoring frameworks for measurement data, frequently used in computerized adaptive testing, cognitively diagnostic assessment and test equating. According to Andrade et al. (2000), IRT can be defined as a set of mathematical models (Item Response Models - IRM) constructed to represent the probability of an individual giving the right answer to an item of a particular test. The number of Item Responsible Models available to measurement analysis has increased considerably in the last fifteen years due to increasing computer power and due to a demand for accuracy and more meaningful inferences grounded in complex data. The developments in modeling with Item Response Theory were related with developments in estimation theory, most remarkably Bayesian estimation with Markov chain Monte Carlo algorithms (Patz & Junker, 1999). The popularity of Item Response Theory has also implied numerous overviews in books and journals, and many connections between IRT and other statistical estimation procedures, such as factor analysis and structural equation modeling, have been made repeatedly (Van der Lindem & Hambleton, 1997). As stated before the Item Response Theory covers a variety of measurement models, ranging from basic one-dimensional models for dichotomously and polytomously scored items and their multidimensional analogues to models that incorporate information about cognitive sub-processes which influence the overall item response process. The aim of this work is to introduce the main concepts associated with one-dimensional models of Item Response Theory, to specify the logistic models with one, two and three parameters, to discuss some properties of these models and to present the main estimation procedures.
Issues of medication administration and control in Iowa schools.
Farris, Karen B; McCarthy, Ann Marie; Kelly, Michael W; Clay, Daniel; Gross, Jami N
2003-11-01
Who is responsible for medication administration at school? To answer this question, a descriptive, self-administered survey was mailed to a random sample of 850 school principals in Iowa. The eight-page, 57-item, anonymous survey was mailed first class, and a follow-up reminder post card was mailed two weeks later. Descriptive analyses were conducted, with type of respondent (principal versus school nurse), grade level, and size of school examined to explore differences. A 46.6% response rate was obtained; 97% of respondents indicated their schools had written guidelines for medication administration. Principals (41%) and school nurses (34%) reported that they have the ultimate legal responsibility for medication administration. Policies for medication administration on field trips were available in schools of 73.6% of respondents. High schools were more likely to allow self-medication than other grade levels. "Missed dose" was the most common medication error. The main reasons contributing to medication administration errors included poor communication among school, family, and healthcare providers, and the increased number of students on medication. It remains unclear who holds ultimate responsibility for medication administration in schools. Written policies typically exist for medication administration at school, but not field trips. Communicating medication changes to schools, and ensuring medications are available at school, likely can reduce medication administration errors.
Minaya, Patricia; Baumstarck, Karine; Berbis, Julie; Goncalves, Anthony; Barlesi, Fabrice; Michel, Gérard; Salas, Sébastien; Chinot, Olivier; Grob, Jean-Jacques; Seitz, Jean François; Bladou, Franck; Clement, Audrey; Mancini, Julien; Simeoni, Marie-Claude; Auquier, Pascal
2012-04-01
The study objective was to validate a specific quality of life (QoL) questionnaire for caregivers of cancer patients, the CareGiver Oncology Quality of Life questionnaire (CarGOQoL), based on the exclusive points of view of the caregivers. A 75-item questionnaire generated from content analysis of interviews with caregivers was self-completed by 837 caregivers of cancer patients. In addition to sociodemographic data and patient characteristics, self-reported questionnaires assessing QoL, burden, coping and social support were collected. Psychometric properties combined methods relying on both classical test theory and item response theory. The final 29 items selected assessed 10 dimensions: psychological well-being, burden, relationship with health care, administration and finances, coping, physical well-being, self-esteem, leisure time, social support and private life; they were isolated from principal component analysis explaining 73% of the total variance. The missing data and the floor effects were low. Some ceiling effects were found for B (34%). Cronbach's alpha coefficients ranged from 0.72 to 0.89, except private life (PL) (0.55). Unidimensionality of the scales was confirmed by Rasch analyses. Correlations with other instruments confirmed the isolated content and significant links were found with respect to patient's characteristics. Reproducibility and sensitivity to change were found satisfactory. The CarGOQoL could provide a reliable and valid measure of caregivers of cancer patients' QoL which are key-actors in the provision of health care. Copyright © 2011 Elsevier Ltd. All rights reserved.
Performance of the Swedish version of the Revised Piper Fatigue Scale.
Jakobsson, Sofie; Taft, Charles; Östlund, Ulrika; Ahlberg, Karin
2013-12-01
The Revised Piper Fatigue scale is one of the most widely used instruments internationally to assess cancer-related fatigue. The aim of the present study was to evaluate selected psychometric properties of a Swedish version of the RPFS (SPFS). An earlier translation of the SPFS was further evaluated and developed. The new version was mailed to 300 patients undergoing curative radiotherapy. The internal validity was assessed using Principal Axis Factor Analysis with oblimin rotation and multitrait analysis. External validity was examined in relation to the Multidimensional Fatigue Inventory-20 (MFI-20) and in known-groups analyses. Totally 196 patients (response rate = 65%) returned evaluable questionnaires. Principal axis factoring analysis yielded three factors (74% of the variance) rather than four as in the original RPFS. Multitrait analyses confirmed the adequacy of scaling assumptions. Known-groups analyses failed to support the discriminative validity. Concurrent validity was satisfactory. The new Swedish version of the RPFS showed good acceptability, reliability and convergent and- discriminant item-scale validity. Our results converge with other international versions of the RPFS in failing to support the four-dimension conceptual model of the instrument. Hence, RPFS suitability for use in international comparisons may be limited which also may have implications for cross-cultural validity of the newly released 12-item version of the RPFS. Further research on the Swedish version should address reasons for high missing rates for certain items in the subscale of affective meaning, further evaluation of the discriminative validity and assessment of its sensitivity in detecting changes over time. Copyright © 2013 Elsevier Ltd. All rights reserved.
A new look at patient satisfaction: learning from self-organizing maps.
Voutilainen, Ari; Kvist, Tarja; Sherwood, Paula R; Vehviläinen-Julkunen, Katri
2014-01-01
To some extent, results always depend on the methods used, and the complete picture of the phenomenon of interest can be drawn only by combining results of different data processing techniques. This emphasizes the use of a wide arsenal of methods for processing and analyzing patient satisfaction surveys. The purpose of this study was to introduce the self-organizing map (SOM) to nursing science and to illustrate the use of the SOM with patient satisfaction data. The SOM is a widely used artificial neural network suitable for clustering and exploring all kind of data sets. The study was partly a secondary analysis of data collected for the Attractive and Safe Hospital Study from four Finnish hospitals in 2008 and 2010 using the Revised Humane Caring Scale. The sample consisted of 5,283 adult patients. The SOM was used to cluster the data set according to (a) respondents and (b) questionnaire items. The SOM was also used as a preprocessor for multinomial logistic regression. An analysis of missing data was carried out to improve the data interpretation. Combining results of the two SOMs and the logistic regression revealed associations between the level of satisfaction, different components of satisfaction, and item nonresponse. The common conception that the relationship between patient satisfaction and age is positive may partly be due to positive association between the tendency of item nonresponse and age. The SOM proved to be a useful method for clustering a questionnaire data set even when the data set was low dimensional per se. Inclusion of empty responses in analyses may help to detect possible misleading noncausative relationships.
A Multidimensional Ideal Point Item Response Theory Model for Binary Data
ERIC Educational Resources Information Center
Maydeu-Olivares, Albert; Hernandez, Adolfo; McDonald, Roderick P.
2006-01-01
We introduce a multidimensional item response theory (IRT) model for binary data based on a proximity response mechanism. Under the model, a respondent at the mode of the item response function (IRF) endorses the item with probability one. The mode of the IRF is the ideal point, or in the multidimensional case, an ideal hyperplane. The model…
Code of Federal Regulations, 2012 CFR
2012-07-01
... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...
Code of Federal Regulations, 2011 CFR
2011-07-01
... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...
Code of Federal Regulations, 2010 CFR
2010-07-01
... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...
Code of Federal Regulations, 2014 CFR
2014-07-01
... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...
Code of Federal Regulations, 2013 CFR
2013-07-01
... resulting from the conversion of a bearer corpus missing callable coupons? 358.19 Section 358.19 Money and... corpus missing callable coupons? The submitting depository institution shall indemnify the United States against any loss resulting from the conversion of a bearer corpus that is missing one or more associated...
VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA
Garcia, Ramon I.; Ibrahim, Joseph G.; Zhu, Hongtu
2009-01-01
We consider the variable selection problem for a class of statistical models with missing data, including missing covariate and/or response data. We investigate the smoothly clipped absolute deviation penalty (SCAD) and adaptive LASSO and propose a unified model selection and estimation procedure for use in the presence of missing data. We develop a computationally attractive algorithm for simultaneously optimizing the penalized likelihood function and estimating the penalty parameters. Particularly, we propose to use a model selection criterion, called the ICQ statistic, for selecting the penalty parameters. We show that the variable selection procedure based on ICQ automatically and consistently selects the important covariates and leads to efficient estimates with oracle properties. The methodology is very general and can be applied to numerous situations involving missing data, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Simulations are given to demonstrate the methodology and examine the finite sample performance of the variable selection procedures. Melanoma data from a cancer clinical trial is presented to illustrate the proposed methodology. PMID:20336190
Wu, Jia-Rong; DeWalt, Darren A; Baker, David W; Schillinger, Dean; Ruo, Bernice; Bibbins-Domingo, Kristen; Macabasco-O'Connell, Aurelia; Holmes, George M; Broucksou, Kimberly A; Erman, Brian; Hawk, Victoria; Cene, Crystal W; Jones, Christine DeLong; Pignone, Michael
2014-09-01
To determine whether a single-item self-report medication adherence question predicts hospitalisation and death in patients with heart failure. Poor medication adherence is associated with increased morbidity and mortality. Having a simple means of identifying suboptimal medication adherence could help identify at-risk patients for interventions. We performed a prospective cohort study in 592 participants with heart failure within a four-site randomised trial. Self-report medication adherence was assessed at baseline using a single-item question: 'Over the past seven days, how many times did you miss a dose of any of your heart medication?' Participants who reported no missing doses were defined as fully adherent, and those missing more than one dose were considered less than fully adherent. The primary outcome was combined all-cause hospitalisation or death over one year and the secondary endpoint was heart failure hospitalisation. Outcomes were assessed with blinded chart reviews, and heart failure outcomes were determined by a blinded adjudication committee. We used negative binomial regression to examine the relationship between medication adherence and outcomes. Fifty-two percent of participants were 52% male, mean age was 61 years, and 31% were of New York Heart Association class III/IV at enrolment; 72% of participants reported full adherence to their heart medicine at baseline. Participants with full medication adherence had a lower rate of all-cause hospitalisation and death (0·71 events/year) compared with those with any nonadherence (0·86 events/year): adjusted-for-site incidence rate ratio was 0·83, fully adjusted incidence rate ratio 0·68. Incidence rate ratios were similar for heart failure hospitalisations. A single medication adherence question at baseline predicts hospitalisation and death over one year in heart failure patients. Medication adherence is associated with all-cause and heart failure-related hospitalisation and death in heart failure. It is important for clinicians to assess patients' medication adherence on a regular basis at their clinical follow-ups. © 2013 John Wiley & Sons Ltd.
A Two-Decision Model for Responses to Likert-Type Items
ERIC Educational Resources Information Center
Thissen-Roe, Anne; Thissen, David
2013-01-01
Extreme response set, the tendency to prefer the lowest or highest response option when confronted with a Likert-type response scale, can lead to misfit of item response models such as the generalized partial credit model. Recently, a series of intrinsically multidimensional item response models have been hypothesized, wherein tendency toward…
Pilot statewide study of pediatric emergency department alignment with national guidelines.
Costich, Julia F; Fallat, Mary E; Scaggs, C Morgan; Bartlett, Richard
2013-07-01
The American Academy of Pediatrics, American College of Emergency Physicians, and Emergency Nursing Association have developed consensus guidelines for pediatric emergency department policies, procedures, supplies, and equipment. Kentucky received funding from the Health Resources and Services Administration through the Emergency Medical Services for Children program to pilot test the guidelines with the state's hospitals. In addition to providing baseline data regarding institutional alignment with the guidelines, the survey supported development of grant funding to procure missing items. Survey administration was undertaken by staff and members of the Kentucky Board of Emergency Medical Services Emergency Medical Services for Children work group and faculty and staff of the University of Kentucky College of Public Health and the University of Louisville School of Medicine. Responses were solicited primarily online with repeated reminders and offers of assistance. Seventy respondents completed the survey section on supplies and equipment either online or by fax. Results identified items unavailable at 20% or more of responding facilities, primarily the smallest sizes of equipment. The survey section addressing policy and procedure received only 16 responses. Kentucky facilities were reasonably well equipped by national standards, but rural facilities and small hospitals did not stock the smallest equipment sizes because of low reported volume of pediatric emergency department cases. Thus, a centralized procurement process that gives them access to an adequate range of pediatric supplies and equipment would support capacity building for the care of children across the entire state. Grant proposals were received from 28 facilities in the first 3 months of funding availability.
Transcultural adaptation and validation of the “Hip and Knee” questionnaire into Spanish
2014-01-01
Background The purpose of the present study is to translate and validate the “Hip and Knee Outcomes Questionnaire”, developed in English, into Spanish. The ‘Hip and Knee Outcomes Questionnaire is a questionnaire planned to evaluate the impact in quality of life of any problem related to the human musculoskeletal system. 10 scientific associations developed it. Methods The questionnaire underwent a validated translation/retro-translation process. Patients undergoing primary knee arthroplasty, before and six months postoperative, tested the final version in Spanish. Psychometric properties of feasibility, reliability, validity and sensitivity to change were assessed. Convergent validity with SF-36 and WOMAC questionnaires was evaluated. Results 316 patients were included. Feasibility: a high number of missing items in questions 3, 4 and 5 were observed. The number of patients with a missing item was 171 (51.35%) in the preoperative visit and 139 (44.0%) at the postoperative. Internal validity: revision of coefficients in the item-rest correlation recommended removing question 6 during the preoperative visit (coefficient <0.20). Convergent validity: coefficients of correlation with WOMAC and SF-36 scales confirm the questionnaire’s validity. Sensitivity to change: statistically significant differences were found between the mean scores of the first visit compared to the postoperative. Conclusion The proposed translation to Spanish of the ‘Hip and Knee Questionnaire’ is found to be reliable, valid and sensible to changes produced at the clinical practice of patients undergoing primary knee arthroplasty. However, some changes at the completion instructions are recommended. Level of evidence: Level I. Prognostic study. PMID:24885248
Extensive validation of the pain disability index in 3 groups of patients with musculoskeletal pain.
Soer, Remko; Köke, Albère J A; Vroomen, Patrick C A J; Stegeman, Patrick; Smeets, Rob J E M; Coppes, Maarten H; Reneman, Michiel F
2013-04-20
A cross-sectional study design was performed. To validate the pain disability index (PDI) extensively in 3 groups of patients with musculoskeletal pain. The PDI is a widely used and studied instrument for disability related to various pain syndromes, although there is conflicting evidence concerning factor structure, test-retest reliability, and missing items. Additionally, an official translation of the Dutch language version has never been performed. For reliability, internal consistency, factor structure, test-retest reliability and measurement error were calculated. Validity was tested with hypothesized correlations with pain intensity, kinesiophobia, Rand-36 subscales, Depression, Roland-Morris Disability Questionnaire, Quality of Life, and Work Status. Structural validity was tested with independent backward translation and approval from the original authors. One hundred seventy-eight patients with acute back pain, 425 patients with chronic low back pain and 365 with widespread pain were included. Internal consistency of the PDI was good. One factor was identified with factor analyses. Test-retest reliability was good for the PDI (intraclass correlation coefficient, 0.76). Standard error of measurement was 6.5 points and smallest detectable change was 17.9 points. Little correlations between the PDI were observed with kinesiophobia and depression, fair correlations with pain intensity, work status, and vitality and moderate correlations with the Rand-36 subscales and the Roland-Morris Disability Questionnaire. The PDI-Dutch language version is internally consistent as a 1-factor structure, and test-retest reliable. Missing items seem high in sexual and professional items. Using the PDI as a 2-factor questionnaire has no additional value and is unreliable.
Smith, Allan Ben; King, Madeleine; Butow, Phyllis; Olver, Ian
2013-01-01
We aimed to compare data quality from online and postal questionnaires and to evaluate the practicality of these different questionnaire modes in a cancer sample. Participants in a study investigating the psychosocial sequelae of testicular cancer could choose to complete a postal or online version of the study questionnaire. Data quality was evaluated by assessing sources of nonobservational errors such as participant nonresponse, item nonresponse and sampling bias. Time taken and number of reminders required for questionnaire return were used as indicators of practicality. Participant nonresponse was significantly higher among participants who chose the postal questionnaire. The proportion of questionnaires with missing items and the mean number of missing items did not differ significantly by mode. A significantly larger proportion of tertiary-educated participants and managers/professionals completed the online questionnaire. There were no significant differences in age, relationship status, employment status, country of birth or language spoken by completion mode. Compared with postal questionnaires, online questionnaires were returned significantly more quickly and required significantly fewer reminders. These results demonstrate that online questionnaire completion can be offered in a cancer sample without compromising data quality. In fact, data quality from online questionnaires may be superior due to lower rates of participant nonresponse. Investigators should be aware of potential sampling bias created by more highly educated participants and managers/professionals choosing to complete online questionnaires. Besides this issue, online questionnaires offer an efficient method for collecting high-quality data, with faster return and fewer reminders. Copyright © 2011 John Wiley & Sons, Ltd.
Boeschen Hospers, J Mirjam; Smits, Niels; Smits, Cas; Stam, Mariska; Terwee, Caroline B; Kramer, Sophia E
2016-04-01
We reevaluated the psychometric properties of the Amsterdam Inventory for Auditory Disability and Handicap (AIADH; Kramer, Kapteyn, Festen, & Tobi, 1995) using item response theory. Item response theory describes item functioning along an ability continuum. Cross-sectional data from 2,352 adults with and without hearing impairment, ages 18-70 years, were analyzed. They completed the AIADH in the web-based prospective cohort study "Netherlands Longitudinal Study on Hearing." A graded response model was fitted to the AIADH data. Category response curves, item information curves, and the standard error as a function of self-reported hearing ability were plotted. The graded response model showed a good fit. Item information curves were most reliable for adults who reported having hearing disability and less reliable for adults with normal hearing. The standard error plot showed that self-reported hearing ability is most reliably measured for adults reporting mild up to moderate hearing disability. This is one of the few item response theory studies on audiological self-reports. All AIADH items could be hierarchically placed on the self-reported hearing ability continuum, meaning they measure the same construct. This provides a promising basis for developing a clinically useful computerized adaptive test, where item selection adapts to the hearing ability of individuals, resulting in efficient assessment of hearing disability.
Raykov, Tenko; Marcoulides, George A
2016-04-01
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete nature of the observed items. Two distinct observational equivalence approaches are outlined that render the item response models from corresponding classical test theory-based models, and can each be used to obtain the former from the latter models. Similarly, classical test theory models can be furnished using the reverse application of either of those approaches from corresponding item response models.
[Instrument to measure adherence in hypertensive patients: contribution of Item Response Theory].
Rodrigues, Malvina Thaís Pacheco; Moreira, Thereza Maria Magalhaes; Vasconcelos, Alexandre Meira de; Andrade, Dalton Francisco de; Silva, Daniele Braz da; Barbetta, Pedro Alberto
2013-06-01
To analyze, by means of "Item Response Theory", an instrument to measure adherence to t treatment for hypertension. Analytical study with 406 hypertensive patients with associated complications seen in primary care in Fortaleza, CE, Northeastern Brazil, 2011 using "Item Response Theory". The stages were: dimensionality test, calibrating the items, processing data and creating a scale, analyzed using the gradual response model. A study of the dimensionality of the instrument was conducted by analyzing the polychoric correlation matrix and factor analysis of complete information. Multilog software was used to calibrate items and estimate the scores. Items relating to drug therapy are the most directly related to adherence while those relating to drug-free therapy need to be reworked because they have less psychometric information and low discrimination. The independence of items, the small number of levels in the scale and low explained variance in the adjustment of the models show the main weaknesses of the instrument analyzed. The "Item Response Theory" proved to be a relevant analysis technique because it evaluated respondents for adherence to treatment for hypertension, the level of difficulty of the items and their ability to discriminate between individuals with different levels of adherence, which generates a greater amount of information. The instrument analyzed is limited in measuring adherence to hypertension treatment, by analyzing the "Item Response Theory" of the item, and needs adjustment. The proper formulation of the items is important in order to accurately measure the desired latent trait.
NASA Astrophysics Data System (ADS)
Yang, Yongchao; Nagarajaiah, Satish
2016-06-01
Randomly missing data of structural vibration responses time history often occurs in structural dynamics and health monitoring. For example, structural vibration responses are often corrupted by outliers or erroneous measurements due to sensor malfunction; in wireless sensing platforms, data loss during wireless communication is a common issue. Besides, to alleviate the wireless data sampling or communication burden, certain accounts of data are often discarded during sampling or before transmission. In these and other applications, recovery of the randomly missing structural vibration responses from the available, incomplete data, is essential for system identification and structural health monitoring; it is an ill-posed inverse problem, however. This paper explicitly harnesses the data structure itself-of the structural vibration responses-to address this (inverse) problem. What is relevant is an empirical, but often practically true, observation, that is, typically there are only few modes active in the structural vibration responses; hence a sparse representation (in frequency domain) of the single-channel data vector, or, a low-rank structure (by singular value decomposition) of the multi-channel data matrix. Exploiting such prior knowledge of data structure (intra-channel sparse or inter-channel low-rank), the new theories of ℓ1-minimization sparse recovery and nuclear-norm-minimization low-rank matrix completion enable recovery of the randomly missing or corrupted structural vibration response data. The performance of these two alternatives, in terms of recovery accuracy and computational time under different data missing rates, is investigated on a few structural vibration response data sets-the seismic responses of the super high-rise Canton Tower and the structural health monitoring accelerations of a real large-scale cable-stayed bridge. Encouraging results are obtained and the applicability and limitation of the presented methods are discussed.
The Consequences of Ignoring Item Parameter Drift in Longitudinal Item Response Models
ERIC Educational Resources Information Center
Lee, Wooyeol; Cho, Sun-Joo
2017-01-01
Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…
ERIC Educational Resources Information Center
Tay, Louis; Vermunt, Jeroen K.; Wang, Chun
2013-01-01
We evaluate the item response theory with covariates (IRT-C) procedure for assessing differential item functioning (DIF) without preknowledge of anchor items (Tay, Newman, & Vermunt, 2011). This procedure begins with a fully constrained baseline model, and candidate items are tested for uniform and/or nonuniform DIF using the Wald statistic.…
On Multidimensional Item Response Theory: A Coordinate-Free Approach. Research Report. ETS RR-07-30
ERIC Educational Resources Information Center
Antal, Tamás
2007-01-01
A coordinate-free definition of complex-structure multidimensional item response theory (MIRT) for dichotomously scored items is presented. The point of view taken emphasizes the possibilities and subtleties of understanding MIRT as a multidimensional extension of the classical unidimensional item response theory models. The main theorem of the…
ERIC Educational Resources Information Center
Missouri State Dept. of Elementary and Secondary Education, Jefferson City.
This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to fifth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…
ERIC Educational Resources Information Center
Hospers, J. Mirjam Boeschen; Smits, Niels; Smits, Cas; Stam, Mariska; Terwee, Caroline B.; Kramer, Sophia E.
2016-01-01
Purpose: We reevaluated the psychometric properties of the Amsterdam Inventory for Auditory Disability and Handicap (AIADH; Kramer, Kapteyn, Festen, & Tobi, 1995) using item response theory. Item response theory describes item functioning along an ability continuum. Method: Cross-sectional data from 2,352 adults with and without hearing…
ERIC Educational Resources Information Center
Bennett, Randy Elliot; And Others
1990-01-01
The relationship of an expert-system-scored constrained free-response item type to multiple-choice and free-response items was studied using data for 614 students on the College Board's Advanced Placement Computer Science (APCS) Examination. Implications for testing and the APCS test are discussed. (SLD)
Kaambwa, Billingsley; Bryan, Stirling; Billingham, Lucinda
2012-06-27
Missing data is a common statistical problem in healthcare datasets from populations of older people. Some argue that arbitrarily assuming the mechanism responsible for the missingness and therefore the method for dealing with this missingness is not the best option-but is this always true? This paper explores what happens when extra information that suggests that a particular mechanism is responsible for missing data is disregarded and methods for dealing with the missing data are chosen arbitrarily. Regression models based on 2,533 intermediate care (IC) patients from the largest evaluation of IC done and published in the UK to date were used to explain variation in costs, EQ-5D and Barthel index. Three methods for dealing with missingness were utilised, each assuming a different mechanism as being responsible for the missing data: complete case analysis (assuming missing completely at random-MCAR), multiple imputation (assuming missing at random-MAR) and Heckman selection model (assuming missing not at random-MNAR). Differences in results were gauged by examining the signs of coefficients as well as the sizes of both coefficients and associated standard errors. Extra information strongly suggested that missing cost data were MCAR. The results show that MCAR and MAR-based methods yielded similar results with sizes of most coefficients and standard errors differing by less than 3.4% while those based on MNAR-methods were statistically different (up to 730% bigger). Significant variables in all regression models also had the same direction of influence on costs. All three mechanisms of missingness were shown to be potential causes of the missing EQ-5D and Barthel data. The method chosen to deal with missing data did not seem to have any significant effect on the results for these data as they led to broadly similar conclusions with sizes of coefficients and standard errors differing by less than 54% and 322%, respectively. Arbitrary selection of methods to deal with missing data should be avoided. Using extra information gathered during the data collection exercise about the cause of missingness to guide this selection would be more appropriate.
Perceived freedom-responsibility covariation among Cypriot adolescents.
Frangou, Georgia; Wilkerson, Keith; McGahan, Joseph R
2008-04-01
Participants were 67 Cypriot adolescents who responded to propositions regarding positive, negative, and noncontingent relations between freedom and responsibility. The authors framed items so that half dealt with freedom given responsibility, and the other half dealt with responsibility given freedom. Results indicated participants were more likely to endorse positive-contingency items than they were negative and noncontingency items when items were framed around freedom given responsibility. However, when items were framed around responsibility given freedom, no such differences emerged. The authors discuss results relative to cultural and sociopolitical differences and similarities between children in Cypress and participants in the United States and implications concerning the present study and previous studies regarding these constructs.
Wahls, Terry; Haugen, Thomas; Cram, Peter
2007-08-01
Missed results can cause needless treatment delays. However, there is little data about the magnitude of this problem and the systems that clinics use to manage test results. Surveys about potential problems related to test results management were developed and administered to clinical staff in a regional Veterans Administration (VA) health care network. The provider survey, conducted four times between May 2005 and October 2006, sampling VA staff physicians, physician assistants, nurse practitioners, and internal medicine trainees, asked questions about the frequency of missed results and diagnosis or treatment delays seen in the antecedent two weeks in their clinics, or if a trainee, the antecedent month. Clinical staff survey response rate was 39% (143 of 370), with 40% using standard operating procedures to manage test results. Forty-four percent routinely reported all results to patients. The provider survey response rate was 50% (441 of 884) overall, with responses often (37% overall; range 29% to 46%) indicating they had seen patients with diagnosis or treatment delays attributed to a missed result; 15% reported two or more such encounters. Even in an integrated health system with an advanced electronic medical record, missed test results and associated diagnosis or treatment delays are common. Additional study and measures of missed results and associated treatment delays are needed.
Wærsted, Morten; Børnick, Taran Svenssen; Twisk, Jos W R; Veiersted, Kaj Bo
2018-02-13
Missing data in longitudinal studies may constitute a source of bias. We suggest three simple missing data indicators for the initial phase of getting an overview of the missingness pattern in a dataset with a high number of follow-ups. Possible use of the indicators is exemplified in two datasets allowing wave nonresponse; a Norwegian dataset of 420 subjects examined at 21 occasions during 6.5 years and a Dutch dataset of 350 subjects with ten repeated measurements over a period of 35 years. The indicators Last response (the timing of last response), Retention (the number of responded follow-ups), and Dispersion (the evenness of the distribution of responses) are introduced. The proposed indicators reveal different aspects of the missing data pattern, and may give the researcher a better insight into the pattern of missingness in a study with several follow-ups, as a starting point for analyzing possible bias. Although the indicators are positively correlated to each other, potential predictors of missingness can have a different relationship with different indicators leading to a better understanding of the missing data mechanism in longitudinal studies. These indictors may be useful descriptive tools when starting to look into a longitudinal dataset with many follow-ups.
ERIC Educational Resources Information Center
DeMars, Christine E.
2012-01-01
In structural equation modeling software, either limited-information (bivariate proportions) or full-information item parameter estimation routines could be used for the 2-parameter item response theory (IRT) model. Limited-information methods assume the continuous variable underlying an item response is normally distributed. For skewed and…
Examination of Different Item Response Theory Models on Tests Composed of Testlets
ERIC Educational Resources Information Center
Kogar, Esin Yilmaz; Kelecioglu, Hülya
2017-01-01
The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…
A Semiparametric Model for Jointly Analyzing Response Times and Accuracy in Computerized Testing
ERIC Educational Resources Information Center
Wang, Chun; Fan, Zhewen; Chang, Hua-Hua; Douglas, Jeffrey A.
2013-01-01
The item response times (RTs) collected from computerized testing represent an underutilized type of information about items and examinees. In addition to knowing the examinees' responses to each item, we can investigate the amount of time examinees spend on each item. Current models for RTs mainly focus on parametric models, which have the…
ERIC Educational Resources Information Center
Missouri State Dept. of Elementary and Secondary Education, Jefferson City.
This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to ninth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…
Bi-dimensional acculturation and cultural response set in CES-D among Korean immigrants
Kim, Eunjung; Seo, Kumin; Cain, Kevin C.
2017-01-01
This study examined a cultural response set to positive affect items and depressive symptom items in CES-D among 172 Korean immigrants. Bi-dimensional acculturation approach, which considers maintenance of Korean Orientation and adoption of American Orientation, was utilized. As Korean immigrants increased American Orientation, they tended to score higher on positive affect items, while no changes occurred in depressive symptom items. Korean Orientation was not related to either positive affect items or depressive symptom items. Korean immigrants have response bias toward positive affect items in CES-D, which decreases as they adopt more American Orientation. CES-D lacks cultural equivalence for Korean immigrants. PMID:20701420
Vegetable parenting practices scale. Item response modeling analyses
Chen, Tzu-An; O’Connor, Teresia; Hughes, Sheryl; Beltran, Alicia; Baranowski, Janice; Diep, Cassandra; Baranowski, Tom
2015-01-01
Objective To evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We also tested for differences in the ways item function (called differential item functioning) across child’s gender, ethnicity, age, and household income groups. Method Parents of 3–5 year old children completed a self-reported vegetable parenting practices scale online. Vegetable parenting practices consisted of 14 effective vegetable parenting practices and 12 ineffective vegetable parenting practices items, each with three subscales (responsiveness, structure, and control). Multidimensional polytomous item response modeling was conducted separately on effective vegetable parenting practices and ineffective vegetable parenting practices. Results One effective vegetable parenting practice item did not fit the model well in the full sample or across demographic groups, and another was a misfit in differential item functioning analyses across child’s gender. Significant differential item functioning was detected across children’s age and ethnicity groups, and more among effective vegetable parenting practices than ineffective vegetable parenting practices items. Wright maps showed items only covered parts of the latent trait distribution. The harder- and easier-to-respond ends of the construct were not covered by items for effective vegetable parenting practices and ineffective vegetable parenting practices, respectively. Conclusions Several effective vegetable parenting practices and ineffective vegetable parenting practices scale items functioned differently on the basis of child’s demographic characteristics; therefore, researchers should use these vegetable parenting practices scales with caution. Item response modeling should be incorporated in analyses of parenting practice questionnaires to better assess differences across demographic characteristics. PMID:25895694
Validation of an instrument to assess visual ability in children with visual impairment in China.
Huang, Jinhai; Khadka, Jyoti; Gao, Rongrong; Zhang, Sifang; Dong, Wenpeng; Bao, Fangjun; Chen, Haisi; Wang, Qinmei; Chen, Hao; Pesudovs, Konrad
2017-04-01
To validate a visual ability instrument for school-aged children with visual impairment in China by translating, culturally adopting and Rasch scaling the Cardiff Visual Ability Questionnaire for Children (CVAQC). The 25-item CVAQC was translated into Mandarin using a standard protocol. The translated version (CVAQC-CN) was subjected to cognitive testing to ensure a proper cultural adaptation of its content. Then, the CVAQC-CN was interviewer-administered to 114 school-aged children and young people with visual impairment. Rasch analysis was carried out to assess its psychometric properties. The correlation between the CVAQC-CN visual ability scores and clinical measure of vision (visual acuity; VA and contrast sensitivity, CS) were assessed using Spearman's r. Based on cultural adaptation exercise, cognitive testing, missing data and Rasch metrics-based iterative item removal, three items were removed from the original 25. The 22-item CVAQC-CN demonstrated excellent measurement precision (person separation index, 3.08), content validity (item separation, 10.09) and item reliability (0.99). Moreover, the CVAQC-CN was unidimensional and had no item bias. The person-item map indicated good targeting of item difficulty to person ability. The CVAQC-CN had moderate correlations between CS (-0.53, p<0.00001) and VA (0.726, p<0.00001), respectively, indicating its validity. The 22-item CVAQC-CN is a psychometrically robust and valid instrument to measure visual ability in children with visual impairment in China. The instrument can be used as a clinical and research outcome measure to assess the change in visual ability after low vision rehabilitation intervention. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
A Model-Free Diagnostic for Single-Peakedness of Item Responses Using Ordered Conditional Means.
Polak, Marike; de Rooij, Mark; Heiser, Willem J
2012-09-01
In this article we propose a model-free diagnostic for single-peakedness (unimodality) of item responses. Presuming a unidimensional unfolding scale and a given item ordering, we approximate item response functions of all items based on ordered conditional means (OCM). The proposed OCM methodology is based on Thurstone & Chave's (1929) criterion of irrelevance, which is a graphical, exploratory method for evaluating the "relevance" of dichotomous attitude items. We generalized this criterion to graded response items and quantified the relevance by fitting a unimodal smoother. The resulting goodness-of-fit was used to determine item fit and aggregated scale fit. Based on a simulation procedure, cutoff values were proposed for the measures of item fit. These cutoff values showed high power rates and acceptable Type I error rates. We present 2 applications of the OCM method. First, we apply the OCM method to personality data from the Developmental Profile; second, we analyze attitude data collected by Roberts and Laughlin (1996) concerning opinions of capital punishment.
Item response theory analysis of the Pain Self-Efficacy Questionnaire.
Costa, Daniel S J; Asghari, Ali; Nicholas, Michael K
2017-01-01
The Pain Self-Efficacy Questionnaire (PSEQ) is a 10-item instrument designed to assess the extent to which a person in pain believes s/he is able to accomplish various activities despite their pain. There is strong evidence for the validity and reliability of both the full-length PSEQ and a 2-item version. The purpose of this study is to further examine the properties of the PSEQ using an item response theory (IRT) approach. We used the two-parameter graded response model to examine the category probability curves, and location and discrimination parameters of the 10 PSEQ items. In item response theory, responses to a set of items are assumed to be probabilistically determined by a latent (unobserved) variable. In the graded-response model specifically, item response threshold (the value of the latent variable for which adjacent response categories are equally likely) and discrimination parameters are estimated for each item. Participants were 1511 mixed, chronic pain patients attending for initial assessment at a tertiary pain management centre. All items except item 7 ('I can cope with my pain without medication') performed well in IRT analysis, and the category probability curves suggested that participants used the 7-point response scale consistently. Items 6 ('I can still do many of the things I enjoy doing, such as hobbies or leisure activity, despite pain'), 8 ('I can still accomplish most of my goals in life, despite the pain') and 9 ('I can live a normal lifestyle, despite the pain') captured higher levels of the latent variable with greater precision. The results from this IRT analysis add to the body of evidence based on classical test theory illustrating the strong psychometric properties of the PSEQ. Despite the relatively poor performance of Item 7, its clinical utility warrants its retention in the questionnaire. The strong psychometric properties of the PSEQ support its use as an effective tool for assessing self-efficacy in people with pain. Copyright © 2016 Scandinavian Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.
Rice, Stephen; McCarley, Jason S
2011-12-01
Automated diagnostic aids prone to false alarms often produce poorer human performance in signal detection tasks than equally reliable miss-prone aids. However, it is not yet clear whether this is attributable to differences in the perceptual salience of the automated aids' misses and false alarms or is the result of inherent differences in operators' cognitive responses to different forms of automation error. The present experiments therefore examined the effects of automation false alarms and misses on human performance under conditions in which the different forms of error were matched in their perceptual characteristics. Young adult participants performed a simulated baggage x-ray screening task while assisted by an automated diagnostic aid. Judgments from the aid were rendered as text messages presented at the onset of each trial, and every trial was followed by a second text message providing response feedback. Thus, misses and false alarms from the aid were matched for their perceptual salience. Experiment 1 found that even under these conditions, false alarms from the aid produced poorer human performance and engendered lower automation use than misses from the aid. Experiment 2, however, found that the asymmetry between misses and false alarms was reduced when the aid's false alarms were framed as neutral messages rather than explicit misjudgments. Results suggest that automation false alarms and misses differ in their inherent cognitive salience and imply that changes in diagnosis framing may allow designers to encourage better use of imperfectly reliable automated aids.
Cappelleri, Joseph C.; Lundy, J. Jason; Hays, Ron D.
2014-01-01
Introduction The U.S. Food and Drug Administration’s patient-reported outcome (PRO) guidance document defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). “Construct validity is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (Strauss & Smith, 2009, p. 7). Hence both qualitative and quantitative information are essential in evaluating the validity of measures. Methods We review classical test theory and item response theory approaches to evaluating PRO measures including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. Conclusion Classical test theory and item response theory can be useful in providing a quantitative assessment of items and scales during the content validity phase of patient-reported outcome measures. Depending on the particular type of measure and the specific circumstances, either one or both approaches should be considered to help maximize the content validity of PRO measures. PMID:24811753
Item Response Theory Using Hierarchical Generalized Linear Models
ERIC Educational Resources Information Center
Ravand, Hamdollah
2015-01-01
Multilevel models (MLMs) are flexible in that they can be employed to obtain item and person parameters, test for differential item functioning (DIF) and capture both local item and person dependence. Papers on the MLM analysis of item response data have focused mostly on theoretical issues where applications have been add-ons to simulation…
Item Response Theory Equating Using Bayesian Informative Priors.
ERIC Educational Resources Information Center
de la Torre, Jimmy; Patz, Richard J.
This paper seeks to extend the application of Markov chain Monte Carlo (MCMC) methods in item response theory (IRT) to include the estimation of equating relationships along with the estimation of test item parameters. A method is proposed that incorporates estimation of the equating relationship in the item calibration phase. Item parameters from…
Instrument Formatting with Computer Data Entry in Mind.
ERIC Educational Resources Information Center
Boser, Judith A.; And Others
Different formats for four types of research items were studied for ease of computer data entry. The types were: (1) numeric response items; (2) individual multiple choice items; (3) multiple choice items with the same response items; and (4) card column indicator placement. Each of the 13 experienced staff members of a major university's Data…
Jeon, Yun-Hee; Liu, Zhixin; Li, Zhicheng; Low, Lee-Fay; Chenoweth, Lynn; O'Connor, Daniel; Beattie, Elizabeth; Davison, Tanya E; Brodaty, Henry
2016-11-01
To develop and validate a short version of the Cornell Scale for Depression in Dementia (CSDD-19) for routine detection of depression in nursing homes. Australian nursing homes. A series of cross-sectional studies were conducted involving: 1) descriptive analysis of pooled data from five nursing home studies that used the CSDD-19 (N = 671) to identify patterns of responses and missing data on individual CSDD items; 2) analysis of four of the five studies (N = 556) to assess CSDD-19 for unidimensionality, item fit, and differential item functioning using Rasch modeling to develop a shorter version, the CSDD-4; 3) validation of the CSDD-4 against the DSM-IV using the fifth study of 115 residents and through expert consultations; and 4) evaluation of the clinical utility of CSDD-4 using an independent cohort of 92 nursing home residents. Four items from the original CSDD-19 were found to be most suitable for depression screening: anxiety, sadness, lack of reactivity to pleasant events, and irritability. The CSDD-4 highly correlated with the original scale (N = 474, r = 0.831, p < 0.001), with acceptable internal consistency (Cronbach's alpha = 0.70). At the cutoff score of less than 2, sensitivity and specificity of CSDD-4 were 81% and 51%, respectively, for the independent cohort (N = 92), of whom 50% had dementia. The CSDD-4 had an area under the curve (AUC) of 0.73 (z = 3.47, p < 0.001), which was compatible with the CSDD-19 (AUC = 0.69, z = 2.89, p < 0.01). The CSDD-4 is valid for routine screening of depression in nursing homes. Its adoption is feasible and practical for nursing home staff, and may facilitate more comprehensive assessment and management of depression in nursing home residents. Copyright © 2016 American Association for Geriatric Psychiatry. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Cramer, Jennifer Suzanne
2010-01-01
A great deal of scholarly research has addressed the issue of dialect mapping in the United States. These studies, usually based on phonetic or lexical items, aim to present an overall picture of the dialect landscape. But what is often missing in these types of projects is an attention to the borders of a dialect region and to what kinds of…
Referential processing: reciprocity and correlates of naming and imaging.
Paivio, A; Clark, J M; Digdon, N; Bons, T
1989-03-01
To shed light on the referential processes that underlie mental translation between representations of objects and words, we studied the reciprocity and determinants of naming and imaging reaction times (RT). Ninety-six subjects pressed a key when they had covertly named 248 pictures or imaged to their names. Mean naming and imagery RTs for each item were correlated with one another, and with properties of names, images, and their interconnections suggested by prior research and dual coding theory. Imagery RTs correlated .56 (df = 246) with manual naming RTs and .58 with voicekey naming RTs from prior studies. A factor analysis of the RTs and of 31 item characteristics revealed 7 dimensions. Imagery and naming RTs loaded on a common referential factor that included variables related to both directions of processing (e.g., missing names and missing images). Naming RTs also loaded on a nonverbal-to-verbal factor that included such variables as number of different names, whereas imagery RTs loaded on a verbal-to-nonverbal factor that included such variables as rated consistency of imagery. The other factors were verbal familiarity, verbal complexity, nonverbal familiarity, and nonverbal complexity. The findings confirm the reciprocity of imaging and naming, and their relation to constructs associated with distinct phases of referential processing.
Consequences of Ignoring Guessing when Estimating the Latent Density in Item Response Theory
ERIC Educational Resources Information Center
Woods, Carol M.
2008-01-01
In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters. In extant Monte Carlo evaluations of RC-IRT, the item response function (IRF) used to fit the data is the same one used to generate the data. The present simulation study examines RC-IRT when the IRF is imperfectly…
ERIC Educational Resources Information Center
Jones, Douglas H.
The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…
Limits on Log Cross-Product Ratios for Item Response Models. Research Report. ETS RR-06-10
ERIC Educational Resources Information Center
Haberman, Shelby J.; Holland, Paul W.; Sinharay, Sandip
2006-01-01
Bounds are established for log cross-product ratios (log odds ratios) involving pairs of items for item response models. First, expressions for bounds on log cross-product ratios are provided for unidimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model.…
Petscher, Yaacov; Mitchell, Alison M; Foorman, Barbara R
2015-01-01
A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is possible that accounting for individual differences in response times may be an increasingly feasible option to strengthen the precision of individual scores. The present research evaluated the differential reliability of scores when using classical test theory and item response theory as compared to a conditional item response model which includes response time as an item parameter. Results indicated that the precision of student ability scores increased by an average of 5 % when using the conditional item response model, with greater improvements for those who were average or high ability. Implications for measurement models of speeded assessments are discussed.
Petscher, Yaacov; Mitchell, Alison M.; Foorman, Barbara R.
2016-01-01
A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is possible that accounting for individual differences in response times may be an increasingly feasible option to strengthen the precision of individual scores. The present research evaluated the differential reliability of scores when using classical test theory and item response theory as compared to a conditional item response model which includes response time as an item parameter. Results indicated that the precision of student ability scores increased by an average of 5 % when using the conditional item response model, with greater improvements for those who were average or high ability. Implications for measurement models of speeded assessments are discussed. PMID:27721568
Modeling missing data in knowledge space theory.
de Chiusole, Debora; Stefanutti, Luca; Anselmi, Pasquale; Robusto, Egidio
2015-12-01
Missing data are a well known issue in statistical inference, because some responses may be missing, even when data are collected carefully. The problem that arises in these cases is how to deal with missing data. In this article, the missingness is analyzed in knowledge space theory, and in particular when the basic local independence model (BLIM) is applied to the data. Two extensions of the BLIM to missing data are proposed: The former, called ignorable missing BLIM (IMBLIM), assumes that missing data are missing completely at random; the latter, called missing BLIM (MissBLIM), introduces specific dependencies of the missing data on the knowledge states, thus assuming that the missing data are missing not at random. The IMBLIM and the MissBLIM modeled the missingness in a satisfactory way, in both a simulation study and an empirical application, depending on the process that generates the missingness: If the missing data-generating process is of type missing completely at random, then either IMBLIM or MissBLIM provide adequate fit to the data. However, if the pattern of missingness is functionally dependent upon unobservable features of the data (e.g., missing answers are more likely to be wrong), then only a correctly specified model of the missingness distribution provides an adequate fit to the data. (c) 2015 APA, all rights reserved).
Orique, Sabrina B; Patty, Christopher M; Sandidge, Alisha; Camarena, Emma; Newsom, Rose
2017-12-01
The aim of this article is to describe the use of Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) data to measure missed nursing care and construct a missed nursing care metric. Missed nursing care varies widely within and between US hospitals. Missed nursing care can be measured utilizing the HCAHPS data. This cross-sectional study used HCAHPS data to measure missed care. This analysis includes HCAHPS data from 1125 acute care patients discharged between January 2014 and December 2014. A missed care index was computed by dividing the total number of missed care occurrences as reported by the patient into the total number of survey responses that did not indicate missed care. The computed missed care index for the organization was 0.6 with individual unit indices ranging from 0.2 to 1.4. Our methods utilize existing data to quantify missed nursing care. Based on the assessment, nursing leaders can develop interventions to decrease the incidence of missed care. Further data should be gathered to validate the incidence of missed care from HCAHPS reports.
Children's missed healthcare appointments: professional and organisational responses.
Appleton, Jane; Powell, Catherine; Coombes, Lindsey
2016-09-01
This National Society for the Prevention of Cruelty to Children (NSPCC) funded UK study sought to examine organisational and professional responses to children's missed healthcare appointments. The study comprised two parts: phase I was a web-based scoping and systematic analysis of UK National Health Service healthcare organisations' internal policies on missed appointments. Phase II involved a case study of how missed appointments were managed within one hospital trust, including interviews with hospital-based staff, review of organisational data and examination of policies and 'systems' in place. Policies accessed were of variable quality when benchmarked against a predetermined set of evidence-based standards. Additional material (eg, board minutes) gleaned through the searches found an apparent disconnect between nationally determined safeguarding requirements and strategies to reduce the cost pressures arising from missed appointments. Findings from the case study included the continuing use of the adult-centric term 'did not attend' (DNA), the challenges that may be inherent in attending appointments (with concomitant sympathy for parents) and a need to further explore general practitioner responses to DNA notifications, particularly given the acknowledged association between missed appointments and child maltreatment. The web-based scoping exercise yielded a small number of organisational policies. These were of variable quality when rated against predetermined standards. Other material gathered through the search strategy found evidence that 'missed appointment' strategies aimed at reducing costs did not always acknowledge the discrete needs of children. The case study findings contribute to an understanding of the complexities and challenges of responding to a missed appointment and the importance of taking a child-centred approach. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Evidence-based practice: extending the search to find material for the systematic review
Helmer, Diane; Savoie, Isabelle; Green, Carolyn; Kazanjian, Arminée
2001-01-01
Background: Cochrane-style systematic reviews increasingly require the participation of librarians. Guidelines on the appropriate search strategy to use for systematic reviews have been proposed. However, research evidence supporting these recommendations is limited. Objective: This study investigates the effectiveness of various systematic search methods used to uncover randomized controlled trials (RCTs) for systematic reviews. Effectiveness is defined as the proportion of relevant material uncovered for the systematic review using extended systematic review search methods. The following extended systematic search methods are evaluated: searching subject-specific or specialized databases (including trial registries), hand searching, scanning reference lists, and communicating personally. Methods: Two systematic review projects were prospectively monitored regarding the method used to identify items as well as the type of items retrieved. The proportion of RCTs identified by each systematic search method was calculated. Results: The extended systematic search methods uncovered 29.2% of all items retrieved for the systematic reviews. The search of specialized databases was the most effective method, followed by scanning of reference lists, communicating personally, and hand searching. Although the number of items identified through hand searching was small, these unique items would otherwise have been missed. Conclusions: Extended systematic search methods are effective tools for uncovering material for the systematic review. The quality of the items uncovered has yet to be assessed and will be key in evaluating the value of the systematic search methods. PMID:11837256
Danieli, Yael; Norris, Fran H; Lindert, Jutta; Paisner, Vera; Engdahl, Brian; Richter, Julia
2015-09-01
A comprehensive valid behavioral measure for assessing multidimensional multigenerational impacts of massive trauma has been missing thus far. We describe the development of the Posttrauma Adaptational Styles questionnaire (Part I of the three-part Danieli Inventory of Multigenerational Legacies of Trauma), a self-report questionnaire of Holocaust survivors' children's perceptions of each parent and their own upbringing (60 items per parent). The items were based on literature and cognitive interviewing of 18 survivors' offspring. A web-based convenience sample survey was designed in English and Hebrew and completed by 482 adult children (M age = 59; 67% women) of Holocaust survivors. Exploratory factor analyses were conducted by using maximum likelihood extraction with Geomin rotation to examine the factor structure of the original 70 items for each parent. Conducted hierarchically, the analysis yielded three higher-order factors reflecting intensities of victim, numb, and fighter styles. The 30-item Victim Style Scale (α = .92-.93) and 18-item Numb Style Scale (α = .89) had excellent internal consistency; the consistency of the 12-item Fighter Style Scale (α = .69-.70) was more modest. English-Hebrew analyses suggested good-to-excellent congruence in factor structure (φ = .87-.99). Further research is needed to evaluate the validity of the measure in other samples and populations. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ackerman, Robert A; Donnellan, M Brent; Roberts, Brent W; Fraley, R Chris
2016-04-01
The Narcissistic Personality Inventory (NPI) is currently the most widely used measure of narcissism in social/personality psychology. It is also relatively unique because it uses a forced-choice response format. We investigate the consequences of changing the NPI's response format for item meaning and factor structure. Participants were randomly assigned to one of three conditions: 40 forced-choice items (n = 2,754), 80 single-stimulus dichotomous items (i.e., separate true/false responses for each item; n = 2,275), or 80 single-stimulus rating scale items (i.e., 5-point Likert-type response scales for each item; n = 2,156). Analyses suggested that the "narcissistic" and "nonnarcissistic" response options from the Entitlement and Superiority subscales refer to independent personality dimensions rather than high and low levels of the same attribute. In addition, factor analyses revealed that although the Leadership dimension was evident across formats, dimensions with entitlement and superiority were not as robust. Implications for continued use of the NPI are discussed. © The Author(s) 2015.
Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items
ERIC Educational Resources Information Center
Cher Wong, Cheow
2015-01-01
Building on previous works by Lord and Ogasawara for dichotomous items, this article proposes an approach to derive the asymptotic standard errors of item response theory true score equating involving polytomous items, for equivalent and nonequivalent groups of examinees. This analytical approach could be used in place of empirical methods like…
Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory
ERIC Educational Resources Information Center
Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi
2016-01-01
High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…
ERIC Educational Resources Information Center
Sengul Avsar, Asiye; Tavsancil, Ezel
2017-01-01
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Rasch Measurement and Item Banking: Theory and Practice.
ERIC Educational Resources Information Center
Nakamura, Yuji
The Rasch Model is an item response theory, one parameter model developed that states that the probability of a correct response on a test is a function of the difficulty of the item and the ability of the candidate. Item banking is useful for language testing. The Rasch Model provides estimates of item difficulties that are meaningful,…
Item Response Theory Models for Wording Effects in Mixed-Format Scales
ERIC Educational Resources Information Center
Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu
2015-01-01
Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…
Vegetable parenting practices scale: Item response modeling analyses
USDA-ARS?s Scientific Manuscript database
Our objective was to evaluate the psychometric properties of a vegetable parenting practices scale using multidimensional polytomous item response modeling which enables assessing item fit to latent variables and the distributional characteristics of the items in comparison to the respondents. We al...
A HO-IRT Based Diagnostic Assessment System with Constructed Response Items
ERIC Educational Resources Information Center
Yang, Chih-Wei; Kuo, Bor-Chen; Liao, Chen-Huei
2011-01-01
The aim of the present study was to develop an on-line assessment system with constructed response items in the context of elementary mathematics curriculum. The system recorded the problem solving process of constructed response items and transfered the process to response codes for further analyses. An inference mechanism based on artificial…
ERIC Educational Resources Information Center
Sen, Rohini
2012-01-01
In the last five decades, research on the uses of response time has extended into the field of psychometrics (Schnikpe & Scrams, 1999; van der Linden, 2006; van der Linden, 2007), where interest has centered around the usefulness of response time information in item calibration and person measurement within an item response theory. framework.…
A Primer on the 2- and 3-Parameter Item Response Theory Models.
ERIC Educational Resources Information Center
Thornton, Artist
Item response theory (IRT) is a useful and effective tool for item response measurement if used in the proper context. This paper discusses the sets of assumptions under which responses can be modeled while exploring the framework of the IRT models relative to response testing. The one parameter model, or one parameter logistic model, is perhaps…
ERIC Educational Resources Information Center
Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan
2016-01-01
This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming…
Power Analysis for Anticipated Non-Response in Randomized Block Designs
ERIC Educational Resources Information Center
Pustejovsky, James E.
2011-01-01
Recent guidance on the treatment of missing data in experiments advocates the use of sensitivity analysis and worst-case bounds analysis for addressing non-ignorable missing data mechanisms; moreover, plans for the analysis of missing data should be specified prior to data collection (Puma et al., 2009). While these authors recommend only that…
Oral health in the Japan self-defense forces - a representative survey.
Kudo, Yuka; John, Mike T; Saito, Yoko; Sur, Shachi; Furuyama, Chisako; Tsukasaki, Hiroaki; Baba, Kazuyoshi
2011-04-19
The oral health of military populations is usually not very well characterized compared to civilian populations. The aim of this study was to investigate two physical oral health characteristics and one perceived oral health measure and their correlation in the Japan self-defense forces (JSDF). Number of missing teeth, denture status, and OHRQoL as evaluated by the Japanese 14-item version of the Oral Health Impact Profile (OHIP-J14) as well as the correlation between these oral health measures was investigated in 911 personnel in the JSDF. Subjects did not have a substantial number of missing teeth and only 4% used removable dentures. The mean OHIP-J14 score was 4.6 ± 6.7 units. The magnitude of the correlation between the number of missing teeth with OHIP-J14 scores was small (r = 0.22, p < 0.001). Mean OHIP-J14 scores differed between subjects with and without dentures (8.6 and 4.4, p < 0.001). Compared to Japanese civilian populations, personnel of the JSDF demonstrated good oral health. Two physical oral health characteristics were associated with perceived oral health.
Ssewanyana, Derrick; van Baar, Anneloes; Newton, Charles R; Abubakar, Amina
2018-06-20
Health risk behavior (HRB) is of concern during adolescence. In sub-Saharan Africa, reliable, valid and culturally appropriate measures of HRB are urgently needed. This study aims at assembling and psychometrically evaluating a comprehensive questionnaire on HRB of adolescents in Kilifi County at the coast of Kenya. The Kilifi Health Risk Behavior Questionnaire (KRIBE-Q) was assembled using items on HRB identified from a systematic review and by consulting 85 young people through 11 focus group discussions and in-depth interviews with 10 key informants like teachers and employees of organizations providing various services to young people in Kilifi County. The assembled list of HRB items were back and forward translated from English to Swahili and harmonized by a panel of experts. A total of 164 adolescents completed the assembled Swahili questionnaire at baseline and two weeks later 85 of them completed the questionnaire again. A classical test theory approach was utilized for psychometric evaluation. We computed the amount of missing data at item-level to verify data quality. Scaling evaluation was assessed by spread of responses across options at an item-level. Using Gwet's AC1 coefficient, test-retest reliability was assessed using data from the 85 adolescents who answered the questionnaire twice. Observations and completion of a brief questionnaire were done for non-psychometric evaluation of the KRIBE-Q administered via audio-computer assisted self-interview (ACASI) in Swahili language to 40 adolescents. The KRIBE-Q showed high data quality, good spread of responses across options and a very good test-retest reliability (Gwet's AC1 = 0.82). It comprised 8 components with acceptable test-retest reliability: behavior resulting in unintentional injury and violence (0.85); tobacco use (0.85); alcohol and drug use (0.96); sexual behaviors (0.94); dietary behaviors (0.60); physical activity (0.74); gambling (0.73); and hygiene behavior (0.89). About 96% of the adolescents found the ACASI private and easy to use. Prevalence of bullying (32%), physical fights (40%) and engagement in gambling (26%) was high. The KRIBE-Q assembled in this study is a psychometrically sound instrument for adolescents in rural coastal Kenya and feasible to administer via ACASI. This measure may be useful for surveys and planning interventions in similar settings.
Cao, Rui; Nosofsky, Robert M; Shiffrin, Richard M
2017-05-01
In short-term-memory (STM)-search tasks, observers judge whether a test probe was present in a short list of study items. Here we investigated the long-term learning mechanisms that lead to the highly efficient STM-search performance observed under conditions of consistent-mapping (CM) training, in which targets and foils never switch roles across trials. In item-response learning, subjects learn long-term mappings between individual items and target versus foil responses. In category learning, subjects learn high-level codes corresponding to separate sets of items and learn to attach old versus new responses to these category codes. To distinguish between these 2 forms of learning, we tested subjects in categorized varied mapping (CV) conditions: There were 2 distinct categories of items, but the assignment of categories to target versus foil responses varied across trials. In cases involving arbitrary categories, CV performance closely resembled standard varied-mapping performance without categories and departed dramatically from CM performance, supporting the item-response-learning hypothesis. In cases involving prelearned categories, CV performance resembled CM performance, as long as there was sufficient practice or steps taken to reduce trial-to-trial category-switching costs. This pattern of results supports the category-coding hypothesis for sufficiently well-learned categories. Thus, item-response learning occurs rapidly and is used early in CM training; category learning is much slower but is eventually adopted and is used to increase the efficiency of search beyond that available from item-response learning. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Validation of a Persian version of the Fibromyalgia Impact Questionnaire (FIQ-P).
Bidari, Ali; Hassanzadeh, Morteza; Mohabat, Mohamad-Farzam; Talachian, Elham; Khoei, Effat Merghati
2014-02-01
The aim of this study is to translate, adapt, and validate a Persian version of the Fibromyalgia (FM) Impact Questionnaire (FIQ-P). The FIQ-P was adapted following the translation and back-translation approach; then, it was administered to thirty females with FM. Participants also completed two other validated questionnaires, the Medical Outcome Survey Short Form-36 (SF-36) and the Beck Depression Inventory (BDI). Internal consistency within the FIQ-P items and its test-retest reliability were assessed with Cronbach's alpha and Spearman's correlation coefficient, respectively. Construct validity was analyzed by Spearman's r when correlating the FIQ-P to other questionnaires. The translated version was concordant. Adaptation affected two sub-items of physical function. Participants' mean age ± standard deviation was 40.4 ± 9.0 years. Internal consistency proved good with α = 0.80. Test-retest coefficient ranged from 0.50 for the item "work days missed" to 0.79 for all FIQ-P items. Fair and statistically significant (P < 0.01) correlations were found between the FIQ-P items and two other questionnaires, SF-36 (r = -0.57) and BDI (r = 0.53). We concluded that the FIQ-P is a valid and reliable instrument for measuring health status of Persian-speaking FM patients.
Sekely, Angela; Taylor, Graeme J; Bagby, R Michael
2018-03-17
The Toronto Structured Interview for Alexithymia (TSIA) was developed to provide a structured interview method for assessing alexithymia. One drawback of this instrument is the amount of time it takes to administer and score. The current study used item response theory (IRT) methods to analyze data from a large heterogeneous multi-language sample (N = 842) to investigate whether a subset of items could be selected to create a short version of the instrument. Samejima's (1969) graded response model was used to fit the item responses. Items providing maximum information were retained in the short model, resulting in the elimination of 12-items from the original 24-items. Despite the 50% reduction in the number of items, 65.22% of the information was retained. Further studies are needed to validate the short version. A short version of the TSIA is potentially of practical value to clinicians and researchers with time constraints. Copyright © 2018. Published by Elsevier B.V.
Preserving the Integrity of Citations and References by All Stakeholders of Science Communication.
Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Gerasimov, Alexey N; Kostyukova, Elena I; Kitas, George D
2015-11-01
Citations to scholarly items are building bricks for multidisciplinary science communication. Citation analyses are currently influencing individual career advancement and ranking of academic and research institutions worldwide. This article overviews the involvement of scientific authors, reviewers, editors, publishers, indexers, and learned associations in the citing and referencing to preserve the integrity of science communication. Authors are responsible for thorough bibliographic searches to select relevant references for their articles, comprehend main points, and cite them in an ethical way. Reviewers and editors may perform additional searches and recommend missing essential references. Publishers, in turn, are in a position to instruct their authors over the citations and references, provide tools for validation of references, and open access to bibliographies. Publicly available reference lists bear important information about the novelty and relatedness of the scholarly items with the published literature. Few editorial associations have dealt with the issue of citations and properly managed references. As a prime example, the International Committee of Medical Journal Editors (ICMJE) issued in December 2014 an updated set of recommendations on the need for citing primary literature and avoiding unethical references, which are applicable to the global scientific community. With the exponential growth of literature and related references, it is critically important to define functions of all stakeholders of science communication in curbing the issue of irrational and unethical citations and thereby improve the quality and indexability of scholarly journals.
Preserving the Integrity of Citations and References by All Stakeholders of Science Communication
Yessirkepov, Marlen; Voronov, Alexander A.; Gerasimov, Alexey N.; Kostyukova, Elena I.; Kitas, George D.
2015-01-01
Citations to scholarly items are building bricks for multidisciplinary science communication. Citation analyses are currently influencing individual career advancement and ranking of academic and research institutions worldwide. This article overviews the involvement of scientific authors, reviewers, editors, publishers, indexers, and learned associations in the citing and referencing to preserve the integrity of science communication. Authors are responsible for thorough bibliographic searches to select relevant references for their articles, comprehend main points, and cite them in an ethical way. Reviewers and editors may perform additional searches and recommend missing essential references. Publishers, in turn, are in a position to instruct their authors over the citations and references, provide tools for validation of references, and open access to bibliographies. Publicly available reference lists bear important information about the novelty and relatedness of the scholarly items with the published literature. Few editorial associations have dealt with the issue of citations and properly managed references. As a prime example, the International Committee of Medical Journal Editors (ICMJE) issued in December 2014 an updated set of recommendations on the need for citing primary literature and avoiding unethical references, which are applicable to the global scientific community. With the exponential growth of literature and related references, it is critically important to define functions of all stakeholders of science communication in curbing the issue of irrational and unethical citations and thereby improve the quality and indexability of scholarly journals. PMID:26538996
Contextual behavior and neural circuits
Lee, Inah; Lee, Choong-Hee
2013-01-01
Animals including humans engage in goal-directed behavior flexibly in response to items and their background, which is called contextual behavior in this review. Although the concept of context has long been studied, there are differences among researchers in defining and experimenting with the concept. The current review aims to provide a categorical framework within which not only the neural mechanisms of contextual information processing but also the contextual behavior can be studied in more concrete ways. For this purpose, we categorize contextual behavior into three subcategories as follows by considering the types of interactions among context, item, and response: contextual response selection, contextual item selection, and contextual item–response selection. Contextual response selection refers to the animal emitting different types of responses to the same item depending on the context in the background. Contextual item selection occurs when there are multiple items that need to be chosen in a contextual manner. Finally, when multiple items and multiple contexts are involved, contextual item–response selection takes place whereby the animal either chooses an item or inhibits such a response depending on item–context paired association. The literature suggests that the rhinal cortical regions and the hippocampal formation play key roles in mnemonically categorizing and recognizing contextual representations and the associated items. In addition, it appears that the fronto-striatal cortical loops in connection with the contextual information-processing areas critically control the flexible deployment of adaptive action sets and motor responses for maximizing goals. We suggest that contextual information processing should be investigated in experimental settings where contextual stimuli and resulting behaviors are clearly defined and measurable, considering the dynamic top-down and bottom-up interactions among the neural systems for contextual behavior. PMID:23675321
Reliability and validity of the Chinese mandarin version of PedsQL™ 3.0 transplant module.
Chang, Ying; Luo, Yanhui; Zhou, Yuchen; Wang, Ruixin; Song, Na; Zhu, Guanghua; Wang, Bin; Qin, Maoquan; Yang, Jun; Sun, Yuan; Li, Chunfu; Zhou, Xuan
2016-10-05
Long-term health-related quality of life (HRQoL) of pediatric patients after hematopoietic stem cell transplantation (HSCT) is increasingly studied worldwide. However, few studies have been performed in China, where no uniform scale is available; the PedsQL™ Cancer Module 3.0 Chinese Mandarin version has been used to evaluate HRQoL of patients after HSCT in China. This study aimed to assess the reliability and validity of the Chinese Mandarin version of PedsQL™ 3.0 Transplant Module. Patients between 2 and 18 years old, who underwent HSCT from January 2006 to June 2014, were recruited in Beijing Children's Hospital affiliated to Capital Medical University, the First Affiliated Hospital of Southern Medical University and Beijing Daopei Hospital. 207 parent reports and 182 child self-reports of the PedsQL™ 3.0 Transplant Module Chinese Mandarin version were assigned, of which 362 were returned. No missing item response was observed in the returned reports. Cronbach's alpha coefficient exceeded 0.7 in total scale and every dimension. The intraclass correlation coefficient exceeded 0.8 in all dimensions of child self-reports and parent reports. Spearman's rank correlation coefficients of items and their respective dimensions were 0.6-0.94 in parent reports, and 0.62-0.93 in child self-reports, while a weak association was found between the items and other dimensions. Exploratory factor analysis indicated a good extraction effect, and construct validity of the scale was >60 %. The Chinese Mandarin version of PedsQL™ 3.0 Transplant Module has good feasibility, reliability and validity. Its use may help improve the HRQoL of children after HSCT in China.
Spanish translation and validation of four short pelvic floor disorders questionnaires.
Treszezamsky, Alejandro D; Karp, Deborah; Dick-Biascoechea, Madeline; Ehsani, Nazanin; Dancz, Christina; Montoya, T Ignacio; Olivera, Cedric K; Smith, Aimee L; Cardenas, Rosa; Fashokun, Tola; Bradley, Catherine S
2013-04-01
Globally, Spanish is the primary language for 329 million people; however, most urogynecologic questionnaires are available in English. We set out to develop valid Spanish translations of the Questionnaire for Urinary Incontinence Diagnosis (QUID), the Three Incontinence Questions (3IQ), and the short Pelvic Floor Distress Inventory (PFDI-20) and Pelvic Floor Impact Questionnaire (PFIQ-7). The TRAPD method (translation, review, adjudication, pretesting, and documentation) was used for translation. Eight native Spanish-speaking translators developed Spanish versions collaboratively. These were pretested with cognitive interviews and revised until optimal. For validation, bilingual patients at seven clinics completed Spanish and English questionnaire versions in randomized order. Participants completed a second set of questionnaires later. The Spanish versions' internal consistency and reliability and Spanish-English agreement were measured using Cronbach's alpha, weighted kappa, and intraclass correlation coefficients. A total of 78 subjects were included; 94.9 % self-identified as Hispanic and 73.1 % spoke Spanish as their primary language. The proportion of per-item missing responses was similar in both languages (median 1.3 %). Internal consistency for Spanish PFDI-20 subscales was acceptable to good and for PFIQ-7 and QUID excellent. Test-retest reliability per item was moderate to near perfect for PFDI-20, substantial to near perfect for PFIQ-7 and 3IQ, and substantial for QUID. Spanish-English agreement for individual items was substantial to near perfect for all questionnaires (kappa range 0.64-0.95) and agreement for PFDI-20, PFIQ-7, and QUID subscales scores was high [intraclass correlation coefficient (ICC) range 0.92-0.99]. We obtained valid Spanish translations of the PFDI-20, PFIQ-7, QUID, and 3IQ. These results support their use as clinical and research assessment tools in Spanish-speaking populations.
Item Response Theory Analysis of the Psychopathic Personality Inventory-Revised.
Eichenbaum, Alexander E; Marcus, David K; French, Brian F
2017-06-01
This study examined item and scale functioning in the Psychopathic Personality Inventory-Revised (PPI-R) using an item response theory analysis. PPI-R protocols from 1,052 college student participants (348 male, 704 female) were analyzed. Analyses were conducted on the 131 self-report items comprising the PPI-R's eight content scales, using a graded response model. Scales collected a majority of their information about respondents possessing higher than average levels of the traits being measured. Each scale contained at least some items that evidenced limited ability to differentiate between respondents with differing levels of the trait being measured. Moreover, 80 items (61.1%) yielded significantly different responses between men and women presumably possessing similar levels of the trait being measured. Item performance was also influenced by the scoring format (directly scored vs. reverse-scored) of the items. Overall, the results suggest that the PPI-R, despite identifying psychopathic personality traits in individuals possessing high levels of those traits, may not identify these traits equally well for men and women, and scores are likely influenced by the scoring format of the individual item and scale.
The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory
ERIC Educational Resources Information Center
Sahin, Alper; Anil, Duygu
2017-01-01
This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…
ERIC Educational Resources Information Center
Arce-Ferrer, Alvaro J.; Bulut, Okan
2017-01-01
This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…
ERIC Educational Resources Information Center
Tay, Louis; Huang, Qiming; Vermunt, Jeroen K.
2016-01-01
In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
ERIC Educational Resources Information Center
Tian, Wei; Cai, Li; Thissen, David; Xin, Tao
2013-01-01
In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…
Cohn, Amy M.; Hagman, Brett T.; Graff, Fiona S.; Noel, Nora E.
2011-01-01
Objective: The present study examined the latent continuum of alcohol-related negative consequences among first-year college women using methods from item response theory and classical test theory. Method: Participants (N = 315) were college women in their freshman year who reported consuming any alcohol in the past 90 days and who completed assessments of alcohol consumption and alcohol-related negative consequences using the Rutgers Alcohol Problem Index. Results: Item response theory analyses showed poor model fit for five items identified in the Rutgers Alcohol Problem Index. Two-parameter item response theory logistic models were applied to the remaining 18 items to examine estimates of item difficulty (i.e., severity) and discrimination parameters. The item difficulty parameters ranged from 0.591 to 2.031, and the discrimination parameters ranged from 0.321 to 2.371. Classical test theory analyses indicated that the omission of the five misfit items did not significantly alter the psychometric properties of the construct. Conclusions: Findings suggest that those consequences that had greater severity and discrimination parameters may be used as screening items to identify female problem drinkers at risk for an alcohol use disorder. PMID:22051212
Generalizability in Item Response Modeling
ERIC Educational Resources Information Center
Briggs, Derek C.; Wilson, Mark
2007-01-01
An approach called generalizability in item response modeling (GIRM) is introduced in this article. The GIRM approach essentially incorporates the sampling model of generalizability theory (GT) into the scaling model of item response theory (IRT) by making distributional assumptions about the relevant measurement facets. By specifying a random…
Quantifying Local, Response Dependence between Two Polytomous Items Using the Rasch Model
ERIC Educational Resources Information Center
Andrich, David; Humphry, Stephen M.; Marais, Ida
2012-01-01
Models of modern test theory imply statistical independence among responses, generally referred to as "local independence." One violation of local independence occurs when the response to one item governs the response to a subsequent item. Expanding on a formulation of this kind of violation as a process in the dichotomous Rasch model,…
Using Response Times for Item Selection in Adaptive Testing
ERIC Educational Resources Information Center
van der Linden, Wim J.
2008-01-01
Response times on items can be used to improve item selection in adaptive testing provided that a probabilistic model for their distribution is available. In this research, the author used a hierarchical modeling framework with separate first-level models for the responses and response times and a second-level model for the distribution of the…
The Influence of Item Response Indecision on the Self-Directed Search
ERIC Educational Resources Information Center
Sampson, James P., Jr.; Shy, Jonathan D.; Hartley, Sarah Lucas; Reardon, Robert C.; Peterson, Gary W.
2009-01-01
Students (N = 247) responded to Self-Directed Search (SDS) per the standard response format and were also instructed to record a question mark (?) for items about which they were uncertain (item response indecision [IRI]). The initial responses of the 114 participants with a (?) were then reversed and a second SDS summary code was obtained and…
“Which Box Should I Check?”: Examining Standard Check Box Approaches to Measuring Race and Ethnicity
Eisenhower, Abbey; Suyemoto, Karen; Lucchese, Fernanda; Canenguez, Katia
2014-01-01
Objective This study examined methodological concerns with standard approaches to measuring race and ethnicity using the federally defined race and ethnicity categories, as utilized in National Institutes of Health (NIH) funded research. Data Sources/Study Setting Surveys were administered to 219 economically disadvantaged, racially and ethnically diverse participants at Boston Women Infants and Children (WIC) clinics during 2010. Study Design We examined missingness and misclassification in responses to the closed-ended NIH measure of race and ethnicity compared with open-ended measures of self-identified race and ethnicity. Principal Findings Rates of missingness were 26 and 43 percent for NIH race and ethnicity items, respectively, compared with 11 and 18 percent for open-ended responses. NIH race responses matched racial self-identification in only 44 percent of cases. Missingness and misclassification were disproportionately higher for self-identified Latina(o)s, African-Americans, and Cape Verdeans. Race, but not ethnicity, was more often missing for immigrant versus mainland U.S.-born respondents. Results also indicated that ethnicity for Hispanic/Latina(o)s is more complex than captured in this measure. Conclusions The NIH's current race and ethnicity measure demonstrated poor differentiation of race and ethnicity, restricted response options, and lack of an inclusive ethnicity question. Separating race and ethnicity and providing respondents with adequate flexibility to identify themselves both racially and ethnically may improve valid operationalization. PMID:24298894
Semiparametric Estimation of Treatment Effect in a Pretest–Posttest Study with Missing Data
Davidian, Marie; Tsiatis, Anastasios A.; Leon, Selene
2008-01-01
The pretest–posttest study is commonplace in numerous applications. Typically, subjects are randomized to two treatments, and response is measured at baseline, prior to intervention with the randomized treatment (pretest), and at prespecified follow-up time (posttest). Interest focuses on the effect of treatments on the change between mean baseline and follow-up response. Missing posttest response for some subjects is routine, and disregarding missing cases can lead to invalid inference. Despite the popularity of this design, a consensus on an appropriate analysis when no data are missing, let alone for taking into account missing follow-up, does not exist. Under a semiparametric perspective on the pretest–posttest model, in which limited distributional assumptions on pretest or posttest response are made, we show how the theory of Robins, Rotnitzky and Zhao may be used to characterize a class of consistent treatment effect estimators and to identify the efficient estimator in the class. We then describe how the theoretical results translate into practice. The development not only shows how a unified framework for inference in this setting emerges from the Robins, Rotnitzky and Zhao theory, but also provides a review and demonstration of the key aspects of this theory in a familiar context. The results are also relevant to the problem of comparing two treatment means with adjustment for baseline covariates. PMID:19081743
Semiparametric Estimation of Treatment Effect in a Pretest-Posttest Study with Missing Data.
Davidian, Marie; Tsiatis, Anastasios A; Leon, Selene
2005-08-01
The pretest-posttest study is commonplace in numerous applications. Typically, subjects are randomized to two treatments, and response is measured at baseline, prior to intervention with the randomized treatment (pretest), and at prespecified follow-up time (posttest). Interest focuses on the effect of treatments on the change between mean baseline and follow-up response. Missing posttest response for some subjects is routine, and disregarding missing cases can lead to invalid inference. Despite the popularity of this design, a consensus on an appropriate analysis when no data are missing, let alone for taking into account missing follow-up, does not exist. Under a semiparametric perspective on the pretest-posttest model, in which limited distributional assumptions on pretest or posttest response are made, we show how the theory of Robins, Rotnitzky and Zhao may be used to characterize a class of consistent treatment effect estimators and to identify the efficient estimator in the class. We then describe how the theoretical results translate into practice. The development not only shows how a unified framework for inference in this setting emerges from the Robins, Rotnitzky and Zhao theory, but also provides a review and demonstration of the key aspects of this theory in a familiar context. The results are also relevant to the problem of comparing two treatment means with adjustment for baseline covariates.
Improving measurement of injection drug risk behavior using item response theory.
Janulis, Patrick
2014-03-01
Recent research highlights the multiple steps to preparing and injecting drugs and the resultant viral threats faced by drug users. This research suggests that more sensitive measurement of injection drug HIV risk behavior is required. In addition, growing evidence suggests there are gender differences in injection risk behavior. However, the potential for differential item functioning between genders has not been explored. To explore item response theory as an improved measurement modeling technique that provides empirically justified scaling of injection risk behavior and to examine for potential gender-based differential item functioning. Data is used from three studies in the National Institute on Drug Abuse's Criminal Justice Drug Abuse Treatment Studies. A two-parameter item response theory model was used to scale injection risk behavior and logistic regression was used to examine for differential item functioning. Item fit statistics suggest that item response theory can be used to scale injection risk behavior and these models can provide more sensitive estimates of risk behavior. Additionally, gender-based differential item functioning is present in the current data. Improved measurement of injection risk behavior using item response theory should be encouraged as these models provide increased congruence between construct measurement and the complexity of injection-related HIV risk. Suggestions are made to further improve injection risk behavior measurement. Furthermore, results suggest direct comparisons of composite scores between males and females may be misleading and future work should account for differential item functioning before comparing levels of injection risk behavior.
Knowledge of the ordinal position of list items in pigeons.
Scarf, Damian; Colombo, Michael
2011-10-01
Ordinal knowledge is a fundamental aspect of advanced cognition. It is self-evident that humans represent ordinal knowledge, and over the past 20 years it has become clear that nonhuman primates share this ability. In contrast, evidence that nonprimate species represent ordinal knowledge is missing from the comparative literature. To address this issue, in the present experiment we trained pigeons on three 4-item lists and then tested them with derived lists in which, relative to the training lists, the ordinal position of the items was either maintained or changed. Similar to the findings with human and nonhuman primates, our pigeons performed markedly better on the maintained lists compared to the changed lists, and displayed errors consistent with the view that they used their knowledge of ordinal position to guide responding on the derived lists. These findings demonstrate that the ability to acquire ordinal knowledge is not unique to the primate lineage. (PsycINFO Database Record (c) 2011 APA, all rights reserved).
Measuring sexual orientation in adolescent health surveys: evaluation of eight school-based surveys.
Saewyc, Elizabeth M; Bauer, Greta R; Skay, Carol L; Bearinger, Linda H; Resnick, Michael D; Reis, Elizabeth; Murphy, Aileen
2004-10-01
To examine the performance of various items measuring sexual orientation within 8 school-based adolescent health surveys in the United States and Canada from 1986 through 1999. Analyses examined nonresponse and unsure responses to sexual orientation items compared with other survey items, demographic differences in responses, tests for response set bias, and congruence of responses to multiple orientation items; analytical methods included frequencies, contingency tables with Chi-square, and ANOVA with least significant differences (LSD)post hoc tests; all analyses were conducted separately by gender. In all surveys, nonresponse rates for orientation questions were similar to other sexual questions, but not higher; younger students, immigrants, and students with learning disabilities were more likely to skip items or select "unsure." Sexual behavior items had the lowest nonresponse, but fewer than half of all students reported sexual behavior, limiting its usefulness for indicating orientation. Item placement in the survey, wording, and response set bias all appeared to influence nonresponse and unsure rates. Specific recommendations include standardizing wording across future surveys, and pilot testing items with diverse ages and ethnic groups of teens before use. All three dimensions of orientation should be assessed where possible; when limited to single items, sexual attraction may be the best choice. Specific wording suggestions are offered for future surveys.
Worldwide Report, Nuclear Development and Proliferation
1984-07-02
delegate welcomed the inclusion of a new item-disarmament and development -on the commission’s agenda. Miss Kunadi said the catalytic effects of arms...249199 JPRS-TND-84-016 2 July 1984 — """ , f L ~ tibiae iel<M99; Worldwide Report NUCLEAR DEVELOPMENT AND PROLIFERATION mom m FBIS...Arlington, Virginia 22201. JPRS-TND-84- ■016 2 July 1984 WORLDWIDE REPORT NUCLEAR DEVELOPMENT AND PROLIFERATION CONTENTS ASIA PEOPLE ’S REPUBLIC OF
NASA Technical Reports Server (NTRS)
Gunawardena, J. A.
1992-01-01
This cache mechanism is transparent but does not contain associative circuits. It does not rely on locality of reference of instructions or data. No redundant instructions or data are encached. Items in the cache are accessed without address arithmetic. A cache miss is detected by the simplest test; compare two bits. These features would result in faster access, higher hit rate, reduced chip area, and less power dissipation in comparison with associative systems of similar size.
45 CFR 1355.40 - Foster care and adoption data collection.
Code of Federal Regulations, 2014 CFR
2014-10-01
...-annual detailed data submission. These are specified in Appendix E to this part. (c) Missing data standards. (1) The term “missing data” refers to instances where no data have been entered, if applicable... particular case will be converted to missing data. All data which are “out of range” (i.e., the response is...
45 CFR 1355.40 - Foster care and adoption data collection.
Code of Federal Regulations, 2012 CFR
2012-10-01
...-annual detailed data submission. These are specified in Appendix E to this part. (c) Missing data standards. (1) The term “missing data” refers to instances where no data have been entered, if applicable... particular case will be converted to missing data. All data which are “out of range” (i.e., the response is...
45 CFR 1355.40 - Foster care and adoption data collection.
Code of Federal Regulations, 2013 CFR
2013-10-01
...-annual detailed data submission. These are specified in Appendix E to this part. (c) Missing data standards. (1) The term “missing data” refers to instances where no data have been entered, if applicable... particular case will be converted to missing data. All data which are “out of range” (i.e., the response is...
ERIC Educational Resources Information Center
Li, Yanmei; Li, Shuhong; Wang, Lin
2010-01-01
Many standardized educational tests include groups of items based on a common stimulus, known as "testlets". Standard unidimensional item response theory (IRT) models are commonly used to model examinees' responses to testlet items. However, it is known that local dependence among testlet items can lead to biased item parameter estimates…
Assessing the Utility of Item Response Theory Models: Differential Item Functioning.
ERIC Educational Resources Information Center
Scheuneman, Janice Dowd
The current status of item response theory (IRT) is discussed. Several IRT methods exist for assessing whether an item is biased. Focus is on methods proposed by L. M. Rudner (1975), F. M. Lord (1977), D. Thissen et al. (1988) and R. L. Linn and D. Harnisch (1981). Rudner suggested a measure of the area lying between the two item characteristic…
ERIC Educational Resources Information Center
Eignor, Daniel R.; Douglass, James B.
This paper attempts to provide some initial information about the use of a variety of item response theory (IRT) models in the item selection process; its purpose is to compare the information curves derived from the selection of items characterized by several different IRT models and their associated parameter estimation programs. These…
Al-Rubaish, Abdullah M; Abdel Rahim, Sheikh Idris; Hassan, Ammar; Ali, Amein Al; Mokabel, Fatma; Hegazy, Mohammed; Wosornu, Ladé
2010-05-01
The National Commission for Academic Accreditation and Assessment is responsible for the academic accreditation of universities in the Kingdom of Saudi Arabia (KSA). Requirements for this include evaluation of teaching effectiveness, evidence-based conclusions, and external benchmarks. To develop a questionnaire for students' evaluation of the teaching skills of individual instructors and provide a tool for benchmarking. College of Nursing, University of Dammam [UoD], May-June 2009. The original questionnaire was "Monash Questionnaire Series on Teaching (MonQueST) - Clinical Nursing. The UoD modification retained four areas and seven responses, but reduced items from 26 to 20. Outcome measures were factor analysis and Cronbach's alpha coefficient. Seven Nursing courses were studied, viz.: Fundamentals, Medical, Surgical, Psychiatric and Mental Health, Obstetrics and Gynecology, Pediatrics, and Family and Community Health. Total number of students was 74; missing data ranged from 5 to 27%. The explained variance ranged from 66.9% to 78.7%. The observed Cornbach's α coefficients ranged from 0.78 to 0.93, indicating an exceptionally high reliability. The students in the study were found to be fair and frank in their evaluation.
Fischer, H Felix; Rose, Matthias
2016-10-19
Recently, a growing number of Item-Response Theory (IRT) models has been published, which allow estimation of a common latent variable from data derived by different Patient Reported Outcomes (PROs). When using data from different PROs, direct estimation of the latent variable has some advantages over the use of sum score conversion tables. It requires substantial proficiency in the field of psychometrics to fit such models using contemporary IRT software. We developed a web application ( http://www.common-metrics.org ), which allows estimation of latent variable scores more easily using IRT models calibrating different measures on instrument independent scales. Currently, the application allows estimation using six different IRT models for Depression, Anxiety, and Physical Function. Based on published item parameters, users of the application can directly estimate latent trait estimates using expected a posteriori (EAP) for sum scores as well as for specific response patterns, Bayes modal (MAP), Weighted likelihood estimation (WLE) and Maximum likelihood (ML) methods and under three different prior distributions. The obtained estimates can be downloaded and analyzed using standard statistical software. This application enhances the usability of IRT modeling for researchers by allowing comparison of the latent trait estimates over different PROs, such as the Patient Health Questionnaire Depression (PHQ-9) and Anxiety (GAD-7) scales, the Center of Epidemiologic Studies Depression Scale (CES-D), the Beck Depression Inventory (BDI), PROMIS Anxiety and Depression Short Forms and others. Advantages of this approach include comparability of data derived with different measures and tolerance against missing values. The validity of the underlying models needs to be investigated in the future.
ERIC Educational Resources Information Center
Magnus, Brooke E.; Thissen, David
2017-01-01
Questionnaires that include items eliciting count responses are becoming increasingly common in psychology. This study proposes methodological techniques to overcome some of the challenges associated with analyzing multivariate item response data that exhibit zero inflation, maximum inflation, and heaping at preferred digits. The modeling…
Nested Logit Models for Multiple-Choice Item Response Data
ERIC Educational Resources Information Center
Suh, Youngsuk; Bolt, Daniel M.
2010-01-01
Nested logit item response models for multiple-choice data are presented. Relative to previous models, the new models are suggested to provide a better approximation to multiple-choice items where the application of a solution strategy precedes consideration of response options. In practice, the models also accommodate collapsibility across all…
The Dutch Identity: A New Tool for the Study of Item Response Models.
ERIC Educational Resources Information Center
Holland, Paul W.
1990-01-01
The Dutch Identity is presented as a useful tool for expressing the basic equations of item response models that relate the manifest probabilities to the item response functions and the latent trait distribution. Ways in which the identity may be exploited are suggested and illustrated. (SLD)
Item response theory analysis of the mechanics baseline test
NASA Astrophysics Data System (ADS)
Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.
2012-02-01
Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.
Sample Invariance of the Structural Equation Model and the Item Response Model: A Case Study.
ERIC Educational Resources Information Center
Breithaupt, Krista; Zumbo, Bruno D.
2002-01-01
Evaluated the sample invariance of item discrimination statistics in a case study using real data, responses of 10 random samples of 500 people to a depression scale. Results lend some support to the hypothesized superiority of a two-parameter item response model over the common form of structural equation modeling, at least when responses are…
Heyland, Daren K; Jiang, Xuran; Day, Andrew G; Cohen, S Robin
2013-08-01
The recently developed Canadian Health Care Evaluation Project (CANHELP) questionnaire, which can be used to assess both patient and family satisfaction with end-of-life care, takes 40-60 minutes to complete. The length of the interview may limit its uptake and clinical utility; a shorter version would make its use more feasible. The purpose of this study was to develop and validate a shorter version of the CANHELP questionnaire. Data were collected using a cross-sectional survey of patients with advanced medical diseases and their family members. Participants completed the long version of CANHELP, a global rating of satisfaction with care (GRS), the FAMCARE scale (family members only), and a quality-of-life (QOL) questionnaire. We reduced the items on the long version based on their relationship to the GRS, the frequency of missing data, the distribution of responses, the redundancy of the items, and focus groups with frontline users. With the remaining items, we assessed internal consistency using Cronbach's alpha, and evaluated construct validity by describing the correlation of the new CANHELP Lite with the full version of CANHELP, GRS, FAMCARE, and the QOL questionnaire scores. A total of 363 patients and 193 family members participated in this study. The patient version was reduced from 37 items to 20 items and the caregiver version was reduced from 38 items to 21 items. Cronbach's alphas ranged from 0.68 to 0.93 for all domains of both the patient and caregiver questionnaires. We observed a high degree of correlation between CANHELP Lite domains and overall scores and the same domains and overall scores for the full version of CANHELP. In addition, we observed moderate to strong correlation between the CANHELP Lite overall satisfaction scores and the GRS questions. There was moderate correlation between the overall family member CANHELP Lite score and overall FAMCARE score (r = 0.45) and this was similar to the correlation between the full version of CANHELP and FAMCARE scores (r = 0.41). CANHELP Lite correlated more strongly with the QOL subscale on health care than the other QOL subscales. The CANHELP Lite questionnaire is a valid and internally consistent instrument to measure satisfaction with end-of-life care. Copyright © 2013 U.S. Cancer Pain Relief Committee. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Prinzel, III, Lawrence J. (Inventor); Pope, Alan T. (Inventor); Williams, Steven P. (Inventor); Bailey, Randall E. (Inventor); Arthur, Jarvis J. (Inventor); Kramer, Lynda J. (Inventor); Schutte, Paul C. (Inventor)
2012-01-01
Embodiments of the invention permit flight paths (current and planned) to be viewed from various orientations to provide improved path and terrain awareness via graphical two-dimensional or three-dimensional perspective display formats. By coupling the flight path information with a terrain database, uncompromising terrain awareness relative to the path and ownship is provided. In addition, missed approaches, path deviations, and any navigational path can be reviewed and rehearsed before performing the actual task. By rehearsing a particular mission, check list items can be reviewed, terrain awareness can be highlighted, and missed approach procedures can be discussed by the flight crew. Further, the use of Controller Pilot Datalink Communications enables data-linked path, flight plan changes, and Air Traffic Control requests to be integrated into the flight display of the present invention.
NASA Astrophysics Data System (ADS)
Linn, Marcia C.; de Benedictis, Tina; Delucchi, Kevin; Harris, Abigail; Stage, Elizabeth
The National Assessment of Educational Progress Science Assessment has consistently revealed small gender differences on science content items but not on science inquiry items. This assessment differs from others in that respondents can choose I don't know rather than guessing. This paper examines explanations for the gender differences including (a) differential prior instruction, (b) differential response to uncertainty and use of the I don't know response, (c) differential response to figurally presented items, and (d) different attitudes towards science. Of these possible explanations, the first two received support. Females are more likely to use the I don't know response, especially for items with physical science content or masculine themes such as football. To ameliorate this situation we need more effective science instruction and more gender-neutral assessment items.
Marienfeld, Carla Beth; Tek, Ece; Diaz, Esperanza; Schottenfeld, Richard; Chawarski, Marek
2012-12-01
Psychiatrists' decision making about prescribing benzodiazepines (BZD) was evaluated in a community mental health center. An anonymous survey of outpatient psychiatrists in an academic-affiliated public mental health center was conducted using a 45-item questionnaire developed based on the results of a previous study. Sixty-six percent of responses indicate that, at times, psychiatrists experienced requests for behaviors suspicious for abuse, including 'lost/missing prescriptions' and 'use of BZD by others'. Patient characteristics such as 'history of abuse', 'unknown patient', and 'patient use of illicit substances' were occasional or common reasons for NOT prescribing BZDs (75%). The most common contexts in which the majority of our sample was uncomfortable prescribing BZDs involved a patient history of substance abuse, fear of initiation of dependence, diversion, and feeling manipulated by the patient. Time limitations were a dilemma for 20%. Psychiatrist self-reported dilemma and behavior in prescribing BZDs largely reflected concerns with substance abuse and less frequently workload or time issues.
Using the Mixed Rasch Model to analyze data from the beliefs and attitudes about memory survey.
Smith, Everett V; Ying, Yuping; Brown, Scott W
2012-01-01
In this study, we used the Mixed Rasch Model (MRM) to analyze data from the Beliefs and Attitudes About Memory Survey (BAMS; Brown, Garry, Silver, and Loftus, 1997). We used the original 5-point BAMS data to investigate the functioning of the "Neutral" category via threshold analysis under a 2-class MRM solution. The "Neutral" category was identified as not eliciting the model expected responses and observations in the "Neutral" category were subsequently treated as missing data. For the BAMS data without the "Neutral" category, exploratory MRM analyses specifying up to 5 latent classes were conducted to evaluate data-model fit using the consistent Akaike information criterion (CAIC). For each of three BAMS subscales, a two latent class solution was identified as fitting the mixed Rasch rating scale model the best. Results regarding threshold analysis, person parameters, and item fit based on the final models are presented and discussed as well as the implications of this study.
Park, Jong Cook; Kim, Kwang Sig
2012-03-01
The reliability of test is determined by each items' characteristics. Item analysis is achieved by classical test theory and item response theory. The purpose of the study was to compare the discrimination indices with item response theory using the Rasch model. Thirty-one 4th-year medical school students participated in the clinical course written examination, which included 22 A-type items and 3 R-type items. Point biserial correlation coefficient (C(pbs)) was compared to method of extreme group (D), biserial correlation coefficient (C(bs)), item-total correlation coefficient (C(it)), and corrected item-total correlation coeffcient (C(cit)). Rasch model was applied to estimate item difficulty and examinee's ability and to calculate item fit statistics using joint maximum likelihood. Explanatory power (r2) of Cpbs is decreased in the following order: C(cit) (1.00), C(it) (0.99), C(bs) (0.94), and D (0.45). The ranges of difficulty logit and standard error and ability logit and standard error were -0.82 to 0.80 and 0.37 to 0.76, -3.69 to 3.19 and 0.45 to 1.03, respectively. Item 9 and 23 have outfit > or =1.3. Student 1, 5, 7, 18, 26, 30, and 32 have fit > or =1.3. C(pbs), C(cit), and C(it) are good discrimination parameters. Rasch model can estimate item difficulty parameter and examinee's ability parameter with standard error. The fit statistics can identify bad items and unpredictable examinee's responses.
Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein
2017-01-01
Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials. PMID:29445703
Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein
2017-01-01
Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials.
Using Item Response Theory to Describe the Nonverbal Literacy Assessment (NVLA)
ERIC Educational Resources Information Center
Fleming, Danielle; Wilson, Mark; Ahlgrim-Delzell, Lynn
2018-01-01
The Nonverbal Literacy Assessment (NVLA) is a literacy assessment designed for students with significant intellectual disabilities. The 218-item test was initially examined using confirmatory factor analysis. This method showed that the test worked as expected, but the items loaded onto a single factor. This article uses item response theory to…
Measuring Student Learning with Item Response Theory
ERIC Educational Resources Information Center
Lee, Young-Jin; Palazzo, David J.; Warnakulasooriya, Rasil; Pritchard, David E.
2008-01-01
We investigate short-term learning from hints and feedback in a Web-based physics tutoring system. Both the skill of students and the difficulty and discrimination of items were determined by applying item response theory (IRT) to the first answers of students who are working on for-credit homework items in an introductory Newtonian physics…
Higher-Order Item Response Models for Hierarchical Latent Traits
ERIC Educational Resources Information Center
Huang, Hung-Yu; Wang, Wen-Chung; Chen, Po-Hsi; Su, Chi-Ming
2013-01-01
Many latent traits in the human sciences have a hierarchical structure. This study aimed to develop a new class of higher order item response theory models for hierarchical latent traits that are flexible in accommodating both dichotomous and polytomous items, to estimate both item and person parameters jointly, to allow users to specify…
Evaluating Item Fit for Multidimensional Item Response Models
ERIC Educational Resources Information Center
Zhang, Bo; Stone, Clement A.
2008-01-01
This research examines the utility of the s-x[superscript 2] statistic proposed by Orlando and Thissen (2000) in evaluating item fit for multidimensional item response models. Monte Carlo simulation was conducted to investigate both the Type I error and statistical power of this fit statistic in analyzing two kinds of multidimensional test…
An Item Response Theory Model for Test Bias.
ERIC Educational Resources Information Center
Shealy, Robin; Stout, William
This paper presents a conceptualization of test bias for standardized ability tests which is based on multidimensional, non-parametric, item response theory. An explanation of how individually-biased items can combine through a test score to produce test bias is provided. It is contended that bias, although expressed at the item level, should be…
NASA Astrophysics Data System (ADS)
Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan
2016-12-01
This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.
Near-Miss Effects on Response Latencies and Win Estimations of Slot Machine Players
ERIC Educational Resources Information Center
Dixon, Mark R.; Schreiber, James E.
2004-01-01
The present study examined the degree to which slot machine near-miss trials, or trials that displayed 2 of 3 winning symbols on the payoff line, affected response times and win estimations of 12 recreational slot machine players. Participants played a commercial slot machine in a casino-like laboratory for course extra-credit points. Videotaped…
Wilkerson, Keith; McGahan, Joseph R; Stevens, Rick; Williamson, David; Low, Jean
2009-12-01
The goal of this study was to determine whether differential response formats to covariation problems influence corresponding response latencies. The authors provided participants with 3 trials of 16 statements addressing positive and negative relations between freedom and responsibility. The authors framed half of the items around responsibility given freedom and the other half around freedom given responsibility. Response formats comprised true-false, agree-disagree, and yes-no answers as a between-participants factor. Results indicated that the manipulation of response format did not affect latencies. However, latencies differed according to the framing of the items. For items framed around freedom given responsibility, latencies were shorter. In addition, participants were more likely to report a positive relation between freedom and responsibility when items were framed around freedom given responsibility. The authors discuss implications relative to previous research in this area and give recommendations for future research.
Ye, Zeng Jie; Liang, Mu Zi; Zhang, Hao Wei; Li, Peng Fei; Ouyang, Xue Ren; Yu, Yuan Liang; Liu, Mei Ling; Qiu, Hong Zhong
2018-06-01
Classic theory test has been used to develop and validate the 25-item Resilience Scale Specific to Cancer (RS-SC) in Chinese patients with cancer. This study was designed to provide additional information about the discriminative value of the individual items tested with an item response theory analysis. A two-parameter graded response model was performed to examine whether any of the items of the RS-SC exhibited problems with the ordering and steps of thresholds, as well as the ability of items to discriminate patients with different resilience levels using item characteristic curves. A sample of 214 Chinese patients with cancer diagnosis was analyzed. The established three-dimension structure of the RS-SC was confirmed. Several items showed problematic thresholds or discrimination ability and require further revision. Some problematic items should be refined and a short-form of RS-SC maybe feasible in clinical settings in order to reduce burden on patients. However, the generalizability of these findings warrants further investigations.
NASA Astrophysics Data System (ADS)
Cui, Yiqian; Shi, Junyou; Wang, Zili
2017-11-01
Built-in tests (BITs) are widely used in mechanical systems to perform state identification, whereas the BIT false and missed alarms cause trouble to the operators or beneficiaries to make correct judgments. Artificial neural networks (ANN) are previously used for false and missed alarms identification, which has the features such as self-organizing and self-study. However, these ANN models generally do not incorporate the temporal effect of the bottom-level threshold comparison outputs and the historical temporal features are not fully considered. To improve the situation, this paper proposes a new integrated BIT design methodology by incorporating a novel type of dynamic neural networks (DNN) model. The new DNN model is termed as Forward IIR & Recurrent FIR DNN (FIRF-DNN), where its component neurons, network structures, and input/output relationships are discussed. The condition monitoring false and missed alarms reduction implementation scheme based on FIRF-DNN model is also illustrated, which is composed of three stages including model training, false and missed alarms detection, and false and missed alarms suppression. Finally, the proposed methodology is demonstrated in the application study and the experimental results are analyzed.
Automatic Scoring of Paper-and-Pencil Figural Responses. Research Report.
ERIC Educational Resources Information Center
Martinez, Michael E.; And Others
Large-scale testing is dominated by the multiple-choice question format. Widespread use of the format is due, in part, to the ease with which multiple-choice items can be scored automatically. This paper examines automatic scoring procedures for an alternative item type: figural response. Figural response items call for the completion or…
Introduction to Multilevel Item Response Theory Analysis: Descriptive and Explanatory Models
ERIC Educational Resources Information Center
Sulis, Isabella; Toland, Michael D.
2017-01-01
Item response theory (IRT) models are the main psychometric approach for the development, evaluation, and refinement of multi-item instruments and scaling of latent traits, whereas multilevel models are the primary statistical method when considering the dependence between person responses when primary units (e.g., students) are nested within…
An Extension of IRT-Based Equating to the Dichotomous Testlet Response Theory Model
ERIC Educational Resources Information Center
Tao, Wei; Cao, Yi
2016-01-01
Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…
ERIC Educational Resources Information Center
Gadermann, Anne M.; Guhn, Martin; Zumbo, Bruno D.
2012-01-01
This paper provides a conceptual, empirical, and practical guide for estimating ordinal reliability coefficients for ordinal item response data (also referred to as Likert, Likert-type, ordered categorical, or rating scale item responses). Conventionally, reliability coefficients, such as Cronbach's alpha, are calculated using a Pearson…
IRTPRO 2.1 for Windows (Item Response Theory for Patient-Reported Outcomes)
ERIC Educational Resources Information Center
Paek, Insu; Han, Kyung T.
2013-01-01
This article reviews a new item response theory (IRT) model estimation program, IRTPRO 2.1, for Windows that is capable of unidimensional and multidimensional IRT model estimation for existing and user-specified constrained IRT models for dichotomously and polytomously scored item response data. (Contains 1 figure and 2 notes.)
The Robustness of LOGIST and BILOG IRT Estimation Programs to Violations of Local Independence.
ERIC Educational Resources Information Center
Ackerman, Terry A.
One of the important underlying assumptions of all item response theory (IRT) models is that of local independence. This assumption requires that the response to an item on a test not be influenced by the response to any other items. This assumption is often taken for granted, with little or no scrutiny of the response process required to answer…
Unexpected Neighborhood Sources of Food and Drink: Implications for Research and Community Health.
Lucan, Sean C; Maroko, Andrew R; Seitchik, Jason L; Yoon, Dong Hum; Sperry, Luisa E; Schechter, Clyde B
2018-06-11
Studies of neighborhood food environments typically focus on select stores (especially supermarkets) and/or restaurants (especially fast-food outlets), make presumptions about healthfulness without assessing actual items for sale, and ignore other kinds of businesses offering foods/drinks. The current study assessed availability of select healthful and less-healthful foods/drinks from all storefront businesses in an urban environment and considered implications for food-environment research and community health. Cross-sectional assessment in 2013 of all storefront businesses (n=852) on all street segments (n=1,253) in 32 census tracts of the Bronx, New York. Investigators assessed for healthful items (produce, whole grains, nuts, water, milk) and less-healthful items (refined sweets, salty/fatty fare, sugar-added drinks, and alcohol), noting whether items were from food businesses (e.g., supermarkets and restaurants) or other storefront businesses (OSB, e.g., barber shops, gyms, hardware stores, laundromats). Data were analyzed in 2017. Half of all businesses offered food/drink items. More than one seventh of all street segments (more than one third in higher-poverty census tracts) had businesses selling food/drink. OSB accounted for almost one third of all businesses offering food/drink items (about one quarter of businesses offering any healthful items and more than two thirds of businesses offering only less-healthful options). Food environments include many businesses not primarily focused on selling foods/drinks. Studies that do not consider OSB may miss important food/drink sources, be incomplete and inaccurate, and potentially misguide interventions. OSB hold promise for improving food environments and community health by offering healthful items; some already do. Copyright © 2018 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
Holding on to hope: A review of the literature exploring missing persons, hope and ambiguous loss.
Wayland, Sarah; Maple, Myfanwy; McKay, Kathy; Glassock, Geoffrey
2016-01-01
When a person goes missing, those left behind mourn an ambiguous loss where grief can be disenfranchised. Different to bereavement following death, hope figures into this experience as a missing person has the potential to return. This review explores hope for families of missing people. Lived experience of ambiguous loss was deconstructed to reveal responses punctuated by hope, which had practical and psychological implications for those learning to live with an unresolved absence. Future lines of enquiry must address the dearth of research exploring the role of hope, unresolved grief, and its clinical implications when a person is missing.
Item response theory scoring and the detection of curvilinear relationships.
Carter, Nathan T; Dalal, Dev K; Guan, Li; LoPilato, Alexander C; Withrow, Scott A
2017-03-01
Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct. We test whether item response theory (IRT) scoring appropriate for the underlying response process to self-report measures results in more accurate tests for curvilinearity. In 2 simulation studies, we show that, regardless of the underlying response process used to generate the data, using the traditional sum-score generally results in high Type 1 error rates or low power for detecting curvilinearity, depending on the distribution of item locations. With few exceptions, appropriate power and Type 1 error rates are achieved when dominance-based and ideal point-based IRT scoring are correctly used to score dominance and ideal point response data, respectively. We conclude that (a) researchers should be theory-guided when hypothesizing and testing for curvilinear relations; (b) correctly identifying whether responses follow an ideal point versus dominance process, particularly when items are not extreme is critical; and (c) IRT model-based scoring is crucial for accurate tests of curvilinearity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Assessing Construct Validity Using Multidimensional Item Response Theory.
ERIC Educational Resources Information Center
Ackerman, Terry A.
The concept of a user-specified validity sector is discussed. The idea of the validity sector combines the work of M. D. Reckase (1986) and R. Shealy and W. Stout (1991). Reckase developed a methodology to represent an item in a multidimensional latent space as a vector. Item vectors are computed using multidimensional item response theory item…
ERIC Educational Resources Information Center
Dimitrov, Dimiter M.
2007-01-01
The validation of cognitive attributes required for correct answers on binary test items or tasks has been addressed in previous research through the integration of cognitive psychology and psychometric models using parametric or nonparametric item response theory, latent class modeling, and Bayesian modeling. All previous models, each with their…
ERIC Educational Resources Information Center
Bilir, Mustafa Kuzey
2009-01-01
This study uses a new psychometric model (mixture item response theory-MIMIC model) that simultaneously estimates differential item functioning (DIF) across manifest groups and latent classes. Current DIF detection methods investigate DIF from only one side, either across manifest groups (e.g., gender, ethnicity, etc.), or across latent classes…
Item Response Theory and Health Outcomes Measurement in the 21st Century
Hays, Ron D.; Morales, Leo S.; Reise, Steve P.
2006-01-01
Item response theory (IRT) has a number of potential advantages over classical test theory in assessing self-reported health outcomes. IRT models yield invariant item and latent trait estimates (within a linear transformation), standard errors conditional on trait level, and trait estimates anchored to item content. IRT also facilitates evaluation of differential item functioning, inclusion of items with different response formats in the same scale, and assessment of person fit and is ideally suited for implementing computer adaptive testing. Finally, IRT methods can be helpful in developing better health outcome measures and in assessing change over time. These issues are reviewed, along with a discussion of some of the methodological and practical challenges in applying IRT methods. PMID:10982088
Science and the public welfare
Press, F.
1991-01-01
Earthquakes & Volcanoes has achieved much in its 20-year history. It serves as a link with policy makers and the public. it offers a variety of information attractive to professionals: historical, culutral, and curent events, and news items not found or missed elsewhere. And the journal can anticipate an even mroe important role in the future because, with increasing concentrations of population as well as building costs, people and their constructs have become mroe vulnerable to hazards.
NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES
He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.
2017-01-01
Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225
The effect of response modality on immediate serial recall in dementia of the Alzheimer type.
Macé, Anne-Laure; Ergis, Anne-Marie; Caza, Nicole
2012-09-01
Contrary to traditional models of verbal short-term memory (STM), psycholinguistic accounts assume that temporary retention of verbal materials is an intrinsic property of word processing. Therefore, memory performance will depend on the nature of the STM tasks, which vary according to the linguistic representations they engage. The aim of this study was to explore the effect of response modality on verbal STM performance in individuals with dementia of the Alzheimer Type (DAT), and its relationship with the patients' word-processing deficits. Twenty individuals with mild DAT and 20 controls were tested on an immediate serial recall (ISR) task using the same items across two response modalities (oral and picture pointing) and completed a detailed language assessment. When scoring of ISR performance was based on item memory regardless of item order, a response modality effect was found for all participants, indicating that they recalled more items with picture pointing than with oral response. However, this effect was less marked in patients than in controls, resulting in an interaction. Interestingly, when recall of both item and order was considered, results indicated similar performance between response modalities in controls, whereas performance was worse for pointing than for oral response in patients. Picture-naming performance was also reduced in patients relative to controls. However, in the word-to-picture matching task, a similar pattern of responses was found between groups for incorrectly named pictures of the same items. The finding of a response modality effect in item memory for all participants is compatible with the assumption that semantic influences are greater in picture pointing than in oral response, as predicted by psycholinguistic models. Furthermore, patients' performance was modulated by their word-processing deficits, showing a reduced advantage relative to controls. Overall, the response modality effect observed in this study for item memory suggests that verbal STM performance is intrinsically linked with word processing capacities in both healthy controls and individuals with mild DAT, supporting psycholinguistic models of STM.
ERIC Educational Resources Information Center
Wallace, Colin S.; Prather, Edward E.; Duncan, Douglas K.
2012-01-01
This is the third of five papers detailing our national study of general education astronomy students' conceptual and reasoning difficulties with cosmology. In this paper, we use item response theory to analyze students' responses to three out of the four conceptual cosmology surveys we developed. The specific item response theory model we use is…
ERIC Educational Resources Information Center
Flowers, Claudia P.; Raju, Nambury S.; Oshima, T. C.
Current interest in the assessment of measurement equivalence emphasizes two methods of analysis, linear, and nonlinear procedures. This study simulated data using the graded response model to examine the performance of linear (confirmatory factor analysis or CFA) and nonlinear (item-response-theory-based differential item function or IRT-Based…
A Polytomous Item Response Theory Analysis of Social Physique Anxiety Scale
ERIC Educational Resources Information Center
Fletcher, Richard B.; Crocker, Peter
2014-01-01
The present study investigated the social physique anxiety scale's factor structure and item properties using confirmatory factor analysis and item response theory. An additional aim was to identify differences in response patterns between groups (gender). A large sample of high school students aged 11-15 years (N = 1,529) consisting of n =…
Item Response Theory at Subject- and Group-Level. Research Report 90-1.
ERIC Educational Resources Information Center
Tobi, Hilde
This paper reviews the literature about item response models for the subject level and aggregated level (group level). Group-level item response models (IRMs) are used in the United States in large-scale assessment programs such as the National Assessment of Educational Progress and the California Assessment Program. In the Netherlands, these…
ERIC Educational Resources Information Center
Schilling, Stephen G.
2007-01-01
In this paper the author examines the role of item response theory (IRT), particularly multidimensional item response theory (MIRT) in test validation from a validity argument perspective. The author provides justification for several structural assumptions and interpretations, taking care to describe the role he believes they should play in any…
ERIC Educational Resources Information Center
von Davier, Matthias; Sinharay, Sandip
2009-01-01
This paper presents an application of a stochastic approximation EM-algorithm using a Metropolis-Hastings sampler to estimate the parameters of an item response latent regression model. Latent regression models are extensions of item response theory (IRT) to a 2-level latent variable model in which covariates serve as predictors of the…
ERIC Educational Resources Information Center
Anderson, Daniel; Kahn, Joshua D.; Tindal, Gerald
2017-01-01
Unidimensionality and local independence are two common assumptions of item response theory. The former implies that all items measure a common latent trait, while the latter implies that responses are independent, conditional on respondents' location on the latent trait. Yet, few tests are truly unidimensional. Unmodeled dimensions may result in…
ERIC Educational Resources Information Center
Crino, Michael D.; And Others
1985-01-01
The random response technique was compared to a direct questionnaire, administered to college students, to investigate whether or not the responses predicted the social desirability of the item. Results suggest support for the hypothesis. A 33-item version of the Marlowe-Crowne Social Desirability Scale which was used is included. (GDC)
The Act of Answering Questions Elicited Differentiated Responses in a Concealed Information Test.
Otsuka, Takuro; Mizutani, Mitsuyoshi; Yagi, Akihiro; Katayama, Jun'ichi
2018-04-17
The concealed information test (CIT), a psychophysiological detection of deception test, compares physiological responses between crime-related and crime-unrelated items. In previous studies, whether the act of answering questions affected physiological responses was unclear. This study examined effects of both question-related and answer-related processes on physiological responses. Twenty participants received a modified CIT, in which the interval between presentation of questions and answering them was 27 s. Differentiated respiratory movements and cardiovascular responses between items were observed for both questions (items) and answers, while differentiated skin conductance response was observed only for questions. These results suggest that physiological responses to questions reflected orientation to a crime-related item, while physiological responses during answering reflected inhibition of psychological arousal caused by orienting. Regarding the CIT's accuracy, participants' perception of the questions themselves more strongly influenced physiological responses than answering them. © 2018 American Academy of Forensic Sciences.
Ambulant Measurements of Physiological Status and Cognitive Performance during Sustained Operations
2009-10-01
the target, reaction time, illegal responses, and missed responses were recorded. 2.4 Physiological measurements 2.4.1 Anthropometry ...system (SPi-Elite, GPsports Australia ) was mounted on the soldiers’ backpack. The system measured continuously during the training weeks. Walking or...the percentage missed stimuli were even more alike. 3.3 Physiological measurements 3.3.1 Anthropometry The soldiers who completed the training
Development and validation of an item response theory-based Social Responsiveness Scale short form.
Sturm, Alexandra; Kuhfeld, Megan; Kasari, Connie; McCracken, James T
2017-09-01
Research and practice in autism spectrum disorder (ASD) rely on quantitative measures, such as the Social Responsiveness Scale (SRS), for characterization and diagnosis. Like many ASD diagnostic measures, SRS scores are influenced by factors unrelated to ASD core features. This study further interrogates the psychometric properties of the SRS using item response theory (IRT), and demonstrates a strategy to create a psychometrically sound short form by applying IRT results. Social Responsiveness Scale analyses were conducted on a large sample (N = 21,426) of youth from four ASD databases. Items were subjected to item factor analyses and evaluation of item bias by gender, age, expressive language level, behavior problems, and nonverbal IQ. Item selection based on item psychometric properties, DIF analyses, and substantive validity produced a reduced item SRS short form that was unidimensional in structure, highly reliable (α = .96), and free of gender, age, expressive language, behavior problems, and nonverbal IQ influence. The short form also showed strong relationships with established measures of autism symptom severity (ADOS, ADI-R, Vineland). Degree of association between all measures varied as a function of expressive language. Results identified specific SRS items that are more vulnerable to non-ASD-related traits. The resultant 16-item SRS short form may possess superior psychometric properties compared to the original scale and emerge as a more precise measure of ASD core symptom severity, facilitating research and practice. Future research using IRT is needed to further refine existing measures of autism symptomatology. © 2017 Association for Child and Adolescent Mental Health.
Gifford, Katherine A; Liu, Dandan; Romano, Raymond; Jones, Richard N; Jefferson, Angela L
2015-12-01
Subjective cognitive decline (SCD) may indicate unhealthy cognitive changes, but no standardized SCD measurement exists. This pilot study aims to identify reliable SCD questions. 112 cognitively normal (NC, 76±8 years, 63% female), 43 mild cognitive impairment (MCI; 77±7 years, 51% female), and 33 diagnostically ambiguous participants (79±9 years, 58% female) were recruited from a research registry and completed 57 self-report SCD questions. Psychometric methods were used for item-reduction. Factor analytic models assessed unidimensionality of the latent trait (SCD); 19 items were removed with extreme response distribution or trait-fit. Item response theory (IRT) provided information about question utility; 17 items with low information were dropped. Post-hoc simulation using computerized adaptive test (CAT) modeling selected the most commonly used items (n=9 of 21 items) that represented the latent trait well (r=0.94) and differentiated NC from MCI participants (F(1,146)=8.9, p=0.003). Item response theory and computerized adaptive test modeling identified nine reliable SCD items. This pilot study is a first step toward refining SCD assessment in older adults. Replication of these findings and validation with Alzheimer's disease biomarkers will be an important next step for the creation of a SCD screener.
Cross-Cultural Validation of the Quality of Life in Hand Eczema Questionnaire (QOLHEQ).
Ofenloch, Robert F; Oosterhaven, Jart A F; Susitaival, Päivikki; Svensson, Åke; Weisshaar, Elke; Minamoto, Keiko; Onder, Meltem; Schuttelaar, Marie Louise A; Bulbul Baskan, Emel; Diepgen, Thomas L; Apfelbacher, Christian
2017-07-01
The Quality of Life in Hand Eczema Questionnaire (QOLHEQ) is the only instrument assessing disease-specific health-related quality of life in patients with hand eczema. It is available in eight language versions. In this study we assessed if the items of different language versions of the QOLHEQ yield comparable values across countries. An international multicenter study was conducted with participating centers in Finland, Germany, Japan, The Netherlands, Sweden, and Turkey. Methods of item response theory were applied to each subscale to assess differential item functioning for items among countries. Overall, 662 hand eczema patients were recruited into the study. Single items were removed or split according to the item response theory model by country to resolve differential item functioning. After this adjustment, none of the four subscales of the QOLHEQ showed significant misfit to the item response theory model (P < 0.01), and a Person Separation Index of greater than 0.7 showed good internal consistency for each subscale. By adapting the scoring of the QOLHEQ using the methods of item response theory, it was possible to obtain QOLHEQ values that are comparable across countries. Cross-cultural variations in the interpretation of single items were resolved. The QOLHEQ is now ready to be used in international studies assessing the health-related quality of life impact of hand eczema. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Analyzing force concept inventory with item response theory
NASA Astrophysics Data System (ADS)
Wang, Jing; Bao, Lei
2010-10-01
Item response theory is a popular assessment method used in education. It rests on the assumption of a probability framework that relates students' innate ability and their performance on test questions. Item response theory transforms students' raw test scores into a scaled proficiency score, which can be used to compare results obtained with different test questions. The scaled score also addresses the issues of ceiling effects and guessing, which commonly exist in quantitative assessment. We used item response theory to analyze the force concept inventory (FCI). Our results show that item response theory can be useful for analyzing physics concept surveys such as the FCI and produces results about the individual questions and student performance that are beyond the capability of classical statistics. The theory yields detailed measurement parameters regarding the difficulty, discrimination features, and probability of correct guess for each of the FCI questions.
Item Response Theory Models for Performance Decline during Testing
ERIC Educational Resources Information Center
Jin, Kuan-Yu; Wang, Wen-Chung
2014-01-01
Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…
The Effect of Error in Item Parameter Estimates on the Test Response Function Method of Linking.
ERIC Educational Resources Information Center
Kaskowitz, Gary S.; De Ayala, R. J.
2001-01-01
Studied the effect of item parameter estimation for computation of linking coefficients for the test response function (TRF) linking/equating method. Simulation results showed that linking was more accurate when there was less error in the parameter estimates, and that 15 or 25 common items provided better results than 5 common items under both…
ERIC Educational Resources Information Center
Gu, Fei; Skorupski, William P.; Hoyle, Larry; Kingston, Neal M.
2011-01-01
Ramsay-curve item response theory (RC-IRT) is a nonparametric procedure that estimates the latent trait using splines, and no distributional assumption about the latent trait is required. For item parameters of the two-parameter logistic (2-PL), three-parameter logistic (3-PL), and polytomous IRT models, RC-IRT can provide more accurate estimates…
Jordan, Pascal; Shedden-Mora, Meike C; Löwe, Bernd
2017-01-01
The Generalized Anxiety Disorder scale (GAD-7) is one of the most frequently used diagnostic self-report scales for screening, diagnosis and severity assessment of anxiety disorder. Its psychometric properties from the view of the Item Response Theory paradigm have rarely been investigated. We aimed to close this gap by analyzing the GAD-7 within a large sample of primary care patients with respect to its psychometric properties and its implications for scoring using Item Response Theory. Robust, nonparametric statistics were used to check unidimensionality of the GAD-7. A graded response model was fitted using a Bayesian approach. The model fit was evaluated using posterior predictive p-values, item information functions were derived and optimal predictions of anxiety were calculated. The sample included N = 3404 primary care patients (60% female; mean age, 52,2; standard deviation 19.2) The analysis indicated no deviations of the GAD-7 scale from unidimensionality and a decent fit of a graded response model. The commonly suggested ultra-brief measure consisting of the first two items, the GAD-2, was supported by item information analysis. The first four items discriminated better than the last three items with respect to latent anxiety. The information provided by the first four items should be weighted more heavily. Moreover, estimates corresponding to low to moderate levels of anxiety show greater variability. The psychometric validity of the GAD-2 was supported by our analysis.
Shedden-Mora, Meike C.; Löwe, Bernd
2017-01-01
Objective The Generalized Anxiety Disorder scale (GAD-7) is one of the most frequently used diagnostic self-report scales for screening, diagnosis and severity assessment of anxiety disorder. Its psychometric properties from the view of the Item Response Theory paradigm have rarely been investigated. We aimed to close this gap by analyzing the GAD-7 within a large sample of primary care patients with respect to its psychometric properties and its implications for scoring using Item Response Theory. Methods Robust, nonparametric statistics were used to check unidimensionality of the GAD-7. A graded response model was fitted using a Bayesian approach. The model fit was evaluated using posterior predictive p-values, item information functions were derived and optimal predictions of anxiety were calculated. Results The sample included N = 3404 primary care patients (60% female; mean age, 52,2; standard deviation 19.2) The analysis indicated no deviations of the GAD-7 scale from unidimensionality and a decent fit of a graded response model. The commonly suggested ultra-brief measure consisting of the first two items, the GAD-2, was supported by item information analysis. The first four items discriminated better than the last three items with respect to latent anxiety. Conclusion The information provided by the first four items should be weighted more heavily. Moreover, estimates corresponding to low to moderate levels of anxiety show greater variability. The psychometric validity of the GAD-2 was supported by our analysis. PMID:28771530
Do large-scale assessments measure students' ability to integrate scientific knowledge?
NASA Astrophysics Data System (ADS)
Lee, Hee-Sun
2010-03-01
Large-scale assessments are used as means to diagnose the current status of student achievement in science and compare students across schools, states, and countries. For efficiency, multiple-choice items and dichotomously-scored open-ended items are pervasively used in large-scale assessments such as Trends in International Math and Science Study (TIMSS). This study investigated how well these items measure secondary school students' ability to integrate scientific knowledge. This study collected responses of 8400 students to 116 multiple-choice and 84 open-ended items and applied an Item Response Theory analysis based on the Rasch Partial Credit Model. Results indicate that most multiple-choice items and dichotomously-scored open-ended items can be used to determine whether students have normative ideas about science topics, but cannot measure whether students integrate multiple pieces of relevant science ideas. Only when the scoring rubric is redesigned to capture subtle nuances of student open-ended responses, open-ended items become a valid and reliable tool to assess students' knowledge integration ability.
Accounting for Local Dependence with the Rasch Model: The Paradox of Information Increase.
Andrich, David
Test theories imply statistical, local independence. Where local independence is violated, models of modern test theory that account for it have been proposed. One violation of local independence occurs when the response to one item governs the response to a subsequent item. Expanding on a formulation of this kind of violation between two items in the dichotomous Rasch model, this paper derives three related implications. First, it formalises how the polytomous Rasch model for an item constituted by summing the scores of the dependent items absorbs the dependence in its threshold structure. Second, it shows that as a consequence the unit when the dependence is accounted for is not the same as if the items had no response dependence. Third, it explains the paradox, known, but not explained in the literature, that the greater the dependence of the constituent items the greater the apparent information in the constituted polytomous item when it should provide less information.
Zhu, Weiguo; Sun, Weixiang; Xu, Leilei; Sun, Xu; Liu, Zhen; Qiu, Yong; Zhu, Zezhang
2017-04-01
OBJECTIVE Recently, minimally invasive scoliosis surgery (MISS) was introduced for the correction of adult scoliosis. Multiple benefits including a good deformity correction rate and fewer complications have been demonstrated. However, few studies have reported on the use of MISS for the management of adolescent idiopathic scoliosis (AIS). The purpose of this study was to investigate the outcome of posterior MISS assisted by O-arm navigation for the correction of Lenke Type 5C AIS. METHODS The authors searched a database for all patients with AIS who had been treated with either MISS or PSF between November 2012 and January 2014. Levels of fusion, density of implants, operation time, and estimated blood loss (EBL) were recorded. Coronal and sagittal parameters were evaluated before surgery, immediately after surgery, and at the last follow-up. The accuracy of pedicle screw placement was assessed according to postoperative axial CT images in both groups. The 22-item Scoliosis Research Society questionnaire (SRS-22) results and complications were collected during follow-up. RESULTS The authors retrospectively reviewed the records of 45 patients with Lenke Type 5C AIS, 15 who underwent posterior MISS under O-arm navigation and 30 who underwent posterior spinal fusion (PSF). The 2 treatment groups were matched in terms of baseline characteristics. Comparison of radiographic parameters revealed no obvious difference between the 2 groups immediately after surgery or at the final follow-up; however, the MISS patients had significantly less EBL (p < 0.001) and longer operation times (p = 0.002). The evaluation of pain and self-image using the SRS-22 showed significantly higher scores in the MISS group (p = 0.013 and 0.046, respectively) than in the PSF group. Postoperative CT showed high accuracy in pedicle placement in both groups. No deep wound infection, pseudarthrosis, additional surgery, implant failure, or neurological complications were recorded in either group. CONCLUSIONS Minimally invasive scoliosis surgery is an effective and safe alternative to open surgery for patients with Lenke Type 5C AIS. Compared with results of the open approach, the outcomes of MISS are promising, with reduced morbidity. Before the routine use of MISS, however, long-term data are needed.
Oral health in the Japan self-defense forces - a representative survey
2011-01-01
Background The oral health of military populations is usually not very well characterized compared to civilian populations. The aim of this study was to investigate two physical oral health characteristics and one perceived oral health measure and their correlation in the Japan self-defense forces (JSDF). Methods Number of missing teeth, denture status, and OHRQoL as evaluated by the Japanese 14-item version of the Oral Health Impact Profile (OHIP-J14) as well as the correlation between these oral health measures was investigated in 911 personnel in the JSDF. Results Subjects did not have a substantial number of missing teeth and only 4% used removable dentures. The mean OHIP-J14 score was 4.6 ± 6.7 units. The magnitude of the correlation between the number of missing teeth with OHIP-J14 scores was small (r = 0.22, p < 0.001). Mean OHIP-J14 scores differed between subjects with and without dentures (8.6 and 4.4, p < 0.001). Conclusions Compared to Japanese civilian populations, personnel of the JSDF demonstrated good oral health. Two physical oral health characteristics were associated with perceived oral health. PMID:21501526
Do people treat missing information adaptively when making inferences?
Garcia-Retamero, Rocio; Rieskamp, Jörg
2009-10-01
When making inferences, people are often confronted with situations with incomplete information. Previous research has led to a mixed picture about how people react to missing information. Options include ignoring missing information, treating it as either positive or negative, using the average of past observations for replacement, or using the most frequent observation of the available information as a placeholder. The accuracy of these inference mechanisms depends on characteristics of the environment. When missing information is uniformly distributed, it is most accurate to treat it as the average, whereas when it is negatively correlated with the criterion to be judged, treating missing information as if it were negative is most accurate. Whether people treat missing information adaptively according to the environment was tested in two studies. The results show that participants were sensitive to how missing information was distributed in an environment and most frequently selected the mechanism that was most adaptive. From these results the authors conclude that reacting to missing information in different ways is an adaptive response to environmental characteristics.
ERIC Educational Resources Information Center
Kleinke, David J.
Four forms of a 36-item adaptation of the Stanford Achievement Test were administered to 484 fourth graders. External factors potentially influencing test performance were examined, namely: (1) item order (easy-to-difficult vs. uniform); (2) response location (left column vs. right column); (3) handedness which may interact with response location;…
Person Response Functions and the Definition of Units in the Social Sciences
ERIC Educational Resources Information Center
Engelhard, George, Jr.; Perkins, Aminah F.
2011-01-01
Humphry (this issue) has written a thought-provoking piece on the interpretation of item discrimination parameters as scale units in item response theory. One of the key features of his work is the description of an item response theory (IRT) model that he calls the logistic measurement function that combines aspects of two traditions in IRT that…
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2016-01-01
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…
ERIC Educational Resources Information Center
Fu, Jianbin
2016-01-01
The multidimensional item response theory (MIRT) models with covariates proposed by Haberman and implemented in the "mirt" program provide a flexible way to analyze data based on item response theory. In this report, we discuss applications of the MIRT models with covariates to longitudinal test data to measure skill differences at the…
ERIC Educational Resources Information Center
Tsutakawa, Robert K.; Lin, Hsin Ying
Item response curves for a set of binary responses are studied from a Bayesian viewpoint of estimating the item parameters. For the two-parameter logistic model with normally distributed ability, restricted bivariate beta priors are used to illustrate the computation of the posterior mode via the EM algorithm. The procedure is illustrated by data…
Modeling Answer Change Behavior: An Application of a Generalized Item Response Tree Model
ERIC Educational Resources Information Center
Jeon, Minjeong; De Boeck, Paul; van der Linden, Wim
2017-01-01
We present a novel application of a generalized item response tree model to investigate test takers' answer change behavior. The model allows us to simultaneously model the observed patterns of the initial and final responses after an answer change as a function of a set of latent traits and item parameters. The proposed application is illustrated…
Mallinckrodt, Brent; Tekie, Yacob T
2016-11-01
The Working Alliance Inventory (WAI) has made great contributions to psychotherapy research. However, studies suggest the 7-point response format and 3-factor structure of the client version may have psychometric problems. This study used Rasch item response theory (IRT) to (a) improve WAI response format, (b) compare two brief 12-item versions (WAI-sr; WAI-s), and (c) develop a new 16-item Brief Alliance Inventory (BAI). Archival data from 1786 counseling center and community clients were analyzed. IRT findings suggested problems with crossed category thresholds. A rescoring scheme that combines neighboring responses to create 5- and 4-point scales sharply reduced these problems. Although subscale variance was reduced by 11-26%, rescoring yielded improved reliability and generally higher correlations with therapy process (session depth and smoothness) and outcome measures (residual gain symptom improvement). The 16-item BAI was designed to maximize "bandwidth" of item difficulty and preserve a broader range of WAI sensitivity than WAI-s or WAI-sr. Comparisons suggest the BAI performed better in several respects than the WAI-s or WAI-sr and equivalent to the full WAI on several performance indicators.
An approximate generalized linear model with random effects for informative missing data.
Follmann, D; Wu, M
1995-03-01
This paper develops a class of models to deal with missing data from longitudinal studies. We assume that separate models for the primary response and missingness (e.g., number of missed visits) are linked by a common random parameter. Such models have been developed in the econometrics (Heckman, 1979, Econometrica 47, 153-161) and biostatistics (Wu and Carroll, 1988, Biometrics 44, 175-188) literature for a Gaussian primary response. We allow the primary response, conditional on the random parameter, to follow a generalized linear model and approximate the generalized linear model by conditioning on the data that describes missingness. The resultant approximation is a mixed generalized linear model with possibly heterogeneous random effects. An example is given to illustrate the approximate approach, and simulations are performed to critique the adequacy of the approximation for repeated binary data.
Khorramdel, Lale; von Davier, Matthias
2014-01-01
This study shows how to address the problem of trait-unrelated response styles (RS) in rating scales using multidimensional item response theory. The aim is to test and correct data for RS in order to provide fair assessments of personality. Expanding on an approach presented by Böckenholt (2012), observed rating data are decomposed into multiple response processes based on a multinomial processing tree. The data come from a questionnaire consisting of 50 items of the International Personality Item Pool measuring the Big Five dimensions administered to 2,026 U.S. students with a 5-point rating scale. It is shown that this approach can be used to test if RS exist in the data and that RS can be differentiated from trait-related responses. Although the extreme RS appear to be unidimensional after exclusion of only 1 item, a unidimensional measure for the midpoint RS is obtained only after exclusion of 10 items. Both RS measurements show high cross-scale correlations and item response theory-based (marginal) reliabilities. Cultural differences could be found in giving extreme responses. Moreover, it is shown how to score rating data to correct for RS after being proved to exist in the data.
Prisciandaro, James J; Tolliver, Bryan K
2016-11-15
The Young Mania Rating Scale (YMRS) and Montgomery-Asberg Depression Rating Scale (MADRS) are among the most widely used outcome measures for clinical trials of medications for Bipolar Disorder (BD). Nonetheless, very few studies have examined the measurement characteristics of the YMRS and MADRS in individuals with BD using modern psychometric methods. The present study evaluated the YMRS and MADRS in the Systematic Treatment Enhancement Program for BD (STEP-BD) study using Item Response Theory (IRT). Baseline data from 3716 STEP-BD participants were available for the present analysis. The Graded Response Model (GRM) was fit separately to YMRS and MADRS item responses. Differential item functioning (DIF) was examined by regressing a variety of clinically relevant covariates (e.g., sex, substance dependence) on all test items and on the latent symptom severity dimension, within each scale. Both scales: 1) contained several items that provided little or no psychometric information, 2) were inefficient, in that the majority of item response categories did not provide incremental psychometric information, 3) poorly measured participants outside of a narrow band of severity, 4) evidenced DIF for nearly all items, suggesting that item responses were, in part, determined by factors other than symptom severity. Limited to outpatients; DIF analysis only sensitive to certain forms of DIF. The present study provides evidence for significant measurement problems involving the YMRS and MADRS. More work is needed to refine these measures and/or develop suitable alternative measures of BD symptomatology for clinical trials research. Copyright © 2016 Elsevier B.V. All rights reserved.
Better assessment of physical function: item improvement is neglected but essential
2009-01-01
Introduction Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. Methods The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. Results We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Conclusions Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes. PMID:20015354
Better assessment of physical function: item improvement is neglected but essential.
Bruce, Bonnie; Fries, James F; Ambrosini, Debbie; Lingala, Bharathi; Gandek, Barbara; Rose, Matthias; Ware, John E
2009-01-01
Physical function is a key component of patient-reported outcome (PRO) assessment in rheumatology. Modern psychometric methods, such as Item Response Theory (IRT) and Computerized Adaptive Testing, can materially improve measurement precision at the item level. We present the qualitative and quantitative item-evaluation process for developing the Patient Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank. The process was stepwise: we searched extensively to identify extant Physical Function items and then classified and selectively reduced the item pool. We evaluated retained items for content, clarity, relevance and comprehension, reading level, and translation ease by experts and patient surveys, focus groups, and cognitive interviews. We then assessed items by using classic test theory and IRT, used confirmatory factor analyses to estimate item parameters, and graded response modeling for parameter estimation. We retained the 20 Legacy (original) Health Assessment Questionnaire Disability Index (HAQ-DI) and the 10 SF-36's PF-10 items for comparison. Subjects were from rheumatoid arthritis, osteoarthritis, and healthy aging cohorts (n = 1,100) and a national Internet sample of 21,133 subjects. We identified 1,860 items. After qualitative and quantitative evaluation, 124 newly developed PROMIS items composed the PROMIS item bank, which included revised Legacy items with good fit that met IRT model assumptions. Results showed that the clearest and best-understood items were simple, in the present tense, and straightforward. Basic tasks (like dressing) were more relevant and important versus complex ones (like dancing). Revised HAQ-DI and PF-10 items with five response options had higher item-information content than did comparable original Legacy items with fewer response options. IRT analyses showed that the Physical Function domain satisfied general criteria for unidimensionality with one-, two-, three-, and four-factor models having comparable model fits. Correlations between factors in the test data sets were > 0.90. Item improvement must underlie attempts to improve outcome assessment. The clear, personally important and relevant, ability-framed items in the PROMIS Physical Function item bank perform well in PRO assessment. They will benefit from further study and application in a wider variety of rheumatic diseases in diverse clinical groups, including those at the extremes of physical functioning, and in different administration modes.
A Graphical Approach to Item Analysis. Research Report. ETS RR-04-10
ERIC Educational Resources Information Center
Livingston, Samuel A.; Dorans, Neil J.
2004-01-01
This paper describes an approach to item analysis that is based on the estimation of a set of response curves for each item. The response curves show, at a glance, the difficulty and the discriminating power of the item and the popularity of each distractor, at any level of the criterion variable (e.g., total score). The curves are estimated by…
ERIC Educational Resources Information Center
Tassé, Marc J.; Schalock, Robert L.; Thissen, David; Balboni, Giulia; Bersani, Henry, Jr.; Borthwick-Duffy, Sharon A.; Spreat, Scott; Widaman, Keith F.; Zhang, Dalun; Navas, Patricia
2016-01-01
The Diagnostic Adaptive Behavior Scale (DABS) was developed using item response theory (IRT) methods and was constructed to provide the most precise and valid adaptive behavior information at or near the cutoff point of making a decision regarding a diagnosis of intellectual disability. The DABS initial item pool consisted of 260 items. Using IRT…
ERIC Educational Resources Information Center
Stevenson, Claire E.; Heiser, Willem J.; Resing, Wilma C. M.
2016-01-01
Multiple-choice (MC) analogy items are often used in cognitive assessment. However, in dynamic testing, where the aim is to provide insight into potential for learning and the learning process, constructed-response (CR) items may be of benefit. This study investigated whether training with CR or MC items leads to differences in the strategy…
ERIC Educational Resources Information Center
Swygert, Kimberly A.
In this study, data from an operational computerized adaptive test (CAT) were examined in order to gather information concerning item response times in a CAT environment. The CAT under study included multiple-choice items measuring verbal, quantitative, and analytical reasoning. The analyses included the fitting of regression models describing the…
Item response theory in personality assessment: a demonstration using the MMPI-2 depression scale.
Childs, R A; Dahlstrom, W G; Kemp, S M; Panter, A T
2000-03-01
Item response theory (IRT) analyses have, over the past 3 decades, added much to our understanding of the relationships among and characteristics of test items, as revealed in examinees response patterns. Assessment instruments used outside the educational context have only infrequently been analyzed using IRT, however. This study demonstrates the relevance of IRT to personality data through analyses of Scale 2 (the Depression Scale) on the revised Minnesota Multiphasic Personality Inventory (MMPI-2). A rich set of hypotheses regarding the items on this scale, including contrasts among the Harris-Lingoes and Wiener-Harmon subscales and differences in the items measurement characteristics for men and women, are investigated through the IRT analyses.
Cohen, Matthew L; Kisala, Pamela A; Dyson-Hudson, Trevor A; Tulsky, David S
2018-05-01
To develop modern patient-reported outcome measures that assess pain interference and pain behavior after spinal cord injury (SCI). Grounded-theory based qualitative item development; large-scale item calibration field-testing; confirmatory factor analyses; graded response model item response theory analyses; statistical linking techniques to transform scores to the Patient Reported Outcome Measurement Information System (PROMIS) metric. Five SCI Model Systems centers and one Department of Veterans Affairs medical center in the United States. Adults with traumatic SCI. N/A. Spinal Cord Injury - Quality of Life (SCI-QOL) Pain Interference item bank, SCI-QOL Pain Interference short form, and SCI-QOL Pain Behavior scale. Seven hundred fifty-seven individuals with traumatic SCI completed 58 items addressing various aspects of pain. Items were then separated by whether they assessed pain interference or pain behavior, and poorly functioning items were removed. Confirmatory factor analyses confirmed that each set of items was unidimensional, and item response theory analyses were used to estimate slopes and thresholds for the items. Ultimately, 7 items (4 from PROMIS) comprised the Pain Behavior scale and 25 items (18 from PROMIS) comprised the Pain Interference item bank. Ten of these 25 items were selected to form the Pain Interference short form. The SCI-QOL Pain Interference item bank and the SCI-QOL Pain Behavior scale demonstrated robust psychometric properties. The Pain Interference item bank is available as a computer adaptive test or short form for research and clinical applications, and scores are transformed to the PROMIS metric.
Why do we miss rare targets? Exploring the boundaries of the low prevalence effect
Rich, Anina N.; Kunar, Melina A.; Van Wert, Michael J.; Hidalgo-Sotelo, Barbara; Horowitz, Todd S.; Wolfe, Jeremy M.
2011-01-01
Observers tend to miss a disproportionate number of targets in visual search tasks with rare targets. This ‘prevalence effect’ may have practical significance since many screening tasks (e.g., airport security, medical screening) are low prevalence searches. It may also shed light on the rules used to terminate search when a target is not found. Here, we use perceptually simple stimuli to explore the sources of this effect. Experiment 1 shows a prevalence effect in inefficient spatial configuration search. Experiment 2 demonstrates this effect occurs even in a highly efficient feature search. However, the two prevalence effects differ. In spatial configuration search, misses seem to result from ending the search prematurely, while in feature search, they seem due to response errors. In Experiment 3, a minimum delay before response eliminated the prevalence effect for feature but not spatial configuration search. In Experiment 4, a target was present on each trial in either two (2AFC) or four (4AFC) orientations. With only two response alternatives, low prevalence produced elevated errors. Providing four response alternatives eliminated this effect. Low target prevalence puts searchers under pressure that tends to increase miss errors. We conclude that the specific source of those errors depends on the nature of the search. PMID:19146299
Reliability and validity of a short form household food security scale in a Caribbean community.
Gulliford, Martin C; Mahabir, Deepak; Rocke, Brian
2004-06-16
We evaluated the reliability and validity of the short form household food security scale in a different setting from the one in which it was developed. The scale was interview administered to 531 subjects from 286 households in north central Trinidad in Trinidad and Tobago, West Indies. We evaluated the six items by fitting item response theory models to estimate item thresholds, estimating agreement among respondents in the same households and estimating the slope index of income-related inequality (SII) after adjusting for age, sex and ethnicity. Item-score correlations ranged from 0.52 to 0.79 and Cronbach's alpha was 0.87. Item responses gave within-household correlation coefficients ranging from 0.70 to 0.78. Estimated item thresholds (standard errors) from the Rasch model ranged from -2.027 (0.063) for the 'balanced meal' item to 2.251 (0.116) for the 'hungry' item. The 'balanced meal' item had the lowest threshold in each ethnic group even though there was evidence of differential functioning for this item by ethnicity. Relative thresholds of other items were generally consistent with US data. Estimation of the SII, comparing those at the bottom with those at the top of the income scale, gave relative odds for an affirmative response of 3.77 (95% confidence interval 1.40 to 10.2) for the lowest severity item, and 20.8 (2.67 to 162.5) for highest severity item. Food insecurity was associated with reduced consumption of green vegetables after additionally adjusting for income and education (0.52, 0.28 to 0.96). The household food security scale gives reliable and valid responses in this setting. Differing relative item thresholds compared with US data do not require alteration to the cut-points for classification of 'food insecurity without hunger' or 'food insecurity with hunger'. The data provide further evidence that re-evaluation of the 'balanced meal' item is required.
Computerized Adaptive Testing with Item Clones. Research Report.
ERIC Educational Resources Information Center
Glas, Cees A. W.; van der Linden, Wim J.
To reduce the cost of item writing and to enhance the flexibility of item presentation, items can be generated by item-cloning techniques. An important consequence of cloning is that it may cause variability on the item parameters. Therefore, a multilevel item response model is presented in which it is assumed that the item parameters of a…
Depression symptoms and lost productivity in chronic rhinosinusitis.
Campbell, Adam P; Phillips, Katie M; Hoehle, Lloyd P; Feng, Allen L; Bergmark, Regan W; Caradonna, David S; Gray, Stacey T; Sedaghat, Ahmad R
2017-03-01
Chronic rhinosinusitis (CRS) is associated with significant losses of patient productivity that cost billions of dollars every year. The causative factors for decreases in productivity in patients with CRS have yet to be determined. To determine which patterns of CRS symptoms drive lost productivity. Prospective, cross-sectional cohort study of 107 patients with CRS. Sinonasal symptom severity was measured using the 22-item Sinonasal Outcomes Test, from which sleep, nasal, otologic or facial pain, and emotional function subdomain scores were calculated using principal component analysis. Depression risk was assessed with the 2-item Patient Health Questionnaire (PHQ-2), whereas nasal obstruction was assessed with the Nasal Obstruction Symptom Evaluation (NOSE) instrument. Lost productivity was assessed by asking participants how many days of work and/or school they missed in the last 3 months because of CRS. Associations were sought between lost productivity and CRS symptoms. A total of 107 patients were recruited. Patients missed a mean (SD) of 3.1 (12.9) days of work or school because of CRS. Lost productivity was most strongly associated with the emotional function subdomain (β = 7.48; 95% confidence interval [CI], 5.71-9.25; P < .001). Reinforcing this finding, lost productivity was associated with PHQ-2 score (β = 4.72; 95% CI, 2.62-6.83; P < .001). Lost productivity was less strongly associated with the nasal symptom subdomain score (β = 2.65; 95% CI, 0.77-4.52; P = .007), and there was no association between lost productivity and NOSE score (β = 0.01; 95% CI, -0.12 to 0.13; P = .91). Symptoms associated with depression are most strongly associated with missed days of work or school because of CRS. Further treatment focusing on depression-associated symptoms in patients with CRS may reduce losses in productivity. Copyright © 2016 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Pitchford, Melanie; Ball, Linden J.; Hunt, Thomas E.; Steel, Richard
2017-01-01
We report a study examining the role of ‘cognitive miserliness’ as a determinant of poor performance on the standard three-item Cognitive Reflection Test (CRT). The cognitive miserliness hypothesis proposes that people often respond incorrectly on CRT items because of an unwillingness to go beyond default, heuristic processing and invest time and effort in analytic, reflective processing. Our analysis (N = 391) focused on people’s response times to CRT items to determine whether predicted associations are evident between miserly thinking and the generation of incorrect, intuitive answers. Evidence indicated only a weak correlation between CRT response times and accuracy. Item-level analyses also failed to demonstrate predicted response-time differences between correct analytic and incorrect intuitive answers for two of the three CRT items. We question whether participants who give incorrect intuitive answers on the CRT can legitimately be termed cognitive misers and whether the three CRT items measure the same general construct. PMID:29099840
Development of the Contact Lens User Experience: CLUE Scales
Wirth, R. J.; Edwards, Michael C.; Henderson, Michael; Henderson, Terri; Olivares, Giovanna; Houts, Carrie R.
2016-01-01
ABSTRACT Purpose The field of optometry has become increasingly interested in patient-reported outcomes, reflecting a common trend occurring across the spectrum of healthcare. This article reviews the development of the Contact Lens User Experience: CLUE system designed to assess patient evaluations of contact lenses. CLUE was built using modern psychometric methods such as factor analysis and item response theory. Methods The qualitative process through which relevant domains were identified is outlined as well as the process of creating initial item banks. Psychometric analyses were conducted on the initial item banks and refinements were made to the domains and items. Following this data-driven refinement phase, a second round of data was collected to further refine the items and obtain final item response theory item parameters estimates. Results Extensive qualitative work identified three key areas patients consider important when describing their experience with contact lenses. Based on item content and psychometric dimensionality assessments, the developing CLUE instruments were ultimately focused around four domains: comfort, vision, handling, and packaging. Item response theory parameters were estimated for the CLUE item banks (377 items), and the resulting scales were found to provide precise and reliable assignment of scores detailing users’ subjective experiences with contact lenses. Conclusions The CLUE family of instruments, as it currently exists, exhibits excellent psychometric properties. PMID:27383257
Khorramdel, Lale; Kubinger, Klaus D; Uitz, Alexander
2014-04-01
An experiment was conducted to investigate the effects of item order and questionnaire content on faking good or intentional response distortion. It was hypothesized that intentional response distortion would either increase towards the end of a long questionnaire, as learning effects might make it easier to adjust responses to a faking good schema, or decrease because applicants' will to distort responses is reduced if the questionnaire lasts long enough. Furthermore, it was hypothesized that certain types of questionnaire content are especially vulnerable to response distortion. Eighty-four pre-selected pilot applicants filled out a questionnaire consisting of 516 items including items from the NEO five factor inventory (NEO FFI), NEO personality inventory revised (NEO PI-R) and business-focused inventory of personality (BIP). The positions of the items were varied within the applicant sample to test if responses are affected by item order, and applicants' response behaviour was additionally compared to that of volunteers. Applicants reported significantly higher mean scores than volunteers, and results provide some evidence of decreased faking tendencies towards the end of the questionnaire. Furthermore, it could be demonstrated that lower variances or standard deviations in combination with appropriate (often higher) mean scores can serve as an indicator for faking tendencies in group comparisons, even if effects are not significant. © 2013 International Union of Psychological Science.
Kawasaki, Yohei; Ide, Kazuki; Akutagawa, Maiko; Yamada, Hiroshi; Yutaka, Ono; Furukawa, Toshiaki A.
2017-01-01
Background Several recent studies have shown that total scores on depressive symptom measures in a general population approximate an exponential pattern except for the lower end of the distribution. Furthermore, we confirmed that the exponential pattern is present for the individual item responses on the Center for Epidemiologic Studies Depression Scale (CES-D). To confirm the reproducibility of such findings, we investigated the total score distribution and item responses of the Kessler Screening Scale for Psychological Distress (K6) in a nationally representative study. Methods Data were drawn from the National Survey of Midlife Development in the United States (MIDUS), which comprises four subsamples: (1) a national random digit dialing (RDD) sample, (2) oversamples from five metropolitan areas, (3) siblings of individuals from the RDD sample, and (4) a national RDD sample of twin pairs. K6 items are scored using a 5-point scale: “none of the time,” “a little of the time,” “some of the time,” “most of the time,” and “all of the time.” The pattern of total score distribution and item responses were analyzed using graphical analysis and exponential regression model. Results The total score distributions of the four subsamples exhibited an exponential pattern with similar rate parameters. The item responses of the K6 approximated a linear pattern from “a little of the time” to “all of the time” on log-normal scales, while “none of the time” response was not related to this exponential pattern. Discussion The total score distribution and item responses of the K6 showed exponential patterns, consistent with other depressive symptom scales. PMID:28289560
The e-MSWS-12: improving the multiple sclerosis walking scale using item response theory.
Engelhard, Matthew M; Schmidt, Karen M; Engel, Casey E; Brenton, J Nicholas; Patek, Stephen D; Goldman, Myla D
2016-12-01
The Multiple Sclerosis Walking Scale (MSWS-12) is the predominant patient-reported measure of multiple sclerosis (MS) -elated walking ability, yet it had not been analyzed using item response theory (IRT), the emerging standard for patient-reported outcome (PRO) validation. This study aims to reduce MSWS-12 measurement error and facilitate computerized adaptive testing by creating an IRT model of the MSWS-12 and distributing it online. MSWS-12 responses from 284 subjects with MS were collected by mail and used to fit and compare several IRT models. Following model selection and assessment, subpopulations based on age and sex were tested for differential item functioning (DIF). Model comparison favored a one-dimensional graded response model (GRM). This model met fit criteria and explained 87 % of response variance. The performance of each MSWS-12 item was characterized using category response curves (CRCs) and item information. IRT-based MSWS-12 scores correlated with traditional MSWS-12 scores (r = 0.99) and timed 25-foot walk (T25FW) speed (r = -0.70). Item 2 showed DIF based on age (χ 2 = 19.02, df = 5, p < 0.01), and Item 11 showed DIF based on sex (χ 2 = 13.76, df = 5, p = 0.02). MSWS-12 measurement error depends on walking ability, but could be lowered by improving or replacing items with low information or DIF. The e-MSWS-12 includes IRT-based scoring, error checking, and an estimated T25FW derived from MSWS-12 responses. It is available at https://ms-irt.shinyapps.io/e-MSWS-12 .
2013-01-01
Background Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. Methods The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Results Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. Conclusions The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information. PMID:23453056
Zoanetti, Nathan; Beaves, Mark; Griffin, Patrick; Wallace, Euan M
2013-03-04
Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information.
Measuring the quality of life in hypertension according to Item Response Theory
Borges, José Wicto Pereira; Moreira, Thereza Maria Magalhães; Schmitt, Jeovani; de Andrade, Dalton Francisco; Barbetta, Pedro Alberto; de Souza, Ana Célia Caetano; Lima, Daniele Braz da Silva; Carvalho, Irialda Saboia
2017-01-01
ABSTRACT OBJECTIVE To analyze the Miniquestionário de Qualidade de Vida em Hipertensão Arterial (MINICHAL – Mini-questionnaire of Quality of Life in Hypertension) using the Item Response Theory. METHODS This is an analytical study conducted with 712 persons with hypertension treated in thirteen primary health care units of Fortaleza, State of Ceará, Brazil, in 2015. The steps of the analysis by the Item Response Theory were: evaluation of dimensionality, estimation of parameters of items, and construction of scale. The study of dimensionality was carried out on the polychoric correlation matrix and confirmatory factor analysis. To estimate the item parameters, we used the Gradual Response Model of Samejima. The analyses were conducted using the free software R with the aid of psych and mirt. RESULTS The analysis has allowed the visualization of item parameters and their individual contributions in the measurement of the latent trait, generating more information and allowing the construction of a scale with an interpretative model that demonstrates the evolution of the worsening of the quality of life in five levels. Regarding the item parameters, the items related to the somatic state have had a good performance, as they have presented better power to discriminate individuals with worse quality of life. The items related to mental state have been those which contributed with less psychometric data in the MINICHAL. CONCLUSIONS We conclude that the instrument is suitable for the identification of the worsening of the quality of life in hypertension. The analysis of the MINICHAL using the Item Response Theory has allowed us to identify new sides of this instrument that have not yet been addressed in previous studies. PMID:28492764
Self-reported weight and predictors of missing responses in youth.
Aceves-Martins, Magaly; Whitehead, Ross; Inchley, Jo; Giralt, Montse; Currie, Candace; Solà, Rosa
2018-02-12
The aims of the present manuscript are to analyse self-reported data on weight, including the missing data, from the 2014 Scottish Health Behaviour in School-Aged Children (HBSC) Study, and to investigate whether behavioural factors related with overweight and obesity, namely dietary habits, physical activity and sedentary behaviour, are associated with weight non-response. 10839 11-, 13- and 15-year-olds participated in the cross-national 2014 Scottish HBSC Study. Weight missing data was evaluated using Little's Missing Completely at Random (MCAR) test. Afterwards, a fitted multivariate logistic regression model was used to determine all possible multivariate associations between weight response and each of the behavioural factors related with obesity. 58.9% of self-reported weight was missing, not at random (MCAR p < 0.001). Weight was self-reported less frequently by girls (19.2%) than by boys (21.9%). Participants who reported low physical activity (OR 1.2, p < 0.001), low vegetable consumption (OR 1.24, p < 0.001) and high computer gaming on weekdays (OR 1.18, p = 0.003) were more likely to not report their weight. There are groups of young people in Scotland who are less likely to report their weight. Their weight status may be of the greatest concern because of their poorer health profile, based on key behaviours associated with their non-response. Furthermore, knowing the value of a healthy weight and reinforcing healthy lifestyle messages may help raise youth awareness of how diet, physical activity and sedentary behaviours can influence weight. Copyright © 2018 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Samejima, Fumiko
In latent trait theory the latent space, or space of the hypothetical construct, is usually represented by some unidimensional or multi-dimensional continuum of real numbers. Like the latent space, the item response can either be treated as a discrete variable or as a continuous variable. Latent trait theory relates the item response to the latent…
ERIC Educational Resources Information Center
Reise, Steven P.; Meijer, Rob R.; Ainsworth, Andrew T.; Morales, Leo S.; Hays, Ron D.
2006-01-01
Group-level parametric and non-parametric item response theory models were applied to the Consumer Assessment of Healthcare Providers and Systems (CAHPS[R]) 2.0 core items in a sample of 35,572 Medicaid recipients nested within 131 health plans. Results indicated that CAHPS responses are dominated by within health plan variation, and only weakly…
Missed nursing care and its relationship with confidence in delegation among hospital nurses.
Saqer, Tahani J; AbuAlRub, Raeda F
2018-04-06
To (i) identify the types and reasons for missed nursing care among Jordanian hospital nurses; (ii) identify predictors of missed nursing care based on study variables; and (iii) examine the relationship between nurses' confidence in delegation and missed nursing care. Missed nursing care is a global concern for nurses and nurse administrators. Investigating the relation between the confidence in delegation and missed nursing care might help in designing strategies that enable nurses to minimise missed care and enhance quality of services. A correlational research design was used for this study. A convenience sample of 362 hospital nurses completed the missed nursing care survey, and confidence and intent to delegate scale. The results of the study revealed that ambulating and feeding patients on time, doing mouth care and attending interdisciplinary care conferences were the most frequent types of missed care. The mean score for missed nursing care was (2.78) on a scale from 1-5. The most prevalent reasons for missed care were "labour resources, followed by material resources, and then communication". Around 45% of the variation in the perceived level of "missed nursing care" was explained by background variables and perceived reasons for missed nursing. However, the relationship between confidence in delegation and missed care was insignificant. The results of this study add to the body of international literature on most prevalent types and reasons for missed nursing care in a different cultural context. Highlighting most prevalent reasons for missed nursing care could help nurse administrators in designing responsive strategies to eliminate or reduces such reasons. © 2018 John Wiley & Sons Ltd.
Reise, Steven P.; Ventura, Joseph; Keefe, Richard S. E.; Baade, Lyle E.; Gold, James M.; Green, Michael F.; Kern, Robert S.; Mesholam-Gately, Raquelle; Nuechterlein, Keith H.; Seidman, Larry J.; Bilder, Robert
2011-01-01
We conducted psychometric analyses of two interview-based measures of cognitive deficits: the 21-item Clinical Global Impression of Cognition in Schizophrenia (CGI-CogS; Ventura et al., 2008), and the 20-item Schizophrenia Cognition Rating Scale (SCoRS; Keefe et al., 2006), which were administered on two occasions to a sample of people with schizophrenia. Traditional psychometrics, bifactor analysis, and item response theory (IRT) methods were used to explore item functioning, dimensionality, and to compare instruments. Despite containing similar item content, responses to the CGI-CogS demonstrated superior psychometric properties (e.g., higher item-intercorrelations, better spread of ratings across response categories), relative to the SCoRS. We argue that these differences arise mainly from the differential use of prompts and how the items are phrased and scored. Bifactor analysis demonstrated that although both measures capture a broad range of cognitive functioning (e.g., working memory, social cognition), the common variance on each is overwhelmingly explained by a single general factor. IRT analyses of the combined pool of 41 items showed that measurement precision is peaked in the mild to moderate range of cognitive impairment. Finally, simulated adaptive testing revealed that only about 10 to 12 items are necessary to achieve latent trait level estimates with reasonably small standard errors for most individuals. This suggests that these interview-based measures of cognitive deficits could be shortened without loss of measurement precision. PMID:21381848
Validation of a clinical critical thinking skills test in nursing.
Shin, Sujin; Jung, Dukyoo; Kim, Sungeun
2015-01-27
The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability.
Validation of a clinical critical thinking skills test in nursing
2015-01-01
Purpose: The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. Methods: This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Results: Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. Conclusion: From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability. PMID:25622716
Lambert, Michael Canute; Ferguson, Gail M; Rowan, George T
2016-03-01
Cross-national study of adolescents' psychological adjustment requires measures that permit reliable and valid assessment across informants and nations, but such measures are virtually nonexistent. Item-response-theory-based linking is a promising yet underutilized methodological procedure that permits more accurate assessment across informants and nations. To demonstrate this procedure, the Resilience Scale of the Behavioral Assessment for Children of African Heritage (Lambert et al., 2005) was administered to 250 African American and 294 Jamaican nonreferred adolescents and their caregivers. Multiple items without significant differential item functioning emerged, allowing scale linking across informants and nations. Calibrating item parameters via item response theory linking can permit cross-informant cross-national assessment of youth. (c) 2016 APA, all rights reserved).
Rorschach missing responses--is this more than nothing?
King, M G
2014-01-01
The Rorschach has been demonstrated as a suitable tool for investigating otherwise hidden psychological aspects of sex offenders: sex-related responses are more common. The present paper looks at the established tendency of some clients to minimise their overall Rorschach responding, the linking of this response restraint to particular Rorschach profiles, and the sparse but consistent literature which casts doubt on the proposition that Examiner enthusiasm will cause the minimising client to provide more responses which divulge additional information. In the case of sex offenders, with so much to hide, it is proposed that there may be extensive filtering of responses even among those giving more than "normal" sex-related responses. "What the client did not say", and the corresponding "missing" Rorschach responses in the case of sex offenders is discussed in the light of an individual case: (a sex offender with undue interest in young boys' penii) where "sex-like" images were specifically targeted, but never named as such. The exciting prospect of inferring what the client could have said and thus generating the content of missing responses, whether or not response filtering produced numerical minimisation, must be balanced against the risk of naked men and women (and their genitalia) representing nothing more than an artefact of the clinician's own making--"ce qui n' est pas le cas".
Detection of Differential Item Functioning Using the Lasso Approach
ERIC Educational Resources Information Center
Magis, David; Tuerlinckx, Francis; De Boeck, Paul
2015-01-01
This article proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the "LR lasso DIF method": logistic regression (LR) model is formulated for all item responses. The model contains item-specific intercepts,…
Mielenz, Thelma J; Callahan, Leigh F; Edwards, Michael C
2016-03-12
Examine the feasibility of performing an item response theory (IRT) analysis on two of the Centers for Disease Control and Prevention health-related quality of life (CDC HRQOL) modules - the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM). Previous principal components analyses confirm that the two scales both assess a mix of mental (CDC-MH) and physical health (CDC-PH). The purpose is to conduct item response theory (IRT) analysis on the CDC-MH and CDC-PH scales separately. 2182 patients with self-reported or physician-diagnosed arthritis completed a cross-sectional survey including HDCM and HDSM items. Besides global health, the other 8 items ask the number of days that some statement was true; we chose to recode the data into 8 categories based on observed clustering. The IRT assumptions were assessed using confirmatory factor analysis and the data could be modeled using an unidimensional IRT model. The graded response model was used for IRT analyses and CDC-MH and CDC-PH scales were analyzed separately in flexMIRT. The IRT parameter estimates for the five-item CDC-PH all appeared reasonable. The three-item CDC-MH did not have reasonable parameter estimates. The CDC-PH scale is amenable to IRT analysis but the existing The CDC-MH scale is not. We suggest either using the 4-item Healthy Days Core Module (HDCM) and the 5-item Healthy days Symptoms Module (HDSM) as they currently stand or the CDC-PH scale alone if the primary goal is to measure physical health related HRQOL.
Decayed and missing teeth and oral-health-related factors: predicting depression in homeless people.
Coles, Emma; Chan, Karen; Collins, Jennifer; Humphris, Gerry M; Richards, Derek; Williams, Brian; Freeman, Ruth
2011-08-01
The objective of the study was to determine the effect of dental health status, dental anxiety and oral-health-related quality of life (OHRQoL) upon homeless people's experience of depression. A cross-sectional survey was conducted on a sample of homeless people in seven National Health Service Boards in Scotland. All participants completed a questionnaire to assess their depression, dental anxiety and OHRQoL using reliable and valid measures. Participants had an oral examination to assess their experience of tooth decay (decayed and missing teeth). Latent variable path analysis was conducted to determine the effects of dental health status on depression via dental anxiety and OHRQoL using intensive resampling methods. A total of 853 homeless people participated, of which 70% yielded complete data sets. Three latent variables, decayed and missing teeth, dental anxiety (Modified Dental Anxiety Scale: five items) and depression (Center for Epidemiological Studies Depression Scale: two factors), and a single variable for OHRQoL (Oral Health Impact Profile total scale) were used in a hybrid structural equation model. The variable decayed and missing teeth was associated with depression through indirect pathways (total standardised indirect effects=0.44, P<.001), via OHRQoL and dental anxiety (χ²=75.90, df=40, comparative fit index=0.985, Tucker-Lewis index=0.977, root mean square error of approximation=0.051 [90% confidence interval: 0.037-0.065]). Depression in Scottish homeless people is related to dental health status and oral-health-related factors. Decayed and missing teeth may influence depression primarily through the psychological constructs of OHRQoL and, to a lesser extent, dental anxiety. Copyright © 2010 Elsevier Inc. All rights reserved.
Sequential Computerized Mastery Tests--Three Simulation Studies
ERIC Educational Resources Information Center
Wiberg, Marie
2006-01-01
A simulation study of a sequential computerized mastery test is carried out with items modeled with the 3 parameter logistic item response theory model. The examinees' responses are either identically distributed, not identically distributed, or not identically distributed together with estimation errors in the item characteristics. The…
A two-question method for assessing gender categories in the social and medical sciences.
Tate, Charlotte Chuck; Ledbetter, Jay N; Youssef, Cris P
2013-01-01
Three studies (N = 990) assessed the statistical reliability of two methods of determining gender identity that can capture transgender spectrum identities (i.e., current gender identities different from birth-assigned gender categories). Study 1 evaluated a single question with four response options (female, male, transgender, other) on university students. The missing data rate was higher than the valid response rates for transgender and other options using this method. Study 2 evaluated a method of asking two separate questions (i.e., one for current identity and another for birth-assigned category), with response options specific to each. Results showed no missing data and two times the transgender spectrum response rate compared to Study 1. Study 3 showed that the two-question method also worked in community samples, producing near-zero missing data. The two-question method also identified cisgender identities (same birth-assigned and current gender identity), making it a dynamic and desirable measurement tool for the social and medical sciences.
Distinguishing Fast and Slow Processes in Accuracy - Response Time Data.
Coomans, Frederik; Hofman, Abe; Brinkhuis, Matthieu; van der Maas, Han L J; Maris, Gunter
2016-01-01
We investigate the relation between speed and accuracy within problem solving in its simplest non-trivial form. We consider tests with only two items and code the item responses in two binary variables: one indicating the response accuracy, and one indicating the response speed. Despite being a very basic setup, it enables us to study item pairs stemming from a broad range of domains such as basic arithmetic, first language learning, intelligence-related problems, and chess, with large numbers of observations for every pair of problems under consideration. We carry out a survey over a large number of such item pairs and compare three types of psychometric accuracy-response time models present in the literature: two 'one-process' models, the first of which models accuracy and response time as conditionally independent and the second of which models accuracy and response time as conditionally dependent, and a 'two-process' model which models accuracy contingent on response time. We find that the data clearly violates the restrictions imposed by both one-process models and requires additional complexity which is parsimoniously provided by the two-process model. We supplement our survey with an analysis of the erroneous responses for an example item pair and demonstrate that there are very significant differences between the types of errors in fast and slow responses.
What can we learn from PISA?: Investigating PISA's approach to scientific literacy
NASA Astrophysics Data System (ADS)
Schwab, Cheryl Jean
This dissertation is an investigation of the relationship between the multidimensional conception of scientific literacy and its assessment. The Programme for International Student Assessment (PISA), developed under the auspices of the Organization for Economic Cooperation and Development (OECD), offers a unique opportunity to evaluate the assessment of scientific literacy. PISA developed a continuum of performance for scientific literacy across three competencies (i.e., process, content, and situation). Foundational to the interpretation of PISA science assessment is PISA's definition of scientific literacy, which I argue incorporates three themes drawn from history: (a) scientific way of thinking, (b) everyday relevance of science, and (c) scientific literacy for all students. Three coordinated studies were conducted to investigate the validity of PISA science assessment and offer insight into the development of items to assess scientific 2 literacy. Multidimensional models of the internal structure of the PISA 2003 science items were found not to reflect the complex character of PISA's definition of scientific literacy. Although the multidimensional models across the three competencies significantly decreased the G2 statistic from the unidimensional model, high correlations between the dimensions suggest that the dimensions are similar. A cognitive analysis of student verbal responses to PISA science items revealed that students were using competencies of scientific literacy, but the competencies were not elicited by the PISA science items at the depth required by PISA's definition of scientific literacy. Although student responses contained only knowledge of scientific facts and simple scientific concepts, students were using more complex skills to interpret and communicate their responses. Finally the investigation of different scoring approaches and item response models illustrated different ways to interpret student responses to assessment items. These analyses highlighted the complexities of students' responses to the PISA science items and the use of the ordered partition model to accommodate different but equal item responses. The results of the three investigations are used to discuss ways to improve the development and interpretation of PISA's science items.
Austvoll-Dahlgren, Astrid; Guttersrud, Øystein; Nsangi, Allen; Semakula, Daniel; Oxman, Andrew D
2017-01-01
Background The Claim Evaluation Tools database contains multiple-choice items for measuring people’s ability to apply the key concepts they need to know to be able to assess treatment claims. We assessed items from the database using Rasch analysis to develop an outcome measure to be used in two randomised trials in Uganda. Rasch analysis is a form of psychometric testing relying on Item Response Theory. It is a dynamic way of developing outcome measures that are valid and reliable. Objectives To assess the validity, reliability and responsiveness of 88 items addressing 22 key concepts using Rasch analysis. Participants We administrated four sets of multiple-choice items in English to 1114 people in Uganda and Norway, of which 685 were children and 429 were adults (including 171 health professionals). We scored all items dichotomously. We explored summary and individual fit statistics using the RUMM2030 analysis package. We used SPSS to perform distractor analysis. Results Most items conformed well to the Rasch model, but some items needed revision. Overall, the four item sets had satisfactory reliability. We did not identify significant response dependence between any pairs of items and, overall, the magnitude of multidimensionality in the data was acceptable. The items had a high level of difficulty. Conclusion Most of the items conformed well to the Rasch model’s expectations. Following revision of some items, we concluded that most of the items were suitable for use in an outcome measure for evaluating the ability of children or adults to assess treatment claims. PMID:28550019
Cordier, Reinie; Speyer, Renée; Schindler, Antonio; Michou, Emilia; Heijnen, Bas Joris; Baijens, Laura; Karaduman, Ayşe; Swan, Katina; Clavé, Pere; Joosten, Annette Veronica
2018-02-01
The Swallowing Quality of Life questionnaire (SWAL-QOL) is widely used clinically and in research to evaluate quality of life related to swallowing difficulties. It has been described as a valid and reliable tool, but was developed and tested using classic test theory. This study describes the reliability and validity of the SWAL-QOL using item response theory (IRT; Rasch analysis). SWAL-QOL data were gathered from 507 participants at risk of oropharyngeal dysphagia (OD) across four European countries. OD was confirmed in 75.7% of participants via videofluoroscopy and/or fiberoptic endoscopic evaluation, or a clinical diagnosis based on meeting selected criteria. Patients with esophageal dysphagia were excluded. Data were analysed using Rasch analysis. Item and person reliability was good for all the items combined. However, person reliability was poor for 8 subscales and item reliability was poor for one subscale. Eight subscales exhibited poor person separation and two exhibited poor item separation. Overall item and person fit statistics were acceptable. However, at an individual item fit level results indicated unpredictable item responses for 28 items, and item redundancy for 10 items. The item-person dimensionality map confirmed these findings. Results from the overall Rasch model fit and Principal Component Analysis were suggestive of a second dimension. For all the items combined, none of the item categories were 'category', 'threshold' or 'step' disordered; however, all subscales demonstrated category disordered functioning. Findings suggest an urgent need to further investigate the underlying structure of the SWAL-QOL and its psychometric characteristics using IRT.
75 FR 61784 - Proposed Collection; Comment Request for Review of a Revised Information Collection
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-06
... response time of ten minutes per form reporting a missing check is estimated; the same amount of time is needed to report the missing checks or electronic funds transfer (EFT) payments using the telephone. The...
ERIC Educational Resources Information Center
Arffman, Inga
2016-01-01
Open-ended (OE) items are widely used to gather data on student performance in international achievement studies. However, several factors may threaten validity when using such items. This study examined Finnish coders' opinions about threats to validity when coding responses to OE items in the PISA 2012 problem-solving test. A total of 6…
ERIC Educational Resources Information Center
Cao, Yi; Lu, Ru; Tao, Wei
2014-01-01
The local item independence assumption underlying traditional item response theory (IRT) models is often not met for tests composed of testlets. There are 3 major approaches to addressing this issue: (a) ignore the violation and use a dichotomous IRT model (e.g., the 2-parameter logistic [2PL] model), (b) combine the interdependent items to form a…
ERIC Educational Resources Information Center
Ferrando, Pere J.
2004-01-01
This study used kernel-smoothing procedures to estimate the item characteristic functions (ICFs) of a set of continuous personality items. The nonparametric ICFs were compared with the ICFs estimated (a) by the linear model and (b) by Samejima's continuous-response model. The study was based on a conditioned approach and used an error-in-variables…
ERIC Educational Resources Information Center
Watson, Kathy; Baranowski, Tom; Thompson, Debbe; Jago, Russell; Baranowski, Janice; Klesges, Lisa M.
2006-01-01
This study examined multidimensional item response theory (MIRT) modeling to assess social desirability (SocD) influences on self-reported physical activity self-efficacy (PASE) and fruit and vegetable self-efficacy (FVSE). The observed sample included 473 Houston-area adolescent males (10-14 years). SocD (nine items), PASE (19 items) and FVSE (21…
The Structure of the Narcissistic Personality Inventory With Binary and Rating Scale Items.
Boldero, Jennifer M; Bell, Richard C; Davies, Richard C
2015-01-01
Narcissistic Personality Inventory (NPI) items typically have a forced-choice format, comprising a narcissistic and a nonnarcissistic statement. Recently, some have presented the narcissistic statements and asked individuals to either indicate whether they agree or disagree that the statements are self-descriptive (i.e., a binary response format) or to rate the extent to which they agree or disagree that these statements are self-descriptive on a Likert scale (i.e., a rating response format). The current research demonstrates that when NPI items have a binary or a rating response format, the scale has a bifactor structure (i.e., the items load on a general factor and on 6 specific group factors). Indexes of factor strength suggest that the data are unidimensional enough for the NPI's general factor to be considered a measure of a narcissism latent trait. However, the rating item general factor assessed more narcissism components than the binary item one. The positive correlations of the NPI's general factor, assessed when items have a rating response format, were moderate with self-esteem, strong with a measure of narcissistic grandiosity, and weak with 2 measures of narcissistic vulnerability. Together, the results suggest that using a rating format for items enhances the information provided by the NPI.
Vermersch, Patrick; Hobart, Jeremy; Dive-Pouletty, Catherine; Bozzi, Sylvie; Hass, Steven; Coyle, Patricia K
2017-04-01
The Treatment Satisfaction Questionnaire for Medication (TSQM) was designed to assess patient treatment satisfaction in chronic diseases. Its performance has not been examined in multiple sclerosis (MS). The 14 items of the TSQM cover four domains: Effectiveness, Side Effects, Convenience, and Global Satisfaction. To evaluate performance of the TSQM in patients with relapsing MS, using data collected from the TENERE study (NCT00883337), in which 324 patients received oral teriflunomide or subcutaneous interferon beta-1a for ⩾48 weeks. Five measurement properties were examined using traditional psychometric methods: data completeness, scale-to-sample targeting, scaling assumptions, reliability (including test-retest), and construct validity (internal: item-level scaling success, confirmatory factor analysis, and exploratory factor analysis; external: convergence, discrimination, and group differences). There were few (<2%) missing item data; domain scores could be computed for all patients. Score distributions were skewed toward higher satisfaction; two domains had marked ceiling effects. Scaling assumptions were supported. Internal consistency reliability was high (Cronbach's α > 0.90). Internal validity tests supported item groupings. Correlations supported convergent and discriminant construct validity; hypothesis testing supported group differences validity. This investigation found the TSQM to be a useful tool, exhibiting good psychometric measurement properties in patients with relapsing MS in the TENERE study.
Epilepsy-related ambiguity in rating the child behavior checklist and the teacher's report form.
Oostrom, K J; Schouten, A; Kruitwagen, C L; Peters, A C; Jennekens-Schinkel, A
2001-01-01
Although the child behavior checklist (CBCL) and the teacher's report form (TRF) were not designed for diagnosing psychopathology in children with chronic illnesses, they have become extensively used research tools to assess behavioural problems in paediatric populations, including children with epilepsy. When applied to children with epilepsy, items like "staring blankly" or "twitching" can be rated on the basis of seizure features rather than behaviour and, hence, render behavioural scores ambiguous. The aims were detection, and evaluation of the impact, of CBCL and TRF items eliciting ambiguity when applied to children with "epilepsy only" (idiopathic or cryptogenic epilepsy, attending normal schools). Experts identified items that give rise to interpretational ambiguity of the ratings in epilepsy. By treating ratings on these items as missing values, their effect was evaluated in CBCL and TRF scores of 59 schoolchildren with "epilepsy only" and age and gender matched healthy classmates. Seven items of the CBCL gave rise to ambiguity of which items 5 co-occur on the TRF. Rescoring reduced psychopathology scores in children with "epilepsy only", but not in those of healthy children: the percentage of patients trespassing the clinical cut off score, on at least one of the subscales, reduced from 46 to 23% on the CBCL and from 18 to 15% on the TRF. Parents and teachers run the risk of confusing behaviour and seizure features when filling out the CBCL and TRF. In "epilepsy only", prevalence estimates of psychopathology based on the CBCL and TRF, should be considered with some reserve.
Carroll, Beverley; Freeman, Becky
2015-04-01
Around one in 10 Australian women report that they smoke while pregnant, and this may be a significant underestimation. In 2013, Australian celebrity Chrissie Swan announced publicly that she had been smoking during her pregnancy, generating substantial media coverage. This study sought to identify the main themes in the reporting of the 'Swan pregnant and admitting smoking' story by online news media. Between 6 February 2013 and 18 February 2013 inclusively, a content analysis was conducted of Australian online news items using the keywords: 'Chrissie Swan smoking', and 'Chrissie Swan pregnant and smoking'. News items were coded for nine themes. A total of 124 items were identified. The most frequent themes were: 'celebrity story' (90.32%) and 'societal judgement of pregnant smokers' (69.35%). Less than one-half (45.97%) of the news items included 'quitting is hard' content and only 29.03% of the news items included 'smoking and health' content. Specific quit-referral content was found in only 13.71% of the news items. There was a missed opportunity to promote positive, non-judgemental smoking and pregnancy messages and health information that support pregnant women to quit smoking. SO WHAT?: Health promotion strategies are needed to build capacity in advocacy to promote positive health messages and counter societal judgement of pregnant smokers. Formative research into the use of celebrities and other influential women to promote positive empowering messages should be carried out and incorporated in future health promotion campaigns to improve pregnant women's ability to quit smoking.
Pilkonis, Paul A.; Yu, Lan; Dodds, Nathan E.; Johnston, Kelly L.; Lawrence, Suzanne; Hilton, Thomas F.; Daley, Dennis C.; Patkar, Ashwin A.; McCarty, Dennis
2015-01-01
Background Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®): severity of substance use and positive appeal of substance use. Methods Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5,300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1,336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network. Results Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank. Conclusions Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings. PMID:26423364
The PedsQL 4.0 as a pediatric population health measure: feasibility, reliability, and validity.
Varni, James W; Burwinkle, Tasha M; Seid, Michael; Skarr, Douglas
2003-01-01
The application of health-related quality of life (HRQOL) as a pediatric population health measure may facilitate risk assessment and resource allocation, the tracking of community health, the identification of health disparities, and the determination of health outcomes from interventions and policy decisions. To determine the feasibility, reliability, and validity of the 23-item PedsQL 4.0 (Pediatric Quality of Life Inventory) Generic Core Scales as a measure of pediatric population health for children and adolescents. Mail survey in February and March 2001 to 20 031 families with children ages 2-16 years throughout the State of California encompassing all new enrollees in the State's Children's Health Insurance Program (SCHIP) for those months and targeted language groups. The PedsQL 4.0 Generic Core Scales (Physical, Emotional, Social, School Functioning) were completed by 10 241 families through a statewide mail survey to evaluate the HRQOL of new enrollees in SCHIP. The PedsQL 4.0 evidenced minimal missing responses, achieved excellent reliability for the Total Scale Score (alpha =.89 child;.92 parent report), and distinguished between healthy children and children with chronic health conditions. The PedsQL 4.0 was also related to indicators of health care access, days missed from school, days sick in bed or too ill to play, and days needing care. The results demonstrate the feasibility, reliability, and validity of the PedsQL 4.0 as a pediatric population health outcome. Measuring pediatric HRQOL may be a way to evaluate the health outcomes of SCHIP.
Practical Guide to Conducting an Item Response Theory Analysis
ERIC Educational Resources Information Center
Toland, Michael D.
2014-01-01
Item response theory (IRT) is a psychometric technique used in the development, evaluation, improvement, and scoring of multi-item scales. This pedagogical article provides the necessary information needed to understand how to conduct, interpret, and report results from two commonly used ordered polytomous IRT models (Samejima's graded…
Analyzing Longitudinal Item Response Data via the Pairwise Fitting Method
ERIC Educational Resources Information Center
Fu, Zhi-Hui; Tao, Jian; Shi, Ning-Zhong; Zhang, Ming; Lin, Nan
2011-01-01
Multidimensional item response theory (MIRT) models can be applied to longitudinal educational surveys where a group of individuals are administered different tests over time with some common items. However, computational problems typically arise as the dimension of the latent variables increases. This is especially true when the latent variable…
Item Construction and Psychometric Models Appropriate for Constructed Responses
1991-08-01
which involve only one attribute per item. This is especially true when we are dealing with constructed-response items, we have to measure much more...Service University of Ilinois Educacional Testing Service Rosedal Road Capign. IL 61801 Princeton. K3 08541 Princeton. N3 08541 Dr. Charles LeiS Dr
Different Approaches to Covariate Inclusion in the Mixture Rasch Model
ERIC Educational Resources Information Center
Li, Tongyun; Jiao, Hong; Macready, George B.
2016-01-01
The present study investigates different approaches to adding covariates and the impact in fitting mixture item response theory models. Mixture item response theory models serve as an important methodology for tackling several psychometric issues in test development, including the detection of latent differential item functioning. A Monte Carlo…
Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory
ERIC Educational Resources Information Center
Lee, Won-Chan
2010-01-01
In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…
Robust Estimation of Latent Ability in Item Response Models
ERIC Educational Resources Information Center
Schuster, Christof; Yuan, Ke-Hai
2011-01-01
Because of response disturbances such as guessing, cheating, or carelessness, item response models often can only approximate the "true" individual response probabilities. As a consequence, maximum-likelihood estimates of ability will be biased. Typically, the nature and extent to which response disturbances are present is unknown, and, therefore,…
Theoretical and Empirical Comparisons between Two Models for Continuous Item Responses.
ERIC Educational Resources Information Center
Ferrando, Pere J.
2002-01-01
Analyzed the relations between two continuous response models intended for typical response items: the linear congeneric model and Samejima's continuous response model (CRM). Illustrated the relations described using an empirical example and assessed the relations through a simulation study. (SLD)
Peytremann-Bridevaux, Isabelle; Scherer, Frédy; Peer, Laurence; Cathieni, Federico; Bonsack, Charles; Cléopas, Agatta; Kolly, Véronique; Perneger, Thomas V; Burnand, Bernard
2006-01-01
Background While there is interest in measuring the satisfaction of patients discharged from psychiatric hospitals, it might be important to determine whether surveys of psychiatric patients should employ generic or psychiatry-specific instruments. The aim of this study was to compare two psychiatric-specific and one generic questionnaires assessing patients' satisfaction after a hospitalisation in a psychiatric hospital. Methods We randomised adult patients discharged from two Swiss psychiatric university hospitals between April and September 2004, to receive one of three instruments: the Saphora-Psy questionnaire, the Perceptions of Care survey questionnaire or the Picker Institute questionnaire for acute care hospitals. In addition to the comparison of response rates, completion time, mean number of missing items and mean ceiling effect, we targeted our comparison on patients and asked them to answer ten evaluation questions about the questionnaire they had just completed. Results 728 out of 1550 eligible patients (47%) participated in the study. Across questionnaires, response rates were similar (Saphora-Psy: 48.5%, Perceptions of Care: 49.9%, Picker: 43.4%; P = 0.08), average completion time was lowest for the Perceptions of Care questionnaire (minutes: Saphora-Psy: 17.7, Perceptions of Care: 13.7, Picker: 17.5; P = 0.005), the Saphora-Psy questionnaire had the largest mean proportion of missing responses (Saphora-Psy: 7.1%, Perceptions of Care: 2.8%, Picker: 4.0%; P < 0.001) and the Perceptions of Care questionnaire showed the highest ceiling effect (Saphora-Psy: 17.1%, Perceptions of Care: 41.9%, Picker: 36.3%; P < 0.001). There were no differences in the patients' evaluation of the questionnaires. Conclusion Despite differences in the intended target population, content, lay-out and length of questionnaires, none appeared to be obviously better based on our comparison. All three presented advantages and drawbacks and could be used for the satisfaction evaluation of psychiatric inpatients. However, if comparison across medical services or hospitals is desired, using a generic questionnaire might be advantageous. PMID:16938136
Rhodes, Matthew G; Jacoby, Larry L
2007-03-01
The authors examined whether participants can shift their criterion for recognition decisions in response to the probability that an item was previously studied. Participants in 3 experiments were given recognition tests in which the probability that an item was studied was correlated with its location during the test. Results from all 3 experiments indicated that participants' response criteria were sensitive to the probability that an item was previously studied and that shifts in criterion were robust. In addition, awareness of the bases for criterion shifts and feedback on performance were key factors contributing to the observed shifts in decision criteria. These data suggest that decision processes can operate in a dynamic fashion, shifting from item to item.
ERIC Educational Resources Information Center
Rudner, Lawrence
This digest discusses the advantages and disadvantages of using item banks, and it provides useful information for those who are considering implementing an item banking project in their school districts. The primary advantage of item banking is in test development. Using an item response theory method, such as the Rasch model, items from multiple…
Huang, Yueng-Hsiang; Lee, Jin; Chen, Zhuo; Perry, MacKenna; Cheung, Janelle H; Wang, Mo
2017-06-01
Zohar and Luria's (2005) safety climate (SC) scale, measuring organization- and group- level SC each with 16 items, is widely used in research and practice. To improve the utility of the SC scale, we shortened the original full-length SC scales. Item response theory (IRT) analysis was conducted using a sample of 29,179 frontline workers from various industries. Based on graded response models, we shortened the original scales in two ways: (1) selecting items with above-average discriminating ability (i.e. offering more than 6.25% of the original total scale information), resulting in 8-item organization-level and 11-item group-level SC scales; and (2) selecting the most informative items that together retain at least 30% of original scale information, resulting in 4-item organization-level and 4-item group-level SC scales. All four shortened scales had acceptable reliability (≥0.89) and high correlations (≥0.95) with the original scale scores. The shortened scales will be valuable for academic research and practical survey implementation in improving occupational safety. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Unsworth, Nash; Brewer, Gene A; Spillers, Gregory J
2011-09-01
In three experiments search termination decisions were examined as a function of response type (correct vs. incorrect) and confidence. It was found that the time between the last retrieved item and the decision to terminate search (exit latency) was related to the type of response and confidence in the last item retrieved. Participants were willing to search longer when the last retrieved item was a correct item vs. an incorrect item and when the confidence was high in the last retrieved item. It was also found that the number of errors retrieved during the recall period was related to search termination decisions such that the more errors retrieved, the more likely participants were to terminate the search. Finally, it was found that knowledge of overall search set size influenced the time needed to search for items, but did not influence search termination decisions. Copyright © 2011 Elsevier B.V. All rights reserved.
Steca, Patrizia; Monzani, Dario; Greco, Andrea; Chiesi, Francesca; Primi, Caterina
2015-06-01
This study is aimed at testing the measurement properties of the Life Orientation Test-Revised (LOT-R) for the assessment of dispositional optimism by employing item response theory (IRT) analyses. The LOT-R was administered to a large sample of 2,862 Italian adults. First, confirmatory factor analyses demonstrated the theoretical conceptualization of the construct measured by the LOT-R as a single bipolar dimension. Subsequently, IRT analyses for polytomous, ordered response category data were applied to investigate the items' properties. The equivalence of the items across gender and age was assessed by analyzing differential item functioning. Discrimination and severity parameters indicated that all items were able to distinguish people with different levels of optimism and adequately covered the spectrum of the latent trait. Additionally, the LOT-R appears to be gender invariant and, with minor exceptions, age invariant. Results provided evidence that the LOT-R is a reliable and valid measure of dispositional optimism. © The Author(s) 2014.
ERIC Educational Resources Information Center
Ding, Kele; Olds, R. Scott; Thombs, Dennis L.
2009-01-01
This retrospective case study assessed the influence of item non-response error on subsequent response to questionnaire items assessing adolescent alcohol and marijuana use. Post-hoc analyses were conducted on survey results obtained from 4,371 7th to 12th grade students in Ohio in 2005. A skip pattern design in a conventional questionnaire…
ERIC Educational Resources Information Center
Hsieh, Chueh-An; von Eye, Alexander A.; Maier, Kimberly S.
2010-01-01
The application of multidimensional item response theory models to repeated observations has demonstrated great promise in developmental research. It allows researchers to take into consideration both the characteristics of item response and measurement error in longitudinal trajectory analysis, which improves the reliability and validity of the…
Applying mixed methods to pretest the Pressure Ulcer Quality of Life (PU-QOL) instrument.
Gorecki, C; Lamping, D L; Nixon, J; Brown, J M; Cano, S
2012-04-01
Pretesting is key in the development of patient-reported outcome (PRO) instruments. We describe a mixed-methods approach based on interviews and Rasch measurement methods in the pretesting of the Pressure Ulcer Quality of Life (PU-QOL) instrument. We used cognitive interviews to pretest the PU-QOL in 35 patients with pressure ulcers with the view to identifying problematic items, followed by Rasch analysis to examine response options, appropriateness of the item series and biases due to question ordering (item fit). We then compared findings in an interactive and iterative process to identify potential strengths and weaknesses of PU-QOL items, and guide decision-making about further revisions to items and design/layout. Although cognitive interviews largely supported items, they highlighted problems with layout, response options and comprehension. Findings from the Rasch analysis identified problems with response options through reversed thresholds. The use of a mixed-methods approach in pretesting the PU-QOL instrument proved beneficial for identifying problems with scale layout, response options and framing/wording of items. Rasch measurement methods are a useful addition to standard qualitative pretesting for evaluating strengths and weaknesses of early stage PRO instruments.
HIV/AIDS knowledge among men who have sex with men: applying the item response theory.
Gomes, Raquel Regina de Freitas Magalhães; Batista, José Rodrigues; Ceccato, Maria das Graças Braga; Kerr, Lígia Regina Franco Sansigolo; Guimarães, Mark Drew Crosland
2014-04-01
To evaluate the level of HIV/AIDS knowledge among men who have sex with men in Brazil using the latent trait model estimated by Item Response Theory. Multicenter, cross-sectional study, carried out in ten Brazilian cities between 2008 and 2009. Adult men who have sex with men were recruited (n = 3,746) through Respondent Driven Sampling. HIV/AIDS knowledge was ascertained through ten statements by face-to-face interview and latent scores were obtained through two-parameter logistic modeling (difficulty and discrimination) using Item Response Theory. Differential item functioning was used to examine each item characteristic curve by age and schooling. Overall, the HIV/AIDS knowledge scores using Item Response Theory did not exceed 6.0 (scale 0-10), with mean and median values of 5.0 (SD = 0.9) and 5.3, respectively, with 40.7% of the sample with knowledge levels below the average. Some beliefs still exist in this population regarding the transmission of the virus by insect bites, by using public restrooms, and by sharing utensils during meals. With regard to the difficulty and discrimination parameters, eight items were located below the mean of the scale and were considered very easy, and four items presented very low discrimination parameter (< 0.34). The absence of difficult items contributed to the inaccuracy of the measurement of knowledge among those with median level and above. Item Response Theory analysis, which focuses on the individual properties of each item, allows measures to be obtained that do not vary or depend on the questionnaire, which provides better ascertainment and accuracy of knowledge scores. Valid and reliable scales are essential for monitoring HIV/AIDS knowledge among the men who have sex with men population over time and in different geographic regions, and this psychometric model brings this advantage.
Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions
ERIC Educational Resources Information Center
Sinharay, Sandip; Johnson, Matthew S.; Williamson, David M.
2003-01-01
Item families, which are groups of related items, are becoming increasingly popular in complex educational assessments. For example, in automatic item generation (AIG) systems, a test may consist of multiple items generated from each of a number of item models. Item calibration or scoring for such an assessment requires fitting models that can…
Jafari, Peyman; Bagheri, Zahra; Ayatollahi, Seyyed Mohamad Taghi; Soltani, Zahra
2012-03-13
Item response theory (IRT) is extensively used to develop adaptive instruments of health-related quality of life (HRQoL). However, each IRT model has its own function to estimate item and category parameters, and hence different results may be found using the same response categories with different IRT models. The present study used the Rasch rating scale model (RSM) to examine and reassess the psychometric properties of the Persian version of the PedsQL™ 4.0 Generic Core Scales. The PedsQL™ 4.0 Generic Core Scales was completed by 938 Iranian school children and their parents. Convergent, discriminant and construct validity of the instrument were assessed by classical test theory (CTT). The RSM was applied to investigate person and item reliability, item statistics and ordering of response categories. The CTT method showed that the scaling success rate for convergent and discriminant validity were 100% in all domains with the exception of physical health in the child self-report. Moreover, confirmatory factor analysis supported a four-factor model similar to its original version. The RSM showed that 22 out of 23 items had acceptable infit and outfit statistics (<1.4, >0.6), person reliabilities were low, item reliabilities were high, and item difficulty ranged from -1.01 to 0.71 and -0.68 to 0.43 for child self-report and parent proxy-report, respectively. Also the RSM showed that successive response categories for all items were not located in the expected order. This study revealed that, in all domains, the five response categories did not perform adequately. It is not known whether this problem is a function of the meaning of the response choices in the Persian language or an artifact of a mostly healthy population that did not use the full range of the response categories. The response categories should be evaluated in further validation studies, especially in large samples of chronically ill patients.
Reynolds, Arthur J; Richardson, Brandt A; Hayakawa, Momoko; Lease, Erin M; Warner-Richter, Mallory; Englund, Michelle M; Ou, Suh-Ruu; Sullivan, Molly
2014-11-26
Early childhood interventions have demonstrated positive effects on well-being. Whether full-day vs part-day attendance improves outcomes is unknown. To evaluate the association between a full- vs part-day early childhood program and school readiness, attendance, and parent involvement. End-of-preschool follow-up of a nonrandomized, matched-group cohort of predominantly low-income, ethnic minority children enrolled in the Child-Parent Centers (CPC) for the full day (7 hours; n = 409) or part day (3 hours on average; n = 573) in the 2012-2013 school year in 11 schools in Chicago, Illinois. The Midwest CPC Education Program provides comprehensive instruction, family-support, and health services from preschool to third grade. School readiness skills at the end of preschool, attendance and chronic absences, and parental involvement. The readiness domains in the Teaching Strategies GOLD Assessment System include a total of 49 items with a score range of 105-418. The specific domains are socioemotional with 9 items (score range, 20-81), language with 6 items (score range, 15-54), literacy with 12 items (score range, 9-104), math with 7 items (score, 8-60), physical health with 5 items (score range, 14-45), and cognitive development with 10 items (score range, 18-90). Full-day preschool participants had higher scores than part-day peers on socioemotional development (58.6 vs 54.5; difference, 4.1; 95% CI, 0.5-7.6; P = .03), language (39.9 vs 37.3; difference, 2.6; 95% CI, 0.6-4.6; P = .01), math (40.0 vs 36.4; difference, 3.6; 95% CI, 0.5-6.7; P = .02), physical health (35.5 vs 33.6; difference, 1.9; 95% CI, 0.5-3.2; P = .006), and the total score (298.1 vs 278.2; difference, 19.9; 95% CI, 1.2-38.4; P = .04). Literacy (64.5 vs 58.6; difference, 5.9; 95% CI, -0.07 to 12.4; P = .08) and cognitive development (59.7 vs 57.7; difference, 2.0; 95% CI, -2.4 to 6.3; P = .38) were not significant. Full-day preschool graduates also had higher rates of attendance (85.9% vs 80.4%; difference, 5.5; 95% CI, 2.6-8.4; P = .001) and lower rates of chronic absences (≥10% days missed; 53.0% vs 71.6%; difference, -18.6; 95% CI, -28.5 to -8.7; P = .001; ≥20% days missed; 21.2% vs 38.8%; difference -17.6%; 95% CI, -25.6 to -9.7; P < .001) but no differences in parental involvement. In an expansion of the CPCs in Chicago, a full-day preschool intervention was associated with increased school readiness skills in 4 of 6 domains, attendance, and reduced chronic absences compared with a part-day program. These findings should be replicated in other programs and contexts.