Nickel and cobalt release from jewellery and metal clothing items in Korea.
Cheong, Seung Hyun; Choi, You Won; Choi, Hae Young; Byun, Ji Yeon
2014-01-01
In Korea, the prevalence of nickel allergy has shown a sharply increasing trend. Cobalt contact allergy is often associated with concomitant reactions to nickel, and is more common in Korea than in western countries. The aim of the present study was to investigate the prevalence of items that release nickel and cobalt on the Korean market. A total of 471 items that included 193 branded jewellery, 202 non-branded jewellery and 76 metal clothing items were sampled and studied with a dimethylglyoxime (DMG) test and a cobalt spot test to detect nickel and cobalt release, respectively. Nickel release was detected in 47.8% of the tested items. The positive rates in the DMG test were 12.4% for the branded jewellery, 70.8% for the non-branded jewellery, and 76.3% for the metal clothing items. Cobalt release was found in 6.2% of items. Among the types of jewellery, belts and hair pins showed higher positive rates in both the DMG test and the cobalt spot test. Our study shows that the prevalence of items that release nickel or cobalt among jewellery and metal clothing items is high in Korea. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Garcia-Martinez, Irma; Weiss, Theresa R; Yousaf, Muhammad N; Ali, Ather; Mehal, Wajahat Z
2018-01-01
Leukocyte activation (LA) testing identifies food items that induce a patient specific cellular response in the immune system, and has recently been shown in a randomized double blinded prospective study to reduce symptoms in patients with irritable bowel syndrome (IBS). We hypothesized that test reactivity to particular food items, and the systemic immune response initiated by these food items, is due to the release of cellular DNA from blood immune cells. We tested this by quantifying total DNA concentration in the cellular supernatant of immune cells exposed to positive and negative foods from 20 healthy volunteers. To establish if the DNA release by positive samples is a specific phenomenon, we quantified myeloperoxidase (MPO) in cellular supernatants. We further assessed if a particular immune cell population (neutrophils, eosinophils, and basophils) was activated by the positive food items by flow cytometry analysis. To identify the signaling pathways that are required for DNA release we tested if specific inhibitors of key signaling pathways could block DNA release. Foods with a positive LA test result gave a higher supernatant DNA content when compared to foods with a negative result. This was specific as MPO levels were not increased by foods with a positive LA test. Protein kinase C (PKC) inhibitors resulted in inhibition of positive food stimulated DNA release. Positive foods resulted in CD63 levels greater than negative foods in eosinophils in 76.5% of tests. LA test identifies food items that result in release of DNA and activation of peripheral blood innate immune cells in a PKC dependent manner, suggesting that this LA test identifies food items that result in release of inflammatory markers and activation of innate immune cells. This may be the basis for the improvement in symptoms in IBS patients who followed an LA test guided diet.
ERIC Educational Resources Information Center
Missouri State Dept. of Elementary and Secondary Education, Jefferson City.
This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to fifth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…
ERIC Educational Resources Information Center
Missouri State Dept. of Elementary and Secondary Education, Jefferson City.
This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to ninth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…
Thyssen, Jacob Pontoppidan; Jellesen, Morten S; Menné, Torkil; Lidén, Carola; Julander, Anneli; Møller, Per; Johansen, Jeanne Duus
2010-08-01
Before the introduction of the EU Nickel Directive, concern was raised that manufacturers of jewellery might turn from the use of nickel to cobalt following the regulatory intervention on nickel exposure. The aim was to study 354 consumer items using the cobalt spot test. Cobalt release was assessed to obtain a risk estimate of cobalt allergy and dermatitis in consumers who would wear the jewellery. The cobalt spot test was used to assess cobalt release from all items. Microstructural characterization was made using scanning electron microscope (SEM) and energy-dispersive spectroscopy (EDS). Cobalt release was found in 4 (1.1%) of 354 items. All these had a dark appearance. SEM/EDS was performed on the four dark appearing items which showed tin-cobalt plating on these. This study showed that only a minority of inexpensive jewellery purchased in Denmark released cobalt when analysed with the cobalt spot test. As fashion trends fluctuate and we found cobalt release from dark appearing jewellery, cobalt release from consumer items should be monitored in the future. Industries may not be fully aware of the potential cobalt allergy problem.
Release from output interference in recognition memory: A test of the attention hypothesis.
Criss, Amy H; Salomão, Cristina; Malmberg, Kenneth J; Aue, William; Kılıç, Aslı; Claridge, MarkAvery
2018-05-01
Retrieval results in both costs and benefits to episodic memory. Output interference (OI) refers to the finding that episodic memory accuracy decreases with increasing test trials. Release from OI is the restoration of original accuracy at some point during the test. For example, a release from OI in recognition memory testing occurs when the semantic similarity between stimuli decreases midway through testing, suggesting that item representations stored on early trials cause interference on tests occurring on later trials to the extent that the earlier items share features with the latter items. In two recognition memory experiments, we demonstrate release from OI for words and faces. We also test whether release from OI is the result of interference or is due to a boost in attention caused by reorienting to a novel stimulus type. A test for the foils presented during the initial test list supports the interference account of OI. Implications for models of memory are discussed.
Identification of metallic items that caused nickel dermatitis in Danish patients.
Thyssen, Jacob P; Menné, Torkil; Johansen, Jeanne D
2010-09-01
Nickel allergy is prevalent as assessed by epidemiological studies. In an attempt to further identify and characterize sources that may result in nickel allergy and dermatitis, we analysed items identified by nickel-allergic dermatitis patients as causative of nickel dermatitis by using the dimethylglyoxime (DMG) test. Dermatitis patients with nickel allergy of current relevance were identified over a 2-year period in a tertiary referral patch test centre. When possible, their work tools and personal items were examined with the DMG test. Among 95 nickel-allergic dermatitis patients, 70 (73.7%) had metallic items investigated for nickel release. A total of 151 items were investigated, and 66 (43.7%) gave positive DMG test reactions. Objects were nearly all purchased or acquired after the introduction of the EU Nickel Directive. Only one object had been inherited, and only two objects had been purchased outside of Denmark. DMG testing is valuable as a screening test for nickel release and should be used to identify relevant exposures in nickel-allergic patients. Mainly consumer items, but also work tools used in an occupational setting, released nickel in dermatitis patients. This study confirmed 'risk items' from previous studies, including mobile phones.
Nickel on the market: a baseline survey of articles in 'prolonged contact' with skin.
Ringborg, Evelina; Lidén, Carola; Julander, Anneli
2016-08-01
In April 2014, the European Chemicals Agency defined the concept of 'prolonged contact with skin' as used in the EU nickel restriction. To establish a baseline of nickel-releasing items on the Swedish market conforming with the EU nickel restriction according to the definition of 'prolonged contact' with the skin. We performed a limited market survey in Stockholm, Sweden. Items with metallic parts that come into contact with the skin, except those explicitly mentioned in the legal text, were chosen. The dimethylglyoxime (DMG) test was used to evaluate nickel release. One hundred and forty-one items belonging to one of three categories - accessories, utensils for needlework, painting and writing (called utensils), and electronic devices - were tested in the study. Forty-four percent of all items were DMG test-positive (releasing nickel), and 9% gave a doubtful DMG test result. The large proportion of nickel-releasing items in the present study shows clearly that broader parts of industry need to take action to prevent nickel allergy. The high proportion of DMG test-positive items indicates that there is still much work to be done to reduce the nickel exposure of the population. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Nickel on the Swedish market. Follow-up after implementation of the Nickel Directive.
Lidén, Carola; Norberg, Kristina
2005-01-01
The Nickel Directive aims at the prevention of sensitization and elicitation of nickel dermatitis. It limits nickel release from, and nickel content in, certain items. The Directive came into full force by July 2001. The aim of this study was to investigate the frequency on the market of items that release nickel and of nickel content in piercing posts, 2 years after coming into force of the Directive. Of special interest was to study changes compared to the situation in 1999, when a baseline study had been carried out. Nickel release from 786 items covered by the Nickel Directive was tested with the dimethylglyoxime (DMG) test, and nickel content in 18 piercing posts was analysed. Nickel release was shown from 8% of items intended for direct and prolonged contact with the skin, and 17% of the piercing posts contained too much nickel, a decrease compared to 1999. There has been significant adaptation to the requirements of the Nickel Directive. The DMG test is useful for screening for nickel release and for monitoring the market. Provided there is further adaptation to the requirements, the risk of sensitization and elicitation of nickel dermatitis will be significantly reduced.
Nickel on the Swedish market: follow-up 10 years after entry into force of the EU Nickel Directive.
Biesterbos, Jacqueline; Yazar, Kerem; Lidén, Carola
2010-12-01
The EU Nickel Directive, aimed at primary and secondary prevention of nickel allergy by limitation of nickel release from certain items, came fully into force in July 2001. To assess the prevalence on the market of items with nickel release and to compare the outcome with previous studies performed in Sweden in 1999 and 2002-2003. Nickel release from 659 items covered by the EU Nickel Directive was assessed with the dimethylglyoxime (DMG) test. Special attention, as compared with the previous surveys, was given to cheap jewellery in street markets and sewing materials in haberdashery shops. Nickel release was shown for 9% of the tested items, all of which were intended for direct and prolonged contact with the skin. A high proportion of items bought at haberdashery shops and street markets, 34% and 61%, respectively, showed nickel release. The Swedish market for products intended for direct and prolonged contact with the skin has largely adapted to the Nickel Directive. It is suggested that authorities should monitor the market regularly and give attention to areas where compliance with the requirements is poor, for protection of public health. © 2010 John Wiley & Sons A/S.
ERIC Educational Resources Information Center
Brese, Falk, Ed.
2012-01-01
The goal for selecting the released set of test items was to have approximately 25% of each of the full item sets for mathematics content knowledge (MCK) and mathematics pedagogical content knowledge (MPCK) that would represent the full range of difficulty, content, and item format used in the TEDS-M study. The initial step in the selection was to…
ERIC Educational Resources Information Center
Missouri State Dept. of Elementary and Secondary Education, Jefferson City.
This document deals with testing in intermediate communication arts for seventh graders in Missouri public schools. The document contains the following items from the Session 1 Test Booklet: "Swimming in Snow" (Diana C. Conway) (Items 1, 2, and 5); "Discovery" (Marion Dane Bauer) (Item 13); writing prompt; and a writer's…
Science or Reading: What Is Being Measured by Standardized Tests?
ERIC Educational Resources Information Center
Visone, Jeremy D.
2010-01-01
This study examined reading issues associated with a standardized science test. Grade 11 students in Connecticut were shown released science test items and asked about the reading issues associated with the items. Findings suggested that students varied in their understanding of the nature of the items and in their ability to read for detail. The…
ERIC Educational Resources Information Center
O'Keeffe, Lisa
2016-01-01
Language is frequently discussed as barrier to mathematics word problems. Hence this paper presents the initial findings of a linguistic analysis of numeracy skills test sample items. The theoretical perspective of multi-modal text analysis underpinned this study, in which data was extracted from the ten sample numeracy test items released by the…
Real Time Cockpit Resource Management (CRM) Training
2010-10-01
to post-test. Table 4 Learning Scores for the Five Spiral 1 Classes Spiral 1 Class Pilots Sensors Pretest Posttest Difference Pretest Posttest ...results from the five Spiral 1 classes. Table 6 Pretest / Posttest Gain Scores Associated with Each Learning Test Item Test Item Class Item...SMALL BUSINESS INNOVATION RESEARCH (SBIR) PHASE II REPORT. Distribution A: Approved for public release; distribution unlimited. (Approval given
PSSA Released Reading Items, 2000-2001. The Pennsylvania System of School Assessment.
ERIC Educational Resources Information Center
Pennsylvania State Dept. of Education, Harrisburg. Bureau of Curriculum and Academic Services.
This document contains materials directly related to the actual reading test of the Pennsylvania System of School Assessment (PSSA), including the reading rubric, released passages, selected-response questions with answer keys, performance tasks, and scored samples of students' responses to the tasks. All of these items may be duplicated to…
Metal exposures from aluminum cookware: An unrecognized public health risk in developing countries.
Weidenhamer, Jeffrey D; Fitzpatrick, Meghann P; Biro, Alison M; Kobunski, Peter A; Hudson, Michael R; Corbin, Rebecca W; Gottesfeld, Perry
2017-02-01
Removing lead from gasoline has resulted in decreases in blood lead levels in most of the world, but blood lead levels remain elevated in low and middle-income countries compared to more developed countries. Several reasons for this difference have been investigated, but few studies have examined the potential contribution from locally-made aluminum cookware. In a previous study of cookware from a single African country, Cameroon, artisanal aluminum cookware that is made from scrap metal released significant quantities of lead. In this study, 42 intact aluminum cookware items from ten developing countries were tested for their potential to release lead and other metals during cooking. Fifteen items released ≥1 microgram of lead per serving (250mL) when tested by boiling with dilute acetic acid for 2h. One pot, from Viet Nam, released 33, 1126 and 1426 micrograms per serving in successive tests. Ten samples released >1 microgram of cadmium per serving, and fifteen items released >1 microgram of arsenic per serving. The mean exposure estimate for aluminum was 125mg per serving, more than six times the World Health Organization's Provisional Tolerable Weekly Intake of 20mg/day for a 70kg adult, and 40 of 42 items tested exceeded this level. We conducted preliminary assessments of three potential methods to reduce metal leaching from this cookware. Coating the cookware reduced aluminum exposure per serving by >98%, and similar reductions were seen for other metals as well. Potential exposure to metals by corrosion during cooking may pose a significant and largely unrecognized public health risk which deserves urgent attention. Copyright © 2016 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Colorado State Dept. of Education, Denver.
This document contains released reading comprehension passages, test items, and writing prompts from the Colorado Student Assessment Program for 2001. The sample questions and prompts are included without answers or examples of student responses. Test materials are included for: (1) Grade 4 Reading and Writing; (2) Grade 4 Lectura y Escritura…
ERIC Educational Resources Information Center
Lorié, William A.
2013-01-01
A reverse engineering approach to automatic item generation (AIG) was applied to a figure-based publicly released test item from the Organisation for Economic Cooperation and Development (OECD) Programme for International Student Assessment (PISA) mathematical literacy cognitive instrument as part of a proof of concept. The author created an item…
Three controversies over item disclosure in medical licensure examinations.
Park, Yoon Soo; Yang, Eunbae B
2015-01-01
In response to views on public's right to know, there is growing attention to item disclosure - release of items, answer keys, and performance data to the public - in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations - 1) fairness and validity, 2) impact on passing levels, and 3) utility of item disclosure - by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers' right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.
ERIC Educational Resources Information Center
Ryan, Joseph; Brockmann, Frank
2009-01-01
Equating is an essential tool in educational assessment due the critical role it plays in several key areas: establishing validity across forms and years; fairness; test security; and, increasingly, continuity in programs that release items or require ongoing development. Although the practice of equating is rooted in long standing practices that…
Nickel release from surgical instruments and operating room equipment.
Boyd, Anne H; Hylwa, Sara A
2018-04-15
Background There has been no systematic study assessing nickel release from surgical instruments and equipment used within the operating suite. This equipment represents important potential sources of exposure for nickel-sensitive patients and hospital staff. To investigate nickel release from commonly used surgical instruments and operating room equipment. Using the dimethylglyoxime nickel spot test, a variety of surgical instruments and operating room equipment were tested for nickel release at our institution. Of the 128 surgical instruments tested, only 1 was positive for nickel release. Of the 43 operating room items tested, 19 were positive for nickel release, 7 of which have the potential for direct contact with patients and/or hospital staff. Hospital systems should be aware of surgical instruments and operating room equipment as potential sources of nickel exposure.
Three controversies over item disclosure in medical licensure examinations
Park, Yoon Soo; Yang, Eunbae B.
2015-01-01
In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1) fairness and validity, 2) impact on passing levels, and 3) utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration. PMID:26374693
Examination of the PROMIS upper extremity item bank.
Hung, Man; Voss, Maren W; Bounsanga, Jerry; Crum, Anthony B; Tyser, Andrew R
Clinical measurement. The psychometric properties of the PROMIS v1.2 UE item bank were tested on various samples prior to its release, but have not been fully evaluated among the orthopaedic population. This study assesses the performance of the UE item bank within the UE orthopaedic patient population. The UE item bank was administered to 1197 adult patients presenting to a tertiary orthopaedic clinic specializing in hand and UE conditions and was examined using traditional statistics and Rasch analysis. The UE item bank fits a unidimensional model (outfit MNSQ range from 0.64 to 1.70) and has adequate reliabilities (person = 0.84; item = 0.82) and local independence (item residual correlations range from -0.37 to 0.34). Only one item exhibits gender differential item functioning. Most items target low levels of function. The UE item bank is a useful clinical assessment tool. Additional items covering higher functions are needed to enhance validity. Supplemental testing is recommended for patients at higher levels of function until more high function UE items are developed. 2c. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Muratti, Jose E.; And Others
A parallel Spanish edition was developed of released objectives and objective-referenced items used in the National Assessment of Educational Progress (NAEP) in the field of Career and Occupational Development (COD). The Spanish edition was designed to assess the identical skills, attitudes, concepts, and knowledge of Spanish-dominant students…
Less we forget: retrieval cues and release from retrieval-induced forgetting.
Jonker, Tanya R; Seli, Paul; Macleod, Colin M
2012-11-01
Retrieving some items from memory can impair the subsequent recall of other related but not retrieved items, a phenomenon called retrieval-induced forgetting (RIF). The dominant explanation of RIF-the inhibition account-asserts that forgetting occurs because related items are suppressed during retrieval practice to reduce retrieval competition. This item inhibition persists, making it more difficult to recall the related items on a later test. In our set of experiments, each category was designed such that each exemplar belonged to one of two subcategories (e.g., each BIRD exemplar was either a bird of prey or a pet bird), but this subcategory information was not made explicit during study or retrieval practice. Practicing retrieval of items from only one subcategory led to RIF for items from the other subcategory when cued only with the overall category label (BIRD) at test. However, adapting the technique of Gardiner, Craik, and Birtwistle (Journal of Learning and Verbal Behavior 11:778-783, 1972), providing subcategory cues during the final test eliminated RIF. The results challenge the inhibition account's fundamental assumption of cue independence but are consistent with a cue-based interference account.
A Comparative Study on the Lot Release Systems for Vaccines as of 2016.
Fujita, Kentaro; Naito, Seishiro; Ochiai, Masaki; Konda, Toshifumi; Kato, Atsushi
2017-09-25
Many countries have already established their own vaccine lot release system that is designed for each country's situation: while the World Health Organization promotes for the convergence of these regulatory systems so that vaccines of assured quality are provided globally. We conducted a questionnaire-based investigation of the lot release systems for vaccines in 7 countries and 2 regions. We found that a review of the summary protocol by the National Regulatory Authorities was commonly applied for the independent lot release of vaccines, however, we also noted some diversity between countries, especially in regard to the testing policy. Some countries and regions, including Japan, regularly tested every lot of vaccines, whereas the frequency of these tests was reduced in other countries and regions as determined based on the risk assessment of these products. Test items selected for the lot release varied among the countries or regions investigated, although there was a tendency to prioritize the potency tests. An understanding of the lot release policy may contribute to improving and harmonizing the lot release system globally in the future.
ERIC Educational Resources Information Center
Missouri State Dept. of Elementary and Secondary Education, Jefferson City.
This booklet contains sample items from the Missouri social studies test for eighth graders. The first sample is based on a speech delivered by Elizabeth Cady Stanton in the mid-1880s, which proposed a new approach to raising girls. Students are directed to use their own knowledge and the speech excerpt to do three activities. The second sample…
Evaluation of asbestos-containing products and released fibers in home appliances.
Hwang, Sung Ho; Park, Wha Me
2016-09-01
The purpose of this study was to detect asbestos-containing products and released asbestos fibers from home appliances. The authors investigated a total of 414 appliances manufactured between 1986 and 2007. Appliances were divided into three categories: large-sized electric appliances, small-sized electric appliances, and household items. Analysis for asbestos-containing material (ACM) was performed using polarized light microscopy (PLM) and stereoscopic microscopy. Air sampling was performed to measure airborne concentration of asbestos using a phase-contrast microscope (PCM). The results of the analysis for ACM in appliances show that large-sized electric appliances (refrigerators, washing machines, kimchi-refrigerators) and household items (bicycles, motorcycles, gas boilers) contain asbestos material and small-sized electric appliances do not contain asbestos material. All appliances with detected asbestos material showed typical characteristics of chrysotile (7-50%) and tremolite (7-10%). No released fibers of ACM were detected from the tested appliances when the appliances were operating. This study gives the basic information on asbestos risk to people who use home appliances. All appliances with detected asbestos material showed typical characteristics of chrysotile (7-50%) and tremolite (7-10%). No released fibers of ACM were detected from the tested appliances when the appliances were operating.
The impact of common metal allergens in daily devices.
Hamann, Dathan; Hamann, Carsten R; Thyssen, Jacob P
2013-10-01
We are widely exposed to metal allergens in our daily doings. As exposures constantly changes because of fashion trends and technological developments, there is a need for a continuous update of patch testers. An overview of consumer metal exposure studies that have been published in 2012 and 2013 is provided as well as lists of common metal exposures. Nickel release in concentrations that cause nickel allergy and contact dermatitis is seen from laptop computers. Cobalt is found in leather as a dye and may cause chronic dermatitis. Chromium is used as a dye and for tanning in leather items and is found in nearly all shoes and released from a high proportion. New consumer items should continuously be considered and investigated for metal release when patients with positive patch test results to metal allergens are evaluated.
Alatorre-Miguel, Efren; Zambrano-Sánchez, Elizabeth; Reyes-Legorreta, Celia
2015-01-01
Attention deficit hyperactivity disorder (ADHD) affects 5-6% of school aged children worldwide. Pharmacological therapy is considered the first-line treatment and methylphenidate (MPH) is considered the first-choice medication. There are two formulations: immediate release (IR) MPH and long-acting (or extended release) formulation (MPH-ER). In this work, we measure the efficacy of treatment for both presentations in one month with Conners' scales and electroencephalography (EEG). Results. for IR group, in parents and teachers Conners test, all items showed significant differences, towards improvement, except for teachers in perfectionism and emotional instability. For ER group in parent's Conners test, the items in which there were no significant differences are psychosomatic and emotional instability. For teachers, there were no significant differences in: hyperactivity and perfectionism. Comparing the Conners questionnaires (parents versus teachers) we find significant differences before and after treatment in hyperactivity, perfectionism, psychosomatics, DSM-IV hyperactive-impulsive, and DSM-IV total. In the EEG the Wilcoxon test showed a significant difference (P < 0.0001). As we can see, both presentations are suitable for managing the ADHD and have the same effect on the symptomatology and in the EEG. PMID:25838946
Durand-Rivera, Alfredo; Alatorre-Miguel, Efren; Zambrano-Sánchez, Elizabeth; Reyes-Legorreta, Celia
2015-01-01
Attention deficit hyperactivity disorder (ADHD) affects 5-6% of school aged children worldwide. Pharmacological therapy is considered the first-line treatment and methylphenidate (MPH) is considered the first-choice medication. There are two formulations: immediate release (IR) MPH and long-acting (or extended release) formulation (MPH-ER). In this work, we measure the efficacy of treatment for both presentations in one month with Conners' scales and electroencephalography (EEG). Results. for IR group, in parents and teachers Conners test, all items showed significant differences, towards improvement, except for teachers in perfectionism and emotional instability. For ER group in parent's Conners test, the items in which there were no significant differences are psychosomatic and emotional instability. For teachers, there were no significant differences in: hyperactivity and perfectionism. Comparing the Conners questionnaires (parents versus teachers) we find significant differences before and after treatment in hyperactivity, perfectionism, psychosomatics, DSM-IV hyperactive-impulsive, and DSM-IV total. In the EEG the Wilcoxon test showed a significant difference (P < 0.0001). As we can see, both presentations are suitable for managing the ADHD and have the same effect on the symptomatology and in the EEG.
Encoding Processes and Sex-Role Preferences
ERIC Educational Resources Information Center
Kail, Robert V., Jr.; Levine, Laura E.
1976-01-01
Seven and 10-year-olds were tested on memory and sex-role preference tasks. The memory task was the Wickens release from proactive inhibition paradigm in which short-term recall of words is tested on successive trials. Children selected favorite pictures from an array including masculine and feminine items. (JH)
ERIC Educational Resources Information Center
Shanmugam, S. Kanageswari Suppiah; Lan, Ong Saw
2013-01-01
Purpose: This study aims to investigate the validity of using bilingual test to measure the mathematics achievement of students who have limited English proficiency (LEP). The bilingual test and the English-only test consist of 20 computation and 20 word problem multiple-choice questions (from TIMSS 2003 and 2007 released items. The bilingual test…
1983-06-01
of this repat) U7NCLASSIFIED ISo. OECLASSI PICATION/i DOWNGRADING SCHEDULE IS, OIS? UUTION STATEMENT (fo Sie ftepoe) Approved for public release...26 1. GPETS Initial Outfitt ng (GINO) ..... 26 2. GPETE End Item Replacement (GEIR) * . . 27 D. GINO REQUIREMENTS DETERMINYATION .. . . . 28 E...interval of a sample of 305 GPETE items increased from 8.8 tc 13.6 months. The estimated annual savings resui- ng from this increase was 18.000
Virginia Standards of Learning Assessments. Grade 8 Released Test Items, 1998.
ERIC Educational Resources Information Center
Virginia State Dept.of Education, Richmond. Div. of Assessment and Reporting.
Beginning in Spring 1998, Virginia students participated in the Standards of Learning (SOL) assessments designed to test student knowledge of the content and skills specified in the state's standards. This document contains questions that approximately 79,000 students in grade 8 were required to answer as part of the SOL assessments. These…
Virginia Standards of Learning Assessments. Grade 5 Released Test Items, 1998.
ERIC Educational Resources Information Center
Virginia State Dept.of Education, Richmond. Div. of Assessment and Reporting.
Beginning in Spring 1998, Virginia students participated in the Standards of Learning (SOL) assessments designed to test student knowledge of the content and skills specified in the state's standards. This document contains questions that approximately 80,000 students in grade 5 were required to answer as part of the SOL assessments. These…
Virginia Standards of Learning Assessments. Grade 3 Released Test Items, 1998.
ERIC Educational Resources Information Center
Virginia State Dept.of Education, Richmond. Div. of Assessment and Reporting.
Beginning in Spring 1998, Virginia students participated in the Standards of Learning (SOL) Assessments designed to test student knowledge of the content and skills specified in the state's standards. This document contains questions that approximately 83,000 students in grade 3 were required to answer as part of the SOL assessments. These…
Chromium(VI) release from leather and metals can be detected with a diphenylcarbazide spot test.
Bregnbak, David; Johansen, Jeanne D; Jellesen, Morten S; Zachariae, Claus; Thyssen, Jacob P
2015-11-01
Along with chromium, nickel and cobalt are the clinically most important metal allergens. However, unlike for nickel and cobalt, there is no validated colorimetric spot test that detects chromium. Such a test could help both clinicians and their patients with chromium dermatitis to identify culprit exposures. To evaluate the use of diphenylcarbazide (DPC) as a spot test reagent for the identification of chromium(VI) release. A colorimetric chromium(VI) spot test based on DPC was prepared and used on different items from small market surveys. The DPC spot test was able to identify chromium(VI) release at 0.5 ppm without interference from other pure metals, alloys, or leather. A market survey using the test showed no chromium(VI) release from work tools (0/100). However, chromium(VI) release from metal screws (7/60), one earring (1/50), leather shoes (4/100) and leather gloves (6/11) was observed. We found no false-positive test reactions. Confirmatory testing was performed with X-ray fluorescence (XRF) and spectrophotometrically on extraction fluids. The use of DPC as a colorimetric spot test reagent appears to be a good and valid test method for detecting the release of chromium(VI) ions from leather and metal articles. The spot test has the potential to become a valuable screening tool. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Small-Item Vapor Test Method, FY11 Release
2012-07-01
to this test procedure is provided alphabetically in the following list: absorption: The uptake of a contaminant INTO the volume of a material. The... powders , wipes), or gas-phase (fumigants, including aerosols). decontamination process: The process of making any person, object, or area safe by...with another contaminant. Generally, bare metals and glass are nonsorptive materials for some agents. operational decontamination: Decontamination
Spartan Release Engagement Mechanism (REM) stress and fracture analysis
NASA Technical Reports Server (NTRS)
Marlowe, D. S.; West, E. J.
1984-01-01
The revised stress and fracture analysis of the Spartan REM hardware for current load conditions and mass properties is presented. The stress analysis was performed using a NASTRAN math model of the Spartan REM adapter, base, and payload. Appendix A contains the material properties, loads, and stress analysis of the hardware. The computer output and model description are in Appendix B. Factors of safety used in the stress analysis were 1.4 on tested items and 2.0 on all other items. Fracture analysis of the items considered fracture critical was accomplished using the MSFC Crack Growth Analysis code. Loads and stresses were obtaind from the stress analysis. The fracture analysis notes are located in Appendix A and the computer output in Appendix B. All items analyzed met design and fracture criteria.
Proactive interference from items previously stored in visual working memory.
Makovski, Tal; Jiang, Yuhong V
2008-01-01
This study investigates the fate of information that was previously stored in visual working memory but that is no longer needed. Previous research has found inconsistent results, with some showing effective release of irrelevant information and others showing proactive interference. Using change detection tasks of colors or shapes, we show that participants tend to falsely classify a changed item as "no change" if it matches one of the memory items on the preceding trial. The interference is spatially specific: Memory for the preceding trial interferes more if it matches the feature value and the location of a test item than if it does not. Interference results from retaining information in visual working memory, since it is absent when items on the preceding trials are passively viewed, or are attended but not memorized. We conclude that people cannot fully eliminate unwanted visual information from current working memory tasks.
78 FR 25723 - National Assessment Governing Board; Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-02
..., assistive listening devices, materials in alternative format) should notify Munira Mwalimu at 202- 357-6938.... to review secure NAEP test materials for Science Interactive Computer Tasks (ICTs) at grades 4, 8... provided with secure items and materials which are not yet available for release to the general public...
NASA Technical Reports Server (NTRS)
Bledsoe, Kristin
2013-01-01
The Crew Exploration Vehicle Parachute Assembly System (CPAS) is the parachute system for NASA s Orion spacecraft. The test program consists of numerous drop tests, wherein a test article rigged with parachutes is extracted or released from an aircraft. During such tests, range safety is paramount, as is the recoverability of the parachutes and test article. It is crucial to establish an aircraft release point that will ensure that the article and all items released from it will land in safe locations. A new footprint predictor tool, called Sasquatch, was created in MATLAB. This tool takes in a simulated trajectory for the test article, information about all released objects, and atmospheric wind data (simulated or actual) to calculate the trajectories of the released objects. Dispersions are applied to the landing locations of those objects, taking into account the variability of winds, aircraft release point, and object descent rate. Sasquatch establishes a payload release point (e.g., where the payload will be extracted from the carrier aircraft) that will ensure that the payload and all objects released from it will land in a specified cleared area. The landing locations (the final points in the trajectories) are plotted on a map of the test range. Sasquatch was originally designed for CPAS drop tests and includes extensive information about both the CPAS hardware and the primary test range used for CPAS testing. However, it can easily be adapted for more complex CPAS drop tests, other NASA projects, and commercial partners. CPAS has developed the Sasquatch footprint tool to ensure range safety during parachute drop tests. Sasquatch is well correlated to test data and continues to ensure the safety of test personnel as well as the safe recovery of all equipment. The tool will continue to be modified based on new test data, improving predictions and providing added capability to meet the requirements of more complex testing.
Development and Overview of CPAS Sasquatch Airdrop Landing Location Predictor Software
NASA Technical Reports Server (NTRS)
Bledsoe, Kristin J.; Bernatovich, Michael A.
2015-01-01
The Capsule Parachute Assembly System (CPAS) is the parachute system for NASA's Orion spacecraft. CPAS is currently in the Engineering Development Unit (EDU) phase of testing. The test program consists of numerous drop tests, wherein a test article rigged with parachutes is extracted from an aircraft. During such tests, range safety is paramount, as is the recoverability of the parachutes and test article. It is crucial to establish a release point from the aircraft that will ensure that the article and all items released from it during flight will land in a designated safe area. The Sasquatch footprint tool was developed to determine this safe release point and to predict the probable landing locations (footprints) of the payload and all released objects. In 2012, a new version of Sasquatch, called Sasquatch Polygons, was developed that significantly upgraded the capabilities of the footprint tool. Key improvements were an increase in the accuracy of the predictions, and the addition of an interface with the Debris Tool (DT), an in-flight debris avoidance tool for use on the test observation helicopter. Additional enhancements include improved data presentation for communication with test personnel and a streamlined code structure. This paper discusses the development, validation, and performance of Sasquatch Polygons, as well as its differences from the original Sasquatch footprint tool.
Small-Item Contact Test Method, FY11 Release
2012-07-01
the exposure mass of the agent. APPENDIX 8 Comparison of data using different contact swabs should include consideration for the material- uptake ...Terminology specific to this test procedure is provided alphabetically in the following list. • absorption: The uptake of a contaminant INTO the...substance with the ability to remove and/or neutralize chemical agents on/in surfaces of interest. The decontaminant can be liquid, solid ( powders , wipes
48 CFR 245.7101-3 - DD Form 1348-1, DoD Single Line Item Release/Receipt Document.
Code of Federal Regulations, 2010 CFR
2010-10-01
... PROPERTY Plant Clearance Forms 245.7101-3 DD Form 1348-1, DoD Single Line Item Release/Receipt Document. Use for shipments of excess industrial plant equipment and contractor inventory redistribution system...
Guide to Mathematics Released Items: Understanding Scoring. 2015
ERIC Educational Resources Information Center
Partnership for Assessment of Readiness for College and Careers, 2015
2015-01-01
The 2014-2015 administrations of the PARCC assessment included two separate test administration windows: the Performance-Based Assessment (PBA) and the End-of-Year (EOY), both of which were administered in paper-based and computer-based formats. The first window was for administration of the PBA, and the second window was for the administration of…
Establishing Reliability and Validity of the Criterion Referenced Exam of GeoloGy Standards EGGS
NASA Astrophysics Data System (ADS)
Guffey, S. K.; Slater, S. J.; Slater, T. F.; Schleigh, S.; Burrows, A. C.
2016-12-01
Discipline-based geoscience education researchers have considerable need for a criterion-referenced, easy-to-administer and -score conceptual diagnostic survey for undergraduates taking introductory science survey courses in order for faculty to better be able to monitor the learning impacts of various interactive teaching approaches. To support ongoing education research across the geosciences, we are continuing to rigorously and systematically work to firmly establish the reliability and validity of the recently released Exam of GeoloGy Standards, EGGS. In educational testing, reliability refers to the consistency or stability of test scores whereas validity refers to the accuracy of the inferences or interpretations one makes from test scores. There are several types of reliability measures being applied to the iterative refinement of the EGGS survey, including test-retest, alternate form, split-half, internal consistency, and interrater reliability measures. EGGS rates strongly on most measures of reliability. For one, Cronbach's alpha provides a quantitative index indicating the extent to which if students are answering items consistently throughout the test and measures inter-item correlations. Traditional item analysis methods further establish the degree to which a particular item is reliably assessing students is actually quantifiable, including item difficulty and item discrimination. Validity, on the other hand, is perhaps best described by the word accuracy. For example, content validity is the to extent to which a measurement reflects the specific intended domain of the content, stemming from judgments of people who are either experts in the testing of that particular content area or are content experts. Perhaps more importantly, face validity is a judgement of how representative an instrument is reflective of the science "at face value" and refers to the extent to which a test appears to measure a the targeted scientific domain as viewed by laypersons, examinees, test users, the public, and other invested stakeholders.
The VCOP Scale: a measure of overprotection in parents of physically vulnerable children.
Wright, L; Mullen, T; West, K; Wyatt, P
1993-11-01
A scale is developed for measuring the overprotecting vs. optimal developmental stimulation tendencies for parents of physically "vulnerable" children. A series of items were administered to parents whose parenting techniques had been rated as either highly overprotective or as optimal by a group of MDs and other professionals. Correlations were estimated between each of the items and parental tendencies as rated by professionals. Twenty-eight items were selected that provided maximum prediction of over-protection. The resulting R2 was extraordinarily high (.94). Coefficient alpha and test-retest coefficients were acceptable. It is hoped that release of the new instrument (VCOPS) at this time will allow others to join in determining the clinical and experimental validity of this scale.
USDA-ARS?s Scientific Manuscript database
We determined the feasibility of using unmanned aerial vehicle (UAV) video monitoring to predict intake of discrete food items of rangeland-raised Raramuri Criollo non-nursing beef cows. Thirty-five cows were released into a 405-m2 rectangular dry lot, either in pairs (pilot tests) or individually (...
Guide to English Language Arts/Literacy Released Items: Understanding Scoring. 2015
ERIC Educational Resources Information Center
Partnership for Assessment of Readiness for College and Careers, 2015
2015-01-01
The Partnership for Assessment of Readiness for College and Careers (PARCC) is a group of states working together to develop a modern assessment that replaces previous state standardized tests. It provides better information for teachers and parents to identify where a student needs help, or is excelling, so they are able to enhance instruction to…
Assessing Conceptual and Algorithmic Knowledge in General Chemistry with ACS Exams
ERIC Educational Resources Information Center
Holme, Thomas; Murphy, Kristen
2011-01-01
In 2005, the ACS Examinations Institute released an exam for first-term general chemistry in which items are intentionally paired with one conceptual and one traditional item. A second-term, paired-questions exam was released in 2007. This paper presents an empirical study of student performances on these two exams based on national samples of…
Plastic debris retention and exportation by a mangrove forest patch.
Ivar do Sul, Juliana A; Costa, Monica F; Silva-Cavalcanti, Jacqueline S; Araújo, Maria Christina B
2014-01-15
An experiment observed the behavior of selected tagged plastic items deliberately released in different habitats of a tropical mangrove forest in NE Brazil in late rainy (September) and late dry (March) seasons. Significant differences were not reported among seasons. However, marine debris retention varied among habitats, according to characteristics such as hydrodynamic (i.e., flow rates and volume transported) and relative vegetation (Rhizophora mangle) height and density. The highest grounds retained significantly more items when compared to the borders of the river and the tidal creek. Among the used tagged items, PET bottles were more observed and margarine tubs were less observed, being easily transported to adjacent habitats. Plastic bags were the items most retained near the releasing site. The balance between items retained and items lost was positive, demonstrating that mangrove forests tend to retain plastic marine debris for long periods (months-years). Copyright © 2013 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Reed, Deborah K.
2015-01-01
This study explored the data-based decision making of 12 teachers in grades 6-8 who were asked about their perceptions and use of three required interim measures of reading performance: oral reading fluency (ORF), retell, and a benchmark comprised of released state test items. Focus group participants reported they did not believe the benchmark or…
Items Supporting the Hanford Internal Dosimetry Program Implementation of the IMBA Computer Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carbaugh, Eugene H.; Bihl, Donald E.
2008-01-07
The Hanford Internal Dosimetry Program has adopted the computer code IMBA (Integrated Modules for Bioassay Analysis) as its primary code for bioassay data evaluation and dose assessment using methodologies of ICRP Publications 60, 66, 67, 68, and 78. The adoption of this code was part of the implementation plan for the June 8, 2007 amendments to 10 CFR 835. This information release includes action items unique to IMBA that were required by PNNL quality assurance standards for implementation of safety software. Copie of the IMBA software verification test plan and the outline of the briefing given to new users aremore » also included.« less
Evidence for proactive interference in the focus of attention of working memory.
Carroll, Lauren M; Jalbert, Annie; Penney, Alexander M; Neath, Ian; Surprenant, Aimée M; Tehan, Gerald
2010-09-01
Proactive interference (PI) occurs when an earlier item interferes with memory for a newer item. Whereas some researchers (e.g., Surprenant & Neath, 2009a) argue that PI can be observed in all memory systems, some multiple systems theorists (e.g., Cowan, 1999) propose that items in the focus of attention of working memory are immune to PI. Two experiments tested whether PI occurs when the to-be-remembered items are assumed, by multiple-systems theorists, to be held in the focus of attention. In each experiment, subjects saw four trials in a row with the same type of to-be-remembered items, followed by four trials in a row with a different type of material. On each trial, only 3 stimuli were shown, which is below the capacity limit of the focus of attention, and subjects were asked if a probe item was one of those 3 items seen. In both experiments, response time increased from Trial 1 to Trial 4, suggesting that items from the earlier trials interfered with memory on the later trials. In addition, release from PI was shown in that response times decreased with a change of materials. The results replicate those first reported by Hanley and Scheirer (1975), and pose a problem for theorists who argue that parts of short-term memory are immune to PI. Copyright 2010 APA, all rights reserved.
The New Jettison Policy for the International Space Station
NASA Technical Reports Server (NTRS)
Johnson, Nicholas L.
2006-01-01
During more than seven years of operations by the International Space Station (ISS), approximately three dozen pieces of debris were released and subsequently cataloged by the U.S. Space Surveillance Network (SSN). The individual mass of these objects ranged from less than 1 kg to 70 kg. Although some of these debris were separated from the ISS accidentally, some were intentionally cast-off, especially the larger items. In addition, small operational satellites are candidates for launch from the ISS, such as the TNS-O satellite deployed from ISS in March 2005. Recently an official ISS Jettison Policy was developed to ensure that decisions to deliberately release objects in the future were based upon a complete evaluation of the benefits and risks to the ISS, other resident space objects, and people on the Earth. The policy identifies four categories of items which might be considered for release: (1) items that pose a safety issue for return on-board a visiting vehicle, (2) items that negatively impact ISS utilization, return, or on-orbit stowage manifests, (3) items that represent an EVA timeline savings, and (4) items that are designed for jettison. Some of the principal issues to be addressed during this evaluation process are the potential for the object to recontact the ISS within the first two days after jettison, the potential of the object to breakup prior to reentry, the ability of the SSN to track the object, and the risk to people on Earth from components which might survive reentry. This paper summarizes the history of objects released from ISS, examines the specifics of the ISS jettison policy, and addresses the overall impact of ISS debris on the space environment.
Antal, Borbála; Kuki, Ákos; Nagy, Lajos; Nagy, Tibor; Zsuga, Miklós; Kéki, Sándor
2016-07-01
Residues of chemicals on clothing products were examined by direct analysis in real-time (DART) mass spectrometry. Our experiments have revealed the presence of more than 40 chemicals in 15 different clothing items. The identification was confirmed by DART tandem mass spectrometry (MS/MS) experiments for 14 compounds. The most commonly detected hazardous substances were nonylphenol ethoxylates (NPEs), phthalic acid esters (phthalates), amines released by azo dyes, and quinoline derivates. DART-MS was able to detect NPEs on the skin of the person wearing the clothing item contaminated by NPE residuals. Automated data acquisition and processing method was developed and tested for the recognition of NPE residues thereby reducing the analysis time.
NASA Technical Reports Server (NTRS)
1967-01-01
Immediately following the Apollo 204 accident of January 27, 1961. all associated equipment and material were impounded. Release of this equipment and material for normal use was under the close control of the Apollo 204 Review Board. Apollo Review Board Administrative Procedure No. 11, February 11, 1961, established the Apollo 204 Review Board Material Release Record (MRR). This MRR was the official form used to release material from full impoundment and was valid only after being approved by the Board and signed by a Member. The form was used as the authority to place any impounded item into one of the three Categories defined in Administrative Procedure No. 11. This appendix contains all of the authorized MRR's. Each item submitted on an MRR was given a control number; a description, including the part number and serial number; the relevance and location to the accident; any constraints before release; and the control category. The categories placed on the equipment were as follows: Category A - Items which may have a significant influence or bearing on the results or findings of the Apollo 204 Review Board; Category B - All material other than Category A which is considered relevant to the Apollo 204 Review Board investigation; Category C - Material released from Board jurisdiction. Several classes of equipment were released by special Board action prior to the establishment of the MRR system. The operating procedure for release of these classes is Enclosure F-l to this appendix.
Testing therapeutic potency of anticancer drugs in animal studies: a commentary.
Den Otter, Willem; Steerenberg, Peter A; Van der Laan, Jan Willem
2002-04-01
Regulatory authorities for medicines in European countries deal with many applications for admission to the market of anticancer drugs. Each application must be supported by preclinical and clinical data, among which testing of the therapeutic activity of drugs in animals is important. Recently, the Committee for Proprietary Medicinal Products (CPMP) has released a note for guidance on the preclinical evaluation of anticancer medicinal products. This note provides only general statements regarding tests of anticancer drugs in rodents. This stimulates considerations on how to organize and how to evaluate these tests. In this article we describe our considerations regarding these items based on our experience with applications in The Netherlands since 1993. (c) 2002 Elsevier Science (USA).
USDA National Nutrient Database for Standard Reference, Release 24
USDA-ARS?s Scientific Manuscript database
The USDA Nutrient Database for Standard Reference, Release 24 contains data for over 7,900 food items for up to 146 food components. It replaces the previous release, SR23, issued in September 2010. Data in SR24 supersede values in the printed Handbooks and previous electronic releases of the databa...
Consumer leather exposure: an unrecognized cause of cobalt sensitization.
Thyssen, Jacob P; Johansen, Jeanne D; Jellesen, Morten S; Møller, Per; Sloth, Jens J; Zachariae, Claus; Menné, Torkil
2013-11-01
A patient who had suffered from persistent generalized dermatitis for 7 years was diagnosed with cobalt sensitization, and his leather couch was suspected as the culprit, owing to the clinical presentation mimicking allergic chromium dermatitis resulting from leather furniture exposure. The cobalt spot test, X-ray fluorescence, inductively coupled plasma mass spectrometry and scanning electron microscopy were used to determine cobalt content and release from the leather couch that caused the dermatitis and from 14 randomly collected samples of furniture leather. The sample from the patient's leather couch, but none of the 14 random leather samples, released cobalt in high concentrations. Dermatitis cleared when the patient stopped using his couch. Cobalt is used in the so-called pre-metallized dyeing of leather products. Repeated studies have found high levels of cobalt sensitization, but not nickel sensitization, in patients with foot dermatitis. We raise the possibility that cobalt may be widely released from leather items, and advise dermatologists to consider this in patients with positive cobalt patch test reactions. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Crosby, Richard; Shrier, Lydia A
2013-04-01
The purpose of this study was to develop and test a sexual-partner-related risk behavior index to identify high-risk individuals most likely to have a sexually transmitted infection (STI). Patients from five STI and adolescent medical clinics in three US cities were recruited (N = 928; M age = 29.2 years). Data were collected using audio-computer-assisted self-interviewing. Of seven sexual-partner-related variables, those that were significantly associated with the outcomes were combined into a partner-related risk behavior index. The dependent variables were laboratory-confirmed infection with Chlamydia trachomatis, Neisseria gonorrhoeae, and/or Trichomonas vaginalis. Nearly one-fifth of the sample (169/928; 18.4%) tested positive for an STI. Three of the seven items were significantly associated with having one or more STIs: sex with a newly released prisoner, sex with a person known or suspected of having an STI, and sexual concurrency. In combined form, this three-item index was significantly associated with STI prevalence (p < .001). In the presence of three covariates (gender, race, and age), those classified as being at-risk by the index were 1.8 times more likely than those not classified as such to test positive for an STI (p < .001). Among individuals at risk for STIs, a three-item index predicted testing positive for one or more of three STIs. This index could be used to prioritize and guide intensified clinic-based counseling for high-risk patients of STI and other clinics.
7 CFR 2902.36 - Concrete and asphalt release fluids.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 15 2010-01-01 2010-01-01 false Concrete and asphalt release fluids. 2902.36 Section... PROCUREMENT Designated Items § 2902.36 Concrete and asphalt release fluids. (a) Definition. Products that are designed to provide a lubricating barrier between the composite surface materials (e.g., concrete or...
USDA National Nutrient Database for Standard Reference, release 28
USDA-ARS?s Scientific Manuscript database
The USDA National Nutrient Database for Standard Reference, Release 28 contains data for nearly 8,800 food items for up to 150 food components. SR28 replaces the previous release, SR27, originally issued in August 2014. Data in SR28 supersede values in the printed handbooks and previous electronic...
The General Mission Analysis Tool (GMAT) System Test Plan
NASA Technical Reports Server (NTRS)
Conway, Darrel J.; Hughes, Steven P.
2007-01-01
This document serves as the System Test Approach for the GMAT Project. Preparation for system testing consists of three major stages: 1) The Test Approach sets the scope of system testing, the overall strategy to be adopted, the activities to be completed, the general resources required and the methods and processes to be used to test the release. It also details the activities, dependencies and effort required to conduct the System Test. 2) Test Planning details the activities, dependencies and effort required to conduct the System Test. 3) Test Cases documents the tests to be applied, the data to be processed, the automated testing coverage and the expected results. This document covers the first two of these items, and established the framework used for the GMAT test case development. The test cases themselves exist as separate components, and are managed outside of and concurrently with this System Test Plan.
7 CFR 3201.36 - Concrete and asphalt release fluids.
Code of Federal Regulations, 2013 CFR
2013-01-01
... 7 Agriculture 15 2013-01-01 2013-01-01 false Concrete and asphalt release fluids. 3201.36 Section... PROCUREMENT Designated Items § 3201.36 Concrete and asphalt release fluids. (a) Definition. Products that are... asphalt) and the container (e.g., wood or metal forms, truck beds, roller surfaces). (b) Minimum biobased...
7 CFR 3201.36 - Concrete and asphalt release fluids.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 7 Agriculture 15 2012-01-01 2012-01-01 false Concrete and asphalt release fluids. 3201.36 Section... PROCUREMENT Designated Items § 3201.36 Concrete and asphalt release fluids. (a) Definition. Products that are... asphalt) and the container (e.g., wood or metal forms, truck beds, roller surfaces). (b) Minimum biobased...
7 CFR 2902.36 - Concrete and asphalt release fluids.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 15 2011-01-01 2011-01-01 false Concrete and asphalt release fluids. 2902.36 Section... PROCUREMENT Designated Items § 2902.36 Concrete and asphalt release fluids. (a) Definition. Products that are... asphalt) and the container (e.g., wood or metal forms, truck beds, roller surfaces). (b) Minimum biobased...
7 CFR 3201.36 - Concrete and asphalt release fluids.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 7 Agriculture 15 2014-01-01 2014-01-01 false Concrete and asphalt release fluids. 3201.36 Section... PROCUREMENT Designated Items § 3201.36 Concrete and asphalt release fluids. (a) Definition. Products that are... asphalt) and the container (e.g., wood or metal forms, truck beds, roller surfaces). (b) Minimum biobased...
USDA National Nutrient Database for Standard Reference, Release 25
USDA-ARS?s Scientific Manuscript database
The USDA National Nutrient Database for Standard Reference, Release 25(SR25)contains data for over 8,100 food items for up to 146 food components. It replaces the previous release, SR24, issued in September 2011. Data in SR25 supersede values in the printed handbooks and previous electronic releas...
GPS Spectrum Management (Briefing Charts)
2015-04-29
15) agenda item (AI) 1.1 draft conference preparatory meeting ( CPM ); proactively keeping possible mobile broadband allocations away from GPS in-band...UNCLASSIFIED/APPROVED FOR PUBLIC RELEASE ITU Watch Items SPACE AND MISSILE SYSTEMS CENTER • WRC-15 AI 1.1 -mobile broadband; finalization of CPM in Mar
33 CFR 208.22 - Twin Buttes Dam and Reservoir, Middle and South Concho Rivers, Tex.
Code of Federal Regulations, 2013 CFR
2013-07-01
... and releases; uncontrolled spillway releases; storage; reservoir inflow; available evaporation data.... Normally, one reading at 8 a.m. shall be shown for each day. Readings of all items except evaporation shall...
33 CFR 208.22 - Twin Buttes Dam and Reservoir, Middle and South Concho Rivers, Tex.
Code of Federal Regulations, 2014 CFR
2014-07-01
... and releases; uncontrolled spillway releases; storage; reservoir inflow; available evaporation data.... Normally, one reading at 8 a.m. shall be shown for each day. Readings of all items except evaporation shall...
33 CFR 208.22 - Twin Buttes Dam and Reservoir, Middle and South Concho Rivers, Tex.
Code of Federal Regulations, 2012 CFR
2012-07-01
... and releases; uncontrolled spillway releases; storage; reservoir inflow; available evaporation data.... Normally, one reading at 8 a.m. shall be shown for each day. Readings of all items except evaporation shall...
Selecting Items for Criterion-Referenced Tests.
ERIC Educational Resources Information Center
Mellenbergh, Gideon J.; van der Linden, Wim J.
1982-01-01
Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)
The Millennium Cohort: A 21-Year Contribution to the Understanding of Military and Veterans’ Health
2009-12-10
syndrome (15 items) • Other anxiety syndrome (6 items) • Eating disorders (4 items; binge and bulimia nervosa) Has your doctor or other health...The Millennium Cohort: a 21-Year Contribution to the Understanding of Military and Veterans’ Health Second Annual Trauma Stress Disorders ...AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES Presented at The Second Annual Trauma Spectrum Disorders
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. United Space Alliance workers begin packing pieces of Columbia debris for shipment to The Aerospace Corporation in El Segundo, Calif. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbias debris is stored in the VAB.
75 FR 33355 - Records Schedules; Availability and Request for Comments
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-11
... of agency accomplishments, press releases, and files relating to educational campaigns. 6. Department... of Justice, Civil Division (N1-60-10-16, 1 item, 1 temporary item). Documents that are attorney... capacity based on actions they took in connection with their official position. 8. Department of Justice...
Guide to Mathematics Released Items: Understanding Scoring
ERIC Educational Resources Information Center
Partnership for Assessment of Readiness for College and Careers, 2017
2017-01-01
The Partnership for Assessment of Readiness for College and Careers (PARCC) mathematics items measure critical thinking, mathematical reasoning, and the ability to apply skills and knowledge to real-world problems. Students are asked to solve problems involving the key knowledge and skills for their grade level as identified by the Common Core…
ERIC Educational Resources Information Center
Burns, Daniel J.; Martens, Nicholas J.; Bertoni, Alicia A.; Sweeney, Emily J.; Lividini, Michelle D.
2006-01-01
In a repeated testing paradigm, list items receiving item-specific processing are more likely to be recovered across successive tests (item gains), whereas items receiving relational processing are likely to be forgotten progressively less on successive tests. Moreover, analysis of cumulative-recall curves has shown that item-specific processing…
Release from PI in Running Memory: What Does This Tell Us about Developmental STM?
ERIC Educational Resources Information Center
Cohen, Ronald L.; Griffiths, Karen
1987-01-01
To study age-related improvements in information processing, a release from proactive interference (PI) procedure was used with 144 children in conjunction with a running memory task. For class of item and acoustic similarity, evidence was found for PI release with age, but there was no evidence of a relationship between short-term memory (STM)…
ERIC Educational Resources Information Center
Matlock, Ki Lynn; Turner, Ronna
2016-01-01
When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…
ERIC Educational Resources Information Center
Spaan, Mary
2007-01-01
This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…
ERIC Educational Resources Information Center
Hewitt, Margaret A.; Homan, Susan P.
2004-01-01
Test validity issues considered by test developers and school districts rarely include individual item readability levels. In this study, items from a major standardized test were examined for individual item readability level and item difficulty. The Homan-Hewitt Readability Formula was applied to items across three grade levels. Results of…
Modern Space Craft - Antique Specifications
NASA Technical Reports Server (NTRS)
Brewer, Ron; Trout, Dawn
2006-01-01
Spacecraft now and of the future are being controlled by EMC requirements of the past. Little has been done by the launch vehicle/spacecraft manufacturers to abandon MIL-STD-461C which was released in 1986 because most of the electronics equipment being used aboard current launch vehicles is approved by similarity and heritage to MIL-STD-46 1 C and its predecessors. Twenty years later these electronic equipment items are still not tested to today's MIL-STD-461E requirements because there is a risk that the items will fail to meet the requirements and thus the cost will increase if it becomes necessary to redesign the equipment. That cost is insignificant compared with the cost of losing an entire mission! In the 20 years that have elapsed since MIL-STD-461C was released, the EMC environment has undergone major changes. High speed digital devices have been created that have fundamental clock and bus frequencies that span the entire LV/SC frequency range from the Flight Termination Systems through C and S-Band telemetry. Personnel involved in ground operations routinely carry and use hand held transceivers and cellular telephones close by sensitive electronics equipment. There are now many more orbiting receivers and emitters, plus range assets have increased dramatically since 2001. It's way past time to bring requirements up-to-date!
Peipert, John D; Bentler, Peter; Klicko, Kristi; Hays, Ron D
2018-05-14
Black dialysis patients report better health-related quality of life (HRQOL) than White patients, which may be explained if Black and White patients respond systematically differently to HRQOL survey items. We examined differential item functioning (DIF) of the Kidney Disease Quality of Life 36-item (KDQOL TM -36) Burden of Kidney Disease, Symptoms and Problems with Kidney Disease, and Effects of Kidney Disease scales between Black (n = 18,404) and White (n = 21,439) dialysis patients. We fit multiple group confirmatory factor analysis models with increasing invariance: a Configural model (invariant factor structure), a Metric model (invariant factor loadings), and a Scalar model (invariant intercepts). Criteria for invariance included non-significant χ 2 tests, > 0.002 difference in the models' CFI, and > 0.015 difference in RMSEA and SRMR. Next, starting with a fully invariant model, we freed loadings and intercepts item-by-item to determine if DIF impacted estimated KDQOL TM -36 scale means. ΔCFI was 0.006 between the metric and scalar models but was reduced to 0.001 when we freed intercepts for the burdens and symptoms and problems of kidney disease scales. In comparison to standardized means of 0 in the White group, those for the Black group on the Burdens, Symptoms and Problems, and Effects of Kidney Disease scales were 0.218, 0.061, and 0.161, respectively. When loadings and thresholds were released sequentially, differences in means between models ranged between 0.001 and 0.048. Despite some DIF, impacts on KDQOL TM -36 responses appear to be minimal. We conclude that the KDQOL TM -36 is appropriate to make substantive comparisons of HRQOL between Black and White dialysis patients.
The Effect of the Position of an Item within a Test on the Item Difficulty Value.
ERIC Educational Resources Information Center
Rubin, Lois S.; Mott, David E. W.
An investigation of the effect on the difficulty value of an item due to position placement within a test was made. Using a 60-item operational test comprised of 5 subtests, 60 items were placed as experimental items on a number of spiralled test forms in three different positions (first, middle, last) within the subtest composed of like items.…
ERIC Educational Resources Information Center
Marie, S. Maria Josephine Arokia; Edannur, Sreekala
2015-01-01
This paper focused on the analysis of test items constructed in the paper of teaching Physical Science for B.Ed. class. It involved the analysis of difficulty level and discrimination power of each test item. Item analysis allows selecting or omitting items from the test, but more importantly item analysis is a tool to help the item writer improve…
ERIC Educational Resources Information Center
Wang, Wei
2013-01-01
Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…
10 CFR 850.31 - Release criteria.
Code of Federal Regulations, 2010 CFR
2010-01-01
... lowest contamination level practicable, but not to exceed the levels established in paragraphs (b) and (c... contamination level of equipment or item surfaces does not exceed the higher of 0.2 µg/100 cm 2 or the... the equipment or item and its future use and the nature of the beryllium contamination. (c) Before...
Test item linguistic complexity and assessments for deaf students.
Cawthon, Stephanie
2011-01-01
Linguistic complexity of test items is one test format element that has been studied in the context of struggling readers and their participation in paper-and-pencil tests. The present article presents findings from an exploratory study on the potential relationship between linguistic complexity and test performance for deaf readers. A total of 64 students completed 52 multiple-choice items, 32 in mathematics and 20 in reading. These items were coded for linguistic complexity components of vocabulary, syntax, and discourse. Mathematics items had higher linguistic complexity ratings than reading items, but there were no significant relationships between item linguistic complexity scores and student performance on the test items. The discussion addresses issues related to the subject area, student proficiency levels in the test content, factors to look for in determining a "linguistic complexity effect," and areas for further research in test item development and deaf students.
Serong, Julia; Anhäuser, Marcus; Wormer, Holger
2015-01-01
A current research project deals with the question of how the quality of medical health information changes on its way from the academic journal via press releases to the news media. In an exploratory study a sample of 30 news items has been selected stage-by-stage from an adjusted total sample of 1,695 journalistic news items on medical research in 2013. Using a multidimensional set of criteria the news items as well as the corresponding academic articles, abstracts and press releases are examined by science journalists and medical experts. Together with a content analysis of the expert assessments, it will be verified to what extent established quality standards for medical journalism can be applied to medical health communication and public relations or even to studies and abstracts as well. Copyright © 2015. Published by Elsevier GmbH.
NASA Astrophysics Data System (ADS)
Gross, Jürgen H.
2015-03-01
Direct analysis in real time-mass spectrometry (DART-MS) enables screening of articles of daily use made of polydimethylsiloxanes (PDMS), commonly known as silicone rubber, to assess their tendency to release low molecular weight silicone oligomers. DART-MS analyses were performed on a Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometer. Flexible silicone baking molds, a watch band, and a dough scraper, as baby articles different brands of pacifiers, nipples, and a teething ring have been examined. While somewhat arbitrarily chosen, the set can be regarded as representative of household items, baby articles, and other objects made of silicone rubber. For comparison, two brands of silicone septa and as blanks a glass slide and a latex pacifier were included. Differences between the objects were mainly observed in terms of molecular weight distribution and occasional release of other compounds in addition to PDMS. Other than that, all objects made of silicone rubber released significant amounts of PDMS during DART analysis. To provide a coarse quantification, a calibration based on silicone oil was established, which delivered PDMS losses from 20 μg to >100 μg during the 16-s period per measurement. Also, the extraction of baking molds in rapeseed oil demonstrated a PDMS release at the level of 1 μg mg-1. These findings indicate a potential health hazard from frequent or long-term use of such items. This work does not intend to blame certain brands of such articles. Nonetheless, a higher level of awareness of this source of daily silicone intake is suggested.
Gross, Jürgen H
2015-03-01
Direct analysis in real time-mass spectrometry (DART-MS) enables screening of articles of daily use made of polydimethylsiloxanes (PDMS), commonly known as silicone rubber, to assess their tendency to release low molecular weight silicone oligomers. DART-MS analyses were performed on a Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometer. Flexible silicone baking molds, a watch band, and a dough scraper, as baby articles different brands of pacifiers, nipples, and a teething ring have been examined. While somewhat arbitrarily chosen, the set can be regarded as representative of household items, baby articles, and other objects made of silicone rubber. For comparison, two brands of silicone septa and as blanks a glass slide and a latex pacifier were included. Differences between the objects were mainly observed in terms of molecular weight distribution and occasional release of other compounds in addition to PDMS. Other than that, all objects made of silicone rubber released significant amounts of PDMS during DART analysis. To provide a coarse quantification, a calibration based on silicone oil was established, which delivered PDMS losses from 20 μg to >100 μg during the 16-s period per measurement. Also, the extraction of baking molds in rapeseed oil demonstrated a PDMS release at the level of 1 μg mg(-1). These findings indicate a potential health hazard from frequent or long-term use of such items. This work does not intend to blame certain brands of such articles. Nonetheless, a higher level of awareness of this source of daily silicone intake is suggested.
The Selection of Test Items for Decision Making with a Computer Adaptive Test.
ERIC Educational Resources Information Center
Spray, Judith A.; Reckase, Mark D.
The issue of test-item selection in support of decision making in adaptive testing is considered. The number of items needed to make a decision is compared for two approaches: selecting items from an item pool that are most informative at the decision point or selecting items that are most informative at the examinee's ability level. The first…
Tepe, Rodger; Tepe, Chabha
2015-03-01
To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. In this test-retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. The IL self-efficacy survey demonstrated good reliability (test-retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test-retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments.
A New Item Selection Procedure for Mixed Item Type in Computerized Classification Testing.
ERIC Educational Resources Information Center
Lau, C. Allen; Wang, Tianyou
This paper proposes a new Information-Time index as the basis for item selection in computerized classification testing (CCT) and investigates how this new item selection algorithm can help improve test efficiency for item pools with mixed item types. It also investigates how practical constraints such as item exposure rate control, test…
7 CFR 1962.17 - Disposal of chattel security, use of proceeds and release of lien.
Code of Federal Regulations, 2013 CFR
2013-01-01
... third party requests a release of specific items which must be recorded under the UCC or chattel... products. When Form FmHA 441-18 is in effect under the UCC, the notice to the purchaser will be made on...
7 CFR 1962.17 - Disposal of chattel security, use of proceeds and release of lien.
Code of Federal Regulations, 2011 CFR
2011-01-01
... third party requests a release of specific items which must be recorded under the UCC or chattel... products. When Form FmHA 441-18 is in effect under the UCC, the notice to the purchaser will be made on...
7 CFR 1962.17 - Disposal of chattel security, use of proceeds and release of lien.
Code of Federal Regulations, 2014 CFR
2014-01-01
... third party requests a release of specific items which must be recorded under the UCC or chattel... products. When Form FmHA 441-18 is in effect under the UCC, the notice to the purchaser will be made on...
7 CFR 1962.17 - Disposal of chattel security, use of proceeds and release of lien.
Code of Federal Regulations, 2012 CFR
2012-01-01
... third party requests a release of specific items which must be recorded under the UCC or chattel... products. When Form FmHA 441-18 is in effect under the UCC, the notice to the purchaser will be made on...
CUNY's Voter Registration System.
ERIC Educational Resources Information Center
Hershenson, Jay; And Others
This collection of items including public testimony by the Vice Chancellor, Jay Hershenson, a formal resolution, a press release, and brochures, documents the City University of New York's (CUNY) unique voter registration system, "CUNY Project Vote". As the press release describes it, Project Vote is the nation's largest student voter…
A Process for Reviewing and Evaluating Generated Test Items
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis
2016-01-01
Testing organization needs large numbers of high-quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time-consuming and expensive because each item is written,…
ERIC Educational Resources Information Center
Banerjee, Jayanti; Papageorgiou, Spiros
2016-01-01
The research reported in this article investigates differential item functioning (DIF) in a listening comprehension test. The study explores the relationship between test-taker age and the items' language domains across multiple test forms. The data comprise test-taker responses (N = 2,861) to a total of 133 unique items, 46 items of which were…
Item validity vs. item discrimination index: a redundancy?
NASA Astrophysics Data System (ADS)
Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.
2018-03-01
In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.
A Comparison of Three Types of Test Development Procedures Using Classical and Latent Trait Methods.
ERIC Educational Resources Information Center
Benson, Jeri; Wilson, Michael
Three methods of item selection were used to select sets of 38 items from a 50-item verbal analogies test and the resulting item sets were compared for internal consistency, standard errors of measurement, item difficulty, biserial item-test correlations, and relative efficiency. Three groups of 1,500 cases each were used for item selection. First…
ERIC Educational Resources Information Center
Çokluk, Ömay; Gül, Emrah; Dogan-Gül, Çilem
2016-01-01
The study aims to examine whether differential item function is displayed in three different test forms that have item orders of random and sequential versions (easy-to-hard and hard-to-easy), based on Classical Test Theory (CTT) and Item Response Theory (IRT) methods and bearing item difficulty levels in mind. In the correlational research, the…
The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory
ERIC Educational Resources Information Center
Sahin, Alper; Anil, Duygu
2017-01-01
This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…
[Perceptions on item disclosure for the Korean medical licensing examination].
Yang, Eunbae B
2015-09-01
This study analyzed the perceptions of medical students and faculty regarding disclosure of test items on the Korean medical licensing examination. I conducted a survey of medical students from medical colleges and professional medical schools nationwide. Responses were analyzed from 718 participants as well as 69 faculty members who participated in creating the medical licensing examination item sets. Data were analyzed using descriptive statistics and the chi-square test. It is important to maintain test quality and to keep the test items unavailable to the public. There are also concerns among students that disclosure of test items would prompt increasing difficulty of test items (48.3%). Further, few students found it desirable to disclose test items regardless of any considerations (28.5%). The professors, who had experience in designing the test items, also expressed their opposition to test item disclosure (60.9%). It is desirable not to disclose the test items of the Korean medical licensing examination to the public on the condition that students are provided with a sufficient amount of information regarding the examination. This is so that the exam can appropriately identify candidates with the required qualifications.
A Review of Classical Methods of Item Analysis.
ERIC Educational Resources Information Center
French, Christine L.
Item analysis is a very important consideration in the test development process. It is a statistical procedure to analyze test items that combines methods used to evaluate the important characteristics of test items, such as difficulty, discrimination, and distractibility of the items in a test. This paper reviews some of the classical methods for…
Modeling Item-Position Effects within an IRT Framework
ERIC Educational Resources Information Center
Debeer, Dries; Janssen, Rianne
2013-01-01
Changing the order of items between alternate test forms to prevent copying and to enhance test security is a common practice in achievement testing. However, these changes in item order may affect item and test characteristics. Several procedures have been proposed for studying these item-order effects. The present study explores the use of…
ACER Chemistry Test Item Collection. ACER Chemtic Year 12.
ERIC Educational Resources Information Center
Australian Council for Educational Research, Hawthorn.
The chemistry test item banks contains 225 multiple-choice questions suitable for diagnostic and achievement testing; a three-page teacher's guide; answer key with item facilities; an answer sheet; and a 45-item sample achievement test. Although written for the new grade 12 chemistry course in Victoria, Australia, the items are widely applicable.…
Proactive Interference Slows Recognition by Eliminating Fast Assessments of Familiarity
ERIC Educational Resources Information Center
Oztekin, Ilke; McElree, Brian
2007-01-01
The response-signal speed-accuracy tradeoff (SAT) procedure was used to investigate how proactive interference (PI) affects retrieval from working memory. Participants were presented with 6-item study lists, followed immediately by a recognition probe. A variant of a release from PI design was used: All items in a list were from the same semantic…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-04
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-63971; File No. SR-NYSEARCA-2011-05] Self... 201 of Regulation SHO Under the Securities Exchange Act of 1934 February 25, 2011. Pursuant to Section... described in Items I and II below, which Items have been substantially prepared by the self- regulatory...
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-21
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-63116; File No. SR-NYSEArca-2010-89] Self... Arca To Bypass Non-Regulation NMS Protected Market Centers When Routing Away October 15, 2010. Pursuant... as described in Items I and II below, which Items have been prepared by the self-regulatory...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-04
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-63977; File No. SR-NYSE-2011-05] Self... of Regulation SHO Under the Securities Exchange Act of 1934 February 25, 2011. Pursuant to Section 19... change as described in Items I and II below, which Items have been substantially prepared by the self...
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-08
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-63416; File No. SR-BX-2010-083] Self-Regulatory Organizations; NASDAQ OMX BX, Inc.; Notice of Filing of Proposed Rule Change Relating to The Price Improvement... Items I and II below, which Items have been prepared by the self-regulatory organization. The Commission...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-11
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-69039; File No. SR-NASDAQ-2013-031] Self...'' Orders Submitted to the Retail Price Improvement Program Will Qualify as ``Retail Orders'' March 5, 2013... described in Items I, II, and III below, which Items have been prepared by the self-regulatory organization...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-19
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-67202; File No. SR-ISE-2012-54] Self-Regulatory... Rule Change Relating to the Extension of the Price Improvement Mechanism Pilot Program June 14, 2012... described in Items I and II below, which items have been prepared by the self-regulatory organization. The...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-01
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-69853; File No. SR-ISE-2013-41] Self-Regulatory... Rule Change To Extend the Price Improvement Mechanism Pilot Program June 25, 2013. Pursuant to Section... described in Items I and II below, which items have been prepared by the self-regulatory organization. The...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-29
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-68709; File No. SR-NYSE-2013-04] Self... Improvement Orders in a Non-RLP Capacity for Securities to Which the RLP Is Not Assigned January 23, 2013... in Items I and II below, which Items have been prepared by the self-regulatory organization. The...
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Assembling a Computerized Adaptive Testing Item Pool as a Set of Linear Tests
ERIC Educational Resources Information Center
van der Linden, Wim J.; Ariel, Adelaide; Veldkamp, Bernard P.
2006-01-01
Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content…
Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory
ERIC Educational Resources Information Center
Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi
2016-01-01
High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…
Item Specifications, Science Grade 8. Blue Prints for Testing Minimum Performance Test.
ERIC Educational Resources Information Center
Arkansas State Dept. of Education, Little Rock.
These item specifications were developed as a part of the Arkansas "Minimum Performance Testing Program" (MPT). There is one item specification for each instructional objective included in the MPT. The purpose of an item specification is to provide an overview of the general content and format of test items used to measure an…
Item Specifications, Science Grade 6. Blue Prints for Testing Minimum Performance Test.
ERIC Educational Resources Information Center
Arkansas State Dept. of Education, Little Rock.
These item specifications were developed as a part of the Arkansas "Minimum Performance Testing Program" (MPT). There is one item specification for each instructional objective included in the MPT. The purpose of an item specification is to provide an overview of the general content and format of test items used to measure an…
Criterion-Referenced Test Items for Welding.
ERIC Educational Resources Information Center
Davis, Diane, Ed.
This test item bank on welding contains test questions based upon competencies found in the Missouri Welding Competency Profile. Some test items are keyed for multiple competencies. These criterion-referenced test items are designed to work with the Vocational Instructional Management System. Questions have been statistically sampled and validated…
Jang, Yoonhee; Wixted, John T.; Pecher, Diane; Zeelenberg, René; Huber, David E.
2012-01-01
Even without feedback, test practice enhances delayed performance compared to study practice, but the size of the effect is variable across studies. We investigated the benefit of testing, separating initially retrievable items from initially non-retrievable items. In two experiments, an initial test determined item retrievability. Retrievable or non-retrievable items were subsequently presented for repeated study or test practice. Collapsing across items, in Experiment 1, we obtained the typical crossover interaction between retention interval and practice type. For retrievable items, however, the crossover interaction was quantitatively different, with a small study benefit for an immediate test and a larger testing benefit after a delay. For non-retrievable items, there was a large study benefit for an immediate test, but one week later there was no difference between the study and test practice conditions. In Experiment 2, initially non-retrievable items were given additional study followed by either an immediate test or even more additional study, and one week later performance did not differ between the two conditions. These results indicate that the effect size of study/test practice is due to the relative contribution of retrievable and non-retrievable items. PMID:22304454
Jang, Yoonhee; Wixted, John T; Pecher, Diane; Zeelenberg, René; Huber, David E
2012-01-01
Even without feedback, test practice enhances delayed performance compared to study practice, but the size of the effect is variable across studies. We investigated the benefit of testing, separating initially retrievable items from initially nonretrievable items. In two experiments, an initial test determined item retrievability. Retrievable or nonretrievable items were subsequently presented for repeated study or test practice. Collapsing across items, in Experiment 1, we obtained the typical cross-over interaction between retention interval and practice type. For retrievable items, however, the cross-over interaction was quantitatively different, with a small study benefit for an immediate test and a larger testing benefit after a delay. For nonretrievable items, there was a large study benefit for an immediate test, but one week later there was no difference between the study and test practice conditions. In Experiment 2, initially nonretrievable items were given additional study followed by either an immediate test or even more additional study, and one week later performance did not differ between the two conditions. These results indicate that the effect size of study/test practice is due to the relative contribution of retrievable and nonretrievable items.
Optimal Test Design with Rule-Based Item Generation
ERIC Educational Resources Information Center
Geerlings, Hanneke; van der Linden, Wim J.; Glas, Cees A. W.
2013-01-01
Optimal test-design methods are applied to rule-based item generation. Three different cases of automated test design are presented: (a) test assembly from a pool of pregenerated, calibrated items; (b) test generation on the fly from a pool of calibrated item families; and (c) test generation on the fly directly from calibrated features defining…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Criterion-Referenced Test Items for Small Engines.
ERIC Educational Resources Information Center
Herd, Amon
This notebook contains criterion-referenced test items for testing students' knowledge of small engines. The test items are based upon competencies found in the Missouri Small Engine Competency Profile. The test item bank is organized in 18 sections that cover the following duties: shop procedures; tools and equipment; fasteners; servicing fuel…
An Investigation of the Impact of Guessing on Coefficient α and Reliability
2014-01-01
Guessing is known to influence the test reliability of multiple-choice tests. Although there are many studies that have examined the impact of guessing, they used rather restrictive assumptions (e.g., parallel test assumptions, homogeneous inter-item correlations, homogeneous item difficulty, and homogeneous guessing levels across items) to evaluate the relation between guessing and test reliability. Based on the item response theory (IRT) framework, this study investigated the extent of the impact of guessing on reliability under more realistic conditions where item difficulty, item discrimination, and guessing levels actually vary across items with three different test lengths (TL). By accommodating multiple item characteristics simultaneously, this study also focused on examining interaction effects between guessing and other variables entered in the simulation to be more realistic. The simulation of the more realistic conditions and calculations of reliability and classical test theory (CTT) item statistics were facilitated by expressing CTT item statistics, coefficient α, and reliability in terms of IRT model parameters. In addition to the general negative impact of guessing on reliability, results showed interaction effects between TL and guessing and between guessing and test difficulty.
Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André
2016-01-01
Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…
Tepe, Rodger; Tepe, Chabha
2015-01-01
Objective To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. Methods In this test–retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. Results The IL self-efficacy survey demonstrated good reliability (test–retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test–retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). Conclusions This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments. PMID:25517736
Integrating Test-Form Formatting into Automated Test Assembly
ERIC Educational Resources Information Center
Diao, Qi; van der Linden, Wim J.
2013-01-01
Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using…
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis
2013-01-01
Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…
2004-05-25
KENNEDY SPACE CENTER, FLA. - United Space Alliance technician J.C. Harrison steers while NASA’s Scott Thurston guides a piece of Columbia debris through a gate in the Vehicle Assembly Building, where the debris is stored. This piece is one of eight being released to The Aerospace Corporation in El Segundo, Calif., for testing and research. Thurston is the Columbia debris coordinator. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite.
2004-05-25
KENNEDY SPACE CENTER, FLA. - United Space Alliance workers J.C. Harrison (left) and Amy Mangiacapra (right) pack up pieces of Columbia debris for shipment to The Aerospace Corporation in El Segundo, Calif. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbia’s debris is stored in the VAB.
2004-05-25
KENNEDY SPACE CENTER, FLA. - United Space Alliance workers begin packing pieces of Columbia debris for shipment to The Aerospace Corporation in El Segundo, Calif. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbia’s debris is stored in the VAB.
2004-05-25
KENNEDY SPACE CENTER, FLA. - United Space Alliance workers J.C. Harrison (left) and Amy Mangiacapra pack pieces of Columbia debris for transfer to the shipping facility for travel to The Aerospace Corporation in El Segundo, Calif. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbia’s debris is stored in the VAB.
2004-05-25
KENNEDY SPACE CENTER, FLA. - After being wrapped and secured on pallets, pieces of Columbia debris are loaded onto a truck to transport them to the shipping facility for travel to The Aerospace Corporation in El Segundo, Calif. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbia’s debris is stored in the VAB.
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. United Space Alliance workers J.C. Harrison (left) and Amy Mangiacapra (right) pack up pieces of Columbia debris for shipment to The Aerospace Corporation in El Segundo, Calif. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbias debris is stored in the VAB.
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. United Space Alliance workers J.C. Harrison (far left) and Amy Mangiacapra guide a wrapped piece of Columbia debris through the Vehicle Assembly Building, where it is stored. Alongside is NASAs Scott Thurston, who is the Columbia debris coordinator. This piece is one of eight being released to The Aerospace Corporation in El Segundo, Calif., for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite.
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. United Space Alliance workers J.C. Harrison (left) and Amy Mangiacapra pack pieces of Columbia debris for transfer to the shipping facility for travel to The Aerospace Corporation in El Segundo, Calif. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbias debris is stored in the VAB.
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. United Space Alliance technician J.C. Harrison steers while NASAs Scott Thurston guides a piece of Columbia debris through a gate in the Vehicle Assembly Building, where the debris is stored. This piece is one of eight being released to The Aerospace Corporation in El Segundo, Calif., for testing and research. Thurston is the Columbia debris coordinator. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite.
A Procedure To Detect Test Bias Present Simultaneously in Several Items.
ERIC Educational Resources Information Center
Shealy, Robin; Stout, William
A statistical procedure is presented that is designed to test for unidirectional test bias existing simultaneously in several items of an ability test, based on the assumption that test bias is incipient within the two groups' ability differences. The proposed procedure--Simultaneous Item Bias (SIB)--is based on a multidimensional item response…
An Item Response Theory Model for Test Bias.
ERIC Educational Resources Information Center
Shealy, Robin; Stout, William
This paper presents a conceptualization of test bias for standardized ability tests which is based on multidimensional, non-parametric, item response theory. An explanation of how individually-biased items can combine through a test score to produce test bias is provided. It is contended that bias, although expressed at the item level, should be…
27 CFR 41.85a - Release from customs custody of returned articles.
Code of Federal Regulations, 2012 CFR
2012-04-01
... PRODUCTS, CIGARETTE PAPERS AND TUBES, AND PROCESSED TOBACCO Tobacco Products and Cigarette Papers and Tubes... Cigarette Papers and Tubes Without Payment of Tax Or Certain Duty § 41.85a Release from customs custody of... warehouse proprietor. (b) Domestically produced cigarette papers and tubes (classifiable under item 9801.00...
Safety evaluation -- Spent water treatment system components inventory release
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dodd, E.N. Jr.
1995-01-24
Over the past few years various impediments to shipment of generated spent basin water treatment system components have resulted in the accumulation of quantities of these waste items at 100K. Specifically, there are (as of 01/01/95) 13 grout/culvert packaged cartridge filters (CF), four unpackaged cartridge filters, 60 spent ion exchange columns (IXC) and seven ion exchange modules (IXM) at 100K awaiting shipment for final waste disposal. As a result of the accumulation of this waste, the question has arisen regarding the consequences of potential releases of the inventory of radionuclides in these waste items relative to the K Area safetymore » envelope. The purpose of this paper is to address this question. The initial step evaluating the consequences of potential release of material from the spent water treatment system components was to determine the individual and total radionuclide inventories of concern. Generally the radioisotopes of concern to the dose consequences were Sr/Y-90, Cs-137, and the transuranic (TRU) isotopes. The loading of these radioisotopes needed to be determined for each of the components of the total number of accumulated IXCs, IXMs and CFs. This evaluation examines four potential releases of material from the spent water treatment system components. These releases are: the release of material from all 39 IXCs stored in 183-KW; the release of material from the IXCs, IXMs and CFs at 105-KE and 105-KW; the release of material from the 13 CFs stored behind 105-KE; and the non-mechanistic release of the total stored waste inventory.« less
ERIC Educational Resources Information Center
Arikan, Serkan; van de Vijver, Fons J. R.; Yagmur, Kutlay
2018-01-01
We examined Differential Item Functioning (DIF) and the size of cross-cultural performance differences in the Programme for International Student Assessment (PISA) 2012 mathematics data before and after application of propensity score matching. The mathematics performance of Indonesian, Turkish, Australian, and Dutch students on released items was…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-17
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-63531; File No. SR-ISE-2010-109] Self... change, as described in Items I and II below, which items have been prepared by the self-regulatory... change from interested persons. \\1\\ 15 U.S.C. 78s(b)(1). \\2\\ 17 CFR 240.19b-4. I. Self-Regulatory...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-24
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-66827; File No. SR-ISE-2012-26] Self-Regulatory... Items I and II below, which Items have been prepared by the self- regulatory organization. The.... \\1\\ 15 U.S.C. 78s(b)(1). \\2\\ 17 CFR 240.19b-4. I. Self-Regulatory Organization's Statement of the...
ERIC Educational Resources Information Center
Quaigrain, Kennedy; Arhin, Ato Kwamina
2017-01-01
Item analysis is essential in improving items which will be used again in later tests; it can also be used to eliminate misleading items in a test. The study focused on item and test quality and explored the relationship between difficulty index (p-value) and discrimination index (DI) with distractor efficiency (DE). The study was conducted among…
ERIC Educational Resources Information Center
Snyder, James
2010-01-01
This dissertation research examined the changes in item RIT calibration that occurred when adding audio to a set of currently calibrated RIT items and then placing these new items as field test items in the modified assessments on the NWEA MAP test platform. The researcher used test results from over 600 students in the Poway School District in…
Student science achievement and the integration of Indigenous knowledge on standardized tests
NASA Astrophysics Data System (ADS)
Dupuis, Juliann; Abrams, Eleanor
2017-09-01
In this article, we examine how American Indian students in Montana performed on standardized state science assessments when a small number of test items based upon traditional science knowledge from a cultural curriculum, "Indian Education for All", were included. Montana is the first state in the US to mandate the use of a culturally relevant curriculum in all schools and to incorporate this curriculum into a portion of the standardized assessment items. This study compares White and American Indian student test scores on these particular test items to determine how White and American Indian students perform on culturally relevant test items compared to traditional standard science test items. The connections between student achievement on adapted culturally relevant science test items versus traditional items brings valuable insights to the fields of science education, research on student assessments, and Indigenous studies.
Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items
ERIC Educational Resources Information Center
Aybek, Eren Can; Demirtasli, R. Nukhet
2017-01-01
This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…
An Effect Size Measure for Raju's Differential Functioning for Items and Tests
ERIC Educational Resources Information Center
Wright, Keith D.; Oshima, T. C.
2015-01-01
This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…
Detecting a Gender-Related DIF Using Logistic Regression and Transformed Item Difficulty
ERIC Educational Resources Information Center
Abedlaziz, Nabeel; Ismail, Wail; Hussin, Zaharah
2011-01-01
Test items are designed to provide information about the examinees. Difficult items are designed to be more demanding and easy items are less so. However, sometimes, test items carry with their demands other than those intended by the test developer (Scheuneman & Gerritz, 1990). When personal attributes such as gender systematically affect…
Influence of Fallible Item Parameters on Test Information During Adaptive Testing.
ERIC Educational Resources Information Center
Wetzel, C. Douglas; McBride, James R.
Computer simulation was used to assess the effects of item parameter estimation errors on different item selection strategies used in adaptive and conventional testing. To determine whether these effects reduced the advantages of certain optimal item selection strategies, simulations were repeated in the presence and absence of item parameter…
A Guide to Item Banking in Education. (Third Edition).
ERIC Educational Resources Information Center
Naccarato, Richard W.
The current status of banks of test items existing across the United States was determined through a survey conducted between September and December 1987. Item "bank" in this context does not imply that the test items are available in computerized form, but simply that "deposited" test items can be withdrawn for use. Emphasis…
Development and validation of an energy-balance knowledge test for fourth- and fifth-grade students.
Chen, Senlin; Zhu, Xihe; Kang, Minsoo
2017-05-01
A valid test measuring children's energy-balance (EB) knowledge is lacking in research. This study developed and validated the energy-balance knowledge test (EBKT) for fourth and fifth grade students. The original EBKT contained 25 items but was reduced to 23 items based on pilot result and intensive expert panel discussion. De-identified data were collected from 468 fourth and fifth grade students enrolled in four schools to examine the psychometric properties of the EBKT items. The Rasch model analysis was conducted using the Winstep 3.65.0 software. Differential item functioning (DIF) analysis flagged 1 item (item #4) functioning differently between boys and girls, which was deleted. The final 22-item EBKT showed desirable model-data fit indices. The items had large variability ranging from -3.58 logit (item #10, the easiest) to 1.70 logit (item #3, the hardest). The average person ability on the test was 0.28 logit (SD = .78). Additional analyses supported known-group difference validity of the EBKT scores in capturing gender- and grade-based ability differences. The test was overall valid but could be further improved by expanding test items to discern various ability levels. For lack of a better test, researchers and practitioners may use the EBKT to assess fourth- and fifth-grade students' EB knowledge.
NASA Astrophysics Data System (ADS)
Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan
2016-12-01
This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.
ERIC Educational Resources Information Center
Baghaei, Purya; Ravand, Hamdollah
2016-01-01
In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…
Machine Shop. Criterion-Referenced Test (CRT) Item Bank.
ERIC Educational Resources Information Center
Davis, Diane, Ed.
This drafting criterion-referenced test item bank is keyed to the machine shop competency profile developed by industry and education professionals in Missouri. The 16 references used for drafting the test items are listed. Test items are arranged under these categories: orientation to machine shop; performing mathematical calculations; performing…
Rescuing Computerized Testing by Breaking Zipf's Law.
ERIC Educational Resources Information Center
Wainer, Howard
2000-01-01
Suggests that because of the nonlinear relationship between item usage and item security, the problems of test security posed by continuous administration of standardized tests cannot be resolved merely by increasing the size of the item pool. Offers alternative strategies to overcome these problems, distributing test items so as to avoid the…
ERIC Educational Resources Information Center
Ito, Kyoko; Sykes, Robert C.
This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…
ERIC Educational Resources Information Center
Atalmis, Erkan Hasan
2016-01-01
Multiple-choice (MC) items are commonly used in high-stake tests. Thus, each item of such tests should be meticulously constructed to increase the accuracy of decisions based on test results. Haladyna and his colleagues (2002) addressed the valid item-writing guidelines to construct high quality MC items in order to increase test reliability and…
The development of a clinical outcomes survey research application: Assessment Center.
Gershon, Richard; Rothrock, Nan E; Hanrahan, Rachel T; Jansky, Liz J; Harniss, Mark; Riley, William
2010-06-01
The National Institutes of Health sponsored Patient-Reported Outcome Measurement Information System (PROMIS) aimed to create item banks and computerized adaptive tests (CATs) across multiple domains for individuals with a range of chronic diseases. Web-based software was created to enable a researcher to create study-specific Websites that could administer PROMIS CATs and other instruments to research participants or clinical samples. This paper outlines the process used to develop a user-friendly, free, Web-based resource (Assessment Center) for storage, retrieval, organization, sharing, and administration of patient-reported outcomes (PRO) instruments. Joint Application Design (JAD) sessions were conducted with representatives from numerous institutions in order to supply a general wish list of features. Use Cases were then written to ensure that end user expectations matched programmer specifications. Program development included daily programmer "scrum" sessions, weekly Usability Acceptability Testing (UAT) and continuous Quality Assurance (QA) activities pre- and post-release. Assessment Center includes features that promote instrument development including item histories, data management, and storage of statistical analysis results. This case study of software development highlights the collection and incorporation of user input throughout the development process. Potential future applications of Assessment Center in clinical research are discussed.
Item difficulty and item validity for the Children's Group Embedded Figures Test.
Rusch, R R; Trigg, C L; Brogan, R; Petriquin, S
1994-02-01
The validity and reliability of the Children's Group Embedded Figures Test was reported for students in Grade 2 by Cromack and Stone in 1980; however, a search of the literature indicates no evidence for internal consistency or item analysis. Hence the purpose of this study was to examine the item difficulty and item validity of the test with children in Grades 1 and 2. Confusion in the literature over development and use of this test was seemingly resolved through analysis of these descriptions and through an interview with the test developer. One early-appearing item was unreasonably difficult. Two or three other items were quite difficult and made little contribution to the total score. Caution is recommended, however, in any reordering or elimination of items based on these findings, given the limited number of subjects (n = 84).
1976-01-01
items. The items tested were the MODI-PAC, a proprietary item of Reming)on Arms Company, a standard 12 - gauge round of No. 4 lead shot, and an...to refrain from testing this item. Therefore, the final selection of items for testing were (1) the MODI-PAC, (2) a standard 12 - gauge shotgun round of...The first item evaluated was the MODI-PAC5. The MOQ1-PAC which standsfor “modified impact “ is a 12 - gauge shotgun shell loaded with approximately 320
Australian Chemistry Test Item Bank: Years 11 & 12. Volume 1.
ERIC Educational Resources Information Center
Commons, C., Ed.; Martin, P., Ed.
Volume 1 of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the "ACER…
Australian Chemistry Test Item Bank: Years 11 and 12. Volume 2.
ERIC Educational Resources Information Center
Commons, C., Ed.; Martin, P., Ed.
The second volume of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the…
Interactions Between Item Content And Group Membership on Achievement Test Items.
ERIC Educational Resources Information Center
Linn, Robert L.; Harnisch, Delwyn L.
The purpose of this investigation was to examine the interaction of item content and group membership on achievement test items. Estimates of the parameters of the three parameter logistic model were obtained on the 46 item math test for the sample of eighth grade students (N = 2055) participating in the Illinois Inventory of Educational Progress,…
Effects of Item Exposure for Conventional Examinations in a Continuous Testing Environment.
ERIC Educational Resources Information Center
Hertz, Norman R.; Chinn, Roberta N.
This study explored the effect of item exposure on two conventional examinations administered as computer-based tests. A principal hypothesis was that item exposure would have little or no effect on average difficulty of the items over the course of an administrative cycle. This hypothesis was tested by exploring conventional item statistics and…
McInnes, Matthew D F; Moher, David; Thombs, Brett D; McGrath, Trevor A; Bossuyt, Patrick M; Clifford, Tammy; Cohen, Jérémie F; Deeks, Jonathan J; Gatsonis, Constantine; Hooft, Lotty; Hunt, Harriet A; Hyde, Christopher J; Korevaar, Daniël A; Leeflang, Mariska M G; Macaskill, Petra; Reitsma, Johannes B; Rodin, Rachel; Rutjes, Anne W S; Salameh, Jean-Paul; Stevens, Adrienne; Takwoingi, Yemisi; Tonelli, Marcello; Weeks, Laura; Whiting, Penny; Willis, Brian H
2018-01-23
Systematic reviews of diagnostic test accuracy synthesize data from primary diagnostic studies that have evaluated the accuracy of 1 or more index tests against a reference standard, provide estimates of test performance, allow comparisons of the accuracy of different tests, and facilitate the identification of sources of variability in test accuracy. To develop the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagnostic test accuracy guideline as a stand-alone extension of the PRISMA statement. Modifications to the PRISMA statement reflect the specific requirements for reporting of systematic reviews and meta-analyses of diagnostic test accuracy studies and the abstracts for these reviews. Established standards from the Enhancing the Quality and Transparency of Health Research (EQUATOR) Network were followed for the development of the guideline. The original PRISMA statement was used as a framework on which to modify and add items. A group of 24 multidisciplinary experts used a systematic review of articles on existing reporting guidelines and methods, a 3-round Delphi process, a consensus meeting, pilot testing, and iterative refinement to develop the PRISMA diagnostic test accuracy guideline. The final version of the PRISMA diagnostic test accuracy guideline checklist was approved by the group. The systematic review (produced 64 items) and the Delphi process (provided feedback on 7 proposed items; 1 item was later split into 2 items) identified 71 potentially relevant items for consideration. The Delphi process reduced these to 60 items that were discussed at the consensus meeting. Following the meeting, pilot testing and iterative feedback were used to generate the 27-item PRISMA diagnostic test accuracy checklist. To reflect specific or optimal contemporary systematic review methods for diagnostic test accuracy, 8 of the 27 original PRISMA items were left unchanged, 17 were modified, 2 were added, and 2 were omitted. The 27-item PRISMA diagnostic test accuracy checklist provides specific guidance for reporting of systematic reviews. The PRISMA diagnostic test accuracy guideline can facilitate the transparent reporting of reviews, and may assist in the evaluation of validity and applicability, enhance replicability of reviews, and make the results from systematic reviews of diagnostic test accuracy studies more useful.
An Efficiency Balanced Information Criterion for Item Selection in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Han, Kyung T.
2012-01-01
Successful administration of computerized adaptive testing (CAT) programs in educational settings requires that test security and item exposure control issues be taken seriously. Developing an item selection algorithm that strikes the right balance between test precision and level of item pool utilization is the key to successful implementation…
ERIC Educational Resources Information Center
Arendasy, Martin E.; Sommer, Markus
2012-01-01
The use of new test administration technologies such as computerized adaptive testing in high-stakes educational and occupational assessments demands large item pools. Classic item construction processes and previous approaches to automatic item generation faced the problems of a considerable loss of items after the item calibration phase. In this…
Item Purification Does Not Always Improve DIF Detection: A Counterexample with Angoff's Delta Plot
ERIC Educational Resources Information Center
Magis, David; Facon, Bruno
2013-01-01
Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…
NASA Astrophysics Data System (ADS)
Mackevica, Aiga; Olsson, Mikael Emil; Hansen, Steffen Foss
2018-01-01
TiO2 is ubiquitously present in a wide range of everyday items, both as an intentionally incorporated additive and naturally occurring constituent. It can be found in a wide range of consumer products, including personal care products, food contact materials, and textiles. Normal use of these products may lead to consumer and/or environmental exposure to TiO2, possibly in form of nanoparticles. The aim of this study is to perform a leaching test and apply state-of-the-art methods to investigate nano-TiO2 and total Ti release from five types of commercially available conventional textiles: table placemats, wet wipes, microfiber cloths, and two types of baby bodysuits, with Ti contents ranging from 2.63 to 1448 μg/g. Released particle analysis was performed using conventional and single particle inductively coupled plasma mass spectrometry (ICP-MS and spICP-MS), in conjunction with transmission electron microscopy (TEM), to measure total and particulate TiO2 release by mass and particle number, as well as size distribution. Less than 1% of the initial Ti content was released over 24 h of leaching, with the highest releases reaching 3.13 μg/g. The fraction of nano-TiO2 released varied among fabric types and represented 0-80% of total TiO2 release. Particle mode sizes were 50-75 nm, and TEM imaging revealed particles in sizes of 80-200 nm. This study highlights the importance of using a multi-method approach to obtain quantitative release data that is able to provide an indication regarding particle number, size distribution, and mass concentration, all of which can help in understanding the fate and exposure of nanoparticles.
Comparability of scores on the MMPI-2-RF scales generated with the MMPI-2 and MMPI-2-RF booklets.
Van der Heijden, P T; Egger, J I M; Derksen, J J L
2010-05-01
In most validity studies on the recently released 338-item MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008; Tellegen & Ben-Porath, 2008), scale scores were derived from the 567-item MMPI-2 booklet. In this study, we evaluated the comparability of the MMPI-2-RF scale scores derived from the original 567-item MMPI-2 booklet with MMPI-2-RF scale scores derived from the 338-item MMPI-2-RF booklet in a Dutch student sample (N = 107). We used a counterbalanced (ABBA) design. We compared results with those previously reported by Tellegen and Ben-Porath (2008). Our findings support the comparability of the scores of the 338-item version and the 567-item version of the 50 MMPI-2-RF scales. We discuss clinical implications and directions for further research.
Jia, Lin-Zhi; Ya-Jun, Ma; Cao, Yi; Qian, Fen; Li, Xiang-Yu
2012-04-30
The quality index among "Medical Parasitology" exam papers and measured data for students in three majors from the university in 2010 were compared and analyzed. The exam papers were formed from the test item bank. The alpha reliability coefficients of the three exam papers were above 0.70. The knowledge structure and capacity structure of the exam papers were basically balanced. But the alpha reliability coefficients of the second major was the lowest, mainly due to quality of test items in the exam paper and the failure of revising the index of test item bank in time. This observation demonstrated that revising the test items and their index in the item bank according to the measured data can improve the quality of test item bank proposition and reduce the difference among exam papers.
The Role of Item Models in Automatic Item Generation
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis
2012-01-01
Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…
ERIC Educational Resources Information Center
Kee, D.W.; Gregory-Domingue, A.; Rice, K.; Tone, K.
2005-01-01
The present study examined gender schema encoding (activation of gender stereotypes) in the absence deliberate purposes relating to people (stereotype application). A release from proactive interference short-term-memory task was used. College (Experiment 1) and sixth-grade (Experiment 2) participants were asked to retain item identity and…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-30
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-62368; File No. SR-NYSEARCA-2010-60] Self... as described in Items I and II below, which Items have been prepared by the self-regulatory... interested persons. \\1\\ 15 U.S.C. 78s(b)(1). \\2\\ 15 U.S.C. 78a. \\3\\ 17 CFR 240.19b-4. I. Self-Regulatory...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-20
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-69142; File No. SR-NASDAQ-2013-048] Self... rule change as described in Items I, II and III below, which Items have been prepared by the self... change from interested persons. \\1\\ 15 U.S.C. 78s(b)(1). \\2\\ 17 CFR 240.19b-4. I. Self-Regulatory...
Gasperini, Claudio; Hupperts, Raymond; Lycke, Jan; Short, Christine; McNeill, Manjit; Zhong, John; Mehta, Lahar R
2016-11-15
Prolonged-release (PR) fampridine is approved to treat walking impairment in persons with multiple sclerosis (MS); however, treatment benefits may extend beyond walking. MOBILE was a phase 2, 24-week, double-blind, placebo-controlled exploratory study to assess the impact of 10mg PR-fampridine twice daily versus placebo on several subject-assessed measures. This analysis evaluated the physical and psychological health outcomes of subjects with progressing or relapsing MS from individual items of the Multiple Sclerosis Impact Scale (MSIS-29). PR-fampridine treatment (n=68) resulted in greater improvements from baseline in the MSIS-29 physical (PHYS) and psychological (PSYCH) impact subscales, with differences of 89% and 148% in mean score reduction from baseline (n=64) at week 24 versus placebo, respectively. MSIS-29 item analysis showed that a higher percentage of PR-fampridine subjects had mean improvements in 16/20 PHYS and 6/9 PSYCH items versus placebo after 24weeks. Post hoc analysis of the 12-item Multiple Sclerosis Walking Scale (MSWS-12) improver population (≥8-point mean improvement) demonstrated differences in mean reductions from baseline of 97% and 111% in PR-fampridine MSIS-29 PHYS and PSYCH subscales versus the overall placebo group over 24weeks. A higher percentage of MSWS-12 improvers treated with PR-fampridine showed mean improvements in 20/20 PHYS and 8/9 PSYCH items versus placebo at 24weeks. In conclusion, PR-fampridine resulted in physical and psychological benefits versus placebo, sustained over 24weeks. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Item Review and the Rearrangement Procedure: Its Process and Its Results
ERIC Educational Resources Information Center
Papanastasiou, Elena C.
2005-01-01
Permitting item review is to the benefit of the examinees who typically increase their test scores with item review. However, testing companies do not prefer item review since it does not follow the logic on which adaptive tests are based, and since it is prone to cheating strategies. Consequently, item review is not permitted in many adaptive…
A Model-Based Method for Content Validation of Automatically Generated Test Items
ERIC Educational Resources Information Center
Zhang, Xinxin; Gierl, Mark
2016-01-01
The purpose of this study is to describe a methodology to recover the item model used to generate multiple-choice test items with a novel graph theory approach. Beginning with the generated test items and working backward to recover the original item model provides a model-based method for validating the content used to automatically generate test…
Optimal Bayesian Adaptive Design for Test-Item Calibration.
van der Linden, Wim J; Ren, Hao
2015-06-01
An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers' ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.
State Assessment Program Item Banks: Model Language for Request for Proposals (RFP) and Contracts
ERIC Educational Resources Information Center
Swanson, Leonard C.
2010-01-01
This document provides recommendations for request for proposal (RFP) and contract language that state education agencies can use to specify their requirements for access to test item banks. An item bank is a repository for test items and data about those items. Item banks are used by state agency staff to view items and associated data; to…
The Impact of Receiving the Same Items on Consecutive Computer Adaptive Test Administrations.
ERIC Educational Resources Information Center
O'Neill, Thomas; Lunz, Mary E.; Thiede, Keith
2000-01-01
Studied item exposure in a computerized adaptive test when the item selection algorithm presents examinees with questions they were asked in a previous test administration. Results with 178 repeat examinees on a medical technologists' test indicate that the combined use of an adaptive algorithm to select items and latent trait theory to estimate…
ERIC Educational Resources Information Center
Saß, Steffani; Schütte, Kerstin
2016-01-01
Solving test items might require abilities in test-takers other than the construct the test was designed to assess. Item and student characteristics such as item format or reading comprehension can impact the test result. This experiment is based on cognitive theories of text and picture comprehension. It examines whether integration aids, which…
Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly
ERIC Educational Resources Information Center
Veldkamp, Bernard P.; Matteucci, Mariagiulia; de Jong, Martijn G.
2013-01-01
Item response theory parameters have to be estimated, and because of the estimation process, they do have uncertainty in them. In most large-scale testing programs, the parameters are stored in item banks, and automated test assembly algorithms are applied to assemble operational test forms. These algorithms treat item parameters as fixed values,…
Identifying Differential Item Functioning in Multi-Stage Computer Adaptive Testing
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis; Li, Johnson
2013-01-01
The purpose of this study is to evaluate the performance of CATSIB (Computer Adaptive Testing-Simultaneous Item Bias Test) for detecting differential item functioning (DIF) when items in the matching and studied subtest are administered adaptively in the context of a realistic multi-stage adaptive test (MST). MST was simulated using a 4-item…
A Stepwise Test Characteristic Curve Method to Detect Item Parameter Drift
ERIC Educational Resources Information Center
Guo, Rui; Zheng, Yi; Chang, Hua-Hua
2015-01-01
An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…
Shen, Linjun; Li, Feiming; Wattleworth, Roberta; Filipetto, Frank
2010-10-01
The Comprehensive Osteopathic Medical Licensing Examination conducted a trial of multimedia items in the 2008-2009 Level 3 testing cycle to determine (1) if multimedia items were able to test additional elements of medical knowledge and skills and (2) how to develop effective multimedia items. Forty-four content-matched multimedia and text multiple-choice items were randomly delivered to Level 3 candidates. Logistic regression and paired-samples t tests were used for pairwise and group-level comparisons, respectively. Nine pairs showed significant differences in either difficulty or/and discrimination. Content analysis found that, if text narrations were less direct, multimedia materials could make items easier. When textbook terminologies were replaced by multimedia presentations, multimedia items could become more difficult. Moreover, a multimedia item was found not uniformly difficult for candidates at different ability levels, possibly because multimedia and text items tested different elements of a same concept. Multimedia items may be capable of measuring some constructs different from what text items can measure. Effective multimedia items with reasonable psychometric properties can be intentionally developed.
Koh, Bongyeun; Hong, Sunggi; Kim, Soon-Sim; Hyun, Jin-Sook; Baek, Milye; Moon, Jundong; Kwon, Hayran; Kim, Gyoungyong; Min, Seonggi; Kang, Gu-Hyun
2016-01-01
The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE), which requires examinees to select items randomly. The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01), as well as 4 of the 5 items on the advanced skills test (P<0.05). In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01), as well as all 3 of the advanced skills test items (P<0.01). In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination.
Item Analysis in Introductory Economics Testing.
ERIC Educational Resources Information Center
Tinari, Frank D.
1979-01-01
Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)
NASA Astrophysics Data System (ADS)
Ilich, Maria O.
Psychometricians and test developers evaluate standardized tests for potential bias against groups of test-takers by using differential item functioning (DIF). English language learners (ELLs) are a diverse group of students whose native language is not English. While they are still learning the English language, they must take their standardized tests for their school subjects, including science, in English. In this study, linguistic complexity was examined as a possible source of DIF that may result in test scores that confound science knowledge with a lack of English proficiency among ELLs. Two years of fifth-grade state science tests were analyzed for evidence of DIF using two DIF methods, Simultaneous Item Bias Test (SIBTest) and logistic regression. The tests presented a unique challenge in that the test items were grouped together into testlets---groups of items referring to a scientific scenario to measure knowledge of different science content or skills. Very large samples of 10, 256 students in 2006 and 13,571 students in 2007 were examined. Half of each sample was composed of Spanish-speaking ELLs; the balance was comprised of native English speakers. The two DIF methods were in agreement about the items that favored non-ELLs and the items that favored ELLs. Logistic regression effect sizes were all negligible, while SIBTest flagged items with low to high DIF. A decrease in socioeconomic status and Spanish-speaking ELL diversity may have led to inconsistent SIBTest effect sizes for items used in both testing years. The DIF results for the testlets suggested that ELLs lacked sufficient opportunity to learn science content. The DIF results further suggest that those constructed response test items requiring the student to draw a conclusion about a scientific investigation or to plan a new investigation tended to favor ELLs.
NASA Astrophysics Data System (ADS)
Wren, David A.
The research presented in this dissertation culminated in a 10-item Thermochemistry Concept Inventory (TCI). The development of the TCI can be divided into two main phases: qualitative studies and quantitative studies. Both phases focused on the primary stakeholders of the TCI, college-level general chemistry instructors and students. Each phase was designed to collect evidence for the validity of the interpretations and uses of TCI testing data. A central use of TCI testing data is to identify student conceptual misunderstandings, which are represented as incorrect options of multiple-choice TCI items. Therefore, quantitative and qualitative studies focused heavily on collecting evidence at the item-level, where important interpretations may be made by TCI users. Qualitative studies included student interviews (N = 28) and online expert surveys (N = 30). Think-aloud student interviews (N = 12) were used to identify conceptual misunderstandings used by students. Novice response process validity interviews (N = 16) helped provide information on how students interpreted and answered TCI items and were the basis of item revisions. Practicing general chemistry instructors (N = 18), or experts, defined boundaries of thermochemistry content included on the TCI. Once TCI items were in the later stages of development, an online version of the TCI was used in expert response process validity survey (N = 12), to provide expert feedback on item content, format and consensus of the correct answer for each item. Quantitative studies included three phases: beta testing of TCI items (N = 280), pilot testing of the a 12-item TCI (N = 485), and a large data collection using a 10-item TCI ( N = 1331). In addition to traditional classical test theory analysis, Rasch model analysis was also used for evaluation of testing data at the test and item level. The TCI was administered in both formative assessment (beta and pilot testing) and summative assessment (large data collection), with items performing well in both. One item, item K, did not have acceptable psychometric properties when the TCI was used as a quiz (summative assessment), but was retained in the final version of the TCI based on the acceptable psychometric properties displayed in pilot testing (formative assessment).
ERIC Educational Resources Information Center
Li, Yanmei
2012-01-01
In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often removed. For a test that contains mostly dichotomous items and only a small number of polytomous items, removing some drifted polytomous anchor items may result in anchor sets that no longer resemble mini-versions of…
Sinharay, Sandip
2017-09-01
Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.
ERIC Educational Resources Information Center
McLeod, Lori D.; Lewis, Charles; Thissen, David.
With the increased use of computerized adaptive testing, which allows for continuous testing, new concerns about test security have evolved, one being the assurance that items in an item pool are safeguarded from theft. In this paper, the risk of score inflation and procedures to detect test takers using item preknowledge are explored. When test…
Effect of Multiple Testing Adjustment in Differential Item Functioning Detection
ERIC Educational Resources Information Center
Kim, Jihye; Oshima, T. C.
2013-01-01
In a typical differential item functioning (DIF) analysis, a significance test is conducted for each item. As a test consists of multiple items, such multiple testing may increase the possibility of making a Type I error at least once. The goal of this study was to investigate how to control a Type I error rate and power using adjustment…
Item Response Theory Models for Performance Decline during Testing
ERIC Educational Resources Information Center
Jin, Kuan-Yu; Wang, Wen-Chung
2014-01-01
Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…
Differential item functioning analysis of the Vanderbilt Expertise Test for cars.
Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel
2015-01-01
The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.
Samejima Items in Multiple-Choice Tests: Identification and Implications
ERIC Educational Resources Information Center
Rahman, Nazia
2013-01-01
Samejima hypothesized that non-monotonically increasing item response functions (IRFs) of ability might occur for multiple-choice items (referred to here as "Samejima items") if low ability test takers with some, though incomplete, knowledge or skill are drawn to a particularly attractive distractor, while very low ability test takers…
Computerized Numerical Control Test Item Bank.
ERIC Educational Resources Information Center
Reneau, Fred; And Others
This guide contains 285 test items for use in teaching a course in computerized numerical control. All test items were reviewed, revised, and validated by incumbent workers and subject matter instructors. Items are provided for assessing student achievement in such aspects of programming and planning, setting up, and operating machines with…
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
ERIC Educational Resources Information Center
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
ERIC Educational Resources Information Center
He, Yong
2013-01-01
Common test items play an important role in equating multiple test forms under the common-item nonequivalent groups design. Inconsistent item parameter estimates among common items can lead to large bias in equated scores for IRT true score equating. Current methods extensively focus on detection and elimination of outlying common items, which…
ERIC Educational Resources Information Center
Scheuneman, Janice Dowd; Gerritz, Kalle
1990-01-01
Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)
Item Structural Properties as Predictors of Item Difficulty and Item Association.
ERIC Educational Resources Information Center
Solano-Flores, Guillermo
1993-01-01
Studied the ability of logical test design (LTD) to predict student performance in reading Roman numerals for 211 sixth graders in Mexico City tested on Roman numeral items varying on LTD-related and non-LTD-related variables. The LTD-related variable item iterativity was found to be the best predictor of item difficulty. (SLD)
Investigating Item Exposure Control Methods in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Ozturk, Nagihan Boztunc; Dogan, Nuri
2015-01-01
This study aims to investigate the effects of item exposure control methods on measurement precision and on test security under various item selection methods and item pool characteristics. In this study, the Randomesque (with item group sizes of 5 and 10), Sympson-Hetter, and Fade-Away methods were used as item exposure control methods. Moreover,…
ERIC Educational Resources Information Center
Lee, Woo-yeol; Cho, Sun-Joo
2017-01-01
Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…
Item Pool Design for an Operational Variable-Length Computerized Adaptive Test
ERIC Educational Resources Information Center
He, Wei; Reckase, Mark D.
2014-01-01
For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…
Fire Tests on E-vehicle Battery Cells and Packs.
Sturk, David; Hoffmann, Lars; Ahlberg Tidblad, Annika
2015-01-01
The purpose of this study was to investigate the effects of abuse conditions, including realistic crash scenarios, on Li ion battery systems in E-vehicles in order to develop safe practices and priorities when responding to accidents involving E-vehicles. External fire tests using a single burning item equipment were performed on commercial Li ion battery cells and battery packs for electric vehicle (E-vehicle) application. The 2 most common battery cell technologies were tested: Lithium iron phosphate (LFP) and mixed transition metal oxide (lithium nickel manganese cobalt oxide, NMC) cathodes against graphite anodes, respectively. The cell types investigated were "pouch" cells, with similar physical dimensions, but the NMC cells have double the electric capacity of the LFP cells due to the higher energy density of the NMC chemistry, 7 and 14 Ah, respectively. Heat release rate (HRR) data and concentrations of toxic gases were acquired by oxygen consumption calorimetry and Fourier transform infrared spectroscopy (FTIR), respectively. The test results indicate that the state of charge (SOC) affects the HRR as well as the amount of toxic hydrogen fluoride (HF) gas formed during combustion. A larger number of cells increases the amount of HF formed per cell. There are significant differences in response to the fire exposure between the NMC and LFP cells in this study. The LFP cells generate a lot more HF per cell, but the overall reactivity of the NMC cells is higher. However, the total energy released by both batteries during combustion was independent of SOC, which indicates that the electric energy content of the test object contributes to the activation energy of the thermal and heat release process, whereas the chemical energy stored in the materials is the main source of thermal energy in the batteries. The results imply that it is difficult to draw conclusions about higher order system behavior with respect to HF emissions based on data from tests on single cells or small assemblies of cells. This applies to energy release rates as well. The present data show that mass and shielding effects between cells in multicell assemblies affect the propagation of a thermal event.
Deichmann Nielsen, Lea; Bech, Per; Hounsgaard, Lise; Alkier Gildberg, Frederik
2017-08-01
Unstructured risk assessment, as well as confounders (underlying reasons for the patient's risk behaviour and alliance), risk behaviour, and parameters of alliance, have been identified as factors that prolong the duration of mechanical restraint among forensic mental health inpatients. To clinically validate a new, structured short-term risk assessment instrument called the Mechanical Restraint-Confounders, Risk, Alliance Score (MR-CRAS), with the intended purpose of supporting the clinicians' observation and assessment of the patient's readiness to be released from mechanical restraint. The content and layout of MR-CRAS and its user manual were evaluated using face validation by forensic mental health clinicians, content validation by an expert panel, and pilot testing within two, closed forensic mental health inpatient units. The three sub-scales (Confounders, Risk, and a parameter of Alliance) showed excellent content validity. The clinical validations also showed that MR-CRAS was perceived and experienced as a comprehensible, relevant, comprehensive, and useable risk assessment instrument. MR-CRAS contains 18 clinically valid items, and the instrument can be used to support the clinical decision-making regarding the possibility of releasing the patient from mechanical restraint. The present three studies have clinically validated a short MR-CRAS scale that is currently being psychometrically tested in a larger study.
ERIC Educational Resources Information Center
Yoon, Su-Youn; Lee, Chong Min; Houghton, Patrick; Lopez, Melissa; Sakano, Jennifer; Loukina, Anastasia; Krovetz, Bob; Lu, Chi; Madani, Nitin
2017-01-01
In this study, we developed assistive tools and resources to support TOEIC® Listening test item generation. There has recently been an increased need for a large pool of items for these tests. This need has, in turn, inspired efforts to increase the efficiency of item generation while maintaining the quality of the created items. We aimed to…
ERIC Educational Resources Information Center
Nissan, Susan; And Others
One of the item types in the Listening Comprehension section of the Test of English as a Foreign Language (TOEFL) test is the dialogue. Because the dialogue item pool needs to have an appropriate balance of items at a range of difficulty levels, test developers have examined items at various difficulty levels in an attempt to identify their…
2014-12-01
chemical etching EDM electrical discharge machine EID enterprise identifier EOSS Engineering Operational Sequencing System F Fahrenheit...Center in Corona , California, released a DoN IUID Marking Guide, which made recommendations on how to mark legacy items. It provides technical...uploaded into the IUID registry managed by the Naval Surface Warfare Center (NSWC) in Corona , California. There is no set amount of information
2006-10-01
NCAPS ) Christina M. Underhill, Ph.D. Approved for public release; distribution is unlimited. NPRST-TN-06-9 October 2006...Investigation of Item-Pair Presentation and Construct Validity of the Navy Computer Adaptive Personality Scales ( NCAPS ) Christina M. Underhill, Ph.D...documents one of the steps in our development of the Navy Computer Adaptive Personality Scales ( NCAPS ). NCAPS is a computer adaptive personality measure
Jensen, Theo Walther; Møller, Thea Palsgaard; Viereck, Søren; Roland, Jens; Pedersen, Thomas Egesborg; Lippert, Freddy K
2018-01-10
The European Resuscitation Council (ERC) released new guidelines on resuscitation in 2015. For the first time, the guidelines included a separate chapter on first aid for laypersons. We analysed the current major Danish national first aid books to identify potential inconsistencies between the current books and the new evidence-based first aid guidelines. We identified first aid books from all the first aid courses offered by major Danish suppliers. Based on the new ERC first aid guidelines, we developed a checklist of 26 items within 16 different categories to assess the content; this checklist was adapted following the principle of mutually exclusive and collectively exhaustive questioning. To assess the agreement between four raters, Fleiss' kappa test was used. Items that did not reach an acceptable kappa score were excluded. We evaluated 10 first aid books used for first aid courses and published between 2009 and 2015. The content of the books complied with the new in 38% of the answers. In 12 of the 26 items, there was less than 50% consistency. These items include proximal pressure points and elevation of extremities for the control of bleeding, use of cervical collars, treatment for an open chest wound, burn dressing, dental avulsion, passive leg raising, administration of bronchodilators, adrenaline, and aspirin. Danish course material showed significant inconsistencies with the new evidence-based first aid guidelines. The new knowledge from the evidence-based guidelines should be incorporated into revised and updated first aid course material.
Park, In Sook; Suh, Yeon Ok; Park, Hae Sook; Kang, So Young; Kim, Kwang Sung; Kim, Gyung Hee; Choi, Yeon-Hee; Kim, Hyun-Ju
2017-01-01
The purpose of this study was to improve the quality of items on the Korean Nursing Licensing Examination by developing and evaluating case-based items that reflect integrated nursing knowledge. We conducted a cross-sectional observational study to develop new case-based items. The methods for developing test items included expert workshops, brainstorming, and verification of content validity. After a mock examination of undergraduate nursing students using the newly developed case-based items, we evaluated the appropriateness of the items through classical test theory and item response theory. A total of 50 case-based items were developed for the mock examination, and content validity was evaluated. The question items integrated 34 discrete elements of integrated nursing knowledge. The mock examination was taken by 741 baccalaureate students in their fourth year of study at 13 universities. Their average score on the mock examination was 57.4, and the examination showed a reliability of 0.40. According to classical test theory, the average level of item difficulty of the items was 57.4% (80%-100% for 12 items; 60%-80% for 13 items; and less than 60% for 25 items). The mean discrimination index was 0.19, and was above 0.30 for 11 items and 0.20 to 0.29 for 15 items. According to item response theory, the item discrimination parameter (in the logistic model) was none for 10 items (0.00), very low for 20 items (0.01 to 0.34), low for 12 items (0.35 to 0.64), moderate for 6 items (0.65 to 1.34), high for 1 item (1.35 to 1.69), and very high for 1 item (above 1.70). The item difficulty was very easy for 24 items (below -2.0), easy for 8 items (-2.0 to -0.5), medium for 6 items (-0.5 to 0.5), hard for 3 items (0.5 to 2.0), and very hard for 9 items (2.0 or above). The goodness-of-fit test in terms of the 2-parameter item response model between the range of 2.0 to 0.5 revealed that 12 items had an ideal correct answer rate. We surmised that the low reliability of the mock examination was influenced by the timing of the test for the examinees and the inappropriate difficulty of the items. Our study suggested a methodology for the development of future case-based items for the Korean Nursing Licensing Examination.
The beneficial effect of testing: an event-related potential study
Bai, Cheng-Hua; Bridger, Emma K.; Zimmer, Hubert D.; Mecklinger, Axel
2015-01-01
The enhanced memory performance for items that are tested as compared to being restudied (the testing effect) is a frequently reported memory phenomenon. According to the episodic context account of the testing effect, this beneficial effect of testing is related to a process which reinstates the previously learnt episodic information. Few studies have explored the neural correlates of this effect at the time point when testing takes place, however. In this study, we utilized the ERP correlates of successful memory encoding to address this issue, hypothesizing that if the benefit of testing is due to retrieval-related processes at test then subsequent memory effects (SMEs) should resemble the ERP correlates of retrieval-based processing in their temporal and spatial characteristics. Participants were asked to learn Swahili-German word pairs before items were presented in either a testing or a restudy condition. Memory performance was assessed immediately and 1-day later with a cued recall task. Successfully recalling items at test increased the likelihood that items were remembered over time compared to items which were only restudied. An ERP subsequent memory contrast (later remembered vs. later forgotten tested items), which reflects the engagement of processes that ensure items are recallable the next day were topographically comparable with the ERP correlate of immediate recollection (immediately remembered vs. immediately forgotten tested items). This result shows that the processes which allow items to be more memorable over time share qualitatively similar neural correlates with the processes that relate to successful retrieval at test. This finding supports the notion that testing is more beneficial than restudying on memory performance over time because of its engagement of retrieval processes, such as the re-encoding of actively retrieved memory representations. PMID:26441577
Chemical agent simulant release from clothing following vapor exposure.
Feldman, Robert J
2010-02-01
Most ambulatory victims of a terrorist chemical attack will have exposure to vapor only. The study objective was to measure the duration of chemical vapor release from various types of clothing. A chemical agent was simulated using methyl salicylate (MeS), which has similar physical properties to sulfur mustard and was the agent used in the U.S. Army's Man-In-Simulant Test (MIST). Vapor concentration was measured with a Smiths Detection Advanced Portable Detector (APD)-2000 unit. The clothing items were exposed to vapor for 1 hour in a sealed cabinet; vapor concentration was measured at the start and end of each exposure. Clothing was then removed and assessed every 5 minutes with the APD-2000, using a uniform sweep pattern, until readings remained 0. Concentration and duration of vapor release from clothing varied with clothing composition and construction. Lightweight cotton shirts and jeans had the least trapped vapor; down outerwear, the most. Vapor concentration near the clothing often increased for several minutes after the clothing was removed from the contaminated environment. Compression of thick outerwear released additional vapor. Mean times to reach 0 ranged from 7 minutes for jeans to 42 minutes for down jackets. This simulation model of chemical vapor release demonstrates persistent presence of simulant vapor over time. This implies that chemical vapor may be released from the victims' clothing after they are evacuated from the site of exposure, resulting in additional exposure of victims and emergency responders. Insulated outerwear can release additional vapor when handled. If a patient has just moved to a vapor screening point, immediate assessment before additional vapor can be released from the clothing can lead to a false-negative assessment of contamination.
The development of a science process assessment for fourth-grade students
NASA Astrophysics Data System (ADS)
Smith, Kathleen A.; Welliver, Paul W.
In this study, a multiple-choice test entitled the Science Process Assessment was developed to measure the science process skills of students in grade four. Based on the Recommended Science Competency Continuum for Grades K to 6 for Pennsylvania Schools, this instrument measured the skills of (1) observing, (2) classifying, (3) inferring, (4) predicting, (5) measuring, (6) communicating, (7) using space/time relations, (8) defining operationally, (9) formulating hypotheses, (10) experimenting, (11) recognizing variables, (12) interpreting data, and (13) formulating models. To prepare the instrument, classroom teachers and science educators were invited to participate in two science education workshops designed to develop an item bank of test questions applicable to measuring process skill learning. Participants formed writing teams and generated 65 test items representing the 13 process skills. After a comprehensive group critique of each item, 61 items were identified for inclusion into the Science Process Assessment item bank. To establish content validity, the item bank was submitted to a select panel of science educators for the purpose of judging item acceptability. This analysis yielded 55 acceptable test items and produced the Science Process Assessment, Pilot 1. Pilot 1 was administered to 184 fourth-grade students. Students were given a copy of the test booklet; teachers read each test aloud to the students. Upon completion of this first administration, data from the item analysis yielded a reliability coefficient of 0.73. Subsequently, 40 test items were identified for the Science Process Assessment, Pilot 2. Using the test-retest method, the Science Process Assessment, Pilot 2 (Test 1 and Test 2) was administered to 113 fourth-grade students. Reliability coefficients of 0.80 and 0.82, respectively, were ascertained. The correlation between Test 1 and Test 2 was 0.77. The results of this study indicate that (1) the Science Process Assessment, Pilot 2, is a valid and reliable instrument applicable to measuring the science process skills of students in grade four, (2) using educational workshops as a means of developing item banks of test questions is viable and productive in the test development process, and (3) involving classroom teachers and science educators in the test development process is educationally efficient and effective.
Michaelides, Michalis P.
2010-01-01
Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items. PMID:21833230
Michaelides, Michalis P
2010-01-01
Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.
Raykov, Tenko; Marcoulides, George A
2016-04-01
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete nature of the observed items. Two distinct observational equivalence approaches are outlined that render the item response models from corresponding classical test theory-based models, and can each be used to obtain the former from the latter models. Similarly, classical test theory models can be furnished using the reverse application of either of those approaches from corresponding item response models.
Locally Dependent Linear Logistic Test Model with Person Covariates
ERIC Educational Resources Information Center
Ip, Edward H.; Smits, Dirk J. M.; De Boeck, Paul
2009-01-01
The article proposes a family of item-response models that allow the separate and independent specification of three orthogonal components: item attribute, person covariate, and local item dependence. Special interest lies in extending the linear logistic test model, which is commonly used to measure item attributes, to tests with embedded item…
Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items
ERIC Educational Resources Information Center
Penfield, Randall D.
2006-01-01
This study applied the maximum expected information (MEI) and the maximum posterior-weighted information (MPI) approaches of computer adaptive testing item selection to the case of a test using polytomous items following the partial credit model. The MEI and MPI approaches are described. A simulation study compared the efficiency of ability…
Do Reading Experts Agree with MCAT Verbal Reasoning Item Classifications?
ERIC Educational Resources Information Center
Jackson, Evelyn W.; And Others
1994-01-01
Examined whether expert raters (n=5) could agree about classification of Medical College Admission Test (MCAT) items and whether they agreed with MCAT student manual in labeling skill being measured by each test item. Results revealed difficulties in replicating authors' labeling of skills for reading items on practice test provided with 1991 MCAT…
ACER Chemistry Test Item Collection (ACER CHEMTIC Year 12 Supplement).
ERIC Educational Resources Information Center
Australian Council for Educational Research, Hawthorn.
This publication contains 317 multiple-choice chemistry test items related to topics covered in the Victorian (Australia) Year 12 chemistry course. It allows teachers access to a range of items suitable for diagnostic and achievement purposes, supplementing the ACER Chemistry Test Item Collection--Year 12 (CHEMTIC). The topics covered are: organic…
Differential Item Functioning: Its Consequences. Research Report. ETS RR-10-01
ERIC Educational Resources Information Center
Lee, Yi-Hsuan; Zhang, Jinming
2010-01-01
This report examines the consequences of differential item functioning (DIF) using simulated data. Its impact on total score, item response theory (IRT) ability estimate, and test reliability was evaluated in various testing scenarios created by manipulating the following four factors: test length, percentage of DIF items per form, sample sizes of…
Electronics. Criterion-Referenced Test (CRT) Item Bank.
ERIC Educational Resources Information Center
Davis, Diane, Ed.
This document contains 519 criterion-referenced multiple choice and true or false test items for a course in electronics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and the Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 15 units covering the…
Auto Mechanics. Criterion-Referenced Test (CRT) Item Bank.
ERIC Educational Resources Information Center
Tannehill, Dana, Ed.
This document contains 546 criterion-referenced multiple choice and true or false test items for a course in auto mechanics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 35 units covering the…
Developing a Strategy for Using Technology-Enhanced Items in Large-Scale Standardized Tests
ERIC Educational Resources Information Center
Bryant, William
2017-01-01
As large-scale standardized tests move from paper-based to computer-based delivery, opportunities arise for test developers to make use of items beyond traditional selected and constructed response types. Technology-enhanced items (TEIs) have the potential to provide advantages over conventional items, including broadening construct measurement,…
2016-01-01
Purpose: The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE), which requires examinees to select items randomly. Methods: The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. Results: In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01), as well as 4 of the 5 items on the advanced skills test (P<0.05). In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01), as well as all 3 of the advanced skills test items (P<0.01). Conclusion: In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination. PMID:26883810
Doig, Emmah; Prescott, Sarah; Fleming, Jennifer; Cornwell, Petrea; Kuipers, Pim
2016-01-01
To examine the internal reliability and test-retest reliability of the Client-Centeredness of Goal Setting (C-COGS) scale. The C-COGS scale was administered to 42 participants with acquired brain injury after completion of multidisciplinary goal planning. Internal reliability of scale items was examined using item-partial total correlations and Cronbach's α coefficient. The scale was readministered within a 1-mo period to a subsample of 12 participants to examine test-retest reliability by calculating exact and close percentage agreement for each item. After examination of item-partial total correlations, test items were revised. The revised items demonstrated stronger internal consistency than the original items. Preliminary evaluation of test-retest reliability was fair, with an average exact percent agreement across all test items of 67%. Findings support the preliminary reliability of the C-COGS scale as a tool to evaluate and promote client-centered goal planning in brain injury rehabilitation. Copyright © 2016 by the American Occupational Therapy Association, Inc.
Item-Writing Guidelines for Physics
ERIC Educational Resources Information Center
Regan, Tom
2015-01-01
A teacher learning how to write test questions (test items) will almost certainly encounter item-writing guidelines--lists of item-writing do's and don'ts. Item-writing guidelines usually are presented as applicable across all assessment settings. Table I shows some guidelines that I believe to be generally applicable and two will be briefly…
Unidimensional Interpretations for Multidimensional Test Items
ERIC Educational Resources Information Center
Kahraman, Nilufer
2013-01-01
This article considers potential problems that can arise in estimating a unidimensional item response theory (IRT) model when some test items are multidimensional (i.e., show a complex factorial structure). More specifically, this study examines (1) the consequences of model misfit on IRT item parameter estimates due to unintended minor item-level…
Kisala, Pamela A.; Victorson, David; Pace, Natalie; Heinemann, Allen W.; Choi, Seung W.; Tulsky, David S.
2015-01-01
Objective To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Design Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. Participants A total of 716 individuals with SCI completed the trauma items Results The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items Conclusion The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available. PMID:26010967
Vaughn, Kalif E; Rawson, Katherine A; Pyc, Mary A
2013-12-01
A wealth of previous research has established that retrieval practice promotes memory, particularly when retrieval is successful. Although successful retrieval promotes memory, it remains unclear whether successful retrieval promotes memory equally well for items of varying difficulty. Will easy items still outperform difficult items on a final test if all items have been correctly recalled equal numbers of times during practice? In two experiments, normatively difficult and easy Lithuanian-English word pairs were learned via test-restudy practice until each item had been correctly recalled a preassigned number of times (from 1 to 11 correct recalls). Despite equating the numbers of successful recalls during practice, performance on a delayed final cued-recall test was lower for difficult than for easy items. Experiment 2 was designed to diagnose whether the disadvantage for difficult items was due to deficits in cue memory, target memory, and/or associative memory. The results revealed a disadvantage for the difficult versus the easy items only on the associative recognition test, with no differences on cue recognition, and even an advantage on target recognition. Although successful retrieval enhanced memory for both difficult and easy items, equating retrieval success during practice did not eliminate normative item difficulty differences.
Test Bias: An Objective Definition for Test Items.
ERIC Educational Resources Information Center
Durovic, Jerry J.
A test bias definition, applicable at the item-level of a test is presented. The definition conceptually equates test bias with measuring different things in different groups, and operationally equates test bias with a difference in item fit to the Rasch Model, greater than one, between groups. It is suggested that the proposed definition avoids…
2013-01-01
Background Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. Methods The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Results Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. Conclusions The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information. PMID:23453056
Zoanetti, Nathan; Beaves, Mark; Griffin, Patrick; Wallace, Euan M
2013-03-04
Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information.
Detecting Gender Bias Through Test Item Analysis
NASA Astrophysics Data System (ADS)
González-Espada, Wilson J.
2009-03-01
Many physical science and physics instructors might not be trained in pedagogically appropriate test construction methods. This could lead to test items that do not measure what they are intended to measure. A subgroup of these items might show bias against some groups of students. This paper describes how the author became aware of potentially biased items against females in his examinations, which led to the exploration of fundamental issues related to item validity, gender bias, and differential item functioning, or DIF. A brief discussion of DIF in the context of university courses, as well as practical suggestions to detect possible gender-biased items, follows.
Estimating Total-test Scores from Partial Scores in a Matrix Sampling Design.
ERIC Educational Resources Information Center
Sachar, Jane; Suppes, Patrick
It is sometimes desirable to obtain an estimated total-test score for an individual who was administered only a subset of the items in a total test. The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students in grades 3-5 and 60 items of the ll0-item Stanford Mental…
Differential item functioning analysis of the Vanderbilt Expertise Test for cars
Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W.; Van Gulick, Ana Beth; Gauthier, Isabel
2015-01-01
The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge. PMID:26418499
ERIC Educational Resources Information Center
Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.
2012-01-01
Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…
Science Library of Test Items. Volume Two.
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
The second volume of test items in the Science Library of Test Items is intended as a resource to assist teachers in implementing and evaluating science courses in the first 4 years of Australian secondary school. The items were selected from questions submitted to the School Certificate Development Unit by teachers in New South Wales. Only the…
Measuring the Instructional Sensitivity of ESL Reading Comprehension Items.
ERIC Educational Resources Information Center
Brutten, Sheila R.; And Others
A study attempted to estimate the instructional sensitivity of items in three reading comprehension tests in English as a second language (ESL). Instructional sensitivity is a test-item construct defined as the tendency for a test item to vary in difficulty as a function of instruction. Similar tasks were given to readers at different proficiency…
Reducing the Impact of Inappropriate Items on Reviewable Computerized Adaptive Testing
ERIC Educational Resources Information Center
Yen, Yung-Chin; Ho, Rong-Guey; Liao, Wen-Wei; Chen, Li-Ju
2012-01-01
In a test, the testing score would be closer to examinee's actual ability when careless mistakes were corrected. In CAT, however, changing the answer of one item in CAT might cause the following items no longer appropriate for estimating the examinee's ability. These inappropriate items in a reviewable CAT might in turn introduce bias in ability…
ERIC Educational Resources Information Center
Lau, C. Allen; Wang, Tianyou
The purposes of this study were to: (1) extend the sequential probability ratio testing (SPRT) procedure to polytomous item response theory (IRT) models in computerized classification testing (CCT); (2) compare polytomous items with dichotomous items using the SPRT procedure for their accuracy and efficiency; (3) study a direct approach in…
A Conditional Exposure Control Method for Multidimensional Adaptive Testing
ERIC Educational Resources Information Center
Finkelman, Matthew; Nering, Michael L.; Roussos, Louis A.
2009-01-01
In computerized adaptive testing (CAT), ensuring the security of test items is a crucial practical consideration. A common approach to reducing item theft is to define maximum item exposure rates, i.e., to limit the proportion of examinees to whom a given item can be administered. Numerous methods for controlling exposure rates have been proposed…
ERIC Educational Resources Information Center
Downing, Steven M.; Maatsch, Jack L.
To test the effect of clinically relevant multiple-choice item content on the validity of statistical discriminations of physicians' clinical competence, data were collected from a field test of the Emergency Medicine Examination, test items for the certification of specialists in emergency medicine. Two 91-item multiple-choice subscales were…
The Effect of Including or Excluding Students with Testing Accommodations on IRT Calibrations.
ERIC Educational Resources Information Center
Karkee, Thakur; Lewis, Dan M.; Barton, Karen; Haug, Carolyn
This study aimed to determine the degree to which the inclusion of accommodated students with disabilities in the calibration sample affects the characteristics of item parameters and the test results. Investigated were effects on test reliability, item fit to the applicable item response theory (IRT) model, item parameter estimates, and students'…
Online Calibration of Polytomous Items Under the Generalized Partial Credit Model
Zheng, Yi
2016-01-01
Online calibration is a technology-enhanced architecture for item calibration in computerized adaptive tests (CATs). Many CATs are administered continuously over a long term and rely on large item banks. To ensure test validity, these item banks need to be frequently replenished with new items, and these new items need to be pretested before being used operationally. Online calibration dynamically embeds pretest items in operational tests and calibrates their parameters as response data are gradually obtained through the continuous test administration. This study extends existing formulas, procedures, and algorithms for dichotomous item response theory models to the generalized partial credit model, a popular model for items scored in more than two categories. A simulation study was conducted to investigate the developed algorithms and procedures under a variety of conditions, including two estimation algorithms, three pretest item selection methods, three seeding locations, two numbers of score categories, and three calibration sample sizes. Results demonstrated acceptable estimation accuracy of the two estimation algorithms in some of the simulated conditions. A variety of findings were also revealed for the interacted effects of included factors, and recommendations were made respectively. PMID:29881063
2004-05-25
KENNEDY SPACE CENTER, FLA. - United Space Alliance workers J.C. Harrison (far left) and Amy Mangiacapra guide a wrapped piece of Columbia debris through the Vehicle Assembly Building, where it is stored. Alongside is NASA’s Scott Thurston, who is the Columbia debris coordinator. This piece is one of eight being released to The Aerospace Corporation in El Segundo, Calif., for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite.
2004-05-25
KENNEDY SPACE CENTER, FLA. - In the Vehicle Assembly Building (VAB), Scott Thurston looks at pieces of Columbia debris being prepared for transfer to the shipping facility before their delivery to The Aerospace Corporation in El Segundo, Calif. Thurston is the Columbia debris coordinator. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbia’s debris is stored in the VAB.
2004-05-25
KENNEDY SPACE CENTER, FLA. - With NASA’s Scott Thurston (left) alongside, United Space Alliance workers J.C. Harrison (in cap) and Amy Mangiacapra (right) begin moving a piece of Columbia debris being shipped to The Aerospace Corporation in El Segundo, Calif. Thurston is the Columbia debris coordinator. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbia’s debris is stored in the VAB.
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. In the Vehicle Assembly Building (VAB), Scott Thurston looks at pieces of Columbia debris being prepared for transfer to the shipping facility before their delivery to The Aerospace Corporation in El Segundo, Calif. Thurston is the Columbia debris coordinator. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbias debris is stored in the VAB.
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. With NASAs Scott Thurston (left) alongside, United Space Alliance workers J.C. Harrison (in cap) and Amy Mangiacapra (right) begin moving a piece of Columbia debris being shipped to The Aerospace Corporation in El Segundo, Calif. Thurston is the Columbia debris coordinator. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbias debris is stored in the VAB.
Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms
ERIC Educational Resources Information Center
Debeer, Dries; Ali, Usama S.; van Rijn, Peter W.
2017-01-01
Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…
Turkoz, Ibrahim; Fu, Dong-Jing; Bossie, Cynthia A; Alphs, Larry
2015-01-01
This analysis evaluates improvement in symptoms of depression in patients with schizoaffective disorder administered oral paliperidone extended-release by accounting for the magnitude of direct and indirect (changes in negative and positive symptoms and worsening of extrapyramidal symptoms) treatment effects on depressive symptoms. Data for this post hoc analysis were drawn from two six-week, randomized, placebo-controlled studies of paliperidone extended-release versus placebo in adult subjects with schizoaffective disorder (N=614; NCT00412373, NCT00397033). Subjects with baseline 17-item Hamilton Rating Scale for Depression scores of 16 or greater were included. Structural equation models (path analyses) were used to separate total effects into direct and indirect effects on depressive symptoms. Change from baseline in 17-item Hamilton Rating Scale for Depression score at the Week 6 end point was the dependent variable; changes in Positive and Negative Syndrome Scale positive and negative factors and Simpson-Angus Scale (to evaluate extrapyramidal symptoms) scores were independent variables. At baseline, 332 of 614 (54.1%) subjects had a 17-item Hamilton Rating Scale for Depression score of 16 or greater. Path analysis determined that up to 26.4 percent of the paliperidone extended-release versus placebo effect on depressive symptoms may be attributed to a direct treatment effect, and 45.8 percent and 28.4 percent were mediated indirectly through improvements on positive and negative symptoms, respectively. No effects were identified as mediated through extrapyramidal symptoms changes (-0.7%). RESULTS of this analysis suggest that paliperidone's effect on depressive symptoms in subjects with schizoaffective disorder participating in two six-week, randomized, placebo-controlled studies is mediated through indirect effects (e.g., positive and negative symptom changes) and a direct treatment effect.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-09-04
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-67737; File No. SR-NYSEArca-2012-93] Self... proposed rule change as described in Items I, II, and III below, which Items have been prepared by the self... change from interested persons. \\1\\ 15 U.S.C.78s(b)(1). \\2\\ 15 U.S.C. 78a. \\3\\ 17 CFR 240.19b-4. I. Self...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-20
... SECURITIES AND EXCHANGE COMMISSION [Release No. 34-69141; File No. SR-Phlx-2013-29] Self... rule change as described in Items I, II and III below, which Items have been prepared by the self... change from interested persons. \\1\\ 15 U.S.C. 78s(b)(1). \\2\\ 15 U.S.C. 78a. \\3\\ 17 CFR 240.19b-4. I. Self...
The Role of Item Feedback in Self-Adapted Testing.
ERIC Educational Resources Information Center
Roos, Linda L.; And Others
1997-01-01
The importance of item feedback in self-adapted testing was studied by comparing feedback and no feedback conditions for computerized adaptive tests and self-adapted tests taken by 363 college students. Results indicate that item feedback is not necessary to realize score differences between self-adapted and computerized adaptive testing. (SLD)
Criterion-Referenced Test Items for Auto Body.
ERIC Educational Resources Information Center
Tannehill, Dana, Ed.
This test item bank on auto body repair contains criterion-referenced test questions based upon competencies found in the Missouri Auto Body Competency Profile. Some test items are keyed for multiple competencies. The tests cover the following 26 competency areas in the auto body curriculum: auto body careers; measuring and mixing; tools and…
Automated Test-Form Generation
ERIC Educational Resources Information Center
van der Linden, Wim J.; Diao, Qi
2011-01-01
In automated test assembly (ATA), the methodology of mixed-integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different…
Generating news media interest in tobacco control; challenges in an advanced policy environment.
MacKenzie, Ross; Chapman, Simon
2012-08-01
To determine the efficacy of using media releases for tobacco control advocacy in Australia's advanced policy environment. Between February and August 2010, news releases that summarised either newly published but unpublicized research findings, or local developments in tobacco control, were sent to NSW media outlets. Reports arising from the releases were tracked using commercial services Media Monitors and Factiva, as well as Google and Google News. Other tobacco control related news items during the same period were also tracked and recorded. Twenty-one news releases generated 93 news items across all news media, with a quarter of these related to a story of porcine haemoglobin in cigarette filters. By comparison, 'live' policy issues (especially plain packaging and a significant tobacco tax increase) covered in this period attracted 1,033 news stories in the Australian media. Press releases describing recently published, but underpublicized research were issued in weeks where no major competing tobacco control news occurred. Results of this project indicate that in environments with advanced tobacco policy, media opportunities related to tobacco control advocacy are limited, as many objectives have been achieved. The media can still play a key advocacy role in such environments, and advocates need to be particularly vigilant for opportunities that do arise. The paper also highlights the increasingly important role of internet-based media, including opportunities presented by social media for tobacco control.
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Solving the measurement invariance anchor item problem in item response theory.
Meade, Adam W; Wright, Natalie A
2012-09-01
The efficacy of tests of differential item functioning (measurement invariance) has been well established. It is clear that when properly implemented, these tests can successfully identify differentially functioning (DF) items when they exist. However, an assumption of these analyses is that the metric for different groups is linked using anchor items that are invariant. In practice, however, it is impossible to be certain which items are DF and which are invariant. This problem of anchor items, or referent indicators, has long plagued invariance research, and a multitude of suggested approaches have been put forth. Unfortunately, the relative efficacy of these approaches has not been tested. This study compares 11 variations on 5 qualitatively different approaches from recent literature for selecting optimal anchor items. A large-scale simulation study indicates that for nearly all conditions, an easily implemented 2-stage procedure recently put forth by Lopez Rivas, Stark, and Chernyshenko (2009) provided optimal power while maintaining nominal Type I error. With this approach, appropriate anchor items can be easily and quickly located, resulting in more efficacious invariance tests. Recommendations for invariance testing are illustrated using a pedagogical example of employee responses to an organizational culture measure.
When Listening Is Better Than Reading: Performance Gains on Cardiac Auscultation Test Questions.
Short, Kathleen; Bucak, S Deniz; Rosenthal, Francine; Raymond, Mark R
2018-05-01
In 2007, the United States Medical Licensing Examination embedded multimedia simulations of heart sounds into multiple-choice questions. This study investigated changes in item difficulty as determined by examinee performance over time. The data reflect outcomes obtained following initial use of multimedia items from 2007 through 2012, after which an interface change occurred. A total of 233,157 examinees responded to 1,306 cardiology test items over the six-year period; 138 items included multimedia simulations of heart sounds, while 1,168 text-based items without multimedia served as controls. The authors compared changes in difficulty of multimedia items over time with changes in difficulty of text-based cardiology items over time. Further, they compared changes in item difficulty for both groups of items between graduates of Liaison Committee on Medical Education (LCME)-accredited and non-LCME-accredited (i.e., international) medical schools. Examinee performance on cardiology test items with multimedia heart sounds improved by 12.4% over the six-year period, while performance on text-based cardiology items improved by approximately 1.4%. These results were similar for graduates of LCME-accredited and non-LCME-accredited medical schools. Examinees' ability to interpret auscultation findings in test items that include multimedia presentations increased from 2007 to 2012.
Revisiting the role of recollection in item versus forced-choice recognition memory.
Cook, Gabriel I; Marsh, Richard L; Hicks, Jason L
2005-08-01
Many memory theorists have assumed that forced-choice recognition tests can rely more on familiarity, whereas item (yes-no) tests must rely more on recollection. In actuality, several studies have found no differences in the contributions of recollection and familiarity underlying the two different test formats. Using word frequency to manipulate stimulus characteristics, the present study demonstrated that the contributions of recollection to item versus forced-choice tests is variable. Low word frequency resulted in significantly more recollection in an item test than did a forced-choice procedure, but high word frequency produced the opposite result. These results clearly constrain any uniform claim about the degree to which recollection supports responding in item versus forced-choice tests.
A Comparison of Methods of Vertical Equating.
ERIC Educational Resources Information Center
Loyd, Brenda H.; Hoover, H. D.
Rasch model vertical equating procedures were applied to three mathematics computation tests for grades six, seven, and eight. Each level of the test was composed of 45 items in three sets of 15 items, arranged in such a way that tests for adjacent grades had two sets (30 items) in common, and the sixth and eighth grades had 15 items in common. In…
ERIC Educational Resources Information Center
Zebehazy, Kim T.; Zigmond, Naomi; Zimmerman, George J.
2012-01-01
Introduction: This study investigated differential item functioning (DIF) of test items on Pennsylvania's Alternate System of Assessment (PASA) for students with visual impairments and severe cognitive disabilities and what the reasons for the differences may be. Methods: The Wilcoxon signed ranks test was used to analyze differences in the scores…
Objective and Item Banking Computer Software and Its Use in Comprehensive Achievement Monitoring.
ERIC Educational Resources Information Center
Schriber, Peter E.; Gorth, William P.
The current emphasis on objectives and test item banks for constructing more effective tests is being augmented by increasingly sophisticated computer software. Items can be catalogued in numerous ways for retrieval. The items as well as instructional objectives can be stored and test forms can be selected and printed by the computer. It is also…
An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38
ERIC Educational Resources Information Center
Ali, Usama S.; Chang, Hua-Hua
2014-01-01
Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…
Fitting the Rasch Model to Account for Variation in Item Discrimination
ERIC Educational Resources Information Center
Weitzman, R. A.
2009-01-01
Building on the Kelley and Gulliksen versions of classical test theory, this article shows that a logistic model having only a single item parameter can account for varying item discrimination, as well as difficulty, by using item-test correlations to adjust incorrect-correct (0-1) item responses prior to an initial model fit. The fit occurs…
Weighted Maximum-a-Posteriori Estimation in Tests Composed of Dichotomous and Polytomous Items
ERIC Educational Resources Information Center
Sun, Shan-Shan; Tao, Jian; Chang, Hua-Hua; Shi, Ning-Zhong
2012-01-01
For mixed-type tests composed of dichotomous and polytomous items, polytomous items often yield more information than dichotomous items. To reflect the difference between the two types of items and to improve the precision of ability estimation, an adaptive weighted maximum-a-posteriori (WMAP) estimation is proposed. To evaluate the performance of…
ERIC Educational Resources Information Center
Sengul Avsar, Asiye; Tavsancil, Ezel
2017-01-01
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Rasch Measurement and Item Banking: Theory and Practice.
ERIC Educational Resources Information Center
Nakamura, Yuji
The Rasch Model is an item response theory, one parameter model developed that states that the probability of a correct response on a test is a function of the difficulty of the item and the ability of the candidate. Item banking is useful for language testing. The Rasch Model provides estimates of item difficulties that are meaningful,…
Test Design Project: Studies in Test Bias. Annual Report.
ERIC Educational Resources Information Center
McArthur, David
Item bias in a multiple-choice test can be detected by appropriate analyses of the persons x items scoring matrix. This permits comparison of groups of examinees tested with the same instrument. The test may be biased if it is not measuring the same thing in comparable groups, if groups are responding to different aspects of the test items, or if…
ERIC Educational Resources Information Center
Truell, Allen D.; Zhao, Jensen J.; Alexander, Melody W.
2005-01-01
The purposes of this study were to determine if there is a significant difference in postsecondary business student scores and test completion time based on settable test item exposure control interface format, and to determine if there is a significant difference in student scores and test completion time based on settable test item exposure…
The development of a clinical outcomes survey research application: Assessment CenterSM
Rothrock, Nan E.; Hanrahan, Rachel T.; Jansky, Liz J.; Harniss, Mark; Riley, William
2013-01-01
Introduction The National Institutes of Health sponsored Patient-Reported Outcome Measurement Information System (PROMIS) aimed to create item banks and computerized adaptive tests (CATs) across multiple domains for individuals with a range of chronic diseases. Purpose Web-based software was created to enable a researcher to create study-specific Websites that could administer PROMIS CATs and other instruments to research participants or clinical samples. This paper outlines the process used to develop a user-friendly, free, Web-based resource (Assessment CenterSM) for storage, retrieval, organization, sharing, and administration of patient-reported outcomes (PRO) instruments. Methods Joint Application Design (JAD) sessions were conducted with representatives from numerous institutions in order to supply a general wish list of features. Use Cases were then written to ensure that end user expectations matched programmer specifications. Program development included daily programmer “scrum” sessions, weekly Usability Acceptability Testing (UAT) and continuous Quality Assurance (QA) activities pre- and post-release. Results Assessment Center includes features that promote instrument development including item histories, data management, and storage of statistical analysis results. Conclusions This case study of software development highlights the collection and incorporation of user input throughout the development process. Potential future applications of Assessment Center in clinical research are discussed. PMID:20306332
Estimating Total-Test Scores from Partial Scores in a Matrix Sampling Design.
ERIC Educational Resources Information Center
Sachar, Jane; Suppes, Patrick
1980-01-01
The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students and 60 items of the 110-item Stanford Mental Arithmetic Test. Three methods yielded fairly good estimates of the total-test score. (Author/RL)
ERIC Educational Resources Information Center
Penfield, Randall D.; Algina, James
2006-01-01
One approach to measuring unsigned differential test functioning is to estimate the variance of the differential item functioning (DIF) effect across the items of the test. This article proposes two estimators of the DIF effect variance for tests containing dichotomous and polytomous items. The proposed estimators are direct extensions of the…
Smolen, Tomasz; Chuderski, Adam
2015-01-01
Fluid intelligence (Gf) is a crucial cognitive ability that involves abstract reasoning in order to solve novel problems. Recent research demonstrated that Gf strongly depends on the individual effectiveness of working memory (WM). We investigated a popular claim that if the storage capacity underlay the WM-Gf correlation, then such a correlation should increase with an increasing number of items or rules (load) in a Gf-test. As often no such link is observed, on that basis the storage-capacity account is rejected, and alternative accounts of Gf (e.g., related to executive control or processing speed) are proposed. Using both analytical inference and numerical simulations, we demonstrated that the load-dependent change in correlation is primarily a function of the amount of floor/ceiling effect for particular items. Thus, the item-wise WM correlation of a Gf-test depends on its overall difficulty, and the difficulty distribution across its items. When the early test items yield huge ceiling, but the late items do not approach floor, that correlation will increase throughout the test. If the early items locate themselves between ceiling and floor, but the late items approach floor, the respective correlation will decrease. For a hallmark Gf-test, the Raven-test, whose items span from ceiling to floor, the quadratic relationship is expected, and it was shown empirically using a large sample and two types of WMC tasks. In consequence, no changes in correlation due to varying WM/Gf load, or lack of them, can yield an argument for or against any theory of WM/Gf. Moreover, as the mathematical properties of the correlation formula make it relatively immune to ceiling/floor effects for overall moderate correlations, only minor changes (if any) in the WM-Gf correlation should be expected for many psychological tests.
Item response theory analysis of the mechanics baseline test
NASA Astrophysics Data System (ADS)
Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.
2012-02-01
Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.
Computerized adaptive testing: the capitalization on chance problem.
Olea, Julio; Barrada, Juan Ramón; Abad, Francisco J; Ponsoda, Vicente; Cuevas, Lara
2012-03-01
This paper describes several simulation studies that examine the effects of capitalization on chance in the selection of items and the ability estimation in CAT, employing the 3-parameter logistic model. In order to generate different estimation errors for the item parameters, the calibration sample size was manipulated (N = 500, 1000 and 2000 subjects) as was the ratio of item bank size to test length (banks of 197 and 788 items, test lengths of 20 and 40 items), both in a CAT and in a random test. Results show that capitalization on chance is particularly serious in CAT, as revealed by the large positive bias found in the small sample calibration conditions. For broad ranges of theta, the overestimation of the precision (asymptotic Se) reaches levels of 40%, something that does not occur with the RMSE (theta). The problem is greater as the item bank size to test length ratio increases. Potential solutions were tested in a second study, where two exposure control methods were incorporated into the item selection algorithm. Some alternative solutions are discussed.
ERIC Educational Resources Information Center
Öztürk-Gübes, Nese; Kelecioglu, Hülya
2016-01-01
The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…
ERIC Educational Resources Information Center
Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.
2015-01-01
Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…
Designing a Virtual Item Bank Based on the Techniques of Image Processing
ERIC Educational Resources Information Center
Liao, Wen-Wei; Ho, Rong-Guey
2011-01-01
One of the major weaknesses of the item exposure rates of figural items in Intelligence Quotient (IQ) tests lies in its inaccuracies. In this study, a new approach is proposed and a useful test tool known as the Virtual Item Bank (VIB) is introduced. The VIB combine Automatic Item Generation theory and image processing theory with the concepts of…
The Rasch Model and Missing Data, with an Emphasis on Tailoring Test Items.
ERIC Educational Resources Information Center
de Gruijter, Dato N. M.
Many applications of educational testing have a missing data aspect (MDA). This MDA is perhaps most pronounced in item banking, where each examinee responds to a different subtest of items from a large item pool and where both person and item parameter estimates are needed. The Rasch model is emphasized, and its non-parametric counterpart (the…
Bayesian Item Selection in Constrained Adaptive Testing Using Shadow Tests
ERIC Educational Resources Information Center
Veldkamp, Bernard P.
2010-01-01
Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item…
Mathematics Library of Test Items. Volume One.
ERIC Educational Resources Information Center
Fraser, Graham, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from previous tests are made available to teachers for the construction of pretests or posttests, reference tests for inter-class comparisons and general assignments. The collection was reviewed for content…
Are Learning Disabled Students "Test-Wise?": An Inquiry into Reading Comprehension Test Items.
ERIC Educational Resources Information Center
Scruggs, Thomas E.; Lifson, Steve
The ability to correctly answer reading comprehension test items, without having read the accompanying reading passage, was compared for third grade learning disabled students and their peers from a regular classroom. In the first experiment, fourteen multiple choice items were selected from the Stanford Achievement Test. No reading passages were…
Agriculture Library of Test Items.
ERIC Educational Resources Information Center
Sutherland, Duncan, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…
ERIC Educational Resources Information Center
Bermundo, Cesar B.; Bermundo, Alex B.; Ballester, Rex C.
2012-01-01
iBank is a project that utilizes a software to create an item Bank that store quality questions, generate test and print exam. The items are from analyze teacher-constructed test questions that provides the basis for discussing test results, by determining why a test item is or not discriminating between the better and poorer students, and by…
Effects of Test Item Disclosure on Medical Licensing Examination
ERIC Educational Resources Information Center
Yang, Eunbae B.; Lee, Myung Ae; Park, Yoon Soo
2018-01-01
In 2012, the National Health Personnel Licensing Examination Board of Korea decided to publicly disclose all test items and answers to satisfy the test takers' right to know and enhance the transparency of tests administered by the government. This study investigated the effects of item disclosure on the medical licensing examination (MLE),…
Controlling Item Exposure Conditional on Ability in Computerized Adaptive Testing.
ERIC Educational Resources Information Center
Stocking, Martha L.; Lewis, Charles
1998-01-01
Ensuring item and pool security in a continuous testing environment is explored through a new method of controlling exposure rate of items conditional on ability level in computerized testing. Properties of this conditional control on exposure rate, when used in conjunction with a particular adaptive testing algorithm, are explored using simulated…
Battalion Combat Operations Center (COC) Test. Volume II. Test Report,
1982-02-08
reveal, perhaps, that item X can perform a task faster than item-Y. A utility assessment from an experienced, knowledgeable test participant, however...can ascertain whether or not item X can better enable him to accomplish his mission than item Y. 2.4 GENeRALIZED TEST FACILITY. The capabilities of...ATHE MIX D -IX AE4SY MIXES A & C MIX A .IX D M X D IMIX C RATHER DIFFICUJLT VERY DIFFICULT ABILITY TO ABILITY TO ABILITY TO CONTROL DATA EXPLOIT DATA
V-TECS Criterion-Referenced Test Item Bank for Radiologic Technology Occupations.
ERIC Educational Resources Information Center
Reneau, Fred; And Others
This Vocational-Technical Education Consortium of States (V-TECS) criterion-referenced test item bank provides 696 multiple-choice items and 33 matching items for radiologic technology occupations. These job titles are included: radiologic technologist, chief; radiologic technologist; nuclear medicine technologist; radiation therapy technologist;…
22 CFR 121.5 - Apparatus and devices under Category IV(c).
Code of Federal Regulations, 2010 CFR
2010-04-01
..., modified or configured for items listed in that category, bomb racks and shackles, bomb shackle release units, bomb ejectors, torpedo tubes, torpedo and guided missile boosters, guidance systems equipment and...
ERIC Educational Resources Information Center
Magno, Carlo
2009-01-01
The present report demonstrates the difference between classical test theory (CTT) and item response theory (IRT) approach using an actual test data for chemistry junior high school students. The CTT and IRT were compared across two samples and two forms of test on their item difficulty, internal consistency, and measurement errors. The specific…
Modeling Local Item Dependence Due to Common Test Format with a Multidimensional Rasch Model
ERIC Educational Resources Information Center
Baghaei, Purya; Aryadoust, Vahid
2015-01-01
Research shows that test method can exert a significant impact on test takers' performance and thereby contaminate test scores. We argue that common test method can exert the same effect as common stimuli and violate the conditional independence assumption of item response theory models because, in general, subsets of items which have a shared…
Garcia, Sofia F.; Hahn, Elizabeth A.; Magasi, Susan; Lai, Jin-Shei; Semik, Patrick; Hammel, Joy; Heinemann, Allen W.
2014-01-01
Objective To describe the development of new self-report measures of social attitudes that act as environmental facilitators or barriers to the participation of people with disabilities in society. Design A mixed methods approach included a literature review; item classification, selection and writing; cognitive interviews and field testing with participants with spinal cord injury (SCI), traumatic brain injury (TBI) or stroke; and rating scale analysis to evaluate initial psychometric properties. Setting General community. Participants Nine individuals with SCI, TBI or stroke participated in cognitive interviews; 305 community residents with those same conditions participated in field testing. Interventions None. Main Outcome Measure(s) Self-report item pool of social attitudes that act as facilitators or barriers to people with disabilities participating in society. Results An interdisciplinary team of experts classified 710 existing social environment items into content areas and wrote 32 new items. Additional qualitative item review included item refinement and winnowing of the pool prior to cognitive interviews and field testing 82 items. Field test data indicated that the pool satisfies a one-parameter item response theory measurement model and would be appropriate for development into a calibrated item bank. Conclusions Our qualitative item review process supported a social environment conceptual framework that includes both social support and social attitudes. We developed a new social attitudes self-report item pool. Calibration testing of that pool is underway with a larger sample in order to develop a social attitudes item bank for persons with disabilities. PMID:25045803
Garcia, Sofia F; Hahn, Elizabeth A; Magasi, Susan; Lai, Jin-Shei; Semik, Patrick; Hammel, Joy; Heinemann, Allen W
2015-04-01
To describe the development of new self-report measures of social attitudes that act as environmental facilitators or barriers to the participation of people with disabilities in society. A mixed-methods approach included a literature review; item classification, selection, and writing; cognitive interviews and field testing of participants with spinal cord injury (SCI), traumatic brain injury (TBI), or stroke; and rating scale analysis to evaluate initial psychometric properties. General community. Individuals with SCI, TBI, or stroke participated in cognitive interviews (n=9); community residents with those same conditions participated in field testing (n=305). None. Self-report item pool of social attitudes that act as facilitators or barriers to people with disabilities participating in society. An interdisciplinary team of experts classified 710 existing social environment items into content areas and wrote 32 new items. Additional qualitative item review included item refinement and winnowing of the pool prior to cognitive interviews and field testing of 82 items. Field test data indicated that the pool satisfies a 1-parameter item response theory measurement model and would be appropriate for development into a calibrated item bank. Our qualitative item review process supported a social environment conceptual framework that includes both social support and social attitudes. We developed a new social attitudes self-report item pool. Calibration testing of that pool is underway with a larger sample to develop a social attitudes item bank for persons with disabilities. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Mitchell, Alex J; Smith, Adam B; Al-salihy, Zerak; Rahim, Twana A; Mahmud, Mahmud Q; Muhyaldin, Asma S
2011-10-01
We aimed to redefine the optimal self-report symptoms of depression suitable for creation of an item bank that could be used in computer adaptive testing or to develop a simplified screening tool for DSM-V. Four hundred subjects (200 patients with primary depression and 200 non-depressed subjects), living in Iraqi Kurdistan were interviewed. The Mini International Neuropsychiatric Interview (MINI) was used to define the presence of major depression (DSM-IV criteria). We examined symptoms of depression using four well-known scales delivered in Kurdish. The Partial Credit Model was applied to each instrument. Common-item equating was subsequently used to create an item bank and differential item functioning (DIF) explored for known subgroups. A symptom level Rasch analysis reduced the original 45 items to 24 items of the original after the exclusion of 21 misfitting items. A further six items (CESD13 and CESD17, HADS-D4, HADS-D5 and HADS-D7, and CDSS3 and CDSS4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining optimal item bank consisted of 17 items and produced an area under the curve (AUC) of 0.987. Using a bank restricted to the optimal nine items revealed only minor loss of accuracy (AUC = 0.989, sensitivity 96%, specificity 95%). Finally, when restricted to only four items accuracy was still high (AUC was still 0.976; sensitivity 93%, specificity 96%). An item bank of 17 items may be useful in computer adaptive testing and nine or even four items may be used to develop a simplified screening tool for DSM-V major depressive disorder (MDD). Further examination of this item bank should be conducted in different cultural settings.
International Space Station alpha remote manipulator system workstation controls test report
NASA Astrophysics Data System (ADS)
Ehrenstrom, William A.; Swaney, Colin; Forrester, Patrick
1994-05-01
Previous development testing for the space station remote manipulator system workstation controls determined the need for hardware controls for the emergency stop, brakes on/off, and some camera functions. This report documents the results of an evaluation to further determine control implementation requirements, requested by the Canadian Space Agency (CSA), to close outstanding review item discrepancies. This test was conducted at the Johnson Space Center's Space Station Mockup and Trainer Facility in Houston, Texas, with nine NASA astronauts and one CSA astronaut as operators. This test evaluated camera iris and focus, back-up drive, latching end effector release, and autosequence controls using several types of hardware and software implementations. Recommendations resulting from the testing included providing guarded hardware buttons to prevent accidental actuation, providing autosequence controls and back-up drive controls on a dedicated hardware control panel, and that 'latch on/latch off', or on-screen software, controls not be considered. Generally, the operators preferred hardware controls although other control implementations were acceptable. The results of this evaluation will be used along with further testing to define specific requirements for the workstation design.
International Space Station alpha remote manipulator system workstation controls test report
NASA Technical Reports Server (NTRS)
Ehrenstrom, William A.; Swaney, Colin; Forrester, Patrick
1994-01-01
Previous development testing for the space station remote manipulator system workstation controls determined the need for hardware controls for the emergency stop, brakes on/off, and some camera functions. This report documents the results of an evaluation to further determine control implementation requirements, requested by the Canadian Space Agency (CSA), to close outstanding review item discrepancies. This test was conducted at the Johnson Space Center's Space Station Mockup and Trainer Facility in Houston, Texas, with nine NASA astronauts and one CSA astronaut as operators. This test evaluated camera iris and focus, back-up drive, latching end effector release, and autosequence controls using several types of hardware and software implementations. Recommendations resulting from the testing included providing guarded hardware buttons to prevent accidental actuation, providing autosequence controls and back-up drive controls on a dedicated hardware control panel, and that 'latch on/latch off', or on-screen software, controls not be considered. Generally, the operators preferred hardware controls although other control implementations were acceptable. The results of this evaluation will be used along with further testing to define specific requirements for the workstation design.
Kalpakjian, Claire Z.; Tate, Denise G.; Kisala, Pamela A.; Tulsky, David S.
2015-01-01
Objective To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Design Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory- (IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. Participants A total of 717 individuals with SCI completed the self-esteem items. Results A unidimensional model was observed (CFI = 0.946; RMSEA = 0.087) and measurement precision was good (theta range between −2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. Conclusion This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available. PMID:26010972
Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S
2015-05-01
To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.
Victorson, David; Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Weiland, Brian; Choi, Seung W
2015-05-01
To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Resilience item bank and short form. Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. A total of 717 individuals with SCI completed the Resilience items. A unidimensional model was observed (CFI=0.968; RMSEA=0.074) and measurement precision was good (theta range between -3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.
Victorson, David; Tulsky, David S.; Kisala, Pamela A.; Kalpakjian, Claire Z.; Weiland, Brian; Choi, Seung W.
2015-01-01
Objective To describe the development and psychometric properties of the Spinal Cord Injury - Quality of Life (SCI-QOL) Resilience item bank and short form. Design Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). Setting We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. Participants A total of 717 individuals with SCI completed the Resilience items. Results A unidimensional model was observed (CFI = 0.968; RMSEA = 0.074) and measurement precision was good (theta range between −3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. Conclusion This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available. PMID:26010971
Grundgeiger, Tobias
2014-04-01
Retrieving a subset of learned items can lead to the forgetting of related items. Such retrieval-induced forgetting (RIF) can be explained by the inhibition of irrelevant items in order to overcome retrieval competition when the target item is retrieved. According to the retrieval inhibition account, such retrieval competition is a necessary condition for RIF. However, research has indicated that noncompetitive retrieval practice can also cause RIF by strengthening cue-item associations. According to the strength-dependent competition account, the strengthened items interfere with the retrieval of weaker items, resulting in impaired recall of weaker items in the final memory test. The aim of this study was to replicate RIF caused by noncompetitive retrieval practice and to determine whether this forgetting is also observed in recognition tests. In the context of RIF, it has been assumed that recognition tests circumvent interference and, therefore, should not be sensitive to forgetting due to strength-dependent competition. However, this has not been empirically tested, and it has been suggested that participants may reinstate learned cues as retrieval aids during the final test. In the present experiments, competitive practice or noncompetitive practice was followed by either final cued-recall tests or recognition tests. In cued-recall tests, RIF was observed in both competitive and noncompetitive conditions. However, in recognition tests, RIF was observed only in the competitive condition and was absent in the noncompetitive condition. The result underscores the contribution of strength-dependent competition to RIF. However, recognition tests seem to be a reliable way of distinguishing between RIF due to retrieval inhibition or strength-dependent competition.
Adaptive Mental Testing: The State of the Art
1979-11-01
typically vary in their psychometric properties --particularly in their difficulty--the test designer must decide what configuration of these item...psychometric properties best suits the test’s purpose. There are two extreme ration- ales to guide that decision. One rationale is to choose items that are...development of item response theory (Rasch, 1960; Lord, 1952, 1970, 1974a; Birnbaum, 1968) that provided the needed invariance properties for item
ERIC Educational Resources Information Center
Pohl, Steffi; Gräfe, Linda; Rose, Norman
2014-01-01
Data from competence tests usually show a number of missing responses on test items due to both omitted and not-reached items. Different approaches for dealing with missing responses exist, and there are no clear guidelines on which of those to use. While classical approaches rely on an ignorable missing data mechanism, the most recently developed…
Procedures for Selecting Items for Computerized Adaptive Tests.
ERIC Educational Resources Information Center
Kingsbury, G. Gage; Zara, Anthony R.
1989-01-01
Several classical approaches and alternative approaches to item selection for computerized adaptive testing (CAT) are reviewed and compared. The study also describes procedures for constrained CAT that may be added to classical item selection approaches to allow them to be used for applied testing. (TJH)
Efforts Toward the Development of Unbiased Selection and Assessment Instruments.
ERIC Educational Resources Information Center
Rudner, Lawrence M.
Investigations into item bias provide an empirical basis for the identification and elimination of test items which appear to measure different traits across populations or cultural groups. The Psychometric rationales for six approaches to the identification of biased test items are reviewed: (1) Transformed item difficulties: within-group…
Effect of Differential Item Functioning on Test Equating
ERIC Educational Resources Information Center
Kabasakal, Kübra Atalay; Kelecioglu, Hülya
2015-01-01
This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…
Ramsay-Curve Differential Item Functioning
ERIC Educational Resources Information Center
Woods, Carol M.
2011-01-01
Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…
ERIC Educational Resources Information Center
Çikirikçi Demirtasli, Nükhet; Ulutas, Seher
2015-01-01
Problem Statement: Item bias occurs when individuals from different groups (different gender, cultural background, etc.) have different probabilities of responding correctly to a test item despite having the same skill levels. It is important that tests or items do not have bias in order to ensure the accuracy of decisions taken according to test…
ERIC Educational Resources Information Center
Egberink, Iris J. L.; Meijer, Rob R.; Tendeiro, Jorge N.
2015-01-01
A popular method to assess measurement invariance of a particular item is based on likelihood ratio tests with all other items as anchor items. The results of this method are often only reported in terms of statistical significance, and researchers proposed different methods to empirically select anchor items. It is unclear, however, how many…
ERIC Educational Resources Information Center
Masters, James S.
2010-01-01
With the need for larger and larger banks of items to support adaptive testing and to meet security concerns, large-scale item generation is a requirement for many certification and licensure programs. As part of the mass production of items, it is critical that the difficulty and the discrimination of the items be known without the need for…
Unilateral neglect: further validation of the baking tray task.
Appelros, Peter; Karlsson, Gunnel M; Thorwalls, Annika; Tham, Kerstin; Nydevik, Ingegerd
2004-11-01
The Baking Tray Task is a comprehensible, simple-to-perform test for use in assessing unilateral neglect. The aim of this study was to validate further its use with stroke patients. The Baking Tray Task was compared with 2 versions of the Behaviour Inattention Test and a test for personal neglect. A total of 270 patients were subjected to a 3-item version of the Behaviour Inattention Test and 40 patients were subjected to an 8-item version of the Behaviour Inattention Test, besides the Baking Tray Task and the personal neglect test. The Baking Tray Task was more sensitive than the 3-item Behaviour Inattention Test, but the 8-item Behaviour Inattention Test was more sensitive than the Baking Tray Task. The best combination of any 3 tests was Baking Tray Task, Reading an article, and Figure copying; the 2 last-mentioned being a part of the 8-item Behaviour Inattention Test. Multi-item tests detect more cases of neglect than do single tests. However, it is tiresome for the patient to undergo a larger test battery than necessary. It is also time-consuming for the staff. Behavioural tests seem more appropriate when assessing neglect. The Baking Tray Task seems to be one of the most sensitive single tests, but its sensitivity can be further enhanced when it is used in combination with other tests.
Adjusting for cross-cultural differences in computer-adaptive tests of quality of life.
Gibbons, C J; Skevington, S M
2018-04-01
Previous studies using the WHOQOL measures have demonstrated that the relationship between individual items and the underlying quality of life (QoL) construct may differ between cultures. If unaccounted for, these differing relationships can lead to measurement bias which, in turn, can undermine the reliability of results. We used item response theory (IRT) to assess differential item functioning (DIF) in WHOQOL data from diverse language versions collected in UK, Zimbabwe, Russia, and India (total N = 1332). Data were fitted to the partial credit 'Rasch' model. We used four item banks previously derived from the WHOQOL-100 measure, which provided excellent measurement for physical, psychological, social, and environmental quality of life domains (40 items overall). Cross-cultural differential item functioning was assessed using analysis of variance for item residuals and post hoc Tukey tests. Simulated computer-adaptive tests (CATs) were conducted to assess the efficiency and precision of the four items banks. Splitting item parameters by DIF results in four linked item banks without DIF or other breaches of IRT model assumptions. Simulated CATs were more precise and efficient than longer paper-based alternatives. Assessing differential item functioning using item response theory can identify measurement invariance between cultures which, if uncontrolled, may undermine accurate comparisons in computer-adaptive testing assessments of QoL. We demonstrate how compensating for DIF using item anchoring allowed data from all four countries to be compared on a common metric, thus facilitating assessments which were both sensitive to cultural nuance and comparable between countries.
Item analysis of three Spanish naming tests: a cross-cultural investigation.
Marquez de la Plata, Carlos; Arango-Lasprilla, Juan Carlos; Alegret, Montse; Moreno, Alexander; Tárraga, Luis; Lara, Mar; Hewlitt, Margaret; Hynan, Linda; Cullum, C Munro
2009-01-01
Neuropsychological evaluations conducted in the United States and abroad commonly include the use of tests translated from English to Spanish. The use of translated naming tests for evaluating predominately Spanish-speakers has recently been challenged on the grounds that translating test items may compromise a test's construct validity. The Texas Spanish Naming Test (TNT) has been developed in Spanish specifically for use with Spanish-speakers; however, it is unlikely patients from diverse Spanish-speaking geographical regions will perform uniformly on a naming test. The present study evaluated and compared the internal consistency and patterns of item-difficulty and -discrimination for the TNT and two commonly used translated naming tests in three countries (i.e., United States, Colombia, Spain). Two hundred fifty two subjects (136 demented, 116 nondemented) across three countries were administered the TNT, Modified Boston Naming Test-Spanish, and the naming subtest from the CERAD. The TNT demonstrated superior internal consistency to its counterparts, a superior item difficulty pattern than the CERAD naming test, and a superior item discrimination pattern than the MBNT-S across countries. Overall, all three Spanish naming tests differentiated nondemented and moderately demented individuals, but the results suggest the items of the TNT are most appropriate to use with Spanish-speakers. Preliminary normative data for the three tests examined in each country are provided.
Testing enhances both encoding and retrieval for both tested and untested items.
Cho, Kit W; Neely, James H; Crocco, Stephanie; Vitrano, Deana
2017-07-01
In forward testing effects, taking a test enhances memory for subsequently studied material. These effects have been observed for previously studied and tested items, a potentially item-specific testing effect, and newly studied untested items, a purely generalized testing effect. We directly compared item-specific and generalized forward testing effects using procedures to separate testing benefits due to encoding versus retrieval. Participants studied two lists of Swahili-English word pairs, with the second study list containing "new" pairs intermixed with the previously studied "old" pairs. Participants completed a review phase in which they took a cued-recall test on only the "old" pairs or restudied them. In Experiments 1a, 1b, and 2, the review phase was given either before or after the second study list. Testing benefited memory to the same degree for both "new" and "old" pairs, suggesting that there were no pair-specific benefits of testing. The larger benefit from testing when review was given before rather than after the second study list suggests that the memory enhancement was due to both testing-enhanced encoding and testing-enhanced retrieval. To better equate generalized testing effects for "new" and "old" pairs, Experiment 3 intermixed them in the review phase. A statistically significant pair-specific testing effect for "old" items was now observed. Overall, these results show that forward testing effects are due to both testing-enhanced encoding and retrieval effects and that direct, pair-specific forward testing benefits are considerably smaller than indirect, generalized forward testing benefits.
Grossberg, George T; Manes, Facundo; Allegri, Ricardo F; Gutiérrez-Robledo, Luis Miguel; Gloger, Sergio; Xie, Lei; Jia, X Daniel; Pejović, Vojislav; Miller, Michael L; Perhach, James L; Graham, Stephen M
2013-06-01
Immediate-release memantine (10 mg, twice daily) is approved in the USA for moderate-to-severe Alzheimer's disease (AD). This study evaluated the efficacy, safety, and tolerability of a higher-dose, once-daily, extended-release formulation in patients with moderate-to-severe AD concurrently taking cholinesterase inhibitors. In this 24-week, double-blind, multinational study (NCT00322153), outpatients with AD (Mini-Mental State Examination scores of 3-14) were randomized to receive once-daily, 28-mg, extended-release memantine or placebo. Co-primary efficacy parameters were the baseline-to-endpoint score change on the Severe Impairment Battery (SIB) and the endpoint score on the Clinician's Interview-Based Impression of Change Plus Caregiver Input (CIBIC-Plus). The secondary efficacy parameter was the baseline-to-endpoint score change on the 19-item Alzheimer's Disease Cooperative Study-Activities of Daily Living (ADCS-ADL19); additional parameters included the baseline-to-endpoint score changes on the Neuropsychiatric Inventory (NPI) and verbal fluency test. Data were analyzed using a two-way analysis of covariance model, except for CIBIC-Plus (Cochran-Mantel-Haenszel test). Safety and tolerability were assessed through adverse events and physical and laboratory examinations. A total of 677 patients were randomized to receive extended-release memantine (n = 342) or placebo (n = 335); completion rates were 79.8 and 81.2 %, respectively. At endpoint (week 24, last observation carried forward), memantine-treated patients significantly outperformed placebo-treated patients on the SIB (least squares mean difference [95 % CI] 2.6 [1.0, 4.2]; p = 0.001), CIBIC-Plus (p = 0.008), NPI (p = 0.005), and verbal fluency test (p = 0.004); the effect did not achieve significance on ADCS-ADL19 (p = 0.177). Adverse events with a frequency of ≥5.0 % that were more prevalent in the memantine group were headache (5.6 vs. 5.1 %) and diarrhea (5.0 vs. 3.9 %). Extended-release memantine was efficacious, safe, and well tolerated in this population.
The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing
ERIC Educational Resources Information Center
Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi
2013-01-01
Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…
A Paradox in the Study of the Benefits of Test-Item Review
ERIC Educational Resources Information Center
van der Linden, Wim J.; Jeon, Minjeong; Ferrara, Steve
2011-01-01
According to a popular belief, test takers should trust their initial instinct and retain their initial responses when they have the opportunity to review test items. More than 80 years of empirical research on item review, however, has contradicted this belief and shown minor but consistently positive score gains for test takers who changed…
Geography Library of Test Items. Volume Four.
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Home Science Library of Test Items. Volume One.
ERIC Educational Resources Information Center
Smith, Jan, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…
Languages Library of Test Items. Volume Two: German, Latin.
ERIC Educational Resources Information Center
Campbell, Thomas; And Others
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Languages Library of Test Items. Volume One: French, Indonesian.
ERIC Educational Resources Information Center
Campbell, Thomas; And Others
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Geography Library of Test Items. Volume Three.
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Commerce Library of Test Items. Volume One.
ERIC Educational Resources Information Center
Meeve, Brian, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Geography Library of Test Items. Volume Five.
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Textiles and Design Library of Test Items. Volume I.
ERIC Educational Resources Information Center
Smith, Jan, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…
Commerce Library of Test Items. Volume Two.
ERIC Educational Resources Information Center
Meeve, Brian, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Geography Library of Test Items. Volume Six.
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Geography: Library of Test Items. Volume II.
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Sex Differences in the Tendency to Omit Items on Multiple-Choice Tests: 1980-2000
ERIC Educational Resources Information Center
von Schrader, Sarah; Ansley, Timothy
2006-01-01
Much has been written concerning the potential group differences in responding to multiple-choice achievement test items. This discussion has included references to possible disparities in tendency to omit such test items. When test scores are used for high-stakes decision making, even small differences in scores and rankings that arise from male…
A Person Fit Test for IRT Models for Polytomous Items
ERIC Educational Resources Information Center
Glas, C. A. W.; Dagohoy, Anna Villa T.
2007-01-01
A person fit test based on the Lagrange multiplier test is presented for three item response theory models for polytomous items: the generalized partial credit model, the sequential model, and the graded response model. The test can also be used in the framework of multidimensional ability parameters. It is shown that the Lagrange multiplier…
How Big Is Big Enough? Sample Size Requirements for CAST Item Parameter Estimation
ERIC Educational Resources Information Center
Chuah, Siang Chee; Drasgow, Fritz; Luecht, Richard
2006-01-01
Adaptive tests offer the advantages of reduced test length and increased accuracy in ability estimation. However, adaptive tests require large pools of precalibrated items. This study looks at the development of an item pool for 1 type of adaptive administration: the computer-adaptive sequential test. An important issue is the sample size required…
An Explanatory Item Response Theory Approach for a Computer-Based Case Simulation Test
ERIC Educational Resources Information Center
Kahraman, Nilüfer
2014-01-01
Problem: Practitioners working with multiple-choice tests have long utilized Item Response Theory (IRT) models to evaluate the performance of test items for quality assurance. The use of similar applications for performance tests, however, is often encumbered due to the challenges encountered in working with complicated data sets in which local…
Geography Library of Test Items. Volume One.
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Electronic Quality of Life Assessment Using Computer-Adaptive Testing
2016-01-01
Background Quality of life (QoL) questionnaires are desirable for clinical practice but can be time-consuming to administer and interpret, making their widespread adoption difficult. Objective Our aim was to assess the performance of the World Health Organization Quality of Life (WHOQOL)-100 questionnaire as four item banks to facilitate adaptive testing using simulated computer adaptive tests (CATs) for physical, psychological, social, and environmental QoL. Methods We used data from the UK WHOQOL-100 questionnaire (N=320) to calibrate item banks using item response theory, which included psychometric assessments of differential item functioning, local dependency, unidimensionality, and reliability. We simulated CATs to assess the number of items administered before prespecified levels of reliability was met. Results The item banks (40 items) all displayed good model fit (P>.01) and were unidimensional (fewer than 5% of t tests significant), reliable (Person Separation Index>.70), and free from differential item functioning (no significant analysis of variance interaction) or local dependency (residual correlations < +.20). When matched for reliability, the item banks were between 45% and 75% shorter than paper-based WHOQOL measures. Across the four domains, a high standard of reliability (alpha>.90) could be gained with a median of 9 items. Conclusions Using CAT, simulated assessments were as reliable as paper-based forms of the WHOQOL with a fraction of the number of items. These properties suggest that these item banks are suitable for computerized adaptive assessment. These item banks have the potential for international development using existing alternative language versions of the WHOQOL items. PMID:27694100
Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann
2015-01-01
Background Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). Objective The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. Methods The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Results Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Conclusions Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES. PMID:26399428
Alber, Julia M; Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann
2015-09-23
Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES.
ERIC Educational Resources Information Center
Lee, Guemin; Park, In-Yong
2012-01-01
Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel
2017-06-15
Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.
Application of Item Response Theory to Tests of Substance-related Associative Memory
Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.
2015-01-01
A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051
Bäuml, Karl-Heinz T; Holterman, Christoph; Abel, Magdalena
2014-11-01
The testing effect refers to the finding that retrieval practice in comparison to restudy of previously encoded contents can improve memory performance and reduce time-dependent forgetting. Naturally, long retention intervals include both wake and sleep delay, which can influence memory contents differently. In fact, sleep immediately after encoding can induce a mnemonic benefit, stabilizing and strengthening the encoded contents. We investigated in a series of 5 experiments whether sleep influences the testing effect. After initial study of categorized item material (Experiments 1, 2, and 4A), paired associates (Experiment 3), or educational text material (Experiment 4B), subjects were asked to restudy encoded contents or engage in active retrieval practice. A final recall test was conducted after a 12-hr delay that included diurnal wakefulness or nocturnal sleep. The results consistently showed typical testing effects after the wake delay. However, these testing effects were reduced or even eliminated after sleep, because sleep benefited recall of restudied items but left recall of retrieved items unaffected. The findings are consistent with the bifurcation model of the testing effect (Kornell, Bjork, & Garcia, 2011), according to which the distribution of memory strengths across items is shifted differentially by retrieving and restudying, with retrieval strengthening items to a much higher degree than restudy does. On the basis of this model, most of the retrieved items already fall above recall threshold in the absence of sleep, so additional sleep-induced strengthening may not improve recall of retrieved items any further. PsycINFO Database Record (c) 2014 APA, all rights reserved.
ERIC Educational Resources Information Center
van der Linden, Wim J.; Scrams, David J.; Schnipke, Deborah L.
This paper proposes an item selection algorithm that can be used to neutralize the effect of time limits in computer adaptive testing. The method is based on a statistical model for the response-time distributions of the test takers on the items in the pool that is updated each time a new item has been administered. Predictions from the model are…
Do Your Students Measure Up Metrically?
ERIC Educational Resources Information Center
Taylor, P. Mark; Simms, Ken; Kim, Ok-Kyeong; Reys, Robert E.
2001-01-01
Examines released metric items from the Third International Mathematics and Science Study (TIMSS) and the 3rd and 4th grade results. Recommends refocusing instruction on the metric system to improve student performance in measurement. (KHR)
A Comparison of the One-and Three-Parameter Logistic Models on Measures of Test Efficiency.
ERIC Educational Resources Information Center
Benson, Jeri
Two methods of item selection were used to select sets of 40 items from a 50-item verbal analogies test, and the resulting item sets were compared for relative efficiency. The BICAL program was used to select the 40 items having the best mean square fit to the one parameter logistic (Rasch) model. The LOGIST program was used to select the 40 items…
ERIC Educational Resources Information Center
Liu, Jinghua; Zu, Jiyun; Curley, Edward; Carey, Jill
2014-01-01
The purpose of this study is to investigate the impact of discrete anchor items versus passage-based anchor items on observed score equating using empirical data.This study compares an "SAT"® critical reading anchor that contains more discrete items proportionally, compared to the total tests to be equated, to another anchor that…
Computerized Adaptive Testing: Overview and Introduction.
ERIC Educational Resources Information Center
Meijer, Rob R.; Nering, Michael L.
1999-01-01
Provides an overview of computerized adaptive testing (CAT) and introduces contributions to this special issue. CAT elements discussed include item selection, estimation of the latent trait, item exposure, measurement precision, and item-bank development. (SLD)
Flens, Gerard; Smits, Niels; Terwee, Caroline B; Dekker, Joost; Huijbrechts, Irma; de Beurs, Edwin
2017-03-01
We developed a Dutch-Flemish version of the patient-reported outcomes measurement information system (PROMIS) adult V1.0 item bank for depression as input for computerized adaptive testing (CAT). As item bank, we used the Dutch-Flemish translation of the original PROMIS item bank (28 items) and additionally translated 28 U.S. depression items that failed to make the final U.S. item bank. Through psychometric analysis of a combined clinical and general population sample ( N = 2,010), 8 added items were removed. With the final item bank, we performed several CAT simulations to assess the efficiency of the extended (48 items) and the original item bank (28 items), using various stopping rules. Both item banks resulted in highly efficient and precise measurement of depression and showed high similarity between the CAT simulation scores and the full item bank scores. We discuss the implications of using each item bank and stopping rule for further CAT development.
ERIC Educational Resources Information Center
Swiggett, Wanda D.; Kotloff, Laurie; Ezzo, Chelsea; Adler, Rachel; Oliveri, Maria Elena
2014-01-01
The computer-based "Graduate Record Examinations"® ("GRE"®) revised General Test includes interactive item types and testing environment tools (e.g., test navigation, on-screen calculator, and help). How well do test takers understand these innovations? If test takers do not understand the new item types, these innovations may…
Automated Activation and Deactivation of a System Under Test
NASA Technical Reports Server (NTRS)
Poff, Mark A.
2006-01-01
The MPLM Automated Activation/Deactivation application (MPLM means Multi-Purpose Logistic Module) was created with a three-fold purpose in mind: 1. To reduce the possibility of human error in issuing commands to, or interpreting telemetry from, the MPLM power, computer, and environmental control systems; 2. To reduce the amount of test time required for the repetitive activation/deactivation processes; and 3. To reduce the number of on-console personnel required for activation/ deactivation. All of these have been demonstrated with the release of the software. While some degree of automated end-item commanding had previously been performed for space-station hardware in the test environment, none approached the functionality and flexibility of this application. For MPLM activation, it provides mouse-click selection of the hardware complement to be activated, activates the desired hardware and verifies proper feedbacks, and alerts the user when telemetry indicates an error condition or manual intervention is required. For MPLM deactivation, the product senses which end items are active and deactivates them in the proper sequence. For historical purposes, an on-line log is maintained of commands issued and telemetry points monitored. The benefits of the MPLM Automated Activation/ Deactivation application were demonstrated with its first use in December 2002, when it flawlessly performed MPLM activation in 8 minutes (versus as much as 2.4 hours for previous manual activations), and performed MPLM deactivation in 3 minutes (versus 66 minutes for previous manual deactivations). The number of test team members required has dropped from eight to four, and in actuality the software can be operated by a sole (knowledgeable) system engineer.
Severity of Organized Item Theft in Computerized Adaptive Testing: A Simulation Study
ERIC Educational Resources Information Center
Yi, Qing; Zhang, Jinming; Chang, Hua-Hua
2008-01-01
Criteria had been proposed for assessing the severity of possible test security violations for computerized tests with high-stakes outcomes. However, these criteria resulted from theoretical derivations that assumed uniformly randomized item selection. This study investigated potential damage caused by organized item theft in computerized adaptive…
Detecting Item Drift in Large-Scale Testing
ERIC Educational Resources Information Center
Guo, Hongwen; Robin, Frederic; Dorans, Neil
2017-01-01
The early detection of item drift is an important issue for frequently administered testing programs because items are reused over time. Unfortunately, operational data tend to be very sparse and do not lend themselves to frequent monitoring analyses, particularly for on-demand testing. Building on existing residual analyses, the authors propose…
Tree versus Geometric Representation of Tests and Items.
ERIC Educational Resources Information Center
Beller, Michael
1990-01-01
Geometric approaches to representing interrelations among tests and items are compared with an additive tree model (ATM), using 2,644 examinees and 2 other data sets. The ATM's close fit to the data and its coherence of presentation indicate that it is the best means of representing tests and items. (TJH)
Superficial Priming in Episodic Recognition
ERIC Educational Resources Information Center
Dopkins, Stephen; Sargent, Jesse; Ngo, Catherine T.
2010-01-01
We explored the effect of superficial priming in episodic recognition and found it to be different from the effect of semantic priming in episodic recognition. Participants made recognition judgments to pairs of items, with each pair consisting of a prime item and a test item. Correct positive responses to the test item were impeded if the prime…
Statistical Indexes for Monitoring Item Behavior under Computer Adaptive Testing Environment.
ERIC Educational Resources Information Center
Zhu, Renbang; Yu, Feng; Liu, Su
A computerized adaptive test (CAT) administration usually requires a large supply of items with accurately estimated psychometric properties, such as item response theory (IRT) parameter estimates, to ensure the precision of examinee ability estimation. However, an estimated IRT model of a given item in any given pool does not always correctly…
Using Item Response Theory to Describe the Nonverbal Literacy Assessment (NVLA)
ERIC Educational Resources Information Center
Fleming, Danielle; Wilson, Mark; Ahlgrim-Delzell, Lynn
2018-01-01
The Nonverbal Literacy Assessment (NVLA) is a literacy assessment designed for students with significant intellectual disabilities. The 218-item test was initially examined using confirmatory factor analysis. This method showed that the test worked as expected, but the items loaded onto a single factor. This article uses item response theory to…
Aggregating Polytomous DIF Results over Multiple Test Administrations
ERIC Educational Resources Information Center
Zwick, Rebecca; Ye, Lei; Isham, Steven
2018-01-01
In typical differential item functioning (DIF) assessments, an item's DIF status is not influenced by its status in previous test administrations. An item that has shown DIF at multiple administrations may be treated the same way as an item that has shown DIF in only the most recent administration. Therefore, much useful information about the…
A Comparison of Linking and Concurrent Calibration under the Graded Response Model.
ERIC Educational Resources Information Center
Kim, Seock-Ho; Cohen, Allan S.
Applications of item response theory to practical testing problems including equating, differential item functioning, and computerized adaptive testing, require that item parameter estimates be placed onto a common metric. In this study, two methods for developing a common metric for the graded response model under item response theory were…
ERIC Educational Resources Information Center
Nitko, Anthony J.; Hsu, Tse-chi
Item analysis procedures appropriate for domain-referenced classroom testing are described. A conceptual framework within which item statistics can be considered and promising statistics in light of this framework are presented. The sampling fluctuations of the more promising item statistics for sample sizes comparable to the typical classroom…
ERIC Educational Resources Information Center
Bennett, Randy Elliot; And Others
1990-01-01
The relationship of an expert-system-scored constrained free-response item type to multiple-choice and free-response items was studied using data for 614 students on the College Board's Advanced Placement Computer Science (APCS) Examination. Implications for testing and the APCS test are discussed. (SLD)
Fissile interrogation using gamma rays from oxygen
Smith, Donald; Micklich, Bradley J.; Fessler, Andreas
2004-04-20
The subject apparatus provides a means to identify the presence of fissionable material or other nuclear material contained within an item to be tested. The system employs a portable accelerator to accelerate and direct protons to a fluorine-compound target. The interaction of the protons with the fluorine-compound target produces gamma rays which are directed at the item to be tested. If the item to be tested contains either a fissionable material or other nuclear material the interaction of the gamma rays with the material contained within the test item with result in the production of neutrons. A system of neutron detectors is positioned to intercept any neutrons generated by the test item. The results from the neutron detectors are analyzed to determine the presence of a fissionable material or other nuclear material.
Armed Services Vocational Aptitude Battery: Differential Item Functioning on the High School Form.
1988-04-01
AD-RI93 693 ARMED SERVICES VOCATIONAL APTITUDE BATTERY:1/ DIFFERENTIAL ITEM FUNCTIONING..(U) UNIYERSAL ENERGY SYSTEMS INC DAYTON OH R L LINN ET AL...FUNCTIONING ON THE HIGH SCHOOL FORM - H U Robert L. Linn C. Nicholas Hastings Pei-Hua Gillian HuMKatherine E. Ryan A Universal Energy Systems , Inc. 40 Dayton...Period October 1985 - Ky 1987 0 U Approved for public release; distribution is unlimited. R ,. CES LABORATORY 1>2 Se DTIC AIR FORCE SYSTEMS COMMAND 0
Laser Safety Summary of the Large Aircraft Infrared Countermeasure (LAIRCM) Viper Laser, Phase 1
2003-03-06
Item Requirement Yes/ No Comment 2a Does such label contain the following statement? (4.2.2) CAUTION This electronic product has been exempted from...9 Distribution A: Approved for public release; distribution unlimited. PA Case No: TSRL-PA-2016-0214 Item Requirement Yes/ No Comment 8 Is the laser... No Comment 14 Is system designed per MIL-STD-454, MIL-STD-882, and MIL-STD-2036? (4.2.10) YES Personnel hazard control is specified and implemented
Validation of a clinical critical thinking skills test in nursing.
Shin, Sujin; Jung, Dukyoo; Kim, Sungeun
2015-01-27
The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability.
Validation of a clinical critical thinking skills test in nursing
2015-01-01
Purpose: The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. Methods: This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Results: Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. Conclusion: From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability. PMID:25622716
ERIC Educational Resources Information Center
Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan
2014-01-01
C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…
ERIC Educational Resources Information Center
Lynch, Mervin D.; Chaves, John
Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…
ERIC Educational Resources Information Center
Browning, Robert; And Others
1979-01-01
Effects that item order and basal and ceiling rules have on test means, variances, and internal consistency estimates for the Peabody Individual Achievement Test mathematics and reading recognition subtests were examined. Items on the math and reading recognition subtests were significantly easier or harder than test placements indicated. (Author)
Current State of Test Development, Administration, and Analysis: A Study of Faculty Practices.
Bristol, Timothy J; Nelson, John W; Sherrill, Karin J; Wangerin, Virginia S
Developing valid and reliable test items is a critical skill for nursing faculty. This research analyzed the test item writing practice of 674 nursing faculty. Relationships between faculty characteristics and their test item writing practices were analyzed. Findings reveal variability in practice and a gap in implementation of evidence-based standards when developing and evaluating teacher-made examinations.
A Review of Guidelines on Home Drug Testing Websites for Parents
Washio, Yukiko; Fairfax-Columbo, Jaymes; Ball, Emily; Cassey, Heather; Arria, Amelia M.; Bresani, Elena; Curtis, Brenda L.; Kirby, Kimberly C.
2014-01-01
Purpose To update and extend prior work reviewing websites that discuss home drug testing for parents and assess the quality of information that the websites provide to assist them to decide when and how to use home drug testing. Methods We conducted a world-wide web search that identified eight websites providing information for parents on home drug testing. We assessed the information on the sites using checklist developed with field experts in adolescent substance abuse and psychosocial interventions that focus on urine testing. Results None of the websites covered all of items on the 24-item checklist, and only three covered at least half of the items (12, 14, and 21 items, respectively). The five remaining websites covered less than half the checklist items. The mean number of items covered by the websites was 11. Conclusions Among the websites that we reviewed, few provided thorough information to parents regarding empirically-supported strategies to effectively use drug testing to intervene on adolescent substance use. Furthermore, most websites did not provide thorough information regarding the risks and benefits to inform parents’ decision to use home drug testing. Empirical evidence regarding efficacy, benefits, risks, and limitations of home drug testing is needed. PMID:25026103
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
Continuing a series of short tests aimed at measuring student mastery of specific skills in the natural sciences, this supplementary volume includes teachers' notes, a users' guide and inspection copies of test items 27 to 50. Answer keys and test scoring statistics are provided. The items are designed for grades 7 through 10, and a list of the…
ERIC Educational Resources Information Center
Weiss, David J., Ed.
This symposium consists of five papers and presents some recent developments in adaptive testing which have applications to several military testing problems. The overview, by James R. McBride, defines adaptive testing and discusses some of its item selection and scoring strategies. Item response theory, or item characteristic curve theory, is…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Friday, G.P.; Cummins, C.L.; Schwartzman, A.L.
Since the early 1950s, the Savannah River Site (SRS) released over 50 radionuclides into the environment while producing nuclear defense materials. These releases directly exposed aquatic and terrestrial biota to ionizing radiation from surface water, soil, and sediment, and also indirectly by the ingestion of items in the food chain. As part of new missions to develop waste management strategies and identify cost-effective environmental restoration options, knowledge concerning the uptake and distribution of these radionuclides is essential. This report compiles and summarizes site-specific bioconcentration factors for selected radionuclides released at SRS.
DeGeest, David Scott; Schmidt, Frank
2015-01-01
Our objective was to apply the rigorous test developed by Browne (1992) to determine whether the circumplex model fits Big Five personality data. This test has yet to be applied to personality data. Another objective was to determine whether blended items explained correlations among the Big Five traits. We used two working adult samples, the Eugene-Springfield Community Sample and the Professional Worker Career Experience Survey. Fit to the circumplex was tested via Browne's (1992) procedure. Circumplexes were graphed to identify items with loadings on multiple traits (blended items), and to determine whether removing these items changed five-factor model (FFM) trait intercorrelations. In both samples, the circumplex structure fit the FFM traits well. Each sample had items with dual-factor loadings (8 items in the first sample, 21 in the second). Removing blended items had little effect on construct-level intercorrelations among FFM traits. We conclude that rigorous tests show that the fit of personality data to the circumplex model is good. This finding means the circumplex model is competitive with the factor model in understanding the organization of personality traits. The circumplex structure also provides a theoretically and empirically sound rationale for evaluating intercorrelations among FFM traits. Even after eliminating blended items, FFM personality traits remained correlated.
[Mokken scaling of the Cognitive Screening Test].
Diesfeldt, H F A
2009-10-01
The Cognitive Screening Test (CST) is a twenty-item orientation questionnaire in Dutch, that is commonly used to evaluate cognitive impairment. This study applied Mokken Scale Analysis, a non-parametric set of techniques derived from item response theory (IRT), to CST-data of 466 consecutive participants in psychogeriatric day care. The full item set and the standard short version of fourteen items both met the assumptions of the monotone homogeneity model, with scalability coefficient H = 0.39, which is considered weak. In order to select items that would fulfil the assumption of invariant item ordering or the double monotonicity model, the subjects were randomly partitioned into a training set (50% of the sample) and a test set (the remaining half). By means of an automated item selection eleven items were found to measure one latent trait, with H = 0.67 and item H coefficients larger than 0.51. Cross-validation of the item analysis in the remaining half of the subjects gave comparable values (H = 0.66; item H coefficients larger than 0.56). The selected items involve year, place of residence, birth date, the monarch's and prime minister's names, and their predecessors. Applying optimal discriminant analysis (ODA) it was found that the full set of twenty CST items performed best in distinguishing two predefined groups of patients of lower or higher cognitive ability, as established by an independent criterion derived from the Amsterdam Dementia Screening Test. The chance corrected predictive value or prognostic utility was 47.5% for the full item set, 45.2% for the fourteen items of the standard short version of the CST, and 46.1% for the homogeneous, unidimensional set of selected eleven items. The results of the item analysis support the application of the CST in cognitive assessment, and revealed a more reliable 'short' version of the CST than the standard short version (CST14).
Osth, Adam F; Jansson, Anna; Dennis, Simon; Heathcote, Andrew
2018-08-01
A robust finding in recognition memory is that performance declines monotonically across test trials. Despite the prevalence of this decline, there is a lack of consensus on the mechanism responsible. Three hypotheses have been put forward: (1) interference is caused by learning of test items (2) the test items cause a shift in the context representation used to cue memory and (3) participants change their speed-accuracy thresholds through the course of testing. We implemented all three possibilities in a combined model of recognition memory and decision making, which inherits the memory retrieval elements of the Osth and Dennis (2015) model and uses the diffusion decision model (DDM: Ratcliff, 1978) to generate choice and response times. We applied the model to four datasets that represent three challenges, the findings that: (1) the number of test items plays a larger role in determining performance than the number of studied items, (2) performance decreases less for strong items than weak items in pure lists but not in mixed lists, and (3) lexical decision trials interspersed between recognition test trials do not increase the rate at which performance declines. Analysis of the model's parameter estimates suggests that item interference plays a weak role in explaining the effects of recognition testing, while context drift plays a very large role. These results are consistent with prior work showing a weak role for item noise in recognition memory and that retrieval is a strong cause of context change in episodic memory. Copyright © 2018 Elsevier Inc. All rights reserved.
Multistage Computerized Adaptive Testing with Uniform Item Exposure
ERIC Educational Resources Information Center
Edwards, Michael C.; Flora, David B.; Thissen, David
2012-01-01
This article describes a computerized adaptive test (CAT) based on the uniform item exposure multi-form structure (uMFS). The uMFS is a specialization of the multi-form structure (MFS) idea described by Armstrong, Jones, Berliner, and Pashley (1998). In an MFS CAT, the examinee first responds to a small fixed block of items. The items comprising…
Primary Science Assessment Item Setters' Misconceptions Concerning the State Changes of Water
ERIC Educational Resources Information Center
Boo, Hong Kwen
2006-01-01
Assessment is an integral and vital part of teaching and learning, providing feedback on progress through the assessment period to both learners and teachers. However, if test items are flawed because of misconceptions held by the questions setter, then such test items are invalid as assessment tools. Moreover, such flawed items are also likely to…
Stratified and Maximum Information Item Selection Procedures in Computer Adaptive Testing
ERIC Educational Resources Information Center
Deng, Hui; Ansley, Timothy; Chang, Hua-Hua
2010-01-01
In this study we evaluated and compared three item selection procedures: the maximum Fisher information procedure (F), the a-stratified multistage computer adaptive testing (CAT) (STR), and a refined stratification procedure that allows more items to be selected from the high a strata and fewer items from the low a strata (USTR), along with…
Assessment of Differential Item Functioning in Testlet-Based Items Using the Rasch Testlet Model
ERIC Educational Resources Information Center
Wang, Wen-Chung; Wilson, Mark
2005-01-01
This study presents a procedure for detecting differential item functioning (DIF) for dichotomous and polytomous items in testlet-based tests, whereby DIF is taken into account by adding DIF parameters into the Rasch testlet model. Simulations were conducted to assess recovery of the DIF and other parameters. Two independent variables, test type…
Ethnic Group Bias in Intelligence Test Items.
ERIC Educational Resources Information Center
Scheuneman, Janice
In previous studies of ethnic group bias in intelligence test items, the question of bias has been confounded with ability differences between the ethnic group samples compared. The present study is based on a conditional probability model in which an unbiased item is defined as one where the probability of a correct response to an item is the…
Primary Science Assessment Item Setters' Misconceptions Concerning Biological Science Concepts
ERIC Educational Resources Information Center
Boo, Hong Kwen
2007-01-01
Assessment is an integral and vital part of teaching and learning, providing feedback on progress through the assessment period to both learners and teachers. However, if test items are flawed because of misconceptions held by the question setter, then such test items are invalid as assessment tools. Moreover, such flawed items are also likely to…
Examination of Different Item Response Theory Models on Tests Composed of Testlets
ERIC Educational Resources Information Center
Kogar, Esin Yilmaz; Kelecioglu, Hülya
2017-01-01
The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…
A Monte Carlo Study of an Iterative Wald Test Procedure for DIF Analysis
ERIC Educational Resources Information Center
Cao, Mengyang; Tay, Louis; Liu, Yaowu
2017-01-01
This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…
A Semiparametric Model for Jointly Analyzing Response Times and Accuracy in Computerized Testing
ERIC Educational Resources Information Center
Wang, Chun; Fan, Zhewen; Chang, Hua-Hua; Douglas, Jeffrey A.
2013-01-01
The item response times (RTs) collected from computerized testing represent an underutilized type of information about items and examinees. In addition to knowing the examinees' responses to each item, we can investigate the amount of time examinees spend on each item. Current models for RTs mainly focus on parametric models, which have the…
An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests
ERIC Educational Resources Information Center
Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N.
2013-01-01
Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…
Automated Item Generation with Recurrent Neural Networks.
von Davier, Matthias
2018-03-12
Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.
Assessing the Conceptual Understanding about Heat and Thermodynamics at Undergraduate Level
ERIC Educational Resources Information Center
Kulkarni, Vasudeo Digambar; Tambade, Popat Savaleram
2013-01-01
In this study, a Thermodynamic Concept Test (TCT) was designed to assess student's conceptual understanding heat and thermodynamics at undergraduate level. The different statistical tests such as item difficulty index, item discrimination index, point biserial coefficient were used for assessing TCT. For each item of the test these indices were…
A Study of Inference in Standardized Reading Test Items and Its Relationship to Difficulty.
ERIC Educational Resources Information Center
Marzano, Robert J.
To study the relationship between inferences made on standardized reading tests and item difficulty, 50 items on the reading comprehension section of the Metropolitan Achievement Test were analyzed independently in this study by two raters using four general categories of inferences: (1) reference inferences, (2) between proposition inferences,…
Questions and Problems in Science.
ERIC Educational Resources Information Center
Dressel, Paul L.; Nelson, Clarence H.
This folio of test items, contributed by a number of colleges and universities from their course, placement, entrance, or other institutional examinations, was compiled to aid teachers in constructing tests. Only those science courses offered in the first two years of college are represented by the scope of the items. The test items may also serve…
Effects of Using Modified Items to Test Students with Persistent Academic Difficulties
ERIC Educational Resources Information Center
Elliott, Stephen N.; Kettler, Ryan J.; Beddow, Peter A.; Kurz, Alexander; Compton, Elizabeth; McGrath, Dawn; Bruen, Charles; Hinton, Kent; Palmer, Porter; Rodriguez, Michael C.; Bolt, Daniel; Roach, Andrew T.
2010-01-01
This study investigated the effects of using modified items in achievement tests to enhance accessibility. An experiment determined whether tests composed of modified items would reduce the performance gap between students eligible for an alternate assessment based on modified achievement standards (AA-MAS) and students not eligible, and the…
Optimal Stratification of Item Pools in a-Stratified Computerized Adaptive Testing.
ERIC Educational Resources Information Center
Chang, Hua-Hua; van der Linden, Wim J.
2003-01-01
Developed a method based on 0-1 linear programming to stratify an item pool optimally for use in alpha-stratified adaptive testing. Applied the method to a previous item pool from the computerized adaptive test of the Graduate Record Examinations. Results show the new method performs well in practical situations. (SLD)
The Development and Validation of a Formula for Measuring Single-Sentence Test Item Readability.
ERIC Educational Resources Information Center
Homan, Susan; And Others
1994-01-01
A study was conducted with 782 elementary school students to determine whether the Homan-Hewitt Readability Formula could identify the readability of a single-sentence test item. Results indicate that a relationship exists between students' reading grade levels and responses to test items written at higher readability levels. (SLD)
Development and Validation of a Computer Adaptive EFL Test
ERIC Educational Resources Information Center
He, Lianzhen; Min, Shangchao
2017-01-01
The first aim of this study was to develop a computer adaptive EFL test (CALT) that assesses test takers' listening and reading proficiency in English with dichotomous items and polytomous testlets. We reported in detail on the development of the CALT, including item banking, determination of suitable item response theory (IRT) models for item…
The Development and Management of Banks of Performance Based Test Items.
ERIC Educational Resources Information Center
Curtis, H. A., Ed.
Symposium papers presented at an Annual Meeting of the National Council on Measurement in Education (Chicago, 1972), all of which concern banks of test items for use in constructing criterion referenced tests, comprise this document. The first paper, "Locally Produced Item Banks" by Thomas J. Slocum, presents information on the…
Test-retest stability of the Task and Ego Orientation Questionnaire.
Lane, Andrew M; Nevill, Alan M; Bowes, Neal; Fox, Kenneth R
2005-09-01
Establishing stability, defined as observing minimal measurement error in a test-retest assessment, is vital to validating psychometric tools. Correlational methods, such as Pearson product-moment, intraclass, and kappa are tests of association or consistency, whereas stability or reproducibility (regarded here as synonymous) assesses the agreement between test-retest scores. Indexes of reproducibility using the Task and Ego Orientation in Sport Questionnaire (TEOSQ; Duda & Nicholls, 1992) were investigated using correlational (Pearson product-moment, intraclass, and kappa) methods, repeated measures multivariate analysis of variance, and calculating the proportion of agreement within a referent value of +/-1 as suggested by Nevill, Lane, Kilgour, Bowes, and Whyte (2001). Two hundred thirteen soccer players completed the TEOSQ on two occasions, 1 week apart. Correlation analyses indicated a stronger test-retest correlation for the Ego subscale than the Task subscale. Multivariate analysis of variance indicated stability for ego items but with significant increases in four task items. The proportion of test-retest agreement scores indicated that all ego items reported relatively poor stability statistics with test-retest scores within a range of +/-1, ranging from 82.7-86.9%. By contrast, all task items showed test-retest difference scores ranging from 92.5-99%, although further analysis indicated that four task subscale items increased significantly. Findings illustrated that correlational methods (Pearson product-moment, intraclass, and kappa) are influenced by the range in scores, and calculating the proportion of agreement of test-retest differences with a referent value of +/-1 could provide additional insight into the stability of the questionnaire. It is suggested that the item-by-item proportion of agreement method proposed by Nevill et al. (2001) should be used to supplement existing methods and could be especially helpful in identifying rogue items in the initial stages of psychometric questionnaire validation.
ERIC Educational Resources Information Center
Samejima, Fumiko; Changas, Paul S.
The methods and approaches for estimating the operating characteristics of the discrete item responses without assuming any mathematical form have been developed and expanded. It has been made possible that, even if the test information function of a given test is not constant for the interval of ability of interest, it is used as the Old Test.…
Automatic Generation of Rasch-Calibrated Items: Figural Matrices Test GEOM and Endless-Loops Test EC
ERIC Educational Resources Information Center
Arendasy, Martin
2005-01-01
The future of test construction for certain psychological ability domains that can be analyzed well in a structured manner may lie--at the very least for reasons of test security--in the field of automatic item generation. In this context, a question that has not been explicitly addressed is whether it is possible to embed an item response theory…
Evaluation of Floors and Item Gradients for Reading and Math Tests for Young Children
ERIC Educational Resources Information Center
Bradley-Johnson, Sharon; Durmusoglu, Gokce
2005-01-01
Ignoring the adequacy of floors and item gradients for tests used with young children can have serious consequences. Thus, because of the importance of early intervention for reading and math problems, we used the criteria suggested by Bracken for adequate floors and item gradients, and reviewed 15 reading tests and 12 math tests for ages 4-0…
ERIC Educational Resources Information Center
Khaksefidi, Saman
2017-01-01
This study investigates the psychological effect of a wrong question with wrong items on answering to the next question in a test of structure. Forty students selected through stratified random sampling are given 15 questions of a standardized test namely a TOEFL structure test in which questions number 7 and number 11 are wrong and their answers…
ITEM ANALYSIS OF THREE SPANISH NAMING TESTS: A CROSS-CULTURAL INVESTIGATION
de la Plata, Carlos Marquez; Arango-Lasprilla, Juan Carlos; Alegret, Montse; Moreno, Alexander; Tárraga, Luis; Lara, Mar; Hewlitt, Margaret; Hynan, Linda; Cullum, C. Munro
2009-01-01
Neuropsychological evaluations conducted in the United States and abroad commonly include the use of tests translated from English to Spanish. The use of translated naming tests for evaluating predominately Spanish-speakers has recently been challenged on the grounds that translating test items may compromise a test’s construct validity. The Texas Spanish Naming Test (TNT) has been developed in Spanish specifically for use with Spanish-speakers; however, it is unlikely patients from diverse Spanish-speaking geographical regions will perform uniformly on a naming test. The present study evaluated and compared the internal consistency and patterns of item-difficulty and -discrimination for the TNT and two commonly used translated naming tests in three countries (i.e., United States, Colombia, Spain). Two hundred fifty two subjects (126 demented, 116 nondemented) across three countries were administered the TNT, Modified Boston Naming Test-Spanish, and the naming subtest from the CERAD. The TNT demonstrated superior internal consistency to its counterparts, a superior item difficulty pattern than the CERAD naming test, and a superior item discrimination pattern than the MBNT-S across countries. Overall, all three Spanish naming tests differentiated nondemented and moderately demented individuals, but the results suggest the items of the TNT are most appropriate to use with Spanish-speakers. Preliminary normative data for the three tests examined in each country are provided. PMID:19208960
Identifying predictors of physics item difficulty: A linear regression approach
NASA Astrophysics Data System (ADS)
Mesic, Vanes; Muratovic, Hasnija
2011-06-01
Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge structures. Identified predictors point out the fundamental cognitive dimensions of student physics achievement at the end of compulsory education in Bosnia and Herzegovina, whose level of development influenced the test results within the conducted assessments.
Stochl, Jan; Böhnke, Jan R; Pickett, Kate E; Croudace, Tim J
2016-05-20
Recent developments in psychometric modeling and technology allow pooling well-validated items from existing instruments into larger item banks and their deployment through methods of computerized adaptive testing (CAT). Use of item response theory-based bifactor methods and integrative data analysis overcomes barriers in cross-instrument comparison. This paper presents the joint calibration of an item bank for researchers keen to investigate population variations in general psychological distress (GPD). Multidimensional item response theory was used on existing health survey data from the Scottish Health Education Population Survey (n = 766) to calibrate an item bank consisting of pooled items from the short common mental disorder screen (GHQ-12) and the Affectometer-2 (a measure of "general happiness"). Computer simulation was used to evaluate usefulness and efficacy of its adaptive administration. A bifactor model capturing variation across a continuum of population distress (while controlling for artefacts due to item wording) was supported. The numbers of items for different required reliabilities in adaptive administration demonstrated promising efficacy of the proposed item bank. Psychometric modeling of the common dimension captured by more than one instrument offers the potential of adaptive testing for GPD using individually sequenced combinations of existing survey items. The potential for linking other item sets with alternative candidate measures of positive mental health is discussed since an optimal item bank may require even more items than these.
Expertise sensitive item selection.
Chow, P; Russell, H; Traub, R E
2000-12-01
In this paper we describe and illustrate a procedure for selecting items from a large pool for a certification test. The proposed procedure, which is intended to improve the alignment of the certification test with on-the-job performance, is based on an expertise sensitive index. This index for an item is the difference between the item's p values for experts and novices. An example is provided of the application of the index for selecting items to be used in certifying bakers.
Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi
2018-01-01
Objective The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. Methods The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. Results The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). Conclusion The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with DD in clinical and research settings. PMID:29561879
Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi; Chen, Kuan-Lin
2018-01-01
The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with DD in clinical and research settings.
2004-05-25
KENNEDY SPACE CENTER, FLA. - In the Vehicle Assembly Building (VAB), Scott Thurston (red shirt) stands by while a United Space Alliance worker (blue shirt) gets ready to start moving pieces of Columbia debris, such as the PRSD tank in front, for transfer to a shipping facility and delivery to The Aerospace Corporation in El Segundo, Calif. Thurston is the Columbia debris coordinator. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crew’s families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbia’s debris is stored in the VAB.
Williams, Katherine; Valencia, Luis; Gokulan, Kuppan; Trbojevich, Raul; Khare, Sangeeta
2017-02-01
Food contact materials containing antibacterial properties are progressively appearing in the market. Items intended to provide antimicrobial effects such as increased shelf life and food safety are incorporating silver materials during the manufacture of such products. This study examined the total silver content, release capacity, and antibacterial activity of three different silver-containing food contact materials: plastic food storage containers, a plastic cutting board, and food wrapping paper. Silver content and release were determined by Inductively Coupled Plasma Mass Spectrometry, and the results showed that, although the amount of silver in each product was similar, migration varied considerably with kind of material and simulant choice. Antimicrobial effect was tested by measuring the growth of Salmonella Typhimurium during or after exposure to the different food contact materials. The results showed that the food storage containers and wrapping paper delayed the growth of S. Typhimurium under certain conditions, but that these effects were short-lived. The strain of S. Typhimurium used in this study was found to be negative for the presence of tested silver resistance genes. The results of this study suggest that a thorough investigation should be required to show/claim the efficacy of silver-containing food contact materials for food safety purposes. Published by Elsevier Ltd.
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. In the Vehicle Assembly Building (VAB), Scott Thurston (red shirt) stands by while a United Space Alliance worker (blue shirt) gets ready to start moving pieces of Columbia debris, such as the PRSD tank in front, for transfer to a shipping facility and delivery to The Aerospace Corporation in El Segundo, Calif. Thurston is the Columbia debris coordinator. The pieces have been released for loan to the non-governmental agency for testing and research. The Aerospace Corporation requested and will receive graphite/epoxy honeycomb skins from an Orbital Maneuvering System pod, Main Propulsion System Helium tanks, a Reaction Control System Helium tank and a Power Reactant Storage Distribution system tank. The company will use the parts to study re-entry effects on composite materials. NASA notified the Columbia crews families about the loan before releasing the items for study. Researchers believe the testing will show how materials are expected to respond to various heating and loads' environments. The findings will help calibrate tools and models used to predict hazards to people and property from reentering hardware. The Aerospace Corporation will have the debris for one year to perform analyses to estimate maximum temperatures during reentry based upon the geometry and mass of the recovered composite. Columbias debris is stored in the VAB.
Procedures to develop a computerized adaptive test to assess patient-reported physical functioning.
McCabe, Erin; Gross, Douglas P; Bulut, Okan
2018-06-07
The purpose of this paper is to demonstrate the procedures to develop and implement a computerized adaptive patient-reported outcome (PRO) measure using secondary analysis of a dataset and items from fixed-format legacy measures. We conducted secondary analysis of a dataset of responses from 1429 persons with work-related lower extremity impairment. We calibrated three measures of physical functioning on the same metric, based on item response theory (IRT). We evaluated efficiency and measurement precision of various computerized adaptive test (CAT) designs using computer simulations. IRT and confirmatory factor analyses support combining the items from the three scales for a CAT item bank of 31 items. The item parameters for IRT were calculated using the generalized partial credit model. CAT simulations show that reducing the test length from the full 31 items to a maximum test length of 8 items, or 20 items is possible without a significant loss of information (95, 99% correlation with legacy measure scores). We demonstrated feasibility and efficiency of using CAT for PRO measurement of physical functioning. The procedures we outlined are straightforward, and can be applied to other PRO measures. Additionally, we have included all the information necessary to implement the CAT of physical functioning in the electronic supplementary material of this paper.
Bioavailability of Cadmium in Inexpensive Jewelry
Miller, Jennifer; Guinn, Daphne; Pearson, Janna
2011-01-01
Objectives: We evaluated the bioavailability of Cd in 86 components of 57 jewelry items found to contain high levels of Cd (> 10,000 ppm) by X-ray fluorescence (XRF), using extractions that simulate mouthing or swallowing of jewelry items. Methods: We screened jewelry for Cd content by XRF. Bioavailability was measured in two ways. Items were placed in saline solution at 37°C for 6 hr to simulate exposures from mouthing of jewelry items. Items were placed in dilute hydrochloric acid (HCl) at 37°C for 6–96 hr, simulating the worst-case scenario of a child swallowing a jewelry item. Damaged pieces of selected samples were also extracted by both methods to determine the effect of breaching the outer plating on bioavailability. Total Cd content of all items was determined by atomic absorption. Results: The 6-hr saline extraction yielded as much as 2,200 µg Cd, and 24-hr dilute HCl extraction yielded a maximum of > 20,000 µg Cd. Leaching of Cd in dilute HCl increased linearly over 6–96 hr, indicating potential for increasing harm the longer an item remains in the stomach. Damage to jewelry by breaching the outer plating generally, but not always, increased Cd release. Bioavailability did not correlate directly with Cd content. Conclusions: These results indicate the potential for dangerous Cd exposures to children who wear, mouth, or accidentally swallow high-Cd jewelry items. PMID:21377949
Survey Development to Assess College Students' Perceptions of the Campus Environment.
Sowers, Morgan F; Colby, Sarah; Greene, Geoffrey W; Pickett, Mackenzie; Franzen-Castle, Lisa; Olfert, Melissa D; Shelnutt, Karla; Brown, Onikia; Horacek, Tanya M; Kidd, Tandalayo; Kattelmann, Kendra K; White, Adrienne A; Zhou, Wenjun; Riggsbee, Kristin; Yan, Wangcheng; Byrd-Bredbenner, Carol
2017-11-01
We developed and tested a College Environmental Perceptions Survey (CEPS) to assess college students' perceptions of the healthfulness of their campus. CEPS was developed in 3 stages: questionnaire development, validity testing, and reliability testing. Questionnaire development was based on an extensive literature review and input from an expert panel to establish content validity. Face validity was established with the target population using cognitive interviews with 100 college students. Concurrent-criterion validity was established with in-depth interviews (N = 30) of college students compared to surveys completed by the same 30 students. Surveys completed by college students from 8 universities (N = 1147) were used to test internal structure (factor analysis) and internal consistency (Cronbach's alpha). After development and testing, 15 items remained from the original 48 items. A 5-factor solution emerged: physical activity (4 items, α = .635), water (3 items, α = .773), vending (2 items, α = .680), healthy food (2 items, α = .631), and policy (2 items, α = .573). The mean total score for all universities was 62.71 (±11.16) on a 100-point scale. CEPS appears to be a valid and reliable tool for assessing college students' perceptions of their health-related campus environment.
Implicit and explicit forgetting: when is gist remembered?
Dorfman, J; Mandler, G
1994-08-01
Recognition (YES/NO) and stem completion (cued: complete with a word from the list; and uncued: complete with the first word that comes to mind) were tested following either semantic or non-semantic processing of a categorized input list. Item/instance information was tested by contrasting target items from the input list with new items that were categorically related to them; gist/categorical information was tested by comparing target items semantically related to the input items with unrelated new items. For both recognition and stem completion, regardless of initial processing condition, item information decayed rapidly over a period of one week. Gist information was maintained over the same period when initial processing was semantic but only in the cued condition for completion. These results are discussed in terms of dual process theory, which postulates activation/integration of a representation as primarily relevant to implicit item information and elaboration of a representation as mainly relevant to semantic (i.e. categorical) information.
Incidental retrieval-induced forgetting of location information.
Gómez-Ariza, Carlos J; Fernandez, Angel; Bajo, M Teresa
2012-06-01
Retrieval-induced forgetting (RIF) has been studied with different types of tests and materials. However, RIF has always been tested on the items' central features, and there is no information on whether inhibition also extends to peripheral features of the events in which the items are embedded. In two experiments, we specifically tested the presence of RIF in a task in which recall of peripheral information was required. After a standard retrieval practice task oriented to item identity, participants were cued with colors (Exp. 1) or with the items themselves (Exp. 2) and asked to recall the screen locations where the items had been displayed during the study phase. RIF for locations was observed after retrieval practice, an effect that was not present when participants were asked to read instead of retrieving the items. Our findings provide evidence that peripheral location information associated with an item during study can be also inhibited when the retrieval conditions promote the inhibition of more central, item identity information.
Computerized Adaptive Testing with Item Clones. Research Report.
ERIC Educational Resources Information Center
Glas, Cees A. W.; van der Linden, Wim J.
To reduce the cost of item writing and to enhance the flexibility of item presentation, items can be generated by item-cloning techniques. An important consequence of cloning is that it may cause variability on the item parameters. Therefore, a multilevel item response model is presented in which it is assumed that the item parameters of a…
Hjermstad, Marianne J; Bergenmar, Mia; Bjordal, Kristin; Fisher, Sheila E; Hofmeister, Dirk; Montel, Sébastien; Nicolatou-Galitis, Ourania; Pinto, Monica; Raber-Durlacher, Judith; Singer, Susanne; Tomaszewska, Iwona M; Tomaszewski, Krzysztof A; Verdonck-de Leeuw, Irma; Yarom, Noam; Winstanley, Julie B; Herlofson, Bente B
2016-09-01
This international EORTC validation study (phase IV) is aimed at testing the psychometric properties of a quality of life (QoL) module related to oral health problems in cancer patients. The phase III module comprised 17 items with four hypothesized multi-item scales and three single items. In phase IV, patients with mixed cancers, in different treatment phases from 10 countries completed the EORTC QLQ-C30, the QLQ-OH module, and a debriefing interview. The hypothesized structure was tested using combinations of classical test theory and item response theory, following EORTC guidelines. Test-retest assessments and responsiveness to change analysis (RCA) were performed after 2 weeks. Five hundred seventy-two patients (median age 60.3, 54 % females) were analyzed. Completion took <10 min for 84 %, 40 % expressed satisfaction that these issues were addressed. Analyses suggested a revision of the phase III hypothesized scale structure. Two items were deleted based on a high degree of item misfit, together with negative patient feedback. The remaining 15 items formed one eight-item scale named OH-QoL score, a two-item information scale, a two-item scale regarding dentures, and three single items (sticky saliva/mouth soreness/sensitivity to food/drink). Face and convergent validity and internal consistency were confirmed. Test-retest reliability (n = 60) was demonstrated as was RCA for patients undergoing chemotherapy (n = 117; p = 0.06). The resulting QLQ-OH15 discriminated between clinically distinct patient groups, e.g., low performance status vs. higher (p < 000.1), and head-and-neck cancer versus other cancers (p < 0.03). The EORTC module QLQ-OH15 is a short, well-accepted assessment tool focusing on oral problems and QoL to improve clinical management. ClinicalTrials.gov Identifier: NCT01724333.
Item Selection and Pre-equating with Empirical Item Characteristic Curves.
ERIC Educational Resources Information Center
Livingston, Samuel A.
An empirical item characteristic curve shows the probability of a correct response as a function of the student's total test score. These curves can be estimated from large-scale pretest data. They enable test developers to select items that discriminate well in the score region where decisions are made. A similar set of curves can be used to…
ERIC Educational Resources Information Center
Hol, A. Michiel; Vorst, Harrie C. M.; Mellenbergh, Gideon J.
2007-01-01
In a randomized experiment (n = 515), a computerized and a computerized adaptive test (CAT) are compared. The item pool consists of 24 polytomous motivation items. Although items are carefully selected, calibration data show that Samejima's graded response model did not fit the data optimally. A simulation study is done to assess possible…
The Effect of Error in Item Parameter Estimates on the Test Response Function Method of Linking.
ERIC Educational Resources Information Center
Kaskowitz, Gary S.; De Ayala, R. J.
2001-01-01
Studied the effect of item parameter estimation for computation of linking coefficients for the test response function (TRF) linking/equating method. Simulation results showed that linking was more accurate when there was less error in the parameter estimates, and that 15 or 25 common items provided better results than 5 common items under both…
ERIC Educational Resources Information Center
Dutke, Stephan; Barenberg, Jonathan
2015-01-01
We introduce a specific type of item for knowledge tests, confidence-weighted true-false (CTF) items, and review experiences of its application in psychology courses. A CTF item is a statement about the learning content to which students respond whether the statement is true or false, and they rate their confidence level. Previous studies using…
ERIC Educational Resources Information Center
Brown, Frank N.; And Others
The successful Wisconsin Title 1 project item bank offers a valid, flexible, and efficient means of providing migrant student tests in reading and mathematics tailored to instructor curricula. The item bank system consists of nine PASCAL computer programs which maintain, search, and select from approximately 1,000 test items stored on floppy disks…
Cupani, Marcos; Zamparella, Tatiana Castro; Piumatti, Gisella; Vinculado, Grupo
The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. This study aims to develop a bank of items to measure the level of Knowledge on Biology using the Rasch model. The sample consisted of 1219 participants that studied in different faculties of the National University of Cordoba (mean age = 21.85 years, SD = 4.66; 66.9% are women). The items were organized in different forms and into separate subtests, with some common items across subtests. The students were told they had to answer 60 questions of knowledge on biology. Evaluation of Rasch model fit (Zstd >|2.0|), differential item functioning, dimensionality, local independence, item and person separation (>2.0), and reliability (>.80) resulted in a bank of 180 items with good psychometric properties. The bank provides items with a wide range of content coverage and may serve as a sound basis for computerized adaptive testing applications. The contribution of this work is significant in the field of educational assessment in Argentina.
ERIC Educational Resources Information Center
Davis, Laurie Laughlin
2004-01-01
Choosing a strategy for controlling item exposure has become an integral part of test development for computerized adaptive testing (CAT). This study investigated the performance of six procedures for controlling item exposure in a series of simulated CATs under the generalized partial credit model. In addition to a no-exposure control baseline…
Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test
ERIC Educational Resources Information Center
Lee, Yi-Hsuan; Zhang, Jinming
2017-01-01
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Application of Computerized Adaptive Testing to Entrance Examination for Graduate Studies in Turkey
ERIC Educational Resources Information Center
Bulut, Okan; Kan, Adnan
2012-01-01
Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee's responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee's ability level. Instead of…
ERIC Educational Resources Information Center
Veldkamp, Bernard P.; van der Linden, Wim J.
2008-01-01
In most operational computerized adaptive testing (CAT) programs, the Sympson-Hetter (SH) method is used to control the exposure of the items. Several modifications and improvements of the original method have been proposed. The Stocking and Lewis (1998) version of the method uses a multinomial experiment to select items. For severely constrained…
Rasch Based Analysis of Oral Proficiency Test Data.
ERIC Educational Resources Information Center
Nakamura, Yuji
2001-01-01
This paper examines the rating scale data of oral proficiency tests analyzed by a Rasch Analysis focusing on an item map and factor analysis. In discussing the item map, the difficulty order of six items and students' answering patterns are analyzed using descriptive statistics and measures of central tendency of test scores. The data ranks the…
An Approach to Scoring and Equating Tests with Binary Items: Piloting With Large-Scale Assessments
ERIC Educational Resources Information Center
Dimitrov, Dimiter M.
2016-01-01
This article describes an approach to test scoring, referred to as "delta scoring" (D-scoring), for tests with dichotomously scored items. The D-scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The D-score is computed from the…
ERIC Educational Resources Information Center
Kim, Seonghoon
2013-01-01
With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…
Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Choe, Edison M.; Kern, Justin L.; Chang, Hua-Hua
2018-01-01
Despite common operationalization, measurement efficiency of computerized adaptive testing should not only be assessed in terms of the number of items administered but also the time it takes to complete the test. To this end, a recent study introduced a novel item selection criterion that maximizes Fisher information per unit of expected response…
ERIC Educational Resources Information Center
Feldt, Leonard S.
2004-01-01
In some settings, the validity of a battery composite or a test score is enhanced by weighting some parts or items more heavily than others in the total score. This article describes methods of estimating the total score reliability coefficient when differential weights are used with items or parts.
Applications of NLP Techniques to Computer-Assisted Authoring of Test Items for Elementary Chinese
ERIC Educational Resources Information Center
Liu, Chao-Lin; Lin, Jen-Hsiang; Wang, Yu-Chun
2010-01-01
The authors report an implemented environment for computer-assisted authoring of test items and provide a brief discussion about the applications of NLP techniques for computer assisted language learning. Test items can serve as a tool for language learners to examine their competence in the target language. The authors apply techniques for…
Construction and Analysis of Educational Tests Using Abductive Machine Learning
ERIC Educational Resources Information Center
El-Alfy, El-Sayed M.; Abdel-Aal, Radwan E.
2008-01-01
Recent advances in educational technologies and the wide-spread use of computers in schools have fueled innovations in test construction and analysis. As the measurement accuracy of a test depends on the quality of the items it includes, item selection procedures play a central role in this process. Mathematical programming and the item response…
Role of Cognitive Testing in the Development of the CAHPS® Hospital Survey
Levine, Roger E; Fowler, Floyd J; Brown, Julie A
2005-01-01
Objective To describe how cognitive testing results were used to inform the modification and selection of items for the Consumer Assessment of Health Providers and Systems (CAHPS®) Hospital Survey pilot test instrument. Data Sources Cognitive interviews were conducted on 31 subjects in two rounds of testing: in December 2002–January 2003 and in February 2003. In both rounds, interviews were conducted in northern California, southern California, Massachusetts, and North Carolina. Study Design A common protocol served as the basis for cognitive testing activities in each round. This protocol was modified to enable testing of the items as interviewer-administered and self-administered items and to allow members of each of three research teams to use their preferred cognitive research tools. Data Collection/Extraction Methods Each research team independently summarized, documented, and reported their findings. Item-specific and general issues were noted. The results were reviewed and discussed by senior staff from each research team after each round of testing, to inform the acceptance, modification, or elimination of candidate items. Principal Findings Many candidate items required modification because respondents lacked the information required to answer them, respondents failed to understand them consistently, the items were not measuring the constructs they were intended to measure, the items were based on erroneous assumptions about what respondents wanted or experienced during their hospitalization, or the items were asking respondents to make distinctions that were too fine for them to make. Cognitive interviewing enabled the detection of these problems; an understanding of the etiology of the problem informed item revisions. However, for some constructs, the revisions proved to be inadequate. Accordingly, items could not be developed to provide acceptable measures of certain constructs such as shared decision making, coordination of care, and delays in the admissions process. Conclusions Cognitive testing is the most direct way of finding out whether respondents understand questions consistently, have the information needed to answer the questions, and can use the response alternatives provided to describe their experiences or their opinions accurately. Many of the candidate questions failed to meet these standards. Cognitive testing only evaluates the way in which respondents understand and answer questions. Although it does not directly assess the validity of the answers, it is a reasonable premise that cognitive problems will seriously compromise validity and reliability. PMID:16316437
Bartoli, Francesco; Crocamo, Cristina; Biagi, Enrico; Di Carlo, Francesco; Parma, Francesca; Madeddu, Fabio; Capuzzi, Enrico; Colmegna, Fabrizia; Clerici, Massimo; Carrà, Giuseppe
2016-08-01
There is a lack of studies testing accuracy of fast screening methods for alcohol use disorder in mental health settings. We aimed at estimating clinical utility of a standard single-item test for case finding and screening of DSM-5 alcohol use disorder among individuals suffering from anxiety and mood disorders. We recruited adults consecutively referred, in a 12-month period, to an outpatient clinic for anxiety and depressive disorders. We assessed the National Institute on Alcohol Abuse and Alcoholism (NIAAA) single-item test, using the Mini- International Neuropsychiatric Interview (MINI), plus an additional item of Composite International Diagnostic Interview (CIDI) for craving, as reference standard to diagnose a current DSM-5 alcohol use disorder. We estimated sensitivity and specificity of the single-item test, as well as positive and negative Clinical Utility Indexes (CUIs). 242 subjects with anxiety and mood disorders were included. The NIAAA single-item test showed high sensitivity (91.9%) and specificity (91.2%) for DSM-5 alcohol use disorder. The positive CUI was 0.601, whereas the negative one was 0.898, with excellent values also accounting for main individual characteristics (age, gender, diagnosis, psychological distress levels, smoking status). Testing for relevant indexes, we found an excellent clinical utility of the NIAAA single-item test for screening true negative cases. Our findings support a routine use of reliable methods for rapid screening in similar mental health settings. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Hong, Quan Nha; Coutu, Marie-France; Berbiche, Djamal
2017-01-01
The Work Role Functioning Questionnaire (WRFQ) was developed to assess workers' perceived ability to perform job demands and is used to monitor presenteeism. Still few studies on its validity can be found in the literature. The purpose of this study was to assess the items and factorial composition of the Canadian French version of the WRFQ (WRFQ-CF). Two measurement approaches were used to test the WRFQ-CF: Classical Test Theory (CTT) and non-parametric Item Response Theory (IRT). A total of 352 completed questionnaires were analyzed. A four-factor and three-factor model models were tested and shown respectively good fit with 14 items (Root Mean Square Error of Approximation (RMSEA) = 0.06, Standardized Root Mean Square Residual (SRMR) = 0.04, Bentler Comparative Fit Index (CFI) = 0.98) and with 17 items (RMSEA = 0.059, SRMR = 0.048, CFI = 0.98). Using IRT, 13 problematic items were identified, of which 9 were common with CTT. This study tested different models with fewer problematic items found in a three-factor model. Using a non-parametric IRT and CTT for item purification gave complementary results. IRT is still scarcely used and can be an interesting alternative method to enhance the quality of a measurement instrument. More studies are needed on the WRFQ-CF to refine its items and factorial composition.
ERIC Educational Resources Information Center
Geiser, Christian; Lehmann, Wolfgang; Eid, Michael
2006-01-01
Items of mental rotation tests can not only be solved by mental rotation but also by other solution strategies. A multigroup latent class analysis of 24 items of the Mental Rotations Test (MRT) was conducted in a sample of 1,695 German pupils and students to find out how many solution strategies can be identified for the items of this test. The…
A review of guidelines on home drug testing web sites for parents.
Washio, Yukiko; Fairfax-Columbo, Jaymes; Ball, Emily; Cassey, Heather; Arria, Amelia M; Bresani, Elena; Curtis, Brenda L; Kirby, Kimberly C
2014-01-01
To update and extend prior work reviewing Web sites that discuss home drug testing for parents, and assess the quality of information that the Web sites provide, to assist them in deciding when and how to use home drug testing. We conducted a worldwide Web search that identified 8 Web sites providing information for parents on home drug testing. We assessed the information on the sites using a checklist developed with field experts in adolescent substance abuse and psychosocial interventions that focus on urine testing. None of the Web sites covered all the items on the 24-item checklist, and only 3 covered at least half of the items (12, 14, and 21 items, respectively). The remaining 5 Web sites covered less than half of the checklist items. The mean number of items covered by the Web sites was 11. Among the Web sites that we reviewed, few provided thorough information to parents regarding empirically supported strategies to effectively use drug testing to intervene on adolescent substance use. Furthermore, most Web sites did not provide thorough information regarding the risks and benefits to inform parents' decision to use home drug testing. Empirical evidence regarding efficacy, benefits, risks, and limitations of home drug testing is needed.
76 FR 57941 - Retrospective Review Under E.O. 13563: Cargo Preference
Federal Register 2010, 2011, 2012, 2013, 2014
2011-09-19
... attendees are encouraged to limit bags and other items (e.g. mobile phones, laptops, cameras, etc.) they... phone [See also Registration]. Agenda released on regs.dot.gov September 28, 2011. and MarAd Web site...
Validity and Reliability of the 8-Item Work Limitations Questionnaire.
Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C
2017-12-01
Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
A Procedure to Detect Item Bias Present Simultaneously in Several Items
1991-04-25
exhibit a coherent and major biasing influence at the test level. In partic- ular, this can be true even if each individual item displays only a minor...response functions (IRFs) without the use of item parameter estimation algorithms when the sample size is too small for their use. Thissen, Steinberg...convention). A random sample of examinees is drawn from each group, and a test of N items is administered to them. Typically it is suspected that a
Evaluating innovative items for the NCLEX, part I: usability and pilot testing.
Wendt, Anne; Harmes, J Christine
2009-01-01
National Council of State Boards of Nursing (NCSBN) has recently conducted preliminary research on the feasibility of including various types of innovative test questions (items) on the NCLEX. This article focuses on the participants' reactions to and their strategies for interacting with various types of innovative items. Part 2 in the May/June issue will focus on the innovative item templates and evaluation of the statistical characteristics and the level of cognitive processing required to answer the examination items.
Validity of Computer Adaptive Tests of Daily Routines for Youth with Spinal Cord Injury
Haley, Stephen M.
2013-01-01
Objective: To evaluate the accuracy of computer adaptive tests (CATs) of daily routines for child- and parent-reported outcomes following pediatric spinal cord injury (SCI) and to evaluate the validity of the scales. Methods: One hundred ninety-six daily routine items were administered to 381 youths and 322 parents. Pearson correlations, intraclass correlation coefficients (ICC), and 95% confidence intervals (CI) were calculated to evaluate the accuracy of simulated 5-item, 10-item, and 15-item CATs against the full-item banks and to evaluate concurrent validity. Independent samples t tests and analysis of variance were used to evaluate the ability of the daily routine scales to discriminate between children with tetraplegia and paraplegia and among 5 motor groups. Results: ICC and 95% CI demonstrated that simulated 5-, 10-, and 15-item CATs accurately represented the full-item banks for both child- and parent-report scales. The daily routine scales demonstrated discriminative validity, except between 2 motor groups of children with paraplegia. Concurrent validity of the daily routine scales was demonstrated through significant relationships with the FIM scores. Conclusion: Child- and parent-reported outcomes of daily routines can be obtained using CATs with the same relative precision of a full-item bank. Five-item, 10-item, and 15-item CATs have discriminative and concurrent validity. PMID:23671380
Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias
2018-04-10
To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading <.5, 4 residual correlations >.3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.
Buck, Harleah G; Harkness, Karen; Ali, Muhammad Usman; Carroll, Sandra L; Kryworuchko, Jennifer; McGillion, Michael
2017-04-01
Caregivers (CGs) contribute important assistance with heart failure (HF) self-care, including daily maintenance, symptom monitoring, and management. Until CGs' contributions to self-care can be quantified, it is impossible to characterize it, account for its impact on patient outcomes, or perform meaningful cost analyses. The purpose of this study was to conduct psychometric testing and item reduction on the recently developed 34-item Caregiver Contribution to Heart Failure Self-care (CACHS) instrument using classical and item response theory methods. Fifty CGs (mean age 63 years ±12.84; 70% female) recruited from a HF clinic completed the CACHS in 2014 and results evaluated using classical test theory and item response theory. Items would be deleted for low (<.05) or high (>.95) endorsement, low (<.3) or high (>.7) corrected item-total correlations, significant pairwise correlation coefficients, floor or ceiling effects, relatively low latent trait and item information function levels (<1.5 and p > .5), and differential item functioning. After analysis, 14 items were excluded, resulting in a 20-item instrument (self-care maintenance eight items; monitoring seven items; and management five items). Most items demonstrated moderate to high discrimination (median 2.13, minimum .77, maximum 5.05), and appropriate item difficulty (-2.7 to 1.4). Internal consistency reliability was excellent (Cronbach α = .94, average inter-item correlation = .41) with no ceiling effects. The newly developed 20-item version of the CACHS is supported by rigorous instrument development and represents a novel instrument to measure CGs' contribution to HF self-care. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
FIM-Minimum Data Set Motor Item Bank: Short Forms Development and Precision Comparison in Veterans.
Li, Chih-Ying; Romero, Sergio; Simpson, Annie N; Bonilha, Heather S; Simpson, Kit N; Hong, Ickpyo; Velozo, Craig A
2018-03-01
To improve the practical use of the short forms (SFs) developed from the item bank, we compared the measurement precision of the 4- and 8-item SFs generated from a motor item bank composed of the FIM and the Minimum Data Set (MDS). The FIM-MDS motor item bank allowed scores generated from different instruments to be co-calibrated. The 4- and 8-item SFs were developed based on Rasch analysis procedures. This article compared person strata, ceiling/floor effects, and test SE plots for each administration form and examined 95% confidence interval error bands of anchored person measures with the corresponding SFs. We used 0.3 SE as a criterion to reflect a reliability level of .90. Veterans' inpatient rehabilitation facilities and community living centers. Veterans (N=2500) who had both FIM and the MDS data within 6 days during 2008 through 2010. Not applicable. Four- and 8-item SFs of FIM, MDS, and FIM-MDS motor item bank. Six SFs were generated with 4 and 8 items across a range of difficulty levels from the FIM-MDS motor item bank. The three 8-item SFs all had higher correlations with the item bank (r=.82-.95), higher person strata, and less test error than the corresponding 4-item SFs (r=.80-.90). The three 4-item SFs did not meet the criteria of SE <0.3 for any theta values. Eight-item SFs could improve clinical use of the item bank composed of existing instruments across the continuum of care in veterans. We also found that the number of items, not test specificity, determines the precision of the instrument. Copyright © 2017 American Congress of Rehabilitation Medicine. All rights reserved.
ERIC Educational Resources Information Center
Hansen, James D.; Dexter, Lee
1997-01-01
Analysis of test item banks in 10 auditing textbooks found that 75% of questions violated one or more guidelines for multiple-choice items. In comparison, 70% of a certified public accounting exam bank had no violations. (SK)
The Multidimensional Structure of Verbal Comprehension Test Items.
ERIC Educational Resources Information Center
Peled, Zimra
1984-01-01
The multidimensional structure of verbal comprehension test items was investigated. Empirical evidence was provided to support the theory that item tasks are multivariate-multiordered composites of faceted components: language, contextual knowledge, and cognitive operation. Linear and circular properties of cylindrical manifestation were…
ERIC Educational Resources Information Center
Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan
2016-01-01
This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming…
ERIC Educational Resources Information Center
Stevenson, Claire E.; Heiser, Willem J.; Resing, Wilma C. M.
2016-01-01
Multiple-choice (MC) analogy items are often used in cognitive assessment. However, in dynamic testing, where the aim is to provide insight into potential for learning and the learning process, constructed-response (CR) items may be of benefit. This study investigated whether training with CR or MC items leads to differences in the strategy…
1981-02-01
3 Design ..................................................................... 3 Independent Variables...Prestwood & Weiss, 1978), which were designed to assess the effects of KR, the provision of "KR wa ; onf.,tidod with paring of item presentation...ach Item. -3- The present study was designed to separately examine the effects of KR and of computer- versus self-pacing of item presentation in order
ERIC Educational Resources Information Center
Swygert, Kimberly A.
In this study, data from an operational computerized adaptive test (CAT) were examined in order to gather information concerning item response times in a CAT environment. The CAT under study included multiple-choice items measuring verbal, quantitative, and analytical reasoning. The analyses included the fitting of regression models describing the…
ERIC Educational Resources Information Center
Ye, Meng; Xin, Tao
2014-01-01
The authors explored the effects of drifting common items on vertical scaling within the higher order framework of item parameter drift (IPD). The results showed that if IPD occurred between a pair of test levels, the scaling performance started to deviate from the ideal state, as indicated by bias of scaling. When there were two items drifting…
ERIC Educational Resources Information Center
Karkee, Thakur B.; Wright, Karen R.
2004-01-01
Different item response theory (IRT) models may be employed for item calibration. Change of testing vendors, for example, may result in the adoption of a different model than that previously used with a testing program. To provide scale continuity and preserve cut score integrity, item parameter estimates from the new model must be linked to the…
The NTID speech recognition test: NSRT(®).
Bochner, Joseph H; Garrison, Wayne M; Doherty, Karen A
2015-07-01
The purpose of this study was to collect and analyse data necessary for expansion of the NSRT item pool and to evaluate the NSRT adaptive testing software. Participants were administered pure-tone and speech recognition tests including W-22 and QuickSIN, as well as a set of 323 new NSRT items and NSRT adaptive tests in quiet and background noise. Performance on the adaptive tests was compared to pure-tone thresholds and performance on other speech recognition measures. The 323 new items were subjected to Rasch scaling analysis. Seventy adults with mild to moderately severe hearing loss participated in this study. Their mean age was 62.4 years (sd = 20.8). The 323 new NSRT items fit very well with the original item bank, enabling the item pool to be more than doubled in size. Data indicate high reliability coefficients for the NSRT and moderate correlations with pure-tone thresholds (PTA and HFPTA) and other speech recognition measures (W-22, QuickSIN, and SRT). The adaptive NSRT is an efficient and effective measure of speech recognition, providing valid and reliable information concerning respondents' speech perception abilities.
Cho, Sun-Joo; Athay, Michele; Preacher, Kristopher J
2013-05-01
Even though many educational and psychological tests are known to be multidimensional, little research has been done to address how to measure individual differences in change within an item response theory framework. In this paper, we suggest a generalized explanatory longitudinal item response model to measure individual differences in change. New longitudinal models for multidimensional tests and existing models for unidimensional tests are presented within this framework and implemented with software developed for generalized linear models. In addition to the measurement of change, the longitudinal models we present can also be used to explain individual differences in change scores for person groups (e.g., learning disabled students versus non-learning disabled students) and to model differences in item difficulties across item groups (e.g., number operation, measurement, and representation item groups in a mathematics test). An empirical example illustrates the use of the various models for measuring individual differences in change when there are person groups and multiple skill domains which lead to multidimensionality at a time point. © 2012 The British Psychological Society.
Development of an item bank for computerized adaptive test (CAT) measurement of pain.
Petersen, Morten Aa; Aaronson, Neil K; Chie, Wei-Chu; Conroy, Thierry; Costantini, Anna; Hammerlid, Eva; Hjermstad, Marianne J; Kaasa, Stein; Loge, Jon H; Velikova, Galina; Young, Teresa; Groenvold, Mogens
2016-01-01
Patient-reported outcomes should ideally be adapted to the individual patient while maintaining comparability of scores across patients. This is achievable using computerized adaptive testing (CAT). The aim here was to develop an item bank for CAT measurement of the pain domain as measured by the EORTC QLQ-C30 questionnaire. The development process consisted of four steps: (1) literature search, (2) formulation of new items and expert evaluations, (3) pretesting and (4) field-testing and psychometric analyses for the final selection of items. In step 1, we identified 337 pain items from the literature. Twenty-nine new items fitting the QLQ-C30 item style were formulated in step 2 that were reduced to 26 items by expert evaluations. Based on interviews with 31 patients from Denmark, France and the UK, the list was further reduced to 21 items in step 3. In phase 4, responses were obtained from 1103 cancer patients from five countries. Psychometric evaluations showed that 16 items could be retained in a unidimensional item bank. Evaluations indicated that use of the CAT measure may reduce sample size requirements with 15-25% compared to using the QLQ-C30 pain scale. We have established an item bank of 16 items suitable for CAT measurement of pain. While being backward compatible with the QLQ-C30, the new item bank will significantly improve measurement precision of pain. We recommend initiating CAT measurement by screening for pain using the two original QLQ-C30 pain items. The EORTC pain CAT is currently available for "experimental" purposes.
Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities
Hong, Ickpyo; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L.; Shulman, Lisa M.
2017-01-01
Purpose The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. Methods The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R2 less than 10 %). Results The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59–0.85) and acceptable internal consistency (Cronbach’s alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. Conclusion The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms. PMID:27048495
Assessment of the psychometrics of a PROMIS item bank: self-efficacy for managing daily activities.
Hong, Ickpyo; Velozo, Craig A; Li, Chih-Ying; Romero, Sergio; Gruber-Baldini, Ann L; Shulman, Lisa M
2016-09-01
The aim of this study is to investigate the psychometrics of the Patient-Reported Outcomes Measurement Information System self-efficacy for managing daily activities item bank. The item pool was field tested on a sample of 1087 participants via internet (n = 250) and in-clinic (n = 837) surveys. All participants reported having at least one chronic health condition. The 35 item pool was investigated for dimensionality (confirmatory factor analyses, CFA and exploratory factor analysis, EFA), item-total correlations, local independence, precision, and differential item functioning (DIF) across gender, race, ethnicity, age groups, data collection modes, and neurological chronic conditions (McFadden Pseudo R (2) less than 10 %). The item pool met two of the four CFA fit criteria (CFI = 0.952 and SRMR = 0.07). EFA analysis found a dominant first factor (eigenvalue = 24.34) and the ratio of first to second eigenvalue was 12.4. The item pool demonstrated good item-total correlations (0.59-0.85) and acceptable internal consistency (Cronbach's alpha = 0.97). The item pool maintained its precision (reliability over 0.90) across a wide range of theta (3.70), and there was no significant DIF. The findings indicated the item pool has sound psychometric properties and the test items are eligible for development of computerized adaptive testing and short forms.
ERIC Educational Resources Information Center
Kleinke, David J.
Four forms of a 36-item adaptation of the Stanford Achievement Test were administered to 484 fourth graders. External factors potentially influencing test performance were examined, namely: (1) item order (easy-to-difficult vs. uniform); (2) response location (left column vs. right column); (3) handedness which may interact with response location;…
Considering the Use of General and Modified Assessment Items in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Wyse, Adam E.; Albano, Anthony D.
2015-01-01
This article used several data sets from a large-scale state testing program to examine the feasibility of combining general and modified assessment items in computerized adaptive testing (CAT) for different groups of students. Results suggested that several of the assumptions made when employing this type of mixed-item CAT may not be met for…
ERIC Educational Resources Information Center
Li, Ying; Jiao, Hong; Lissitz, Robert W.
2012-01-01
This study investigated the application of multidimensional item response theory (IRT) models to validate test structure and dimensionality. Multiple content areas or domains within a single subject often exist in large-scale achievement tests. Such areas or domains may cause multidimensionality or local item dependence, which both violate the…
ERIC Educational Resources Information Center
Kohli, Nidhi; Koran, Jennifer; Henn, Lisa
2015-01-01
There are well-defined theoretical differences between the classical test theory (CTT) and item response theory (IRT) frameworks. It is understood that in the CTT framework, person and item statistics are test- and sample-dependent. This is not the perception with IRT. For this reason, the IRT framework is considered to be theoretically superior…
ERIC Educational Resources Information Center
Anagnostopoulou, Kyriaki; Hatzinikita, Vassilia; Christidou, Vasilia; Dimopoulos, Kostas
2013-01-01
The paper explores the relationship of the global and the local assessment discourses as expressed by Programme for International Student Assessment (PISA) test items and school-based examinations, respectively. To this end, the paper compares PISA test items related to living systems and the context of life, health, and environment, with Greek…
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2016-01-01
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…
Predicting Item Difficulty in a Reading Comprehension Test with an Artificial Neural Network.
ERIC Educational Resources Information Center
Perkins, Kyle; And Others
This paper reports the results of using a three-layer backpropagation artificial neural network to predict item difficulty in a reading comprehension test. Two network structures were developed, one with and one without a sigmoid function in the output processing unit. The data set, which consisted of a table of coded test items and corresponding…
Precision-Based Item Selection for Exposure Control in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Carroll, Ian A.
2017-01-01
Item exposure control is, relative to adaptive testing, a nascent concept that has emerged only in the last two to three decades on an academic basis as a practical issue in high-stakes computerized adaptive tests. This study aims to implement a new strategy in item exposure control by incorporating the standard error of the ability estimate into…
ERIC Educational Resources Information Center
Gutl, Christian; Lankmayr, Klaus; Weinhofer, Joachim; Hofler, Margit
2011-01-01
Research in automated creation of test items for assessment purposes became increasingly important during the recent years. Due to automatic question creation it is possible to support personalized and self-directed learning activities by preparing appropriate and individualized test items quite easily with relatively little effort or even fully…
ERIC Educational Resources Information Center
Immekus, Jason C.; Maller, Susan J.
2009-01-01
The Kaufman Adolescent and Adult Intelligence Test (KAIT[TM]) is an individually administered test of intelligence for individuals ranging in age from 11 to 85+ years. The item response theory-likelihood ratio procedure, based on the two-parameter logistic model, was used to detect differential item functioning (DIF) in the KAIT across males and…
Higgins, Johanne; Finch, Lois E; Kopec, Jacek; Mayo, Nancy E
2010-02-01
To create and illustrate the development of a method to parsimoniously and hierarchically assess upper extremity function in persons after stroke. Data were analyzed using Rasch analysis. Re-analysis of data from 8 studies involving persons after stroke. Over 4000 patients with stroke who participated in various studies in Montreal and elsewhere in Canada. Data comprised 17 tests or indices of upper extremity function and health-related quality of life, for a total of 99 items related to upper extremity function. Tests and indices included, among others, the Box and Block Test, the Nine-Hole Peg Test and the Stroke Impact Scale. Data were collected at various times post-stroke from 3 days to 1 year. Once the data fit the model, a bank of items measuring upper extremity function with persons and items organized hierarchically by difficulty and ability in log units was produced. This bank forms the basis for eventual computer adaptive testing. The calibration of the items should be tested further psychometrically, as should the interpretation of the metric arising from using the item calibration to measure the upper extremity of individuals.
26 CFR 1.907(c)-2 - Section 907(c)(3) items (for taxable years beginning after December 31, 1982).
Code of Federal Regulations, 2014 CFR
2014-04-01
... stock test. Items described in section 907(c)(3) (A) or (C) are FORI only if a deemed-paid-tax test is met under the criteria of section 902 or 960. The purpose of this test is to require minimum direct or... for the item to qualify as FORI in the hands of the domestic corporation. The test is whether a...
26 CFR 1.907(c)-2 - Section 907(c)(3) items (for taxable years beginning after December 31, 1982).
Code of Federal Regulations, 2011 CFR
2011-04-01
... stock test. Items described in section 907(c)(3) (A) or (C) are FORI only if a deemed-paid-tax test is met under the criteria of section 902 or 960. The purpose of this test is to require minimum direct or... for the item to qualify as FORI in the hands of the domestic corporation. The test is whether a...
26 CFR 1.907(c)-2 - Section 907(c)(3) items (for taxable years beginning after December 31, 1982).
Code of Federal Regulations, 2012 CFR
2012-04-01
... stock test. Items described in section 907(c)(3) (A) or (C) are FORI only if a deemed-paid-tax test is met under the criteria of section 902 or 960. The purpose of this test is to require minimum direct or... for the item to qualify as FORI in the hands of the domestic corporation. The test is whether a...
Relationship of the Basic Attributes Test to Tactical Reconnaissance Pilot Performance
1987-01-01
ulysk 36 4.. Pscoa- Test: Pefcrman Regression Analysis 66 5. Decsison Making Speed: Ped~cnance Regrssiop Analysis 68 6. Item Recognitio . Pefixmanc...agreement between 12 TRS and 91 TRS supcrvisors. This indicated that those most likely to be faced with the task of determining the performance capabilities...those UPT check flights requiring quick, consistent, and accurate responses. Item Recognitio Test Ihe item recognition test reduced to seven scors. Thet
26 CFR 1.907(c)-2 - Section 907(c)(3) items (for taxable years beginning after December 31, 1982).
Code of Federal Regulations, 2013 CFR
2013-04-01
... stock test. Items described in section 907(c)(3) (A) or (C) are FORI only if a deemed-paid-tax test is met under the criteria of section 902 or 960. The purpose of this test is to require minimum direct or... for the item to qualify as FORI in the hands of the domestic corporation. The test is whether a...
Item Response Theory analysis of Fagerström Test for Cigarette Dependence.
Svicher, Andrea; Cosci, Fiammetta; Giannini, Marco; Pistelli, Francesco; Fagerström, Karl
2018-02-01
The Fagerström Test for Cigarette Dependence (FTCD) and the Heaviness of Smoking Index (HSI) are the gold standard measures to assess cigarette dependence. However, FTCD reliability and factor structure have been questioned and HSI psychometric properties are in need of further investigations. The present study examined the psychometrics properties of the FTCD and the HSI via the Item Response Theory. The study was a secondary analysis of data collected in 862 Italian daily smokers. Confirmatory factor analysis was run to evaluate the dimensionality of FTCD. A Grade Response Model was applied to FTCD and HSI to verify the fit to the data. Both item and test functioning were analyzed and item statistics, Test Information Function, and scale reliabilities were calculated. Mokken Scale Analysis was applied to estimate homogeneity and Loevinger's coefficients were calculated. The FTCD showed unidimensionality and homogeneity for most of the items and for the total score. It also showed high sensitivity and good reliability from medium to high levels of cigarette dependence, although problems related to some items (i.e., items 3 and 5) were evident. HSI had good homogeneity, adequate item functioning, and high reliability from medium to high levels of cigarette dependence. Significant Differential Item Functioning was found for items 1, 4, 5 of the FTCD and for both items of HSI. HSI seems highly recommended in clinical settings addressed to heavy smokers while FTCD would be better used in smokers with a level of cigarette dependence ranging between low and high. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evaluation of Item Candidates: The PROMIS Qualitative Item Review
DeWalt, Darren A.; Rothrock, Nan; Yount, Susan; Stone, Arthur A.
2009-01-01
One of the PROMIS (Patient-Reported Outcome Measurement Information System) network's primary goals is the development of a comprehensive item bank for patient-reported outcomes of chronic diseases. For its first set of item banks, PROMIS chose to focus on pain, fatigue, emotional distress, physical function, and social function. An essential step for the development of an item pool is the identification, evaluation, and revision of extant questionnaire items for the core item pool. In this work, we also describe the systematic process wherein items are classified for subsequent statistical processing by the PROMIS investigators. Six phases of item development are documented: identification of extant items, item classification and selection, item review and revision, focus group input on domain coverage, cognitive interviews with individual items, and final revision before field testing. Identification of items refers to the systematic search for existing items in currently available scales. Expert item review and revision was conducted by trained professionals who reviewed the wording of each item and revised as appropriate for conventions adopted by the PROMIS network. Focus groups were used to confirm domain definitions and to identify new areas of item development for future PROMIS item banks. Cognitive interviews were used to examine individual items. Items successfully screened through this process were sent to field testing and will be subjected to innovative scale construction procedures. PMID:17443114
Validity and Reliability of General Nutrition Knowledge Questionnaire for Adults in Uganda
Bukenya, Richard; Ahmed, Abhiya; Andrade, Jeanette M.; Grigsby-Toussaint, Diana S.; Muyonga, John; Andrade, Juan E.
2017-01-01
This study sought to develop and validate a general nutrition knowledge questionnaire (GNKQ) for Ugandan adults. The initial draft consisted of 133 items on five constructs associated with nutrition knowledge; expert recommendations (16 items), food groups (70 items), selecting food (10 items), nutrition and disease relationship (23 items), and food fortification in Uganda (14 items). The questionnaire validity was evaluated in three studies. For the content validity (study 1), a panel of five content matter nutrition experts reviewed the GNKQ draft before and after face validity. For the face validity (study 2), head teachers and health workers (n = 27) completed the questionnaire before attending one of three focus groups to review the clarity of the items. For the construct and test-rest reliability (study 3), head teachers (n = 40) from private and public primary schools and nutrition (n = 52) and engineering (n = 49) students from Makerere University took the questionnaire twice (two weeks apart). Experts agreed (content validity index, CVI > 0.9; reliability, Gwet’s AC1 > 0.85) that all constructs were relevant to evaluate nutrition knowledge. After the focus groups, 29 items were identified as unclear, requiring major (n = 5) and minor (n = 24) reviews. The final questionnaire had acceptable internal consistency (Cronbach α > 0.95), test-retest reliability (r = 0.89), and differentiated (p < 0.001) nutrition knowledge scores between nutrition (67 ± 5) and engineering (39 ± 11) students. Only the construct on nutrition recommendations was unreliable (Cronbach α = 0.51, test-retest r = 0.55), which requires further optimization. The final questionnaire included topics on food groups (41 items), selecting food (2 items), nutrition and disease relationship (14 items), and food fortification in Uganda (22 items) and had good content, construct, and test-retest reliability to evaluate nutrition knowledge among Ugandan adults. PMID:28230779
Development of knowledge tests for multi-disciplinary emergency training: a review and an example.
Sørensen, J L; Thellesen, L; Strandbygaard, J; Svendsen, K D; Christensen, K B; Johansen, M; Langhoff-Roos, P; Ekelund, K; Ottesen, B; Van Der Vleuten, C
2015-01-01
The literature is sparse on written test development in a post-graduate multi-disciplinary setting. Developing and evaluating knowledge tests for use in multi-disciplinary post-graduate training is challenging. The objective of this study was to describe the process of developing and evaluating a multiple-choice question (MCQ) test for use in a multi-disciplinary training program in obstetric-anesthesia emergencies. A multi-disciplinary working committee with 12 members representing six professional healthcare groups and another 28 participants were involved. Recurrent revisions of the MCQ items were undertaken followed by a statistical analysis. The MCQ items were developed stepwise, including decisions on aims and content, followed by testing for face and content validity, construct validity, item-total correlation, and reliability. To obtain acceptable content validity, 40 out of originally 50 items were included in the final MCQ test. The MCQ test was able to distinguish between levels of competence, and good construct validity was indicated by a significant difference in the mean score between consultants and first-year trainees, as well as between first-year trainees and medical and midwifery students. Evaluation of the item-total correlation analysis in the 40 items set revealed that 11 items needed re-evaluation, four of which addressed content issues in local clinical guidelines. A Cronbach's alpha of 0.83 for reliability was found, which is acceptable. Content and construct validity and reliability were acceptable. The presented template for the development of this MCQ test could be useful to others when developing knowledge tests and may enhance the overall quality of test development. © 2014 The Acta Anaesthesiologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.