Sample records for chemistry test item

  1. Australian Chemistry Test Item Bank: Years 11 & 12. Volume 1.

    ERIC Educational Resources Information Center

    Commons, C., Ed.; Martin, P., Ed.

    Volume 1 of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the "ACER…

  2. Australian Chemistry Test Item Bank: Years 11 and 12. Volume 2.

    ERIC Educational Resources Information Center

    Commons, C., Ed.; Martin, P., Ed.

    The second volume of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the…

  3. ACER Chemistry Test Item Collection. ACER Chemtic Year 12.

    ERIC Educational Resources Information Center

    Australian Council for Educational Research, Hawthorn.

    The chemistry test item banks contains 225 multiple-choice questions suitable for diagnostic and achievement testing; a three-page teacher's guide; answer key with item facilities; an answer sheet; and a 45-item sample achievement test. Although written for the new grade 12 chemistry course in Victoria, Australia, the items are widely applicable.…

  4. ACER Chemistry Test Item Collection (ACER CHEMTIC Year 12 Supplement).

    ERIC Educational Resources Information Center

    Australian Council for Educational Research, Hawthorn.

    This publication contains 317 multiple-choice chemistry test items related to topics covered in the Victorian (Australia) Year 12 chemistry course. It allows teachers access to a range of items suitable for diagnostic and achievement purposes, supplementing the ACER Chemistry Test Item Collection--Year 12 (CHEMTIC). The topics covered are: organic…

  5. The Development of Multiple-Choice Items Consistent with the AP Chemistry Curriculum Framework to More Accurately Assess Deeper Understanding

    ERIC Educational Resources Information Center

    Domyancich, John M.

    2014-01-01

    Multiple-choice questions are an important part of large-scale summative assessments, such as the advanced placement (AP) chemistry exam. However, past AP chemistry exam items often lacked the ability to test conceptual understanding and higher-order cognitive skills. The redesigned AP chemistry exam shows a distinctive shift in item types toward…

  6. Science Library of Test Items. Volume Eighteen. A Collection of Multiple Choice Test Items Relating Mainly to Chemistry.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  7. Development and evaluation of a thermochemistry concept inventory for college-level general chemistry

    NASA Astrophysics Data System (ADS)

    Wren, David A.

    The research presented in this dissertation culminated in a 10-item Thermochemistry Concept Inventory (TCI). The development of the TCI can be divided into two main phases: qualitative studies and quantitative studies. Both phases focused on the primary stakeholders of the TCI, college-level general chemistry instructors and students. Each phase was designed to collect evidence for the validity of the interpretations and uses of TCI testing data. A central use of TCI testing data is to identify student conceptual misunderstandings, which are represented as incorrect options of multiple-choice TCI items. Therefore, quantitative and qualitative studies focused heavily on collecting evidence at the item-level, where important interpretations may be made by TCI users. Qualitative studies included student interviews (N = 28) and online expert surveys (N = 30). Think-aloud student interviews (N = 12) were used to identify conceptual misunderstandings used by students. Novice response process validity interviews (N = 16) helped provide information on how students interpreted and answered TCI items and were the basis of item revisions. Practicing general chemistry instructors (N = 18), or experts, defined boundaries of thermochemistry content included on the TCI. Once TCI items were in the later stages of development, an online version of the TCI was used in expert response process validity survey (N = 12), to provide expert feedback on item content, format and consensus of the correct answer for each item. Quantitative studies included three phases: beta testing of TCI items (N = 280), pilot testing of the a 12-item TCI (N = 485), and a large data collection using a 10-item TCI ( N = 1331). In addition to traditional classical test theory analysis, Rasch model analysis was also used for evaluation of testing data at the test and item level. The TCI was administered in both formative assessment (beta and pilot testing) and summative assessment (large data collection), with items performing well in both. One item, item K, did not have acceptable psychometric properties when the TCI was used as a quiz (summative assessment), but was retained in the final version of the TCI based on the acceptable psychometric properties displayed in pilot testing (formative assessment).

  8. Demonstrating the Difference between Classical Test Theory and Item Response Theory Using Derived Test Data

    ERIC Educational Resources Information Center

    Magno, Carlo

    2009-01-01

    The present report demonstrates the difference between classical test theory (CTT) and item response theory (IRT) approach using an actual test data for chemistry junior high school students. The CTT and IRT were compared across two samples and two forms of test on their item difficulty, internal consistency, and measurement errors. The specific…

  9. Omani Twelfth Grade Students' Most Common Misconceptions in Chemistry

    ERIC Educational Resources Information Center

    Al-Balushi, Sulaiman M.; Ambusaidi, Abdullah K.; Al-Shuaili, Ali H.; Taylor, Neil

    2012-01-01

    The current study, undertaken in the Sultanate of Oman, explored twelfth grade students' common misconceptions in seven chemistry conceptual areas. The sample included 786 twelfth grade students in Oman while the instrument was a two-tier test called Chemistry Misconceptions Diagnostic Test (CMDT), consisting of 25 items with 12 items…

  10. Chemistry, Grades 7-12. Annotated Bibliography of Tests.

    ERIC Educational Resources Information Center

    Educational Testing Service, Princeton, NJ. Test Collection.

    Among the 22 tests cited in this bibliography are end-of-course tests, tests of the American Chemical Society, and chemistry item banks. This document is one in a series of topical bibliographies from the Test Collection (TC) at the Educational Testing Service (ETS) containing descriptions of more than 18,000 tests and other measurement devices…

  11. A teaching intervention for reading laboratory experiments in college-level introductory chemistry

    NASA Astrophysics Data System (ADS)

    Kirk, Maria Kristine

    The purpose of this study was to determine the effects that a pre-laboratory guide, conceptualized as a "scientific story grammar," has on college chemistry students' learning when they read an introductory chemistry laboratory manual and perform the experiments in the chemistry laboratory. The participants (N = 56) were students enrolled in four existing general chemistry laboratory sections taught by two instructors at a women's liberal arts college. The pre-laboratory guide consisted of eight questions about the experiment, including the purpose, chemical species, variables, chemical method, procedure, and hypothesis. The effects of the intervention were compared with those of the traditional pre-laboratory assignment for the eight chemistry experiments. Measures included quizzes, tests, chemistry achievement test, science process skills test, laboratory reports, laboratory average, and semester grade. The covariates were mathematical aptitude and prior knowledge of chemistry and science processes, on which the groups differed significantly. The study captured students' perceptions of their experience in general chemistry through a survey and interviews with eight students. The only significant differences in the treatment group's performance were in some subscores on lecture items and laboratory items on the quizzes. An apparent induction period was noted, in that significant measures occurred in mid-semester. Voluntary study with the pre-laboratory guide by control students precluded significant differences on measures given later in the semester. The groups' responses to the survey were similar. Significant instructor effects on three survey items were corroborated by the interviews. The researcher's students were more positive about their pre-laboratory tasks, enjoyed the laboratory sessions more, and were more confident about doing chemistry experiments than the laboratory instructor's groups due to differences in scaffolding by the instructors.

  12. Factor analysis for instruments of science learning motivation and its implementation for the chemistry and biology teacher candidates

    NASA Astrophysics Data System (ADS)

    Prasetya, A. T.; Ridlo, S.

    2018-03-01

    The purpose of this study is to test the learning motivation of science instruments and compare the learning motivation of science from chemistry and biology teacher candidates. Kuesioner Motivasi Sains (KMS) in Indonesian adoption of the Science Motivation Questionnaire II (SMQ II) consisting of 25 items with a 5-point Likert scale. The number of respondents for the Exploratory Factor Analysis (EFA) test was 312. The Kaiser-Meyer-Olkin (KMO), determinant, Bartlett’s Sphericity, Measures of Sampling Adequacy (MSA) tests against KMS using SPSS 20.0, and Lisrel 8.51 software indicate eligible indications. However testing of Communalities obtained results that there are 4 items not qualified, so the item is discarded. The second test, all parameters of eligibility and has a magnitude of Root Mean Square Error of Approximation (RMSEA), P-Value for the Test of Close Fit (RMSEA <0.05), Goodness of Fit Index (GFI) was good. The new KMS with 21 valid items and composite reliability of 0.9329 can be used to test the level of learning motivation of science which includes Intrinsic Motivation, Sefl-Efficacy, Self-Determination, Grade Motivation and Career Motivation for students who master the Indonesian language. KMS trials of chemistry and biology teacher candidates obtained no significant difference in the learning motivation between the two groups.

  13. What We Don't Test: What an Analysis of Unreleased ACS Exam Items Reveals about Content Coverage in General Chemistry Assessments

    ERIC Educational Resources Information Center

    Reed, Jessica J.; Villafan~e, Sachel M.; Raker, Jeffrey R.; Holme, Thomas A.; Murphy, Kristen L.

    2017-01-01

    General chemistry courses are often the foundation for the study of other science disciplines and upper-level chemistry concepts. Students who take introductory chemistry courses are more often from health and science-related fields than chemistry. As such, the content taught and assessed in general chemistry courses is envisioned as building…

  14. Some Effects of Changes in Question Structure and Sequence on Performance in a Multiple Choice Chemistry Test.

    ERIC Educational Resources Information Center

    Hodson, D.

    1984-01-01

    Investigated the effect on student performance of changes in question structure and sequence on a GCE 0-level multiple-choice chemistry test. One finding noted is that there was virtually no change in test reliability on reducing the number of options (from five to per test item). (JN)

  15. New decision criteria for selecting delta check methods based on the ratio of the delta difference to the width of the reference range can be generally applicable for each clinical chemistry test item.

    PubMed

    Park, Sang Hyuk; Kim, So-Young; Lee, Woochang; Chun, Sail; Min, Won-Ki

    2012-09-01

    Many laboratories use 4 delta check methods: delta difference, delta percent change, rate difference, and rate percent change. However, guidelines regarding decision criteria for selecting delta check methods have not yet been provided. We present new decision criteria for selecting delta check methods for each clinical chemistry test item. We collected 811,920 and 669,750 paired (present and previous) test results for 27 clinical chemistry test items from inpatients and outpatients, respectively. We devised new decision criteria for the selection of delta check methods based on the ratio of the delta difference to the width of the reference range (DD/RR). Delta check methods based on these criteria were compared with those based on the CV% of the absolute delta difference (ADD) as well as those reported in 2 previous studies. The delta check methods suggested by new decision criteria based on the DD/RR ratio corresponded well with those based on the CV% of the ADD except for only 2 items each in inpatients and outpatients. Delta check methods based on the DD/RR ratio also corresponded with those suggested in the 2 previous studies, except for 1 and 7 items in inpatients and outpatients, respectively. The DD/RR method appears to yield more feasible and intuitive selection criteria and can easily explain changes in the results by reflecting both the biological variation of the test item and the clinical characteristics of patients in each laboratory. We suggest this as a measure to determine delta check methods.

  16. Science Library of Test Items. Volume Twenty-Three. Geology (Part One). Free Response Testing Program.

    ERIC Educational Resources Information Center

    Hopley, Ken; And Others

    The first of several planned volumes of Free Response Test Items contains geology questions developed by the Assessment and Evaluation Unit of the New South Wales Department of Education. Two additional geology volumes and biology and chemistry volumes are in preparation. The questions in this volume were written and reviewed by practicing…

  17. Development and analysis of an instrument to assess student understanding of GOB chemistry knowledge relevant to clinical nursing practice.

    PubMed

    Brown, Corina E; Hyslop, Richard M; Barbera, Jack

    2015-01-01

    The General, Organic, and Biological Chemistry Knowledge Assessment (GOB-CKA) is a multiple-choice instrument designed to assess students' understanding of the chemistry topics deemed important to clinical nursing practice. This manuscript describes the development process of the individual items along with a psychometric evaluation of the final version of the items and instrument. In developing items for the GOB-CKA, essential topics were identified through a series of expert interviews (with practicing nurses, nurse educators, and GOB chemistry instructors) and confirmed through a national survey. Individual items were tested in qualitative studies with students from the target population for clarity and wording. Data from pilot and beta studies were used to evaluate each item and narrow the total item count to 45. A psychometric analysis performed on data from the 45-item final version was used to provide evidence of validity and reliability. The final version of the instrument has a Cronbach's alpha value of 0.76. Feedback from an expert panel provided evidence of face and content validity. Convergent validity was estimated by comparing the results from the GOB-CKA with the General-Organic-Biochemistry Exam (Form 2007) of the American Chemical Society. Instructors who wish to use the GOB-CKA for teaching and research may contact the corresponding author for a copy of the instrument. © 2014 Wiley Periodicals, Inc.

  18. Assessing Student Preparation through Placement Tests

    NASA Astrophysics Data System (ADS)

    McFate, Craig; Olmsted, John, III

    1999-04-01

    The chemistry department at California State University, Fullerton, uses a placement test of its own design to assess student readiness to enroll in General Chemistry. This test contains items designed to test cognitive skills more than factual knowledge. We have analyzed the ability of this test to predict student success (defined as passing the first-semester course with a C or better) using data for 845 students from four consecutive semesters. In common with other placement tests, we find a weak but statistically significant correlation between test performance and course grades. More meaningfully, there is a strong correlation (R2 = 0.82) between test score and course success, sufficient to use for counseling purposes. An item analysis was conducted to determine what types of questions provide the best predictability. Six questions from the full set of 25 were identified as strong predictors, on the basis of discrimination indices and coefficients of determination that were more than one standard deviation above the mean values for test items. These questions had little in common except for requiring multistep mathematical operations and formal reasoning.

  19. The development of an instrument to assess chemistry perceptions

    NASA Astrophysics Data System (ADS)

    Wells, Raymond R.

    The instrument, developed in this study, attempted to correct the deficiencies of previous instruments. Statements of belief and opinion can be validly included under the construct of chemistry perceptions. Further, statements that might be better characterized as science attitudes, math attitudes, or attitudes toward a specific course or program were not included. Eliminating statements of math anxiety and test anxiety insured that responses to statements of anxiety were perceptions of anxiety solely related to chemistry. The results of the expert judges' responses to the Validation of Proposed Perception Statements forms were detailed to establish construct and content validity. The nature of Likert scale construction and calculation of internal consistency also supported the validity of the instrument. A pilot Chemistry Perception Questionnaire (CPQ) was then constructed based on agreement of the appropriate subscale and mean importance of the perception statements. The pilot CPQ results were subjected to an item analysis based on three sets of statistics: the frequency of each response and the percentage of respondents making each response for each perception statement, the mean and standard deviations for each item, and the item discrimination index which correlated the item scores with the subscale scores. With no zero or negative correlations to the subscale scores, it was not necessary to replace any of the perception statements contained in the pilot instrument. Therefore, the piloted Chemistry Perception Questionnaire became the final instrument. Factor analysis confirmed the multidimensionality of the instrument. The instrument was administered twice with a separation interval of approximately one month in order to perform a test-retest reliability analysis. One hundred and forty-one pairs were matched and results detailed. The correlation between forms, for the total instrument, was 0.9342. The mean coefficient alpha, for the total instrument, was 0.9495. With test-retest correlations and alphas exceeding 0.70 for all seven subscales and the total instrument, it was determined that the Chemistry Perception Questionnaire instrument achieved reasonably high reliability estimations.

  20. Validity and Reliability Testing of an e-learning Questionnaire for Chemistry Instruction

    NASA Astrophysics Data System (ADS)

    Guspatni, G.; Kurniawati, Y.

    2018-04-01

    The aim of this paper is to examine validity and reliability of a questionnaire used to evaluate e-learning implementation in chemistry instruction. 48 questionnaires were filled in by students who had studied chemistry through e-learning system. The questionnaire consisted of 20 indicators evaluating students’ perception on using e-learning. Parametric testing was done as data were assumed to follow normal distribution. Item validity of the questionnaire was examined through item-total correlation using Pearson’s formula while its reliability was assessed with Cronbach’s alpha formula. Moreover, convergent validity was assessed to see whether indicators building a factor had theoretically the same underlying construct. The result of validity testing revealed 19 valid indicators while the result of reliability testing revealed Cronbach’s alpha value of .886. The result of factor analysis showed that questionnaire consisted of five factors, and each of them had indicators building the same construct. This article shows the importance of factor analysis to get a construct valid questionnaire before it is used as research instrument.

  1. Reliability of a science admission test (HAM-Nat) at Hamburg medical school.

    PubMed

    Hissbach, Johanna; Klusmann, Dietrich; Hampe, Wolfgang

    2011-01-01

    The University Hospital in Hamburg (UKE) started to develop a test of knowledge in natural sciences for admission to medical school in 2005 (Hamburger Auswahlverfahren für Medizinische Studiengänge, Naturwissenschaftsteil, HAM-Nat). This study is a step towards establishing the HAM-Nat. We are investigating parallel forms reliability, the effect of a crash course in chemistry on test results, and correlations of HAM-Nat test results with a test of scientific reasoning (similar to a subtest of the "Test for Medical Studies", TMS). 316 first-year students participated in the study in 2007. They completed different versions of the HAM-Nat test which consisted of items that had already been used (HN2006) and new items (HN2007). Four weeks later half of the participants were tested on the HN2007 version of the HAM-Nat again, while the other half completed the test of scientific reasoning. Within this four week interval students were offered a five day chemistry course. Parallel forms reliability for four different test versions ranged from r(tt)=.53 to r(tt)=.67. The retest reliabilities of the HN2007 halves were r(tt)=.54 and r(tt )=.61. Correlations of the two HAM-Nat versions with the test of scientific reasoning were r=.34 und r=.21. The crash course in chemistry had no effect on HAM-Nat scores. The results suggest that further versions of the test of natural sciences will not easily conform to the standards of internal consistency, parallel-forms reliability and retest reliability. Much care has to be taken in order to assemble items which could be used interchangeably for the construction of new test versions. The test of scientific reasoning and the HAM-Nat are tapping different constructs. Participation in a chemistry course did not improve students' achievement, probably because the content of the course was not coordinated with the test and many students lacked of motivation to do well in the second test.

  2. Reliability of a science admission test (HAM-Nat) at Hamburg medical school

    PubMed Central

    Hissbach, Johanna; Klusmann, Dietrich; Hampe, Wolfgang

    2011-01-01

    Objective: The University Hospital in Hamburg (UKE) started to develop a test of knowledge in natural sciences for admission to medical school in 2005 (Hamburger Auswahlverfahren für Medizinische Studiengänge, Naturwissenschaftsteil, HAM-Nat). This study is a step towards establishing the HAM-Nat. We are investigating parallel forms reliability, the effect of a crash course in chemistry on test results, and correlations of HAM-Nat test results with a test of scientific reasoning (similar to a subtest of the "Test for Medical Studies", TMS). Methods: 316 first-year students participated in the study in 2007. They completed different versions of the HAM-Nat test which consisted of items that had already been used (HN2006) and new items (HN2007). Four weeks later half of the participants were tested on the HN2007 version of the HAM-Nat again, while the other half completed the test of scientific reasoning. Within this four week interval students were offered a five day chemistry course. Results: Parallel forms reliability for four different test versions ranged from rtt=.53 to rtt=.67. The retest reliabilities of the HN2007 halves were rtt=.54 and rtt =.61. Correlations of the two HAM-Nat versions with the test of scientific reasoning were r=.34 und r=.21. The crash course in chemistry had no effect on HAM-Nat scores. Conclusions: The results suggest that further versions of the test of natural sciences will not easily conform to the standards of internal consistency, parallel-forms reliability and retest reliability. Much care has to be taken in order to assemble items which could be used interchangeably for the construction of new test versions. The test of scientific reasoning and the HAM-Nat are tapping different constructs. Participation in a chemistry course did not improve students’ achievement, probably because the content of the course was not coordinated with the test and many students lacked of motivation to do well in the second test. PMID:21866246

  3. Analysing task design and students' responses to context-based problems through different analytical frameworks

    NASA Astrophysics Data System (ADS)

    Broman, Karolina; Bernholt, Sascha; Parchmann, Ilka

    2015-05-01

    Background:Context-based learning approaches are used to enhance students' interest in, and knowledge about, science. According to different empirical studies, students' interest is improved by applying these more non-conventional approaches, while effects on learning outcomes are less coherent. Hence, further insights are needed into the structure of context-based problems in comparison to traditional problems, and into students' problem-solving strategies. Therefore, a suitable framework is necessary, both for the analysis of tasks and strategies. Purpose:The aim of this paper is to explore traditional and context-based tasks as well as students' responses to exemplary tasks to identify a suitable framework for future design and analyses of context-based problems. The paper discusses different established frameworks and applies the Higher-Order Cognitive Skills/Lower-Order Cognitive Skills (HOCS/LOCS) taxonomy and the Model of Hierarchical Complexity in Chemistry (MHC-C) to analyse traditional tasks and students' responses. Sample:Upper secondary students (n=236) at the Natural Science Programme, i.e. possible future scientists, are investigated to explore learning outcomes when they solve chemistry tasks, both more conventional as well as context-based chemistry problems. Design and methods:A typical chemistry examination test has been analysed, first the test items in themselves (n=36), and thereafter 236 students' responses to one representative context-based problem. Content analysis using HOCS/LOCS and MHC-C frameworks has been applied to analyse both quantitative and qualitative data, allowing us to describe different problem-solving strategies. Results:The empirical results show that both frameworks are suitable to identify students' strategies, mainly focusing on recall of memorized facts when solving chemistry test items. Almost all test items were also assessing lower order thinking. The combination of frameworks with the chemistry syllabus has been found successful to analyse both the test items as well as students' responses in a systematic way. The framework can therefore be applied in the design of new tasks, the analysis and assessment of students' responses, and as a tool for teachers to scaffold students in their problem-solving process. Conclusions:This paper gives implications for practice and for future research to both develop new context-based problems in a structured way, as well as providing analytical tools for investigating students' higher order thinking in their responses to these tasks.

  4. Thai Grade 11 Students' Alternative Conceptions for Acid-Base Chemistry

    ERIC Educational Resources Information Center

    Artdej, Romklao; Ratanaroutai, Thasaneeya; Coll, Richard Kevin; Thongpanchang, Tienthong

    2010-01-01

    This study involved the development of a two-tier diagnostic instrument to assess Thai high school students' understanding of acid-base chemistry. The acid-base diagnostic test (ABDT) comprising 18 items was administered to 55 Grade 11 students in a science and mathematics programme during the second semester of the 2008 academic year. Analysis of…

  5. A Valid and Reliable Instrument for Cognitive Complexity Rating Assignment of Chemistry Exam Items

    ERIC Educational Resources Information Center

    Knaus, Karen; Murphy, Kristen; Blecking, Anja; Holme, Thomas

    2011-01-01

    The design and use of a valid and reliable instrument for the assignment of cognitive complexity ratings to chemistry exam items is described in this paper. Use of such an instrument provides a simple method to quantify the cognitive demands of chemistry exam items. Instrument validity was established in two different ways: statistically…

  6. Development of the Flame Test Concept Inventory: Measuring Student Thinking about Atomic Emission

    ERIC Educational Resources Information Center

    Bretz, Stacey Lowery; Murata Mayo, Ana Vasquez

    2018-01-01

    This study reports the development of a 19-item Flame Test Concept Inventory, an assessment tool to measure students' understanding of atomic emission. Fifty-two students enrolled in secondary and postsecondary chemistry courses were interviewed about atomic emission and explicitly asked to explain flame test demonstrations and energy level…

  7. The Development of a Visual-Perceptual Chemistry Specific (VPCS) Assessment Tool

    ERIC Educational Resources Information Center

    Oliver-Hoyo, Maria; Sloan, Caroline

    2014-01-01

    The development of the Visual-Perceptual Chemistry Specific (VPCS) assessment tool is based on items that align to eight visual-perceptual skills considered as needed by chemistry students. This tool includes a comprehensive range of visual operations and presents items within a chemistry context without requiring content knowledge to solve…

  8. A Quantum Chemistry Concept Inventory for Physical Chemistry Classes

    ERIC Educational Resources Information Center

    Dick-Perez, Marilu; Luxford, Cynthia J.; Windus, Theresa L.; Holme, Thomas

    2016-01-01

    A 14-item, multiple-choice diagnostic assessment tool, the quantum chemistry concept inventory or QCCI, is presented. Items were developed based on published student misconceptions and content coverage and then piloted and used in advanced physical chemistry undergraduate courses. In addition to the instrument itself, data from both a pretest,…

  9. Student Perceptions of Chemistry Laboratory Learning Environments, Student-Teacher Interactions and Attitudes in Secondary School Gifted Education Classes in Singapore

    NASA Astrophysics Data System (ADS)

    Lang, Quek Choon; Wong, Angela F. L.; Fraser, Barry J.

    2005-09-01

    This study investigated the chemistry laboratory classroom environment, teacher-student interactions and student attitudes towards chemistry among 497 gifted and non-gifted secondary-school students in Singapore. The data were collected using the 35-item Chemistry Laboratory Environment Inventory (CLEI), the 48-item Questionnaire on Teacher Interaction (QTI) and the 30-item Questionnaire on Chemistry-Related Attitudes (QOCRA). Results supported the validity and reliability of the CLEI and QTI for this sample. Stream (gifted versus non-gifted) and gender differences were found in actual and preferred chemistry laboratory classroom environments and teacher-student interactions. Some statistically significant associations of modest magnitude were found between students' attitudes towards chemistry and both the laboratory classroom environment and the interpersonal behaviour of chemistry teachers. Suggestions for improving chemistry laboratory classroom environments and the teacher-student interactions for gifted students are provided.

  10. "JCE" Classroom Activity #105. A Sticky Situation: Chewing Gum and Solubility

    ERIC Educational Resources Information Center

    Montes-Gonzalez, Ingrid; Cintron-Maldonado, Jose A.; Perez-Medina, Ilia E.; Montes-Berrios, Veronica; Roman-Lopez, Saurie N.

    2010-01-01

    In this Activity, students perform several solubility tests using common food items such as chocolate, chewing gum, water, sugar, and oil. From their observations during the Activity, students will initially classify the substances tested as soluble or insoluble. They will then use their understanding of the chemistry of solubility to classify the…

  11. Using Distractor-Driven Standards-Based Multiple-Choice Assessments and Rasch Modeling to Investigate Hierarchies of Chemistry Misconceptions and Detect Structural Problems with Individual Items

    ERIC Educational Resources Information Center

    Herrmann-Abell, Cari F.; DeBoer, George E.

    2011-01-01

    Distractor-driven multiple-choice assessment items and Rasch modeling were used as diagnostic tools to investigate students' understanding of middle school chemistry ideas. Ninety-one items were developed according to a procedure that ensured content alignment to the targeted standards and construct validity. The items were administered to 13360…

  12. Study on a novel core module based on optical fiber bundles for urine dry-chemistry analysis

    NASA Astrophysics Data System (ADS)

    Liu, Gaiqin; Ma, Zengwei; Li, Rui; Hu, Nan; Chen, Ping; Wang, Fei; Zhang, Ruiying; Chen, Longcong

    2017-09-01

    A core module with a novel optical structure is presented to analyze urine by the dry-chemistry method in this paper. It consists of a 32-bit microprocessor, optical fiber bundles, a high precision color sensor and a temperature sensor. The optical fiber bundles are adopted to control the spread path of light and reduce the influence of ambient light and the distance between the strip and sensor effectively. And the temperature sensor is applied to detect the environmental temperature to calibrate the measurement results. Therefore, all these can bring a lot of benefits to the core module, such as improving its test accuracy, reducing its volume and cost, and simplifying its assembly. Additionally, some parameters, including the calculation coefficient about reflectivity of each item, semi-quantitative intervals, the number of test items, may be modified by corresponding instructions in order to enhance its applicability. Meanwhile, its outputs can be chosen among the original data, normalized color values, reflectivity, and the semi-quantitative level of each test item by available instructions. Our results show that the module has high measurement accuracy of more than 95%, good stability, reliability, and consistency and can be easily used in various types of urine analyzers.

  13. A study of the factors affecting the attitudes of young female students toward chemistry at the high school level

    NASA Astrophysics Data System (ADS)

    Banya, Santonino K.

    Chemistry is a human endeavor that relies on basic human qualities like creativity, insights, reasoning, and skills. It depends on habits of the mind: skepticism, tolerance of ambiguity, openness to new ideas, intellectual honesty, curiosity, and communication. Young female students begin studying chemistry curiosity; however, when unconvinced, they become skeptical. Researches focused on gender studies have indicated that attitudes toward science education differ between males and females. A declining interest in chemistry and the under representation of females in the chemical science was found (Jacobs, 2000). This study investigated whether self-confidence toward chemistry, the influence of role models, and knowledge about the usefulness of chemistry were affecting the attitudes toward chemistry, of 183 high school young females across the United States. The young female students surveyed, had studied chemistry for at least one year prior to participating in the study during the fall semester of 2003. The schools were randomly selected represented diverse economic backgrounds and geographical locations. Data were obtained using Chemistry Attitude Influencing Factors (CAIF) instrument and from interviews with a focus group of three young female students about the effect of self-confidence toward chemistry, the influence of role models, and knowledge about the usefulness of chemistry on their decision to study chemistry. The CAIF instrument consisted of a 12-items self-confidence questionnaire (ConfiS), 12-items each of the influence of role models (RoMoS) and knowledge about usefulness of chemistry (US) questionnaire. ConfiS was adopted (with permission) from CAEQ (Coll & Dalgety, 2001), and both RoMoS and US were modified from TOSRA (Fraser, 1978), public domain document. The three young female students interviewed, gave detailed responses about their opinions regarding self-confidence toward chemistry, the influence of role models, and knowledge about the usefulness of chemistry on their attitudes toward the study of chemistry. Both quantitative (a Likert-type Scale questionnaire) and qualitative (open-ended questions) items were used to investigate the views of young female students. Results of the survey were analysed using a correlation test. Significant differences were found in the Likert-type scale scores, providing evidences supporting literature that suggests, self-confidence toward chemistry, the influence of role models, and knowledge about the usefulness of chemistry affect the decision of young female students about the study of chemistry. Interview responses corroborated the results from the survey. Strategies for addressing the problems and recommendations for further studies have been suggested.

  14. Refinement of a Chemistry Attitude Measure for College Students

    ERIC Educational Resources Information Center

    Xu, Xiaoying; Lewis, Jennifer E.

    2011-01-01

    This work presents the evaluation and refinement of a chemistry attitude measure, Attitude toward the Subject of Chemistry Inventory (ASCI), for college students. The original 20-item and revised 8-item versions of ASCI (V1 and V2) were administered to different samples. The evaluation for ASCI had two main foci: reliability and validity. This…

  15. The influence of classroom experiences on community college students self-efficacy, attitude, and future intentions

    NASA Astrophysics Data System (ADS)

    Dawkins, Linda Mulderig

    Science and technology are an integral part of everyday life. Therefore it is necessary that the general population have some understanding and appreciation for science. Participating in activities that are science-related is one way a person could enhance their understanding and appreciation for science. According to the Theory of Planned Behavior (TPB), the attitude and self-efficacy beliefs a person holds regarding an object or activity will influence behavioral intentions (Ajzen, 1991). Therefore, if science educators can have a positive influence on their students' attitude and sense of efficacy toward science, perhaps the result will be a populace who willingly participates in science-related activities, ultimately gaining a better understanding and appreciation for science. The present study examined the relationships between the classroom environment students experienced during a ten week period of introductory chemistry and their attitudes toward chemistry (and general science), chemistry self-efficacy, and intentions to participate in chemistry-related activities in the future. The participants of this study (N = 189) were Midwestern community college students enrolled in an introductory chemistry course. The efficacy scale of the Chemistry Attitude and Experiences Questionnaire (CAEQ) developed by Dalgety, Coll, and Jones (2003) was used to measure student chemistry self-efficacy. The attitude scale used in this study consisted of the attitude toward chemistry items of CAEQ and five additional items pertaining to general science attitude. The classroom environment scale was defined by two measures: (1) instructional pedagogies and (2) teacher immediacy behaviors. The items within the instructional pedagogies and teacher immediacy measures were based on previous research that focused on identifying teaching techniques and teacher attributes that were conducive to promoting an engaging, supportive classroom environment that would promote better attitude toward science and stronger science self-efficacy beliefs. Exploratory factor analysis of the attitude items revealed that students did not differentiate between general science attitude and chemistry attitude. Therefore, all twenty-six attitude items were combined into one attitude measure. Additionally, factor analysis revealed that the items designed to measure the separate dimensions of instructional pedagogies and teacher immediacy behavior both loaded highly on the same factor, resulting in the combing of these two sets of items into one measure of classroom environment. Structural equations modeling (SEM) analyses of the relationships between student perceptions of the classroom environment and their attitude, efficacy and intentions to participate in chemistry-related activities revealed that a positive classroom environment was associated with positive changes in both attitude toward chemistry/science and chemistry self-efficacy, as hypothesized. These analyses also supported the hypothesis that a positive change in chemistry self-efficacy beliefs mediated student intentions to participate in chemistry-related activities. However, the findings did not support the hypothesis that positive changes in attitude toward chemistry/science would mediate participation in chemistry-related activities.

  16. Undergraduate chemistry students' conceptions of atomic structure, molecular structure and chemical bonding

    NASA Astrophysics Data System (ADS)

    Campbell, Erin Roberts

    The process of chemical education should facilitate students' construction of meaningful conceptual structures about the concepts and processes of chemistry. It is evident, however, that students at all levels possess concepts that are inconsistent with currently accepted scientific views. The purpose of this study was to examine undergraduate chemistry students' conceptions of atomic structure, chemical bonding and molecular structure. A diagnostic instrument to evaluate students' conceptions of atomic and molecular structure was developed by the researcher. The instrument incorporated multiple-choice items and reasoned explanations based upon relevant literature and a categorical summarization of student responses (Treagust, 1988, 1995). A covalent bonding and molecular structure diagnostic instrument developed by Peterson and Treagust (1989) was also employed. The ex post facto portion of the study examined the conceptual understanding of undergraduate chemistry students using descriptive statistics to summarize the results obtained from the diagnostic instruments. In addition to the descriptive portion of the study, a total score for each student was calculated based on the combination of correct and incorrect choices made for each item. A comparison of scores obtained on the diagnostic instruments by the upper and lower classes of undergraduate students was made using a t-Test. This study also examined an axiomatic assumption that an understanding of atomic structure is important in understanding bonding and molecular structure. A Pearson Correlation Coefficient, ṟ, was calculated to provide a measure of the strength of this association. Additionally, this study gathered information regarding expectations of undergraduate chemistry students' understanding held by the chemical community. Two questionnaires were developed with items based upon the propositional knowledge statements used in the development of the diagnostic instruments. Subgroups of items from the questionnaires were formed from the combination of items found to measure different aspects of a specific topic area using a reliability analysis. Average scores for the subgroups were compared to results obtained by students on the diagnostic instrument targeting the same topic area. There were no significant differences of the scores on both of the diagnostic instruments between the levels of undergraduate chemistry students. There were, however, significant differences on certain items of the diagnostic instruments between upper and lower class students. Additionally, misconceptions were identified within all levels of these undergraduate students that corresponded to previous results reported in the literature. A significant relationship was found to exist between the scores obtained on the two diagnostic instruments, as well as strong correlations between specific items and the total scores of the instruments. Response to the expectations questionnaires revealed no differences between the chemical industry and chemical academia, but did provide information concerning the chemical community's expectations of undergraduate chemistry students. Results indicate that undergraduate students majoring in chemistry have conceptions that are inconsistent with currently accepted scientific views. The findings also support the hypothesis that an understanding of the general structure of the atom and the roles played by electrons in molecular bonding and structure is important to an understanding of chemical properties and behavior.

  17. Formative Assessment in High School Chemistry Teaching: Investigating the Alignment of Teachers' Goals with Their Items

    ERIC Educational Resources Information Center

    Sandlin, Benjamin; Harshman, Jordan; Yezierski, Ellen

    2015-01-01

    A 2011 report by the Department of Education states that understanding how teachers use results from formative assessments to guide their practice is necessary to improve instruction. Chemistry teachers have goals for items in their formative assessments, but the degree of alignment between what is assessed by these items and the teachers' goals…

  18. Evaluation of diagnostic tools that tertiary teachers can apply to profile their students' conceptions

    NASA Astrophysics Data System (ADS)

    Schultz, Madeleine; Lawrie, Gwendolyn A.; Bailey, Chantal H.; Bedford, Simon B.; Dargaville, Tim R.; O'Brien, Glennys; Tasker, Roy; Thompson, Christopher D.; Williams, Mark; Wright, Anthony H.

    2017-03-01

    A multi-institution collaborative team of Australian chemistry education researchers, teaching a total of over 3000 first year chemistry students annually, has explored a tool for diagnosing students' prior conceptions as they enter tertiary chemistry courses. Five core topics were selected and clusters of diagnostic items were assembled linking related concepts in each topic together. An ordered multiple choice assessment strategy was adopted to enable provision of formative feedback to students through combination of the specific distractors that they chose. Concept items were either sourced from existing research instruments or developed by the project team. The outcome is a diagnostic tool consisting of five topic clusters of five concept items that has been delivered in large introductory chemistry classes at five Australian institutions. Statistical analysis of data has enabled exploration of the composition and validity of the instrument including a comparison between delivery of the complete 25 item instrument with subsets of five items, clustered by topic. This analysis revealed that most items retained their validity when delivered in small clusters. Tensions between the assembly, validation and delivery of diagnostic instruments for the purposes of acquiring robust psychometric research data versus their pragmatic use are considered in this study.

  19. Answering Fixed Response Items in Chemistry: A Pilot Study.

    ERIC Educational Resources Information Center

    Hateley, R. J.

    1979-01-01

    Presents a pilot study on student thinking in chemistry. Verbal comments of a group of six college students were recorded and analyzed to identify how each student arrives at the correct answer in fixed response items in chemisty. (HM)

  20. The effect of participation in an extended inquiry project on general chemistry student laboratory interactions, confidence, and process skills

    NASA Astrophysics Data System (ADS)

    Krystyniak, Rebecca A.

    2001-12-01

    This study explored the effect of participation by second-semester general chemistry students in an extended open-inquiry laboratory investigation on their use of science process skills and confidence in performing specific aspects of laboratory investigations. In addition, verbal interactions of a student lab team among team members and with their instructor over three open-inquiry laboratory sessions and two non-inquiry sessions were investigated. Instruments included the Test of Integrated Skills (TIPS), a 36-item multiple-choice instrument, and the Chemistry Laboratory Survey (CLS), a researcher co-designed 20-item 8-point instrument. Instruments were administered at the beginning and close of the semester to 157 second-semester general chemistry students at the two universities; students at only one university participated in open-inquiry activity. A MANCOVA was performed to investigate relationships among control and experimental students, TIPS, and CLS post-test scores. Covariates were TIPS and CLS pre-test scores and prior high school and college science experience. No significant relationships were found. Wilcoxen analyses indicated both groups showed increase in confidence; experimental-group students with below-average TIPS pre-test scores showed a significant increase in science process skills. Transcribed audio tapes of all laboratory-based verbal interactions were analyzed. Coding categories, developed using the constant comparison method, led to an inter-rater reliability of .96. During open-inquiry activities, the lab team interacted less often, sought less guidance from their instructor, and talked less about chemistry concepts than during non-inquiry activities. Evidence confirmed that students used science process skills and engaged in higher-order thinking during both types of activities. A four-student focus shared their experiences with open-inquiry activities, indicating that they enjoyed the experience, viewed it as worthwhile, and believed it helped them gain understanding of the nature of chemistry research. Research results indicate that participation in open-inquiry laboratory increases student confidence and, for some students, the ability to use science process skills. Evidence documents differences in student laboratory interactions and behavior that are attributable to the type of laboratory experience. Further research into aspects of open-inquiry laboratory experiences is recommended.

  1. High School Students' Concepts of Acids and Bases.

    ERIC Educational Resources Information Center

    Ross, Bertram H. B.

    An investigation of Ontario high school students' understanding of acids and bases with quantitative and qualitative methods revealed misconceptions. A concept map, based on the objectives of the Chemistry Curriculum Guideline, generated multiple-choice items and interview questions. The multiple-choice test was administered to 34 grade 12…

  2. Analysis of Student Performance in Peer Led Undergraduate Supplements

    NASA Astrophysics Data System (ADS)

    Gardner, Linda M.

    Foundations of Chemistry courses at the University of Kansas have traditionally accommodated nearly 1,000 individual students every year with a single course in a large lecture hall. To develop a more student-centered learning atmosphere, Peer Led Undergraduate Supplements (PLUS) were introduced to assist students, starting in the spring of 2010. PLUS was derived from the more well-known Peer-Led Team Learning with modifications to meet the specific needs of the university and the students. The yearlong investigation of PLUS Chemistry began in the fall of 2012 to allow for adequate development of materials and training of peer leaders. We examined the impact of academic achievement for students who attended PLUS sessions while controlling for high school GPA, math ACT scores, credit hours earned in high school, completion of calculus, gender, and those aspiring to be pharmacists (i.e., pre-pharmacy students). In a least linear squares multiple regression, PLUS participants performed on average one percent higher on exam scores for Chemistry 184 and four tenths of a percent on Chemistry 188 for each PLUS session attended. Pre-pharmacy students moderated the effect of PLUS attendance on chemistry achievement, ultimately negating any relative gain associated by attending PLUS sessions. Evidence of gender difference was demonstrated in the Chemistry 188 model, indicating females experience a greater benefit from PLUS sessions. Additionally, an item analysis studied the relationship between PLUS material to individual items on exams. The research discovered that students who attended PLUS session, answered the items correctly 10 to 20 percent more than their comparison group for PLUS interrelated items and no difference to 10 percent for non-PLUS related items. In summary, PLUS has a positive effect on exam performance in introductory chemistry courses at the University of Kansas.

  3. The Binary System Laboratory Activities Based on Students Mental Model

    NASA Astrophysics Data System (ADS)

    Albaiti, A.; Liliasari, S.; Sumarna, O.; Martoprawiro, M. A.

    2017-09-01

    Generic science skills (GSS) are required to develop student conception in learning binary system. The aim of this research was to know the improvement of students GSS through the binary system labotoratory activities based on their mental model using hypothetical-deductive learning cycle. It was a mixed methods embedded experimental model research design. This research involved 15 students of a university in Papua, Indonesia. Essay test of 7 items was used to analyze the improvement of students GSS. Each items was designed to interconnect macroscopic, sub-microscopic and symbolic levels. Students worksheet was used to explore students mental model during investigation in laboratory. The increase of students GSS could be seen in their N-Gain of each GSS indicators. The results were then analyzed descriptively. Students mental model and GSS have been improved from this study. They were interconnect macroscopic and symbolic levels to explain binary systems phenomena. Furthermore, they reconstructed their mental model with interconnecting the three levels of representation in Physical Chemistry. It necessary to integrate the Physical Chemistry Laboratory into a Physical Chemistry course for effectiveness and efficiency.

  4. Relationship of students' conceptual representations and problem-solving abilities in acid-base chemistry

    NASA Astrophysics Data System (ADS)

    Powers, Angela R.

    2000-10-01

    This study explored the relationship between secondary chemistry students' conceptual representations of acid-base chemistry, as shown in student-constructed concept maps, and their ability to solve acid-base problems, represented by their score on an 18-item paper and pencil test, the Acid-Base Concept Assessment (ABCA). The ABCA, consisting of both multiple-choice and short-answer items, was originally designed using a question-type by subtopic matrix, validated by a panel of experts, and refined through pilot studies and factor analysis to create the final instrument. The concept map task included a short introduction to concept mapping, a prototype concept map, a practice concept-mapping activity, and the instructions for the acid-base concept map task. The instruments were administered to chemistry students at two high schools; 108 subjects completed both instruments for this study. Factor analysis of ABCA results indicated that the test was unifactorial for these students, despite the intention to create an instrument with multiple "question-type" scales. Concept maps were scored both holistically and by counting valid concepts. The two approaches were highly correlated (r = 0.75). The correlation between ABCA score and concept-map score was 0.29 for holistically-scored concept maps and 0.33 for counted-concept maps. Although both correlations were significant, they accounted for only 8.8 and 10.2% of variance in ABCA scores, respectively. However, when the reliability of the instruments used is considered, more than 20% of the variance in ABCA scores may be explained by concept map scores. MANOVAs for ABCA and concept map scores by instructor, student gender, and year in school showed significant differences for both holistic and counted concept-map scores. Discriminant analysis revealed that the source of these differences was the instruction variable. Significant differences between classes receiving different instruction were found in the frequency of concepts listed by students for 9 of 10 concepts evaluated. Mean ABCA scores did not differ significantly between the two instruction groups. The results of this study failed to provide evidence of conceptual distinctions among different "types" of problem-solving items. The results suggested that several factors influence success in chemistry problem solving, including concept knowledge and organization. Further research into the nature of chemistry problems and problem solving is recommended.

  5. A Computer-Based Instrument That Identifies Common Science Misconceptions

    ERIC Educational Resources Information Center

    Larrabee, Timothy G.; Stein, Mary; Barman, Charles

    2006-01-01

    This article describes the rationale for and development of a computer-based instrument that helps identify commonly held science misconceptions. The instrument, known as the Science Beliefs Test, is a 47-item instrument that targets topics in chemistry, physics, biology, earth science, and astronomy. The use of an online data collection system…

  6. Assessing Conceptual and Algorithmic Knowledge in General Chemistry with ACS Exams

    ERIC Educational Resources Information Center

    Holme, Thomas; Murphy, Kristen

    2011-01-01

    In 2005, the ACS Examinations Institute released an exam for first-term general chemistry in which items are intentionally paired with one conceptual and one traditional item. A second-term, paired-questions exam was released in 2007. This paper presents an empirical study of student performances on these two exams based on national samples of…

  7. Investigating the Effectiveness of Teaching Methods Based on a Four-Step Constructivist Strategy

    NASA Astrophysics Data System (ADS)

    Çalik, Muammer; Ayas, Alipaşa; Coll, Richard K.

    2010-02-01

    This paper reports on an investigation of the effectiveness an intervention using several different methods for teaching solution chemistry. The teaching strategy comprised a four-step approach derived from a constructivist view of learning. A sample consisting of 44 students (18 boys and 26 girls) was selected purposively from two different Grade 9 classes in the city of Trabzon, Turkey. Data collection employed a purpose-designed `solution chemistry concept test', consisting of 17 items, with the quantitative data from the survey supported by qualitative interview data. The findings suggest that using different methods embedded within the four-step constructivist-based teaching strategy enables students to refute some alternative conceptions, but does not completely eliminate student alternative conceptions for solution chemistry.

  8. Gender and Minority Achievement Gaps in Science in Eighth Grade: Item Analyses of Nationally Representative Data. Research Report. ETS RR-17-36

    ERIC Educational Resources Information Center

    Qian, Xiaoyu; Nandakumar, Ratna; Glutting, Joseoph; Ford, Danielle; Fifield, Steve

    2017-01-01

    In this study, we investigated gender and minority achievement gaps on 8th-grade science items employing a multilevel item response methodology. Both gaps were wider on physics and earth science items than on biology and chemistry items. Larger gender gaps were found on items with specific topics favoring male students than other items, for…

  9. Dimensionality and predictive validity of the HAM-Nat, a test of natural sciences for medical school admission

    PubMed Central

    2011-01-01

    Background Knowledge in natural sciences generally predicts study performance in the first two years of the medical curriculum. In order to reduce delay and dropout in the preclinical years, Hamburg Medical School decided to develop a natural science test (HAM-Nat) for student selection. In the present study, two different approaches to scale construction are presented: a unidimensional scale and a scale composed of three subject specific dimensions. Their psychometric properties and relations to academic success are compared. Methods 334 first year medical students of the 2006 cohort responded to 52 multiple choice items from biology, physics, and chemistry. For the construction of scales we generated two random subsamples, one for development and one for validation. In the development sample, unidimensional item sets were extracted from the item pool by means of weighted least squares (WLS) factor analysis, and subsequently fitted to the Rasch model. In the validation sample, the scales were subjected to confirmatory factor analysis and, again, Rasch modelling. The outcome measure was academic success after two years. Results Although the correlational structure within the item set is weak, a unidimensional scale could be fitted to the Rasch model. However, psychometric properties of this scale deteriorated in the validation sample. A model with three highly correlated subject specific factors performed better. All summary scales predicted academic success with an odds ratio of about 2.0. Prediction was independent of high school grades and there was a slight tendency for prediction to be better in females than in males. Conclusions A model separating biology, physics, and chemistry into different Rasch scales seems to be more suitable for item bank development than a unidimensional model, even when these scales are highly correlated and enter into a global score. When such a combination scale is used to select the upper quartile of applicants, the proportion of successful completion of the curriculum after two years is expected to rise substantially. PMID:21999767

  10. Dimensionality and predictive validity of the HAM-Nat, a test of natural sciences for medical school admission.

    PubMed

    Hissbach, Johanna C; Klusmann, Dietrich; Hampe, Wolfgang

    2011-10-14

    Knowledge in natural sciences generally predicts study performance in the first two years of the medical curriculum. In order to reduce delay and dropout in the preclinical years, Hamburg Medical School decided to develop a natural science test (HAM-Nat) for student selection. In the present study, two different approaches to scale construction are presented: a unidimensional scale and a scale composed of three subject specific dimensions. Their psychometric properties and relations to academic success are compared. 334 first year medical students of the 2006 cohort responded to 52 multiple choice items from biology, physics, and chemistry. For the construction of scales we generated two random subsamples, one for development and one for validation. In the development sample, unidimensional item sets were extracted from the item pool by means of weighted least squares (WLS) factor analysis, and subsequently fitted to the Rasch model. In the validation sample, the scales were subjected to confirmatory factor analysis and, again, Rasch modelling. The outcome measure was academic success after two years. Although the correlational structure within the item set is weak, a unidimensional scale could be fitted to the Rasch model. However, psychometric properties of this scale deteriorated in the validation sample. A model with three highly correlated subject specific factors performed better. All summary scales predicted academic success with an odds ratio of about 2.0. Prediction was independent of high school grades and there was a slight tendency for prediction to be better in females than in males. A model separating biology, physics, and chemistry into different Rasch scales seems to be more suitable for item bank development than a unidimensional model, even when these scales are highly correlated and enter into a global score. When such a combination scale is used to select the upper quartile of applicants, the proportion of successful completion of the curriculum after two years is expected to rise substantially.

  11. How Do Undergraduate Students Conceptualize Acid-Base Chemistry? Measurement of a Concept Progression

    ERIC Educational Resources Information Center

    Romine, William L.; Todd, Amber N.; Clark, Travis B.

    2016-01-01

    We developed and validated a new instrument, called "Measuring Concept progressions in Acid-Base chemistry" (MCAB) and used it to better understand the progression of undergraduate students' understandings about acid-base chemistry. Items were developed based on an existing learning progression for acid-base chemistry. We used the Rasch…

  12. The development of an integrated assessment instrument for measuring analytical thinking and science process skills

    NASA Astrophysics Data System (ADS)

    Irwanto, Rohaeti, Eli; LFX, Endang Widjajanti; Suyanta

    2017-05-01

    This research aims to develop instrument and determine the characteristics of an integrated assessment instrument. This research uses 4-D model, which includes define, design, develop, and disseminate. The primary product is validated by expert judgment, tested it's readability by students, and assessed it's feasibility by chemistry teachers. This research involved 246 students of grade XI of four senior high schools in Yogyakarta, Indonesia. Data collection techniques include interview, questionnaire, and test. Data collection instruments include interview guideline, item validation sheet, users' response questionnaire, instrument readability questionnaire, and essay test. The results show that the integrated assessment instrument has Aiken validity value of 0.95. Item reliability was 0.99 and person reliability was 0.69. Teachers' response to the integrated assessment instrument is very good. Therefore, the integrated assessment instrument is feasible to be applied to measure the students' analytical thinking and science process skills.

  13. Developing and validating a chemical bonding instrument for Korean high school students

    NASA Astrophysics Data System (ADS)

    Jang, Nak Han

    The major purpose of this study was to develop a reliable and valid instrument designed to collect and investigate on Korean high school students' understanding about concepts regarding chemical bonding. The Chemical Bonding Diagnostic Test (CBDT) was developed by the procedure by previously relevant researches (Treagust, 1985; Peterson, 1986; Tan, 1994). The final instrument consisted of 15 two-tier items. The reliability coefficient (Cronbach alpha) for the whole test was 0.74. Also, the range of values for the discrimination index was from 0.38 to 0.90 and the overall average difficulty index was 0.38. The test was administered to 716 science declared students in Korean high school. The 37 common misconceptions on chemical bonding were identified through analysis of the items from the CBDT. The grade 11 students had slightly more misconceptions than the grade 12 students for ionic bonding, covalent bonding, and hydrogen bonding while the grade 12 students had more misconceptions about octet rule and hydrogen bonding than the grade 11 students. From the analysis of ANCOVA, there was no significant difference in grades, and between grade levels and gender on the mean score of CBDT. However, there was a significant difference in gender and a significant interaction between grade levels and chemistry preference. In conclusion, Korean high school students had the most common misconception about the electron configuration on ionic bonding and the water density on hydrogen bonding. Korean students' understanding about the chemical bonding was dependent on the interaction between grade levels and the chemistry preference. Consequently, grade 12 chemistry-preferred students had the highest mean scores among student groups concerned by this study.

  14. Measuring the development of conceptual understanding in chemistry

    NASA Astrophysics Data System (ADS)

    Claesgens, Jennifer Marie

    The purpose of this dissertation research is to investigate and characterize how students learn chemistry from pre-instruction to deeper understanding of the subject matter in their general chemistry coursework. Based on preliminary work, I believe that students have a general pathway of learning across the "big ideas," or concepts, in chemistry that can be characterized over the course of instruction. My hypothesis is that as students learn chemistry they build from experience and logical reasoning then relate chemistry specific ideas in a pair-wise fashion before making more complete multi-relational links for deeper understanding of the subject matter. This proposed progression of student learning, which starts at Notions, moves to Recognition, and then to Formulation, is described in the ChemQuery Perspectives framework. My research continues the development of ChemQuery, an NSF-funded assessment system that uses a framework of the key ideas in the discipline and criterion-referenced analysis using item response theory (IRT) to map student progress. Specifially, this research investigates the potential for using criterion-referenced analysis to describe and measure how students learn chemistry followed by more detailed task analysis of patterns in student responses found in the data. My research question asks: does IRT work to describe and measure how students learn chemistry and if so, what is discovered about how students learn? Although my findings seem to neither entirely support nor entirely refute the pathway of student understanding proposed in the ChemQuery Perspectives framework. My research does provide an indication of trouble spots. For example, it seems like the pathway from Notions to Recognition is holding but there are difficulties around the transition from Recognition to Formulation that cannot be resolved with this data. Nevertheless, this research has produced the following, which has contributed to the development of the ChemQuery assessment system, (a) 13 new change items with good fits, 3 new change items that need further study, (b) a refined scoring guide and (c) a set of item exemplars that can then be developed further into a computer-adapted model so that more data can be captured.

  15. Guide to Developing High-Quality, Reliable, and Valid Multiple-Choice Assessments

    ERIC Educational Resources Information Center

    Towns, Marcy H.

    2014-01-01

    Chemistry faculty members are highly skilled in obtaining, analyzing, and interpreting physical measurements, but often they are less skilled in measuring student learning. This work provides guidance for chemistry faculty from the research literature on multiple-choice item development in chemistry. Areas covered include content, stem, and…

  16. The Combined Effects of Classroom Teaching and Learning Strategy Use on Students' Chemistry Self-Efficacy

    NASA Astrophysics Data System (ADS)

    Cheung, Derek

    2015-02-01

    For students to be successful in school chemistry, a strong sense of self-efficacy is essential. Chemistry self-efficacy can be defined as students' beliefs about the extent to which they are capable of performing specific chemistry tasks. According to Bandura (Psychol. Rev. 84:191-215, 1977), students acquire information about their level of self-efficacy from four sources: performance accomplishments, vicarious experiences, verbal persuasion, and physiological states. No published studies have investigated how instructional strategies in chemistry lessons can provide students with positive experiences with these four sources of self-efficacy information and how the instructional strategies promote students' chemistry self-efficacy. In this study, questionnaire items were constructed to measure student perceptions about instructional strategies, termed efficacy-enhancing teaching, which can provide positive experiences with the four sources of self-efficacy information. Structural equation modeling was then applied to test a hypothesized mediation model, positing that efficacy-enhancing teaching positively affects students' chemistry self-efficacy through their use of deep learning strategies such as metacognitive control strategies. A total of 590 chemistry students at nine secondary schools in Hong Kong participated in the survey. The mediation model provided a good fit to the student data. Efficacy-enhancing teaching had a direct effect on students' chemistry self-efficacy. Efficacy-enhancing teaching also directly affected students' use of deep learning strategies, which in turn affected students' chemistry self-efficacy. The implications of these findings for developing secondary school students' chemistry self-efficacy are discussed.

  17. The Impact of Non-attempted and Dually-Attempted Items on Person Abilities Using Item Response Theory

    PubMed Central

    Sideridis, Georgios D.; Tsaousis, Ioannis; Al Harbi, Khaleel

    2016-01-01

    The purpose of the present study was to relate response strategy with person ability estimates. Two behavioral strategies were examined: (a) the strategy to skip items in order to save time on timed tests, and, (b) the strategy to select two responses on an item, with the hope that one of them may be considered correct. Participants were 4,422 individuals who were administered a standardized achievement measure related to math, biology, chemistry, and physics. In the present evaluation, only the physics subscale was employed. Two analyses were conducted: (a) a person-based one to identify differences between groups and potential correlates of those differences, and, (b) a measure-based analysis in order to identify the parts of the measure that were responsible for potential group differentiation. For (a) person abilities the 2-PL model was employed and later the 3-PL and 4-PL models in order to estimate upper and lower asymptotes of person abilities. For (b) differential item functioning, differential test functioning, and differential distractor functioning were investigated. Results indicated that there were significant differences between groups with completers having the highest ability compared to both non-attempters and dual responders. There were no significant differences between no-attempters and dual responders. The present findings have implications for response strategy efficacy and measure evaluation, revision, and construction. PMID:27790174

  18. The Impact of Non-attempted and Dually-Attempted Items on Person Abilities Using Item Response Theory.

    PubMed

    Sideridis, Georgios D; Tsaousis, Ioannis; Al Harbi, Khaleel

    2016-01-01

    The purpose of the present study was to relate response strategy with person ability estimates. Two behavioral strategies were examined: (a) the strategy to skip items in order to save time on timed tests, and, (b) the strategy to select two responses on an item, with the hope that one of them may be considered correct. Participants were 4,422 individuals who were administered a standardized achievement measure related to math, biology, chemistry, and physics. In the present evaluation, only the physics subscale was employed. Two analyses were conducted: (a) a person-based one to identify differences between groups and potential correlates of those differences, and, (b) a measure-based analysis in order to identify the parts of the measure that were responsible for potential group differentiation. For (a) person abilities the 2-PL model was employed and later the 3-PL and 4-PL models in order to estimate upper and lower asymptotes of person abilities. For (b) differential item functioning, differential test functioning, and differential distractor functioning were investigated. Results indicated that there were significant differences between groups with completers having the highest ability compared to both non-attempters and dual responders. There were no significant differences between no-attempters and dual responders. The present findings have implications for response strategy efficacy and measure evaluation, revision, and construction.

  19. Atmospheric Pressure Chemical Ionization Sources Used in The Detection of Explosives by Ion Mobility Spectrometry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Waltman, Melanie J.

    2010-05-01

    Explosives detection is a necessary and wide spread field of research. From large shipping containers to airline luggage, numerous items are tested for explosives every day. In the area of trace explosives detection, ion mobility spectrometry (IMS) is the technique employed most often because it is a quick, simple, and accurate way to test many items in a short amount of time. Detection by IMS is based on the difference in drift times of product ions through the drift region of an IMS instrument. The product ions are created when the explosive compounds, introduced to the instrument, are chemically ionizedmore » through interactions with the reactant ions. The identity of the reactant ions determines the outcomes of the ionization process. This research investigated the reactant ions created by various ionization sources and looked into ways to manipulate the chemistry occurring in the sources.« less

  20. Development and Analysis of an Instrument to Assess Student Understanding of GOB Chemistry Knowledge Relevant to Clinical Nursing Practice

    ERIC Educational Resources Information Center

    Brown, Corina E.; Hyslop, Richard M.; Barbera, Jack

    2015-01-01

    The General, Organic, and Biological Chemistry Knowledge Assessment (GOB-CKA) is a multiple-choice instrument designed to assess students' understanding of the chemistry topics deemed important to clinical nursing practice. This manuscript describes the development process of the individual items along with a psychometric evaluation of the…

  1. Impact of STS (Context-Based Type of Teaching) in Comparison with a Textbook Approach on Attitudes and Achievement in Community College Chemistry Classrooms

    ERIC Educational Resources Information Center

    Perkins, Gita

    2011-01-01

    The purpose of this study was to analyze the impact of a context-based teaching approach (STS) versus a more traditional textbook approach on the attitudes and achievement of community college chemistry students. In studying attitudes toward chemistry within this study, I used a 30-item Likert scale in order to study the importance of chemistry in…

  2. Consumer Chemistry in the Classroom. Science from the Supermarket.

    ERIC Educational Resources Information Center

    Sumrall, William J.; Brown, Fred W.

    1991-01-01

    Activities that show students a practical use for chemistry using common items such as food products, pharmaceuticals, and household products as sources of chemical compounds are presented. The importance of having adequate resource materials available for students is emphasized. (KR)

  3. How Gender and College Chemistry Experience Influence Mental Rotation Ability.

    ERIC Educational Resources Information Center

    Brownlow, Sheila; Miderski, Carol Ann

    Deficits in spatial abilities, particularly Mental Rotation (MR), may contribute to women's avoidance of areas of study (such as chemistry) that rely on MR. Those women who do succeed in chemistry may do so because they have MT skills that are on par with their male peers. We examined MR ability on 12 items from the Vandenberg and Kuse MR test…

  4. A Spoonful of C[subscript 12]H[subscript 22]O[subscript 11] Makes the Chemistry Go Down: Candy Motivations in the High School Chemistry Classroom

    ERIC Educational Resources Information Center

    Ennever, Fanny K.

    2007-01-01

    A food motivation activity, using a candy bar for high school chemistry classes is described. The use of everyday items like candy makes lab sessions interesting for students and may also help connect chemical concepts to their observable world and encourage them to ask questions.

  5. Bibliography of AV Materials.

    ERIC Educational Resources Information Center

    Journal of Chemical Education, 1981

    1981-01-01

    Presented is the second part of a bibliographic listing of commercially available audiovisual materials for chemistry. Information includes producer (with addresses), catalog number, format (slides, cassettes, filmstrips, films), and price for items in these categories: matter and energy, nuclear chemistry, periodic table, solids and crystals,…

  6. Survey Exploring Views of Scientists on Current Trends in Chemistry Education

    NASA Astrophysics Data System (ADS)

    Vamvakeros, Xenofon; Pavlatou, Evangelia A.; Spyrellis, Nicolas

    2010-02-01

    A survey exploring the views of scientists, chemists and chemical engineers, on current trends in Chemistry Education was conducted in Greece. Their opinions were investigated using a questionnaire focusing on curricula (the content and process of chemistry teaching and learning), as well as on the respondents’ general educational beliefs and their underlying epistemological views. The aim of this work was to investigate the respondents’ opinions and, if possible, to identify the areas where convergence or even consensus occurred. The results showed that some of the items on the research questionnaire produced a high degree of agreement with the respondents’ views, while a few others were exactly the opposite. These items are considered to be representative of more widespread views. In order to explore the diverging opinions, the items on the research questionnaire that showed great variance were analyzed to determine, whether or not there were significant inter-item correlations among subgroups of participants with different demographic characteristics. Postgraduate studies, professional occupation, age/experience, and career within or outside the wide educational sector were among the main factors that significantly influenced the research results. The study did not reveal any single belief framework underlying the opinions of the respondents. Nevertheless, three specific approach frameworks—ACADEMIC, CONSTRUCTIVIST and SCIENTIFIC REALISM—were analyzed to determine which had the highest degree of agreement. It was found that the SCIENTIFIC REALISM framework and the curriculum emphasis characteristic of the context-based CTSE (Chemistry, Technology, Society and Environment) prevailed, as they produced a significantly higher mean score. The ACADEMIC framework followed with a moderate mean score and the CONSTRUCTIVIST framework had a lower mean score.

  7. Teacher-Student Interaction and Gifted Students' Attitudes toward Chemistry in Laboratory Classrooms in Singapore

    ERIC Educational Resources Information Center

    Lang, Quek Choon; Wong, Angela F. L.; Fraser, Barry J.

    2005-01-01

    This study investigated associations between teacher-student interaction and students' attitudes towards chemistry among 497 tenth grade students from three independent schools in Singapore. Analyses supported the reliability and validity of a 48-item version of the Questionnaire on Teacher Interaction (QTI). Statistically significant gender…

  8. Mapping Student Understanding in Chemistry: The Perspectives of Chemists

    ERIC Educational Resources Information Center

    Claesgens, Jennifer; Scalise, Kathleen; Wilson, Mark; Stacy, Angelica

    2009-01-01

    Preliminary pilot studies and a field study show how a generalizable conceptual framework calibrated with item response modeling can be used to describe the development of student conceptual understanding in chemistry. ChemQuery is an assessment system that uses a framework of the key ideas in the discipline, called the Perspectives of Chemists,…

  9. Photoelectron Spectroscopy in Advanced Placement Chemistry

    ERIC Educational Resources Information Center

    Benigna, James

    2014-01-01

    Photoelectron spectroscopy (PES) is a new addition to the Advanced Placement (AP) Chemistry curriculum. This article explains the rationale for its inclusion, an overview of how the PES instrument records data, how the data can be analyzed, and how to include PES data in the course. Sample assessment items and analysis are included, as well as…

  10. Identification and analysis of student conceptions used to solve chemical equilibrium problems

    NASA Astrophysics Data System (ADS)

    Voska, Kirk William

    This study identified and quantified chemistry conceptions students use when solving chemical equilibrium problems requiring the application of Le Chatelier's principle, and explored the feasibility of designing a paper and pencil test for this purpose. It also demonstrated the utility of conditional probabilities to assess test quality. A 10-item pencil-and-paper, two-tier diagnostic instrument, the Test to Identify Student Conceptualizations (TISC) was developed and administered to 95 second-semester university general chemistry students after they received regular course instruction concerning equilibrium in homogeneous aqueous, heterogeneous aqueous, and homogeneous gaseous systems. The content validity of TISC was established through a review of TISC by a panel of experts; construct validity was established through semi-structured interviews and conditional probabilities. Nine students were then selected from a stratified random sample for interviews to validate TISC. The probability that TISC correctly identified an answer given by a student in an interview was p = .64, while the probability that TISC correctly identified a reason given by a student in an interview was p=.49. Each TISC item contained two parts. In the first part the student selected the correct answer to a problem from a set of four choices. In the second part students wrote reasons for their answer to the first part. TISC questions were designed to identify students' conceptions concerning the application of Le Chatelier's principle, the constancy of the equilibrium constant, K, and the effect of a catalyst. Eleven prevalent incorrect conceptions were identified. This study found students consistently selected correct answers more frequently (53% of the time) than they provided correct reasons (33% of the time). The association between student answers and respective reasons on each TISC item was quantified using conditional probabilities calculated from logistic regression coefficients. The probability a student provided correct reasoning (B) when the student selected a correct answer (A) ranged from P(B| A) =.32 to P(B| A) =.82. However, the probability a student selected a correct answer when they provided correct reasoning ranged from P(A| B) =.96 to P(A| B) = 1. The K-R 20 reliability for TISC was found to be.79.

  11. Design of Chemical Literacy Assessment by Using Model of Educational Reconstruction (MER) on Solubility Topic

    NASA Astrophysics Data System (ADS)

    Yusmaita, E.; Nasra, Edi

    2018-04-01

    This research aims to produce instrument for measuring chemical literacy assessment in basic chemistry courses with solubility topic. The construction of this measuring instrument is adapted to the PISA (Programme for International Student Assessment) problem’s characteristics and the Syllaby of Basic Chemistry in KKNI-IndonesianNational Qualification Framework. The PISA is a cross-country study conducted periodically to monitor the outcomes of learners' achievement in each participating country. So far, studies conducted by PISA include reading literacy, mathematic literacy and scientific literacy. Refered to the scientific competence of the PISA study on science literacy, an assessment designed to measure the chemical literacy of the chemistry department’s students in UNP. The research model used is MER (Model of Educational Reconstruction). The validity and reliability values of discourse questions is measured using the software ANATES. Based on the acquisition of these values is obtained a valid and reliable chemical literacy questions.There are seven question items limited response on the topic of solubility with valid category, the acquisition value of test reliability is 0,86, and has a difficulty index and distinguishing good

  12. Nuclear Forensics and Attribution: A National Laboratory Perspective

    NASA Astrophysics Data System (ADS)

    Hall, Howard L.

    2008-04-01

    Current capabilities in technical nuclear forensics - the extraction of information from nuclear and/or radiological materials to support the attribution of a nuclear incident to material sources, transit routes, and ultimately perpetrator identity - derive largely from three sources: nuclear weapons testing and surveillance programs of the Cold War, advances in analytical chemistry and materials characterization techniques, and abilities to perform ``conventional'' forensics (e.g., fingerprints) on radiologically contaminated items. Leveraging that scientific infrastructure has provided a baseline capability to the nation, but we are only beginning to explore the scientific challenges that stand between today's capabilities and tomorrow's requirements. These scientific challenges include radically rethinking radioanalytical chemistry approaches, developing rapidly deployable sampling and analysis systems for field applications, and improving analytical instrumentation. Coupled with the ability to measure a signature faster or more exquisitely, we must also develop the ability to interpret those signatures for meaning. This requires understanding of the physics and chemistry of nuclear materials processes well beyond our current level - especially since we are unlikely to ever have direct access to all potential sources of nuclear threat materials.

  13. University Chemistry Students' Learning Approaches and Willingness to Change Major

    ERIC Educational Resources Information Center

    Lastusaari, Mika; Murtonen, Mari

    2013-01-01

    A questionnaire with 22 Likert type items was developed to collect cross-sectional data from university chemistry students of different study years (N = 118). The aim was to obtain information on their learning approaches as well as their study preferences. Students willing to change from their major subject to medical education represented a…

  14. Picture Chem: Playing a Game to Identify Laboratory Equipment Items and Describe Their Use

    ERIC Educational Resources Information Center

    Kavak, Nusret; Yamak, Havva

    2016-01-01

    Laboratory activities are an important means of instruction in science; as such, they have been used in chemistry education since the 1880s. Many learning objectives can be achieved through the use of laboratory activities undertaken by chemistry students. In student-centered laboratory activities, students should know how to use an apparatus in…

  15. A Psychometric Evaluation of the Colorado Learning Attitudes about Science Survey for Use in Chemistry

    ERIC Educational Resources Information Center

    Heredia, Keily; Lewis, Jennifer E.

    2012-01-01

    The purpose of this paper is to evaluate the psychometric properties of The Colorado Learning Attitudes about Science Survey (CLASS). The 50-item instrument was administered to 311 college students from a public institution in the United States enrolled in General Chemistry I Laboratory. Confirmatory factor analysis and Cronbach's [alpha]…

  16. Adaptation of the Attitude toward the Subject of Chemistry Inventory (ASCI) into Turkish

    ERIC Educational Resources Information Center

    Sen, Senol; Yilmaz, Ayhan; Temel, Senar

    2016-01-01

    Developing an attitude influential in individuals' behaviours and related with academic achievement is a concept whose development science educators consider important. This research aims to adapt the 8-item Attitude toward the Subject of Chemistry Inventory (ASCI)--which was developed by Bauer (2008) and revised by Xu and Lewis (2011)--into…

  17. Simple Methods for Production of Nanoscale Metal Oxide Films from Household Sources

    ERIC Educational Resources Information Center

    Campbell, Dean J.; Baliss, Michelle S.; Hinman, Jordan J.; Ziegenhorn, John W.; Andrews, Mark J.; Stevenson, Keith J.

    2013-01-01

    Production of thin metal oxide films was recently explored as part of an outreach program with a goal of producing nanoscale structures with household items. Household items coated with various metals or titanium compounds can be heated to produce colorful films with nanoscale thicknesses. As part of a materials chemistry laboratory experiment…

  18. Exploring Different Types of Assessment Items to Measure Linguistically Diverse Students' Understanding of Energy and Matter in Chemistry

    ERIC Educational Resources Information Center

    Ryoo, Kihyun; Toutkoushian, Emily; Bedell, Kristin

    2018-01-01

    Energy and matter are fundamental, yet challenging concepts in middle school chemistry due to their abstract, unobservable nature. Although it is important for science teachers to elicit a range of students' ideas to design and revise their instruction, capturing such varied ideas using traditional assessments consisting of multiple-choice items…

  19. Equipment

    ERIC Educational Resources Information Center

    Education in Chemistry, 1976

    1976-01-01

    Described are a number of publications and items for use in chemistry. Included are brief descriptions of actuators, thermometer strips, current balances, and a variety of catalogues and other publications. (RH)

  20. Friendship chemistry: An examination of underlying factors☆.

    PubMed

    Campbell, Kelly; Holderness, Nicole; Riggs, Matt

    2015-06-01

    Interpersonal chemistry refers to a connection between two individuals that exists upon first meeting. The goal of the current study is to identify beliefs about the underlying components of friendship chemistry. Individuals respond to an online Friendship Chemistry Questionnaire containing items that are derived from interdependence theory and the friendship formation literature. Participants are randomly divided into two subsamples. A principal axis factor analysis with promax rotation is performed on subsample 1 and produces 5 factors: Reciprocal candor, mutual interest, personableness, similarity, and physical attraction. A confirmatory factor analysis is conducted using subsample 2 and provides support for the 5-factor model. Participants with agreeable, open, and conscientious personalities more commonly report experiencing friendship chemistry, as do those who are female, young, and European/white. Responses from participants who have never experienced chemistry are qualitatively analyzed. Limitations and directions for future research are discussed.

  1. Thai Grade 11 students' alternative conceptions for acid-base chemistry

    NASA Astrophysics Data System (ADS)

    Artdej, Romklao; Ratanaroutai, Thasaneeya; Coll, Richard Kevin; Thongpanchang, Tienthong

    2010-07-01

    This study involved the development of a two-tier diagnostic instrument to assess Thai high school students' understanding of acid-base chemistry. The acid-base diagnostic test (ABDT) comprising 18 items was administered to 55 Grade 11 students in a science and mathematics programme during the second semester of the 2008 academic year. Analysis of students' responses from this study followed the methodology outlined by Çalik and Ayas. The research findings suggest that the ABDT, the multiple choice diagnostic instrument, enables researchers and teachers to classify students' understanding at different levels. Most students exhibited alternative conceptions for several concepts: acid-base theory, dissociation of strong acids or bases, and dissociation of weak acids/bases. Interestingly, one of the concepts that students appeared to find most difficult, and for which they exhibited the most alternative conceptions, was acid-base theory. Some alternative conceptions revealed in this study differ from earlier reports, such as the concept of electrolyte and non-electrolyte solutions as well as the concentration changes of H3O+and OH- in water. These research findings present valuable information for facilitating better understanding of acid-base chemistry by providing insight into the preventable and correctable alternative conceptions exhibited by students.

  2. Improving Measures via Examining the Behavior of Distractors in Multiple-Choice Tests

    PubMed Central

    Sideridis, Georgios; Tsaousis, Ioannis; Al Harbi, Khaleel

    2017-01-01

    The purpose of the present article was to illustrate, using an example from a national assessment, the value from analyzing the behavior of distractors in measures that engage the multiple-choice format. A secondary purpose of the present article was to illustrate four remedial actions that can potentially improve the measurement of the construct(s) under study. Participants were 2,248 individuals who took a national examination of chemistry. The behavior of the distractors was analyzed by modeling their behavior within the Rasch model. Potentially informative distractors were (a) further modeled using the partial credit model, (b) split onto separate items and retested for model fit and parsimony, (c) combined to form a “super” item or testlet, and (d) reexamined after deleting low-ability individuals who likely guessed on those informative, albeit erroneous, distractors. Results indicated that all but the item split strategies were associated with better model fit compared with the original model. The best fitted model, however, involved modeling and crediting informative distractors via the partial credit model or eliminating the responses of low-ability individuals who likely guessed on informative distractors. The implications, advantages, and disadvantages of modeling informative distractors for measurement purposes are discussed. PMID:29795904

  3. Bringing out the "Main Characters" in General Chemistry: Can Creating a Sense of Narrative in the Classroom and for the Textbook Aid Long-Term Memory?

    ERIC Educational Resources Information Center

    Chang, Junyoung; Churchill, David

    2011-01-01

    A new approach for teaching general chemistry is presented and discussed. Importantly, a storyline approach is provided in which the same chemical item or concept is reintroduced and embellished from chapter to chapter. The intention is to bring more connectivity between the various seemingly unrelated chapters. This might lead to a more…

  4. A three-year study of the impact of instructor attitude, enthusiasm, and teaching style on student learning in a medicinal chemistry course.

    PubMed

    Alsharif, Naser Z; Qi, Yongyue

    2014-09-15

    To determine the effect of instructor attitude, enthusiasm, and teaching style on learning for distance and campus pharmacy students. Over a 3-year period, distance and campus students enrolled in the spring semester of a medicinal chemistry course were asked to complete a survey instrument with questions related to instructor attitude, enthusiasm, and teaching style, as well as items to measure student intrinsic motivation and vitality. More positive responses were observed among distance students and older students. Gender did not impact student perspectives on 25 of the 26 survey questions. Student-related items were significantly correlated with instructor-related items. Also, student-related items and second-year cumulative grade point average were predictive of students' final course grades. Instructor enthusiasm demonstrated the highest correlation with student intrinsic motivation and vitality. While this study addresses the importance of content mastery and instructional methodologies, it focuses on issues related to instructor attitude, instructor enthusiasm, and teaching style, which all play a critical role in the learning process. Thus, instructors have a responsibility to evaluate, reevaluate, and analyze the above factors to address any related issues that impact the learning process, including their influence on professional students' intrinsic motivation and vitality, and ability to meet educational outcomes.

  5. Friendship chemistry: An examination of underlying factors☆

    PubMed Central

    Campbell, Kelly; Holderness, Nicole; Riggs, Matt

    2015-01-01

    Interpersonal chemistry refers to a connection between two individuals that exists upon first meeting. The goal of the current study is to identify beliefs about the underlying components of friendship chemistry. Individuals respond to an online Friendship Chemistry Questionnaire containing items that are derived from interdependence theory and the friendship formation literature. Participants are randomly divided into two subsamples. A principal axis factor analysis with promax rotation is performed on subsample 1 and produces 5 factors: Reciprocal candor, mutual interest, personableness, similarity, and physical attraction. A confirmatory factor analysis is conducted using subsample 2 and provides support for the 5-factor model. Participants with agreeable, open, and conscientious personalities more commonly report experiencing friendship chemistry, as do those who are female, young, and European/white. Responses from participants who have never experienced chemistry are qualitatively analyzed. Limitations and directions for future research are discussed. PMID:26097283

  6. CURRICULUM MATERIALS.

    ERIC Educational Resources Information Center

    New Jersey State Dept. of Education, Trenton.

    MATERIALS ARE LISTED BY 36 TOPICS ARRANGED IN ALPHABETICAL ORDER. TOPICS INCLUDE APPRENTICE TRAINING, BAKING, DRAFTING, ENGLISH, GLASSBLOWING, HOME ECONOMICS, INDUSTRIAL CHEMISTRY, MACHINE SHOP, NEEDLE TRADES, REFRIGERATION, AND UPHOLSTERY. PRICES ARE GIVEN FOR EACH ITEM. (EL)

  7. Students' confidence in the ability to transfer basic math skills in introductory physics and chemistry courses at a community college

    NASA Astrophysics Data System (ADS)

    Quinn, Reginald

    2013-01-01

    The purpose of this study was to examine the confidence levels that community college students have in transferring basic math skills to science classes, as well as any factors that influence their confidence levels. This study was conducted with 196 students at a community college in central Mississippi. The study was conducted during the month of November after all of the students had taken their midterm exams and received midterm grades. The instrument used in this survey was developed and validated by the researcher. The instrument asks the students to rate how confident they were in working out specific math problems and how confident they were in working problems using those specific math skills in physics and chemistry. The instrument also provided an example problem for every confidence item. Results revealed that students' demographics were significant predictors in confidence scores. Students in the 18-22 year old range were less confident in solving math problems than others. Students who had retaken a math course were less confident than those who had not. Chemistry students were less confident in solving math problems than those in physics courses. Chemistry II students were less confident than those in Chemistry I and Principals of Chemistry. Students were least confident in solving problems involving logarithms and the most confident in solving algebra problems. In general, students felt that their math courses did not prepare them for the math problems encountered in science courses. There was no significant difference in confidence between students who had completed their math homework online and those who had completed their homework on paper. The researcher recommends that chemistry educators find ways of incorporating more mathematics in their courses especially logarithms and slope. Furthermore, math educators should incorporate more chemistry related applications to math class. Results of hypotheses testing, conclusions, discussions, and recommendations for future research are included.

  8. National Chemistry Week 2000: JCE Resources in Food Chemistry

    NASA Astrophysics Data System (ADS)

    Jacobsen, Erica K.

    2000-10-01

    November brings another National Chemistry Week, and this year's theme is food chemistry. I was asked to collect and evaluate JCE resources for use with this theme, a project that took me deep into past issues of JCE and yielded many treasures. Here we present the results of searches for food chemistry information and activities. While the selected articles are mainly at the high school and college levels, there are some excellent ones for the elementary school level and some that can be adapted for younger students. The focus of all articles is on the chemistry of food itself. Activities that only use food to demonstrate a principle other than food chemistry are not included. Articles that cover household products such as cleansers and pharmaceuticals are also not included. Each article has been characterized as a demonstration, experiment, calculation, activity, or informational item; several fit more than one classification. Also included are keywords and an evaluation as to which levels the article may serve.

  9. 16 CFR 1500.83 - Exemptions for small packages, minor hazards, and special circumstances.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ...) Chemistry sets and other science education sets intended primarily for use by juveniles, and replacement... misused and should be used only under adult supervision. IMPORTANT—Read cautions on individual items...

  10. 16 CFR 1500.83 - Exemptions for small packages, minor hazards, and special circumstances.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ...) Chemistry sets and other science education sets intended primarily for use by juveniles, and replacement... misused and should be used only under adult supervision. IMPORTANT—Read cautions on individual items...

  11. 16 CFR 1500.83 - Exemptions for small packages, minor hazards, and special circumstances.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ...) Chemistry sets and other science education sets intended primarily for use by juveniles, and replacement... misused and should be used only under adult supervision. IMPORTANT—Read cautions on individual items...

  12. Application of Ion Mobility Spectrometry (IMS) in forensic chemistry and toxicology with focus on biological matrices

    NASA Technical Reports Server (NTRS)

    Bernhard, Werner; Keller, Thomas; Regenscheit, Priska

    1995-01-01

    The IMS (Ion Mobility Spectroscopy) instrument 'Ionscan' takes advantage of the fact that trace quantities of illicit drugs are adsorbed on dust particles on clothes, in cars and on other items of evidence. The dust particles are collected on a membrane filter by a special attachment on a vacuum cleaner. The sample is then directly inserted into the spectrometer and can be analyzed immediately. We show casework applications of a forensic chemistry and toxicology laboratory. One new application of IMS in forensic chemistry is the detection of psilocybin in dried mushrooms without any further sample preparation.

  13. A Three-Year Study of the Impact of Instructor Attitude, Enthusiasm, and Teaching Style on Student Learning in a Medicinal Chemistry Course

    PubMed Central

    Qi, Yongyue

    2014-01-01

    Objective. To determine the effect of instructor attitude, enthusiasm, and teaching style on learning for distance and campus pharmacy students. Methods. Over a 3-year period, distance and campus students enrolled in the spring semester of a medicinal chemistry course were asked to complete a survey instrument with questions related to instructor attitude, enthusiasm, and teaching style, as well as items to measure student intrinsic motivation and vitality. Results. More positive responses were observed among distance students and older students. Gender did not impact student perspectives on 25 of the 26 survey questions. Student-related items were significantly correlated with instructor-related items. Also, student-related items and second-year cumulative grade point average were predictive of students’ final course grades. Instructor enthusiasm demonstrated the highest correlation with student intrinsic motivation and vitality. Conclusion. While this study addresses the importance of content mastery and instructional methodologies, it focuses on issues related to instructor attitude, instructor enthusiasm, and teaching style, which all play a critical role in the learning process. Thus, instructors have a responsibility to evaluate, reevaluate, and analyze the above factors to address any related issues that impact the learning process, including their influence on professional students’ intrinsic motivation and vitality, and ability to meet educational outcomes. PMID:25258437

  14. The "wonderful properties of glass": Liebig's Kaliapparat and the practice of chemistry in glass.

    PubMed

    Jackson, Catherine M

    2015-03-01

    Everybody knows that glass is and always has been an important presence in chemical laboratories. Yet the very self-evidence of this notion tends to obscure a supremely important change in chemical practice during the early decades of the nineteenth century. This essay uses manuals of specifically chemical glassblowing published between about 1825 and 1835 to show that early nineteenth-century chemists began using glass in distinctly new ways and that their appropriation of glassblowing skill had profoundly important effects on the emerging discipline of chemistry. The new practice of chemistry in glass-exemplified in this essay by Justus Liebig's introduction of a new item of chemical glassware for organic analysis, the Kaliapparat--transformed not merely the material culture of chemistry but also its geography, its pedagogy, and, ultimately, its institutions. Moving chemistry into glass--a change so important that it warrants the term "glassware revolution"--had far-reaching consequences.

  15. A Head for Science: Using Cabbage to Teach Chemistry.

    ERIC Educational Resources Information Center

    Lifting, Inez Fugate

    1988-01-01

    Using ordinary household items--vinegar, ammonia, and cabbage juice--teachers can demonstrate properties of acids, bases, and neutrals. Students are encouraged to discuss results and hypothesize about experiments. A guide to the project is provided. (JL)

  16. 16 CFR § 1500.83 - Exemptions for small packages, minor hazards, and special circumstances.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ...) Chemistry sets and other science education sets intended primarily for use by juveniles, and replacement... misused and should be used only under adult supervision. IMPORTANT—Read cautions on individual items...

  17. Fish bioconcentration studies with column-generated analyte concentrations of highly hydrophobic organic chemicals.

    PubMed

    Schlechtriem, Christian; Böhm, Leonard; Bebon, Rebecca; Bruckert, Hans-Jörg; Düring, Rolf-Alexander

    2017-04-01

    The performance of aqueous exposure bioconcentration fish tests according to Organisation for Economic Co-operation and Development (OECD) guideline 305 requires the possibility of preparing stable aqueous concentrations of the test substances. For highly hydrophobic organic chemicals (HOCs; octanol-water partition coefficient [log K OW ] > 5), testing via aqueous exposure may become increasingly difficult. A solid-phase desorption dosing system was developed to generate stable concentrations of HOCs without using solubilizing agents. The system was tested with hexachlorobenzene (HCB), o-terphenyl (oTP), polychlorinated biphenyl (PCB) 153, and dibenz[a,h]anthracene (DBA) (log K OW 5.5-7.8) in 2 flow-through fish tests with rainbow trout (Oncorhynchus mykiss). The analysis of the test media applied during the bioconcentration factor (BCF) studies showed that stable analyte concentrations of the 4 HOCs were maintained in the test system over an uptake period of 8 wk. Bioconcentration factors (L kg -1 wet wt) were estimated for HCB (BCF 35 589), oTP (BCF 12 040), and PCB 153 (BCF 18 539) based on total water concentrations. No bioconcentration could be determined for DBA, probably because of the rapid metabolism of the test item. The solid-phase desorption dosing system is suitable to provide stable aqueous concentrations of HOCs required to determine the bioconcentration in fish and represents a viable alternative to the use of solubilizing agents for the preparation of test solutions. Environ Toxicol Chem 2017;36:906-916. © 2016 The Authors. Environmental Toxicology and Chemistry Published by Wiley Periodicals, Inc. on behalf of SETAC. © 2016 The Authors. Environmental Toxicology and Chemistry Published by Wiley Periodicals, Inc. on behalf of SETAC.

  18. Selecting Items for Criterion-Referenced Tests.

    ERIC Educational Resources Information Center

    Mellenbergh, Gideon J.; van der Linden, Wim J.

    1982-01-01

    Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)

  19. Examining Chemistry Students Visual-Perceptual Skills Using the VSCS tool and Interview Data

    NASA Astrophysics Data System (ADS)

    Christian, Caroline

    The Visual-Spatial Chemistry Specific (VSCS) assessment tool was developed to test students' visual-perceptual skills, which are required to form a mental image of an object. The VSCS was designed around the theoretical framework of Rochford and Archer that provides eight distinct and well-defined visual-perceptual skills with identified problems students might have with each skill set. Factor analysis was used to analyze the results during the validation process of the VSCS. Results showed that the eight factors could not be separated from each other, but instead two factors emerged as significant to the data. These two factors have been defined and described as a general visual-perceptual skill (factor 1) and a skill that adds on a second level of complexity by involving multiple viewpoints such as changing frames of reference. The questions included in the factor analysis were bolstered by the addition of an item response theory (IRT) analysis. Interviews were also conducted with twenty novice students to test face validity of the tool, and to document student approaches at solving visualization problems of this type. Students used five main physical resources or processes to solve the questions, but the resource that was the most successful was handling or building a physical representation of an object.

  20. An Item Gains and Losses Analysis of False Memories Suggests Critical Items Receive More Item-Specific Processing than List Items

    ERIC Educational Resources Information Center

    Burns, Daniel J.; Martens, Nicholas J.; Bertoni, Alicia A.; Sweeney, Emily J.; Lividini, Michelle D.

    2006-01-01

    In a repeated testing paradigm, list items receiving item-specific processing are more likely to be recovered across successive tests (item gains), whereas items receiving relational processing are likely to be forgotten progressively less on successive tests. Moreover, analysis of cumulative-recall curves has shown that item-specific processing…

  1. Primary and Secondary School Science.

    ERIC Educational Resources Information Center

    Educational Documentation and Information, 1984

    1984-01-01

    This 344-item annotated bibliography presents overview of science teaching in following categories: science education; primary school science; integrated science teaching; teaching of biology, chemistry, physics, earth/space science; laboratory work; computer technology; out-of-school science; science and society; science education at…

  2. Light Barrier for Non-Foil Packaging

    DTIC Science & Technology

    2010-12-16

    packaging to reduce light-induced changes continues (Tung et al 2001). Lipid oxidation chemistry PRINTPACK Inc. Atlanta, Ga Item No. 0010 Final...complete protection against UV and visible light (i.e. opacity) is advisable. Photosensitizers particularly (e.g. flavonoids , riboflavin —especially for

  3. Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

    ERIC Educational Resources Information Center

    Matlock, Ki Lynn; Turner, Ronna

    2016-01-01

    When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…

  4. Evolution of a Test Item

    ERIC Educational Resources Information Center

    Spaan, Mary

    2007-01-01

    This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…

  5. Readability Level of Standardized Test Items and Student Performance: The Forgotten Validity Variable

    ERIC Educational Resources Information Center

    Hewitt, Margaret A.; Homan, Susan P.

    2004-01-01

    Test validity issues considered by test developers and school districts rarely include individual item readability levels. In this study, items from a major standardized test were examined for individual item readability level and item difficulty. The Homan-Hewitt Readability Formula was applied to items across three grade levels. Results of…

  6. The Effect of the Position of an Item within a Test on the Item Difficulty Value.

    ERIC Educational Resources Information Center

    Rubin, Lois S.; Mott, David E. W.

    An investigation of the effect on the difficulty value of an item due to position placement within a test was made. Using a 60-item operational test comprised of 5 subtests, 60 items were placed as experimental items on a number of spiralled test forms in three different positions (first, middle, last) within the subtest composed of like items.…

  7. Relevance of Item Analysis in Standardizing an Achievement Test in Teaching of Physical Science in B.Ed Syllabus

    ERIC Educational Resources Information Center

    Marie, S. Maria Josephine Arokia; Edannur, Sreekala

    2015-01-01

    This paper focused on the analysis of test items constructed in the paper of teaching Physical Science for B.Ed. class. It involved the analysis of difficulty level and discrimination power of each test item. Item analysis allows selecting or omitting items from the test, but more importantly item analysis is a tool to help the item writer improve…

  8. Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

    ERIC Educational Resources Information Center

    Wang, Wei

    2013-01-01

    Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

  9. Test item linguistic complexity and assessments for deaf students.

    PubMed

    Cawthon, Stephanie

    2011-01-01

    Linguistic complexity of test items is one test format element that has been studied in the context of struggling readers and their participation in paper-and-pencil tests. The present article presents findings from an exploratory study on the potential relationship between linguistic complexity and test performance for deaf readers. A total of 64 students completed 52 multiple-choice items, 32 in mathematics and 20 in reading. These items were coded for linguistic complexity components of vocabulary, syntax, and discourse. Mathematics items had higher linguistic complexity ratings than reading items, but there were no significant relationships between item linguistic complexity scores and student performance on the test items. The discussion addresses issues related to the subject area, student proficiency levels in the test content, factors to look for in determining a "linguistic complexity effect," and areas for further research in test item development and deaf students.

  10. Evaluating the efficacy of a chemistry video game

    NASA Astrophysics Data System (ADS)

    Shapiro, Marina

    A quasi-experimental design pre-test/post-test intervention study utilizing a within group analysis was conducted with 45 undergraduate college chemistry students that investigated the effect of implementing a game-based learning environment into an undergraduate college chemistry course in order to learn if serious educational games (SEGs) can be used to achieve knowledge gains of complex chemistry concepts and to achieve increase in students' positive attitude toward chemistry. To evaluate if students learn chemistry concepts by participating in a chemistry game-based learning environment, a one-way repeated measures analysis of variance (ANOVA) was conducted across three time points (pre-test, post-test, delayed post-test which were chemistry content exams). Results showed that there was an increase in exam scores over time. The results of the ANOVA indicated a statistically significant time effect. To evaluate if students' attitude towards chemistry increased as a result of participating in a chemistry game-based learning environment a paired samples t-test was conducted using a chemistry attitudinal survey by Mahdi (2014) as the pre- and post-test. Results of the paired-samples t-test indicated that there was no significant difference in pre-attitudinal scores and post-attitudinal scores.

  11. The Selection of Test Items for Decision Making with a Computer Adaptive Test.

    ERIC Educational Resources Information Center

    Spray, Judith A.; Reckase, Mark D.

    The issue of test-item selection in support of decision making in adaptive testing is considered. The number of items needed to make a decision is compared for two approaches: selecting items from an item pool that are most informative at the decision point or selecting items that are most informative at the examinee's ability level. The first…

  12. Development and psychometric evaluation of an information literacy self-efficacy survey and an information literacy knowledge test.

    PubMed

    Tepe, Rodger; Tepe, Chabha

    2015-03-01

    To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. In this test-retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. The IL self-efficacy survey demonstrated good reliability (test-retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test-retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments.

  13. A New Item Selection Procedure for Mixed Item Type in Computerized Classification Testing.

    ERIC Educational Resources Information Center

    Lau, C. Allen; Wang, Tianyou

    This paper proposes a new Information-Time index as the basis for item selection in computerized classification testing (CCT) and investigates how this new item selection algorithm can help improve test efficiency for item pools with mixed item types. It also investigates how practical constraints such as item exposure rate control, test…

  14. A Process for Reviewing and Evaluating Generated Test Items

    ERIC Educational Resources Information Center

    Gierl, Mark J.; Lai, Hollis

    2016-01-01

    Testing organization needs large numbers of high-quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time-consuming and expensive because each item is written,…

  15. What's in a Topic? Exploring the Interaction between Test-Taker Age and Item Content in High-Stakes Testing

    ERIC Educational Resources Information Center

    Banerjee, Jayanti; Papageorgiou, Spiros

    2016-01-01

    The research reported in this article investigates differential item functioning (DIF) in a listening comprehension test. The study explores the relationship between test-taker age and the items' language domains across multiple test forms. The data comprise test-taker responses (N = 2,861) to a total of 133 unique items, 46 items of which were…

  16. Post-consumer use efficacies of preservatives in personal care and topical drug products: relationship to preservative category.

    PubMed

    Ravita, Timothy D; Tanner, Ralph S; Ahearn, Donald G; Arms, Erin L; Crockett, Patrick W

    2009-01-01

    Ninety-six used personal care and topical OTC drug items collected from consumers in the USA were examined for the presence of microbial contaminants. Of the eye and face product type containing global preservative chemistries (i.e., acceptable for use in Japan without major restrictions), 55% yielded numbers of microorganisms in excess of 500 CFU/g (P < 0.1814). For the mascara products with global preservative chemistries, 79% yielded numbers of microorganisms in excess of 500 CFU/g (P < 0.024). Products containing global preservative chemistries accounted for 88% (n = 14) of the products that had microbial contents above 10(4) CFU/g (P < 0.001). Prominent contaminants were species of Staphylococcus, Pseudomonas, Klebsiella, Streptococcus, Lactobacillus, Bacillus, Corynebacterium, and yeast. In general, under the stress of consumer use, products preserved with global preservative chemistries did not maintain as adequate preservation as products with non-global preservatives.

  17. Science News of the Year.

    ERIC Educational Resources Information Center

    Science News, 1987

    1987-01-01

    Provides a review of science news stories reported in "Science News" during 1987. References each item to the volume and page number in which the subject was addressed. Contains references on astronomy, behavior, biology, biomedicine, chemistry, earth sciences, environment, mathematics and computers, paleontology and anthropology, physics, science…

  18. Continuing Education in the Professions. Current Information Sources, No. 24.

    ERIC Educational Resources Information Center

    Syracuse Univ., NY. ERIC Clearinghouse on Adult Education.

    Beginning with bibliographies, surveys, and other general works, this 225-item annotated bibliography on professional continuing education covers the following areas: engineering and technical education; chemistry and clinical psychology; medicine and health (including psychiatry); inservice education and retraining for lawyers, law enforcement…

  19. Item validity vs. item discrimination index: a redundancy?

    NASA Astrophysics Data System (ADS)

    Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

    2018-03-01

    In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.

  20. A Comparison of Three Types of Test Development Procedures Using Classical and Latent Trait Methods.

    ERIC Educational Resources Information Center

    Benson, Jeri; Wilson, Michael

    Three methods of item selection were used to select sets of 38 items from a 50-item verbal analogies test and the resulting item sets were compared for internal consistency, standard errors of measurement, item difficulty, biserial item-test correlations, and relative efficiency. Three groups of 1,500 cases each were used for item selection. First…

  1. Examining Differential Item Functions of Different Item Ordered Test Forms According to Item Difficulty Levels

    ERIC Educational Resources Information Center

    Çokluk, Ömay; Gül, Emrah; Dogan-Gül, Çilem

    2016-01-01

    The study aims to examine whether differential item function is displayed in three different test forms that have item orders of random and sequential versions (easy-to-hard and hard-to-easy), based on Classical Test Theory (CTT) and Item Response Theory (IRT) methods and bearing item difficulty levels in mind. In the correlational research, the…

  2. The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

    ERIC Educational Resources Information Center

    Sahin, Alper; Anil, Duygu

    2017-01-01

    This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…

  3. [Perceptions on item disclosure for the Korean medical licensing examination].

    PubMed

    Yang, Eunbae B

    2015-09-01

    This study analyzed the perceptions of medical students and faculty regarding disclosure of test items on the Korean medical licensing examination. I conducted a survey of medical students from medical colleges and professional medical schools nationwide. Responses were analyzed from 718 participants as well as 69 faculty members who participated in creating the medical licensing examination item sets. Data were analyzed using descriptive statistics and the chi-square test. It is important to maintain test quality and to keep the test items unavailable to the public. There are also concerns among students that disclosure of test items would prompt increasing difficulty of test items (48.3%). Further, few students found it desirable to disclose test items regardless of any considerations (28.5%). The professors, who had experience in designing the test items, also expressed their opposition to test item disclosure (60.9%). It is desirable not to disclose the test items of the Korean medical licensing examination to the public on the condition that students are provided with a sufficient amount of information regarding the examination. This is so that the exam can appropriately identify candidates with the required qualifications.

  4. A Review of Classical Methods of Item Analysis.

    ERIC Educational Resources Information Center

    French, Christine L.

    Item analysis is a very important consideration in the test development process. It is a statistical procedure to analyze test items that combines methods used to evaluate the important characteristics of test items, such as difficulty, discrimination, and distractibility of the items in a test. This paper reviews some of the classical methods for…

  5. Modeling Item-Position Effects within an IRT Framework

    ERIC Educational Resources Information Center

    Debeer, Dries; Janssen, Rianne

    2013-01-01

    Changing the order of items between alternate test forms to prevent copying and to enhance test security is a common practice in achievement testing. However, these changes in item order may affect item and test characteristics. Several procedures have been proposed for studying these item-order effects. The present study explores the use of…

  6. Chemistry, College Level. Annotated Bibliography of Tests.

    ERIC Educational Resources Information Center

    Educational Testing Service, Princeton, NJ. Test Collection.

    Most of the 30 tests cited in this bibliography are those of the American Chemical Society. Subjects covered include physical chemistry, organic chemistry, inorganic chemistry, analytical chemistry, and other specialized areas. The tests are designed only for advanced high school, and both bachelor/graduate degree levels of college students. This…

  7. Science Library of Test Items. Volume Nineteen. A Collection of Multiple Choice Test Items Relating Mainly to Geology.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  8. Science Library of Test Items. Volume Seventeen. A Collection of Multiple Choice Test Items Relating Mainly to Biology.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  9. Delicious Chemicals.

    ERIC Educational Resources Information Center

    Barry, Dana M.

    This paper presents an approach to chemistry and nutrition that focuses on food items that people consider delicious. Information is organized according to three categories of food chemicals that provide energy to the human body: (1) fats and oils; (2) carbohydrates; and (3) proteins. Minerals, vitamins, and additives are also discussed along with…

  10. More Mudpies to Magnets: Science for Young Children.

    ERIC Educational Resources Information Center

    Sherwood, Elizabeth A.; Williams, Robert A.; Rockwell, Robert E.

    This book presents science activities designed for young children. The activities are divided into the following the content areas of chemistry, physics, earth explorations, weather watchers, flight and space, plants, animal adventures, and mathworks. Each activity features sections of language with science, required items, procedures, and…

  11. Assembling a Computerized Adaptive Testing Item Pool as a Set of Linear Tests

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Ariel, Adelaide; Veldkamp, Bernard P.

    2006-01-01

    Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content…

  12. Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory

    ERIC Educational Resources Information Center

    Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi

    2016-01-01

    High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…

  13. Item Specifications, Science Grade 8. Blue Prints for Testing Minimum Performance Test.

    ERIC Educational Resources Information Center

    Arkansas State Dept. of Education, Little Rock.

    These item specifications were developed as a part of the Arkansas "Minimum Performance Testing Program" (MPT). There is one item specification for each instructional objective included in the MPT. The purpose of an item specification is to provide an overview of the general content and format of test items used to measure an…

  14. Item Specifications, Science Grade 6. Blue Prints for Testing Minimum Performance Test.

    ERIC Educational Resources Information Center

    Arkansas State Dept. of Education, Little Rock.

    These item specifications were developed as a part of the Arkansas "Minimum Performance Testing Program" (MPT). There is one item specification for each instructional objective included in the MPT. The purpose of an item specification is to provide an overview of the general content and format of test items used to measure an…

  15. Criterion-Referenced Test Items for Welding.

    ERIC Educational Resources Information Center

    Davis, Diane, Ed.

    This test item bank on welding contains test questions based upon competencies found in the Missouri Welding Competency Profile. Some test items are keyed for multiple competencies. These criterion-referenced test items are designed to work with the Vocational Instructional Management System. Questions have been statistically sampled and validated…

  16. Decomposing the interaction between retention interval and study/test practice: The role of retrievability

    PubMed Central

    Jang, Yoonhee; Wixted, John T.; Pecher, Diane; Zeelenberg, René; Huber, David E.

    2012-01-01

    Even without feedback, test practice enhances delayed performance compared to study practice, but the size of the effect is variable across studies. We investigated the benefit of testing, separating initially retrievable items from initially non-retrievable items. In two experiments, an initial test determined item retrievability. Retrievable or non-retrievable items were subsequently presented for repeated study or test practice. Collapsing across items, in Experiment 1, we obtained the typical crossover interaction between retention interval and practice type. For retrievable items, however, the crossover interaction was quantitatively different, with a small study benefit for an immediate test and a larger testing benefit after a delay. For non-retrievable items, there was a large study benefit for an immediate test, but one week later there was no difference between the study and test practice conditions. In Experiment 2, initially non-retrievable items were given additional study followed by either an immediate test or even more additional study, and one week later performance did not differ between the two conditions. These results indicate that the effect size of study/test practice is due to the relative contribution of retrievable and non-retrievable items. PMID:22304454

  17. Decomposing the interaction between retention interval and study/test practice: the role of retrievability.

    PubMed

    Jang, Yoonhee; Wixted, John T; Pecher, Diane; Zeelenberg, René; Huber, David E

    2012-01-01

    Even without feedback, test practice enhances delayed performance compared to study practice, but the size of the effect is variable across studies. We investigated the benefit of testing, separating initially retrievable items from initially nonretrievable items. In two experiments, an initial test determined item retrievability. Retrievable or nonretrievable items were subsequently presented for repeated study or test practice. Collapsing across items, in Experiment 1, we obtained the typical cross-over interaction between retention interval and practice type. For retrievable items, however, the cross-over interaction was quantitatively different, with a small study benefit for an immediate test and a larger testing benefit after a delay. For nonretrievable items, there was a large study benefit for an immediate test, but one week later there was no difference between the study and test practice conditions. In Experiment 2, initially nonretrievable items were given additional study followed by either an immediate test or even more additional study, and one week later performance did not differ between the two conditions. These results indicate that the effect size of study/test practice is due to the relative contribution of retrievable and nonretrievable items.

  18. Optimal Test Design with Rule-Based Item Generation

    ERIC Educational Resources Information Center

    Geerlings, Hanneke; van der Linden, Wim J.; Glas, Cees A. W.

    2013-01-01

    Optimal test-design methods are applied to rule-based item generation. Three different cases of automated test design are presented: (a) test assembly from a pool of pregenerated, calibrated items; (b) test generation on the fly from a pool of calibrated item families; and (c) test generation on the fly directly from calibrated features defining…

  19. Science Library of Test Items. Volume Twenty. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 1.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  20. Science Library of Test Items. Volume Twenty-One. A Collection of Multiple Choice Test Items Relating Mainly to Physics, 2.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  1. Science Library of Test Items. Volume Twenty-Two. A Collection of Multiple Choice Test Items Relating Mainly to Skills.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…

  2. Criterion-Referenced Test Items for Small Engines.

    ERIC Educational Resources Information Center

    Herd, Amon

    This notebook contains criterion-referenced test items for testing students' knowledge of small engines. The test items are based upon competencies found in the Missouri Small Engine Competency Profile. The test item bank is organized in 18 sections that cover the following duties: shop procedures; tools and equipment; fasteners; servicing fuel…

  3. An Investigation of the Impact of Guessing on Coefficient α and Reliability

    PubMed Central

    2014-01-01

    Guessing is known to influence the test reliability of multiple-choice tests. Although there are many studies that have examined the impact of guessing, they used rather restrictive assumptions (e.g., parallel test assumptions, homogeneous inter-item correlations, homogeneous item difficulty, and homogeneous guessing levels across items) to evaluate the relation between guessing and test reliability. Based on the item response theory (IRT) framework, this study investigated the extent of the impact of guessing on reliability under more realistic conditions where item difficulty, item discrimination, and guessing levels actually vary across items with three different test lengths (TL). By accommodating multiple item characteristics simultaneously, this study also focused on examining interaction effects between guessing and other variables entered in the simulation to be more realistic. The simulation of the more realistic conditions and calculations of reliability and classical test theory (CTT) item statistics were facilitated by expressing CTT item statistics, coefficient α, and reliability in terms of IRT model parameters. In addition to the general negative impact of guessing on reliability, results showed interaction effects between TL and guessing and between guessing and test difficulty.

  4. Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

    ERIC Educational Resources Information Center

    Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André

    2016-01-01

    Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

  5. Development and psychometric evaluation of an information literacy self-efficacy survey and an information literacy knowledge test*

    PubMed Central

    Tepe, Rodger; Tepe, Chabha

    2015-01-01

    Objective To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. Methods In this test–retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. Results The IL self-efficacy survey demonstrated good reliability (test–retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test–retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). Conclusions This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments. PMID:25517736

  6. Integrating Test-Form Formatting into Automated Test Assembly

    ERIC Educational Resources Information Center

    Diao, Qi; van der Linden, Wim J.

    2013-01-01

    Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using…

  7. Instructional Topics in Educational Measurement (ITEMS) Module: Using Automated Processes to Generate Test Items

    ERIC Educational Resources Information Center

    Gierl, Mark J.; Lai, Hollis

    2013-01-01

    Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…

  8. A Procedure To Detect Test Bias Present Simultaneously in Several Items.

    ERIC Educational Resources Information Center

    Shealy, Robin; Stout, William

    A statistical procedure is presented that is designed to test for unidirectional test bias existing simultaneously in several items of an ability test, based on the assumption that test bias is incipient within the two groups' ability differences. The proposed procedure--Simultaneous Item Bias (SIB)--is based on a multidimensional item response…

  9. An Item Response Theory Model for Test Bias.

    ERIC Educational Resources Information Center

    Shealy, Robin; Stout, William

    This paper presents a conceptualization of test bias for standardized ability tests which is based on multidimensional, non-parametric, item response theory. An explanation of how individually-biased items can combine through a test score to produce test bias is provided. It is contended that bias, although expressed at the item level, should be…

  10. Radiological and Environmental Research Division annual report, October 1978-September 1979. Part I. Fundamental molecular physics and chemistry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    1979-01-01

    Research on the chemical physics of atoms and molecules, especially their interaction with external agents such as photons and electrons is reported. Abstracts of seven individual items from the report were prepared separately for the data base. (GHT)

  11. Content Analysis of Chemistry Curricula in Germany Case Study: Chemical Reactions

    ERIC Educational Resources Information Center

    Timofte, Roxana S.

    2015-01-01

    Curriculum-assessment alignment is a well known foundation for good practice in educational assessment, for items' curricular validity purposes. Nowadays instruments are designed to measure pupils' competencies in one or more areas of competence. Sub-competence areas could be defined theoretically and statistical analysis of empirical data by…

  12. Bibliography on Smoking and Health.

    ERIC Educational Resources Information Center

    Public Health Service (DHEW), Rockville, MD. National Clearinghouse for Smoking and Health.

    This Bibliography includes all of the items added to the Technical Information Center of the National Clearinghouse for Smoking and Health from January through December 1971. The publication is broken down into eleven major categories. These are: (1) chemistry, pharmacology and toxicology; (2) mortality and morbidity; (3) neoplastic diseases; (4)…

  13. Using Reliability and Item Analysis to Evaluate a Teacher-Developed Test in Educational Measurement and Evaluation

    ERIC Educational Resources Information Center

    Quaigrain, Kennedy; Arhin, Ato Kwamina

    2017-01-01

    Item analysis is essential in improving items which will be used again in later tests; it can also be used to eliminate misleading items in a test. The study focused on item and test quality and explored the relationship between difficulty index (p-value) and discrimination index (DI) with distractor efficiency (DE). The study was conducted among…

  14. Audio Adapted Assessment Data: Does the Addition of Audio to Written Items Modify the Item Calibration?

    ERIC Educational Resources Information Center

    Snyder, James

    2010-01-01

    This dissertation research examined the changes in item RIT calibration that occurred when adding audio to a set of currently calibrated RIT items and then placing these new items as field test items in the modified assessments on the NWEA MAP test platform. The researcher used test results from over 600 students in the Poway School District in…

  15. Student science achievement and the integration of Indigenous knowledge on standardized tests

    NASA Astrophysics Data System (ADS)

    Dupuis, Juliann; Abrams, Eleanor

    2017-09-01

    In this article, we examine how American Indian students in Montana performed on standardized state science assessments when a small number of test items based upon traditional science knowledge from a cultural curriculum, "Indian Education for All", were included. Montana is the first state in the US to mandate the use of a culturally relevant curriculum in all schools and to incorporate this curriculum into a portion of the standardized assessment items. This study compares White and American Indian student test scores on these particular test items to determine how White and American Indian students perform on culturally relevant test items compared to traditional standard science test items. The connections between student achievement on adapted culturally relevant science test items versus traditional items brings valuable insights to the fields of science education, research on student assessments, and Indigenous studies.

  16. Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items

    ERIC Educational Resources Information Center

    Aybek, Eren Can; Demirtasli, R. Nukhet

    2017-01-01

    This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…

  17. An Effect Size Measure for Raju's Differential Functioning for Items and Tests

    ERIC Educational Resources Information Center

    Wright, Keith D.; Oshima, T. C.

    2015-01-01

    This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…

  18. Detecting a Gender-Related DIF Using Logistic Regression and Transformed Item Difficulty

    ERIC Educational Resources Information Center

    Abedlaziz, Nabeel; Ismail, Wail; Hussin, Zaharah

    2011-01-01

    Test items are designed to provide information about the examinees. Difficult items are designed to be more demanding and easy items are less so. However, sometimes, test items carry with their demands other than those intended by the test developer (Scheuneman & Gerritz, 1990). When personal attributes such as gender systematically affect…

  19. Influence of Fallible Item Parameters on Test Information During Adaptive Testing.

    ERIC Educational Resources Information Center

    Wetzel, C. Douglas; McBride, James R.

    Computer simulation was used to assess the effects of item parameter estimation errors on different item selection strategies used in adaptive and conventional testing. To determine whether these effects reduced the advantages of certain optimal item selection strategies, simulations were repeated in the presence and absence of item parameter…

  20. A Guide to Item Banking in Education. (Third Edition).

    ERIC Educational Resources Information Center

    Naccarato, Richard W.

    The current status of banks of test items existing across the United States was determined through a survey conducted between September and December 1987. Item "bank" in this context does not imply that the test items are available in computerized form, but simply that "deposited" test items can be withdrawn for use. Emphasis…

  1. Development and validation of an energy-balance knowledge test for fourth- and fifth-grade students.

    PubMed

    Chen, Senlin; Zhu, Xihe; Kang, Minsoo

    2017-05-01

    A valid test measuring children's energy-balance (EB) knowledge is lacking in research. This study developed and validated the energy-balance knowledge test (EBKT) for fourth and fifth grade students. The original EBKT contained 25 items but was reduced to 23 items based on pilot result and intensive expert panel discussion. De-identified data were collected from 468 fourth and fifth grade students enrolled in four schools to examine the psychometric properties of the EBKT items. The Rasch model analysis was conducted using the Winstep 3.65.0 software. Differential item functioning (DIF) analysis flagged 1 item (item #4) functioning differently between boys and girls, which was deleted. The final 22-item EBKT showed desirable model-data fit indices. The items had large variability ranging from -3.58 logit (item #10, the easiest) to 1.70 logit (item #3, the hardest). The average person ability on the test was 0.28 logit (SD = .78). Additional analyses supported known-group difference validity of the EBKT scores in capturing gender- and grade-based ability differences. The test was overall valid but could be further improved by expanding test items to discern various ability levels. For lack of a better test, researchers and practitioners may use the EBKT to assess fourth- and fifth-grade students' EB knowledge.

  2. Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique

    NASA Astrophysics Data System (ADS)

    Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan

    2016-12-01

    This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.

  3. Modeling Local Item Dependence in Cloze and Reading Comprehension Test Items Using Testlet Response Theory

    ERIC Educational Resources Information Center

    Baghaei, Purya; Ravand, Hamdollah

    2016-01-01

    In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…

  4. Machine Shop. Criterion-Referenced Test (CRT) Item Bank.

    ERIC Educational Resources Information Center

    Davis, Diane, Ed.

    This drafting criterion-referenced test item bank is keyed to the machine shop competency profile developed by industry and education professionals in Missouri. The 16 references used for drafting the test items are listed. Test items are arranged under these categories: orientation to machine shop; performing mathematical calculations; performing…

  5. Rescuing Computerized Testing by Breaking Zipf's Law.

    ERIC Educational Resources Information Center

    Wainer, Howard

    2000-01-01

    Suggests that because of the nonlinear relationship between item usage and item security, the problems of test security posed by continuous administration of standardized tests cannot be resolved merely by increasing the size of the item pool. Offers alternative strategies to overcome these problems, distributing test items so as to avoid the…

  6. Chemistry for whom? Gender awareness in teaching and learning chemistry

    NASA Astrophysics Data System (ADS)

    Andersson, Kristina

    2017-06-01

    Marie Ståhl and Anita Hussénius have defined what discourses dominate national tests in chemistry for Grade 9 in Sweden by using feminist, critical didactic perspectives. This response seeks to expand the results in Ståhl and Hussénius's article Chemistry inside an epistemological community box!— Discursive exclusions and inclusions in the Swedish national tests in chemistry, by using different facets of gender awareness. The first facet—Gender awareness in relations to the test designers' own conceptions—highlighted how the gender order where women are subordinated men becomes visible in the national tests as a consequence of the test designers internalized conceptions. The second facet—Gender awareness in relation to chemistry—discussed the hierarchy between discourses within chemistry. The third facet—Gender awareness in relation to students—problematized chemistry in relation to the students' identity formation. In summary, I suggest that the different discourses can open up new ways to interpret chemistry and perhaps dismantle the hegemonic chemistry discourse.

  7. An Evaluation of "Intentional" Weighting of Extended-Response or Constructed-Response Items in Tests with Mixed Item Types.

    ERIC Educational Resources Information Center

    Ito, Kyoko; Sykes, Robert C.

    This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…

  8. Do the Guideline Violations Influence Test Difficulty of High-Stake Test?: An Investigation on University Entrance Examination in Turkey

    ERIC Educational Resources Information Center

    Atalmis, Erkan Hasan

    2016-01-01

    Multiple-choice (MC) items are commonly used in high-stake tests. Thus, each item of such tests should be meticulously constructed to increase the accuracy of decisions based on test results. Haladyna and his colleagues (2002) addressed the valid item-writing guidelines to construct high quality MC items in order to increase test reliability and…

  9. 42 CFR 493.929 - Chemistry.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 42 Public Health 5 2010-10-01 2010-10-01 false Chemistry. 493.929 Section 493.929 Public Health... Proficiency Testing Programs by Specialty and Subspecialty § 493.929 Chemistry. The subspecialties under the specialty of chemistry for which a proficiency testing program may offer proficiency testing are routine...

  10. 42 CFR 493.929 - Chemistry.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 42 Public Health 5 2012-10-01 2012-10-01 false Chemistry. 493.929 Section 493.929 Public Health... Proficiency Testing Programs by Specialty and Subspecialty § 493.929 Chemistry. The subspecialties under the specialty of chemistry for which a proficiency testing program may offer proficiency testing are routine...

  11. 42 CFR 493.929 - Chemistry.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 42 Public Health 5 2014-10-01 2014-10-01 false Chemistry. 493.929 Section 493.929 Public Health... Proficiency Testing Programs by Specialty and Subspecialty § 493.929 Chemistry. The subspecialties under the specialty of chemistry for which a proficiency testing program may offer proficiency testing are routine...

  12. 42 CFR 493.929 - Chemistry.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 42 Public Health 5 2013-10-01 2013-10-01 false Chemistry. 493.929 Section 493.929 Public Health... Proficiency Testing Programs by Specialty and Subspecialty § 493.929 Chemistry. The subspecialties under the specialty of chemistry for which a proficiency testing program may offer proficiency testing are routine...

  13. 42 CFR 493.929 - Chemistry.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 42 Public Health 5 2011-10-01 2011-10-01 false Chemistry. 493.929 Section 493.929 Public Health... Proficiency Testing Programs by Specialty and Subspecialty § 493.929 Chemistry. The subspecialties under the specialty of chemistry for which a proficiency testing program may offer proficiency testing are routine...

  14. Item difficulty and item validity for the Children's Group Embedded Figures Test.

    PubMed

    Rusch, R R; Trigg, C L; Brogan, R; Petriquin, S

    1994-02-01

    The validity and reliability of the Children's Group Embedded Figures Test was reported for students in Grade 2 by Cromack and Stone in 1980; however, a search of the literature indicates no evidence for internal consistency or item analysis. Hence the purpose of this study was to examine the item difficulty and item validity of the test with children in Grades 1 and 2. Confusion in the literature over development and use of this test was seemingly resolved through analysis of these descriptions and through an interview with the test developer. One early-appearing item was unreasonably difficult. Two or three other items were quite difficult and made little contribution to the total score. Caution is recommended, however, in any reordering or elimination of items based on these findings, given the limited number of subjects (n = 84).

  15. Weapon Performance Testing and Analysis: The MODI-PAC Round, the Number 4 Lead-Shot Round, and the Flying Baton

    DTIC Science & Technology

    1976-01-01

    items. The items tested were the MODI-PAC, a proprietary item of Reming)on Arms Company, a standard 12 - gauge round of No. 4 lead shot, and an...to refrain from testing this item. Therefore, the final selection of items for testing were (1) the MODI-PAC, (2) a standard 12 - gauge shotgun round of...The first item evaluated was the MODI-PAC5. The MOQ1-PAC which standsfor “modified impact “ is a 12 - gauge shotgun shell loaded with approximately 320

  16. Interactions Between Item Content And Group Membership on Achievement Test Items.

    ERIC Educational Resources Information Center

    Linn, Robert L.; Harnisch, Delwyn L.

    The purpose of this investigation was to examine the interaction of item content and group membership on achievement test items. Estimates of the parameters of the three parameter logistic model were obtained on the 46 item math test for the sample of eighth grade students (N = 2055) participating in the Illinois Inventory of Educational Progress,…

  17. Effects of Item Exposure for Conventional Examinations in a Continuous Testing Environment.

    ERIC Educational Resources Information Center

    Hertz, Norman R.; Chinn, Roberta N.

    This study explored the effect of item exposure on two conventional examinations administered as computer-based tests. A principal hypothesis was that item exposure would have little or no effect on average difficulty of the items over the course of an administrative cycle. This hypothesis was tested by exploring conventional item statistics and…

  18. 42 CFR 493.839 - Condition: Chemistry.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 42 Public Health 5 2010-10-01 2010-10-01 false Condition: Chemistry. 493.839 Section 493.839... These Tests § 493.839 Condition: Chemistry. The specialty of chemistry includes for the purposes of proficiency testing the subspecialties of routine chemistry, endocrinology, and toxicology. ...

  19. 42 CFR 493.839 - Condition: Chemistry.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 42 Public Health 5 2014-10-01 2014-10-01 false Condition: Chemistry. 493.839 Section 493.839... These Tests § 493.839 Condition: Chemistry. The specialty of chemistry includes for the purposes of proficiency testing the subspecialties of routine chemistry, endocrinology, and toxicology. ...

  20. 42 CFR 493.839 - Condition: Chemistry.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 42 Public Health 5 2013-10-01 2013-10-01 false Condition: Chemistry. 493.839 Section 493.839... These Tests § 493.839 Condition: Chemistry. The specialty of chemistry includes for the purposes of proficiency testing the subspecialties of routine chemistry, endocrinology, and toxicology. ...

  1. 42 CFR 493.839 - Condition: Chemistry.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 42 Public Health 5 2011-10-01 2011-10-01 false Condition: Chemistry. 493.839 Section 493.839... These Tests § 493.839 Condition: Chemistry. The specialty of chemistry includes for the purposes of proficiency testing the subspecialties of routine chemistry, endocrinology, and toxicology. ...

  2. 42 CFR 493.839 - Condition: Chemistry.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 42 Public Health 5 2012-10-01 2012-10-01 false Condition: Chemistry. 493.839 Section 493.839... These Tests § 493.839 Condition: Chemistry. The specialty of chemistry includes for the purposes of proficiency testing the subspecialties of routine chemistry, endocrinology, and toxicology. ...

  3. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement.

    PubMed

    McInnes, Matthew D F; Moher, David; Thombs, Brett D; McGrath, Trevor A; Bossuyt, Patrick M; Clifford, Tammy; Cohen, Jérémie F; Deeks, Jonathan J; Gatsonis, Constantine; Hooft, Lotty; Hunt, Harriet A; Hyde, Christopher J; Korevaar, Daniël A; Leeflang, Mariska M G; Macaskill, Petra; Reitsma, Johannes B; Rodin, Rachel; Rutjes, Anne W S; Salameh, Jean-Paul; Stevens, Adrienne; Takwoingi, Yemisi; Tonelli, Marcello; Weeks, Laura; Whiting, Penny; Willis, Brian H

    2018-01-23

    Systematic reviews of diagnostic test accuracy synthesize data from primary diagnostic studies that have evaluated the accuracy of 1 or more index tests against a reference standard, provide estimates of test performance, allow comparisons of the accuracy of different tests, and facilitate the identification of sources of variability in test accuracy. To develop the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagnostic test accuracy guideline as a stand-alone extension of the PRISMA statement. Modifications to the PRISMA statement reflect the specific requirements for reporting of systematic reviews and meta-analyses of diagnostic test accuracy studies and the abstracts for these reviews. Established standards from the Enhancing the Quality and Transparency of Health Research (EQUATOR) Network were followed for the development of the guideline. The original PRISMA statement was used as a framework on which to modify and add items. A group of 24 multidisciplinary experts used a systematic review of articles on existing reporting guidelines and methods, a 3-round Delphi process, a consensus meeting, pilot testing, and iterative refinement to develop the PRISMA diagnostic test accuracy guideline. The final version of the PRISMA diagnostic test accuracy guideline checklist was approved by the group. The systematic review (produced 64 items) and the Delphi process (provided feedback on 7 proposed items; 1 item was later split into 2 items) identified 71 potentially relevant items for consideration. The Delphi process reduced these to 60 items that were discussed at the consensus meeting. Following the meeting, pilot testing and iterative feedback were used to generate the 27-item PRISMA diagnostic test accuracy checklist. To reflect specific or optimal contemporary systematic review methods for diagnostic test accuracy, 8 of the 27 original PRISMA items were left unchanged, 17 were modified, 2 were added, and 2 were omitted. The 27-item PRISMA diagnostic test accuracy checklist provides specific guidance for reporting of systematic reviews. The PRISMA diagnostic test accuracy guideline can facilitate the transparent reporting of reviews, and may assist in the evaluation of validity and applicability, enhance replicability of reviews, and make the results from systematic reviews of diagnostic test accuracy studies more useful.

  4. An Efficiency Balanced Information Criterion for Item Selection in Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Han, Kyung T.

    2012-01-01

    Successful administration of computerized adaptive testing (CAT) programs in educational settings requires that test security and item exposure control issues be taken seriously. Developing an item selection algorithm that strikes the right balance between test precision and level of item pool utilization is the key to successful implementation…

  5. Using Automatic Item Generation to Meet the Increasing Item Demands of High-Stakes Educational and Occupational Assessment

    ERIC Educational Resources Information Center

    Arendasy, Martin E.; Sommer, Markus

    2012-01-01

    The use of new test administration technologies such as computerized adaptive testing in high-stakes educational and occupational assessments demands large item pools. Classic item construction processes and previous approaches to automatic item generation faced the problems of a considerable loss of items after the item calibration phase. In this…

  6. Item Purification Does Not Always Improve DIF Detection: A Counterexample with Angoff's Delta Plot

    ERIC Educational Resources Information Center

    Magis, David; Facon, Bruno

    2013-01-01

    Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…

  7. [Difference analysis among majors in medical parasitology exam papers by test item bank proposition].

    PubMed

    Jia, Lin-Zhi; Ya-Jun, Ma; Cao, Yi; Qian, Fen; Li, Xiang-Yu

    2012-04-30

    The quality index among "Medical Parasitology" exam papers and measured data for students in three majors from the university in 2010 were compared and analyzed. The exam papers were formed from the test item bank. The alpha reliability coefficients of the three exam papers were above 0.70. The knowledge structure and capacity structure of the exam papers were basically balanced. But the alpha reliability coefficients of the second major was the lowest, mainly due to quality of test items in the exam paper and the failure of revising the index of test item bank in time. This observation demonstrated that revising the test items and their index in the item bank according to the measured data can improve the quality of test item bank proposition and reduce the difference among exam papers.

  8. The Role of Item Models in Automatic Item Generation

    ERIC Educational Resources Information Center

    Gierl, Mark J.; Lai, Hollis

    2012-01-01

    Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

  9. Item Review and the Rearrangement Procedure: Its Process and Its Results

    ERIC Educational Resources Information Center

    Papanastasiou, Elena C.

    2005-01-01

    Permitting item review is to the benefit of the examinees who typically increase their test scores with item review. However, testing companies do not prefer item review since it does not follow the logic on which adaptive tests are based, and since it is prone to cheating strategies. Consequently, item review is not permitted in many adaptive…

  10. A Model-Based Method for Content Validation of Automatically Generated Test Items

    ERIC Educational Resources Information Center

    Zhang, Xinxin; Gierl, Mark

    2016-01-01

    The purpose of this study is to describe a methodology to recover the item model used to generate multiple-choice test items with a novel graph theory approach. Beginning with the generated test items and working backward to recover the original item model provides a model-based method for validating the content used to automatically generate test…

  11. Optimal Bayesian Adaptive Design for Test-Item Calibration.

    PubMed

    van der Linden, Wim J; Ren, Hao

    2015-06-01

    An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers' ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.

  12. State Assessment Program Item Banks: Model Language for Request for Proposals (RFP) and Contracts

    ERIC Educational Resources Information Center

    Swanson, Leonard C.

    2010-01-01

    This document provides recommendations for request for proposal (RFP) and contract language that state education agencies can use to specify their requirements for access to test item banks. An item bank is a repository for test items and data about those items. Item banks are used by state agency staff to view items and associated data; to…

  13. A Historical Investigation into Item Formats of ACS Exams and Their Relationships to Science Practices

    ERIC Educational Resources Information Center

    Brandriet, Alexandra; Reed, Jessica J.; Holme, Thomas

    2015-01-01

    The release of the "NRC Framework for K-12 Science Education" and the "Next Generation Science Standards" has important implications for classroom teaching and assessment. Of particular interest is the implementation of science practices in the chemistry classroom, and the definitions established by the NRC makes these…

  14. NMR Spectra through the Eyes of a Student: Eye Tracking Applied to NMR Items

    ERIC Educational Resources Information Center

    Topczewski, Joseph J.; Topczewski, Anna M.; Tang, Hui; Kendhammer, Lisa K.; Pienta, Norbert J.

    2017-01-01

    Nuclear magnetic resonance spectroscopy (NMR) plays a key role in introductory organic chemistry, spanning theory, concepts, and experimentation. Therefore, it is imperative that the instruction methods for NMR are both efficient and effective. By utilizing eye tracking equipment, the researchers were able to monitor how second-semester organic…

  15. Core Ideas and Topics: Building Up or Drilling Down?

    ERIC Educational Resources Information Center

    Cooper, Melanie M.; Posey, Lynmarie A.; Underwood, Sonia M.

    2017-01-01

    In this paper we discuss how and why core ideas can serve as the framework upon which chemistry curricula and assessment items are developed. While there are a number of projects that have specified "big ideas" or "anchoring concepts", the ways that these ideas are subsequently developed may inadvertently lead to fragmentation…

  16. Rapid Production of a Porous Cellulose Acetate Membrane for Water Filtration Using Readily Available Chemicals

    ERIC Educational Resources Information Center

    Kaiser, Adrian; Stark, Wendelin J.; Grass, Robert N.

    2017-01-01

    A chemistry laboratory experiment using everyday items and readily available chemicals is described to introduce advanced high school students and undergraduate college students to porous polymer membranes. In a three-step manufacturing process, a membrane is produced at room temperature. The filtration principle of the membrane is then…

  17. The Impact of Receiving the Same Items on Consecutive Computer Adaptive Test Administrations.

    ERIC Educational Resources Information Center

    O'Neill, Thomas; Lunz, Mary E.; Thiede, Keith

    2000-01-01

    Studied item exposure in a computerized adaptive test when the item selection algorithm presents examinees with questions they were asked in a previous test administration. Results with 178 repeat examinees on a medical technologists' test indicate that the combined use of an adaptive algorithm to select items and latent trait theory to estimate…

  18. Helping Poor Readers Demonstrate Their Science Competence: Item Characteristics Supporting Text-Picture Integration

    ERIC Educational Resources Information Center

    Saß, Steffani; Schütte, Kerstin

    2016-01-01

    Solving test items might require abilities in test-takers other than the construct the test was designed to assess. Item and student characteristics such as item format or reading comprehension can impact the test result. This experiment is based on cognitive theories of text and picture comprehension. It examines whether integration aids, which…

  19. Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly

    ERIC Educational Resources Information Center

    Veldkamp, Bernard P.; Matteucci, Mariagiulia; de Jong, Martijn G.

    2013-01-01

    Item response theory parameters have to be estimated, and because of the estimation process, they do have uncertainty in them. In most large-scale testing programs, the parameters are stored in item banks, and automated test assembly algorithms are applied to assemble operational test forms. These algorithms treat item parameters as fixed values,…

  20. Identifying Differential Item Functioning in Multi-Stage Computer Adaptive Testing

    ERIC Educational Resources Information Center

    Gierl, Mark J.; Lai, Hollis; Li, Johnson

    2013-01-01

    The purpose of this study is to evaluate the performance of CATSIB (Computer Adaptive Testing-Simultaneous Item Bias Test) for detecting differential item functioning (DIF) when items in the matching and studied subtest are administered adaptively in the context of a realistic multi-stage adaptive test (MST). MST was simulated using a 4-item…

  1. A Stepwise Test Characteristic Curve Method to Detect Item Parameter Drift

    ERIC Educational Resources Information Center

    Guo, Rui; Zheng, Yi; Chang, Hua-Hua

    2015-01-01

    An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…

  2. The promise and challenge of including multimedia items in medical licensure examinations: some insights from an empirical trial.

    PubMed

    Shen, Linjun; Li, Feiming; Wattleworth, Roberta; Filipetto, Frank

    2010-10-01

    The Comprehensive Osteopathic Medical Licensing Examination conducted a trial of multimedia items in the 2008-2009 Level 3 testing cycle to determine (1) if multimedia items were able to test additional elements of medical knowledge and skills and (2) how to develop effective multimedia items. Forty-four content-matched multimedia and text multiple-choice items were randomly delivered to Level 3 candidates. Logistic regression and paired-samples t tests were used for pairwise and group-level comparisons, respectively. Nine pairs showed significant differences in either difficulty or/and discrimination. Content analysis found that, if text narrations were less direct, multimedia materials could make items easier. When textbook terminologies were replaced by multimedia presentations, multimedia items could become more difficult. Moreover, a multimedia item was found not uniformly difficult for candidates at different ability levels, possibly because multimedia and text items tested different elements of a same concept. Multimedia items may be capable of measuring some constructs different from what text items can measure. Effective multimedia items with reasonable psychometric properties can be intentionally developed.

  3. Varying levels of difficulty index of skills-test items randomly selected by examinees on the Korean emergency medical technician licensing examination.

    PubMed

    Koh, Bongyeun; Hong, Sunggi; Kim, Soon-Sim; Hyun, Jin-Sook; Baek, Milye; Moon, Jundong; Kwon, Hayran; Kim, Gyoungyong; Min, Seonggi; Kang, Gu-Hyun

    2016-01-01

    The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE), which requires examinees to select items randomly. The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01), as well as 4 of the 5 items on the advanced skills test (P<0.05). In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01), as well as all 3 of the advanced skills test items (P<0.01). In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination.

  4. Lunar Science Conference, 4th, Houston, Tex., March 5-8, 1973, Proceedings. Volume 1 - Mineralogy and petrology. Volume 2 - Chemical and isotope analyses. Organic chemistry. Volume 3 - Physical properties

    NASA Technical Reports Server (NTRS)

    Gose, W. A.

    1973-01-01

    The mineralogy, petrology, chemistry, isotopic composition, and physical properties of lunar materials are described in papers detailing methods, results, and implications of research on samples returned from eight lunar landing sites: Apollo 11, 12, 14, 15, 16, 17, and Luna 16 and 20. The results of experiments conducted or set up on the lunar surface by the astronauts are also described along with observations taken from Command Modules and subsatellites. Major topics include general geology, soil and breccia studies, petrologic studies, mineralogic analyses, elemental compositions, radiometric age determinations, rare gas chemistry, radionuclides, organogenic compounds, particle track records, thermal properties, seismic studies, resonance studies, orbital mapping, lunar atmosphere, magnetic studies, electrical studies, optical properties, and microcratering. Individual items are announced in this issue.

  5. Item Analysis in Introductory Economics Testing.

    ERIC Educational Resources Information Center

    Tinari, Frank D.

    1979-01-01

    Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)

  6. Differential Item Functioning (DIF) among Spanish-Speaking English Language Learners (ELLs) in State Science Tests

    NASA Astrophysics Data System (ADS)

    Ilich, Maria O.

    Psychometricians and test developers evaluate standardized tests for potential bias against groups of test-takers by using differential item functioning (DIF). English language learners (ELLs) are a diverse group of students whose native language is not English. While they are still learning the English language, they must take their standardized tests for their school subjects, including science, in English. In this study, linguistic complexity was examined as a possible source of DIF that may result in test scores that confound science knowledge with a lack of English proficiency among ELLs. Two years of fifth-grade state science tests were analyzed for evidence of DIF using two DIF methods, Simultaneous Item Bias Test (SIBTest) and logistic regression. The tests presented a unique challenge in that the test items were grouped together into testlets---groups of items referring to a scientific scenario to measure knowledge of different science content or skills. Very large samples of 10, 256 students in 2006 and 13,571 students in 2007 were examined. Half of each sample was composed of Spanish-speaking ELLs; the balance was comprised of native English speakers. The two DIF methods were in agreement about the items that favored non-ELLs and the items that favored ELLs. Logistic regression effect sizes were all negligible, while SIBTest flagged items with low to high DIF. A decrease in socioeconomic status and Spanish-speaking ELL diversity may have led to inconsistent SIBTest effect sizes for items used in both testing years. The DIF results for the testlets suggested that ELLs lacked sufficient opportunity to learn science content. The DIF results further suggest that those constructed response test items requiring the student to draw a conclusion about a scientific investigation or to plan a new investigation tended to favor ELLs.

  7. Examining the Impact of Drifted Polytomous Anchor Items on Test Characteristic Curve (TCC) Linking and IRT True Score Equating. Research Report. ETS RR-12-09

    ERIC Educational Resources Information Center

    Li, Yanmei

    2012-01-01

    In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often removed. For a test that contains mostly dichotomous items and only a small number of polytomous items, removing some drifted polytomous anchor items may result in anchor sets that no longer resemble mini-versions of…

  8. Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

    PubMed

    Sinharay, Sandip

    2017-09-01

    Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.

  9. A Bayesian Method for the Detection of Item Preknowledge in CAT. Law School Admission Council Computerized Testing Report. LSAC Research Report Series.

    ERIC Educational Resources Information Center

    McLeod, Lori D.; Lewis, Charles; Thissen, David.

    With the increased use of computerized adaptive testing, which allows for continuous testing, new concerns about test security have evolved, one being the assurance that items in an item pool are safeguarded from theft. In this paper, the risk of score inflation and procedures to detect test takers using item preknowledge are explored. When test…

  10. Arguments, contradictions, resistances, and conceptual change in students' understanding of atomic structure

    NASA Astrophysics Data System (ADS)

    Niaz, Mansoor; Aguilera, Damarys; Maza, Arelys; Liendo, Gustavo

    2002-07-01

    Most general chemistry courses and textbooks emphasize experimental details and lack a history and philosophy of science perspective. The objective of this study is to facilitate freshman general chemistry students' understanding of atomic structure based on the work of Thomson, Rutherford, and Bohr. It is hypothesized that classroom discussions based on arguments/counterarguments of the heuristic principles, on which these scientists based their atomic models, can facilitate students' conceptual understanding. This study is based on 160 freshman students enrolled in six sections of General Chemistry I (three sections formed part of the experimental group). All three models (Thomson, Rutherford, and Bohr) were presented to the experimental and control group students in the traditional manner, as found in most textbooks. After this, the three sections of the experimental group participated in the discussion of six items with alternative responses. Students were first asked to select a response and then participate in classroom discussions leading to arguments in favor or against the selected response and finally select a new response. Three weeks after having discussed the six items, both the experimental and control groups presented a monthly exam (based on the three models) and after another 3 weeks a semester exam. Results obtained show that given the opportunity to argue and discuss, students' understanding can go beyond the simple regurgitation of experimental details. Performance of the experimental group showed contradictions, resistances, and progressive conceptual change with considerable and consistent improvement in the last item. It is concluded that if we want our students to understand scientific progress and practice, then it is important that we include the experimental details not as a rhetoric of conclusions (Schwab, 1962, The teaching of science as enquiry, Cambridge, MA, Harward University Press; Schwab, 1974, Conflicting conceptions of curriculum, Berkeley, CA, McCutchan) but as heuristic principles (Lakatos, 1970, Criticism and the growth of knowledge, Cambridge, UK, Cambridge University Press, pp. 91-195), which were based on arguments, controversies, and interpretations of the scientists.

  11. Effect of Multiple Testing Adjustment in Differential Item Functioning Detection

    ERIC Educational Resources Information Center

    Kim, Jihye; Oshima, T. C.

    2013-01-01

    In a typical differential item functioning (DIF) analysis, a significance test is conducted for each item. As a test consists of multiple items, such multiple testing may increase the possibility of making a Type I error at least once. The goal of this study was to investigate how to control a Type I error rate and power using adjustment…

  12. Item Response Theory Models for Performance Decline during Testing

    ERIC Educational Resources Information Center

    Jin, Kuan-Yu; Wang, Wen-Chung

    2014-01-01

    Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

  13. Differential item functioning analysis of the Vanderbilt Expertise Test for cars.

    PubMed

    Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel

    2015-01-01

    The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.

  14. Samejima Items in Multiple-Choice Tests: Identification and Implications

    ERIC Educational Resources Information Center

    Rahman, Nazia

    2013-01-01

    Samejima hypothesized that non-monotonically increasing item response functions (IRFs) of ability might occur for multiple-choice items (referred to here as "Samejima items") if low ability test takers with some, though incomplete, knowledge or skill are drawn to a particularly attractive distractor, while very low ability test takers…

  15. Computerized Numerical Control Test Item Bank.

    ERIC Educational Resources Information Center

    Reneau, Fred; And Others

    This guide contains 285 test items for use in teaching a course in computerized numerical control. All test items were reviewed, revised, and validated by incumbent workers and subject matter instructors. Items are provided for assessing student achievement in such aspects of programming and planning, setting up, and operating machines with…

  16. Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

    ERIC Educational Resources Information Center

    He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei

    2013-01-01

    Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…

  17. Robust Scale Transformation Methods in IRT True Score Equating under Common-Item Nonequivalent Groups Design

    ERIC Educational Resources Information Center

    He, Yong

    2013-01-01

    Common test items play an important role in equating multiple test forms under the common-item nonequivalent groups design. Inconsistent item parameter estimates among common items can lead to large bias in equated scores for IRT true score equating. Current methods extensively focus on detection and elimination of outlying common items, which…

  18. Using Differential Item Functioning Procedures to Explore Sources of Item Difficulty and Group Performance Characteristics.

    ERIC Educational Resources Information Center

    Scheuneman, Janice Dowd; Gerritz, Kalle

    1990-01-01

    Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)

  19. Item Structural Properties as Predictors of Item Difficulty and Item Association.

    ERIC Educational Resources Information Center

    Solano-Flores, Guillermo

    1993-01-01

    Studied the ability of logical test design (LTD) to predict student performance in reading Roman numerals for 211 sixth graders in Mexico City tested on Roman numeral items varying on LTD-related and non-LTD-related variables. The LTD-related variable item iterativity was found to be the best predictor of item difficulty. (SLD)

  20. Investigating Item Exposure Control Methods in Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Ozturk, Nagihan Boztunc; Dogan, Nuri

    2015-01-01

    This study aims to investigate the effects of item exposure control methods on measurement precision and on test security under various item selection methods and item pool characteristics. In this study, the Randomesque (with item group sizes of 5 and 10), Sympson-Hetter, and Fade-Away methods were used as item exposure control methods. Moreover,…

  1. Detecting Differential Item Discrimination (DID) and the Consequences of Ignoring DID in Multilevel Item Response Models

    ERIC Educational Resources Information Center

    Lee, Woo-yeol; Cho, Sun-Joo

    2017-01-01

    Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…

  2. Item Pool Design for an Operational Variable-Length Computerized Adaptive Test

    ERIC Educational Resources Information Center

    He, Wei; Reckase, Mark D.

    2014-01-01

    For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…

  3. Analyzing Item Generation with Natural Language Processing Tools for the "TOEIC"® Listening Test. Research Report. ETS RR-17-52

    ERIC Educational Resources Information Center

    Yoon, Su-Youn; Lee, Chong Min; Houghton, Patrick; Lopez, Melissa; Sakano, Jennifer; Loukina, Anastasia; Krovetz, Bob; Lu, Chi; Madani, Nitin

    2017-01-01

    In this study, we developed assistive tools and resources to support TOEIC® Listening test item generation. There has recently been an increased need for a large pool of items for these tests. This need has, in turn, inspired efforts to increase the efficiency of item generation while maintaining the quality of the created items. We aimed to…

  4. An Analysis of Factors Affecting the Difficulty of Dialogue Items in TOEFL Listening Comprehension. TOEFL Research Reports, 51.

    ERIC Educational Resources Information Center

    Nissan, Susan; And Others

    One of the item types in the Listening Comprehension section of the Test of English as a Foreign Language (TOEFL) test is the dialogue. Because the dialogue item pool needs to have an appropriate balance of items at a range of difficulty levels, test developers have examined items at various difficulty levels in an attempt to identify their…

  5. Item development process and analysis of 50 case-based items for implementation on the Korean Nursing Licensing Examination.

    PubMed

    Park, In Sook; Suh, Yeon Ok; Park, Hae Sook; Kang, So Young; Kim, Kwang Sung; Kim, Gyung Hee; Choi, Yeon-Hee; Kim, Hyun-Ju

    2017-01-01

    The purpose of this study was to improve the quality of items on the Korean Nursing Licensing Examination by developing and evaluating case-based items that reflect integrated nursing knowledge. We conducted a cross-sectional observational study to develop new case-based items. The methods for developing test items included expert workshops, brainstorming, and verification of content validity. After a mock examination of undergraduate nursing students using the newly developed case-based items, we evaluated the appropriateness of the items through classical test theory and item response theory. A total of 50 case-based items were developed for the mock examination, and content validity was evaluated. The question items integrated 34 discrete elements of integrated nursing knowledge. The mock examination was taken by 741 baccalaureate students in their fourth year of study at 13 universities. Their average score on the mock examination was 57.4, and the examination showed a reliability of 0.40. According to classical test theory, the average level of item difficulty of the items was 57.4% (80%-100% for 12 items; 60%-80% for 13 items; and less than 60% for 25 items). The mean discrimination index was 0.19, and was above 0.30 for 11 items and 0.20 to 0.29 for 15 items. According to item response theory, the item discrimination parameter (in the logistic model) was none for 10 items (0.00), very low for 20 items (0.01 to 0.34), low for 12 items (0.35 to 0.64), moderate for 6 items (0.65 to 1.34), high for 1 item (1.35 to 1.69), and very high for 1 item (above 1.70). The item difficulty was very easy for 24 items (below -2.0), easy for 8 items (-2.0 to -0.5), medium for 6 items (-0.5 to 0.5), hard for 3 items (0.5 to 2.0), and very hard for 9 items (2.0 or above). The goodness-of-fit test in terms of the 2-parameter item response model between the range of 2.0 to 0.5 revealed that 12 items had an ideal correct answer rate. We surmised that the low reliability of the mock examination was influenced by the timing of the test for the examinees and the inappropriate difficulty of the items. Our study suggested a methodology for the development of future case-based items for the Korean Nursing Licensing Examination.

  6. The beneficial effect of testing: an event-related potential study

    PubMed Central

    Bai, Cheng-Hua; Bridger, Emma K.; Zimmer, Hubert D.; Mecklinger, Axel

    2015-01-01

    The enhanced memory performance for items that are tested as compared to being restudied (the testing effect) is a frequently reported memory phenomenon. According to the episodic context account of the testing effect, this beneficial effect of testing is related to a process which reinstates the previously learnt episodic information. Few studies have explored the neural correlates of this effect at the time point when testing takes place, however. In this study, we utilized the ERP correlates of successful memory encoding to address this issue, hypothesizing that if the benefit of testing is due to retrieval-related processes at test then subsequent memory effects (SMEs) should resemble the ERP correlates of retrieval-based processing in their temporal and spatial characteristics. Participants were asked to learn Swahili-German word pairs before items were presented in either a testing or a restudy condition. Memory performance was assessed immediately and 1-day later with a cued recall task. Successfully recalling items at test increased the likelihood that items were remembered over time compared to items which were only restudied. An ERP subsequent memory contrast (later remembered vs. later forgotten tested items), which reflects the engagement of processes that ensure items are recallable the next day were topographically comparable with the ERP correlate of immediate recollection (immediately remembered vs. immediately forgotten tested items). This result shows that the processes which allow items to be more memorable over time share qualitatively similar neural correlates with the processes that relate to successful retrieval at test. This finding supports the notion that testing is more beneficial than restudying on memory performance over time because of its engagement of retrieval processes, such as the re-encoding of actively retrieved memory representations. PMID:26441577

  7. Opportunity integrated assessment facilitating critical thinking and science process skills measurement on acid base matter

    NASA Astrophysics Data System (ADS)

    Sari, Anggi Ristiyana Puspita; Suyanta, LFX, Endang Widjajanti; Rohaeti, Eli

    2017-05-01

    Recognizing the importance of the development of critical thinking and science process skills, the instrument should give attention to the characteristics of chemistry. Therefore, constructing an accurate instrument for measuring those skills is important. However, the integrated instrument assessment is limited in number. The purpose of this study is to validate an integrated assessment instrument for measuring students' critical thinking and science process skills on acid base matter. The development model of the test instrument adapted McIntire model. The sample consisted of 392 second grade high school students in the academic year of 2015/2016 in Yogyakarta. Exploratory Factor Analysis (EFA) was conducted to explore construct validity, whereas content validity was substantiated by Aiken's formula. The result shows that the KMO test is 0.714 which indicates sufficient items for each factor and the Bartlett test is significant (a significance value of less than 0.05). Furthermore, content validity coefficient which is based on 8 experts is obtained at 0.85. The findings support the integrated assessment instrument to measure critical thinking and science process skills on acid base matter.

  8. The development of a science process assessment for fourth-grade students

    NASA Astrophysics Data System (ADS)

    Smith, Kathleen A.; Welliver, Paul W.

    In this study, a multiple-choice test entitled the Science Process Assessment was developed to measure the science process skills of students in grade four. Based on the Recommended Science Competency Continuum for Grades K to 6 for Pennsylvania Schools, this instrument measured the skills of (1) observing, (2) classifying, (3) inferring, (4) predicting, (5) measuring, (6) communicating, (7) using space/time relations, (8) defining operationally, (9) formulating hypotheses, (10) experimenting, (11) recognizing variables, (12) interpreting data, and (13) formulating models. To prepare the instrument, classroom teachers and science educators were invited to participate in two science education workshops designed to develop an item bank of test questions applicable to measuring process skill learning. Participants formed writing teams and generated 65 test items representing the 13 process skills. After a comprehensive group critique of each item, 61 items were identified for inclusion into the Science Process Assessment item bank. To establish content validity, the item bank was submitted to a select panel of science educators for the purpose of judging item acceptability. This analysis yielded 55 acceptable test items and produced the Science Process Assessment, Pilot 1. Pilot 1 was administered to 184 fourth-grade students. Students were given a copy of the test booklet; teachers read each test aloud to the students. Upon completion of this first administration, data from the item analysis yielded a reliability coefficient of 0.73. Subsequently, 40 test items were identified for the Science Process Assessment, Pilot 2. Using the test-retest method, the Science Process Assessment, Pilot 2 (Test 1 and Test 2) was administered to 113 fourth-grade students. Reliability coefficients of 0.80 and 0.82, respectively, were ascertained. The correlation between Test 1 and Test 2 was 0.77. The results of this study indicate that (1) the Science Process Assessment, Pilot 2, is a valid and reliable instrument applicable to measuring the science process skills of students in grade four, (2) using educational workshops as a means of developing item banks of test questions is viable and productive in the test development process, and (3) involving classroom teachers and science educators in the test development process is educationally efficient and effective.

  9. A Review of the Effects on IRT Item Parameter Estimates with a Focus on Misbehaving Common Items in Test Equating

    PubMed Central

    Michaelides, Michalis P.

    2010-01-01

    Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items. PMID:21833230

  10. A Review of the Effects on IRT Item Parameter Estimates with a Focus on Misbehaving Common Items in Test Equating.

    PubMed

    Michaelides, Michalis P

    2010-01-01

    Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.

  11. On the Relationship Between Classical Test Theory and Item Response Theory: From One to the Other and Back.

    PubMed

    Raykov, Tenko; Marcoulides, George A

    2016-04-01

    The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete nature of the observed items. Two distinct observational equivalence approaches are outlined that render the item response models from corresponding classical test theory-based models, and can each be used to obtain the former from the latter models. Similarly, classical test theory models can be furnished using the reverse application of either of those approaches from corresponding item response models.

  12. Locally Dependent Linear Logistic Test Model with Person Covariates

    ERIC Educational Resources Information Center

    Ip, Edward H.; Smits, Dirk J. M.; De Boeck, Paul

    2009-01-01

    The article proposes a family of item-response models that allow the separate and independent specification of three orthogonal components: item attribute, person covariate, and local item dependence. Special interest lies in extending the linear logistic test model, which is commonly used to measure item attributes, to tests with embedded item…

  13. Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items

    ERIC Educational Resources Information Center

    Penfield, Randall D.

    2006-01-01

    This study applied the maximum expected information (MEI) and the maximum posterior-weighted information (MPI) approaches of computer adaptive testing item selection to the case of a test using polytomous items following the partial credit model. The MEI and MPI approaches are described. A simulation study compared the efficiency of ability…

  14. Do Reading Experts Agree with MCAT Verbal Reasoning Item Classifications?

    ERIC Educational Resources Information Center

    Jackson, Evelyn W.; And Others

    1994-01-01

    Examined whether expert raters (n=5) could agree about classification of Medical College Admission Test (MCAT) items and whether they agreed with MCAT student manual in labeling skill being measured by each test item. Results revealed difficulties in replicating authors' labeling of skills for reading items on practice test provided with 1991 MCAT…

  15. Differential Item Functioning: Its Consequences. Research Report. ETS RR-10-01

    ERIC Educational Resources Information Center

    Lee, Yi-Hsuan; Zhang, Jinming

    2010-01-01

    This report examines the consequences of differential item functioning (DIF) using simulated data. Its impact on total score, item response theory (IRT) ability estimate, and test reliability was evaluated in various testing scenarios created by manipulating the following four factors: test length, percentage of DIF items per form, sample sizes of…

  16. Electronics. Criterion-Referenced Test (CRT) Item Bank.

    ERIC Educational Resources Information Center

    Davis, Diane, Ed.

    This document contains 519 criterion-referenced multiple choice and true or false test items for a course in electronics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and the Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 15 units covering the…

  17. Auto Mechanics. Criterion-Referenced Test (CRT) Item Bank.

    ERIC Educational Resources Information Center

    Tannehill, Dana, Ed.

    This document contains 546 criterion-referenced multiple choice and true or false test items for a course in auto mechanics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 35 units covering the…

  18. Developing a Strategy for Using Technology-Enhanced Items in Large-Scale Standardized Tests

    ERIC Educational Resources Information Center

    Bryant, William

    2017-01-01

    As large-scale standardized tests move from paper-based to computer-based delivery, opportunities arise for test developers to make use of items beyond traditional selected and constructed response types. Technology-enhanced items (TEIs) have the potential to provide advantages over conventional items, including broadening construct measurement,…

  19. Varying levels of difficulty index of skills-test items randomly selected by examinees on the Korean emergency medical technician licensing examination

    PubMed Central

    2016-01-01

    Purpose: The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE), which requires examinees to select items randomly. Methods: The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. Results: In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01), as well as 4 of the 5 items on the advanced skills test (P<0.05). In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01), as well as all 3 of the advanced skills test items (P<0.01). Conclusion: In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination. PMID:26883810

  20. Reliability of the Client-Centeredness of Goal Setting (C-COGS) Scale in Acquired Brain Injury Rehabilitation.

    PubMed

    Doig, Emmah; Prescott, Sarah; Fleming, Jennifer; Cornwell, Petrea; Kuipers, Pim

    2016-01-01

    To examine the internal reliability and test-retest reliability of the Client-Centeredness of Goal Setting (C-COGS) scale. The C-COGS scale was administered to 42 participants with acquired brain injury after completion of multidisciplinary goal planning. Internal reliability of scale items was examined using item-partial total correlations and Cronbach's α coefficient. The scale was readministered within a 1-mo period to a subsample of 12 participants to examine test-retest reliability by calculating exact and close percentage agreement for each item. After examination of item-partial total correlations, test items were revised. The revised items demonstrated stronger internal consistency than the original items. Preliminary evaluation of test-retest reliability was fair, with an average exact percent agreement across all test items of 67%. Findings support the preliminary reliability of the C-COGS scale as a tool to evaluate and promote client-centered goal planning in brain injury rehabilitation. Copyright © 2016 by the American Occupational Therapy Association, Inc.

  1. Item-Writing Guidelines for Physics

    ERIC Educational Resources Information Center

    Regan, Tom

    2015-01-01

    A teacher learning how to write test questions (test items) will almost certainly encounter item-writing guidelines--lists of item-writing do's and don'ts. Item-writing guidelines usually are presented as applicable across all assessment settings. Table I shows some guidelines that I believe to be generally applicable and two will be briefly…

  2. Unidimensional Interpretations for Multidimensional Test Items

    ERIC Educational Resources Information Center

    Kahraman, Nilufer

    2013-01-01

    This article considers potential problems that can arise in estimating a unidimensional item response theory (IRT) model when some test items are multidimensional (i.e., show a complex factorial structure). More specifically, this study examines (1) the consequences of model misfit on IRT item parameter estimates due to unintended minor item-level…

  3. Measuring psychological trauma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Psychological Trauma item bank and short form

    PubMed Central

    Kisala, Pamela A.; Victorson, David; Pace, Natalie; Heinemann, Allen W.; Choi, Seung W.; Tulsky, David S.

    2015-01-01

    Objective To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Design Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. Participants A total of 716 individuals with SCI completed the trauma items Results The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items Conclusion The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available. PMID:26010967

  4. Repeated retrieval practice and item difficulty: does criterion learning eliminate item difficulty effects?

    PubMed

    Vaughn, Kalif E; Rawson, Katherine A; Pyc, Mary A

    2013-12-01

    A wealth of previous research has established that retrieval practice promotes memory, particularly when retrieval is successful. Although successful retrieval promotes memory, it remains unclear whether successful retrieval promotes memory equally well for items of varying difficulty. Will easy items still outperform difficult items on a final test if all items have been correctly recalled equal numbers of times during practice? In two experiments, normatively difficult and easy Lithuanian-English word pairs were learned via test-restudy practice until each item had been correctly recalled a preassigned number of times (from 1 to 11 correct recalls). Despite equating the numbers of successful recalls during practice, performance on a delayed final cued-recall test was lower for difficult than for easy items. Experiment 2 was designed to diagnose whether the disadvantage for difficult items was due to deficits in cue memory, target memory, and/or associative memory. The results revealed a disadvantage for the difficult versus the easy items only on the associative recognition test, with no differences on cue recognition, and even an advantage on target recognition. Although successful retrieval enhanced memory for both difficult and easy items, equating retrieval success during practice did not eliminate normative item difficulty differences.

  5. Test Bias: An Objective Definition for Test Items.

    ERIC Educational Resources Information Center

    Durovic, Jerry J.

    A test bias definition, applicable at the item-level of a test is presented. The definition conceptually equates test bias with measuring different things in different groups, and operationally equates test bias with a difference in item fit to the Rasch Model, greater than one, between groups. It is suggested that the proposed definition avoids…

  6. Fixed or mixed: a comparison of three, four and mixed-option multiple-choice tests in a Fetal Surveillance Education Program

    PubMed Central

    2013-01-01

    Background Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. Methods The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Results Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. Conclusions The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information. PMID:23453056

  7. Fixed or mixed: a comparison of three, four and mixed-option multiple-choice tests in a Fetal Surveillance Education Program.

    PubMed

    Zoanetti, Nathan; Beaves, Mark; Griffin, Patrick; Wallace, Euan M

    2013-03-04

    Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information.

  8. Chemistry inside an Epistemological Community Box! Discursive Exclusions and Inclusions in Swedish National Tests in Chemistry

    ERIC Educational Resources Information Center

    Ståhl, Marie; Hussénius, Anita

    2017-01-01

    This study examined the Swedish national tests in chemistry for implicit and explicit values. The chemistry subject is understudied compared to biology and physics and students view chemistry as their least interesting science subject. The Swedish national science assessments aim to support equitable and fair evaluation of students, to concretize…

  9. Detecting Gender Bias Through Test Item Analysis

    NASA Astrophysics Data System (ADS)

    González-Espada, Wilson J.

    2009-03-01

    Many physical science and physics instructors might not be trained in pedagogically appropriate test construction methods. This could lead to test items that do not measure what they are intended to measure. A subgroup of these items might show bias against some groups of students. This paper describes how the author became aware of potentially biased items against females in his examinations, which led to the exploration of fundamental issues related to item validity, gender bias, and differential item functioning, or DIF. A brief discussion of DIF in the context of university courses, as well as practical suggestions to detect possible gender-biased items, follows.

  10. Approaches and Study Skills Inventory for Students (ASSIST) in an Introductory Course in Chemistry

    ERIC Educational Resources Information Center

    Brown, Stephen; White, Sue; Wakeling, Lara; Naiker, Mani

    2015-01-01

    Approaches to study and learning may enhance or undermine educational outcomes, and thus it is important for educators to be knowledgeable about their students' approaches to study and learning. The Approaches and Study Skills Inventory for Students (ASSIST)--a 52 item inventory which identifies three learning styles (Deep, Strategic, and…

  11. Toward a Tripartite Model of Research Motivation: Development and Initial Validation of the Research Motivation Scale

    ERIC Educational Resources Information Center

    Deemer, Eric D.; Martens, Matthew P.; Buboltz, Walter C.

    2010-01-01

    An instrument designed to measure a 3-factor model of research motivation was developed and psychometrically examined in the present research. Participants were 437 graduate students in biology, chemistry/biochemistry, physics/astronomy, and psychology. A principal components analysis supported the retention of 20 items representing the 3-factor…

  12. Estimating Total-test Scores from Partial Scores in a Matrix Sampling Design.

    ERIC Educational Resources Information Center

    Sachar, Jane; Suppes, Patrick

    It is sometimes desirable to obtain an estimated total-test score for an individual who was administered only a subset of the items in a total test. The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students in grades 3-5 and 60 items of the ll0-item Stanford Mental…

  13. Differential item functioning analysis of the Vanderbilt Expertise Test for cars

    PubMed Central

    Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W.; Van Gulick, Ana Beth; Gauthier, Isabel

    2015-01-01

    The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge. PMID:26418499

  14. Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

    ERIC Educational Resources Information Center

    Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.

    2012-01-01

    Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…

  15. Science Library of Test Items. Volume Two.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    The second volume of test items in the Science Library of Test Items is intended as a resource to assist teachers in implementing and evaluating science courses in the first 4 years of Australian secondary school. The items were selected from questions submitted to the School Certificate Development Unit by teachers in New South Wales. Only the…

  16. Measuring the Instructional Sensitivity of ESL Reading Comprehension Items.

    ERIC Educational Resources Information Center

    Brutten, Sheila R.; And Others

    A study attempted to estimate the instructional sensitivity of items in three reading comprehension tests in English as a second language (ESL). Instructional sensitivity is a test-item construct defined as the tendency for a test item to vary in difficulty as a function of instruction. Similar tasks were given to readers at different proficiency…

  17. Reducing the Impact of Inappropriate Items on Reviewable Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Yen, Yung-Chin; Ho, Rong-Guey; Liao, Wen-Wei; Chen, Li-Ju

    2012-01-01

    In a test, the testing score would be closer to examinee's actual ability when careless mistakes were corrected. In CAT, however, changing the answer of one item in CAT might cause the following items no longer appropriate for estimating the examinee's ability. These inappropriate items in a reviewable CAT might in turn introduce bias in ability…

  18. Comparing and Combining Dichotomous and Polytomous Items with SPRT Procedure in Computerized Classification Testing.

    ERIC Educational Resources Information Center

    Lau, C. Allen; Wang, Tianyou

    The purposes of this study were to: (1) extend the sequential probability ratio testing (SPRT) procedure to polytomous item response theory (IRT) models in computerized classification testing (CCT); (2) compare polytomous items with dichotomous items using the SPRT procedure for their accuracy and efficiency; (3) study a direct approach in…

  19. A Conditional Exposure Control Method for Multidimensional Adaptive Testing

    ERIC Educational Resources Information Center

    Finkelman, Matthew; Nering, Michael L.; Roussos, Louis A.

    2009-01-01

    In computerized adaptive testing (CAT), ensuring the security of test items is a crucial practical consideration. A common approach to reducing item theft is to define maximum item exposure rates, i.e., to limit the proportion of examinees to whom a given item can be administered. Numerous methods for controlling exposure rates have been proposed…

  20. The Effects of Clinically Relevant Multiple-Choice Items on the Statistical Discrimination of Physician Clinical Competence.

    ERIC Educational Resources Information Center

    Downing, Steven M.; Maatsch, Jack L.

    To test the effect of clinically relevant multiple-choice item content on the validity of statistical discriminations of physicians' clinical competence, data were collected from a field test of the Emergency Medicine Examination, test items for the certification of specialists in emergency medicine. Two 91-item multiple-choice subscales were…

  1. The Effect of Including or Excluding Students with Testing Accommodations on IRT Calibrations.

    ERIC Educational Resources Information Center

    Karkee, Thakur; Lewis, Dan M.; Barton, Karen; Haug, Carolyn

    This study aimed to determine the degree to which the inclusion of accommodated students with disabilities in the calibration sample affects the characteristics of item parameters and the test results. Investigated were effects on test reliability, item fit to the applicable item response theory (IRT) model, item parameter estimates, and students'…

  2. Zambian pre-service junior high school science teachers' chemical reasoning and ability

    NASA Astrophysics Data System (ADS)

    Banda, Asiana

    The purpose of this study was two-fold: examine junior high school pre-service science teachers' chemical reasoning; and establish the extent to which the pre-service science teachers' chemical abilities explain their chemical reasoning. A sample comprised 165 junior high school pre-service science teachers at Mufulira College of Education in Zambia. There were 82 males and 83 females. Data were collected using a Chemical Concept Reasoning Test (CCRT). Pre-service science teachers' chemical reasoning was established through qualitative analysis of their responses to test items. The Rasch Model was used to determine the pre-service teachers' chemical abilities and item difficulty. Results show that most pre-service science teachers had incorrect chemical reasoning on chemical concepts assessed in this study. There was no significant difference in chemical understanding between the Full-Time and Distance Education pre-service science teachers, and between second and third year pre-service science teachers. However, there was a significant difference in chemical understanding between male and female pre-service science teachers. Male pre-service science teachers showed better chemical understanding than female pre-service science teachers. The Rasch model revealed that the pre-service science teachers had low chemical abilities, and the CCRT was very difficult for this group of pre-service science teachers. As such, their incorrect chemical reasoning was attributed to their low chemical abilities. These results have implications on science teacher education, chemistry teaching and learning, and chemical education research.

  3. Three controversies over item disclosure in medical licensure examinations.

    PubMed

    Park, Yoon Soo; Yang, Eunbae B

    2015-01-01

    In response to views on public's right to know, there is growing attention to item disclosure - release of items, answer keys, and performance data to the public - in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations - 1) fairness and validity, 2) impact on passing levels, and 3) utility of item disclosure - by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers' right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.

  4. Online Calibration of Polytomous Items Under the Generalized Partial Credit Model

    PubMed Central

    Zheng, Yi

    2016-01-01

    Online calibration is a technology-enhanced architecture for item calibration in computerized adaptive tests (CATs). Many CATs are administered continuously over a long term and rely on large item banks. To ensure test validity, these item banks need to be frequently replenished with new items, and these new items need to be pretested before being used operationally. Online calibration dynamically embeds pretest items in operational tests and calibrates their parameters as response data are gradually obtained through the continuous test administration. This study extends existing formulas, procedures, and algorithms for dichotomous item response theory models to the generalized partial credit model, a popular model for items scored in more than two categories. A simulation study was conducted to investigate the developed algorithms and procedures under a variety of conditions, including two estimation algorithms, three pretest item selection methods, three seeding locations, two numbers of score categories, and three calibration sample sizes. Results demonstrated acceptable estimation accuracy of the two estimation algorithms in some of the simulated conditions. A variety of findings were also revealed for the interacted effects of included factors, and recommendations were made respectively. PMID:29881063

  5. Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms

    ERIC Educational Resources Information Center

    Debeer, Dries; Ali, Usama S.; van Rijn, Peter W.

    2017-01-01

    Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…

  6. CFD Code Development for Combustor Flows

    NASA Technical Reports Server (NTRS)

    Norris, Andrew

    2003-01-01

    During the lifetime of this grant, work has been performed in the areas of model development, code development, code validation and code application. For model development, this has included the PDF combustion module, chemical kinetics based on thermodynamics, neural network storage of chemical kinetics, ILDM chemical kinetics and assumed PDF work. Many of these models were then implemented in the code, and in addition many improvements were made to the code, including the addition of new chemistry integrators, property evaluation schemes, new chemistry models and turbulence-chemistry interaction methodology. Validation of all new models and code improvements were also performed, while application of the code to the ZCET program and also the NPSS GEW combustor program were also performed. Several important items remain under development, including the NOx post processing, assumed PDF model development and chemical kinetic development. It is expected that this work will continue under the new grant.

  7. Nickel and cobalt release from jewellery and metal clothing items in Korea.

    PubMed

    Cheong, Seung Hyun; Choi, You Won; Choi, Hae Young; Byun, Ji Yeon

    2014-01-01

    In Korea, the prevalence of nickel allergy has shown a sharply increasing trend. Cobalt contact allergy is often associated with concomitant reactions to nickel, and is more common in Korea than in western countries. The aim of the present study was to investigate the prevalence of items that release nickel and cobalt on the Korean market. A total of 471 items that included 193 branded jewellery, 202 non-branded jewellery and 76 metal clothing items were sampled and studied with a dimethylglyoxime (DMG) test and a cobalt spot test to detect nickel and cobalt release, respectively. Nickel release was detected in 47.8% of the tested items. The positive rates in the DMG test were 12.4% for the branded jewellery, 70.8% for the non-branded jewellery, and 76.3% for the metal clothing items. Cobalt release was found in 6.2% of items. Among the types of jewellery, belts and hair pins showed higher positive rates in both the DMG test and the cobalt spot test. Our study shows that the prevalence of items that release nickel or cobalt among jewellery and metal clothing items is high in Korea. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. The Role of Item Feedback in Self-Adapted Testing.

    ERIC Educational Resources Information Center

    Roos, Linda L.; And Others

    1997-01-01

    The importance of item feedback in self-adapted testing was studied by comparing feedback and no feedback conditions for computerized adaptive tests and self-adapted tests taken by 363 college students. Results indicate that item feedback is not necessary to realize score differences between self-adapted and computerized adaptive testing. (SLD)

  9. Criterion-Referenced Test Items for Auto Body.

    ERIC Educational Resources Information Center

    Tannehill, Dana, Ed.

    This test item bank on auto body repair contains criterion-referenced test questions based upon competencies found in the Missouri Auto Body Competency Profile. Some test items are keyed for multiple competencies. The tests cover the following 26 competency areas in the auto body curriculum: auto body careers; measuring and mixing; tools and…

  10. Automated Test-Form Generation

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Diao, Qi

    2011-01-01

    In automated test assembly (ATA), the methodology of mixed-integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different…

  11. Geography, Years 7-10, Library of Test Items. Volume Eight. Junior Secondary Items To Be Used With 1976 to 1980 H.S.C. Geography Exam. Broadsheets.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  12. 42 CFR 493.931 - Routine chemistry.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 42 Public Health 5 2010-10-01 2010-10-01 false Routine chemistry. 493.931 Section 493.931 Public... Proficiency Testing Programs by Specialty and Subspecialty § 493.931 Routine chemistry. (a) Program content and frequency of challenge. To be approved for proficiency testing for routine chemistry, a program...

  13. 42 CFR 493.931 - Routine chemistry.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 42 Public Health 5 2013-10-01 2013-10-01 false Routine chemistry. 493.931 Section 493.931 Public... Proficiency Testing Programs by Specialty and Subspecialty § 493.931 Routine chemistry. (a) Program content and frequency of challenge. To be approved for proficiency testing for routine chemistry, a program...

  14. 42 CFR 493.931 - Routine chemistry.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 42 Public Health 5 2012-10-01 2012-10-01 false Routine chemistry. 493.931 Section 493.931 Public... Proficiency Testing Programs by Specialty and Subspecialty § 493.931 Routine chemistry. (a) Program content and frequency of challenge. To be approved for proficiency testing for routine chemistry, a program...

  15. 42 CFR 493.931 - Routine chemistry.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 42 Public Health 5 2014-10-01 2014-10-01 false Routine chemistry. 493.931 Section 493.931 Public... Proficiency Testing Programs by Specialty and Subspecialty § 493.931 Routine chemistry. (a) Program content and frequency of challenge. To be approved for proficiency testing for routine chemistry, a program...

  16. 42 CFR 493.931 - Routine chemistry.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 42 Public Health 5 2011-10-01 2011-10-01 false Routine chemistry. 493.931 Section 493.931 Public... Proficiency Testing Programs by Specialty and Subspecialty § 493.931 Routine chemistry. (a) Program content and frequency of challenge. To be approved for proficiency testing for routine chemistry, a program...

  17. Solving the measurement invariance anchor item problem in item response theory.

    PubMed

    Meade, Adam W; Wright, Natalie A

    2012-09-01

    The efficacy of tests of differential item functioning (measurement invariance) has been well established. It is clear that when properly implemented, these tests can successfully identify differentially functioning (DF) items when they exist. However, an assumption of these analyses is that the metric for different groups is linked using anchor items that are invariant. In practice, however, it is impossible to be certain which items are DF and which are invariant. This problem of anchor items, or referent indicators, has long plagued invariance research, and a multitude of suggested approaches have been put forth. Unfortunately, the relative efficacy of these approaches has not been tested. This study compares 11 variations on 5 qualitatively different approaches from recent literature for selecting optimal anchor items. A large-scale simulation study indicates that for nearly all conditions, an easily implemented 2-stage procedure recently put forth by Lopez Rivas, Stark, and Chernyshenko (2009) provided optimal power while maintaining nominal Type I error. With this approach, appropriate anchor items can be easily and quickly located, resulting in more efficacious invariance tests. Recommendations for invariance testing are illustrated using a pedagogical example of employee responses to an organizational culture measure.

  18. When Listening Is Better Than Reading: Performance Gains on Cardiac Auscultation Test Questions.

    PubMed

    Short, Kathleen; Bucak, S Deniz; Rosenthal, Francine; Raymond, Mark R

    2018-05-01

    In 2007, the United States Medical Licensing Examination embedded multimedia simulations of heart sounds into multiple-choice questions. This study investigated changes in item difficulty as determined by examinee performance over time. The data reflect outcomes obtained following initial use of multimedia items from 2007 through 2012, after which an interface change occurred. A total of 233,157 examinees responded to 1,306 cardiology test items over the six-year period; 138 items included multimedia simulations of heart sounds, while 1,168 text-based items without multimedia served as controls. The authors compared changes in difficulty of multimedia items over time with changes in difficulty of text-based cardiology items over time. Further, they compared changes in item difficulty for both groups of items between graduates of Liaison Committee on Medical Education (LCME)-accredited and non-LCME-accredited (i.e., international) medical schools. Examinee performance on cardiology test items with multimedia heart sounds improved by 12.4% over the six-year period, while performance on text-based cardiology items improved by approximately 1.4%. These results were similar for graduates of LCME-accredited and non-LCME-accredited medical schools. Examinees' ability to interpret auscultation findings in test items that include multimedia presentations increased from 2007 to 2012.

  19. Revisiting the role of recollection in item versus forced-choice recognition memory.

    PubMed

    Cook, Gabriel I; Marsh, Richard L; Hicks, Jason L

    2005-08-01

    Many memory theorists have assumed that forced-choice recognition tests can rely more on familiarity, whereas item (yes-no) tests must rely more on recollection. In actuality, several studies have found no differences in the contributions of recollection and familiarity underlying the two different test formats. Using word frequency to manipulate stimulus characteristics, the present study demonstrated that the contributions of recollection to item versus forced-choice tests is variable. Low word frequency resulted in significantly more recollection in an item test than did a forced-choice procedure, but high word frequency produced the opposite result. These results clearly constrain any uniform claim about the degree to which recollection supports responding in item versus forced-choice tests.

  20. A Comparison of Methods of Vertical Equating.

    ERIC Educational Resources Information Center

    Loyd, Brenda H.; Hoover, H. D.

    Rasch model vertical equating procedures were applied to three mathematics computation tests for grades six, seven, and eight. Each level of the test was composed of 45 items in three sets of 15 items, arranged in such a way that tests for adjacent grades had two sets (30 items) in common, and the sixth and eighth grades had 15 items in common. In…

  1. Ability or Access-Ability: Differential Item Functioning of Items on Alternate Performance-Based Assessment Tests for Students with Visual Impairments

    ERIC Educational Resources Information Center

    Zebehazy, Kim T.; Zigmond, Naomi; Zimmerman, George J.

    2012-01-01

    Introduction: This study investigated differential item functioning (DIF) of test items on Pennsylvania's Alternate System of Assessment (PASA) for students with visual impairments and severe cognitive disabilities and what the reasons for the differences may be. Methods: The Wilcoxon signed ranks test was used to analyze differences in the scores…

  2. Objective and Item Banking Computer Software and Its Use in Comprehensive Achievement Monitoring.

    ERIC Educational Resources Information Center

    Schriber, Peter E.; Gorth, William P.

    The current emphasis on objectives and test item banks for constructing more effective tests is being augmented by increasingly sophisticated computer software. Items can be catalogued in numerous ways for retrieval. The items as well as instructional objectives can be stored and test forms can be selected and printed by the computer. It is also…

  3. An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38

    ERIC Educational Resources Information Center

    Ali, Usama S.; Chang, Hua-Hua

    2014-01-01

    Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…

  4. Fitting the Rasch Model to Account for Variation in Item Discrimination

    ERIC Educational Resources Information Center

    Weitzman, R. A.

    2009-01-01

    Building on the Kelley and Gulliksen versions of classical test theory, this article shows that a logistic model having only a single item parameter can account for varying item discrimination, as well as difficulty, by using item-test correlations to adjust incorrect-correct (0-1) item responses prior to an initial model fit. The fit occurs…

  5. Weighted Maximum-a-Posteriori Estimation in Tests Composed of Dichotomous and Polytomous Items

    ERIC Educational Resources Information Center

    Sun, Shan-Shan; Tao, Jian; Chang, Hua-Hua; Shi, Ning-Zhong

    2012-01-01

    For mixed-type tests composed of dichotomous and polytomous items, polytomous items often yield more information than dichotomous items. To reflect the difference between the two types of items and to improve the precision of ability estimation, an adaptive weighted maximum-a-posteriori (WMAP) estimation is proposed. To evaluate the performance of…

  6. Examination of Polytomous Items' Psychometric Properties According to Nonparametric Item Response Theory Models in Different Test Conditions

    ERIC Educational Resources Information Center

    Sengul Avsar, Asiye; Tavsancil, Ezel

    2017-01-01

    This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…

  7. Rasch Measurement and Item Banking: Theory and Practice.

    ERIC Educational Resources Information Center

    Nakamura, Yuji

    The Rasch Model is an item response theory, one parameter model developed that states that the probability of a correct response on a test is a function of the difficulty of the item and the ability of the candidate. Item banking is useful for language testing. The Rasch Model provides estimates of item difficulties that are meaningful,…

  8. Test Design Project: Studies in Test Bias. Annual Report.

    ERIC Educational Resources Information Center

    McArthur, David

    Item bias in a multiple-choice test can be detected by appropriate analyses of the persons x items scoring matrix. This permits comparison of groups of examinees tested with the same instrument. The test may be biased if it is not measuring the same thing in comparable groups, if groups are responding to different aspects of the test items, or if…

  9. The Impact of Settable Test Item Exposure Control Interface Format on Postsecondary Business Student Test Performance

    ERIC Educational Resources Information Center

    Truell, Allen D.; Zhao, Jensen J.; Alexander, Melody W.

    2005-01-01

    The purposes of this study were to determine if there is a significant difference in postsecondary business student scores and test completion time based on settable test item exposure control interface format, and to determine if there is a significant difference in student scores and test completion time based on settable test item exposure…

  10. Estimating Total-Test Scores from Partial Scores in a Matrix Sampling Design.

    ERIC Educational Resources Information Center

    Sachar, Jane; Suppes, Patrick

    1980-01-01

    The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students and 60 items of the 110-item Stanford Mental Arithmetic Test. Three methods yielded fairly good estimates of the total-test score. (Author/RL)

  11. A Chemistry Concept Reasoning Test

    ERIC Educational Resources Information Center

    Cloonan, Carrie A.; Hutchinson, John S.

    2011-01-01

    A Chemistry Concept Reasoning Test was created and validated providing an easy-to-use tool for measuring conceptual understanding and critical scientific thinking of general chemistry models and theories. The test is designed to measure concept understanding comparable to that found in free-response questions requiring explanations over…

  12. A Generalized DIF Effect Variance Estimator for Measuring Unsigned Differential Test Functioning in Mixed Format Tests

    ERIC Educational Resources Information Center

    Penfield, Randall D.; Algina, James

    2006-01-01

    One approach to measuring unsigned differential test functioning is to estimate the variance of the differential item functioning (DIF) effect across the items of the test. This article proposes two estimators of the DIF effect variance for tests containing dichotomous and polytomous items. The proposed estimators are direct extensions of the…

  13. The quadratic relationship between difficulty of intelligence test items and their correlations with working memory.

    PubMed

    Smolen, Tomasz; Chuderski, Adam

    2015-01-01

    Fluid intelligence (Gf) is a crucial cognitive ability that involves abstract reasoning in order to solve novel problems. Recent research demonstrated that Gf strongly depends on the individual effectiveness of working memory (WM). We investigated a popular claim that if the storage capacity underlay the WM-Gf correlation, then such a correlation should increase with an increasing number of items or rules (load) in a Gf-test. As often no such link is observed, on that basis the storage-capacity account is rejected, and alternative accounts of Gf (e.g., related to executive control or processing speed) are proposed. Using both analytical inference and numerical simulations, we demonstrated that the load-dependent change in correlation is primarily a function of the amount of floor/ceiling effect for particular items. Thus, the item-wise WM correlation of a Gf-test depends on its overall difficulty, and the difficulty distribution across its items. When the early test items yield huge ceiling, but the late items do not approach floor, that correlation will increase throughout the test. If the early items locate themselves between ceiling and floor, but the late items approach floor, the respective correlation will decrease. For a hallmark Gf-test, the Raven-test, whose items span from ceiling to floor, the quadratic relationship is expected, and it was shown empirically using a large sample and two types of WMC tasks. In consequence, no changes in correlation due to varying WM/Gf load, or lack of them, can yield an argument for or against any theory of WM/Gf. Moreover, as the mathematical properties of the correlation formula make it relatively immune to ceiling/floor effects for overall moderate correlations, only minor changes (if any) in the WM-Gf correlation should be expected for many psychological tests.

  14. Item response theory analysis of the mechanics baseline test

    NASA Astrophysics Data System (ADS)

    Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.

    2012-02-01

    Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.

  15. Computerized adaptive testing: the capitalization on chance problem.

    PubMed

    Olea, Julio; Barrada, Juan Ramón; Abad, Francisco J; Ponsoda, Vicente; Cuevas, Lara

    2012-03-01

    This paper describes several simulation studies that examine the effects of capitalization on chance in the selection of items and the ability estimation in CAT, employing the 3-parameter logistic model. In order to generate different estimation errors for the item parameters, the calibration sample size was manipulated (N = 500, 1000 and 2000 subjects) as was the ratio of item bank size to test length (banks of 197 and 788 items, test lengths of 20 and 40 items), both in a CAT and in a random test. Results show that capitalization on chance is particularly serious in CAT, as revealed by the large positive bias found in the small sample calibration conditions. For broad ranges of theta, the overestimation of the precision (asymptotic Se) reaches levels of 40%, something that does not occur with the RMSE (theta). The problem is greater as the item bank size to test length ratio increases. Potential solutions were tested in a second study, where two exposure control methods were incorporated into the item selection algorithm. Some alternative solutions are discussed.

  16. The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

    ERIC Educational Resources Information Center

    Öztürk-Gübes, Nese; Kelecioglu, Hülya

    2016-01-01

    The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

  17. Location Indices for Ordinal Polytomous Items Based on Item Response Theory. Research Report. ETS RR-15-20

    ERIC Educational Resources Information Center

    Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.

    2015-01-01

    Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…

  18. Designing a Virtual Item Bank Based on the Techniques of Image Processing

    ERIC Educational Resources Information Center

    Liao, Wen-Wei; Ho, Rong-Guey

    2011-01-01

    One of the major weaknesses of the item exposure rates of figural items in Intelligence Quotient (IQ) tests lies in its inaccuracies. In this study, a new approach is proposed and a useful test tool known as the Virtual Item Bank (VIB) is introduced. The VIB combine Automatic Item Generation theory and image processing theory with the concepts of…

  19. The Rasch Model and Missing Data, with an Emphasis on Tailoring Test Items.

    ERIC Educational Resources Information Center

    de Gruijter, Dato N. M.

    Many applications of educational testing have a missing data aspect (MDA). This MDA is perhaps most pronounced in item banking, where each examinee responds to a different subtest of items from a large item pool and where both person and item parameter estimates are needed. The Rasch model is emphasized, and its non-parametric counterpart (the…

  20. Three controversies over item disclosure in medical licensure examinations

    PubMed Central

    Park, Yoon Soo; Yang, Eunbae B.

    2015-01-01

    In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1) fairness and validity, 2) impact on passing levels, and 3) utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration. PMID:26374693

  1. What Are They Thinking? The Development and Use of an Instrument that Identifies Common Science Misconceptions

    ERIC Educational Resources Information Center

    Stein, Mary; Barman, Charles R.; Larrabee, Timothy

    2007-01-01

    This article describes the rationale for, and development of, an online instrument that helps identify commonly held science misconceptions. Science Beliefs is a 47-item instrument that targets topics in chemistry, physics, biology, earth science, and astronomy. It utilizes a true or false, along with a written-explanation, format. The true or…

  2. Chemistry of 2,5-dihydroxy-(1,4)-benzoquinone, a key chromophore in aged cellulosics

    USDA-ARS?s Scientific Manuscript database

    Cotton or linen fabrics and paper, as well as other items composed chiefly of cellulose, tend to change to a yellow or brown color as they age. The change in color is usually accompanied by increased brittleness and loss of strength, as well. A cause of these phenomena is thought to be the formation...

  3. Bayesian Item Selection in Constrained Adaptive Testing Using Shadow Tests

    ERIC Educational Resources Information Center

    Veldkamp, Bernard P.

    2010-01-01

    Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item…

  4. Mathematics Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Fraser, Graham, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from previous tests are made available to teachers for the construction of pretests or posttests, reference tests for inter-class comparisons and general assignments. The collection was reviewed for content…

  5. Are Learning Disabled Students "Test-Wise?": An Inquiry into Reading Comprehension Test Items.

    ERIC Educational Resources Information Center

    Scruggs, Thomas E.; Lifson, Steve

    The ability to correctly answer reading comprehension test items, without having read the accompanying reading passage, was compared for third grade learning disabled students and their peers from a regular classroom. In the first experiment, fourteen multiple choice items were selected from the Stanford Achievement Test. No reading passages were…

  6. Agriculture Library of Test Items.

    ERIC Educational Resources Information Center

    Sutherland, Duncan, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…

  7. iBank

    ERIC Educational Resources Information Center

    Bermundo, Cesar B.; Bermundo, Alex B.; Ballester, Rex C.

    2012-01-01

    iBank is a project that utilizes a software to create an item Bank that store quality questions, generate test and print exam. The items are from analyze teacher-constructed test questions that provides the basis for discussing test results, by determining why a test item is or not discriminating between the better and poorer students, and by…

  8. Effects of Test Item Disclosure on Medical Licensing Examination

    ERIC Educational Resources Information Center

    Yang, Eunbae B.; Lee, Myung Ae; Park, Yoon Soo

    2018-01-01

    In 2012, the National Health Personnel Licensing Examination Board of Korea decided to publicly disclose all test items and answers to satisfy the test takers' right to know and enhance the transparency of tests administered by the government. This study investigated the effects of item disclosure on the medical licensing examination (MLE),…

  9. Controlling Item Exposure Conditional on Ability in Computerized Adaptive Testing.

    ERIC Educational Resources Information Center

    Stocking, Martha L.; Lewis, Charles

    1998-01-01

    Ensuring item and pool security in a continuous testing environment is explored through a new method of controlling exposure rate of items conditional on ability level in computerized testing. Properties of this conditional control on exposure rate, when used in conjunction with a particular adaptive testing algorithm, are explored using simulated…

  10. Battalion Combat Operations Center (COC) Test. Volume II. Test Report,

    DTIC Science & Technology

    1982-02-08

    reveal, perhaps, that item X can perform a task faster than item-Y. A utility assessment from an experienced, knowledgeable test participant, however...can ascertain whether or not item X can better enable him to accomplish his mission than item Y. 2.4 GENeRALIZED TEST FACILITY. The capabilities of...ATHE MIX D -IX AE4SY MIXES A & C MIX A .IX D M X D IMIX C RATHER DIFFICUJLT VERY DIFFICULT ABILITY TO ABILITY TO ABILITY TO CONTROL DATA EXPLOIT DATA

  11. V-TECS Criterion-Referenced Test Item Bank for Radiologic Technology Occupations.

    ERIC Educational Resources Information Center

    Reneau, Fred; And Others

    This Vocational-Technical Education Consortium of States (V-TECS) criterion-referenced test item bank provides 696 multiple-choice items and 33 matching items for radiologic technology occupations. These job titles are included: radiologic technologist, chief; radiologic technologist; nuclear medicine technologist; radiation therapy technologist;…

  12. Modeling Local Item Dependence Due to Common Test Format with a Multidimensional Rasch Model

    ERIC Educational Resources Information Center

    Baghaei, Purya; Aryadoust, Vahid

    2015-01-01

    Research shows that test method can exert a significant impact on test takers' performance and thereby contaminate test scores. We argue that common test method can exert the same effect as common stimuli and violate the conditional independence assumption of item response theory models because, in general, subsets of items which have a shared…

  13. Development of Self-Report Measures of Social Attitudes that Act as Environmental Barriers and Facilitators for People with Disabilities

    PubMed Central

    Garcia, Sofia F.; Hahn, Elizabeth A.; Magasi, Susan; Lai, Jin-Shei; Semik, Patrick; Hammel, Joy; Heinemann, Allen W.

    2014-01-01

    Objective To describe the development of new self-report measures of social attitudes that act as environmental facilitators or barriers to the participation of people with disabilities in society. Design A mixed methods approach included a literature review; item classification, selection and writing; cognitive interviews and field testing with participants with spinal cord injury (SCI), traumatic brain injury (TBI) or stroke; and rating scale analysis to evaluate initial psychometric properties. Setting General community. Participants Nine individuals with SCI, TBI or stroke participated in cognitive interviews; 305 community residents with those same conditions participated in field testing. Interventions None. Main Outcome Measure(s) Self-report item pool of social attitudes that act as facilitators or barriers to people with disabilities participating in society. Results An interdisciplinary team of experts classified 710 existing social environment items into content areas and wrote 32 new items. Additional qualitative item review included item refinement and winnowing of the pool prior to cognitive interviews and field testing 82 items. Field test data indicated that the pool satisfies a one-parameter item response theory measurement model and would be appropriate for development into a calibrated item bank. Conclusions Our qualitative item review process supported a social environment conceptual framework that includes both social support and social attitudes. We developed a new social attitudes self-report item pool. Calibration testing of that pool is underway with a larger sample in order to develop a social attitudes item bank for persons with disabilities. PMID:25045803

  14. Development of self-report measures of social attitudes that act as environmental barriers and facilitators for people with disabilities.

    PubMed

    Garcia, Sofia F; Hahn, Elizabeth A; Magasi, Susan; Lai, Jin-Shei; Semik, Patrick; Hammel, Joy; Heinemann, Allen W

    2015-04-01

    To describe the development of new self-report measures of social attitudes that act as environmental facilitators or barriers to the participation of people with disabilities in society. A mixed-methods approach included a literature review; item classification, selection, and writing; cognitive interviews and field testing of participants with spinal cord injury (SCI), traumatic brain injury (TBI), or stroke; and rating scale analysis to evaluate initial psychometric properties. General community. Individuals with SCI, TBI, or stroke participated in cognitive interviews (n=9); community residents with those same conditions participated in field testing (n=305). None. Self-report item pool of social attitudes that act as facilitators or barriers to people with disabilities participating in society. An interdisciplinary team of experts classified 710 existing social environment items into content areas and wrote 32 new items. Additional qualitative item review included item refinement and winnowing of the pool prior to cognitive interviews and field testing of 82 items. Field test data indicated that the pool satisfies a 1-parameter item response theory measurement model and would be appropriate for development into a calibrated item bank. Our qualitative item review process supported a social environment conceptual framework that includes both social support and social attitudes. We developed a new social attitudes self-report item pool. Calibration testing of that pool is underway with a larger sample to develop a social attitudes item bank for persons with disabilities. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  15. Redefining diagnostic symptoms of depression using Rasch analysis: testing an item bank suitable for DSM-V and computer adaptive testing.

    PubMed

    Mitchell, Alex J; Smith, Adam B; Al-salihy, Zerak; Rahim, Twana A; Mahmud, Mahmud Q; Muhyaldin, Asma S

    2011-10-01

    We aimed to redefine the optimal self-report symptoms of depression suitable for creation of an item bank that could be used in computer adaptive testing or to develop a simplified screening tool for DSM-V. Four hundred subjects (200 patients with primary depression and 200 non-depressed subjects), living in Iraqi Kurdistan were interviewed. The Mini International Neuropsychiatric Interview (MINI) was used to define the presence of major depression (DSM-IV criteria). We examined symptoms of depression using four well-known scales delivered in Kurdish. The Partial Credit Model was applied to each instrument. Common-item equating was subsequently used to create an item bank and differential item functioning (DIF) explored for known subgroups. A symptom level Rasch analysis reduced the original 45 items to 24 items of the original after the exclusion of 21 misfitting items. A further six items (CESD13 and CESD17, HADS-D4, HADS-D5 and HADS-D7, and CDSS3 and CDSS4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining optimal item bank consisted of 17 items and produced an area under the curve (AUC) of 0.987. Using a bank restricted to the optimal nine items revealed only minor loss of accuracy (AUC = 0.989, sensitivity 96%, specificity 95%). Finally, when restricted to only four items accuracy was still high (AUC was still 0.976; sensitivity 93%, specificity 96%). An item bank of 17 items may be useful in computer adaptive testing and nine or even four items may be used to develop a simplified screening tool for DSM-V major depressive disorder (MDD). Further examination of this item bank should be conducted in different cultural settings.

  16. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form

    PubMed Central

    Kalpakjian, Claire Z.; Tate, Denise G.; Kisala, Pamela A.; Tulsky, David S.

    2015-01-01

    Objective To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Design Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory- (IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. Participants A total of 717 individuals with SCI completed the self-esteem items. Results A unidimensional model was observed (CFI = 0.946; RMSEA = 0.087) and measurement precision was good (theta range between −2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. Conclusion This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available. PMID:26010972

  17. Measuring self-esteem after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Self-esteem item bank and short form.

    PubMed

    Kalpakjian, Claire Z; Tate, Denise G; Kisala, Pamela A; Tulsky, David S

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury-Quality of Life (SCI-QOL) Self-esteem item bank. Using a mixed-methods design, we developed and tested a self-esteem item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory-(IRT) based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. We tested a pool of 30 items at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital, and the James J. Peters/Bronx Department of Veterans Affairs hospital. A total of 717 individuals with SCI completed the self-esteem items. A unidimensional model was observed (CFI=0.946; RMSEA=0.087) and measurement precision was good (theta range between -2.7 and 0.7). Eleven items were flagged for DIF; however, effect sizes were negligible with little practical impact on score estimates. The final calibrated item bank resulted in 23 retained items. This study indicates that the SCI-QOL Self-esteem item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  18. Measuring resilience after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Resilience item bank and short form.

    PubMed

    Victorson, David; Tulsky, David S; Kisala, Pamela A; Kalpakjian, Claire Z; Weiland, Brian; Choi, Seung W

    2015-05-01

    To describe the development and psychometric properties of the Spinal Cord Injury--Quality of Life (SCI-QOL) Resilience item bank and short form. Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. A total of 717 individuals with SCI completed the Resilience items. A unidimensional model was observed (CFI=0.968; RMSEA=0.074) and measurement precision was good (theta range between -3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available.

  19. Measuring resilience after spinal cord injury: Development, validation and psychometric characteristics of the SCI-QOL Resilience item bank and short form

    PubMed Central

    Victorson, David; Tulsky, David S.; Kisala, Pamela A.; Kalpakjian, Claire Z.; Weiland, Brian; Choi, Seung W.

    2015-01-01

    Objective To describe the development and psychometric properties of the Spinal Cord Injury - Quality of Life (SCI-QOL) Resilience item bank and short form. Design Using a mixed-methods design, we developed and tested a resilience item bank through the use of focus groups with individuals with SCI and clinicians with expertise in SCI, cognitive interviews, and item-response theory based analytic approaches, including tests of model fit and differential item functioning (DIF). Setting We tested a 32-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, the Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Department of Veterans Affairs medical center. Participants A total of 717 individuals with SCI completed the Resilience items. Results A unidimensional model was observed (CFI = 0.968; RMSEA = 0.074) and measurement precision was good (theta range between −3.1 and 0.9). Ten items were flagged for DIF, however, after examination of effect sizes we found this to be negligible with little practical impact on score estimates. The final calibrated item bank resulted in 21 retained items. Conclusion This study indicates that the SCI-QOL Resilience item bank represents a psychometrically robust measurement tool. Short form items are also suggested and computer adaptive tests are available. PMID:26010971

  20. An analysis of high school students' perceptions and academic performance in laboratory experiences

    NASA Astrophysics Data System (ADS)

    Mirchin, Robert Douglas

    This research study is an investigation of student-laboratory (i.e., lab) learning based on students' perceptions of experiences using questionnaire data and evidence of their science-laboratory performance based on paper-and-pencil assessments using Maryland-mandated criteria, Montgomery County Public Schools (MCPS) criteria, and published laboratory questions. A 20-item questionnaire consisting of 18 Likert-scale items and 2 open-ended items that addressed what students liked most and least about lab was administered to students before labs were observed. A pre-test and post-test assessing laboratory achievement were administered before and after the laboratory experiences. The three labs observed were: soda distillation, stoichiometry, and separation of a mixture. Five significant results or correlations were found. For soda distillation, there were two positive correlations. Student preference for analyzing data was positively correlated with achievement on the data analysis dimension of the lab rubric. A student preference for using numbers and graphs to analyze data was positively correlated with achievement on the analysis dimension of the lab rubric. For the separating a mixture lab data the following pairs of correlations were significant. Student preference for doing chemistry labs where numbers and graphs were used to analyze data had a positive correlation with writing a correctly worded hypothesis. Student responses that lab experiences help them learn science positively correlated with achievement on the data dimension of the lab rubric. The only negative correlation found related to the first result where students' preference for computers was inversely correlated to their performance on analyzing data on their lab report. Other findings included the following: students like actual experimental work most and the write-up and analysis of a lab the least. It is recommended that lab science instruction be inquiry-based, hands-on, and that students be tested for lab content acquisition. The final conclusion of the study is that students expressed a preference for working in groups and working with materials and equipment as opposed to individual, non-group work and analyzing data.

  1. Noncompetitive retrieval practice causes retrieval-induced forgetting in cued recall but not in recognition.

    PubMed

    Grundgeiger, Tobias

    2014-04-01

    Retrieving a subset of learned items can lead to the forgetting of related items. Such retrieval-induced forgetting (RIF) can be explained by the inhibition of irrelevant items in order to overcome retrieval competition when the target item is retrieved. According to the retrieval inhibition account, such retrieval competition is a necessary condition for RIF. However, research has indicated that noncompetitive retrieval practice can also cause RIF by strengthening cue-item associations. According to the strength-dependent competition account, the strengthened items interfere with the retrieval of weaker items, resulting in impaired recall of weaker items in the final memory test. The aim of this study was to replicate RIF caused by noncompetitive retrieval practice and to determine whether this forgetting is also observed in recognition tests. In the context of RIF, it has been assumed that recognition tests circumvent interference and, therefore, should not be sensitive to forgetting due to strength-dependent competition. However, this has not been empirically tested, and it has been suggested that participants may reinstate learned cues as retrieval aids during the final test. In the present experiments, competitive practice or noncompetitive practice was followed by either final cued-recall tests or recognition tests. In cued-recall tests, RIF was observed in both competitive and noncompetitive conditions. However, in recognition tests, RIF was observed only in the competitive condition and was absent in the noncompetitive condition. The result underscores the contribution of strength-dependent competition to RIF. However, recognition tests seem to be a reliable way of distinguishing between RIF due to retrieval inhibition or strength-dependent competition.

  2. Adaptive Mental Testing: The State of the Art

    DTIC Science & Technology

    1979-11-01

    typically vary in their psychometric properties --particularly in their difficulty--the test designer must decide what configuration of these item...psychometric properties best suits the test’s purpose. There are two extreme ration- ales to guide that decision. One rationale is to choose items that are...development of item response theory (Rasch, 1960; Lord, 1952, 1970, 1974a; Birnbaum, 1968) that provided the needed invariance properties for item

  3. Dealing with Omitted and Not-Reached Items in Competence Tests: Evaluating Approaches Accounting for Missing Responses in Item Response Theory Models

    ERIC Educational Resources Information Center

    Pohl, Steffi; Gräfe, Linda; Rose, Norman

    2014-01-01

    Data from competence tests usually show a number of missing responses on test items due to both omitted and not-reached items. Different approaches for dealing with missing responses exist, and there are no clear guidelines on which of those to use. While classical approaches rely on an ignorable missing data mechanism, the most recently developed…

  4. Procedures for Selecting Items for Computerized Adaptive Tests.

    ERIC Educational Resources Information Center

    Kingsbury, G. Gage; Zara, Anthony R.

    1989-01-01

    Several classical approaches and alternative approaches to item selection for computerized adaptive testing (CAT) are reviewed and compared. The study also describes procedures for constrained CAT that may be added to classical item selection approaches to allow them to be used for applied testing. (TJH)

  5. Efforts Toward the Development of Unbiased Selection and Assessment Instruments.

    ERIC Educational Resources Information Center

    Rudner, Lawrence M.

    Investigations into item bias provide an empirical basis for the identification and elimination of test items which appear to measure different traits across populations or cultural groups. The Psychometric rationales for six approaches to the identification of biased test items are reviewed: (1) Transformed item difficulties: within-group…

  6. Effect of Differential Item Functioning on Test Equating

    ERIC Educational Resources Information Center

    Kabasakal, Kübra Atalay; Kelecioglu, Hülya

    2015-01-01

    This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…

  7. Ramsay-Curve Differential Item Functioning

    ERIC Educational Resources Information Center

    Woods, Carol M.

    2011-01-01

    Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…

  8. A Study on Detecting of Differential Item Functioning of PISA 2006 Science Literacy Items in Turkish and American Samples

    ERIC Educational Resources Information Center

    Çikirikçi Demirtasli, Nükhet; Ulutas, Seher

    2015-01-01

    Problem Statement: Item bias occurs when individuals from different groups (different gender, cultural background, etc.) have different probabilities of responding correctly to a test item despite having the same skill levels. It is important that tests or items do not have bias in order to ensure the accuracy of decisions taken according to test…

  9. Investigating Measurement Invariance in Computer-Based Personality Testing: The Impact of Using Anchor Items on Effect Size Indices

    ERIC Educational Resources Information Center

    Egberink, Iris J. L.; Meijer, Rob R.; Tendeiro, Jorge N.

    2015-01-01

    A popular method to assess measurement invariance of a particular item is based on likelihood ratio tests with all other items as anchor items. The results of this method are often only reported in terms of statistical significance, and researchers proposed different methods to empirically select anchor items. It is unclear, however, how many…

  10. A Comparison of Traditional Test Blueprinting and Item Development to Assessment Engineering in a Licensure Context

    ERIC Educational Resources Information Center

    Masters, James S.

    2010-01-01

    With the need for larger and larger banks of items to support adaptive testing and to meet security concerns, large-scale item generation is a requirement for many certification and licensure programs. As part of the mass production of items, it is critical that the difficulty and the discrimination of the items be known without the need for…

  11. Unilateral neglect: further validation of the baking tray task.

    PubMed

    Appelros, Peter; Karlsson, Gunnel M; Thorwalls, Annika; Tham, Kerstin; Nydevik, Ingegerd

    2004-11-01

    The Baking Tray Task is a comprehensible, simple-to-perform test for use in assessing unilateral neglect. The aim of this study was to validate further its use with stroke patients. The Baking Tray Task was compared with 2 versions of the Behaviour Inattention Test and a test for personal neglect. A total of 270 patients were subjected to a 3-item version of the Behaviour Inattention Test and 40 patients were subjected to an 8-item version of the Behaviour Inattention Test, besides the Baking Tray Task and the personal neglect test. The Baking Tray Task was more sensitive than the 3-item Behaviour Inattention Test, but the 8-item Behaviour Inattention Test was more sensitive than the Baking Tray Task. The best combination of any 3 tests was Baking Tray Task, Reading an article, and Figure copying; the 2 last-mentioned being a part of the 8-item Behaviour Inattention Test. Multi-item tests detect more cases of neglect than do single tests. However, it is tiresome for the patient to undergo a larger test battery than necessary. It is also time-consuming for the staff. Behavioural tests seem more appropriate when assessing neglect. The Baking Tray Task seems to be one of the most sensitive single tests, but its sensitivity can be further enhanced when it is used in combination with other tests.

  12. Adjusting for cross-cultural differences in computer-adaptive tests of quality of life.

    PubMed

    Gibbons, C J; Skevington, S M

    2018-04-01

    Previous studies using the WHOQOL measures have demonstrated that the relationship between individual items and the underlying quality of life (QoL) construct may differ between cultures. If unaccounted for, these differing relationships can lead to measurement bias which, in turn, can undermine the reliability of results. We used item response theory (IRT) to assess differential item functioning (DIF) in WHOQOL data from diverse language versions collected in UK, Zimbabwe, Russia, and India (total N = 1332). Data were fitted to the partial credit 'Rasch' model. We used four item banks previously derived from the WHOQOL-100 measure, which provided excellent measurement for physical, psychological, social, and environmental quality of life domains (40 items overall). Cross-cultural differential item functioning was assessed using analysis of variance for item residuals and post hoc Tukey tests. Simulated computer-adaptive tests (CATs) were conducted to assess the efficiency and precision of the four items banks. Splitting item parameters by DIF results in four linked item banks without DIF or other breaches of IRT model assumptions. Simulated CATs were more precise and efficient than longer paper-based alternatives. Assessing differential item functioning using item response theory can identify measurement invariance between cultures which, if uncontrolled, may undermine accurate comparisons in computer-adaptive testing assessments of QoL. We demonstrate how compensating for DIF using item anchoring allowed data from all four countries to be compared on a common metric, thus facilitating assessments which were both sensitive to cultural nuance and comparable between countries.

  13. Item analysis of three Spanish naming tests: a cross-cultural investigation.

    PubMed

    Marquez de la Plata, Carlos; Arango-Lasprilla, Juan Carlos; Alegret, Montse; Moreno, Alexander; Tárraga, Luis; Lara, Mar; Hewlitt, Margaret; Hynan, Linda; Cullum, C Munro

    2009-01-01

    Neuropsychological evaluations conducted in the United States and abroad commonly include the use of tests translated from English to Spanish. The use of translated naming tests for evaluating predominately Spanish-speakers has recently been challenged on the grounds that translating test items may compromise a test's construct validity. The Texas Spanish Naming Test (TNT) has been developed in Spanish specifically for use with Spanish-speakers; however, it is unlikely patients from diverse Spanish-speaking geographical regions will perform uniformly on a naming test. The present study evaluated and compared the internal consistency and patterns of item-difficulty and -discrimination for the TNT and two commonly used translated naming tests in three countries (i.e., United States, Colombia, Spain). Two hundred fifty two subjects (136 demented, 116 nondemented) across three countries were administered the TNT, Modified Boston Naming Test-Spanish, and the naming subtest from the CERAD. The TNT demonstrated superior internal consistency to its counterparts, a superior item difficulty pattern than the CERAD naming test, and a superior item discrimination pattern than the MBNT-S across countries. Overall, all three Spanish naming tests differentiated nondemented and moderately demented individuals, but the results suggest the items of the TNT are most appropriate to use with Spanish-speakers. Preliminary normative data for the three tests examined in each country are provided.

  14. Testing enhances both encoding and retrieval for both tested and untested items.

    PubMed

    Cho, Kit W; Neely, James H; Crocco, Stephanie; Vitrano, Deana

    2017-07-01

    In forward testing effects, taking a test enhances memory for subsequently studied material. These effects have been observed for previously studied and tested items, a potentially item-specific testing effect, and newly studied untested items, a purely generalized testing effect. We directly compared item-specific and generalized forward testing effects using procedures to separate testing benefits due to encoding versus retrieval. Participants studied two lists of Swahili-English word pairs, with the second study list containing "new" pairs intermixed with the previously studied "old" pairs. Participants completed a review phase in which they took a cued-recall test on only the "old" pairs or restudied them. In Experiments 1a, 1b, and 2, the review phase was given either before or after the second study list. Testing benefited memory to the same degree for both "new" and "old" pairs, suggesting that there were no pair-specific benefits of testing. The larger benefit from testing when review was given before rather than after the second study list suggests that the memory enhancement was due to both testing-enhanced encoding and testing-enhanced retrieval. To better equate generalized testing effects for "new" and "old" pairs, Experiment 3 intermixed them in the review phase. A statistically significant pair-specific testing effect for "old" items was now observed. Overall, these results show that forward testing effects are due to both testing-enhanced encoding and retrieval effects and that direct, pair-specific forward testing benefits are considerably smaller than indirect, generalized forward testing benefits.

  15. The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi

    2013-01-01

    Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

  16. A Paradox in the Study of the Benefits of Test-Item Review

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Jeon, Minjeong; Ferrara, Steve

    2011-01-01

    According to a popular belief, test takers should trust their initial instinct and retain their initial responses when they have the opportunity to review test items. More than 80 years of empirical research on item review, however, has contradicted this belief and shown minor but consistently positive score gains for test takers who changed…

  17. Geography Library of Test Items. Volume Four.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  18. Home Science Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Smith, Jan, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…

  19. Languages Library of Test Items. Volume Two: German, Latin.

    ERIC Educational Resources Information Center

    Campbell, Thomas; And Others

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  20. Languages Library of Test Items. Volume One: French, Indonesian.

    ERIC Educational Resources Information Center

    Campbell, Thomas; And Others

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  1. Geography Library of Test Items. Volume Three.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  2. Commerce Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Meeve, Brian, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  3. Geography Library of Test Items. Volume Five.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  4. Textiles and Design Library of Test Items. Volume I.

    ERIC Educational Resources Information Center

    Smith, Jan, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…

  5. Commerce Library of Test Items. Volume Two.

    ERIC Educational Resources Information Center

    Meeve, Brian, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  6. Geography Library of Test Items. Volume Six.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  7. Geography: Library of Test Items. Volume II.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  8. Sex Differences in the Tendency to Omit Items on Multiple-Choice Tests: 1980-2000

    ERIC Educational Resources Information Center

    von Schrader, Sarah; Ansley, Timothy

    2006-01-01

    Much has been written concerning the potential group differences in responding to multiple-choice achievement test items. This discussion has included references to possible disparities in tendency to omit such test items. When test scores are used for high-stakes decision making, even small differences in scores and rankings that arise from male…

  9. A Person Fit Test for IRT Models for Polytomous Items

    ERIC Educational Resources Information Center

    Glas, C. A. W.; Dagohoy, Anna Villa T.

    2007-01-01

    A person fit test based on the Lagrange multiplier test is presented for three item response theory models for polytomous items: the generalized partial credit model, the sequential model, and the graded response model. The test can also be used in the framework of multidimensional ability parameters. It is shown that the Lagrange multiplier…

  10. How Big Is Big Enough? Sample Size Requirements for CAST Item Parameter Estimation

    ERIC Educational Resources Information Center

    Chuah, Siang Chee; Drasgow, Fritz; Luecht, Richard

    2006-01-01

    Adaptive tests offer the advantages of reduced test length and increased accuracy in ability estimation. However, adaptive tests require large pools of precalibrated items. This study looks at the development of an item pool for 1 type of adaptive administration: the computer-adaptive sequential test. An important issue is the sample size required…

  11. An Explanatory Item Response Theory Approach for a Computer-Based Case Simulation Test

    ERIC Educational Resources Information Center

    Kahraman, Nilüfer

    2014-01-01

    Problem: Practitioners working with multiple-choice tests have long utilized Item Response Theory (IRT) models to evaluate the performance of test items for quality assurance. The use of similar applications for performance tests, however, is often encumbered due to the challenges encountered in working with complicated data sets in which local…

  12. Geography Library of Test Items. Volume One.

    ERIC Educational Resources Information Center

    Kouimanos, John, Ed.

    As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…

  13. Electronic Quality of Life Assessment Using Computer-Adaptive Testing

    PubMed Central

    2016-01-01

    Background Quality of life (QoL) questionnaires are desirable for clinical practice but can be time-consuming to administer and interpret, making their widespread adoption difficult. Objective Our aim was to assess the performance of the World Health Organization Quality of Life (WHOQOL)-100 questionnaire as four item banks to facilitate adaptive testing using simulated computer adaptive tests (CATs) for physical, psychological, social, and environmental QoL. Methods We used data from the UK WHOQOL-100 questionnaire (N=320) to calibrate item banks using item response theory, which included psychometric assessments of differential item functioning, local dependency, unidimensionality, and reliability. We simulated CATs to assess the number of items administered before prespecified levels of reliability was met. Results The item banks (40 items) all displayed good model fit (P>.01) and were unidimensional (fewer than 5% of t tests significant), reliable (Person Separation Index>.70), and free from differential item functioning (no significant analysis of variance interaction) or local dependency (residual correlations < +.20). When matched for reliability, the item banks were between 45% and 75% shorter than paper-based WHOQOL measures. Across the four domains, a high standard of reliability (alpha>.90) could be gained with a median of 9 items. Conclusions Using CAT, simulated assessments were as reliable as paper-based forms of the WHOQOL with a fraction of the number of items. These properties suggest that these item banks are suitable for computerized adaptive assessment. These item banks have the potential for international development using existing alternative language versions of the WHOQOL items. PMID:27694100

  14. Designing and Testing an Inventory for Measuring Social Media Competency of Certified Health Education Specialists

    PubMed Central

    Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann

    2015-01-01

    Background Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). Objective The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. Methods The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Results Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Conclusions Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES. PMID:26399428

  15. Designing and Testing an Inventory for Measuring Social Media Competency of Certified Health Education Specialists.

    PubMed

    Alber, Julia M; Bernhardt, Jay M; Stellefson, Michael; Weiler, Robert M; Anderson-Lewis, Charkarra; Miller, M David; MacInnes, Jann

    2015-09-23

    Social media can promote healthy behaviors by facilitating engagement and collaboration among health professionals and the public. Thus, social media is quickly becoming a vital tool for health promotion. While guidelines and trainings exist for public health professionals, there are currently no standardized measures to assess individual social media competency among Certified Health Education Specialists (CHES) and Master Certified Health Education Specialists (MCHES). The aim of this study was to design, develop, and test the Social Media Competency Inventory (SMCI) for CHES and MCHES. The SMCI was designed in three sequential phases: (1) Conceptualization and Domain Specifications, (2) Item Development, and (3) Inventory Testing and Finalization. Phase 1 consisted of a literature review, concept operationalization, and expert reviews. Phase 2 involved an expert panel (n=4) review, think-aloud sessions with a small representative sample of CHES/MCHES (n=10), a pilot test (n=36), and classical test theory analyses to develop the initial version of the SMCI. Phase 3 included a field test of the SMCI with a random sample of CHES and MCHES (n=353), factor and Rasch analyses, and development of SMCI administration and interpretation guidelines. Six constructs adapted from the unified theory of acceptance and use of technology and the integrated behavioral model were identified for assessing social media competency: (1) Social Media Self-Efficacy, (2) Social Media Experience, (3) Effort Expectancy, (4) Performance Expectancy, (5) Facilitating Conditions, and (6) Social Influence. The initial item pool included 148 items. After the pilot test, 16 items were removed or revised because of low item discrimination (r<.30), high interitem correlations (Ρ>.90), or based on feedback received from pilot participants. During the psychometric analysis of the field test data, 52 items were removed due to low discrimination, evidence of content redundancy, low R-squared value, or poor item infit or outfit. Psychometric analyses of the data revealed acceptable reliability evidence for the following scales: Social Media Self-Efficacy (alpha=.98, item reliability=.98, item separation=6.76), Social Media Experience (alpha=.98, item reliability=.98, item separation=6.24), Effort Expectancy(alpha =.74, item reliability=.95, item separation=4.15), Performance Expectancy (alpha =.81, item reliability=.99, item separation=10.09), Facilitating Conditions (alpha =.66, item reliability=.99, item separation=16.04), and Social Influence (alpha =.66, item reliability=.93, item separation=3.77). There was some evidence of local dependence among the scales, with several observed residual correlations above |.20|. Through the multistage instrument-development process, sufficient reliability and validity evidence was collected in support of the purpose and intended use of the SMCI. The SMCI can be used to assess the readiness of health education specialists to effectively use social media for health promotion research and practice. Future research should explore associations across constructs within the SMCI and evaluate the ability of SMCI scores to predict social media use and performance among CHES and MCHES.

  16. A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

    ERIC Educational Resources Information Center

    Lee, Guemin; Park, In-Yong

    2012-01-01

    Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

  17. Comparing Recent Organizing Templates for Test Content between ACS Exams in General Chemistry and AP Chemistry

    ERIC Educational Resources Information Center

    Holme, Thomas

    2014-01-01

    Two different versions of "big ideas" rooted content maps have recently been published for general chemistry. As embodied in the content outline from the College Board, one of these maps is designed to guide curriculum development and testing for advanced placement (AP) chemistry. The Anchoring Concepts Content Map for general chemistry…

  18. Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation.

    PubMed

    Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel

    2017-06-15

    Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.

  19. Application of Item Response Theory to Tests of Substance-related Associative Memory

    PubMed Central

    Shono, Yusuke; Grenard, Jerry L.; Ames, Susan L.; Stacy, Alan W.

    2015-01-01

    A substance-related word association test (WAT) is one of the commonly used indirect tests of substance-related implicit associative memory and has been shown to predict substance use. This study applied an item response theory (IRT) modeling approach to evaluate psychometric properties of the alcohol- and marijuana-related WATs and their items among 775 ethnically diverse at-risk adolescents. After examining the IRT assumptions, item fit, and differential item functioning (DIF) across gender and age groups, the original 18 WAT items were reduced to 14- and 15-items in the alcohol- and marijuana-related WAT, respectively. Thereafter, unidimensional one- and two-parameter logistic models (1PL and 2PL models) were fitted to the revised WAT items. The results demonstrated that both alcohol- and marijuana-related WATs have good psychometric properties. These results were discussed in light of the framework of a unified concept of construct validity (Messick, 1975, 1989, 1995). PMID:25134051

  20. Sleep can reduce the testing effect: it enhances recall of restudied items but can leave recall of retrieved items unaffected.

    PubMed

    Bäuml, Karl-Heinz T; Holterman, Christoph; Abel, Magdalena

    2014-11-01

    The testing effect refers to the finding that retrieval practice in comparison to restudy of previously encoded contents can improve memory performance and reduce time-dependent forgetting. Naturally, long retention intervals include both wake and sleep delay, which can influence memory contents differently. In fact, sleep immediately after encoding can induce a mnemonic benefit, stabilizing and strengthening the encoded contents. We investigated in a series of 5 experiments whether sleep influences the testing effect. After initial study of categorized item material (Experiments 1, 2, and 4A), paired associates (Experiment 3), or educational text material (Experiment 4B), subjects were asked to restudy encoded contents or engage in active retrieval practice. A final recall test was conducted after a 12-hr delay that included diurnal wakefulness or nocturnal sleep. The results consistently showed typical testing effects after the wake delay. However, these testing effects were reduced or even eliminated after sleep, because sleep benefited recall of restudied items but left recall of retrieved items unaffected. The findings are consistent with the bifurcation model of the testing effect (Kornell, Bjork, & Garcia, 2011), according to which the distribution of memory strengths across items is shifted differentially by retrieving and restudying, with retrieval strengthening items to a much higher degree than restudy does. On the basis of this model, most of the retrieved items already fall above recall threshold in the absence of sleep, so additional sleep-induced strengthening may not improve recall of retrieved items any further. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  1. Using Response-Time Constraints in Item Selection To Control for Differential Speededness in Computerized Adaptive Testing. LSAC Research Report Series.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Scrams, David J.; Schnipke, Deborah L.

    This paper proposes an item selection algorithm that can be used to neutralize the effect of time limits in computer adaptive testing. The method is based on a statistical model for the response-time distributions of the test takers on the items in the pool that is updated each time a new item has been administered. Predictions from the model are…

  2. Identification of metallic items that caused nickel dermatitis in Danish patients.

    PubMed

    Thyssen, Jacob P; Menné, Torkil; Johansen, Jeanne D

    2010-09-01

    Nickel allergy is prevalent as assessed by epidemiological studies. In an attempt to further identify and characterize sources that may result in nickel allergy and dermatitis, we analysed items identified by nickel-allergic dermatitis patients as causative of nickel dermatitis by using the dimethylglyoxime (DMG) test. Dermatitis patients with nickel allergy of current relevance were identified over a 2-year period in a tertiary referral patch test centre. When possible, their work tools and personal items were examined with the DMG test. Among 95 nickel-allergic dermatitis patients, 70 (73.7%) had metallic items investigated for nickel release. A total of 151 items were investigated, and 66 (43.7%) gave positive DMG test reactions. Objects were nearly all purchased or acquired after the introduction of the EU Nickel Directive. Only one object had been inherited, and only two objects had been purchased outside of Denmark. DMG testing is valuable as a screening test for nickel release and should be used to identify relevant exposures in nickel-allergic patients. Mainly consumer items, but also work tools used in an occupational setting, released nickel in dermatitis patients. This study confirmed 'risk items' from previous studies, including mobile phones.

  3. A Comparison of the One-and Three-Parameter Logistic Models on Measures of Test Efficiency.

    ERIC Educational Resources Information Center

    Benson, Jeri

    Two methods of item selection were used to select sets of 40 items from a 50-item verbal analogies test, and the resulting item sets were compared for relative efficiency. The BICAL program was used to select the 40 items having the best mean square fit to the one parameter logistic (Rasch) model. The LOGIST program was used to select the 40 items…

  4. Test Score Equating Using Discrete Anchor Items versus Passage-Based Anchor Items: A Case Study Using "SAT"® Data. Research Report. ETS RR-14-14

    ERIC Educational Resources Information Center

    Liu, Jinghua; Zu, Jiyun; Curley, Edward; Carey, Jill

    2014-01-01

    The purpose of this study is to investigate the impact of discrete anchor items versus passage-based anchor items on observed score equating using empirical data.This study compares an "SAT"® critical reading anchor that contains more discrete items proportionally, compared to the total tests to be equated, to another anchor that…

  5. Computerized Adaptive Testing: Overview and Introduction.

    ERIC Educational Resources Information Center

    Meijer, Rob R.; Nering, Michael L.

    1999-01-01

    Provides an overview of computerized adaptive testing (CAT) and introduces contributions to this special issue. CAT elements discussed include item selection, estimation of the latent trait, item exposure, measurement precision, and item-bank development. (SLD)

  6. Development of a Computer Adaptive Test for Depression Based on the Dutch-Flemish Version of the PROMIS Item Bank.

    PubMed

    Flens, Gerard; Smits, Niels; Terwee, Caroline B; Dekker, Joost; Huijbrechts, Irma; de Beurs, Edwin

    2017-03-01

    We developed a Dutch-Flemish version of the patient-reported outcomes measurement information system (PROMIS) adult V1.0 item bank for depression as input for computerized adaptive testing (CAT). As item bank, we used the Dutch-Flemish translation of the original PROMIS item bank (28 items) and additionally translated 28 U.S. depression items that failed to make the final U.S. item bank. Through psychometric analysis of a combined clinical and general population sample ( N = 2,010), 8 added items were removed. With the final item bank, we performed several CAT simulations to assess the efficiency of the extended (48 items) and the original item bank (28 items), using various stopping rules. Both item banks resulted in highly efficient and precise measurement of depression and showed high similarity between the CAT simulation scores and the full item bank scores. We discuss the implications of using each item bank and stopping rule for further CAT development.

  7. Usability of Interactive Item Types and Tools Introduced in the New GRE® Revised General Test. ETS GRE® Board Research Report. ETS GRE®-14-05. ETS Research Report. RR-14-28

    ERIC Educational Resources Information Center

    Swiggett, Wanda D.; Kotloff, Laurie; Ezzo, Chelsea; Adler, Rachel; Oliveri, Maria Elena

    2014-01-01

    The computer-based "Graduate Record Examinations"® ("GRE"®) revised General Test includes interactive item types and testing environment tools (e.g., test navigation, on-screen calculator, and help). How well do test takers understand these innovations? If test takers do not understand the new item types, these innovations may…

  8. Severity of Organized Item Theft in Computerized Adaptive Testing: A Simulation Study

    ERIC Educational Resources Information Center

    Yi, Qing; Zhang, Jinming; Chang, Hua-Hua

    2008-01-01

    Criteria had been proposed for assessing the severity of possible test security violations for computerized tests with high-stakes outcomes. However, these criteria resulted from theoretical derivations that assumed uniformly randomized item selection. This study investigated potential damage caused by organized item theft in computerized adaptive…

  9. Detecting Item Drift in Large-Scale Testing

    ERIC Educational Resources Information Center

    Guo, Hongwen; Robin, Frederic; Dorans, Neil

    2017-01-01

    The early detection of item drift is an important issue for frequently administered testing programs because items are reused over time. Unfortunately, operational data tend to be very sparse and do not lend themselves to frequent monitoring analyses, particularly for on-demand testing. Building on existing residual analyses, the authors propose…

  10. Tree versus Geometric Representation of Tests and Items.

    ERIC Educational Resources Information Center

    Beller, Michael

    1990-01-01

    Geometric approaches to representing interrelations among tests and items are compared with an additive tree model (ATM), using 2,644 examinees and 2 other data sets. The ATM's close fit to the data and its coherence of presentation indicate that it is the best means of representing tests and items. (TJH)

  11. Superficial Priming in Episodic Recognition

    ERIC Educational Resources Information Center

    Dopkins, Stephen; Sargent, Jesse; Ngo, Catherine T.

    2010-01-01

    We explored the effect of superficial priming in episodic recognition and found it to be different from the effect of semantic priming in episodic recognition. Participants made recognition judgments to pairs of items, with each pair consisting of a prime item and a test item. Correct positive responses to the test item were impeded if the prime…

  12. Statistical Indexes for Monitoring Item Behavior under Computer Adaptive Testing Environment.

    ERIC Educational Resources Information Center

    Zhu, Renbang; Yu, Feng; Liu, Su

    A computerized adaptive test (CAT) administration usually requires a large supply of items with accurately estimated psychometric properties, such as item response theory (IRT) parameter estimates, to ensure the precision of examinee ability estimation. However, an estimated IRT model of a given item in any given pool does not always correctly…

  13. Using Item Response Theory to Describe the Nonverbal Literacy Assessment (NVLA)

    ERIC Educational Resources Information Center

    Fleming, Danielle; Wilson, Mark; Ahlgrim-Delzell, Lynn

    2018-01-01

    The Nonverbal Literacy Assessment (NVLA) is a literacy assessment designed for students with significant intellectual disabilities. The 218-item test was initially examined using confirmatory factor analysis. This method showed that the test worked as expected, but the items loaded onto a single factor. This article uses item response theory to…

  14. Aggregating Polytomous DIF Results over Multiple Test Administrations

    ERIC Educational Resources Information Center

    Zwick, Rebecca; Ye, Lei; Isham, Steven

    2018-01-01

    In typical differential item functioning (DIF) assessments, an item's DIF status is not influenced by its status in previous test administrations. An item that has shown DIF at multiple administrations may be treated the same way as an item that has shown DIF in only the most recent administration. Therefore, much useful information about the…

  15. A Comparison of Linking and Concurrent Calibration under the Graded Response Model.

    ERIC Educational Resources Information Center

    Kim, Seock-Ho; Cohen, Allan S.

    Applications of item response theory to practical testing problems including equating, differential item functioning, and computerized adaptive testing, require that item parameter estimates be placed onto a common metric. In this study, two methods for developing a common metric for the graded response model under item response theory were…

  16. Missouri Assessment Program (MAP), Spring 2000: Elementary Health/Physical Education, Released Items, Grade 5.

    ERIC Educational Resources Information Center

    Missouri State Dept. of Elementary and Secondary Education, Jefferson City.

    This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to fifth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…

  17. Item Analysis Appropriate for Domain-Referenced Classroom Testing. (Project Technical Report Number 1).

    ERIC Educational Resources Information Center

    Nitko, Anthony J.; Hsu, Tse-chi

    Item analysis procedures appropriate for domain-referenced classroom testing are described. A conceptual framework within which item statistics can be considered and promising statistics in light of this framework are presented. The sampling fluctuations of the more promising item statistics for sample sizes comparable to the typical classroom…

  18. The Relationship of Expert-System Scored Constrained Free-Response Items to Multiple-Choice and Open-Ended Items.

    ERIC Educational Resources Information Center

    Bennett, Randy Elliot; And Others

    1990-01-01

    The relationship of an expert-system-scored constrained free-response item type to multiple-choice and free-response items was studied using data for 614 students on the College Board's Advanced Placement Computer Science (APCS) Examination. Implications for testing and the APCS test are discussed. (SLD)

  19. Fissile interrogation using gamma rays from oxygen

    DOEpatents

    Smith, Donald; Micklich, Bradley J.; Fessler, Andreas

    2004-04-20

    The subject apparatus provides a means to identify the presence of fissionable material or other nuclear material contained within an item to be tested. The system employs a portable accelerator to accelerate and direct protons to a fluorine-compound target. The interaction of the protons with the fluorine-compound target produces gamma rays which are directed at the item to be tested. If the item to be tested contains either a fissionable material or other nuclear material the interaction of the gamma rays with the material contained within the test item with result in the production of neutrons. A system of neutron detectors is positioned to intercept any neutrons generated by the test item. The results from the neutron detectors are analyzed to determine the presence of a fissionable material or other nuclear material.

  20. Validation of a clinical critical thinking skills test in nursing.

    PubMed

    Shin, Sujin; Jung, Dukyoo; Kim, Sungeun

    2015-01-27

    The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability.

  1. Validation of a clinical critical thinking skills test in nursing

    PubMed Central

    2015-01-01

    Purpose: The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. Methods: This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Results: Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. Conclusion: From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability. PMID:25622716

  2. A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

    ERIC Educational Resources Information Center

    Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan

    2014-01-01

    C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

  3. Do Self Concept Tests Test Self Concept? An Evaluation of the Validity of Items on the Piers Harris and Coopersmith Measures.

    ERIC Educational Resources Information Center

    Lynch, Mervin D.; Chaves, John

    Items from Peirs-Harris and Coopersmith self-concept tests were evaluated against independent measures on three self-constructs, idealized, empathic, and worth. Construct measurements were obtained with the semantic differential and D statistic. Ratings were obtained from 381 children, grades 4-6. For each test, item ratings and construct measures…

  4. Technical Characteristics of the Peabody Individual Achievement Test as a Function of Item Arrangement and Basal and Ceiling Rules.

    ERIC Educational Resources Information Center

    Browning, Robert; And Others

    1979-01-01

    Effects that item order and basal and ceiling rules have on test means, variances, and internal consistency estimates for the Peabody Individual Achievement Test mathematics and reading recognition subtests were examined. Items on the math and reading recognition subtests were significantly easier or harder than test placements indicated. (Author)

  5. Current State of Test Development, Administration, and Analysis: A Study of Faculty Practices.

    PubMed

    Bristol, Timothy J; Nelson, John W; Sherrill, Karin J; Wangerin, Virginia S

    Developing valid and reliable test items is a critical skill for nursing faculty. This research analyzed the test item writing practice of 674 nursing faculty. Relationships between faculty characteristics and their test item writing practices were analyzed. Findings reveal variability in practice and a gap in implementation of evidence-based standards when developing and evaluating teacher-made examinations.

  6. A Review of Guidelines on Home Drug Testing Websites for Parents

    PubMed Central

    Washio, Yukiko; Fairfax-Columbo, Jaymes; Ball, Emily; Cassey, Heather; Arria, Amelia M.; Bresani, Elena; Curtis, Brenda L.; Kirby, Kimberly C.

    2014-01-01

    Purpose To update and extend prior work reviewing websites that discuss home drug testing for parents and assess the quality of information that the websites provide to assist them to decide when and how to use home drug testing. Methods We conducted a world-wide web search that identified eight websites providing information for parents on home drug testing. We assessed the information on the sites using checklist developed with field experts in adolescent substance abuse and psychosocial interventions that focus on urine testing. Results None of the websites covered all of items on the 24-item checklist, and only three covered at least half of the items (12, 14, and 21 items, respectively). The five remaining websites covered less than half the checklist items. The mean number of items covered by the websites was 11. Conclusions Among the websites that we reviewed, few provided thorough information to parents regarding empirically-supported strategies to effectively use drug testing to intervene on adolescent substance use. Furthermore, most websites did not provide thorough information regarding the risks and benefits to inform parents’ decision to use home drug testing. Empirical evidence regarding efficacy, benefits, risks, and limitations of home drug testing is needed. PMID:25026103

  7. Science Library of Test Items. Volume Eight. Mastery Testing Program. Series 3 & 4 Supplements to Introduction and Manual.

    ERIC Educational Resources Information Center

    New South Wales Dept. of Education, Sydney (Australia).

    Continuing a series of short tests aimed at measuring student mastery of specific skills in the natural sciences, this supplementary volume includes teachers' notes, a users' guide and inspection copies of test items 27 to 50. Answer keys and test scoring statistics are provided. The items are designed for grades 7 through 10, and a list of the…

  8. Applications of Computerized Adaptive Testing. Proceedings of a Symposium presented at the Annual Convention of the Military Testing Association (18th, October 1976). Research Report 77-1.

    ERIC Educational Resources Information Center

    Weiss, David J., Ed.

    This symposium consists of five papers and presents some recent developments in adaptive testing which have applications to several military testing problems. The overview, by James R. McBride, defines adaptive testing and discusses some of its item selection and scoring strategies. Item response theory, or item characteristic curve theory, is…

  9. Solving Differential Equations Analytically. Elementary Differential Equations. Modules and Monographs in Undergraduate Mathematics and Its Applications Project. UMAP Unit 335.

    ERIC Educational Resources Information Center

    Goldston, J. W.

    This unit introduces analytic solutions of ordinary differential equations. The objective is to enable the student to decide whether a given function solves a given differential equation. Examples of problems from biology and chemistry are covered. Problem sets, quizzes, and a model exam are included, and answers to all items are provided. The…

  10. A Rigorous Test of the Fit of the Circumplex Model to Big Five Personality Data: Theoretical and Methodological Issues and Two Large Sample Empirical Tests.

    PubMed

    DeGeest, David Scott; Schmidt, Frank

    2015-01-01

    Our objective was to apply the rigorous test developed by Browne (1992) to determine whether the circumplex model fits Big Five personality data. This test has yet to be applied to personality data. Another objective was to determine whether blended items explained correlations among the Big Five traits. We used two working adult samples, the Eugene-Springfield Community Sample and the Professional Worker Career Experience Survey. Fit to the circumplex was tested via Browne's (1992) procedure. Circumplexes were graphed to identify items with loadings on multiple traits (blended items), and to determine whether removing these items changed five-factor model (FFM) trait intercorrelations. In both samples, the circumplex structure fit the FFM traits well. Each sample had items with dual-factor loadings (8 items in the first sample, 21 in the second). Removing blended items had little effect on construct-level intercorrelations among FFM traits. We conclude that rigorous tests show that the fit of personality data to the circumplex model is good. This finding means the circumplex model is competitive with the factor model in understanding the organization of personality traits. The circumplex structure also provides a theoretically and empirically sound rationale for evaluating intercorrelations among FFM traits. Even after eliminating blended items, FFM personality traits remained correlated.

  11. [Mokken scaling of the Cognitive Screening Test].

    PubMed

    Diesfeldt, H F A

    2009-10-01

    The Cognitive Screening Test (CST) is a twenty-item orientation questionnaire in Dutch, that is commonly used to evaluate cognitive impairment. This study applied Mokken Scale Analysis, a non-parametric set of techniques derived from item response theory (IRT), to CST-data of 466 consecutive participants in psychogeriatric day care. The full item set and the standard short version of fourteen items both met the assumptions of the monotone homogeneity model, with scalability coefficient H = 0.39, which is considered weak. In order to select items that would fulfil the assumption of invariant item ordering or the double monotonicity model, the subjects were randomly partitioned into a training set (50% of the sample) and a test set (the remaining half). By means of an automated item selection eleven items were found to measure one latent trait, with H = 0.67 and item H coefficients larger than 0.51. Cross-validation of the item analysis in the remaining half of the subjects gave comparable values (H = 0.66; item H coefficients larger than 0.56). The selected items involve year, place of residence, birth date, the monarch's and prime minister's names, and their predecessors. Applying optimal discriminant analysis (ODA) it was found that the full set of twenty CST items performed best in distinguishing two predefined groups of patients of lower or higher cognitive ability, as established by an independent criterion derived from the Amsterdam Dementia Screening Test. The chance corrected predictive value or prognostic utility was 47.5% for the full item set, 45.2% for the fourteen items of the standard short version of the CST, and 46.1% for the homogeneous, unidimensional set of selected eleven items. The results of the item analysis support the application of the CST in cognitive assessment, and revealed a more reliable 'short' version of the CST than the standard short version (CST14).

  12. Modeling the dynamics of recognition memory testing with an integrated model of retrieval and decision making.

    PubMed

    Osth, Adam F; Jansson, Anna; Dennis, Simon; Heathcote, Andrew

    2018-08-01

    A robust finding in recognition memory is that performance declines monotonically across test trials. Despite the prevalence of this decline, there is a lack of consensus on the mechanism responsible. Three hypotheses have been put forward: (1) interference is caused by learning of test items (2) the test items cause a shift in the context representation used to cue memory and (3) participants change their speed-accuracy thresholds through the course of testing. We implemented all three possibilities in a combined model of recognition memory and decision making, which inherits the memory retrieval elements of the Osth and Dennis (2015) model and uses the diffusion decision model (DDM: Ratcliff, 1978) to generate choice and response times. We applied the model to four datasets that represent three challenges, the findings that: (1) the number of test items plays a larger role in determining performance than the number of studied items, (2) performance decreases less for strong items than weak items in pure lists but not in mixed lists, and (3) lexical decision trials interspersed between recognition test trials do not increase the rate at which performance declines. Analysis of the model's parameter estimates suggests that item interference plays a weak role in explaining the effects of recognition testing, while context drift plays a very large role. These results are consistent with prior work showing a weak role for item noise in recognition memory and that retrieval is a strong cause of context change in episodic memory. Copyright © 2018 Elsevier Inc. All rights reserved.

  13. Predictors of Nursing Students' Performance in a One-Semester Organic and Biochemistry Course

    NASA Astrophysics Data System (ADS)

    van Lanen, Robert J.; Lockie, Nancy M.; McGannon, Thomas

    2000-06-01

    In an effort to empower nursing students to successfully persist in chemistry, predictors of success for undergraduate nursing students enrolled in a one-semester organic and biochemistry course were identified. The sample consisted of 308 undergraduate nursing students enrolled in Chemistry 108 (Principles of Organic and Biochemistry) during a period of seven semesters. In this study, Supplemental Instruction (SI) is a nonremedial academic support program offered for Chemistry 108 students. Placement tests in Mathematics, Reading, and English are required of all entering students. The English Placement Test assesses proficiency in analytical reading and writing; the Nelson Denny Reading Test (Form E) assesses the student's understanding of written vocabulary and the mastery of reading comprehension, and the Mathematics Placement Test measures the student's mastery of arithmetic and algebraic calculations. Both demographic and academic variables were examined. For the entire sample, five predictor variables were identified: Mathematics Placement Test score, Chemistry 107 grade (a prerequisite), total number of SI sessions attended, Nelson Denny Reading Test (Form E) score, and age. Predictors for various subpopulations of the sample were also identified. Predictors for students of traditional age were Mathematics Placement Test score, total number of SI sessions attended, and Chemistry 107 grade. The best predictors for continuing education students were Chemistry 107 grade and Nelson Denny Test score.

  14. The effect of autoclave resterilisation on polyester vascular grafts.

    PubMed

    Riepe, G; Whiteley, M S; Wente, A; Rogge, A; Schröder, A; Galland, R B; Imig, H

    1999-11-01

    polyester grafts are expensive, single-use items. Some manufacturers of uncoated, woven grafts include instructions for autoclave resterilisation to be performed at the surgeon's own request. Others warn against such manipulation. Theoretically, the glass transition point of polyester at 70-80 degrees C and the possible acceleration of hydrolysis suggest that autoclave resterilisation at 135 degrees C might be a problem. a DeBakey Soft Woven Dacron Vascular Prosthesis (Bard) and a Woven Double Velour Dacron Graft (Meadox) were autoclave-resterilised 0 to 20 times, having been weighed before and after sterilisation. Tactile testing was performed. Mechanical properties were examined by probe puncture and single-filament testing, the surface was examined by scanning electron microscopy and the degree of hydrolysis by infra-red spectroscopy. tactile testing revealed a change of feeling with increasing cycles of resterilisation. Investigation of weight, textile strength, single-filament strength, electron microscopy of the surface and infra-red spectroscopy showed no change of the material. changes felt are presumably a surface phenomenon, not measurably affecting strength or chemistry of material after autoclave resterilisation. We therefore feel that it is safe to use once-autoclave-resterilised surplus uncoated polyester grafts, provided that sterility is guaranteed. Copyright 1999 Harcourt Publishers Ltd.

  15. High School Class for Gifted Pupils in Physics and Sciences and Pupils' Skills Measured by Standard and Pisa Test

    NASA Astrophysics Data System (ADS)

    Djordjevic, G. S.; Pavlovic-Babic, D.

    2010-01-01

    The "High school class for students with special abilities in physics" was founded in Nis, Serbia (www.pmf.ni.ac.yu/f_odeljenje) in 2003. The basic aim of this project has been introducing a broadened curriculum of physics, mathematics, computer science, as well as chemistry and biology. Now, six years after establishing of this specialized class, and 3 years after the previous report, we present analyses of the pupils' skills in solving rather problem oriented test, as PISA test, and compare their results with the results of pupils who study under standard curricula. More precisely results are compared to the progress results of the pupils in a standard Grammar School and the corresponding classes of the Mathematical Gymnasiums in Nis. Analysis of achievement data should clarify what are benefits of introducing in school system track for gifted students. Additionally, item analysis helps in understanding and improvement of learning strategies' efficacy. We make some conclusions and remarks that may be useful for the future work that aims to increase pupils' intrinsic and instrumental motivation for physics and sciences, as well as to increase the efficacy of teaching physics and science.

  16. Multistage Computerized Adaptive Testing with Uniform Item Exposure

    ERIC Educational Resources Information Center

    Edwards, Michael C.; Flora, David B.; Thissen, David

    2012-01-01

    This article describes a computerized adaptive test (CAT) based on the uniform item exposure multi-form structure (uMFS). The uMFS is a specialization of the multi-form structure (MFS) idea described by Armstrong, Jones, Berliner, and Pashley (1998). In an MFS CAT, the examinee first responds to a small fixed block of items. The items comprising…

  17. Primary Science Assessment Item Setters' Misconceptions Concerning the State Changes of Water

    ERIC Educational Resources Information Center

    Boo, Hong Kwen

    2006-01-01

    Assessment is an integral and vital part of teaching and learning, providing feedback on progress through the assessment period to both learners and teachers. However, if test items are flawed because of misconceptions held by the questions setter, then such test items are invalid as assessment tools. Moreover, such flawed items are also likely to…

  18. Stratified and Maximum Information Item Selection Procedures in Computer Adaptive Testing

    ERIC Educational Resources Information Center

    Deng, Hui; Ansley, Timothy; Chang, Hua-Hua

    2010-01-01

    In this study we evaluated and compared three item selection procedures: the maximum Fisher information procedure (F), the a-stratified multistage computer adaptive testing (CAT) (STR), and a refined stratification procedure that allows more items to be selected from the high a strata and fewer items from the low a strata (USTR), along with…

  19. Assessment of Differential Item Functioning in Testlet-Based Items Using the Rasch Testlet Model

    ERIC Educational Resources Information Center

    Wang, Wen-Chung; Wilson, Mark

    2005-01-01

    This study presents a procedure for detecting differential item functioning (DIF) for dichotomous and polytomous items in testlet-based tests, whereby DIF is taken into account by adding DIF parameters into the Rasch testlet model. Simulations were conducted to assess recovery of the DIF and other parameters. Two independent variables, test type…

  20. Ethnic Group Bias in Intelligence Test Items.

    ERIC Educational Resources Information Center

    Scheuneman, Janice

    In previous studies of ethnic group bias in intelligence test items, the question of bias has been confounded with ability differences between the ethnic group samples compared. The present study is based on a conditional probability model in which an unbiased item is defined as one where the probability of a correct response to an item is the…

  1. Primary Science Assessment Item Setters' Misconceptions Concerning Biological Science Concepts

    ERIC Educational Resources Information Center

    Boo, Hong Kwen

    2007-01-01

    Assessment is an integral and vital part of teaching and learning, providing feedback on progress through the assessment period to both learners and teachers. However, if test items are flawed because of misconceptions held by the question setter, then such test items are invalid as assessment tools. Moreover, such flawed items are also likely to…

  2. Examination of Different Item Response Theory Models on Tests Composed of Testlets

    ERIC Educational Resources Information Center

    Kogar, Esin Yilmaz; Kelecioglu, Hülya

    2017-01-01

    The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…

  3. A Monte Carlo Study of an Iterative Wald Test Procedure for DIF Analysis

    ERIC Educational Resources Information Center

    Cao, Mengyang; Tay, Louis; Liu, Yaowu

    2017-01-01

    This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…

  4. A Semiparametric Model for Jointly Analyzing Response Times and Accuracy in Computerized Testing

    ERIC Educational Resources Information Center

    Wang, Chun; Fan, Zhewen; Chang, Hua-Hua; Douglas, Jeffrey A.

    2013-01-01

    The item response times (RTs) collected from computerized testing represent an underutilized type of information about items and examinees. In addition to knowing the examinees' responses to each item, we can investigate the amount of time examinees spend on each item. Current models for RTs mainly focus on parametric models, which have the…

  5. Missouri Assessment Program (MAP), Spring 2000: High School Health/Physical Education, Released Items, Grade 9.

    ERIC Educational Resources Information Center

    Missouri State Dept. of Elementary and Secondary Education, Jefferson City.

    This document presents 10 released items from the Health/Physical Education Missouri Assessment Program (MAP) test given in the spring of 2000 to ninth graders. Items from the test sessions include: selected-response (multiple choice), constructed-response, and a performance event. The selected-response items consist of individual questions…

  6. An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

    ERIC Educational Resources Information Center

    Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N.

    2013-01-01

    Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…

  7. Missouri Assessment Program (MAP), Spring 2000: Intermediate Communication Arts, Released Items, Grade 7.

    ERIC Educational Resources Information Center

    Missouri State Dept. of Elementary and Secondary Education, Jefferson City.

    This document deals with testing in intermediate communication arts for seventh graders in Missouri public schools. The document contains the following items from the Session 1 Test Booklet: "Swimming in Snow" (Diana C. Conway) (Items 1, 2, and 5); "Discovery" (Marion Dane Bauer) (Item 13); writing prompt; and a writer's…

  8. Automated Item Generation with Recurrent Neural Networks.

    PubMed

    von Davier, Matthias

    2018-03-12

    Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.

  9. Assessing the Conceptual Understanding about Heat and Thermodynamics at Undergraduate Level

    ERIC Educational Resources Information Center

    Kulkarni, Vasudeo Digambar; Tambade, Popat Savaleram

    2013-01-01

    In this study, a Thermodynamic Concept Test (TCT) was designed to assess student's conceptual understanding heat and thermodynamics at undergraduate level. The different statistical tests such as item difficulty index, item discrimination index, point biserial coefficient were used for assessing TCT. For each item of the test these indices were…

  10. A Study of Inference in Standardized Reading Test Items and Its Relationship to Difficulty.

    ERIC Educational Resources Information Center

    Marzano, Robert J.

    To study the relationship between inferences made on standardized reading tests and item difficulty, 50 items on the reading comprehension section of the Metropolitan Achievement Test were analyzed independently in this study by two raters using four general categories of inferences: (1) reference inferences, (2) between proposition inferences,…

  11. Questions and Problems in Science.

    ERIC Educational Resources Information Center

    Dressel, Paul L.; Nelson, Clarence H.

    This folio of test items, contributed by a number of colleges and universities from their course, placement, entrance, or other institutional examinations, was compiled to aid teachers in constructing tests. Only those science courses offered in the first two years of college are represented by the scope of the items. The test items may also serve…

  12. Effects of Using Modified Items to Test Students with Persistent Academic Difficulties

    ERIC Educational Resources Information Center

    Elliott, Stephen N.; Kettler, Ryan J.; Beddow, Peter A.; Kurz, Alexander; Compton, Elizabeth; McGrath, Dawn; Bruen, Charles; Hinton, Kent; Palmer, Porter; Rodriguez, Michael C.; Bolt, Daniel; Roach, Andrew T.

    2010-01-01

    This study investigated the effects of using modified items in achievement tests to enhance accessibility. An experiment determined whether tests composed of modified items would reduce the performance gap between students eligible for an alternate assessment based on modified achievement standards (AA-MAS) and students not eligible, and the…

  13. Optimal Stratification of Item Pools in a-Stratified Computerized Adaptive Testing.

    ERIC Educational Resources Information Center

    Chang, Hua-Hua; van der Linden, Wim J.

    2003-01-01

    Developed a method based on 0-1 linear programming to stratify an item pool optimally for use in alpha-stratified adaptive testing. Applied the method to a previous item pool from the computerized adaptive test of the Graduate Record Examinations. Results show the new method performs well in practical situations. (SLD)

  14. The Development and Validation of a Formula for Measuring Single-Sentence Test Item Readability.

    ERIC Educational Resources Information Center

    Homan, Susan; And Others

    1994-01-01

    A study was conducted with 782 elementary school students to determine whether the Homan-Hewitt Readability Formula could identify the readability of a single-sentence test item. Results indicate that a relationship exists between students' reading grade levels and responses to test items written at higher readability levels. (SLD)

  15. Development and Validation of a Computer Adaptive EFL Test

    ERIC Educational Resources Information Center

    He, Lianzhen; Min, Shangchao

    2017-01-01

    The first aim of this study was to develop a computer adaptive EFL test (CALT) that assesses test takers' listening and reading proficiency in English with dichotomous items and polytomous testlets. We reported in detail on the development of the CALT, including item banking, determination of suitable item response theory (IRT) models for item…

  16. The Development and Management of Banks of Performance Based Test Items.

    ERIC Educational Resources Information Center

    Curtis, H. A., Ed.

    Symposium papers presented at an Annual Meeting of the National Council on Measurement in Education (Chicago, 1972), all of which concern banks of test items for use in constructing criterion referenced tests, comprise this document. The first paper, "Locally Produced Item Banks" by Thomas J. Slocum, presents information on the…

  17. Test-retest stability of the Task and Ego Orientation Questionnaire.

    PubMed

    Lane, Andrew M; Nevill, Alan M; Bowes, Neal; Fox, Kenneth R

    2005-09-01

    Establishing stability, defined as observing minimal measurement error in a test-retest assessment, is vital to validating psychometric tools. Correlational methods, such as Pearson product-moment, intraclass, and kappa are tests of association or consistency, whereas stability or reproducibility (regarded here as synonymous) assesses the agreement between test-retest scores. Indexes of reproducibility using the Task and Ego Orientation in Sport Questionnaire (TEOSQ; Duda & Nicholls, 1992) were investigated using correlational (Pearson product-moment, intraclass, and kappa) methods, repeated measures multivariate analysis of variance, and calculating the proportion of agreement within a referent value of +/-1 as suggested by Nevill, Lane, Kilgour, Bowes, and Whyte (2001). Two hundred thirteen soccer players completed the TEOSQ on two occasions, 1 week apart. Correlation analyses indicated a stronger test-retest correlation for the Ego subscale than the Task subscale. Multivariate analysis of variance indicated stability for ego items but with significant increases in four task items. The proportion of test-retest agreement scores indicated that all ego items reported relatively poor stability statistics with test-retest scores within a range of +/-1, ranging from 82.7-86.9%. By contrast, all task items showed test-retest difference scores ranging from 92.5-99%, although further analysis indicated that four task subscale items increased significantly. Findings illustrated that correlational methods (Pearson product-moment, intraclass, and kappa) are influenced by the range in scores, and calculating the proportion of agreement of test-retest differences with a referent value of +/-1 could provide additional insight into the stability of the questionnaire. It is suggested that the item-by-item proportion of agreement method proposed by Nevill et al. (2001) should be used to supplement existing methods and could be especially helpful in identifying rogue items in the initial stages of psychometric questionnaire validation.

  18. How Small the Number of Test Items Can Be for the Basis of Estimating the Operating Characteristics of the Discrete Responses to Unknown Test Items.

    ERIC Educational Resources Information Center

    Samejima, Fumiko; Changas, Paul S.

    The methods and approaches for estimating the operating characteristics of the discrete item responses without assuming any mathematical form have been developed and expanded. It has been made possible that, even if the test information function of a given test is not constant for the interval of ability of interest, it is used as the Old Test.…

  19. Automatic Generation of Rasch-Calibrated Items: Figural Matrices Test GEOM and Endless-Loops Test EC

    ERIC Educational Resources Information Center

    Arendasy, Martin

    2005-01-01

    The future of test construction for certain psychological ability domains that can be analyzed well in a structured manner may lie--at the very least for reasons of test security--in the field of automatic item generation. In this context, a question that has not been explicitly addressed is whether it is possible to embed an item response theory…

  20. Evaluation of Floors and Item Gradients for Reading and Math Tests for Young Children

    ERIC Educational Resources Information Center

    Bradley-Johnson, Sharon; Durmusoglu, Gokce

    2005-01-01

    Ignoring the adequacy of floors and item gradients for tests used with young children can have serious consequences. Thus, because of the importance of early intervention for reading and math problems, we used the criteria suggested by Bracken for adequate floors and item gradients, and reviewed 15 reading tests and 12 math tests for ages 4-0…

  1. The Psychological Effect of Errors in Standardized Language Test Items on EFL Students' Responses to the Following Item

    ERIC Educational Resources Information Center

    Khaksefidi, Saman

    2017-01-01

    This study investigates the psychological effect of a wrong question with wrong items on answering to the next question in a test of structure. Forty students selected through stratified random sampling are given 15 questions of a standardized test namely a TOEFL structure test in which questions number 7 and number 11 are wrong and their answers…

  2. Glucose in Urine Test: MedlinePlus Lab Test Information

    MedlinePlus

    ... Lab Tests Online [Internet]. American Association for Clinical Chemistry; c2001–2017. Diabetes [updated 2017 Jan 15; cited ... Lab Tests Online [Internet]. American Association for Clinical Chemistry; c2001–2017. Glucose Tests: Common Questions [updated 2017 ...

  3. ITEM ANALYSIS OF THREE SPANISH NAMING TESTS: A CROSS-CULTURAL INVESTIGATION

    PubMed Central

    de la Plata, Carlos Marquez; Arango-Lasprilla, Juan Carlos; Alegret, Montse; Moreno, Alexander; Tárraga, Luis; Lara, Mar; Hewlitt, Margaret; Hynan, Linda; Cullum, C. Munro

    2009-01-01

    Neuropsychological evaluations conducted in the United States and abroad commonly include the use of tests translated from English to Spanish. The use of translated naming tests for evaluating predominately Spanish-speakers has recently been challenged on the grounds that translating test items may compromise a test’s construct validity. The Texas Spanish Naming Test (TNT) has been developed in Spanish specifically for use with Spanish-speakers; however, it is unlikely patients from diverse Spanish-speaking geographical regions will perform uniformly on a naming test. The present study evaluated and compared the internal consistency and patterns of item-difficulty and -discrimination for the TNT and two commonly used translated naming tests in three countries (i.e., United States, Colombia, Spain). Two hundred fifty two subjects (126 demented, 116 nondemented) across three countries were administered the TNT, Modified Boston Naming Test-Spanish, and the naming subtest from the CERAD. The TNT demonstrated superior internal consistency to its counterparts, a superior item difficulty pattern than the CERAD naming test, and a superior item discrimination pattern than the MBNT-S across countries. Overall, all three Spanish naming tests differentiated nondemented and moderately demented individuals, but the results suggest the items of the TNT are most appropriate to use with Spanish-speakers. Preliminary normative data for the three tests examined in each country are provided. PMID:19208960

  4. Identifying predictors of physics item difficulty: A linear regression approach

    NASA Astrophysics Data System (ADS)

    Mesic, Vanes; Muratovic, Hasnija

    2011-06-01

    Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge structures. Identified predictors point out the fundamental cognitive dimensions of student physics achievement at the end of compulsory education in Bosnia and Herzegovina, whose level of development influenced the test results within the conducted assessments.

  5. An evaluation of computerized adaptive testing for general psychological distress: combining GHQ-12 and Affectometer-2 in an item bank for public mental health research.

    PubMed

    Stochl, Jan; Böhnke, Jan R; Pickett, Kate E; Croudace, Tim J

    2016-05-20

    Recent developments in psychometric modeling and technology allow pooling well-validated items from existing instruments into larger item banks and their deployment through methods of computerized adaptive testing (CAT). Use of item response theory-based bifactor methods and integrative data analysis overcomes barriers in cross-instrument comparison. This paper presents the joint calibration of an item bank for researchers keen to investigate population variations in general psychological distress (GPD). Multidimensional item response theory was used on existing health survey data from the Scottish Health Education Population Survey (n = 766) to calibrate an item bank consisting of pooled items from the short common mental disorder screen (GHQ-12) and the Affectometer-2 (a measure of "general happiness"). Computer simulation was used to evaluate usefulness and efficacy of its adaptive administration. A bifactor model capturing variation across a continuum of population distress (while controlling for artefacts due to item wording) was supported. The numbers of items for different required reliabilities in adaptive administration demonstrated promising efficacy of the proposed item bank. Psychometric modeling of the common dimension captured by more than one instrument offers the potential of adaptive testing for GPD using individually sequenced combinations of existing survey items. The potential for linking other item sets with alternative candidate measures of positive mental health is discussed since an optimal item bank may require even more items than these.

  6. Expertise sensitive item selection.

    PubMed

    Chow, P; Russell, H; Traub, R E

    2000-12-01

    In this paper we describe and illustrate a procedure for selecting items from a large pool for a certification test. The proposed procedure, which is intended to improve the alignment of the certification test with on-the-job performance, is based on an expertise sensitive index. This index for an item is the difference between the item's p values for experts and novices. An example is provided of the application of the index for selecting items to be used in certifying bakers.

  7. Item-saving assessment of self-care performance in children with developmental disabilities: A prospective caregiver-report computerized adaptive test

    PubMed Central

    Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi

    2018-01-01

    Objective The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. Methods The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. Results The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). Conclusion The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with DD in clinical and research settings. PMID:29561879

  8. Item-saving assessment of self-care performance in children with developmental disabilities: A prospective caregiver-report computerized adaptive test.

    PubMed

    Chen, Cheng-Te; Chen, Yu-Lan; Lin, Yu-Ching; Hsieh, Ching-Lin; Tzeng, Jeng-Yi; Chen, Kuan-Lin

    2018-01-01

    The purpose of this study was to construct a computerized adaptive test (CAT) for measuring self-care performance (the CAT-SC) in children with developmental disabilities (DD) aged from 6 months to 12 years in a content-inclusive, precise, and efficient fashion. The study was divided into 3 phases: (1) item bank development, (2) item testing, and (3) a simulation study to determine the stopping rules for the administration of the CAT-SC. A total of 215 caregivers of children with DD were interviewed with the 73-item CAT-SC item bank. An item response theory model was adopted for examining the construct validity to estimate item parameters after investigation of the unidimensionality, equality of slope parameters, item fitness, and differential item functioning (DIF). In the last phase, the reliability and concurrent validity of the CAT-SC were evaluated. The final CAT-SC item bank contained 56 items. The stopping rules suggested were (a) reliability coefficient greater than 0.9 or (b) 14 items administered. The results of simulation also showed that 85% of the estimated self-care performance scores would reach a reliability higher than 0.9 with a mean test length of 8.5 items, and the mean reliability for the rest was 0.86. Administering the CAT-SC could reduce the number of items administered by 75% to 84%. In addition, self-care performances estimated by the CAT-SC and the full item bank were very similar to each other (Pearson r = 0.98). The newly developed CAT-SC can efficiently measure self-care performance in children with DD whose performances are comparable to those of TD children aged from 6 months to 12 years as precisely as the whole item bank. The item bank of the CAT-SC has good reliability and a unidimensional self-care construct, and the CAT can estimate self-care performance with less than 25% of the items in the item bank. Therefore, the CAT-SC could be useful for measuring self-care performance in children with DD in clinical and research settings.

  9. Comparison of MX-857 versus MX-641 chemistries for type 2485 film

    NASA Technical Reports Server (NTRS)

    Bourque, P. F.

    1972-01-01

    Tests were conducted to evaluate Kodak MX-857 and MX-641 chemistry systems for use with film Type 2485 to be used in the dim light experiments on Apollo 16. The test program objectives were to: (1) retain a minimum ASA speed of at least 4000; (2) maintain a base-plus-fog level of 0.21 density units or less; and (3) minimize the granularity but do not exceed the granularity level of the Apollo 15 imagery. Test results on the Versamat processor indicate that the use of MX-857 chemistry is preferred over MX-641 chemistry in satisfying the stated test objectives.

  10. Quality of dry chemistry testing.

    PubMed

    Nakamura, H; Tatsumi, N

    1999-01-01

    Since the development of the qualitative test paper for urine in 1950s, several kinds of dry-state-reagents and their automated analyzers have been developed. "Dry chemistry" has become to be called since the report on the development of quantitative test paper for serum bilirubin with reflectometer in the end of 1960s and dry chemistry has been world widely known since the presentation on the development of multilayer film reagent for serum biochemical analytes by Eastman Kodak Co at the 10th IFCC Meeting in the end of 1970s. We have reported test menu, results in external quality assessment, merits and demerits, and the future possibilities of dry chemistry.

  11. 21 CFR 862.1560 - Urinary phenylketones (nonquantitative) test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1560 Urinary phenylketones (nonquantitative) test system. (a) Identification. A...

  12. Procedures to develop a computerized adaptive test to assess patient-reported physical functioning.

    PubMed

    McCabe, Erin; Gross, Douglas P; Bulut, Okan

    2018-06-07

    The purpose of this paper is to demonstrate the procedures to develop and implement a computerized adaptive patient-reported outcome (PRO) measure using secondary analysis of a dataset and items from fixed-format legacy measures. We conducted secondary analysis of a dataset of responses from 1429 persons with work-related lower extremity impairment. We calibrated three measures of physical functioning on the same metric, based on item response theory (IRT). We evaluated efficiency and measurement precision of various computerized adaptive test (CAT) designs using computer simulations. IRT and confirmatory factor analyses support combining the items from the three scales for a CAT item bank of 31 items. The item parameters for IRT were calculated using the generalized partial credit model. CAT simulations show that reducing the test length from the full 31 items to a maximum test length of 8 items, or 20 items is possible without a significant loss of information (95, 99% correlation with legacy measure scores). We demonstrated feasibility and efficiency of using CAT for PRO measurement of physical functioning. The procedures we outlined are straightforward, and can be applied to other PRO measures. Additionally, we have included all the information necessary to implement the CAT of physical functioning in the electronic supplementary material of this paper.

  13. Survey Development to Assess College Students' Perceptions of the Campus Environment.

    PubMed

    Sowers, Morgan F; Colby, Sarah; Greene, Geoffrey W; Pickett, Mackenzie; Franzen-Castle, Lisa; Olfert, Melissa D; Shelnutt, Karla; Brown, Onikia; Horacek, Tanya M; Kidd, Tandalayo; Kattelmann, Kendra K; White, Adrienne A; Zhou, Wenjun; Riggsbee, Kristin; Yan, Wangcheng; Byrd-Bredbenner, Carol

    2017-11-01

    We developed and tested a College Environmental Perceptions Survey (CEPS) to assess college students' perceptions of the healthfulness of their campus. CEPS was developed in 3 stages: questionnaire development, validity testing, and reliability testing. Questionnaire development was based on an extensive literature review and input from an expert panel to establish content validity. Face validity was established with the target population using cognitive interviews with 100 college students. Concurrent-criterion validity was established with in-depth interviews (N = 30) of college students compared to surveys completed by the same 30 students. Surveys completed by college students from 8 universities (N = 1147) were used to test internal structure (factor analysis) and internal consistency (Cronbach's alpha). After development and testing, 15 items remained from the original 48 items. A 5-factor solution emerged: physical activity (4 items, α = .635), water (3 items, α = .773), vending (2 items, α = .680), healthy food (2 items, α = .631), and policy (2 items, α = .573). The mean total score for all universities was 62.71 (±11.16) on a 100-point scale. CEPS appears to be a valid and reliable tool for assessing college students' perceptions of their health-related campus environment.

  14. Implicit and explicit forgetting: when is gist remembered?

    PubMed

    Dorfman, J; Mandler, G

    1994-08-01

    Recognition (YES/NO) and stem completion (cued: complete with a word from the list; and uncued: complete with the first word that comes to mind) were tested following either semantic or non-semantic processing of a categorized input list. Item/instance information was tested by contrasting target items from the input list with new items that were categorically related to them; gist/categorical information was tested by comparing target items semantically related to the input items with unrelated new items. For both recognition and stem completion, regardless of initial processing condition, item information decayed rapidly over a period of one week. Gist information was maintained over the same period when initial processing was semantic but only in the cued condition for completion. These results are discussed in terms of dual process theory, which postulates activation/integration of a representation as primarily relevant to implicit item information and elaboration of a representation as mainly relevant to semantic (i.e. categorical) information.

  15. Incidental retrieval-induced forgetting of location information.

    PubMed

    Gómez-Ariza, Carlos J; Fernandez, Angel; Bajo, M Teresa

    2012-06-01

    Retrieval-induced forgetting (RIF) has been studied with different types of tests and materials. However, RIF has always been tested on the items' central features, and there is no information on whether inhibition also extends to peripheral features of the events in which the items are embedded. In two experiments, we specifically tested the presence of RIF in a task in which recall of peripheral information was required. After a standard retrieval practice task oriented to item identity, participants were cued with colors (Exp. 1) or with the items themselves (Exp. 2) and asked to recall the screen locations where the items had been displayed during the study phase. RIF for locations was observed after retrieval practice, an effect that was not present when participants were asked to read instead of retrieving the items. Our findings provide evidence that peripheral location information associated with an item during study can be also inhibited when the retrieval conditions promote the inhibition of more central, item identity information.

  16. Computerized Adaptive Testing with Item Clones. Research Report.

    ERIC Educational Resources Information Center

    Glas, Cees A. W.; van der Linden, Wim J.

    To reduce the cost of item writing and to enhance the flexibility of item presentation, items can be generated by item-cloning techniques. An important consequence of cloning is that it may cause variability on the item parameters. Therefore, a multilevel item response model is presented in which it is assumed that the item parameters of a…

  17. 21 CFR 862.1660 - Quality control material (assayed and unassayed).

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry... control material (assayed and unassayed) for clinical chemistry is a device intended for medical purposes for use in a test system to estimate test precision and to detect systematic analytical deviations...

  18. 21 CFR 862.1660 - Quality control material (assayed and unassayed).

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry... control material (assayed and unassayed) for clinical chemistry is a device intended for medical purposes for use in a test system to estimate test precision and to detect systematic analytical deviations...

  19. 21 CFR 862.1660 - Quality control material (assayed and unassayed).

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry... control material (assayed and unassayed) for clinical chemistry is a device intended for medical purposes for use in a test system to estimate test precision and to detect systematic analytical deviations...

  20. International field testing of the psychometric properties of an EORTC quality of life module for oral health: the EORTC QLQ-OH15.

    PubMed

    Hjermstad, Marianne J; Bergenmar, Mia; Bjordal, Kristin; Fisher, Sheila E; Hofmeister, Dirk; Montel, Sébastien; Nicolatou-Galitis, Ourania; Pinto, Monica; Raber-Durlacher, Judith; Singer, Susanne; Tomaszewska, Iwona M; Tomaszewski, Krzysztof A; Verdonck-de Leeuw, Irma; Yarom, Noam; Winstanley, Julie B; Herlofson, Bente B

    2016-09-01

    This international EORTC validation study (phase IV) is aimed at testing the psychometric properties of a quality of life (QoL) module related to oral health problems in cancer patients. The phase III module comprised 17 items with four hypothesized multi-item scales and three single items. In phase IV, patients with mixed cancers, in different treatment phases from 10 countries completed the EORTC QLQ-C30, the QLQ-OH module, and a debriefing interview. The hypothesized structure was tested using combinations of classical test theory and item response theory, following EORTC guidelines. Test-retest assessments and responsiveness to change analysis (RCA) were performed after 2 weeks. Five hundred seventy-two patients (median age 60.3, 54 % females) were analyzed. Completion took <10 min for 84 %, 40 % expressed satisfaction that these issues were addressed. Analyses suggested a revision of the phase III hypothesized scale structure. Two items were deleted based on a high degree of item misfit, together with negative patient feedback. The remaining 15 items formed one eight-item scale named OH-QoL score, a two-item information scale, a two-item scale regarding dentures, and three single items (sticky saliva/mouth soreness/sensitivity to food/drink). Face and convergent validity and internal consistency were confirmed. Test-retest reliability (n = 60) was demonstrated as was RCA for patients undergoing chemotherapy (n = 117; p = 0.06). The resulting QLQ-OH15 discriminated between clinically distinct patient groups, e.g., low performance status vs. higher (p < 000.1), and head-and-neck cancer versus other cancers (p < 0.03). The EORTC module QLQ-OH15 is a short, well-accepted assessment tool focusing on oral problems and QoL to improve clinical management. ClinicalTrials.gov Identifier: NCT01724333.

  1. Item Selection and Pre-equating with Empirical Item Characteristic Curves.

    ERIC Educational Resources Information Center

    Livingston, Samuel A.

    An empirical item characteristic curve shows the probability of a correct response as a function of the student's total test score. These curves can be estimated from large-scale pretest data. They enable test developers to select items that discriminate well in the score region where decisions are made. A similar set of curves can be used to…

  2. Computerized Adaptive Testing for Polytomous Motivation Items: Administration Mode Effects and a Comparison with Short Forms

    ERIC Educational Resources Information Center

    Hol, A. Michiel; Vorst, Harrie C. M.; Mellenbergh, Gideon J.

    2007-01-01

    In a randomized experiment (n = 515), a computerized and a computerized adaptive test (CAT) are compared. The item pool consists of 24 polytomous motivation items. Although items are carefully selected, calibration data show that Samejima's graded response model did not fit the data optimally. A simulation study is done to assess possible…

  3. The Effect of Error in Item Parameter Estimates on the Test Response Function Method of Linking.

    ERIC Educational Resources Information Center

    Kaskowitz, Gary S.; De Ayala, R. J.

    2001-01-01

    Studied the effect of item parameter estimation for computation of linking coefficients for the test response function (TRF) linking/equating method. Simulation results showed that linking was more accurate when there was less error in the parameter estimates, and that 15 or 25 common items provided better results than 5 common items under both…

  4. Easy and Informative: Using Confidence-Weighted True-False Items for Knowledge Tests in Psychology Courses

    ERIC Educational Resources Information Center

    Dutke, Stephan; Barenberg, Jonathan

    2015-01-01

    We introduce a specific type of item for knowledge tests, confidence-weighted true-false (CTF) items, and review experiences of its application in psychology courses. A CTF item is a statement about the learning content to which students respond whether the statement is true or false, and they rate their confidence level. Previous studies using…

  5. Wisconsin Title I Migrant Education. Section 143 Project: Development of an Item Bank. Summary Report.

    ERIC Educational Resources Information Center

    Brown, Frank N.; And Others

    The successful Wisconsin Title 1 project item bank offers a valid, flexible, and efficient means of providing migrant student tests in reading and mathematics tailored to instructor curricula. The item bank system consists of nine PASCAL computer programs which maintain, search, and select from approximately 1,000 test items stored on floppy disks…

  6. Water chemistry of the secondary circuit at a nuclear power station with a VVER power reactor

    NASA Astrophysics Data System (ADS)

    Tyapkov, V. F.; Erpyleva, S. F.

    2017-05-01

    Results of implementation of the secondary circuit organic amine water chemistry at Russian nuclear power plant (NPP) with VVER-1000 reactors are presented. The requirements for improving the reliability, safety, and efficiency of NPPs and for prolonging the service life of main equipment items necessitate the implementation of new technologies, such as new water chemistries. Data are analyzed on the chemical control of power unit coolant for quality after the changeover to operation with the feed of higher amines, such as morpholine and ethanolamine. Power units having equipment containing copper alloy components were converted from the all-volatile water chemistry to the ethanolamine or morpholine water chemistry with no increase in pH of the steam generator feedwater. This enables the iron content in the steam generator feedwater to be decreased from 6-12 to 2.0-2.5 μg/dm3. It is demonstrated that pH of high-temperature water is among the basic factors controlling erosion and corrosion wear of the piping and the ingress of corrosion products into NPP steam generators. For NPP power units having equipment whose construction material does not include copper alloys, the water chemistries with elevated pH of the secondary coolant are adopted. Stable dosing of correction chemicals at these power units maintains pH25 of 9.5 to 9.7 in the steam generator feedwater with a maximum iron content of 2 μg/dm3 in the steam generator feedwater.

  7. Development of an Item Bank for the Assessment of Knowledge on Biology in Argentine University Students.

    PubMed

    Cupani, Marcos; Zamparella, Tatiana Castro; Piumatti, Gisella; Vinculado, Grupo

    The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. This study aims to develop a bank of items to measure the level of Knowledge on Biology using the Rasch model. The sample consisted of 1219 participants that studied in different faculties of the National University of Cordoba (mean age = 21.85 years, SD = 4.66; 66.9% are women). The items were organized in different forms and into separate subtests, with some common items across subtests. The students were told they had to answer 60 questions of knowledge on biology. Evaluation of Rasch model fit (Zstd >|2.0|), differential item functioning, dimensionality, local independence, item and person separation (>2.0), and reliability (>.80) resulted in a bank of 180 items with good psychometric properties. The bank provides items with a wide range of content coverage and may serve as a sound basis for computerized adaptive testing applications. The contribution of this work is significant in the field of educational assessment in Argentina.

  8. 21 CFR 862.1245 - Dehydroepiandrosterone (free and sulfate) test system.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1245 Dehydroepiandrosterone (free and sulfate) test system. (a) Identification...

  9. 21 CFR 862.1205 - Cortisol (hydrocortisone and hydroxycorticosterone) test system.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1205 Cortisol (hydrocortisone and hydroxycorticosterone) test system...

  10. 21 CFR 862.1245 - Dehydroepiandrosterone (free and sulfate) test system.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1245 Dehydroepiandrosterone (free and sulfate) test system. (a) Identification...

  11. 21 CFR 862.1205 - Cortisol (hydrocortisone and hydroxycorticosterone) test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1205 Cortisol (hydrocortisone and hydroxycorticosterone) test system...

  12. 21 CFR 862.1245 - Dehydroepiandrosterone (free and sulfate) test system.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1245 Dehydroepiandrosterone (free and sulfate) test system. (a) Identification...

  13. 21 CFR 862.1205 - Cortisol (hydrocortisone and hydroxycorticosterone) test system.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1205 Cortisol (hydrocortisone and hydroxycorticosterone) test system...

  14. 21 CFR 862.1245 - Dehydroepiandrosterone (free and sulfate) test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1245 Dehydroepiandrosterone (free and sulfate) test system. (a) Identification...

  15. 21 CFR 862.1245 - Dehydroepiandrosterone (free and sulfate) test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1245 Dehydroepiandrosterone (free and sulfate) test system. (a) Identification...

  16. 21 CFR 862.1205 - Cortisol (hydrocortisone and hydroxycorticosterone) test system.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1205 Cortisol (hydrocortisone and hydroxycorticosterone) test system...

  17. Strategies for Controlling Item Exposure in Computerized Adaptive Testing with the Generalized Partial Credit Model

    ERIC Educational Resources Information Center

    Davis, Laurie Laughlin

    2004-01-01

    Choosing a strategy for controlling item exposure has become an integral part of test development for computerized adaptive testing (CAT). This study investigated the performance of six procedures for controlling item exposure in a series of simulated CATs under the generalized partial credit model. In addition to a no-exposure control baseline…

  18. Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

    ERIC Educational Resources Information Center

    Lee, Yi-Hsuan; Zhang, Jinming

    2017-01-01

    Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

  19. Application of Computerized Adaptive Testing to Entrance Examination for Graduate Studies in Turkey

    ERIC Educational Resources Information Center

    Bulut, Okan; Kan, Adnan

    2012-01-01

    Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee's responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee's ability level. Instead of…

  20. Implementing Sympson-Hetter Item-Exposure Control in a Shadow-Test Approach to Constrained Adaptive Testing

    ERIC Educational Resources Information Center

    Veldkamp, Bernard P.; van der Linden, Wim J.

    2008-01-01

    In most operational computerized adaptive testing (CAT) programs, the Sympson-Hetter (SH) method is used to control the exposure of the items. Several modifications and improvements of the original method have been proposed. The Stocking and Lewis (1998) version of the method uses a multinomial experiment to select items. For severely constrained…

  1. Rasch Based Analysis of Oral Proficiency Test Data.

    ERIC Educational Resources Information Center

    Nakamura, Yuji

    2001-01-01

    This paper examines the rating scale data of oral proficiency tests analyzed by a Rasch Analysis focusing on an item map and factor analysis. In discussing the item map, the difficulty order of six items and students' answering patterns are analyzed using descriptive statistics and measures of central tendency of test scores. The data ranks the…

  2. An Approach to Scoring and Equating Tests with Binary Items: Piloting With Large-Scale Assessments

    ERIC Educational Resources Information Center

    Dimitrov, Dimiter M.

    2016-01-01

    This article describes an approach to test scoring, referred to as "delta scoring" (D-scoring), for tests with dichotomously scored items. The D-scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The D-score is computed from the…

  3. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    ERIC Educational Resources Information Center

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  4. Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Choe, Edison M.; Kern, Justin L.; Chang, Hua-Hua

    2018-01-01

    Despite common operationalization, measurement efficiency of computerized adaptive testing should not only be assessed in terms of the number of items administered but also the time it takes to complete the test. To this end, a recent study introduced a novel item selection criterion that maximizes Fisher information per unit of expected response…

  5. Estimating the Reliability of a Test Battery Composite or a Test Score Based on Weighted Item Scoring

    ERIC Educational Resources Information Center

    Feldt, Leonard S.

    2004-01-01

    In some settings, the validity of a battery composite or a test score is enhanced by weighting some parts or items more heavily than others in the total score. This article describes methods of estimating the total score reliability coefficient when differential weights are used with items or parts.

  6. Science or Reading: What Is Being Measured by Standardized Tests?

    ERIC Educational Resources Information Center

    Visone, Jeremy D.

    2010-01-01

    This study examined reading issues associated with a standardized science test. Grade 11 students in Connecticut were shown released science test items and asked about the reading issues associated with the items. Findings suggested that students varied in their understanding of the nature of the items and in their ability to read for detail. The…

  7. Applications of NLP Techniques to Computer-Assisted Authoring of Test Items for Elementary Chinese

    ERIC Educational Resources Information Center

    Liu, Chao-Lin; Lin, Jen-Hsiang; Wang, Yu-Chun

    2010-01-01

    The authors report an implemented environment for computer-assisted authoring of test items and provide a brief discussion about the applications of NLP techniques for computer assisted language learning. Test items can serve as a tool for language learners to examine their competence in the target language. The authors apply techniques for…

  8. Construction and Analysis of Educational Tests Using Abductive Machine Learning

    ERIC Educational Resources Information Center

    El-Alfy, El-Sayed M.; Abdel-Aal, Radwan E.

    2008-01-01

    Recent advances in educational technologies and the wide-spread use of computers in schools have fueled innovations in test construction and analysis. As the measurement accuracy of a test depends on the quality of the items it includes, item selection procedures play a central role in this process. Mathematical programming and the item response…

  9. Role of Cognitive Testing in the Development of the CAHPS® Hospital Survey

    PubMed Central

    Levine, Roger E; Fowler, Floyd J; Brown, Julie A

    2005-01-01

    Objective To describe how cognitive testing results were used to inform the modification and selection of items for the Consumer Assessment of Health Providers and Systems (CAHPS®) Hospital Survey pilot test instrument. Data Sources Cognitive interviews were conducted on 31 subjects in two rounds of testing: in December 2002–January 2003 and in February 2003. In both rounds, interviews were conducted in northern California, southern California, Massachusetts, and North Carolina. Study Design A common protocol served as the basis for cognitive testing activities in each round. This protocol was modified to enable testing of the items as interviewer-administered and self-administered items and to allow members of each of three research teams to use their preferred cognitive research tools. Data Collection/Extraction Methods Each research team independently summarized, documented, and reported their findings. Item-specific and general issues were noted. The results were reviewed and discussed by senior staff from each research team after each round of testing, to inform the acceptance, modification, or elimination of candidate items. Principal Findings Many candidate items required modification because respondents lacked the information required to answer them, respondents failed to understand them consistently, the items were not measuring the constructs they were intended to measure, the items were based on erroneous assumptions about what respondents wanted or experienced during their hospitalization, or the items were asking respondents to make distinctions that were too fine for them to make. Cognitive interviewing enabled the detection of these problems; an understanding of the etiology of the problem informed item revisions. However, for some constructs, the revisions proved to be inadequate. Accordingly, items could not be developed to provide acceptable measures of certain constructs such as shared decision making, coordination of care, and delays in the admissions process. Conclusions Cognitive testing is the most direct way of finding out whether respondents understand questions consistently, have the information needed to answer the questions, and can use the response alternatives provided to describe their experiences or their opinions accurately. Many of the candidate questions failed to meet these standards. Cognitive testing only evaluates the way in which respondents understand and answer questions. Although it does not directly assess the validity of the answers, it is a reasonable premise that cognitive problems will seriously compromise validity and reliability. PMID:16316437

  10. Clinical utility of a single-item test for DSM-5 alcohol use disorder among outpatients with anxiety and depressive disorders.

    PubMed

    Bartoli, Francesco; Crocamo, Cristina; Biagi, Enrico; Di Carlo, Francesco; Parma, Francesca; Madeddu, Fabio; Capuzzi, Enrico; Colmegna, Fabrizia; Clerici, Massimo; Carrà, Giuseppe

    2016-08-01

    There is a lack of studies testing accuracy of fast screening methods for alcohol use disorder in mental health settings. We aimed at estimating clinical utility of a standard single-item test for case finding and screening of DSM-5 alcohol use disorder among individuals suffering from anxiety and mood disorders. We recruited adults consecutively referred, in a 12-month period, to an outpatient clinic for anxiety and depressive disorders. We assessed the National Institute on Alcohol Abuse and Alcoholism (NIAAA) single-item test, using the Mini- International Neuropsychiatric Interview (MINI), plus an additional item of Composite International Diagnostic Interview (CIDI) for craving, as reference standard to diagnose a current DSM-5 alcohol use disorder. We estimated sensitivity and specificity of the single-item test, as well as positive and negative Clinical Utility Indexes (CUIs). 242 subjects with anxiety and mood disorders were included. The NIAAA single-item test showed high sensitivity (91.9%) and specificity (91.2%) for DSM-5 alcohol use disorder. The positive CUI was 0.601, whereas the negative one was 0.898, with excellent values also accounting for main individual characteristics (age, gender, diagnosis, psychological distress levels, smoking status). Testing for relevant indexes, we found an excellent clinical utility of the NIAAA single-item test for screening true negative cases. Our findings support a routine use of reliable methods for rapid screening in similar mental health settings. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. Evaluating the validity of the Work Role Functioning Questionnaire (Canadian French version) using classical test theory and item response theory.

    PubMed

    Hong, Quan Nha; Coutu, Marie-France; Berbiche, Djamal

    2017-01-01

    The Work Role Functioning Questionnaire (WRFQ) was developed to assess workers' perceived ability to perform job demands and is used to monitor presenteeism. Still few studies on its validity can be found in the literature. The purpose of this study was to assess the items and factorial composition of the Canadian French version of the WRFQ (WRFQ-CF). Two measurement approaches were used to test the WRFQ-CF: Classical Test Theory (CTT) and non-parametric Item Response Theory (IRT). A total of 352 completed questionnaires were analyzed. A four-factor and three-factor model models were tested and shown respectively good fit with 14 items (Root Mean Square Error of Approximation (RMSEA) = 0.06, Standardized Root Mean Square Residual (SRMR) = 0.04, Bentler Comparative Fit Index (CFI) = 0.98) and with 17 items (RMSEA = 0.059, SRMR = 0.048, CFI = 0.98). Using IRT, 13 problematic items were identified, of which 9 were common with CTT. This study tested different models with fewer problematic items found in a three-factor model. Using a non-parametric IRT and CTT for item purification gave complementary results. IRT is still scarcely used and can be an interesting alternative method to enhance the quality of a measurement instrument. More studies are needed on the WRFQ-CF to refine its items and factorial composition.

  12. Separating "Rotators" from "Nonrotators" in the Mental Rotations Test: A Multigroup Latent Class Analysis

    ERIC Educational Resources Information Center

    Geiser, Christian; Lehmann, Wolfgang; Eid, Michael

    2006-01-01

    Items of mental rotation tests can not only be solved by mental rotation but also by other solution strategies. A multigroup latent class analysis of 24 items of the Mental Rotations Test (MRT) was conducted in a sample of 1,695 German pupils and students to find out how many solution strategies can be identified for the items of this test. The…

  13. A review of guidelines on home drug testing web sites for parents.

    PubMed

    Washio, Yukiko; Fairfax-Columbo, Jaymes; Ball, Emily; Cassey, Heather; Arria, Amelia M; Bresani, Elena; Curtis, Brenda L; Kirby, Kimberly C

    2014-01-01

    To update and extend prior work reviewing Web sites that discuss home drug testing for parents, and assess the quality of information that the Web sites provide, to assist them in deciding when and how to use home drug testing. We conducted a worldwide Web search that identified 8 Web sites providing information for parents on home drug testing. We assessed the information on the sites using a checklist developed with field experts in adolescent substance abuse and psychosocial interventions that focus on urine testing. None of the Web sites covered all the items on the 24-item checklist, and only 3 covered at least half of the items (12, 14, and 21 items, respectively). The remaining 5 Web sites covered less than half of the checklist items. The mean number of items covered by the Web sites was 11. Among the Web sites that we reviewed, few provided thorough information to parents regarding empirically supported strategies to effectively use drug testing to intervene on adolescent substance use. Furthermore, most Web sites did not provide thorough information regarding the risks and benefits to inform parents' decision to use home drug testing. Empirical evidence regarding efficacy, benefits, risks, and limitations of home drug testing is needed.

  14. Validity and Reliability of the 8-Item Work Limitations Questionnaire.

    PubMed

    Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C

    2017-12-01

    Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.

  15. 21 CFR 862.1163 - Cardiac allograft gene expression profiling test system.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1163 Cardiac allograft gene expression profiling test system. (a...

  16. HIV Antibody Test

    MedlinePlus

    ... Free Fetal DNA Cerebrospinal Fluid (CSF) Analysis Ceruloplasmin Chemistry Panels Chickenpox and Shingles Tests Chlamydia Testing Chloride ... D. R., Editors (2006). Contemporary Practice in Clinical Chemistry, AACC Press, Washington, DC. Pp. 487-490. Bennett, ...

  17. 21 CFR 862.1187 - Conjugated sulfolithocholic acid (SLCG) test system.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1187 Conjugated sulfolithocholic acid (SLCG) test system. (a) Identification. A...

  18. Lipase Test

    MedlinePlus

    ... Free Fetal DNA Cerebrospinal Fluid (CSF) Analysis Ceruloplasmin Chemistry Panels Chickenpox and Shingles Tests Chlamydia Testing Chloride ... D. R., Editors (© 2006). Contemporary Practice in Clinical Chemistry: AACC Press, Washington, DC. Pp 281-287. Wu, ...

  19. CK-MB Test

    MedlinePlus

    ... Free Fetal DNA Cerebrospinal Fluid (CSF) Analysis Ceruloplasmin Chemistry Panels Chickenpox and Shingles Tests Chlamydia Testing Chloride ... Missouri. Pp. 312-315. Tietz Textbook of Clinical Chemistry and Molecular Diagnostics. Burtis CA, Ashwood ER and ...

  20. 21 CFR 862.1645 - Urinary protein or albumin (nonquantitative) test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1645 Urinary protein or albumin (nonquantitative) test system. (a...

  1. 21 CFR 862.1187 - Conjugated sulfolithocholic acid (SLCG) test system.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1187 Conjugated sulfolithocholic acid (SLCG) test system. (a) Identification. A...

  2. 21 CFR 862.1187 - Conjugated sulfolithocholic acid (SLCG) test system.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1187 Conjugated sulfolithocholic acid (SLCG) test system. (a) Identification. A...

  3. 21 CFR 862.1163 - Cardiac allograft gene expression profiling test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1163 Cardiac allograft gene expression profiling test system. (a...

  4. 21 CFR 862.1360 - Gamma-glutamyl transpeptidase and isoenzymes test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1360 Gamma-glutamyl transpeptidase and isoenzymes test system. (a...

  5. 21 CFR 862.1187 - Conjugated sulfolithocholic acid (SLCG) test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1187 Conjugated sulfolithocholic acid (SLCG) test system. (a) Identification. A...

  6. 21 CFR 862.1163 - Cardiac allograft gene expression profiling test system.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1163 Cardiac allograft gene expression profiling test system. (a...

  7. 21 CFR 862.1163 - Cardiac allograft gene expression profiling test system.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1163 Cardiac allograft gene expression profiling test system. (a...

  8. 21 CFR 862.1360 - Gamma-glutamyl transpeptidase and isoenzymes test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1360 Gamma-glutamyl transpeptidase and isoenzymes test system. (a...

  9. 21 CFR 862.1187 - Conjugated sulfolithocholic acid (SLCG) test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1187 Conjugated sulfolithocholic acid (SLCG) test system. (a) Identification. A...

  10. A Procedure to Detect Item Bias Present Simultaneously in Several Items

    DTIC Science & Technology

    1991-04-25

    exhibit a coherent and major biasing influence at the test level. In partic- ular, this can be true even if each individual item displays only a minor...response functions (IRFs) without the use of item parameter estimation algorithms when the sample size is too small for their use. Thissen, Steinberg...convention). A random sample of examinees is drawn from each group, and a test of N items is administered to them. Typically it is suspected that a

  11. Evaluating innovative items for the NCLEX, part I: usability and pilot testing.

    PubMed

    Wendt, Anne; Harmes, J Christine

    2009-01-01

    National Council of State Boards of Nursing (NCSBN) has recently conducted preliminary research on the feasibility of including various types of innovative test questions (items) on the NCLEX. This article focuses on the participants' reactions to and their strategies for interacting with various types of innovative items. Part 2 in the May/June issue will focus on the innovative item templates and evaluation of the statistical characteristics and the level of cognitive processing required to answer the examination items.

  12. Chemistry inside an epistemological community box! Discursive exclusions and inclusions in Swedish National tests in Chemistry

    NASA Astrophysics Data System (ADS)

    Ståhl, Marie; Hussénius, Anita

    2017-06-01

    This study examined the Swedish national tests in chemistry for implicit and explicit values. The chemistry subject is understudied compared to biology and physics and students view chemistry as their least interesting science subject. The Swedish national science assessments aim to support equitable and fair evaluation of students, to concretize the goals in the chemistry syllabus and to increase student achievement. Discourse and multimodal analyses, based on feminist and critical didactic theories, were used to examine the test's norms and values. The results revealed that the chemistry discourse presented in the tests showed a traditional view of science from the topics discussed (for example, oil and metal), in the way women, men and youth are portrayed, and how their science interests are highlighted or neglected. An elitist view of science emerges from the test, with distinct gender and age biases. Students could interpret these biases as a message that only "the right type" of person may come into the chemistry epistemological community, that is, into this special sociocultural group that harbours a common view about this knowledge. This perspective may have an impact on students' achievement and thereby prevent support for an equitable and fair evaluation. Understanding the underlying evaluative meanings that come with science teaching is a question of democracy since it may affect students' feelings of inclusion or exclusion. The norms and values harboured in the tests will also affect teaching since the teachers are given examples of how the goals in the syllabus can be concretized.

  13. Impact of STS (Context-Based Type of Teaching) in Comparison With a Textbook Approach on Attitudes and Achievement in Community College Chemistry Classrooms

    NASA Astrophysics Data System (ADS)

    Perkins, Gita

    The purpose of this study was to analyze the impact of a context-based teaching approach (STS) versus a more traditional textbook approach on the attitudes and achievement of community college chemistry students. In studying attitudes toward chemistry within this study, I used a 30-item Likert scale in order to study the importance of chemistry in students' lives, the importance of chemistry, the difficulty of chemistry, interest in chemistry, and the usefulness of chemistry for their future career. Though the STS approach students had higher attitude post scores, there was no significant difference between the STS and textbook students' attitude post scores. It was noted that females had higher postattitude scores in the STS group, while males had higher postattitude scores in the textbook group. With regard to postachievement, I noted that males had higher scores in both groups. A correlation existed between postattitude and postachievement in the STS classroom. In summary, while an association between attitude and achievement was found in the STS classroom, teaching approach or sex was not found to influence attitudes, while sex was also not found to influence achievement. These results, overall, suggest that attitudes are not expected to change on the basis of either teaching approach or gender, and that techniques other than changing the teaching approach would need to be used in order to improve the attitudes of students. Qualitative analysis of an online discussion activity on Energy revealed that STS students were able to apply aspects of chemistry in decision making related to socioscientific issues. Additional analysis of interview and written responses provided insight regarding attitudes toward chemistry, with respect to topics of applicability of chemistry to life, difficulties with chemistry, teaching approach for chemistry, and the intent for enrolling in additional chemistry courses. In addition, the surveys of female students brought out subcategories with regard to emotional and professional characteristics of a good teacher, under the category of characteristics of teaching approach. With respect to the category of course experience, subcategories of useful knowledge to solve real-life problems and knowledge for future career were revealed. The differences between the control group females and STS group females with respect to these characteristics was striking and threw insight into how teacher behavior and teaching approach shape student attitudes to chemistry in case of female students.

  14. Validity of Computer Adaptive Tests of Daily Routines for Youth with Spinal Cord Injury

    PubMed Central

    Haley, Stephen M.

    2013-01-01

    Objective: To evaluate the accuracy of computer adaptive tests (CATs) of daily routines for child- and parent-reported outcomes following pediatric spinal cord injury (SCI) and to evaluate the validity of the scales. Methods: One hundred ninety-six daily routine items were administered to 381 youths and 322 parents. Pearson correlations, intraclass correlation coefficients (ICC), and 95% confidence intervals (CI) were calculated to evaluate the accuracy of simulated 5-item, 10-item, and 15-item CATs against the full-item banks and to evaluate concurrent validity. Independent samples t tests and analysis of variance were used to evaluate the ability of the daily routine scales to discriminate between children with tetraplegia and paraplegia and among 5 motor groups. Results: ICC and 95% CI demonstrated that simulated 5-, 10-, and 15-item CATs accurately represented the full-item banks for both child- and parent-report scales. The daily routine scales demonstrated discriminative validity, except between 2 motor groups of children with paraplegia. Concurrent validity of the daily routine scales was demonstrated through significant relationships with the FIM scores. Conclusion: Child- and parent-reported outcomes of daily routines can be obtained using CATs with the same relative precision of a full-item bank. Five-item, 10-item, and 15-item CATs have discriminative and concurrent validity. PMID:23671380

  15. Silent and Vocal Students in a Large Active Learning Chemistry Classroom: Comparison of Performance and Motivational Factors

    ERIC Educational Resources Information Center

    Obenland, Carrie A.; Munson, Ashlyn H.; Hutchinson, John S.

    2013-01-01

    Active learning is becoming more prevalent in large science classrooms, and this study shows the impact on performance of being vocal during Socratic questioning in a General Chemistry course. 800 college students over a two year period were given a pre and post-test using the Chemistry Concept Reasoning Test. The pre-test results showed that…

  16. A 67-Item Stress Resilience item bank showing high content validity was developed in a psychosomatic sample.

    PubMed

    Obbarius, Nina; Fischer, Felix; Obbarius, Alexander; Nolte, Sandra; Liegl, Gregor; Rose, Matthias

    2018-04-10

    To develop the first item bank to measure Stress Resilience (SR) in clinical populations. Qualitative item development resulted in an initial pool of 131 items covering a broad theoretical SR concept. These items were tested in n=521 patients at a psychosomatic outpatient clinic. Exploratory and Confirmatory Factor Analysis (CFA), as well as other state-of-the-art item analyses and IRT were used for item evaluation and calibration of the final item bank. Out of the initial item pool of 131 items, we excluded 64 items (54 factor loading <.5, 4 residual correlations >.3, 2 non-discriminative Item Response Curves, 4 Differential Item Functioning). The final set of 67 items indicated sufficient model fit in CFA and IRT analyses. Additionally, a 10-item short form with high measurement precision (SE≤.32 in a theta range between -1.8 and +1.5) was derived. Both the SR item bank and the SR short form were highly correlated with an existing static legacy tool (Connor-Davidson Resilience Scale). The final SR item bank and 10-item short form showed good psychometric properties. When further validated, they will be ready to be used within a framework of Computer-Adaptive Tests for a comprehensive assessment of the Stress-Construct. Copyright © 2018. Published by Elsevier Inc.

  17. Blood Urea Nitrogen Test

    MedlinePlus

    ... Free Fetal DNA Cerebrospinal Fluid (CSF) Analysis Ceruloplasmin Chemistry Panels Chickenpox and Shingles Tests Chlamydia Testing Chloride ... mmol/L 1 from Tietz Textbook of Clinical Chemistry and Molecular Diagnostics. Burtis CA, Ashwood ER, Bruns ...

  18. 21 CFR 862.1315 - Galactose-1-phosphate uridyl transferase test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1315 Galactose-1-phosphate uridyl transferase test system. (a) Identification...

  19. 21 CFR 862.1315 - Galactose-1-phosphate uridyl transferase test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1315 Galactose-1-phosphate uridyl transferase test system. (a) Identification...

  20. 21 CFR 862.1385 - 17-Hydroxycorticosteroids (17-ketogenic steroids) test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES CLINICAL CHEMISTRY AND CLINICAL TOXICOLOGY DEVICES Clinical Chemistry Test Systems § 862.1385 17-Hydroxycorticosteroids (17-ketogenic steroids) test system...

Top