A Process for Reviewing and Evaluating Generated Test Items
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis
2016-01-01
Testing organization needs large numbers of high-quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time-consuming and expensive because each item is written,…
ACER Chemistry Test Item Collection. ACER Chemtic Year 12.
ERIC Educational Resources Information Center
Australian Council for Educational Research, Hawthorn.
The chemistry test item banks contains 225 multiple-choice questions suitable for diagnostic and achievement testing; a three-page teacher's guide; answer key with item facilities; an answer sheet; and a 45-item sample achievement test. Although written for the new grade 12 chemistry course in Victoria, Australia, the items are widely applicable.…
Australian Chemistry Test Item Bank: Years 11 & 12. Volume 1.
ERIC Educational Resources Information Center
Commons, C., Ed.; Martin, P., Ed.
Volume 1 of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the "ACER…
Australian Chemistry Test Item Bank: Years 11 and 12. Volume 2.
ERIC Educational Resources Information Center
Commons, C., Ed.; Martin, P., Ed.
The second volume of the Australian Chemistry Test Item Bank, consisting of two volumes, contains nearly 2000 multiple-choice items related to the chemistry taught in Year 11 and Year 12 courses in Australia. Items which were written during 1979 and 1980 were initially published in the "ACER Chemistry Test Item Collection" and in the…
Assessment of item-writing flaws in multiple-choice questions.
Nedeau-Cayo, Rosemarie; Laughlin, Deborah; Rus, Linda; Hall, John
2013-01-01
This study evaluated the quality of multiple-choice questions used in a hospital's e-learning system. Constructing well-written questions is fraught with difficulty, and item-writing flaws are common. Study results revealed that most items contained flaws and were written at the knowledge/comprehension level. Few items had linked objectives, and no association was found between the presence of objectives and flaws. Recommendations include education for writing test questions.
ERIC Educational Resources Information Center
Snyder, James
2010-01-01
This dissertation research examined the changes in item RIT calibration that occurred when adding audio to a set of currently calibrated RIT items and then placing these new items as field test items in the modified assessments on the NWEA MAP test platform. The researcher used test results from over 600 students in the Poway School District in…
Sex Differences in the Tendency to Omit Items on Multiple-Choice Tests: 1980-2000
ERIC Educational Resources Information Center
von Schrader, Sarah; Ansley, Timothy
2006-01-01
Much has been written concerning the potential group differences in responding to multiple-choice achievement test items. This discussion has included references to possible disparities in tendency to omit such test items. When test scores are used for high-stakes decision making, even small differences in scores and rankings that arise from male…
The Development and Validation of a Formula for Measuring Single-Sentence Test Item Readability.
ERIC Educational Resources Information Center
Homan, Susan; And Others
1994-01-01
A study was conducted with 782 elementary school students to determine whether the Homan-Hewitt Readability Formula could identify the readability of a single-sentence test item. Results indicate that a relationship exists between students' reading grade levels and responses to test items written at higher readability levels. (SLD)
Tarrant, Marie; Knierim, Aimee; Hayes, Sasha K; Ware, James
2006-12-01
Multiple-choice questions are a common assessment method in nursing examinations. Few nurse educators, however, have formal preparation in constructing multiple-choice questions. Consequently, questions used in baccalaureate nursing assessments often contain item-writing flaws, or violations to accepted item-writing guidelines. In one nursing department, 2770 MCQs were collected from tests and examinations administered over a five-year period from 2001 to 2005. Questions were evaluated for 19 frequently occurring item-writing flaws, for cognitive level, for question source, and for the distribution of correct answers. Results show that almost half (46.2%) of the questions contained violations of item-writing guidelines and over 90% were written at low cognitive levels. Only a small proportion of questions were teacher generated (14.1%), while 36.2% were taken from testbanks and almost half (49.4%) had no source identified. MCQs written at a lower cognitive level were significantly more likely to contain item-writing flaws. While there was no relationship between the source of the question and item-writing flaws, teacher-generated questions were more likely to be written at higher cognitive levels (p<0.001). Correct answers were evenly distributed across all four options and no bias was noted in the placement of correct options. Further training in item-writing is recommended for all faculty members who are responsible for developing tests. Pre-test review and quality assessment is also recommended to reduce the occurrence of item-writing flaws and to improve the quality of test questions.
How Often is "Often"? The Use of Imprecise Terms in Exam Items.
ERIC Educational Resources Information Center
Case, Susan M.
This study was designed to gather data on the meaning of imprecise terms from items written by physicians for their students and by test committees for national licensure and certification examinations. A total of 32 members of test committees who write examination items for various medical specialty examinations participated in the study. Each…
ERIC Educational Resources Information Center
Vorstenbosch, Marc A. T. M.; Klaassen, Tim P. F. M.; Kooloos, Jan G. M.; Bolhuis, Sanneke M.; Laan, Roland F. J. M.
2013-01-01
Anatomists often use images in assessments and examinations. This study aims to investigate the influence of different types of images on item difficulty and item discrimination in written assessments. A total of 210 of 460 students volunteered for an extra assessment in a gross anatomy course. This assessment contained 39 test items grouped in…
ERIC Educational Resources Information Center
Muiznieks, Viktors J.; Cox, John
The Computerized Test-Result Reporting System (CTRS), which consists of three programs written in the BASIC language, was developed to analyze obective tests, test items, test results, and to provide the teacher-user with interpreted data about the performance of tests, Lest items, and students. This paper documents the three programs from the…
Test blueprints for psychiatry residency in-training written examinations in Riyadh, Saudi Arabia
Gaffas, Eisha M; Sequeira, Reginald P; Namla, Riyadh A Al; Al-Harbi, Khalid S
2012-01-01
Background The postgraduate training program in psychiatry in Saudi Arabia, which was established in 1997, is a 4-year residency program. Written exams comprising of multiple choice questions (MCQs) are used as a summative assessment of residents in order to determine their eligibility for promotion from one year to the next. Test blueprints are not used in preparing examinations. Objective To develop test blueprints for the written examinations used in the psychiatry residency program. Methods Based on the guidelines of four professional bodies, documentary analysis was used to develop global and detailed test blueprints for each year of the residency program. An expert panel participated during piloting and final modification of the test blueprints. Their opinion about the content, weightage for each content domain, and proportion of test items to be sampled in each cognitive category as defined by modified Bloom’s taxonomy were elicited. Results Eight global and detailed test blueprints, two for each year of the psychiatry residency program, were developed. The global test blueprints were reviewed by experts and piloted. Six experts participated in the final modification of test blueprints. Based on expert consensus, the content, total weightage for each content domain, and proportion of test items to be included in each cognitive category were determined for each global test blueprint. Experts also suggested progressively decreasing the weightage for recall test items and increasing problem solving test items in examinations, from year 1 to year 4 of the psychiatry residence program. Conclusion A systematic approach using a documentary and content analysis technique was used to develop test blueprints with additional input from an expert panel as appropriate. Test blueprinting is an important step to ensure the test validity in all residency programs. PMID:23762000
Test blueprints for psychiatry residency in-training written examinations in Riyadh, Saudi Arabia.
Gaffas, Eisha M; Sequeira, Reginald P; Namla, Riyadh A Al; Al-Harbi, Khalid S
2012-01-01
The postgraduate training program in psychiatry in Saudi Arabia, which was established in 1997, is a 4-year residency program. Written exams comprising of multiple choice questions (MCQs) are used as a summative assessment of residents in order to determine their eligibility for promotion from one year to the next. Test blueprints are not used in preparing examinations. To develop test blueprints for the written examinations used in the psychiatry residency program. Based on the guidelines of four professional bodies, documentary analysis was used to develop global and detailed test blueprints for each year of the residency program. An expert panel participated during piloting and final modification of the test blueprints. Their opinion about the content, weightage for each content domain, and proportion of test items to be sampled in each cognitive category as defined by modified Bloom's taxonomy were elicited. Eight global and detailed test blueprints, two for each year of the psychiatry residency program, were developed. The global test blueprints were reviewed by experts and piloted. Six experts participated in the final modification of test blueprints. Based on expert consensus, the content, total weightage for each content domain, and proportion of test items to be included in each cognitive category were determined for each global test blueprint. Experts also suggested progressively decreasing the weightage for recall test items and increasing problem solving test items in examinations, from year 1 to year 4 of the psychiatry residence program. A systematic approach using a documentary and content analysis technique was used to develop test blueprints with additional input from an expert panel as appropriate. Test blueprinting is an important step to ensure the test validity in all residency programs.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-30
... surrounding aging-related issues from the National Institute on Aging (NIA). Type of Information Collection... information technology. Direct Comments to OMB: Written comments and/or suggestions regarding the item(s...; Comment Request; Testing Successful Health Communications Surrounding Aging-Related Issues From the...
ERIC Educational Resources Information Center
Cheek, Jimmy G.; McGhee, Max B.
The central purpose of this study was to develop and field test written criterion-referenced tests for the ornamental horticulture component of applied principles of agribusiness and natural resources occupations programs. The test items were to be used by secondary agricultural education students in Florida. Based upon the objectives identified…
ERIC Educational Resources Information Center
Hopley, Ken; And Others
The first of several planned volumes of Free Response Test Items contains geology questions developed by the Assessment and Evaluation Unit of the New South Wales Department of Education. Two additional geology volumes and biology and chemistry volumes are in preparation. The questions in this volume were written and reviewed by practicing…
NASA Astrophysics Data System (ADS)
Beggrow, Elizabeth P.; Ha, Minsu; Nehm, Ross H.; Pearl, Dennis; Boone, William J.
2014-02-01
The landscape of science education is being transformed by the new Framework for Science Education (National Research Council, A framework for K-12 science education: practices, crosscutting concepts, and core ideas. The National Academies Press, Washington, DC, 2012), which emphasizes the centrality of scientific practices—such as explanation, argumentation, and communication—in science teaching, learning, and assessment. A major challenge facing the field of science education is developing assessment tools that are capable of validly and efficiently evaluating these practices. Our study examined the efficacy of a free, open-source machine-learning tool for evaluating the quality of students' written explanations of the causes of evolutionary change relative to three other approaches: (1) human-scored written explanations, (2) a multiple-choice test, and (3) clinical oral interviews. A large sample of undergraduates (n = 104) exposed to varying amounts of evolution content completed all three assessments: a clinical oral interview, a written open-response assessment, and a multiple-choice test. Rasch analysis was used to compute linear person measures and linear item measures on a single logit scale. We found that the multiple-choice test displayed poor person and item fit (mean square outfit >1.3), while both oral interview measures and computer-generated written response measures exhibited acceptable fit (average mean square outfit for interview: person 0.97, item 0.97; computer: person 1.03, item 1.06). Multiple-choice test measures were more weakly associated with interview measures (r = 0.35) than the computer-scored explanation measures (r = 0.63). Overall, Rasch analysis indicated that computer-scored written explanation measures (1) have the strongest correspondence to oral interview measures; (2) are capable of capturing students' normative scientific and naive ideas as accurately as human-scored explanations, and (3) more validly detect understanding than the multiple-choice assessment. These findings demonstrate the great potential of machine-learning tools for assessing key scientific practices highlighted in the new Framework for Science Education.
ERIC Educational Resources Information Center
Cannon, Joanna E.; Hubley, Anita M.
2014-01-01
Content validation is a crucial, but often neglected, component of good test development. In the present study, content validity evidence was collected to determine the degree to which elements (e.g., grammatical structures, items, picture responses, administration, and scoring instructions) of the Comprehension of Written Grammar (CWG) test are…
Managing a Test Item Bank on a Microcomputer: Can It Help You and Your Students?
ERIC Educational Resources Information Center
Peterson, Julian A.; Meister, Lynn L.
1983-01-01
Describes a test item bank developed by the Association for Medical School Departments of Biochemistry (Texas). Programs (written in Pascal) allow self-evaluation by interactive student access to questions randomly selected from a chosen category. Potential users of the system (having student, manager, and instructor modes) are invited to contact…
An Investigation of "Cloze" Items in the Measurement of Achievement in Foreign Languages.
ERIC Educational Resources Information Center
Carroll, John B.; And Others
This study investigates the feasibility of using cloze procedure test items (in which a student supplies a word, letter, or phrase to fill a gap in a continuous text) for the written College Board foreign language achievement tests. An introduction which defines the problem, traces its history, and presents the overall design of the study is…
ERIC Educational Resources Information Center
Aghakhani, Anoosha; Chan, Eric K.
2007-01-01
In this article, the authors review the Clinical Assessment of Depression (CAD), a 50-item self-report measure of depressive symptoms designed for children, adolescents, adults, and elderly adults from 8 to 79 years of age. Purporting to be sensitive to depressive symptomatology across the lifespan, the test items were written to reflect the…
Park, Jong Cook; Kim, Kwang Sig
2012-03-01
The reliability of test is determined by each items' characteristics. Item analysis is achieved by classical test theory and item response theory. The purpose of the study was to compare the discrimination indices with item response theory using the Rasch model. Thirty-one 4th-year medical school students participated in the clinical course written examination, which included 22 A-type items and 3 R-type items. Point biserial correlation coefficient (C(pbs)) was compared to method of extreme group (D), biserial correlation coefficient (C(bs)), item-total correlation coefficient (C(it)), and corrected item-total correlation coeffcient (C(cit)). Rasch model was applied to estimate item difficulty and examinee's ability and to calculate item fit statistics using joint maximum likelihood. Explanatory power (r2) of Cpbs is decreased in the following order: C(cit) (1.00), C(it) (0.99), C(bs) (0.94), and D (0.45). The ranges of difficulty logit and standard error and ability logit and standard error were -0.82 to 0.80 and 0.37 to 0.76, -3.69 to 3.19 and 0.45 to 1.03, respectively. Item 9 and 23 have outfit > or =1.3. Student 1, 5, 7, 18, 26, 30, and 32 have fit > or =1.3. C(pbs), C(cit), and C(it) are good discrimination parameters. Rasch model can estimate item difficulty parameter and examinee's ability parameter with standard error. The fit statistics can identify bad items and unpredictable examinee's responses.
Assessing Mathematics 4. Problem Solving: The APU Approach.
ERIC Educational Resources Information Center
Foxman, Derek; And Others
1984-01-01
Presented are examples of problem-solving items from practical and written mathematics tests. These tests are part of an English survey designed to assess the mathematics achievement of students aged 11 and 15. (JN)
An Attempt to Influence Selected Portions of Student Learning.
ERIC Educational Resources Information Center
Anderson, Edwin R.
In an attempt to selectively improve student performance, one-half of a set of difficult test items from a FORTRAN programming class had handouts explaining the concepts underlying the items distributed to the students. Each handout contained a written learning objective, a short prose passage explaining the objective, and one or more practice…
NASA Astrophysics Data System (ADS)
Nehm, Ross H.; Ha, Minsu; Mayfield, Elijah
2012-02-01
This study explored the use of machine learning to automatically evaluate the accuracy of students' written explanations of evolutionary change. Performance of the Summarization Integrated Development Environment (SIDE) program was compared to human expert scoring using a corpus of 2,260 evolutionary explanations written by 565 undergraduate students in response to two different evolution instruments (the EGALT-F and EGALT-P) that contained prompts that differed in various surface features (such as species and traits). We tested human-SIDE scoring correspondence under a series of different training and testing conditions, using Kappa inter-rater agreement values of greater than 0.80 as a performance benchmark. In addition, we examined the effects of response length on scoring success; that is, whether SIDE scoring models functioned with comparable success on short and long responses. We found that SIDE performance was most effective when scoring models were built and tested at the individual item level and that performance degraded when suites of items or entire instruments were used to build and test scoring models. Overall, SIDE was found to be a powerful and cost-effective tool for assessing student knowledge and performance in a complex science domain.
Gibbons, Laura E; McCurry, Susan; Rhoads, Kristoffer; Masaki, Kamal; White, Lon; Borenstein, Amy R; Larson, Eric B; Crane, Paul K
2009-02-01
The Cognitive Abilities Screening Instrument (CASI) was designed for use in cross-cultural studies of Japanese and Japanese-American elderly in Japan and the U.S.A. The measurement equivalence in Japanese and English had not been confirmed in prior studies. We analyzed the 40 CASI items for differential item functioning (DIF) related to test language, as well as self-reported proficiency with written Japanese, age, and educational attainment in two large epidemiologic studies of Japanese-American elderly: the Kame Project (n=1708) and the Honolulu-Asia Aging Study (HAAS; n = 3148). DIF was present if the demographic groups differed in the probability of success on an item, after controlling for their underlying cognitive functioning ability. While seven CASI items had DIF related to language of testing in Kame (registration of one item; recall of one item; similes; judgment; repeating a phrase; reading and performing a command; and following a three-step instruction), the impact of DIF on participants' scores was minimal. Mean scores for Japanese and English speakers in Kame changed by <0.1 SD after accounting for DIF related to test language. In HAAS, insufficient numbers of participants were tested in Japanese to assess DIF related to test language. In both studies, DIF related to written Japanese proficiency, age, and educational attainment had minimal impact. To the extent that DIF could be assessed, the CASI appeared to meet the goal of measuring cognitive function equivalently in Japanese and English. Stratified data collection would be needed to confirm this conclusion. DIF assessment should be used in other studies with multiple language groups to confirm that measures function equivalently or, if not, form scores that account for DIF.
ERIC Educational Resources Information Center
Hinrichs, Roy S., Comp.
Thirty-one lesson plans on electricity-electronics are presented in this guide designed for industrial arts instructors. Each lesson plan is organized into the following format: (1) lesson objective; (2) supplementary teaching items; (3) presentation; (4) demonstration; (5) laboratory or other activities; and (6) test items (oral, written, or…
Need of Knowledge in Nursing and Demand for Knowledge in Nursing Education.
ERIC Educational Resources Information Center
Johansson, Britt
An English summary of a study on nursing education which was written in Swedish is presented. Standards of medical and surgical knowledge required of student nurses were evaluated based on all written test items in medical and surgical nursing set during one year at Swedish schools of nursing. The views of teaching staff and student nurses on…
Dellinges, Mark A; Curtis, Donald A
2017-08-01
Faculty members are expected to write high-quality multiple-choice questions (MCQs) in order to accurately assess dental students' achievement. However, most dental school faculty members are not trained to write MCQs. Extensive faculty development programs have been used to help educators write better test items. The aim of this pilot study was to determine if a short workshop would result in improved MCQ item-writing by dental school faculty at one U.S. dental school. A total of 24 dental school faculty members who had previously written MCQs were randomized into a no-intervention group and an intervention group in 2015. Six previously written MCQs were randomly selected from each of the faculty members and given an item quality score. The intervention group participated in a training session of one-hour duration that focused on reviewing standard item-writing guidelines to improve in-house MCQs. The no-intervention group did not receive any training but did receive encouragement and an explanation of why good MCQ writing was important. The faculty members were then asked to revise their previously written questions, and these were given an item quality score. The item quality scores for each faculty member were averaged, and the difference from pre-training to post-training scores was evaluated. The results showed a significant difference between pre-training and post-training MCQ difference scores for the intervention group (p=0.04). This pilot study provides evidence that the training session of short duration was effective in improving the quality of in-house MCQs.
Social desirability in personality inventories: Symptoms, diagnosis and prescribed cure
Bäckström, Martin; Björklund, Fredrik
2013-01-01
An analysis of social desirability in personality assessment is presented. Starting with the symptoms, Study 1 showed that mean ratings of graded personality items are moderately to strongly linearly related to social desirability (Self Deception, Impression formation, and the first Principal Component), suggesting that item popularity may be a useful heuristic tool for identifying items which elicit socially desirable responding. We diagnose the cause of socially desirable responding as an interaction between the evaluative content of the item and enhancement motivation in the rater. Study 2 introduced a possible cure; evaluative neutralization of items. To test the feasibility of the method lay psychometricians (undergraduates) reformulated existing personality test items according to written instructions. The new items were indeed lower in social desirability while essentially retaining the five factor structure and reliability of the inventory. We conclude that although neutralization is no miracle cure, it is simple and has beneficial effects. PMID:23252410
Test Theories, Educational Priorities and Reliability of Public Examinations in England
ERIC Educational Resources Information Center
Baird, Jo-Anne; Black, Paul
2013-01-01
Much has already been written on the controversies surrounding the use of different test theories in educational assessment. Other authors have noted the prevalence of classical test theory over item response theory in practice. This Special Issue draws together articles based upon work conducted on the Reliability Programme for England's…
Creating a recollection-based memory through drawing.
Wammes, Jeffrey D; Meade, Melissa E; Fernandes, Myra A
2018-05-01
Drawing a picture of to-be-remembered information substantially boosts memory performance in free-recall tasks. In the current work, we sought to test the notion that drawing confers its benefit to memory performance by creating a detailed recollection of the encoding context. In Experiments 1 and 2, we demonstrated that for both pictures and words, items that were drawn by the participant at encoding were better recognized in a later test than were words that were written out. Moreover, participants' source memory (in this experiment, correct identification of whether the word was drawn or written) was superior for items drawn relative to written at encoding. In Experiments 3A and 3B, we used a remember-know paradigm to demonstrate again that drawn words were better recognized than written words, and further showed that this effect was driven by a greater proportion of recollection-, rather than familiarity-based responses. Lastly, in Experiment 4 we implemented a response deadline procedure, and showed that when recognition responses were speeded, thereby reducing participants' capacity for recollection, the benefit of drawing was substantially smaller. Taken together, our findings converge on the idea that drawing improves memory as a result of providing vivid contextual information which can be later called upon to aid retrieval. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Grade 9 Pilot Test. Mathematics. June 1988 = 9e Annee Test Pilote. Mathematiques. Juin 1988.
ERIC Educational Resources Information Center
Alberta Dept. of Education, Edmonton.
This pilot test for ninth grade mathematics is written in both French and English. The test consists of 75 multiple-choice items. Students are given 90 minutes to complete the examination and the use of a calculator is highly recommended. The test content covers a wide range of mathematical topics including: decimals; exponents; arithmetic word…
Math: Figure and Object Characteristics. Measurement and Geometry. Grades K-9. Revised Edition.
ERIC Educational Resources Information Center
Instructional Objectives Exchange, Los Angeles, CA.
To help classroom teachers construct mathematics tests, thirty-seven general objectives, corresponding sub-objectives, sample test items, and answers are presented. In general, sub-objectives are arranged in increasing order of difficulty. The objectives were written to comprehensively cover two categories: measurement and geometry. Measurement…
Automated Item Generation with Recurrent Neural Networks.
von Davier, Matthias
2018-03-12
Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.
Glider Pilot Written Test Guide: Private and Commercial.
ERIC Educational Resources Information Center
Federal Aviation Administration (DOT), Washington, DC. Flight Standards Service.
The intent of this guide is to define the scope and narrow the field of study as far as possible to the aeronautical knowledge required for qualifying for the private or commercial pilot (glider) certificate. Briefly summarized are type of test items used, hints for taking the test, and certificate requirements. The study outline is the basic…
ERIC Educational Resources Information Center
McGhee, Max B.; Cheek, Jimmy G.
An activity was undertaken to develop written criterion-referenced tests for each of the instructional areas comprising the Fundamentals of Agribusiness and Natural Resources Occupations Program. Designed to be taught at the ninth grade level, the program consists of six major instructional areas: agribusiness management, animal science, plant…
Sunaric-Mégevand, Gordana; Aclimandos, Wagih
2016-01-01
The comprehensive European Board of Ophthalmology Diploma (EBOD) examination is one of 38 European medical specialty examinations. This review aims at disclosing the specific procedures and content of the EBOD examination. It is a descriptive study summarizing the present organization of the EBOD examination. It is the 3rd largest European postgraduate medical assessment after anaesthesiology and cardiology. The master language is English for the Part 1 written test (knowledge test with 52 modified type X multiple-choice questions) (in the past the written test was also available in French and German). Ophthalmology training of minimum 4 years in a full or associated European Union of Medical Specialists (UEMS) member state is a prerequisite. Problem-solving skills are tested in the Part 2 oral assessment, which is a viva of 4 subjects conducted in English with support for native language whenever feasible. The comprehensive EBOD examination is one of the leading examinations organized by UEMS European Boards or Specialist Sections from the point of number of examinees, item banking, and item contents. PMID:27464640
Analysis instrument test on mathematical power the material geometry of space flat side for grade 8
NASA Astrophysics Data System (ADS)
Kusmaryono, Imam; Suyitno, Hardi; Dwijanto, Karomah, Nur
2017-08-01
The main problem of research to determine the quality of test items on the material side of flat geometry to assess students' mathematical power. The method used is quantitative descriptive. The subjects were students of class 8 as many as 20 students. The object of research is the quality of test items in terms of the power of mathematics: validity, reliability, level of difficulty and power differentiator. Instrument mathematical power ratings are tested include: written tests and questionnaires about the disposition of mathematical power. Data were obtained from the field, in the form of test data on the material geometry of space flat side and questionnaires. The results of the test instrument to the reliability of the test item is influenced by many factors. Factors affecting the reliability of the instrument is the number of items, homogeneity test questions, the time required, the uniformity of conditions of the test taker, the homogeneity of the group, the variability problem, and motivation of the individual (person taking the test). Overall, the evaluation results of this study stated that the test instrument can be used as a tool to measure students' mathematical power.
ERIC Educational Resources Information Center
Cheek, Jimmy G.; McGhee, Max B.
An activity was undertaken to develop written criterion-referenced tests for the agricultural mechanics component of the Applied Principles of Agribusiness and Natural Resources. Intended for tenth grade students who have completed Fundamentals of Agribusiness and Natural Resources Occupations, applied principles were designed to consist of three…
A Rasch-Based Validation of the Vocabulary Size Test
ERIC Educational Resources Information Center
Beglar, David
2010-01-01
The primary purpose of this study was to provide preliminary validity evidence for a 140-item form of the Vocabulary Size Test, which is designed to measure written receptive knowledge of the first 14,000 words of English. Nineteen native speakers of English and 178 native speakers of Japanese participated in the study. Analyses based on the Rasch…
ERIC Educational Resources Information Center
Instructional Objectives Exchange, Los Angeles, CA.
To help classroom teachers in grades K-9 construct mathematics tests, fifteen general objectives, corresponding sub-objectives, sample test items, and answers are presented. In general, sub-objectives are arranged in increasing order of difficulty. The objectives were written to comprehensively cover three categories. The first, graphs, covers the…
ERIC Educational Resources Information Center
Cheek, Jimmy G.; McGhee, Max B.
An activity was undertaken to develop written criterion-referenced tests for the common core component of Applied Principles of Agribusiness and Natural Resources Occupations. Intended for tenth grade students who have completed Fundamentals of Agribusiness and Natural Resources Occupations, applied principles were designed to consist of three…
ERIC Educational Resources Information Center
Cheek, Jimmy G.; McGhee, Max B.
An activity was undertaken to develop written criterion-referenced tests for the forestry component of Applied Principles of Agribusiness and Natural Resources. Intended for tenth grade students who have completed Fundamentals of Agribusiness and Natural Resources Occupations, applied principles were designed to consist of three components, with…
ERIC Educational Resources Information Center
Cheek, Jimmy G.; McGhee, Max B.
An activity was undertaken to develop written criterion-referenced tests for the agricultural resources component of Applied Principles of Agribusiness and Natural Resources. Intended for tenth grade students who have completed Fundamentals of Agribusiness and Natural Resources Occupations, applied principles were designed to consist of three…
ERIC Educational Resources Information Center
Cheek, Jimmy G.; McGhee, Max B.
An activity was undertaken to develop written criterion-referenced tests for the agricultural production component of Applied Principles of Agribusiness and Natural Resources Occupations. Intended for tenth grade students who have completed Fundamentals of Agribusiness and Natural Resources Occupations, applied principles were designed to consist…
Assessing the reading comprehension of adults with learning disabilities.
Jones, F W; Long, K; Finlay, W M L
2006-06-01
This study's aim was to begin the process of measuring the reading comprehension of adults with mild and borderline learning disabilities, in order to generate information to help clinicians and other professionals to make written material for adults with learning disabilities more comprehensible. The Test for the Reception of Grammar (TROG), with items presented visually rather than orally, and the Reading Comprehension sub-test of the Wechsler Objective Reading Dimensions (WORD) battery were given to 24 service-users of a metropolitan community learning disability team who had an estimated IQ in the range 50-79. These tests were demonstrated to have satisfactory split-half reliability and convergent validity with this population, supporting both their use in this study and in clinical work. Data are presented concerning the distribution across the sample of reading-ages and the comprehension of written grammatical constructions. These data should be useful to those who are preparing written material for adults with learning disabilities.
Pilkonis, Paul A.; Yu, Lan; Dodds, Nathan E.; Johnston, Kelly L.; Lawrence, Suzanne; Hilton, Thomas F.; Daley, Dennis C.; Patkar, Ashwin A.; McCarty, Dennis
2015-01-01
Background Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®): severity of substance use and positive appeal of substance use. Methods Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5,300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1,336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network. Results Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank. Conclusions Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings. PMID:26423364
Pilkonis, Paul A.; Choi, Seung W.; Reise, Steven P.; Stover, Angela M.; Riley, William T.; Cella, David
2011-01-01
The authors report on the development and calibration of item banks for depression, anxiety, and anger as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®). Comprehensive literature searches yielded an initial bank of 1,404 items from 305 instruments. After qualitative item analysis (including focus groups and cognitive interviewing), 168 items (56 for each construct) were written in a first person, past tense format with a 7-day time frame and five response options reflecting frequency. The calibration sample included nearly 15,000 respondents. Final banks of 28, 29, and 29 items were calibrated for depression, anxiety, and anger, respectively, using item response theory. Test information curves showed that the PROMIS item banks provided more information than conventional measures in a range of severity from approximately −1 to +3 standard deviations (with higher scores indicating greater distress). Short forms consisting of seven to eight items provided information comparable to legacy measures containing more items. PMID:21697139
Pilkonis, Paul A; Choi, Seung W; Reise, Steven P; Stover, Angela M; Riley, William T; Cella, David
2011-09-01
The authors report on the development and calibration of item banks for depression, anxiety, and anger as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®). Comprehensive literature searches yielded an initial bank of 1,404 items from 305 instruments. After qualitative item analysis (including focus groups and cognitive interviewing), 168 items (56 for each construct) were written in a first person, past tense format with a 7-day time frame and five response options reflecting frequency. The calibration sample included nearly 15,000 respondents. Final banks of 28, 29, and 29 items were calibrated for depression, anxiety, and anger, respectively, using item response theory. Test information curves showed that the PROMIS item banks provided more information than conventional measures in a range of severity from approximately -1 to +3 standard deviations (with higher scores indicating greater distress). Short forms consisting of seven to eight items provided information comparable to legacy measures containing more items.
Jacobson, C. Jeffrey; Kashikar-Zuck, Susmita; Farrell, Jennifer; Barnett, Kimberly; Goldschneider, Ken; Dampier, Carlton; Cunningham, Natoshia; Crosby, Lori; DeWitt, Esi Morgan
2015-01-01
As initial steps in a broader effort to develop and test pediatric Pain Behavior and Pain Quality item banks for the Patient Reported Outcomes Measurement Information System (PROMIS®), we employed qualitative interview and item review methods to 1) evaluate the overall conceptual scope and content validity of the PROMIS pain domain framework among children with chronic /recurrent pain conditions, and 2) develop item candidates for further psychometric testing. To elicit the experiential and conceptual scope of pain outcomes across a variety of pediatric recurrent/chronic pain conditions, we conducted semi-structured individual (32) and focus-group interviews (2) with children and adolescents (8–17 years), and parents of children with pain (individual (32) and focus group (2)). Interviews with pain experts (10) explored the operational limits of pain measurement in children. For item bank development, we identified existing items from measures in the literature, grouped them by concept, removed redundancies, and modified remaining items to match PROMIS formatting. New items were written as needed and cognitive debriefing was completed with children and their parents, resulting in 98 Pain Behavior (47 self, 51 proxy), 54 Quality and 4 Intensity items for further testing. Qualitative content analyses suggest that reportable pain outcomes that matter to children with pain are captured within and consistent with the pain domain framework in PROMIS. PMID:26335990
Development of knowledge tests for multi-disciplinary emergency training: a review and an example.
Sørensen, J L; Thellesen, L; Strandbygaard, J; Svendsen, K D; Christensen, K B; Johansen, M; Langhoff-Roos, P; Ekelund, K; Ottesen, B; Van Der Vleuten, C
2015-01-01
The literature is sparse on written test development in a post-graduate multi-disciplinary setting. Developing and evaluating knowledge tests for use in multi-disciplinary post-graduate training is challenging. The objective of this study was to describe the process of developing and evaluating a multiple-choice question (MCQ) test for use in a multi-disciplinary training program in obstetric-anesthesia emergencies. A multi-disciplinary working committee with 12 members representing six professional healthcare groups and another 28 participants were involved. Recurrent revisions of the MCQ items were undertaken followed by a statistical analysis. The MCQ items were developed stepwise, including decisions on aims and content, followed by testing for face and content validity, construct validity, item-total correlation, and reliability. To obtain acceptable content validity, 40 out of originally 50 items were included in the final MCQ test. The MCQ test was able to distinguish between levels of competence, and good construct validity was indicated by a significant difference in the mean score between consultants and first-year trainees, as well as between first-year trainees and medical and midwifery students. Evaluation of the item-total correlation analysis in the 40 items set revealed that 11 items needed re-evaluation, four of which addressed content issues in local clinical guidelines. A Cronbach's alpha of 0.83 for reliability was found, which is acceptable. Content and construct validity and reliability were acceptable. The presented template for the development of this MCQ test could be useful to others when developing knowledge tests and may enhance the overall quality of test development. © 2014 The Acta Anaesthesiologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Barniol, Pablo; Zavala, Genaro
2014-12-01
In this article we compare students' understanding of vector concepts in problems with no physical context, and with three mechanics contexts: force, velocity, and work. Based on our "Test of Understanding of Vectors," a multiple-choice test presented elsewhere, we designed two isomorphic shorter versions of 12 items each: a test with no physical context, and a test with mechanics contexts. For this study, we administered the items twice to students who were finishing an introductory mechanics course at a large private university in Mexico. The first time, we administered the two 12-item tests to 608 students. In the second, we only tested the items for which we had found differences in students' performances that were difficult to explain, and in this case, we asked them to show their reasoning in written form. In the first administration, we detected no significant difference between the medians obtained in the tests; however, we did identify significant differences in some of the items. For each item we analyze the type of difference found between the tests in the selection of the correct answer, the most common error on each of the tests, and the differences in the selection of incorrect answers. We also investigate the causes of the different context effects. Based on these analyses, we establish specific recommendations for the instruction of vector concepts in an introductory mechanics course. In the Supplemental Material we include both tests for other researchers studying vector learning, and for physics teachers who teach this material.
Aldekhayel, Salah A; Alselaim, Nahar A; Magzoub, Mohi Eldin; Al-Qattan, Mohammad M; Al-Namlah, Abdullah M; Tamim, Hani; Al-Khayal, Abdullah; Al-Habdan, Sultan I; Zamakhshary, Mohammed F
2012-10-24
Script Concordance Test (SCT) is a new assessment tool that reliably assesses clinical reasoning skills. Previous descriptions of developing SCT-question banks were merely subjective. This study addresses two gaps in the literature: 1) conducting the first phase of a multistep validation process of SCT in Plastic Surgery, and 2) providing an objective methodology to construct a question bank based on SCT. After developing a test blueprint, 52 test items were written. Five validation questions were developed and a validation survey was established online. Seven reviewers were asked to answer this survey. They were recruited from two countries, Saudi Arabia and Canada, to improve the test's external validity. Their ratings were transformed into percentages. Analysis was performed to compare reviewers' ratings by looking at correlations, ranges, means, medians, and overall scores. Scores of reviewers' ratings were between 76% and 95% (mean 86% ± 5). We found poor correlations between reviewers (Pearson's: +0.38 to -0.22). Ratings of individual validation questions ranged between 0 and 4 (on a scale 1-5). Means and medians of these ranges were computed for each test item (mean: 0.8 to 2.4; median: 1 to 3). A subset of test items comprising 27 items was generated based on a set of inclusion and exclusion criteria. This study proposes an objective methodology for validation of SCT-question bank. Analysis of validation survey is done from all angles, i.e., reviewers, validation questions, and test items. Finally, a subset of test items is generated based on a set of criteria.
Delogu, Franco; Lilla, Christopher C
2017-11-01
Contrasting results in visual and auditory spatial memory stimulate the debate over the role of sensory modality and attention in identity-to-location binding. We investigated the role of sensory modality in the incidental/deliberate encoding of the location of a sequence of items. In 4 separated blocks, 88 participants memorised sequences of environmental sounds, spoken words, pictures and written words, respectively. After memorisation, participants were asked to recognise old from new items in a new sequence of stimuli. They were also asked to indicate from which side of the screen (visual stimuli) or headphone channel (sounds) the old stimuli were presented in encoding. In the first block, participants were not aware of the spatial requirement while, in blocks 2, 3 and 4 they knew that their memory for item location was going to be tested. Results show significantly lower accuracy of object location memory for the auditory stimuli (environmental sounds and spoken words) than for images (pictures and written words). Awareness of spatial requirement did not influence localisation accuracy. We conclude that: (a) object location memory is more effective for visual objects; (b) object location is implicitly associated with item identity during encoding and (c) visual supremacy in spatial memory does not depend on the automaticity of object location binding.
Validity threats: overcoming interference with proposed interpretations of assessment data.
Downing, Steven M; Haladyna, Thomas M
2004-03-01
Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity. The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided. The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged. There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validity threats are suggested.
Evaluation of the flipped classroom approach in a veterinary professional skills course
Moffett, Jenny; Mill, Aileen C
2014-01-01
Background The flipped classroom is an educational approach that has had much recent coverage in the literature. Relatively few studies, however, use objective assessment of student performance to measure the impact of the flipped classroom on learning. The purpose of this study was to evaluate the use of a flipped classroom approach within a medical education setting to the first two levels of Kirkpatrick and Kirkpatrick’s effectiveness of training framework. Methods This study examined the use of a flipped classroom approach within a professional skills course offered to postgraduate veterinary students. A questionnaire was administered to two cohorts of students: those who had completed a traditional, lecture-based version of the course (Introduction to Veterinary Medicine [IVM]) and those who had completed a flipped classroom version (Veterinary Professional Foundations I [VPF I]). The academic performance of students within both cohorts was assessed using a set of multiple-choice items (n=24) nested within a written examination. Data obtained from the questionnaire were analyzed using Cronbach’s alpha, Kruskal–Wallis tests, and factor analysis. Data obtained from student performance in the written examination were analyzed using the nonparametric Wilcoxon rank sum test. Results A total of 133 IVM students and 64 VPF I students (n=197) agreed to take part in the study. Overall, study participants favored the flipped classroom approach over the traditional classroom approach. With respect to student academic performance, the traditional classroom students outperformed the flipped classroom students on a series of multiple-choice items (IVM mean =21.4±1.48 standard deviation; VPF I mean =20.25±2.20 standard deviation; Wilcoxon test, w=7,578; P<0.001). Conclusion This study demonstrates that learners seem to prefer a flipped classroom approach. The flipped classroom was rated more positively than the traditional classroom on many different characteristics. This preference, however, did not translate into improved student performance, as assessed by a series of multiple-choice items delivered during a written examination. PMID:25419164
Evaluation of the flipped classroom approach in a veterinary professional skills course.
Moffett, Jenny; Mill, Aileen C
2014-01-01
The flipped classroom is an educational approach that has had much recent coverage in the literature. Relatively few studies, however, use objective assessment of student performance to measure the impact of the flipped classroom on learning. The purpose of this study was to evaluate the use of a flipped classroom approach within a medical education setting to the first two levels of Kirkpatrick and Kirkpatrick's effectiveness of training framework. This study examined the use of a flipped classroom approach within a professional skills course offered to postgraduate veterinary students. A questionnaire was administered to two cohorts of students: those who had completed a traditional, lecture-based version of the course (Introduction to Veterinary Medicine [IVM]) and those who had completed a flipped classroom version (Veterinary Professional Foundations I [VPF I]). The academic performance of students within both cohorts was assessed using a set of multiple-choice items (n=24) nested within a written examination. Data obtained from the questionnaire were analyzed using Cronbach's alpha, Kruskal-Wallis tests, and factor analysis. Data obtained from student performance in the written examination were analyzed using the nonparametric Wilcoxon rank sum test. A total of 133 IVM students and 64 VPF I students (n=197) agreed to take part in the study. Overall, study participants favored the flipped classroom approach over the traditional classroom approach. With respect to student academic performance, the traditional classroom students outperformed the flipped classroom students on a series of multiple-choice items (IVM mean =21.4±1.48 standard deviation; VPF I mean =20.25±2.20 standard deviation; Wilcoxon test, w=7,578; P<0.001). This study demonstrates that learners seem to prefer a flipped classroom approach. The flipped classroom was rated more positively than the traditional classroom on many different characteristics. This preference, however, did not translate into improved student performance, as assessed by a series of multiple-choice items delivered during a written examination.
Fostering a student's skill for analyzing test items through an authentic task
NASA Astrophysics Data System (ADS)
Setiawan, Beni; Sabtiawan, Wahyu Budi
2017-08-01
Analyzing test items is a skill that must be mastered by prospective teachers, in order to determine the quality of test questions which have been written. The main aim of this research was to describe the effectiveness of authentic task to foster the student's skill for analyzing test items involving validity, reliability, item discrimination index, level of difficulty, and distractor functioning through the authentic task. The participant of the research is students of science education study program, science and mathematics faculty, Universitas Negeri Surabaya, enrolled for assessment course. The research design was a one-group posttest design. The treatment in this study is that the students were provided an authentic task facilitating the students to develop test items, then they analyze the items like a professional assessor using Microsoft Excel and Anates Software. The data of research obtained were analyzed descriptively, such as the analysis was presented by displaying the data of students' skill, then they were associated with theories or previous empirical studies. The research showed the task facilitated the students to have the skills. Thirty-one students got a perfect score for the analyzing, five students achieved 97% mastery, two students had 92% mastery, and another two students got 89% and 79% of mastery. The implication of the finding was the students who get authentic tasks forcing them to perform like a professional, the possibility of the students for achieving the professional skills will be higher at the end of learning.
Is case-specificity content-specificity? An analysis of data from extended-matching questions.
Dory, Valerie; Gagnon, Robert; Charlin, Bernard
2010-03-01
Case-specificity, i.e., variability of a subject's performance across cases, has been a consistent finding in medical education. It has important implications for assessment validity and reliability. Its root causes remain a matter of discussion. One hypothesis, content-specificity, links variability of performance to variable levels of relevant knowledge. Extended-matching items (EMIs) are an ideal format to test this hypothesis as items are grouped by topic. If differences pertaining to content knowledge are the main cause of case-specificity, variability across topics should be high and variability across items within the same topic low. We used generalisability analysis on results of a written test composed of 159 EMIs sat by two cohorts of general practice trainees at one university. Two hundred and twenty-seven trainees took part. The variance component attributed to subjects was small. Variance attributed to topics was smaller than variance attributed to items. The main source of error was interaction between subjects and items, accounting for two-thirds of error. The generalisability D study revealed that for the same total number of items, increasing the number of topics results in a higher G coefficient than increasing the number of items per topic. Topical knowledge does not seem to explain case-specificity observed in our data. Structure of knowledge and reasoning strategy may be more important, in particular pattern-recognition which EMIs were designed to elicit. The causal explanations of case-specificity may be dependent on test format. Increasing the number of topics with fewer items each would increase reliability but also testing time.
Morales, Leo S; Flowers, Claudia; Gutierrez, Peter; Kleinman, Marjorie; Teresi, Jeanne A
2006-11-01
To illustrate the application of the Differential Item and Test Functioning (DFIT) method using English and Spanish versions of the Mini-Mental State Examination (MMSE). Study participants were 65 years of age or older and lived in North Manhattan, New York. Of the 1578 study participants who were administered the MMSE 665 completed it in Spanish. : The MMSE contains 20 items that measure the degree of cognitive impairment in the areas of orientation, attention and calculation, registration, recall and language, as well as the ability to follow verbal and written commands. After assessing the dimensionality of the MMSE scale, item response theory person and item parameters were estimated separately for the English and Spanish sample using Samejima's 2-parameter graded response model. Then the DFIT framework was used to assess differential item functioning (DIF) and differential test functioning (DTF). Nine items were found to show DIF; these were items that ask the respondent to name the correct season, day of the month, city, state, and 2 nearby streets, recall 3 objects, repeat the phrase no ifs, no ands, no buts, follow the command, "close your eyes," and the command, "take the paper in your right hand, fold the paper in half with both hands, and put the paper down in your lap." At the scale level, however, the MMSE did not show differential functioning. Respondents to the English and Spanish versions of the MMSE are comparable on the basis of scale scores. However, assessments based on individual MMSE items may be misleading.
Pilkonis, Paul A; Yu, Lan; Dodds, Nathan E; Johnston, Kelly L; Lawrence, Suzanne M; Hilton, Thomas F; Daley, Dennis C; Patkar, Ashwin A; McCarty, Dennis
2017-08-01
There is a need to monitor patients receiving prescription opioids to detect possible signs of abuse. To address this need, we developed and calibrated an item bank for severity of abuse of prescription pain medication as part of the Patient-Reported Outcomes Measurement Information System (PROMIS ® ). Comprehensive literature searches yielded an initial bank of 5,310 items relevant to substance use and abuse, including abuse of prescription pain medication, from over 80 unique instruments. After qualitative item analysis (i.e., focus groups, cognitive interviewing, expert review, and item revision), 25 items for abuse of prescribed pain medication were included in field testing. Items were written in a first-person, past-tense format, with a three-month time frame and five response options reflecting frequency or severity. The calibration sample included 448 respondents, 367 from the general population (ascertained through an internet panel) and 81 from community treatment programs participating in the National Drug Abuse Treatment Clinical Trials Network. A final bank of 22 items was calibrated using the two-parameter graded response model from item response theory. A seven-item static short form was also developed. The test information curve showed that the PROMIS ® item bank for abuse of prescription pain medication provided substantial information in a broad range of severity. The initial psychometric characteristics of the item bank support its use as a computerized adaptive test or short form, with either version providing a brief, precise, and efficient measure relevant to both clinical and community samples. © 2016 American Academy of Pain Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Pye, Annie; Charalambous, Anna Pavlina; Leroi, Iracema; Thodi, Chrysoulla; Dawes, Piers
2017-11-01
Cognitive screening tests frequently rely on items being correctly heard or seen. We aimed to identify, describe, and evaluate the adaptation, validity, and availability of cognitive screening and assessment tools for dementia which have been developed or adapted for adults with acquired hearing and/or vision impairment. Electronic databases were searched using subject terms "hearing disorders" OR "vision disorders" AND "cognitive assessment," supplemented by exploring reference lists of included papers and via consultation with health professionals to identify additional literature. 1,551 papers were identified, of which 13 met inclusion criteria. Four papers related to tests adapted for hearing impairment; 11 papers related to tests adapted for vision impairment. Frequently adapted tests were the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MOCA). Adaptations for hearing impairment involved deleting or creating written versions for hearing-dependent items. Adaptations for vision impairment involved deleting vision-dependent items or spoken/tactile versions of visual tasks. No study reported validity of the test in relation to detection of dementia in people with hearing/vision impairment. Item deletion had a negative impact on the psychometric properties of the test. While attempts have been made to adapt cognitive tests for people with acquired hearing and/or vision impairment, the primary limitation of these adaptations is that their validity in accurately detecting dementia among those with acquired hearing or vision impairment is yet to be established. It is likely that the sensitivity and specificity of the adapted versions are poorer than the original, especially if the adaptation involved item deletion. One solution would involve item substitution in an alternative sensory modality followed by re-validation of the adapted test.
TRENDS: A flight test relational database user's guide and reference manual
NASA Technical Reports Server (NTRS)
Bondi, M. J.; Bjorkman, W. S.; Cross, J. L.
1994-01-01
This report is designed to be a user's guide and reference manual for users intending to access rotocraft test data via TRENDS, the relational database system which was developed as a tool for the aeronautical engineer with no programming background. This report has been written to assist novice and experienced TRENDS users. TRENDS is a complete system for retrieving, searching, and analyzing both numerical and narrative data, and for displaying time history and statistical data in graphical and numerical formats. This manual provides a 'guided tour' and a 'user's guide' for the new and intermediate-skilled users. Examples for the use of each menu item within TRENDS is provided in the Menu Reference section of the manual, including full coverage for TIMEHIST, one of the key tools. This manual is written around the XV-15 Tilt Rotor database, but does include an appendix on the UH-60 Blackhawk database. This user's guide and reference manual establishes a referrable source for the research community and augments NASA TM-101025, TRENDS: The Aeronautical Post-Test, Database Management System, Jan. 1990, written by the same authors.
Jacobson, C Jeffrey; Kashikar-Zuck, Susmita; Farrell, Jennifer; Barnett, Kimberly; Goldschneider, Ken; Dampier, Carlton; Cunningham, Natoshia; Crosby, Lori; DeWitt, Esi Morgan
2015-12-01
As initial steps in a broader effort to develop and test pediatric pain behavior and pain quality item banks for the Patient-Reported Outcomes Measurement Information System (PROMIS), we used qualitative interview and item review methods to 1) evaluate the overall conceptual scope and content validity of the PROMIS pain domain framework among children with chronic/recurrent pain conditions, and 2) develop item candidates for further psychometric testing. To elicit the experiential and conceptual scope of pain outcomes across a variety of pediatric recurrent/chronic pain conditions, we conducted 32 semi-structured individual and 2 focus-group interviews with children and adolescents (8-17 years), and 32 individual and 2 focus-group interviews with parents of children with pain. Interviews with pain experts (10) explored the operational limits of pain measurement in children. For item bank development, we identified existing items from measures in the literature, grouped them by concept, removed redundancies, and modified the remaining items to match PROMIS formatting. New items were written as needed and cognitive debriefing was completed with the children and their parents, resulting in 98 pain behavior (47 self, 51 proxy), 54 quality, and 4 intensity items for further testing. Qualitative content analyses suggest that reportable pain outcomes that matter to children with pain are captured within and consistent with the pain domain framework in PROMIS. PROMIS pediatric pain behavior, quality, and intensity items were developed based on a theoretical framework of pain that was evaluated by multiple stakeholders in the measurement of pediatric pain, including researchers, clinicians, and children with pain and their parents, and the appropriateness of the framework was verified. Copyright © 2015 American Pain Society. Published by Elsevier Inc. All rights reserved.
The Development of the Post-Divorce Parental Conflict Scale.
ERIC Educational Resources Information Center
Sonnenblick, Renee; Schwarz, J. Conrad
One difficulty in studying the long-term impact of divorce on children has been the lack of a reliable and valid measure of parental conflict for divorced parents. Items for a post-divorce conflict scale were written and tested using 32 male and 63 female college students from divorced families for Study 1 and 60 male and 75 female students from…
Assessing learning in small sized physics courses
NASA Astrophysics Data System (ADS)
Ene, Emanuela; Ackerson, Bruce J.
2018-01-01
We describe the construction, validation, and testing of a concept inventory for an Introduction to Physics of Semiconductors course offered by the department of physics to undergraduate engineering students. By design, this inventory addresses both content knowledge and the ability to interpret content via different cognitive processes outlined in Bloom's revised taxonomy. The primary challenge comes from the low number of test takers. We describe the Rasch modeling analysis for this concept inventory, and the results of the calibration on a small sample size, with the intention of providing a useful blueprint to other instructors. Our study involved 101 students from Oklahoma State University and fourteen faculty teaching or doing research in the field of semiconductors at seven universities. The items were written in four-option multiple-choice format. It was possible to calibrate a 30-item unidimensional scale precisely enough to characterize the student population enrolled each semester and, therefore, to allow the tailoring of the learning activities of each class. We show that this scale can be employed as an item bank from which instructors could extract short testlets and where we can add new items fitting the existing calibration.
Daly, Justine B; Campbell, Elizabeth M; Wiggers, John H; Considine, Robyn J
2002-06-01
This study aimed to determine the prevalence of responsible hospitality policies in a group of licensed premises associated with alcohol-related harm. During March 1999, 108 licensed premises with one or more police-identified alcohol-related incidents in the previous 3 months received a visit from a police officer. A 30-item audit checklist was used to determine the responsible hospitality policies being undertaken by each premises within eight policy domains: display required signage (three items); responsible host practices to prevent intoxication and under-age drinking (five items); written policies and guidelines for responsible service (three items); discouraging inappropriate promotions (three items); safe transport (two items); responsible management issues (seven items); physical environment (three items) and entry conditions (four items). No premises were undertaking all 30 items. Eighty per cent of the premises were undertaking 20 of the 30 items. All premises were undertaking at least 17 of the items. The proportion of premises undertaking individual items ranged from 16% to 100%. Premises were less likely to report having and providing written responsible hospitality documentation to staff, using door charges and having entry/re-entry rules. Significant differences between rural and urban premises were evident for four policies. Clubs were significantly more likely than hotels to have a written responsible service of alcohol policy and to clearly display codes of dress and conditions of entry. This study provides an indication of the extent and nature of responsible hospitality policies in a sample of licensed premises that are associated with a broad range of alcohol related harms. The finding that a large majority of such premises appear to adopt responsible hospitality policies suggests a need to assess the validity and reliability of tools used in the routine assessment of such policies, and of the potential for harm from licensed premises.
Gains to L2 Listeners from Reading while Listening vs. Listening Only in Comprehending Short Stories
ERIC Educational Resources Information Center
Chang, Anna C.-S.
2009-01-01
This study builds on the concept that aural-written verification helps L2 learners develop auditory discrimination skills, refine word recognition and gain awareness of form-meaning relationships, by comparing two modes of aural input: reading while listening (R/L) vs. listening only (L/O). Two test tasks (sequencing and gap filling) of 95 items,…
Edmond, Mark; Neville, Francesca; Khalil, Hisham S
2016-01-01
This pilot study conducted at the Peninsula Medical School is one of very few studies to compare the use of video podcasts to traditional learning resources for medical students. We developed written handouts and video podcasts for three common ear, nose, and throat conditions; epistaxis, otitis media, and tonsillitis. Forty-one second-year students were recruited via email. Students completed a 60-item true or false statement test written by the senior author (20 questions per subject). Students were subsequently randomized to podcast or handouts. Students were able to access their resource via their unique university login on the university homepage and were given 3 weeks to use their resource. They then completed the same 60-item test. Both podcasts and handouts demonstrated a statistically significant increase in student scores (podcasts mean increase in scores 4.7, P=0.004, 95% confidence interval =0.07). Handout mean increase in scores 5.3, P=0.015, 95% confidence interval =0.11). However, there was no significant difference (P=0.07) between the two, with the handout group scoring fractionally higher (podcasts average post-exposure score =37.3 vs handout 37.8) with a larger average improvement. A 5-point Likert scale questionnaire demonstrated that medical students enjoy using reusable learning objects such as podcasts and feel that they should be used more in their curriculum. Podcasts are as good as traditional handouts in teaching second-year medical students three core ear, nose, and throat conditions and enhance their learning experience.
A practical guide to assessing clinical decision-making skills using the key features approach.
Farmer, Elizabeth A; Page, Gordon
2005-12-01
This paper in the series on professional assessment provides a practical guide to writing key features problems (KFPs). Key features problems test clinical decision-making skills in written or computer-based formats. They are based on the concept of critical steps or 'key features' in decision making and represent an advance on the older, less reliable patient management problem (PMP) formats. The practical steps in writing these problems are discussed and illustrated by examples. Steps include assembling problem-writing groups, selecting a suitable clinical scenario or problem and defining its key features, writing the questions, selecting question response formats, preparing scoring keys, reviewing item quality and item banking. The KFP format provides educators with a flexible approach to testing clinical decision-making skills with demonstrated validity and reliability when constructed according to the guidelines provided.
Advances Afoot in Microbiology
Karon, Brad S.
2017-01-01
ABSTRACT In 2016, the American Academy of Microbiology convened a colloquium to examine point-of-care (POC) microbiology testing and to evaluate its effects on clinical microbiology. Colloquium participants included representatives from clinical microbiology laboratories, industry, and the government, who together made recommendations regarding the implementation, oversight, and evaluation of POC microbiology testing. The colloquium report is timely and well written (V. Dolen et al., Changing Diagnostic Paradigms for Microbiology, 2017, https://www.asm.org/index.php/colloquium-reports/item/6421-changing-diagnostic-paradigms-for-microbiology?utm_source=Commentary&utm_medium=referral&utm_campaign=diagnostics). Emerging POC microbiology tests, especially nucleic acid amplification tests, have the potential to advance medical care. PMID:28539341
Byram, Jessica N; Seifert, Mark F; Brooks, William S; Fraser-Cotlin, Laura; Thorp, Laura E; Williams, James M; Wilson, Adam B
2017-03-01
With integrated curricula and multidisciplinary assessments becoming more prevalent in medical education, there is a continued need for educational research to explore the advantages, consequences, and challenges of integration practices. This retrospective analysis investigated the number of items needed to reliably assess anatomical knowledge in the context of gross anatomy and histology. A generalizability analysis was conducted on gross anatomy and histology written and practical examination items that were administered in a discipline-based format at Indiana University School of Medicine and in an integrated fashion at the University of Alabama School of Medicine and Rush University Medical College. Examination items were analyzed using a partially nested design s×(i:o) in which items were nested within occasions (i:o) and crossed with students (s). A reliability standard of 0.80 was used to determine the minimum number of items needed across examinations (occasions) to make reliable and informed decisions about students' competence in anatomical knowledge. Decision study plots are presented to demonstrate how the number of items per examination influences the reliability of each administered assessment. Using the example of a curriculum that assesses gross anatomy knowledge over five summative written and practical examinations, the results of the decision study estimated that 30 and 25 items would be needed on each written and practical examination to reach a reliability of 0.80, respectively. This study is particularly relevant to educators who may question whether the amount of anatomy content assessed in multidisciplinary evaluations is sufficient for making judgments about the anatomical aptitude of students. Anat Sci Educ 10: 109-119. © 2016 American Association of Anatomists. © 2016 American Association of Anatomists.
Utecht, Joseph; Brochhausen, Mathias; Judkins, John; Schneider, Jodi; Boyce, Richard D
2017-01-01
In this research we aim to demonstrate that an ontology-based system can categorize potential drug-drug interaction (PDDI) evidence items into complex types based on a small set of simple questions. Such a method could increase the transparency and reliability of PDDI evidence evaluation, while also reducing the variations in content and seriousness ratings present in PDDI knowledge bases. We extended the DIDEO ontology with 44 formal evidence type definitions. We then manually annotated the evidence types of 30 evidence items. We tested an RDF/OWL representation of answers to a small number of simple questions about each of these 30 evidence items and showed that automatic inference can determine the detailed evidence types based on this small number of simpler questions. These results show proof-of-concept for a decision support infrastructure that frees the evidence evaluator from mastering relatively complex written evidence type definitions.
Bode, Rita K; Lai, Jin-shei; Dineen, Kelly; Heinemann, Allen W; Shevrin, Daniel; Von Roenn, Jamie; Cella, David
2006-01-01
We expanded an existing 33-item physical function (PF) item bank with a sufficient number of items to enable computerized adaptive testing (CAT). Ten items were written to expand the bank and the new item pool was administered to 295 people with cancer. For this analysis of the new pool, seven poorly performing items were identified for further examination. This resulted in a bank with items that define an essentially unidimensional PF construct, cover a wide range of that construct, reliably measure the PF of persons with cancer, and distinguish differences in self-reported functional performance levels. We also developed a 5-item (static) assessment form ("BriefPF") that can be used in clinical research to express scores on the same metric as the overall bank. The BriefPF was compared to the PF-10 from the Medical Outcomes Study SF-36. Both short forms significantly differentiated persons across functional performance levels. While the entire bank was more precise across the PF continuum than either short form, there were differences in the area of the continuum in which each short form was more precise: the BriefPF was more precise than the PF-10 at the lower functional levels and the PF-10 was more precise than the BriefPF at the higher levels. Future research on this bank will include the development of a CAT version, the PF-CAT.
7 CFR 1728.50 - Removal of an item from listing or technical acceptance.
Code of Federal Regulations, 2010 CFR
2010-01-01
... equipment will be notified in writing of a proposal to remove such item from the listing or technical... unanimous, the item will be referred to Technical Standards Committee “B.” Written notice of Technical...” decision, a sponsor may appeal in writing to Technical Standards Committee “B” to review Committee “A's...
7 CFR 1728.30 - Inclusion of an item for listing or technical acceptance.
Code of Federal Regulations, 2010 CFR
2010-01-01
... be referred to Technical Standards Committee “B.” Written notice of Technical Standards Committee “A... 7 Agriculture 11 2010-01-01 2010-01-01 false Inclusion of an item for listing or technical... AND CONSTRUCTION § 1728.30 Inclusion of an item for listing or technical acceptance. (a) Scope. RUS...
29 CFR 4.131 - Furnishing services involving more than use of labor.
Code of Federal Regulations, 2013 CFR
2013-07-01
... laundered items on a rental basis. It is plain from the legislative history that such a contract is typical..., computer services, and the like are within the general coverage of the Act even though the contractor may be required to furnish such tangible items as written reports or computer printouts, since items of...
29 CFR 4.131 - Furnishing services involving more than use of labor.
Code of Federal Regulations, 2014 CFR
2014-07-01
... laundered items on a rental basis. It is plain from the legislative history that such a contract is typical..., computer services, and the like are within the general coverage of the Act even though the contractor may be required to furnish such tangible items as written reports or computer printouts, since items of...
29 CFR 4.131 - Furnishing services involving more than use of labor.
Code of Federal Regulations, 2012 CFR
2012-07-01
... laundered items on a rental basis. It is plain from the legislative history that such a contract is typical..., computer services, and the like are within the general coverage of the Act even though the contractor may be required to furnish such tangible items as written reports or computer printouts, since items of...
Information Concerning Preparation of Specifications for Carpeting.
ERIC Educational Resources Information Center
Gilliland, John W.
This paper argues for detailed, written carpeting specifications to assure that schools obtain quality products at competitive prices. The advantages of and specifications for school carpeting are given. A sample written specification contains items on: scope, general features, materials, acoustic characteristics, identification and acoustic…
Li, Degao; Gao, Kejuan; Zhang, Yue; Wu, Xueyun
2012-01-01
Inspired by a previous study of Korean deaf and hard of hearing adolescents, the researchers conducted a priming task of living-nonliving categorization with a sample of Chinese deaf and hard of hearing adolescents. The sample in this study had significantly lower accuracy levels for the thematically related items than for the taxonomically related items and significantly larger differences in reaction times than a group of hearing adolescents when stimuli were changed from pictures to written words. However, they were not significantly different from the hearing adolescents in their performance with the taxonomically related written words. Furthermore, unlike the hearing adolescents, they did not have significantly different reaction times as the result of changes in positions of stimulus presentations.
Writing superiority in cued recall
Fueller, Carina; Loescher, Jens; Indefrey, Peter
2013-01-01
In list learning paradigms with free recall, written recall has been found to be less susceptible to intrusions of related concepts than spoken recall when the list items had been visually presented. This effect has been ascribed to the use of stored orthographic representations from the study phase during written recall (Kellogg, 2001). In other memory retrieval paradigms, by contrast, either better recall for modality-congruent items or an input-independent writing superiority effect have been found (Grabowski, 2005). In a series of four experiments using a paired associate learning paradigm we tested (a) whether output modality effects on verbal recall can be replicated in a paradigm that does not involve the rejection of semantically related intrusion words, (b) whether a possible superior performance for written recall was due to a slower response onset for writing as compared to speaking in immediate recall, and (c) whether the performance in paired associate word recall was correlated with performance in an additional episodic memory recall task. We observed better written recall in the first half of the recall phase, irrespective of the modality in which the material was presented upon encoding. An explanation for this effect based on longer response latencies for writing and hence more time for memory retrieval could be ruled out by showing that the effect persisted in delayed response versions of the task. Although there was some evidence that stored additional episodic information may contribute to the successful retrieval of associate words, this evidence was only found in the immediate response experiments and hence is most likely independent from the observed output modality effect. In sum, our results from a paired associate learning paradigm suggest that superior performance for written vs. spoken recall cannot be (solely) explained in terms of additional access to stored orthographic representations from the encoding phase. Our findings rather suggest a general writing-superiority effect at the time of memory retrieval. PMID:24151483
Measuring genetic knowledge: a brief survey instrument for adolescents and adults.
Fitzgerald-Butt, S M; Bodine, A; Fry, K M; Ash, J; Zaidi, A N; Garg, V; Gerhardt, C A; McBride, K L
2016-02-01
Basic knowledge of genetics is essential for understanding genetic testing and counseling. The lack of a written, English language, validated, published measure has limited our ability to evaluate genetic knowledge of patients and families. Here, we begin the psychometric analysis of a true/false genetic knowledge measure. The 18-item measure was completed by parents of children with congenital heart defects (CHD) (n = 465) and adolescents and young adults with CHD (age: 15-25, n = 196) with a mean total correct score of 12.6 [standard deviation (SD) = 3.5, range: 0-18]. Utilizing exploratory factor analysis, we determined that one to three correlated factors, or abilities, were captured by our measure. Through confirmatory factor analysis, we determined that the two factor model was the best fit. Although it was necessary to remove two items, the remaining items exhibited adequate psychometric properties in a multidimensional item response theory analysis. Scores for each factor were computed, and a sum-score conversion table was derived. We conclude that this genetic knowledge measure discriminates best at low knowledge levels and is therefore well suited to determine a minimum adequate amount of genetic knowledge. However, further reliability testing and validation in diverse research and clinical settings is needed. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Person Response Functions and the Definition of Units in the Social Sciences
ERIC Educational Resources Information Center
Engelhard, George, Jr.; Perkins, Aminah F.
2011-01-01
Humphry (this issue) has written a thought-provoking piece on the interpretation of item discrimination parameters as scale units in item response theory. One of the key features of his work is the description of an item response theory (IRT) model that he calls the logistic measurement function that combines aspects of two traditions in IRT that…
Advances Afoot in Microbiology.
Patel, Robin; Karon, Brad S
2017-07-01
In 2016, the American Academy of Microbiology convened a colloquium to examine point-of-care (POC) microbiology testing and to evaluate its effects on clinical microbiology. Colloquium participants included representatives from clinical microbiology laboratories, industry, and the government, who together made recommendations regarding the implementation, oversight, and evaluation of POC microbiology testing. The colloquium report is timely and well written (V. Dolen et al., Changing Diagnostic Paradigms for Microbiology , 2017, https://www.asm.org/index.php/colloquium-reports/item/6421-changing-diagnostic-paradigms-for-microbiology?utm_source=Commentary&utm_medium=referral&utm_campaign=diagnostics). Emerging POC microbiology tests, especially nucleic acid amplification tests, have the potential to advance medical care. Copyright © 2017 American Society for Microbiology.
Learners' Perspectives on Authenticity.
ERIC Educational Resources Information Center
Chavez, Monika M. Th.
A survey investigated the attitudes of second language learners about authentic texts, written and oral, used for language instruction. Respondents were 186 randomly-selected university students of German. The students were administered a 212-item questionnaire (the items are appended) that requested information concerning student demographic…
Chatterji, Madhabi; Graham, Mark J; Wyer, Peter C
2009-12-01
The complex competency labeled practice-based learning and improvement (PBLI) by the Accreditation Council for Graduate Medical Education (ACGME) incorporates core knowledge in evidence-based medicine (EBM). The purpose of this study was to operationally define a "PBLI-EBM" domain for assessing resident physician competence. The authors used an iterative design process to first content analyze and map correspondences between ACGME and EBM literature sources. The project team, including content and measurement experts and residents/fellows, parsed, classified, and hierarchically organized embedded learning outcomes using a literature-supported cognitive taxonomy. A pool of 141 items was produced from the domain and assessment specifications. The PBLI-EBM domain and resulting items were content validated through formal reviews by a national panel of experts. The final domain represents overlapping PBLI and EBM cognitive dimensions measurable through written, multiple-choice assessments. It is organized as 4 subdomains of clinical action: Therapy, Prognosis, Diagnosis, and Harm. Four broad cognitive skill branches (Ask, Acquire, Appraise, and Apply) are subsumed under each subdomain. Each skill branch is defined by enabling skills that specify the cognitive processes, content, and conditions pertinent to demonstrable competence. Most items passed content validity screening criteria and were prepared for test form assembly and administration. The operational definition of PBLI-EBM competence is based on a rigorously developed and validated domain and item pool, and substantially expands conventional understandings of EBM. The domain, assessment specifications, and procedures outlined may be used to design written assessments to tap important cognitive dimensions of the overall PBLI competency, as given by ACGME. For more comprehensive coverage of the PBLI competency, such instruments need to be complemented with performance assessments.
Chatterji, Madhabi; Graham, Mark J.; Wyer, Peter C.
2009-01-01
Purpose The complex competency labeled practice-based learning and improvement (PBLI) by the Accreditation Council for Graduate Medical Education (ACGME) incorporates core knowledge in evidence-based medicine (EBM). The purpose of this study was to operationally define a “PBLI-EBM” domain for assessing resident physician competence. Method The authors used an iterative design process to first content analyze and map correspondences between ACGME and EBM literature sources. The project team, including content and measurement experts and residents/fellows, parsed, classified, and hierarchically organized embedded learning outcomes using a literature-supported cognitive taxonomy. A pool of 141 items was produced from the domain and assessment specifications. The PBLI-EBM domain and resulting items were content validated through formal reviews by a national panel of experts. Results The final domain represents overlapping PBLI and EBM cognitive dimensions measurable through written, multiple-choice assessments. It is organized as 4 subdomains of clinical action: Therapy, Prognosis, Diagnosis, and Harm. Four broad cognitive skill branches (Ask, Acquire, Appraise, and Apply) are subsumed under each subdomain. Each skill branch is defined by enabling skills that specify the cognitive processes, content, and conditions pertinent to demonstrable competence. Most items passed content validity screening criteria and were prepared for test form assembly and administration. Conclusions The operational definition of PBLI-EBM competence is based on a rigorously developed and validated domain and item pool, and substantially expands conventional understandings of EBM. The domain, assessment specifications, and procedures outlined may be used to design written assessments to tap important cognitive dimensions of the overall PBLI competency, as given by ACGME. For more comprehensive coverage of the PBLI competency, such instruments need to be complemented with performance assessments. PMID:21975994
Specific test and evaluation plan
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hays, W.H.
1998-03-20
The purpose of this Specific Test and Evaluation Plan (STEP) is to provide a detailed written plan for the systematic testing of modifications made to the 241-AX-B Valve Pit by the W-314 Project. The STEP develops the outline for test procedures that verify the system`s performance to the established Project design criteria. The STEP is a lower tier document based on the W-314 Test and Evaluation Plan (TEP). Testing includes Validations and Verifications (e.g., Commercial Grade Item Dedication activities), Factory Acceptance Tests (FATs), installation tests and inspections, Construction Acceptance Tests (CATs), Acceptance Test Procedures (ATPs), Pre-Operational Test Procedures (POTPs), andmore » Operational Test Procedures (OTPs). It should be noted that POTPs are not required for testing of the transfer line addition. The STEP will be utilized in conjunction with the TEP for verification and validation.« less
Heron, Jon; Crane, Catherine; Gunnell, David; Lewis, Glyn; Evans, Jonathan; Williams, J. Mark G.
2012-01-01
Although the Autobiographical Memory Test (AMT) is widely used its psychometric properties have rarely been investigated. This paper utilises data gathered from a 10-item written version of the AMT, completed by 5792 adolescents participating in the Avon Longitudinal Study of Parents and Children, to examine the psychometric properties of the measure. The results show that the scale derived from responses to the AMT operates well over a wide range of scores, consistent with the aim of deriving a continuous measure of over-general memory. There was strong evidence of group differences in terms of gender, low negative mood, and IQ, and these were in agreement when comparing an item response theory (IRT) approach with that based on a sum score. One advantage of the IRT model is the ability to assess and consequently allow for differential item functioning. This additional analysis showed evidence of response bias for both gender and mood, resulting in attenuation in the mean differences in AMT across these groups. Implications of the findings for the use of the AMT measure in different samples are discussed. PMID:22348421
ERIC Educational Resources Information Center
Anderson, Daniel; Alonzo, Julie; Tindal, Gerald
2012-01-01
In this technical report, we describe the results of a study of mathematics items written to align with the Common Core State Standards (CCSS) in grades 6-8. In each grade, CCSS items were organized into forms, and the reliability of these forms was evaluated along with an experimental form including items aligned with the National Council of…
Assaad, Michael-Andrew; Janvier, Annie; Lapointe, Anie
2018-02-01
This study determined whether there was a difference in the conclusions reached by neonatologists in morbidity and mortality conferences based on their level of involvement in a case. All neonatal deaths occurring between August 2014 and September 2015 at the neonatal intensive care unit of Sainte-Justine Hospital, Montreal, Quebec, Canada, were reviewed by internal physicians involved in the case and external physicians who were not. The reviewers were asked to identify positive and negative clinical practice items and provide written recommendations. These were classified into eight categories and compared for each case. During the study, 55 patients died leading to 110 reviews and a total of 590 positive and negative items. Most items were in the communication (25.2%), ethical decision-making (16.7%) and clinical management (14.8%) categories. Both the internal and external reviewers were in agreement 48.5% of the time for positive items and 44.8% for negative items. There were 242 written recommendations, which differed significantly among the internal and external reviewers. Reviews of neonatal deaths by two independent reviewers, internal physicians and external physicians, led to different positive and negative practice items and recommendations. This could allow for a richer discussion and improve recommendations for patient care. ©2017 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.
77 FR 66599 - Environmental Management Site-Specific Advisory Board, Northern New Mexico
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-06
... Business Written Reports Report on EM SSAB Chairs' Meeting, Carlos Valdez and Manuel Pacheco, Vice-Chair... Business Consideration and Action on 2012 Self Evaluation (Section X. Bylaws), Carlos Valdez Other Items 2... seven days in advance of the meeting at the telephone number listed above. Written statements may be...
7 CFR 1728.40 - Procedure for submission of a proposal.
Code of Federal Regulations, 2010 CFR
2010-01-01
... § 1728.40 Procedure for submission of a proposal. (a) Written Request. Consideration of an item of material or equipment will be obtained by the sponsor through the submission of a written request in an original and five copies addressed to the Chairman, Technical Standards Committee “A” (Electric). The...
32 CFR 750.27 - Information and supporting documentation.
Code of Federal Regulations, 2010 CFR
2010-07-01
... Federal agency. Upon written request, a copy of the report of the examining physician shall be provided... made for lost wages, a written statement from the employer itemizing actual time and wages lost; (v) If... States for the personal injury or the damages claimed. (3) Property damage. (i) Proof of ownership; (ii...
Modern Written Arabic, Volume II.
ERIC Educational Resources Information Center
Naja, A. Nashat; Snow, James A.
This second volume of Modern Written Arabic builds on the previous volume and is the second step designed to teach members of the Foreign Service to read the modern Arabic press. The student will gain recognitional mastery of an extensive set of vocabulary items and will be more intensively exposed to wider and more complex morphological and…
History Untold: Celebrating Ohio History through ABLE Students.
ERIC Educational Resources Information Center
Kent State Univ., OH. Ohio Literacy Resource Center.
This document is a compilation of 25 pieces of writing presenting Ohio adult basic and literacy education (ABLE) students' perspectives of community and personal history. The items included in the compilation were written by ABLE students across Ohio. The compilation is organized in three sections as follows: (1) people (9 items, including a…
ERIC Educational Resources Information Center
Taylor, J. S. H.; Plunkett, Kim; Nation, Kate
2011-01-01
Two experiments explored learning, generalization, and the influence of semantics on orthographic processing in an artificial language. In Experiment 1, 16 adults learned to read 36 novel words written in novel characters. Posttraining, participants discriminated trained from untrained items and generalized to novel items, demonstrating extraction…
Project W-314 specific test and evaluation plan for AZ tank farm upgrades
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hays, W.H.
1998-08-12
The purpose of this Specific Test and Evaluation Plan (STEP) is to provide a detailed written plan for the systematic testing of modifications made by the addition of the SN-631 transfer line from the AZ-O1A pit to the AZ-02A pit by the W-314 Project. The STEP develops the outline for test procedures that verify the system`s performance to the established Project design criteria. The STEP is a lower tier document based on the W-314 Test and Evaluation P1 an (TEP). Testing includes Validations and Verifications (e.g., Commercial Grade Item Dedication activities, etc), Factory Tests and Inspections (FTIs), installation tests andmore » inspections, Construction Tests and Inspections (CTIs), Acceptance Test Procedures (ATPs), Pre-Operational Test Procedures (POTPs), and Operational Test Procedures (OTPs). The STEP will be utilized in conjunction with the TEP for verification and validation.« less
Qualitative Development of the PROMIS® Pediatric Stress Response Item Banks
Gardner, William; Pajer, Kathleen; Riley, Anne W.; Forrest, Christopher B.
2013-01-01
Objective To describe the qualitative development of the Patient-Reported Outcome Measurement Information System (PROMIS®) Pediatric Stress Response item banks. Methods Stress response concepts were specified through a literature review and interviews with content experts, children, and parents. A library comprising 2,677 items derived from 71 instruments was developed. Items were classified into conceptual categories; new items were written and redundant items were removed. Items were then revised based on cognitive interviews (n = 39 children), readability analyses, and translatability reviews. Results 2 pediatric Stress Response sub-domains were identified: somatic experiences (43 items) and psychological experiences (64 items). Final item pools cover the full range of children’s stress experiences. Items are comprehensible among children aged ≥8 years and ready for translation. Conclusions Child- and parent-report versions of the item banks assess children’s somatic and psychological states when demands tax their adaptive capabilities. PMID:23124904
75 FR 82003 - Environmental Management Site-Specific Advisory Board, Northern New Mexico
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-29
..., 2010 Meeting Minutes. 1:30 p.m. Public Comment Period. 1:45 p.m. Old Business. Written Reports. Other Items. 2 p.m. New Business. Report on Long-Term Surveillance Conference, Robert Gallegos and Robert... telephone number listed above. Written statements may be filed with the Board either before or after the...
76 FR 78909 - Environmental Management Site-Specific Advisory Board, Northern New Mexico
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-20
... Period. 1:45 p.m. Old Business. Written Reports. Other Items. 2 p.m. New Business, Ralph Phelps. 2:30 p.m... meeting at the telephone number listed above. Written statements may be filed with the Board either before... facilitate the orderly conduct of business. Individuals wishing to make public comments will be provided a...
77 FR 12044 - Environmental Management Site-Specific Advisory Board, Northern New Mexico
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-28
..., Meeting Minutes 1:30 p.m.--Public Comment Period 1:45 p.m.--Old Business Written Reports Report on Waste Management Symposia, Manuel Pacheco and Joe Tiano Other Items 2 p.m.--New Business Approval of NNMCAB Top... Santistevan at least seven days in advance of the meeting at the telephone number listed above. Written...
75 FR 35446 - Environmental Management Site-Specific Advisory Board, Northern New Mexico
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-22
... Period. 1:30 p.m. Old Business. Written reports. Update on Fall EM SSAB Chairs' Meeting (Hosted by NNMCAB). Other items. 1:45 p.m. New Business. EM SSAB Chairs' Recommendation on Baseline Funding Support, Ralph... listed above. Written statements may be filed with the Board either before or after the meeting...
76 FR 51361 - Environmental Management Site-Specific Advisory Board, Northern New Mexico
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-18
... Minutes. 1:30 p.m. Public Comment Period. 1:45 p.m. Old Business; Written Reports; Other Items. 2 p.m. New Business: Report from Nominating Committee (Section V, F. of NNMCAB Bylaws), Deb Shaw; [[Page 51362... telephone number listed above. Written statements may be filed with the Board either before or after the...
75 FR 9886 - Environmental Management Site-Specific Advisory Board, Northern New Mexico
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-04
.... Old Business Written reports Other items 1:45 p.m. New Business 2 p.m. Overview of Los Alamos National... Santistevan at least seven days in advance of the meeting at the telephone number listed above. Written... conduct the meeting in a fashion that will facilitate the orderly conduct of business. Individuals wishing...
ERIC Educational Resources Information Center
Stodden, Robert A.; And Others
1989-01-01
Educational records of 127 secondary students with disabilities were reviewed to assess impact of vocational assessment information in Individualized Education Plan (IEP) development process. Results focus on the impact of information on the number of IEP vocational goals and objectives written, kinds of IEP items written, and number of IEP goals…
Asante, Isaac; Andoh, Irene; Muijtjens, Arno M M; Donkers, Jeroen
2017-05-01
To assess the stakeholders' perceptions on the competency of entry-level pharmacists and the use of written licensure examination as the primary assessment for licensure decisions on entry-level pharmacists who have completed the Pharmacy Internship Program 1 (PIP) in developing countries. A cross-sectional survey was conducted among stakeholders in which they completed a web-based 21-item pre-tested questionnaire to determine their views regarding the competency outcomes and assessment program for entry-level pharmacist. The stakeholders rated the entry-level pharmacists to possess all competencies except research skills. Stakeholders suggested improvement of the program by defining the competency framework and training preceptors. However, stakeholders disagree on using written examination as the primary assessment for licensure decision and suggested the incorporation of other performance-based assessments like preceptor's assessment reports. Stakeholders are uncertain on entry-level pharmacists in developing countries possessing adequate research competencies and think their assessment program for licensure need more than written examination to assess all required competencies. Copyright © 2017 Elsevier Inc. All rights reserved.
Two for one: a self-management plan coupled with a prescription sheet for children with asthma.
Ducharme, Francine M; Noya, Francisco; McGillivray, David; Resendes, Sandy; Ducharme-Bénard, Stéphanie; Zemek, Roger; Bhogal, Sanjit Kaur; Rouleau, Rachel
2008-10-01
Despite strong recommendations in the asthma guidelines, the use of written self-management plans remains low among asthmatic patients. To develop a written self-management plan, based on scientific evidence and expert opinions, in a format intended to facilitate its dispensing by health care professionals, and to test the perception of its relevance and clarity by asthmatic children, adolescents and adults. Inspired by previously tested self-management plans, surveys of asthma educators, expert opinions and the 2004 Canadian Asthma Guidelines, the authors simultaneously developed French and English versions of a written self-management plan that coupled with a prescription. The self-management plan was tested in parents and their asthmatic children (aged one to 17 years), and it was revised until 85% clarity and perceived relevance was achieved. Ninety-seven children and their parents were interviewed. Twenty per cent had a self-management plan. On the final revision, nearly all items were clear and perceived relevant by 85% or more of the interviewees. Two self-management plans were designed for clinics and acute care settings, respectively. The plans are divided into three control zones identified by symptoms with optional peak flow values and symbolized by traffic light colours. They are designed in triplicate format with a prescription slip, a medical chart copy and a patient copy. The written self-management plans, based on available scientific evidence and expert opinions, are clear and perceived to be relevant by children, adolescents and their parents. By incorporating the prescription and chart copies, they were designed to facilitate dispensing by physicians in both clinics and acute care settings.
Smith, Michelle K; Wood, William B; Knight, Jennifer K
2008-01-01
We have designed, developed, and validated a 25-question Genetics Concept Assessment (GCA) to test achievement of nine broad learning goals in majors and nonmajors undergraduate genetics courses. Written in everyday language with minimal jargon, the GCA is intended for use as a pre- and posttest to measure student learning gains. The assessment was reviewed by genetics experts, validated by student interviews, and taken by >600 students at three institutions. Normalized learning gains on the GCA were positively correlated with averaged exam scores, suggesting that the GCA measures understanding of topics relevant to instructors. Statistical analysis of our results shows that differences in the item difficulty and item discrimination index values between different questions on pre- and posttests can be used to distinguish between concepts that are well or poorly learned during a course.
Wood, William B.; Knight, Jennifer K.
2008-01-01
We have designed, developed, and validated a 25-question Genetics Concept Assessment (GCA) to test achievement of nine broad learning goals in majors and nonmajors undergraduate genetics courses. Written in everyday language with minimal jargon, the GCA is intended for use as a pre- and posttest to measure student learning gains. The assessment was reviewed by genetics experts, validated by student interviews, and taken by >600 students at three institutions. Normalized learning gains on the GCA were positively correlated with averaged exam scores, suggesting that the GCA measures understanding of topics relevant to instructors. Statistical analysis of our results shows that differences in the item difficulty and item discrimination index values between different questions on pre- and posttests can be used to distinguish between concepts that are well or poorly learned during a course. PMID:19047428
Health Literacy and Cancer Prevention: Two New Instruments to Assess Comprehension
Mazor, Kathleen M.; Roblin, Douglas W.; Williams, Andrew E.; Greene, Sarah M.; Gaglio, Bridget; Field, Terry S.; Costanza, Mary E.; Han, Paul K. J.; Saccoccio, Laura; Calvi, Josephine; Cove, Erica; Cowan, Rebecca
2012-01-01
Objectives Ability to understand spoken health information is an important facet of health literacy, but to date, no instrument has been available to quantify patients’ ability in this area. We sought to develop a test to assess comprehension of spoken health messages related to cancer prevention and screening to fill this gap, and a complementary test of comprehension of written health messages. Methods We used the Sentence Verification Technique to write items based on realistic health messages about cancer prevention and screening, including media messages, clinical encounters and clinical print materials. Items were reviewed, revised, and pre-tested. Adults aged 40 to 70 participated in a pilot administration in Georgia, Hawaii, and Massachusetts. Results The Cancer Message Literacy Test-Listening is self-administered via touchscreen laptop computer. No reading is required. It takes approximately 1 hour. The Cancer Message Literacy Test-Reading is self-administered on paper. It takes approximately 10 minutes. Conclusions These two new tests will allow researchers to assess comprehension of spoken health messages, to examine the relationship between listening and reading literacy, and to explore the impact of each form of literacy on health-related outcomes. Practice Implications Researchers and clinicians now have a means of measuring comprehension of spoken health information. PMID:22244323
[Student Magazine of the ESL Classes of the International Ladies' Garment Workers' Union (ILGWU).
ERIC Educational Resources Information Center
Alvarez, Manuel, Ed.; Zetino, Alfredo, Ed.
This student magazine created by the English-as-a-Second-Language (ESL) classes of the International Ladies' Garment Workers' Union (ILGWU) is a collection of personal opinions, reports, and creative writing with illustrations. Each item was written as a voluntary collaboration, homework, or classwork. Items include poems, letters, accounts of…
ERIC Educational Resources Information Center
Jones, Anna Marie; Punia, Mandeep; Young, Shannan; Huegli, Carol Chase; Zidenberg-Cherr, Sheri
2013-01-01
Purpose/Objectives: The objective of this study was to determine the perceived training needs of California school nutrition personnel. Methods: A questionnaire was developed using items from previous questionnaires administered to similar populations. New items were written based on feedback from stakeholders. Respondents were asked to rate their…
76 FR 36101 - Environmental Management Site-Specific Advisory Board, Northern New Mexico; Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-21
... Approval of Agenda and May 12, 2011 Meeting Minutes 1:30 p.m. Public Comment Period 1:45 p.m. Old Business Written Reports Other Items 2:00 p.m. New Business Report on Semi-Annual EM SSAB Chairs' Meeting Report... seven days in advance of the meeting at the telephone number listed above. Written statements may be...
Investigating the impact of automated feedback on students' scientific argumentation
NASA Astrophysics Data System (ADS)
Zhu, Mengxiao; Lee, Hee-Sun; Wang, Ting; Liu, Ou Lydia; Belur, Vinetha; Pallant, Amy
2017-08-01
This study investigates the role of automated scoring and feedback in supporting students' construction of written scientific arguments while learning about factors that affect climate change in the classroom. The automated scoring and feedback technology was integrated into an online module. Students' written scientific argumentation occurred when they responded to structured argumentation prompts. After submitting the open-ended responses, students received scores generated by a scoring engine and written feedback associated with the scores in real-time. Using the log data that recorded argumentation scores as well as argument submission and revisions activities, we answer three research questions. First, how students behaved after receiving the feedback; second, whether and how students' revisions improved their argumentation scores; and third, did item difficulties shift with the availability of the automated feedback. Results showed that the majority of students (77%) made revisions after receiving the feedback, and students with higher initial scores were more likely to revise their responses. Students who revised had significantly higher final scores than those who did not, and each revision was associated with an average increase of 0.55 on the final scores. Analysis on item difficulty shifts showed that written scientific argumentation became easier after students used the automated feedback.
Looman, Wendy Sue; Farrag, Shewikar
2009-01-01
Social capital, defined as an investment in relationships that facilitates the exchange of resources, has been identified as a possible protective factor for child health in the context of risk factors such as poverty. Reliable and valid measures of social capital are needed for research and practice, particularly in non-English-speaking populations in developing countries. To evaluate the psychometric properties and cross-cultural equivalence of the Arabic translation of the Social Capital Scale (SCS). Descriptive, cross-sectional study for psychometric testing of a translated tool. Two metropolitan health clinics in Alexandria, Egypt. A convenience sample of 117 Egyptian parents of children with chronic conditions. To be eligible to participate, respondents had to be a parent of child with a chronic health condition between the ages of 1 and 18 years. The sample included primarily biological parents between the ages of 20 and 56 years. The 20-item Arabic SCS was administered as part of a written survey that included additional measures on demographic information and parent ratings of the child's overall health. Six items were ultimately removed based on item analysis, and exploratory factor analysis was conducted on the resulting 14-item scale. As a measure of construct validity, hypothesis testing was conducted using an independent samples t-test to determine whether a significant difference exists between mean total social capital scores for two groups of respondents based on the parental rating of the child's overall health. Item and factor analysis yielded preliminary support for a revised, 14-item Arabic SCS with four internally consistent factors. The standardized item alpha reliability coefficient for the total 14-item scale was .75. Respondents who reported that their child was in good health had significantly higher social capital scores than those who rated their child's health as poor. The 14-item Arabic SCS was found to be reliable and valid in this sample, with four internally consistent factors. While the tool may not be appropriate for comparing social capital between cultural groups, it will enable clinicians and researchers to address an important gap in knowledge characterized by a paucity of research on childhood chronic illness in low- and middle-income countries such as Egypt.
A Concealed Information Test with multimodal measurement.
Ambach, Wolfgang; Bursch, Stephanie; Stark, Rudolf; Vaitl, Dieter
2010-03-01
A Concealed Information Test (CIT) investigates differential physiological responses to deed-related (probe) vs. irrelevant items. The present study focused on the detection of concealed information using simultaneous recordings of autonomic and brain electrical measures. As a secondary issue, verbal and pictorial presentations were compared with respect to their influence on the recorded measures. Thirty-one participants underwent a mock-crime scenario with a combined verbal and pictorial presentation of nine items. The subsequent CIT, designed with respect to event-related potential (ERP) measurement, used a 3-3.5s interstimulus interval. The item presentation modality, i.e. pictures or written words, was varied between subjects; no response was required from the participants. In addition to electroencephalogram (EEG), electrodermal activity (EDA), electrocardiogram (ECG), respiratory activity, and finger plethysmogram were recorded. A significant probe-vs.-irrelevant effect was found for each of the measures. Compared to sole ERP measurement, the combination of ERP and EDA yielded incremental information for detecting concealed information. Although, EDA per se did not reach the predictive value known from studies primarily designed for peripheral physiological measurement. Presentation modality neither influenced the detection accuracy for autonomic measures nor EEG measures; this underpins the equivalence of verbal and pictorial item presentation in a CIT, regardless of the physiological measures recorded. Future studies should further clarify whether the incremental validity observed in the present study reflects a differential sensitivity of ERP and EDA to different sub-processes in a CIT. Copyright 2009 Elsevier B.V. All rights reserved.
McElhiney, Judith; Lohse, Matthew R; Arora, Amindra S; Peloquin, Joanna M; Geno, Debra M; Kuntz, Melissa M; Enders, Felicity B; Fredericksen, Mary; Abdalla, Adil A; Khan, Yulia; Talley, Nicholas J; Diehl, Nancy N; Beebe, Timothy J; Harris, Ann M; Farrugia, Gianrico; Graner, Darlene E; Murray, Joseph A; Locke, G Richard; Grothe, Rayna M; Crowell, Michael D; Francis, Dawn L; Grudell, April M B; Dabade, Tushar; Ramirez, Angelica; Alkhatib, MhdMaan; Alexander, Jeffrey A; Kimber, Jessica; Prasad, Ganapathy; Zinsmeister, Alan R; Romero, Yvonne
2010-09-01
The aim of this study was to develop the Mayo Dysphagia Questionnaire-30 Day (MDQ-30), a tool to measure esophageal dysphagia, by adapting items from validated instruments for use in clinical trials, and assess its feasibility, reproducibility, and concurrent validity. Outpatients referred to endoscopy for dysphagia or seen in a specialty clinic were recruited. Feasibility testing was done to identify problematic items. Reproducibility was measured by test-retest format. Concurrent validity reflects agreement between information gathered in a structured interview versus the patients' written responses. The MDQ-30, a 28-item instrument, took 10 min (range = 5-30 min) to complete. Four hundred thirty-one outpatients [210 (49%) men; mean age = 61 years] participated. Overall, most concurrent validity kappa values for dysphagia were very good to excellent with a median of 0.78 (min 0.28, max 0.95). The majority of reproducibility kappa values for dysphagia were moderate to excellent with a median kappa value of 0.66 (min 0.07, max 1.0). Overall, concurrent validity and reproducibility kappa values for gastroesophageal reflux disease (GERD) symptoms were 0.81 (95% CI = 0.72, 0.91) and 0.66 (95% CI = 0.55, 0.77), respectively. Individual item percent agreement was generally very good to excellent. Internal consistency was excellent. We conclude that the MDQ-30 is an easy-to-complete tool to evaluate reliably dysphagia symptoms over the last 30 days.
Beginnings V: A Publication of Adult Student Writing of the 2002 Ohio Writers' Conference.
ERIC Educational Resources Information Center
Kent State Univ., OH. Ohio Literacy Resource Center.
This document is a compilation of 68 items that were written by Ohio adult basic and literacy education students and presented at the Fifth Annual Ohio Writers' Conference, which was devoted to the theme "writing and the arts." The compilation is organized in seven sections as follows: (1) choices (8 items, including a poem expressing…
ERIC Educational Resources Information Center
Farrell, Ann H.; Provenzano, Daniel A.; Spadafora, Natalie; Marini, Zopito A.; Volk, Anthony A.
2016-01-01
The purpose of this study was to develop a scale that measures adolescents' attitudes toward classroom incivility and determine whether items would reveal subscales. A sample of 549 adolescents between ages 11 and 18 (53.1% boys; M[subscript age] = 13.90, SD = 1.41) completed items written to measure attitudes toward classroom incivility. An…
Introduction to Psychology and Leadership. Rank-Biserial Correlation as an Item Discrimination.
ERIC Educational Resources Information Center
Westinghouse Learning Corp., Annapolis, MD.
Written as a technical report for the leadership course of the United States Naval Academy (see the final reports which summarize the course development project, EM 010 418, EM 010 419, and EM 010 484), this paper examines the use and interpretation of the rank-biserial correlation as an index of item discrimination. The advantages and…
ERIC Educational Resources Information Center
Haberman, Shelby J.
2013-01-01
A general program for item-response analysis is described that uses the stabilized Newton-Raphson algorithm. This program is written to be compliant with Fortran 2003 standards and is sufficiently general to handle independent variables, multidimensional ability parameters, and matrix sampling. The ability variables may be either polytomous or…
Mental Models of Elementary and Middle School Students in Analyzing Simple Battery and Bulb Circuits
ERIC Educational Resources Information Center
Jabot, Michael; Henry, David
2007-01-01
Written assessment items were developed to probe students' understanding of a variety of direct current (DC) resistive electric circuit concepts. The items were used to explore the mental models that grade 3-8 students use in explaining the direction of electric current and how electric current is affected by different configurations of simple…
Breining, Bonnie; Nozari, Nazbanou; Rapp, Brenda
2016-04-01
Past research has demonstrated interference effects when words are named in the context of multiple items that share a meaning. This interference has been explained within various incremental learning accounts of word production, which propose that each attempt at mapping semantic features to lexical items induces slight but persistent changes that result in cumulative interference. We examined whether similar interference-generating mechanisms operate during the mapping of lexical items to segments by examining the production of words in the context of others that share segments. Previous research has shown that initial-segment overlap amongst a set of target words produces facilitation, not interference. However, this initial-segment facilitation is likely due to strategic preparation, an external factor that may mask underlying interference. In the present study, we applied a novel manipulation in which the segmental overlap across target items was distributed unpredictably across word positions, in order to reduce strategic response preparation. This manipulation led to interference in both spoken (Exp. 1) and written (Exp. 2) production. We suggest that these findings are consistent with a competitive learning mechanism that applies across stages and modalities of word production.
Federer, Meghan Rector; Nehm, Ross H.; Pearl, Dennis K.
2016-01-01
Understanding sources of performance bias in science assessment provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Research investigating assessment bias due to factors such as instrument structure, participant characteristics, and item types are well documented across a variety of disciplines. However, the relationships among these factors are unclear for tasks evaluating understanding through performance on scientific practices, such as explanation. Using item-response theory (Rasch analysis), we evaluated differences in performance by gender on a constructed-response (CR) assessment about natural selection (ACORNS). Three isomorphic item strands of the instrument were administered to a sample of undergraduate biology majors and nonmajors (Group 1: n = 662 [female = 51.6%]; G2: n = 184 [female = 55.9%]; G3: n = 642 [female = 55.1%]). Overall, our results identify relationships between item features and performance by gender; however, the effect is small in the majority of cases, suggesting that males and females tend to incorporate similar concepts into their CR explanations. These results highlight the importance of examining gender effects on performance in written assessment tasks in biology. PMID:26865642
Applying Item Response Theory methods to design a learning progression-based science assessment
NASA Astrophysics Data System (ADS)
Chen, Jing
Learning progressions are used to describe how students' understanding of a topic progresses over time and to classify the progress of students into steps or levels. This study applies Item Response Theory (IRT) based methods to investigate how to design learning progression-based science assessments. The research questions of this study are: (1) how to use items in different formats to classify students into levels on the learning progression, (2) how to design a test to give good information about students' progress through the learning progression of a particular construct and (3) what characteristics of test items support their use for assessing students' levels. Data used for this study were collected from 1500 elementary and secondary school students during 2009--2010. The written assessment was developed in several formats such as the Constructed Response (CR) items, Ordered Multiple Choice (OMC) and Multiple True or False (MTF) items. The followings are the main findings from this study. The OMC, MTF and CR items might measure different components of the construct. A single construct explained most of the variance in students' performances. However, additional dimensions in terms of item format can explain certain amount of the variance in student performance. So additional dimensions need to be considered when we want to capture the differences in students' performances on different types of items targeting the understanding of the same underlying progression. Items in each item format need to be improved in certain ways to classify students more accurately into the learning progression levels. This study establishes some general steps that can be followed to design other learning progression-based tests as well. For example, first, the boundaries between levels on the IRT scale can be defined by using the means of the item thresholds across a set of good items. Second, items in multiple formats can be selected to achieve the information criterion at all the defined boundaries. This ensures the accuracy of the classification. Third, when item threshold parameters vary a bit, the scoring rubrics and the items need to be reviewed to make the threshold parameters similar across items. This is because one important design criterion of the learning progression-based items is that ideally, a student should be at the same level across items, which means that the item threshold parameters (d1, d 2 and d3) should be similar across items. To design a learning progression-based science assessment, we need to understand whether the assessment measures a single construct or several constructs and how items are associated with the constructs being measured. Results from dimension analyses indicate that items of different carbon transforming processes measure different aspects of the carbon cycle construct. However, items of different practices assess the same construct. In general, there are high correlations among different processes or practices. It is not clear whether the strong correlations are due to the inherent links among these process/practice dimensions or due to the fact that the student sample does not show much variation in these process/practice dimensions. Future data are needed to examine the dimensionalities in terms of process/practice in detail. Finally, based on item characteristics analysis, recommendations are made to write more discriminative CR items and better OMC, MTF options. Item writers can follow these recommendations to write better learning progression-based items.
Two-Year Follow-up of the Collision Auto Repair Safety Study (CARSS)
Bejan, Anca; Parker, David L.; Brosseau, Lisa M.; Xi, Min; Skan, Maryellen
2015-01-01
This paper presents an evaluation of the sustainability of health and safety improvements in small auto collision shops 1 year after the implementation of a year-long targeted intervention. During the first year (active phase), owners received quarterly phone calls, written reminders, safety newsletters, and access to online services and in-person assistance with creating safety programs and respirator fit testing. During the second year (passive phase), owners received up to three postcard reminders regarding the availability of free health and safety resources. Forty-five shops received an evaluation at baseline and at the end of the first year (Y1). Of these, 33 were evaluated at the end of the second year (Y2), using the same 92-item assessment tool. At Y1, investigators found that between 70 and 81% of the evaluated items were adequate in each business (mean = 73% items, SD = 11%). At Y2, between 63 and 89% of items were deemed adequate (mean = 73% items, SD = 9.5%). Three safety areas demonstrated statistically significant (P < 0.05) changes: compressed gasses (8% improvement), personal protective equipment (7% improvement), and respiratory protection (6% decline). The number of postcard reminders sent to each business did not affect the degree to which shops maintained safety improvements made during the first year of the intervention. However, businesses that received more postcards were more likely to request assistance services than those receiving fewer. PMID:25539646
NASA Astrophysics Data System (ADS)
Chambers, Timothy
This dissertation presents the results of an experiment that measured the learning outcomes associated with three different pedagogical approaches to introductory physics labs. These three pedagogical approaches presented students with the same apparatus and covered the same physics content, but used different lab manuals to guide students through distinct cognitive processes in conducting their laboratory investigations. We administered post-tests containing multiple-choice conceptual questions and free-response quantitative problems one week after students completed these laboratory investigations. In addition, we collected data from the laboratory practical exam taken by students at the end of the semester. Using these data sets, we compared the learning outcomes for the three curricula in three dimensions of ability: conceptual understanding, quantitative problem-solving skill, and laboratory skills. Our three pedagogical approaches are as follows. Guided labs lead students through their investigations via a combination of Socratic-style questioning and direct instruction, while students record their data and answers to written questions in the manual during the experiment. Traditional labs provide detailed written instructions, which students follow to complete the lab objectives. Open labs provide students with a set of apparatus and a question to be answered, and leave students to devise and execute an experiment to answer the question. In general, we find that students performing Guided labs perform better on some conceptual assessment items, and that students performing Open labs perform significantly better on experimental tasks. Combining a classical test theory analysis of post-test results with in-lab classroom observations allows us to identify individual components of the laboratory manuals and investigations that are likely to have influenced the observed differences in learning outcomes associated with the different pedagogical approaches. Due to the novel nature of this research and the large number of item-level results we produced, we recommend additional research to determine the reproducibility of our results. Analyzing the data with item response theory yields additional information about the performance of our students on both conceptual questions and quantitative problems. We find that performing lab activities on a topic does lead to better-than-expected performance on some conceptual questions regardless of pedagogical approach, but that this acquired conceptual understanding is strongly context-dependent. The results also suggest that a single "Newtonian reasoning ability" is inadequate to explain student response patterns to items from the Force Concept Inventory. We develop a framework for applying polytomous item response theory to the analysis of quantitative free-response problems and for analyzing how features of student solutions are influenced by problem-solving ability. Patterns in how students at different abilities approach our post-test problems are revealed, and we find hints as to how features of a free-response problem influence its item parameters. The item-response theory framework we develop provides a foundation for future development of quantitative free-response research instruments. Chapter 1 of the dissertation presents a brief history of physics education research and motivates the present study. Chapter 2 describes our experimental methodology and discusses the treatments applied to students and the instruments used to measure their learning. Chapter 3 provides an introduction to the statistical and analytical methods used in our data analysis. Chapter 4 presents the full data set, analyzed using both classical test theory and item response theory. Chapter 5 contains a discussion of the implications of our results and a data-driven analysis of our experimental methods. Chapter 6 describes the importance of this work to the field and discusses the relevance of our research to curriculum development and to future work in physics education research.
A nonproprietary, nonsecret program for calculating Stirling cryocoolers
NASA Technical Reports Server (NTRS)
Martini, W. R.
1985-01-01
A design program for an integrated Stirling cycle cryocooler was written on an IBM-PC computer. The program is easy to use and shows the trends and itemizes the losses. The calculated results were compared with some measured performance values. The program predicts somewhat optimistic performance and needs to be calibrated more with experimental measurements. Adding a multiplier to the friction factor can bring the calculated rsults in line with the limited test results so far available. The program is offered as a good framework on which to build a truly useful design program for all types of cryocoolers.
Developing Validity Evidence for the Written Pediatric History and Physical Exam Evaluation Rubric.
King, Marta A; Phillipi, Carrie A; Buchanan, Paula M; Lewin, Linda O
The written history and physical examination (H&P) is an underutilized source of medical trainee assessment. The authors describe development and validity evidence for the Pediatric History and Physical Exam Evaluation (P-HAPEE) rubric: a novel tool for evaluating written H&Ps. Using an iterative process, the authors drafted, revised, and implemented the 10-item rubric at 3 academic institutions in 2014. Eighteen attending physicians and 5 senior residents each scored 10 third-year medical student H&Ps. Inter-rater reliability (IRR) was determined using intraclass correlation coefficients. Cronbach α was used to report consistency and Spearman rank-order correlations to determine relationships between rubric items. Raters provided a global assessment, recorded time to review and score each H&P, and completed a rubric utility survey. Overall intraclass correlation was 0.85, indicating adequate IRR. Global assessment IRR was 0.89. IRR for low- and high-quality H&Ps was significantly greater than for medium-quality ones but did not differ on the basis of rater category (attending physician vs. senior resident), note format (electronic health record vs nonelectronic), or student diagnostic accuracy. Cronbach α was 0.93. The highest correlation between an individual item and total score was for assessments was 0.84; the highest interitem correlation was between assessment and differential diagnosis (0.78). Mean time to review and score an H&P was 16.3 minutes; residents took significantly longer than attending physicians. All raters described rubric utility as "good" or "very good" and endorsed continued use. The P-HAPEE rubric offers a novel, practical, reliable, and valid method for supervising physicians to assess pediatric written H&Ps. Copyright © 2016 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Quality of information accompanying on-line marketing of home diagnostic tests.
Datta, Adrija K; Selman, Tara J; Kwok, Tony; Tang, Teresa; Khan, Khalid S
2008-01-01
To assess the quality of information provided to consumers by websites marketing medical home diagnostic tests. A cross-sectional analysis of a database developed from searching targeted websites. Data sources were websites written in English which marketed medical home diagnostic tests. A meta-search engine was used to identify the first 20 citations for each type of home diagnostic medical test. Relevant websites limited to those written in English were reviewed independently and in triplicate, with disputes resolved by two further reviewers. Information on the quality of these sites was extracted using a pre-piloted performer. 168 websites were suitable for inclusion in the review. The quality of these sites showed marked variation. Only 24 of 168 (14.2%) complied with at least three-quarters of the quality items and just over half (95 of 168, 56.5%) reported official approval or certification of the test. Information on accuracy of the test marketed was reported by 87 of 168 (51.7%) websites, with 15 of 168 (8.9%) providing a scientific reference. Instructions for use of the product were found in 97 of 168 (57.9%). However, the course of action to be taken after obtaining the test result was stated in only 63 of 168 (37.5%) for a positive result and 43 of 168 (25.5%) for a negative result. The quality of information posted on commercial websites marketing home tests online is unsatisfactory and potentially misleading for consumers.
Quality of information accompanying on-line marketing of home diagnostic tests
Datta, Adrija K; Selman, Tara J; Kwok, Tony; Tang, Teresa; Khan, Khalid S
2008-01-01
Objective To assess the quality of information provided to consumers by websites marketing medical home diagnostic tests. Design A cross-sectional analysis of a database developed from searching targeted websites. Setting Data sources were websites written in English which marketed medical home diagnostic tests. Main outcome measures A meta-search engine was used to identify the first 20 citations for each type of home diagnostic medical test. Relevant websites limited to those written in English were reviewed independently and in triplicate, with disputes resolved by two further reviewers. Information on the quality of these sites was extracted using a pre-piloted performer. Results 168 websites were suitable for inclusion in the review. The quality of these sites showed marked variation. Only 24 of 168 (14.2%) complied with at least three-quarters of the quality items and just over half (95 of 168, 56.5%) reported official approval or certification of the test. Information on accuracy of the test marketed was reported by 87 of 168 (51.7%) websites, with 15 of 168 (8.9%) providing a scientific reference. Instructions for use of the product were found in 97 of 168 (57.9%). However, the course of action to be taken after obtaining the test result was stated in only 63 of 168 (37.5%) for a positive result and 43 of 168 (25.5%) for a negative result. Conclusions The quality of information posted on commercial websites marketing home tests online is unsatisfactory and potentially misleading for consumers. PMID:18263912
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-25
... Change To Amend EDGX Rule 4.3, Record of Written Complaints, To Conform to Financial Industry Regulatory... Commission (``Commission'') the proposed rule change as described in Items I and II below, which items have... comments on the proposed rule change from interested persons. \\1\\ 15 U.S.C. 78s(b)(1). \\2\\ 17 CFR 240.19b-4...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-25
... Change To Amend EDGA Rule 4.3, Record of Written Complaints, To Conform to Financial Industry Regulatory... Commission (``Commission'') the proposed rule change as described in Items I and II below, which items have... comments on the proposed rule change from interested persons. \\1\\ 15 U.S.C. 78s(b)(1). \\2\\ 17 CFR 240.19b-4...
ERIC Educational Resources Information Center
Zhang, Xijuan; Savalei, Victoria
2016-01-01
Many psychological scales written in the Likert format include reverse worded (RW) items in order to control acquiescence bias. However, studies have shown that RW items often contaminate the factor structure of the scale by creating one or more method factors. The present study examines an alternative scale format, called the Expanded format,…
ERIC Educational Resources Information Center
Yu, Timothy L.M., Comp.
This bibliography lists and describes published and unpublished material relating to mass communications in Hong Kong and Macao, from 1945 to 1973. Most of the items listed are written in Chinese; a limited number are in English. Part one, which deals with Hong Kong, lists 115 items divided into 18 sections: bibliography and reference material;…
Siqueira, Vicente N; Mancuso, Frederico J N; Campos, Orlando; De Paola, Angelo A; Carvalho, Antonio C; Moises, Valdir A
2015-10-01
Training requirements for general cardiologists without echocardiographic expertise to perform focused cardiac ultrasound (FCU) with portable devices have not yet been defined. The objective of this study was to evaluate a training program to instruct cardiology residents to perform FCU with a hand-carried device (HCD) in different clinical settings. Twelve cardiology residents were subjected to a 50-question test, 4 lectures on basic echocardiography and imaging interpretation, the supervised interpretation of 50 echocardiograms and performance of 30 exams using HCD. After this period, they repeated the written test and were administered a practical test comprising 30 exams each (360 patients) in different clinical settings. They reported on 15 parameters and a final diagnosis; their findings were compared to the HCD exam of a specialist in echocardiography. The proportion of correct answers on the theoretical test was higher after training (86%) than before (51%; P = 0.001). The agreement was substantial among the 15 parameters analyzed (kappa ranging from 0.615 to 0.891; P < 0.001). The percentage of correct interpretation was lower for abnormal (75%) than normal (95%) items, for valve abnormalities (85%) compared to other items (92%) and for graded scale (87%) than for dichotomous (95%) items (P < 0.0001, for all). For the final diagnoses, the kappa value was higher than 0.941 (P < 0.001; 95% CI [0.914, 0.955]). The training proposed enabled residents to perform FCU with HCD, and their findings were in good agreement with those of a cardiologist specialized in echocardiography. © 2015, Wiley Periodicals, Inc.
[Understanding of medical information provided during orthognathic surgery consultations].
Poynard, S; Pare, A; Bonin Goga, B; Laure, B; Goga, D
2014-06-01
A prospective study was conducted from November 2012 to May 2013 to assess what patients had understood after their preoperative consultations for orthognathic surgery. We studied the impact of a written document created in the department, containing the information given during the consultation. Fifty patients were asked to complete 2 questionnaires given to the patient the day before surgery. The first was used to assess what the patients had understood; it included 20 multiple-choice questions on information given during consultation and in the written document. For each item, the patient had to check what he thought to be the right answer. Each correct answer was graded at 1 and each incorrect answer or no answer was graded at 0. The maximum score was 20/20. The second was to assess the written document. Each item was graded from 1 to 10 (Likert-type scale). Thirty-two patients answered both questionnaires. The average score for the first was 15.03/20 (P<0.05), significantly higher than the theoretical average set at 10 (P<0.05). The written document was found understandable (score 8.47/10) and information easy to find (score 7.28/10). The document provided answers to the patients' questions (score 7.50/10), using information given during consultation (score 7.56/10). The 2 consultations and the written document helped patients better understand orthognatic care and surgery. Copyright © 2014. Published by Elsevier Masson SAS.
Cuvelier, A; Lamia, B; Molano, L-C; Muir, J-F; Windisch, W
2012-05-01
We performed the French translation and cross-cultural adaptation of the Severe Respiratory Insufficiency (SRI) questionnaire. Written and validated in German, this questionnaire evaluates health-related quality of life in patients treated with domiciliary ventilation for chronic respiratory failure. Four bilingual German-French translators and a linguist were recruited to produce translations and back-translations of the questionnaire constituted of 49 items in seven domains. Two successive versions were generated and compared to the original questionnaire. The difficulty of the translation and the naturalness were quantified for each item using a 1-10 scale and their equivalence to their original counterpart was graded from A to C. The translated questionnaire was finally tested in a pilot study, which included 15 representative patients. The difficulty of the first translation and the first back-translation was respectively quantified as 2.5 (range 1-5.5) and 1.5 (range 1-6) on the 10-point scale (P=0.0014). The naturalness and the equivalence of 8/49 items were considered as insufficient, which led to the production of a second translation and a second back-translation. The meanings of two items needed clarification during the pilot study. The French translation of the SRI questionnaire represents a new instrument for clinical research in patients treated with domiciliary ventilation for chronic respiratory failure. Its validity needs to be tested in a multicenter study. Copyright © 2012 SPLF. Published by Elsevier Masson SAS. All rights reserved.
The effectiveness of integration of virtual patients in a collaborative learning activity.
Marei, Hesham F; Donkers, Jeroen; Van Merrienboer, Jeroen J G
2018-05-07
Virtual patients (VPs) have been recently integrated within different learning activities. To compare between the effect of using VPs in a collaborative learning activity and using VPs in an independent learning activity on students' knowledge acquisition, retention and transfer. For two different topics, respectively 82 and 76 dental students participated in teaching, learning and assessment sessions with VPs. Students from a female campus and from a male campus have been randomly assigned to condition (collaborative and independent), yielding four experimental groups. Each group received a lecture followed by a learning session using two VPs per topic. Students were administrated immediate and delayed written tests as well as transfer tests using two VPs to assess their knowledge in diagnosis and treatment. For the treatment items of the immediate and delayed written tests, females outperformed males in the collaborative VP group but not in the independent VP group. On the female campus, the use of VPs in a collaborative learning activity is more effective than its use as an independent learning activity in enhancing students' knowledge acquisition and retention. However, the collaborative use of VPs by itself is not enough to produce consistent results across different groups of students and attention should be given to all the factors that would affect students' interaction.
Software For Nearly Optimal Packing Of Cargo
NASA Technical Reports Server (NTRS)
Fennel, Theron R.; Daughtrey, Rodney S.; Schwaab, Doug G.
1994-01-01
PACKMAN computer program used to find nearly optimal arrangements of cargo items in storage containers, subject to such multiple packing objectives as utilization of volumes of containers, utilization of containers up to limits on weights, and other considerations. Automatic packing algorithm employed attempts to find best positioning of cargo items in container, such that volume and weight capacity of container both utilized to maximum extent possible. Written in Common LISP.
Braend, Anja Maria; Gran, Sarah Frandsen; Frich, Jan C; Lindbaek, Morten
2010-01-01
Formative assessment of medical students' clinical performance during general practice clerkship is necessary to learn consultation skills. Our aim was to triangulate feedback using patient questionnaires, written self-assessment and teachers' observation-based assessment, and to describe the content of this feedback. We developed StudentPEP, a 15-item version of EUROPEP, a tool for measuring patients' evaluation of quality in general practice. The teacher and student forms consisted of five StudentPEP-items and open-ended questions asking for approval and improvement needed on four aspects. Quantitative scores were analyzed statistically. Free-text comments were analyzed and categorized into 'specific and concrete' versus 'general and unspecific'. One hundred seventy-three students returned data from 2643 consultations. Mean patients' scores for 15 items were 4.3-4.8 on a five-point Likert scale. Mean teacher scores were 4.4 on five items, while students' mean self-assessments were 3.6-3.8. In an analysis of 380 consultations, students were more specific and concrete in their self-evaluation compared with teachers (p < 0.01). Patients scored students' performance high compared with students' self-assessments. Teachers' scores were in accordance with patients' scores. Teachers' written evaluations of students were often general. There is a potential for improving teachers' feedback in terms of more specific and concrete comments.
Johnson, Tim V; Abbasi, Ammara; Kleris, Renee S; Ehrlich, Samantha S; Barthwaite, Echo; DeLong, Jennifer; Master, Viraj A
2013-08-01
Determining a patient's health literacy is important to optimum patient care. Single-item questions exist for screening written health literacy. We sought to assess the predictive potential of three common screening questions, along with patient age and education level, in the prediction of low health numerical literacy (numeracy). After demographic and educational information was obtained, 441 patients were administered three health literacy screening questions. The three-item Schwartz-Woloshin Numeracy Scale was then administered to assess for low health numeracy (score of 0 out of 3). This score served as the reference standard for Receiver Operating Characteristics (ROC) curve analysis. ROC curves were constructed and used to determine the area under the curve (AUC); a higher AUC suggests increased statistical significance. None of the three screening questions were significant predictors of low health numeracy. However, education level was a significant predictor of low health numeracy, with an AUC (95% CI) of 0.811 (0.720-0.902). This measure had a specificity of 95.3% at the cutoff of 12 years of education (<12 versus > or = 12 years of education) but was non-sensitive. Common single-item questions used to screen for written health literacy are ineffective screening tools for health numeracy. However, low education level is a specific predictor of low health numeracy.
7 CFR 550.45 - Standards of conduct.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Agreements Procurement Standards § 550.45 Standards of conduct. The Cooperator shall maintain written... situations in which the financial interest is not substantial or the gift is an unsolicited item of nominal...
Raphaelis, Silvia; Mayer, Hanna; Ott, Stefan; Mueller, Michael D; Steiner, Enikö; Joura, Elmar; Senn, Beate
2017-07-01
To determine whether written information and/or counseling based on the WOMAN-PRO II Program decreases symptom prevalence in women with vulvar neoplasia by a clinically relevant degree, and to explore the differences between the 2 interventions in symptom prevalence, symptom distress prevalence, and symptom experience. A multicenter randomized controlled parallel-group phase II trial with 2 interventions provided to patients after the initial diagnosis was performed in Austria and Switzerland. Women randomized to written information received a predefined set of leaflets concerning wound care and available healthcare services. Women allocated to counseling were additionally provided with 5 consultations by an Advanced Practice Nurse (APN) between the initial diagnosis and 6months post-surgery that focused on symptom management, utilization of healthcare services, and health-related decision-making. Symptom outcomes were simultaneously measured 5 times to the counseling time points. A total of 49 women with vulvar neoplasia participated in the study. Symptom prevalence decreased in women with counseling by a clinically relevant degree, but not in women with written information. Sporadically, significant differences between the 2 interventions could be observed in individual items, but not in the total scales or subscales of the symptom outcomes. The results indicate that counseling may reduce symptom prevalence in women with vulvar neoplasia by a clinically relevant extent. The observed group differences between the 2 interventions slightly favor counseling over written information. The results justify testing the benefit of counseling thoroughly in a comparative phase III trial. Copyright © 2017 Elsevier Inc. All rights reserved.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-03
...The U.S. Department of the Interior, Bureau of Land Management (BLM), Alaska State Office, in consultation with the appropriate Indian tribes or Native Hawaiian organizations, has determined that the items listed in this notice meet the definition of unassociated funerary objects. Lineal descendants or representatives of any Indian tribe or Native Hawaiian organization not identified in this notice that wish to claim these items should submit a written request to the BLM Alaska State Office. If no additional claimants come forward, transfer of control of the items to the lineal descendants, Indian tribes, or Native Hawaiian organizations stated in this notice may proceed.
34 CFR 74.42 - Codes of conduct.
Code of Federal Regulations, 2010 CFR
2010-07-01
... Procurement Standards § 74.42 Codes of conduct. The recipient shall maintain written standards of conduct... interest is not substantial or the gift is an unsolicited item of nominal value. The standards of conduct...
22 CFR 518.42 - Codes of conduct.
Code of Federal Regulations, 2010 CFR
2010-04-01
... Procurement Standards § 518.42 Codes of conduct. The recipient shall maintain written standards of conduct... financial interest is not substantial or the gift is an unsolicited item of nominal value. The standards of...
49 CFR 19.42 - Codes of conduct.
Code of Federal Regulations, 2010 CFR
2010-10-01
... Requirements Procurement Standards § 19.42 Codes of conduct. The recipient shall maintain written standards of... situations in which the financial interest is not substantial or the gift is an unsolicited item of nominal...
14 CFR 1274.503 - Codes of conduct.
Code of Federal Regulations, 2010 CFR
2010-01-01
... FIRMS Procurement Standards § 1274.503 Codes of conduct. The recipient shall maintain written standards... situations in which the financial interest is not substantial or the gift is an unsolicited item of nominal...
Two-year follow-up of the Collision Auto Repair Safety Study (CARSS).
Bejan, Anca; Parker, David L; Brosseau, Lisa M; Xi, Min; Skan, Maryellen
2015-06-01
This paper presents an evaluation of the sustainability of health and safety improvements in small auto collision shops 1 year after the implementation of a year-long targeted intervention. During the first year (active phase), owners received quarterly phone calls, written reminders, safety newsletters, and access to online services and in-person assistance with creating safety programs and respirator fit testing. During the second year (passive phase), owners received up to three postcard reminders regarding the availability of free health and safety resources. Forty-five shops received an evaluation at baseline and at the end of the first year (Y1). Of these, 33 were evaluated at the end of the second year (Y2), using the same 92-item assessment tool. At Y1, investigators found that between 70 and 81% of the evaluated items were adequate in each business (mean = 73% items, SD = 11%). At Y2, between 63 and 89% of items were deemed adequate (mean = 73% items, SD = 9.5%). Three safety areas demonstrated statistically significant (P < 0.05) changes: compressed gasses (8% improvement), personal protective equipment (7% improvement), and respiratory protection (6% decline). The number of postcard reminders sent to each business did not affect the degree to which shops maintained safety improvements made during the first year of the intervention. However, businesses that received more postcards were more likely to request assistance services than those receiving fewer. © The Author 2014. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
The CARE guidelines: consensus-based clinical case reporting guideline development
Gagnier, Joel J; Kienle, Gunver; Altman, Douglas G; Moher, David; Sox, Harold; Riley, David
2013-01-01
A case report is a narrative that describes, for medical, scientific or educational purposes, a medical problem experienced by one or more patients. Case reports written without guidance from reporting standards are insufficiently rigorous to guide clinical practice or to inform clinical study design. Develop, disseminate and implement systematic reporting guidelines for case reports. We used a three-phase consensus process consisting of (1) premeeting literature review and interviews to generate items for the reporting guidelines, (2) a face-to-face consensus meeting to draft the reporting guidelines and (3) postmeeting feedback, review and pilot testing, followed by finalisation of the case report guidelines. This consensus process involved 27 participants and resulted in a 13-item checklist—a reporting guideline for case reports. The primary items of the checklist are title, key words, abstract, introduction, patient information, clinical findings, timeline, diagnostic assessment, therapeutic interventions, follow-up and outcomes, discussion, patient perspective and informed consent. We believe the implementation of the CARE (CAse REport) guidelines by medical journals will improve the completeness and transparency of published case reports and that the systematic aggregation of information from case reports will inform clinical study design, provide early signals of effectiveness and harms, and improve healthcare delivery. PMID:24155002
Performance-based readability testing of participant information for a Phase 3 IVF trial
Knapp, Peter; Raynor, DK; Silcock, Jonathan; Parkinson, Brian
2009-01-01
Background Studies suggest that the process of patient consent to clinical trials is sub-optimal. Participant information sheets are important but can be technical and lengthy documents. Performance-based readability testing is an established means of assessing patient information, and this study aimed to test its application to participant information for a Phase 3 trial. Methods An independent groups design was used to study the User Testing performance of the participant information sheet from the Phase 3 'Poor Responders' trial of In Vitro Fertilisation (IVF). 20 members of the public were asked to read it, then find and demonstrate understanding of 21 key aspects of the trial. The participant information sheet was then re-written and re-designed, and tested on 20 members of the public, using the same 21 item questionnaire. Results The original participant information sheet performed well in some places. Participants could not find some answers and some of the found information was not understood. In total there were 30 instances of information being not found or not understood. Answers to three questions were found but not understood by many of the participants, these related to aspects of the drug timing, Follicle Stimulating Hormone and compensation. Only two of the 20 participants could find and show understanding of all question items when using the original sheet. The revised sheet performed generally better, with 17 instances of information being not found or not understood, although the number of 'not found' items increased. Half of the 20 participants could find and show understanding of all question items when using the revised sheet. When asked to compare the versions of the sheet, almost all participants preferred the revised version. Conclusion The original participant information sheet may not have enabled patients fully to give valid consent. Participants seeing the revised sheet were better able to understand the trial. Those who write information for trial participants should take account of good practice in information design. Performance-based User Testing may be a useful method to indicate strengths and weaknesses in trial information. PMID:19723335
Technical analysis of the Slosson Written Expression Test.
Erford, Bradley T; Hofler, Donald B
2004-06-01
The Slosson Written Expression Test was designed to assess students ages 8-17 years at risk for difficulties in written expression. Scores from three independent samples were used to evaluate the test's reliability and validity for measuring students' written expression. Test-retest reliability of the SWET subscales ranged from .80 to .94 (n = 151), and .95 for the Written Expression Total Standard Scores. The median alternate-form reliability for students' Written Expression Total Standard Scores was .81 across the three forms. Scores on the Slosson test yielded concurrent validity coefficients (n = 143) of .60 with scores from the Woodcock-Johnson: Tests of Achievement-Third Edition Broad Written Language Domain and .49 with scores on the Test of Written Language-Third Edition Spontaneous Writing Quotient. Exploratory factor analytic procedures suggested the Slosson test is comprised of two dimensions, Writing Mechanics and Writing Maturity (47.1% and 20.1% variance accounted for, respectively). In general, the Slosson Written Expression Test presents with sufficient technical characteristics to be considered a useful written expression screening test.
48 CFR 49.109-7 - Settlement by determination.
Code of Federal Regulations, 2011 CFR
2011-10-01
... certified mail (return receipt requested) to submit written evidence, so as to reach the TCO on or before a... additional information, schedules, and analyses as appropriate. The TCO shall explain each major item of...
Improving the Factor Structure of Psychological Scales
Zhang, Xijuan; Savalei, Victoria
2015-01-01
Many psychological scales written in the Likert format include reverse worded (RW) items in order to control acquiescence bias. However, studies have shown that RW items often contaminate the factor structure of the scale by creating one or more method factors. The present study examines an alternative scale format, called the Expanded format, which replaces each response option in the Likert scale with a full sentence. We hypothesized that this format would result in a cleaner factor structure as compared with the Likert format. We tested this hypothesis on three popular psychological scales: the Rosenberg Self-Esteem scale, the Conscientiousness subscale of the Big Five Inventory, and the Beck Depression Inventory II. Scales in both formats showed comparable reliabilities. However, scales in the Expanded format had better (i.e., lower and more theoretically defensible) dimensionalities than scales in the Likert format, as assessed by both exploratory factor analyses and confirmatory factor analyses. We encourage further study and wider use of the Expanded format, particularly when a scale’s dimensionality is of theoretical interest. PMID:27182074
The development of a clinical outcomes survey research application: Assessment Center.
Gershon, Richard; Rothrock, Nan E; Hanrahan, Rachel T; Jansky, Liz J; Harniss, Mark; Riley, William
2010-06-01
The National Institutes of Health sponsored Patient-Reported Outcome Measurement Information System (PROMIS) aimed to create item banks and computerized adaptive tests (CATs) across multiple domains for individuals with a range of chronic diseases. Web-based software was created to enable a researcher to create study-specific Websites that could administer PROMIS CATs and other instruments to research participants or clinical samples. This paper outlines the process used to develop a user-friendly, free, Web-based resource (Assessment Center) for storage, retrieval, organization, sharing, and administration of patient-reported outcomes (PRO) instruments. Joint Application Design (JAD) sessions were conducted with representatives from numerous institutions in order to supply a general wish list of features. Use Cases were then written to ensure that end user expectations matched programmer specifications. Program development included daily programmer "scrum" sessions, weekly Usability Acceptability Testing (UAT) and continuous Quality Assurance (QA) activities pre- and post-release. Assessment Center includes features that promote instrument development including item histories, data management, and storage of statistical analysis results. This case study of software development highlights the collection and incorporation of user input throughout the development process. Potential future applications of Assessment Center in clinical research are discussed.
Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José
2013-11-01
To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has proved beneficial for the statistical analysis of EBOD. Introduction of negative marking has led to a significant increase in the reliability (KR-20 > 0.90), indicating that the current examination format can be kept for future EBOD examinations. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Workplace safety and health programs, practices, and conditions in auto collision repair businesses.
Brosseau, L M; Bejan, A; Parker, D L; Skan, M; Xi, M
2014-01-01
This article describes the results of a pre-intervention safety assessment conducted in 49 auto collision repair businesses and owners' commitments to specific improvements. A 92-item standardized audit tool employed interviews, record reviews, and observations to assess safety and health programs, training, and workplace conditions. Owners were asked to improve at least one-third of incorrect, deficient, or missing (not in compliance with regulations or not meeting best practice) items, of which a majority were critical or highly important for ensuring workplace safety. Two-thirds of all items were present, with the highest fraction related to electrical safety, machine safety, and lockout/tagout. One-half of shops did not have written safety programs and had not conducted recent training. Many had deficiencies in respiratory protection programs and practices. Thirteen businesses with a current or past relationship with a safety consultant had a significantly higher fraction of correct items, in particular related to safety programs, up-to-date training, paint booth and mixing room conditions, electrical safety, and respiratory protection. Owners selected an average of 58% of recommended improvements; they were most likely to select items related to employee Right-to-Know training, emergency exits, fire extinguishers, and respiratory protection. They were least likely to say they would improve written safety programs, stop routine spraying outside the booth, or provide adequate fire protection for spray areas outside the booth. These baseline results suggest that it may be possible to bring about workplace improvements using targeted assistance from occupational health and safety professionals.
48 CFR 2912.302 - Tailoring of provisions and clauses for the acquisition of commercial items.
Code of Federal Regulations, 2010 CFR
2010-10-01
... tailor terms inconsistent with customary commercial practice must be documented in a written justification by the contracting officer, and may be approved by the HCA on an individual or class basis. ...
The influence of health literacy on comprehension of a colonoscopy preparation information leaflet
Smith, Samuel G.; von Wagner, Christian; McGregor, Lesley M.; Curtis, Laura M.; Wilson, Elizabeth A. H.; Serper, Marina; Wolf, Michael S.
2012-01-01
BACKGROUND Successful bowel preparation is important for safe, efficacious, cost-effective colonoscopy procedures, however poor preparation is common. OBJECTIVE We sought to determine if there was an association between health literacy and comprehension of typical written instructions on how to prepare for a colonoscopy to enable more targeted interventions in this area. DESIGN Cross-sectional observational study SETTING Primary care clinics and federally qualified health centres in Chicago, Illinois. PATIENTS 764 participants (mean age: 63 years; Standard Deviation: 5.42) were recruited. The sample was from a mixed socio-demographic background and 71.9% of the participants were classified as having adequate health literacy scores. INTERVENTION 764 participants were presented with an information leaflet outlining the bowel preparatory instructions for colonoscopy. MAIN OUTCOME MEASURES Five questions assessing comprehension of the instructions in an ‘open book’ test. RESULTS Comprehension scores on the bowel preparation items were low. The mean number of items correctly answered was 3.2 (Standard Deviation, 1.2) out of a possible 5. Comprehensions scores overall and for each individual item differed significantly by health literacy level (all p<0.001). After controlling for gender, age, race, socio-economic status and previous colonoscopy experience in a multivariable model, health literacy was a significant predictor of comprehension (inadequate vs. adequate: β = −0.2; p < 0.001; marginal vs. adequate: β = −0.2; p < 0.001). LIMITATIONS The outcome represents a simulated task and not actual comprehension of preparation instructions for participants’ own recommended behavior. CONCLUSIONS Comprehension of a written colonoscopy preparation leaflet was generally low and significantly more so among people with low health literacy. Poor comprehension has implications for the safety and economic impact of gastroenterological procedures such as colonoscopy. Therefore future interventions should aim to improve comprehension of complex medical information by reducing literacy-related barriers. PMID:22965407
Crane, Paul K; Gruhl, Jonathan C; Erosheva, Elena A; Gibbons, Laura E; McCurry, Susan M; Rhoads, Kristoffer; Nguyen, Viet; Arani, Keerthi; Masaki, Kamal; White, Lon
2010-11-01
Spoken bilingualism may be associated with cognitive reserve. Mastering a complicated written language may be associated with additional reserve. We sought to determine if midlife use of spoken and written Japanese was associated with lower rates of late life cognitive decline. Participants were second-generation Japanese-American men from the Hawaiian island of Oahu, born 1900-1919, free of dementia in 1991, and categorized based on midlife self-reported use of spoken and written Japanese (total n included in primary analysis = 2,520). Cognitive functioning was measured with the Cognitive Abilities Screening Instrument scored using item response theory. We used mixed effects models, controlling for age, income, education, smoking status, apolipoprotein E e4 alleles, and number of study visits. Rates of cognitive decline were not related to use of spoken or written Japanese. This finding was consistent across numerous sensitivity analyses. We did not find evidence to support the hypothesis that multilingualism is associated with cognitive reserve.
Gruhl, Jonathan C.; Erosheva, Elena A.; Gibbons, Laura E.; McCurry, Susan M.; Rhoads, Kristoffer; Nguyen, Viet; Arani, Keerthi; Masaki, Kamal; White, Lon
2010-01-01
Objectives. Spoken bilingualism may be associated with cognitive reserve. Mastering a complicated written language may be associated with additional reserve. We sought to determine if midlife use of spoken and written Japanese was associated with lower rates of late life cognitive decline. Methods. Participants were second-generation Japanese-American men from the Hawaiian island of Oahu, born 1900–1919, free of dementia in 1991, and categorized based on midlife self-reported use of spoken and written Japanese (total n included in primary analysis = 2,520). Cognitive functioning was measured with the Cognitive Abilities Screening Instrument scored using item response theory. We used mixed effects models, controlling for age, income, education, smoking status, apolipoprotein E e4 alleles, and number of study visits. Results. Rates of cognitive decline were not related to use of spoken or written Japanese. This finding was consistent across numerous sensitivity analyses. Discussion. We did not find evidence to support the hypothesis that multilingualism is associated with cognitive reserve. PMID:20639282
Development of Attitudes Toward Homosexuality Scale for Indians (AHSI).
Ahuja, Kanika K
2017-01-01
Attitudes toward homosexuality vary across cultures, with the legal and societal position being rather complicated in India. This study describes the process of developing and validating a Likert-type scale to assess attitudes toward homosexuality among heterosexuals. Phase 1 describes the development of the scale. Items were written based on thematic analysis of narratives generated from 50 college students and reviewing existing scales. After administering the 70-item scale to 68 participants, item analysis yielded 20 statements with item-total correlations over .70. Cronbach's alpha was .97. In Phase 2, the 20-item Attitudes Toward Homosexuality Scale for Indians (AHSI) was administered to 142 participants. Analysis yielded a corrected split-half correlation of .91. Further, AHSI discriminated between women and men; between liberal arts and STEM/business students; and those who reported interpersonal contact with gay men and lesbian women and those who did not. The scale has satisfactory reliability and shows promising construct validity.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-09
... terrorism to the list of items beyond the control of Exchange Related Persons and/or Entities for which... filed with the Commission, and all written communications relating to the proposed rule change between...
Kirschstein, Timo; Wolters, Alexander; Lenz, Jan-Hendrik; Fröhlich, Susanne; Hakenberg, Oliver; Kundt, Günther; Darmüntzel, Martin; Hecker, Michael; Altiner, Attila; Müller-Hilke, Brigitte
2016-01-01
The amendment of the Medical Licensing Act (ÄAppO) in Germany in 2002 led to the introduction of graded assessments in the clinical part of medical studies. This, in turn, lent new weight to the importance of written tests, even though the minimum requirements for exam quality are sometimes difficult to reach. Introducing exam quality as a criterion for the award of performance-based allocation of funds is expected to steer the attention of faculty members towards more quality and perpetuate higher standards. However, at present there is a lack of suitable algorithms for calculating exam quality. In the spring of 2014, the students' dean commissioned the "core group" for curricular improvement at the University Medical Center in Rostock to revise the criteria for the allocation of performance-based funds for teaching. In a first approach, we developed an algorithm that was based on the results of the most common type of exam in medical education, multiple choice tests. It included item difficulty and discrimination, reliability as well as the distribution of grades achieved. This algorithm quantitatively describes exam quality of multiple choice exams. However, it can also be applied to exams involving short assay questions and the OSCE. It thus allows for the quantitation of exam quality in the various subjects and - in analogy to impact factors and third party grants - a ranking among faculty. Our algorithm can be applied to all test formats in which item difficulty, the discriminatory power of the individual items, reliability of the exam and the distribution of grades are measured. Even though the content validity of an exam is not considered here, we believe that our algorithm is suitable as a general basis for performance-based allocation of funds.
Carver, Rebecca Bruu; Castéra, Jérémy; Gericke, Niklas; Evangelista, Neima Alice Menezes
2017-01-01
In this paper we present the development and validation a comprehensive questionnaire to assess college students’ knowledge about modern genetics and genomics, their belief in genetic determinism, and their attitudes towards applications of modern genetics and genomic-based technologies. Written in everyday language with minimal jargon, the Public Understanding and Attitudes towards Genetics and Genomics (PUGGS) questionnaire is intended for use in research on science education and public understanding of science, as a means to investigate relationships between knowledge, determinism and attitudes about modern genetics, which are to date little understood. We developed a set of core ideas and initial items from reviewing the scientific literature on genetics and previous studies on public and student knowledge and attitudes about genetics. Seventeen international experts from different fields (e.g., genetics, education, philosophy of science) reviewed the initial items and their feedback was used to revise the questionnaire. We validated the questionnaire in two pilot tests with samples of university freshmen students. The final questionnaire contains 45 items, including both multiple choice and Likert scale response formats. Cronbach alpha showed good reliability for each section of the questionnaire. In conclusion, the PUGGS questionnaire is a reliable tool for investigating public understanding and attitudes towards modern genetics and genomic-based technologies. PMID:28114357
Carver, Rebecca Bruu; Castéra, Jérémy; Gericke, Niklas; Evangelista, Neima Alice Menezes; El-Hani, Charbel N
2017-01-01
In this paper we present the development and validation a comprehensive questionnaire to assess college students' knowledge about modern genetics and genomics, their belief in genetic determinism, and their attitudes towards applications of modern genetics and genomic-based technologies. Written in everyday language with minimal jargon, the Public Understanding and Attitudes towards Genetics and Genomics (PUGGS) questionnaire is intended for use in research on science education and public understanding of science, as a means to investigate relationships between knowledge, determinism and attitudes about modern genetics, which are to date little understood. We developed a set of core ideas and initial items from reviewing the scientific literature on genetics and previous studies on public and student knowledge and attitudes about genetics. Seventeen international experts from different fields (e.g., genetics, education, philosophy of science) reviewed the initial items and their feedback was used to revise the questionnaire. We validated the questionnaire in two pilot tests with samples of university freshmen students. The final questionnaire contains 45 items, including both multiple choice and Likert scale response formats. Cronbach alpha showed good reliability for each section of the questionnaire. In conclusion, the PUGGS questionnaire is a reliable tool for investigating public understanding and attitudes towards modern genetics and genomic-based technologies.
Interactive learning research: application of cognitive load theory to nursing education.
Hessler, Karen L; Henderson, Ann M
2013-06-25
The purpose of this research was to investigate the effectiveness of interactive self-paced computerized case study compared to traditional hand-written paper case study on the outcomes of student knowledge, attitude, and retention of the content delivered. Cognitive load theory (CLT) provided the theoretical framework for the study. A quasi-experimental pre-test post-test design with random group assignment was used to measure by self-report survey student cognitive load and interactivity level of the intervention. Student scores on quizzes in semester 1 and post-test follow-up quizzes in semester 3 were assessed for the intervention's effects on knowledge retention. While no significant statistical differences were found between groups, the students in the interactive case study group rated their case study as more fun and interactive. These students also scored consistently higher on the post-test quiz items in their third semester, showing the viability of using CLT to improve student retention of nursing curricula information.
17 CFR 229.407 - (Item 407) Corporate governance.
Code of Federal Regulations, 2014 CFR
2014-04-01
... the role of a nominating committee, including the entire board of directors. (3) Describe any material... audit committee has received the written disclosures and the letter from the independent accountant... independent accountant's communications with the audit committee concerning independence, and has discussed...
17 CFR 229.407 - (Item 407) Corporate governance.
Code of Federal Regulations, 2012 CFR
2012-04-01
... the role of a nominating committee, including the entire board of directors. (3) Describe any material... audit committee has received the written disclosures and the letter from the independent accountant... independent accountant's communications with the audit committee concerning independence, and has discussed...
17 CFR 229.407 - (Item 407) Corporate governance.
Code of Federal Regulations, 2013 CFR
2013-04-01
... the role of a nominating committee, including the entire board of directors. (3) Describe any material... audit committee has received the written disclosures and the letter from the independent accountant... independent accountant's communications with the audit committee concerning independence, and has discussed...
7 CFR 4284.638 - Application processing.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Agriculture Regulations of the Department of Agriculture (Continued) RURAL BUSINESS-COOPERATIVE SERVICE AND RURAL UTILITIES SERVICE, DEPARTMENT OF AGRICULTURE GRANTS Rural Business Opportunity Grants § 4284.638...; (iii) A written narrative which includes, at a minimum, the following items: (A) An explanation of why...
ERIC Educational Resources Information Center
Bernstein, Michael I.
1982-01-01
Steps a school board can take to minimize the risk of age discrimination suits include reviewing all written policies, forms, files, and collective bargaining agreements for age discriminatory items; preparing a detailed statistical analysis of the age of personnel; and reviewing reduction-in-force procedures. (Author/MLF)
Using the Clear Communication Index to Improve Materials for a Behavioral Intervention.
Porter, Kathleen J; Alexander, Ramine; Perzynski, Katelynn M; Kruzliakova, Natalie; Zoellner, Jamie M
2018-02-08
Ensuring that written materials used in behavioral interventions are clear is important to support behavior change. This study used the Clear Communication Index (CCI) to assess the original and revised versions of three types of written participant materials from the SIPsmartER intervention. Materials were revised based on original scoring. Scores for the entire index were significantly higher among revised versions than originals (57% versus 41%, p < 0.001); however, few revised materials (n = 2 of 53) achieved the benchmark of ≥90%. Handouts scored higher than worksheets and slide sets for both versions. The proportion of materials scored as having "a single main message" significantly increased between versions for worksheets (7% to 57%, p = 0.003) and slide sets (33% to 67%, p = 0.004). Across individual items, most significant improvements were in Core, with four-items related to the material having a single main message. Findings demonstrate that SIPsmartER's revised materials improved after CCI-informed edits. They advance the evidence and application of the CCI, suggesting it can be effectively used to support improvement in clarity of different types of written materials used in behavioral interventions. Implications for practical considerations of using the tool and suggestions for modifications for specific types of materials are presented.
Project W-314 specific test and evaluation plan for transfer line SN-633 (241-AX-B to 241-AY-02A)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hays, W.H.
1998-03-20
The purpose of this Specific Test and Evaluation Plan (STEP) is to provide a detailed written plan for the systematic testing of modifications made by the addition of the SN-633 transfer line by the W-314 Project. The STEP develops the outline for test procedures that verify the system`s performance to the established Project design criteria. The STEP is a lower tier document based on the W-314 Test and Evaluation Plan (TEP). This STEP encompasses all testing activities required to demonstrate compliance to the project design criteria as it relates to the addition of transfer line SN-633. The Project Design Specificationsmore » (PDS) identify the specific testing activities required for the Project. Testing includes Validations and Verifications (e.g., Commercial Grade Item Dedication activities), Factory Acceptance Tests (FATs), installation tests and inspections, Construction Acceptance Tests (CATs), Acceptance Test Procedures (ATPs), Pre-Operational Test Procedures (POTPs), and Operational Test Procedures (OTPs). It should be noted that POTPs are not required for testing of the transfer line addition. The STEP will be utilized in conjunction with the TEP for verification and validation.« less
ERIC Educational Resources Information Center
Goldenberg, Sheila
Research has suggested that the incidence of loneliness peaks at adolescence and decreases with age. Changes in the determinants of loneliness during adolescence were investigated for grade 8, grade 11, and university students. Subjects (N=410) completed a written questionnaire which included ten items from the UCLA Loneliness Scale, the…
The development of a clinical outcomes survey research application: Assessment CenterSM
Rothrock, Nan E.; Hanrahan, Rachel T.; Jansky, Liz J.; Harniss, Mark; Riley, William
2013-01-01
Introduction The National Institutes of Health sponsored Patient-Reported Outcome Measurement Information System (PROMIS) aimed to create item banks and computerized adaptive tests (CATs) across multiple domains for individuals with a range of chronic diseases. Purpose Web-based software was created to enable a researcher to create study-specific Websites that could administer PROMIS CATs and other instruments to research participants or clinical samples. This paper outlines the process used to develop a user-friendly, free, Web-based resource (Assessment CenterSM) for storage, retrieval, organization, sharing, and administration of patient-reported outcomes (PRO) instruments. Methods Joint Application Design (JAD) sessions were conducted with representatives from numerous institutions in order to supply a general wish list of features. Use Cases were then written to ensure that end user expectations matched programmer specifications. Program development included daily programmer “scrum” sessions, weekly Usability Acceptability Testing (UAT) and continuous Quality Assurance (QA) activities pre- and post-release. Results Assessment Center includes features that promote instrument development including item histories, data management, and storage of statistical analysis results. Conclusions This case study of software development highlights the collection and incorporation of user input throughout the development process. Potential future applications of Assessment Center in clinical research are discussed. PMID:20306332
Nielsen, Kathleen; Henderson, Sheila; Barnett, Anna L; Abbott, Robert D; Berninger, Virginia
2018-01-01
Movement, which draws on motor skills and executive functions for managing them, plays an important role in literacy learning (e.g., movement of mouth during oral reading and movement of hand and fingers during writing); but relatively little research has focused on movement skills in students with specific learning disabilities as the current study did. Parents completed normed Movement Assessment Battery for Children Checklist, 2nd edition (ABC-2), ratings and their children in grades 4 to 9 ( M = 11 years, 11 months; 94 boys, 61 girls) completed diagnostic assessment used to assign them to diagnostic groups: control typical language learning ( N = 42), dysgraphia (impaired handwriting) ( N = 29), dyslexia (impaired word decoding/reading and spelling) ( N = 65), or oral and written language learning disability (OWL LD) (impaired syntax in oral and written language) ( N = 19). The research aims were to (a) correlate the Movement ABC-2 parent ratings for Scale A Static/Predictable Environment (15 items) and Scale B Dynamic/Unpredictable Environment (15 items) with reading and writing achievement in total sample varying within and across different skills; and (b) compare each specific learning disability group with the control group on Movement ABC-2 parent ratings for Scale A, Scale B, and Scale C Movement-Related (Non-Motor Executive Functions, or Self-Efficacy, or Affect) (13 items). At least one Movement ABC-2 parent rating was correlated with each assessed literacy achievement skill. Each of three specific learning disability groups differed from the control group on two Scale A (static/predictable environment) items (fastens buttons and forms letters with pencil or pen) and on three Scale C items (distractibility, overactive, and underestimates own ability); but only OWL LD differed from control on Scale B (dynamic/unpredictable environment) items. Applications of findings to assessment and instruction for students ascertained for and diagnosed with persisting specific learning disabilities in literacy learning, and future research directions are discussed.
Nardi, Bernardo; Arimatea, Emidio; Giovagnoli, Sara; Blasi, Stefano; Bellantuono, Cesario; Rezzonico, Giorgio
2012-01-01
The Mini Questionnaire of Personal Organization (MQPO) has been constructed in order to comply with the inward/outward Personal Meaning Organization's (PMO) theory. According to Nardi's Adaptive Post-Rationalist approach, predictable and invariable caregivers' behaviours allow inward focus and a physical sight of reciprocity; non-predictable and variable caregivers' behaviours allow outward focus and a semantic sight of reciprocity. The 20 items of MQPO have been selected from 29 intermediate (n = 160) and 40 initial items (n = 204). Psychometric validation has been conducted (n = 296), including Internal Validity (Item-Total Correlation; Factor Analysis), Internal Coherence by Factor Analysis, two analyses in Discriminant Validity (n = 132 and n = 80) and Reliability by Test-Retest Analysis (n = 49). All subjects have been given their written informed consent before beginning the test. The validation of the MQPO shows that the ultimate version is consistent with its post-rationalist paradigm. Four different factors have been found, one for each PMO. Validity of the construct and the internal reliability index are satisfying (Alpha = 0.73). Moreover, the results obtained are constant (from r = 0.80 to r = 0.89). There is an adequate agreement between the MQPO scales and the clinical evaluations (72.5%), as well as an excellent agreement (80.0%) between the scores of the MQPO and those of the Personal Meaning Questionnaire. The MQPO is a tool able to study personality as a process by focusing on the relationships between personality and developmental process axes, which are the bases of the PMO's theory, according to the APR approach. Copyright © 2011 John Wiley & Sons, Ltd.
Strom, Suzanne L; Anderson, Craig L; Yang, Luanna; Canales, Cecilia; Amin, Alpesh; Lotfipour, Shahram; McCoy, C Eric; Osborn, Megan Boysen; Langdorf, Mark I
2015-11-01
Traditional Advanced Cardiac Life Support (ACLS) courses are evaluated using written multiple-choice tests. High-fidelity simulation is a widely used adjunct to didactic content, and has been used in many specialties as a training resource as well as an evaluative tool. There are no data to our knowledge that compare simulation examination scores with written test scores for ACLS courses. To compare and correlate a novel high-fidelity simulation-based evaluation with traditional written testing for senior medical students in an ACLS course. We performed a prospective cohort study to determine the correlation between simulation-based evaluation and traditional written testing in a medical school simulation center. Students were tested on a standard acute coronary syndrome/ventricular fibrillation cardiac arrest scenario. Our primary outcome measure was correlation of exam results for 19 volunteer fourth-year medical students after a 32-hour ACLS-based Resuscitation Boot Camp course. Our secondary outcome was comparison of simulation-based vs. written outcome scores. The composite average score on the written evaluation was substantially higher (93.6%) than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6-14.0%], p<0.00005). We found a statistically significant moderate correlation between simulation scenario test performance and traditional written testing (Pearson r=0.48, p=0.04), validating the new evaluation method. Simulation-based ACLS evaluation methods correlate with traditional written testing and demonstrate resuscitation knowledge and skills. Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation.
Construct Definition of Task Design and Related Concepts.
1980-05-19
with others). The final three items were from the Job Characteristics Inventory (Sims, Szilagyi , and Keller, 1976) written to tap friendship...eds.), Research in Organizational Behavior, Vol. 2, J.A.I. Press, Greenwich, Connecticut, 1980. Sims, H.P., Szilagyi , A.D., and Keller, R.T. The
76 FR 44987 - Proposed Agency Information Collection Activities; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-27
... items, which are drawn directly from definitions contained in the FDIC's assessment regulations (as... meeting the FDIC's definitions that will be used for assessment purposes only. The transition guidance... definitions of subprime and leveraged loans in the FDIC's assessment regulations were written in the context...
Bibliografia Especializada: Formacion Docente (Specialized Bibliography: Teacher Education).
ERIC Educational Resources Information Center
Boletin del Centro Nacional de Documentacion e Informacion Educativa, 1971
1971-01-01
This specialized, international bibliography on various issues in teacher education lists approximately 50 articles and books written between 1959 and 1970 in Spanish, French, English, and Portuguese. Many of the items are reports from conferences and seminars on related topics. Several concern teacher education within a given geographical region.…
7 CFR 1956.109 - General requirements for debt settlement.
Code of Federal Regulations, 2012 CFR
2012-01-01
... terms of the note or other instrument, or because of acceleration by written notice prior to the date of... expected, taking into consideration such items as the security, costs of administration, allowances of... security. Proceeds from the sale of security must be applied on the debtor's account, taking into...
7 CFR 1956.109 - General requirements for debt settlement.
Code of Federal Regulations, 2013 CFR
2013-01-01
... terms of the note or other instrument, or because of acceleration by written notice prior to the date of... expected, taking into consideration such items as the security, costs of administration, allowances of... security. Proceeds from the sale of security must be applied on the debtor's account, taking into...
7 CFR 1956.109 - General requirements for debt settlement.
Code of Federal Regulations, 2011 CFR
2011-01-01
... terms of the note or other instrument, or because of acceleration by written notice prior to the date of... expected, taking into consideration such items as the security, costs of administration, allowances of... security. Proceeds from the sale of security must be applied on the debtor's account, taking into...
COLDMON -- Cold File Analysis Package
NASA Astrophysics Data System (ADS)
Rawlinson, D. J.
The COLDMON package has been written to allow system managers to identify those items of software that are not used (or used infrequently) on their systems. It consists of a few command procedures and a Fortran program to analyze the results. It makes use of the AUDIT facility and security ACLs in VMS.
48 CFR 227.7103-11 - Contractor procedures and records.
Code of Federal Regulations, 2010 CFR
2010-10-01
... Rights in Technical Data 227.7103-11 Contractor procedures and records. (a) The clause at 252.227-7013, Rights in Technical Data—Noncommercial Items, requires a contractor, and its subcontractors or suppliers that will deliver technical data with other than unlimited rights, to establish and follow written...
Mass Communication in Malaysia: An Annotated Bibliography.
ERIC Educational Resources Information Center
Tee, Lim Huck, Comp; Sarachandran, V.V., Comp.
This bibliography lists published and unpublished material relating to mass communications in Malaysia, 1945 to 1973. Most of the items listed are written in English and Malay, and a limited number are in Chinese. The bibliography is divided into 21 sections: bibliography and reference material; communication theory, research methods;…
ERIC Educational Resources Information Center
Asian Mass Communication Research and Information Centre, Singapore.
This bibliography lists and describes published and unpublished material written in English that relates to mass communications in India, from 1945 to 1973. The items are grouped into 21 sections: bibliography and reference material; communication theory and research methods; communication (general); media development and characteristics;…
17 CFR 229.1001 - (Item 1001) Summary term sheet.
Code of Federal Regulations, 2010 CFR
2010-04-01
... sheet that is written in plain English. The summary term sheet must briefly describe in bullet point format the most material terms of the proposed transaction. The summary term sheet must provide security... transaction. The bullet points must cross-reference a more detailed discussion contained in the disclosure...
Children's Literature & Disability. Resources You Can Use. NICHCY Bibliography 5. Second Edition.
ERIC Educational Resources Information Center
National Information Center for Children and Youth with Disabilities, Washington, DC.
This bibliography of 95 items is intended to help parents and professionals identify books that are written about or include characters with a disability. The list is grouped according to the following disabilities or issues: attention deficit/hyperactivity disorder, autism, Down syndrome, hearing impairment (including deafness, learning…
76 FR 77552 - Gettysburg National Military Park Advisory Commission
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-13
.... Any member of the public may file with the Commission a written statement concerning agenda items. The... address, or other personal identifying information in your comment, you should be aware that your entire comment--including your personal identifying information--may be made publicly available at any time...
Reproduction of Inflectional Markers in French-Speaking Children with Reading Impairment
ERIC Educational Resources Information Center
St-Pierre, Marie-Catherine; Beland, Renee
2010-01-01
Purpose: Children with reading impairment (RI) experience difficulties in oral and written production of inflectional markers. The origin of these difficulties is not well documented in French. According to some authors, acquisition of irregular items by typically developing children is predicted by token frequency, whereas acquisition of regular…
77 FR 73414 - Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-10
... value associated with sales of weapon systems or defense-related items to foreign countries or foreign... in either government-to-government or commercial sales of defense articles and/or defense services as... 20230 (or via the Internet at [email protected] ). Written comments and recommendations for the proposed...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-14
... items are organized into four focus areas: Surveillance, Prevention and Control, Research, and Product... of the Action Plan: Surveillance, Prevention and Control, Research, and Product Development. Written... for Disease Control and Prevention (CDC), Food and Drug Administration (FDA), and National Institutes...
78 FR 50049 - Meeting of the Uniform Formulary Beneficiary Advisory Panel
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-16
... follow each agenda item) a. Corticosteroids-Immune Modulators b. Self-Monitoring Blood Glucose Systems c... individuals or interested groups to address the Panel. To ensure consideration of their comments, individuals and interested groups should submit written statements as outlined in this notice; but if they still...
Adolescent Racial Identity: Self-Identification of Multiple and "Other" Race/Ethnicities
ERIC Educational Resources Information Center
Harris, Bryn; Ravert, Russell D.; Sullivan, Amanda L.
2017-01-01
This mixed methods study focused on adolescents who rejected conventional singular racial/ethnic categorization by selecting multiple race/ethnicities or writing descriptions of "Other" racial/ethnic identities in response to a survey item asking them to identify their race/ethnicity. Written responses reflected eight distinct categories…
Bilinguals Show Weaker Lexical Access during Spoken Sentence Comprehension
ERIC Educational Resources Information Center
Shook, Anthony; Goldrick, Matthew; Engstler, Caroline; Marian, Viorica
2015-01-01
When bilinguals process written language, they show delays in accessing lexical items relative to monolinguals. The present study investigated whether this effect extended to spoken language comprehension, examining the processing of sentences with either low or high semantic constraint in both first and second languages. English-German…
What do Seniors Remember from Freshman Physics?
NASA Astrophysics Data System (ADS)
Barrantes, Analia; Pawl, Andrew; Pritchard, David E.
2009-11-01
We have given a group of 56 MIT seniors who took mechanics as freshmen a written test similar to the final exam they took in their freshman course, plus the Mechanics Baseline Test (MBT) and Colorado Learning Attitudes about Science Survey (C-LASS) standard instruments. Students in majors unrelated to physics scored 60% lower on the written analytic part of the final than they did as freshmen. The mean score of all students on conceptual multiple choice questions included on the final also declined by about 60% relative to the scores of freshmen. The mean score of all participants on the MBT was insignificantly changed from the posttest taken as freshmen. More specifically, however, the students' performance on 9 of the 26 MBT items (with 6 of the 9 involving graphical kinematics) represents a gain over their freshman pretest score (a normalized gain of about 70%, double the gain achieved in the freshman course alone), while their performance on the remaining 17 questions is best characterized as a loss of approximately 50% of the material learned in the freshman course. Attitudinal survey results indicate that almost half the seniors feel the specific mechanics course content is unlikely to be useful to them, a significant majority (75-85%) feel that physics does teach valuable skills, and an overwhelming majority believe that mechanics should remain a required course at MIT.
What do seniors remember from freshman physics?
NASA Astrophysics Data System (ADS)
Pawl, Andrew; Barrantes, Analia; Pritchard, David E.; Mitchell, Rudolph
2012-12-01
We have given a group of 56 Massachusetts Institute of Technology (MIT) seniors who took mechanics as freshmen a written test similar to the final exam they took in their freshman course as well as the Mechanics Baseline Test (MBT) and the Colorado Learning Attitudes about Science Survey (CLASS). Students in majors unrelated to physics scored 60% lower on the written analytic part of the final than they would have as freshmen. The mean score of all participants on the MBT was insignificantly changed from their average on the posttest they took as freshmen. However, the students’ performance on 9 of the 26 MBT items (with 6 of the 9 involving graphical kinematics) represents a gain over their freshman posttest score (a normalized gain of about 70%), while their performance on the remaining 17 questions is best characterized as a loss of approximately 50% of the material learned in the freshman course. On multiple-choice questions covering advanced physics concepts, the mean score of the participants was about 50% lower than the average performance of freshmen. Although attitudinal survey results indicate that almost half the seniors feel the specific mechanics course content is unlikely to be useful to them, a significant majority (75%-85%) feel that physics does teach valuable problem solving skills, and an overwhelming majority believe that mechanics should remain a required course at MIT.
Applying the health promotion model to development of a worksite intervention.
Lusk, S L; Kerr, M J; Ronis, D L; Eakin, B L
1999-01-01
Consistent use of hearing protection devices (HPDs) decreases noise-induced hearing loss, however, many workers do not use them consistently. Past research has supported the need to use a conceptual framework to understand behaviors and guide intervention programs; however, few reports have specified a process to translate a conceptual model into an intervention. The strongest predictors from the Health Promotion Model were used to design a training program to increase HPD use among construction workers. Carpenters (n = 118), operating engineers (n = 109), and plumber/pipefitters (n = 129) in the Midwest were recruited to participate in the study. Written questionnaires including scales measuring the components of the Health Promotion Model were completed in classroom settings at worker trade group meetings. All items from scales predicting HPD use were reviewed to determine the basis for the content of a program to promote the use of HPDs. Three selection criteria were developed: (1) correlation with use of hearing protection (at least .20), (2) amenability to change, and (3) room for improvement (mean score not at ceiling). Linear regression and Pearson's correlation were used to assess the components of the model as predictors of HPD use. Five predictors had statistically significant regression coefficients: perceived noise exposure, self-efficacy, value of use, barriers to use, and modeling of use of hearing protection. Using items meeting the selection criteria, a 20-minute videotape with written handouts was developed as the core of an intervention. A clearly defined practice session was also incorporated in the training intervention. Determining salient factors for worker populations and specific protective equipment prior to designing an intervention is essential. These predictors provided the basis for a training program that addressed the specific needs of construction workers. Results of tests of the effectiveness of the program will be available in the near future.
The CARE guidelines: consensus-based clinical case report guideline development.
Gagnier, Joel J; Kienle, Gunver; Altman, Douglas G; Moher, David; Sox, Harold; Riley, David
2014-01-01
A case report is a narrative that describes, for medical, scientific, or educational purposes, a medical problem experienced by one or more patients. Case reports written without guidance from reporting standards are insufficiently rigorous to guide clinical practice or to inform clinical study design. Develop, disseminate, and implement systematic reporting guidelines for case reports. We used a three-phase consensus process consisting of (1) pre-meeting literature review and interviews to generate items for the reporting guidelines, (2) a face-to-face consensus meeting to draft the reporting guidelines, and (3) post-meeting feedback, review, and pilot testing, followed by finalization of the case report guidelines. This consensus process involved 27 participants and resulted in a 13-item checklist-a reporting guideline for case reports. The primary items of the checklist are title, key words, abstract, introduction, patient information, clinical findings, timeline, diagnostic assessment, therapeutic interventions, follow-up and outcomes, discussion, patient perspective, and informed consent. We believe the implementation of the CARE (CAse REport) guidelines by medical journals will improve the completeness and transparency of published case reports and that the systematic aggregation of information from case reports will inform clinical study design, provide early signals of effectiveness and harms, and improve healthcare delivery. Copyright © 2014 Reproduced with permission of Global Advances in Health and Medicine. Published by Elsevier Inc. All rights reserved.
Selecting Items for Criterion-Referenced Tests.
ERIC Educational Resources Information Center
Mellenbergh, Gideon J.; van der Linden, Wim J.
1982-01-01
Three item selection methods for criterion-referenced tests are examined: the classical theory of item difficulty and item-test correlation; the latent trait theory of item characteristic curves; and a decision-theoretic approach for optimal item selection. Item contribution to the standardized expected utility of mastery testing is discussed. (CM)
Analytic study of the Tadoma method: background and preliminary results.
Norton, S J; Schultz, M C; Reed, C M; Braida, L D; Durlach, N I; Rabinowitz, W M; Chomsky, C
1977-09-01
Certain deaf-blind persons have been taught, through the Tadoma method of speechreading, to use vibrotactile cues from the face and neck to understand speech. This paper reports the results of preliminary tests of the speechreading ability of one adult Tadoma user. The tests were of four major types: (1) discrimination of speech stimuli; (2) recognition of words in isolation and in sentences; (3) interpretation of prosodic and syntactic features in sentences; and (4) comprehension of written (Braille) and oral speech. Words in highly contextual environments were much better perceived than were words in low-context environments. Many of the word errors involved phonemic substitutions which shared articulatory features with the target phonemes, with a higher error rate for vowels than consonants. Relative to performance on word-recognition tests, performance on some of the discrimination tests was worse than expected. Perception of sentences appeared to be mildly sensitive to rate of talking and to speaker differences. Results of the tests on perception of prosodic and syntactic features, while inconclusive, indicate that many of the features tested were not used in interpreting sentences. On an English comprehension test, a higher score was obtained for items administered in Braille than through oral presentation.
Database of Industrial Technological Information in Kanagawa : Networks for Technology Activities
NASA Astrophysics Data System (ADS)
Saito, Akira; Shindo, Tadashi
This system is one of the databases which require participation by its members and of which premise is to open all the data in it. Aiming at free technological cooperation and exchange among industries it was constructed by Kanagawa Prefecture in collaboration with enterprises located in it. The input data is 36 items such as major product, special and advantageous technology, technolagy to be wanted for cooperation, facility and equipment, which technologically characterize each enterprise. They are expressed in 2,000 characters and written by natural language including Kanji except for some coded items. 24 search items are accessed by natural language so that in addition to interactive searching procedures including menu-type it enables extensive searching. The information service started in Oct., 1986 covering data from 2,000 enterprisen.
ERIC Educational Resources Information Center
Yang, Shou-Jung, Comp.
This bibliography lists and describes published and unpublished material relating to mass communications in Taiwan, from 1945 to 1973. Almost all the items listed are written in Chinese. The bibliography is divided into 17 sections: bibliography and reference material; communication theory and research methods; communication (general); media…
48 CFR 14.407-4 - Mistakes after award.
Code of Federal Regulations, 2014 CFR
2014-10-01
... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 14.407-4 Mistakes...) To rescind a contract; (2) To reform a contract (i) to delete the items involved in the mistake or.... (vi) A written request by the contractor to reform or rescind the contract, and copies of all other...
48 CFR 14.407-4 - Mistakes after award.
Code of Federal Regulations, 2012 CFR
2012-10-01
... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 14.407-4 Mistakes...) To rescind a contract; (2) To reform a contract (i) to delete the items involved in the mistake or.... (vi) A written request by the contractor to reform or rescind the contract, and copies of all other...
48 CFR 14.407-4 - Mistakes after award.
Code of Federal Regulations, 2011 CFR
2011-10-01
... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 14.407-4 Mistakes...) To rescind a contract; (2) To reform a contract (i) to delete the items involved in the mistake or.... (vi) A written request by the contractor to reform or rescind the contract, and copies of all other...
48 CFR 14.407-4 - Mistakes after award.
Code of Federal Regulations, 2013 CFR
2013-10-01
... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 14.407-4 Mistakes...) To rescind a contract; (2) To reform a contract (i) to delete the items involved in the mistake or.... (vi) A written request by the contractor to reform or rescind the contract, and copies of all other...
48 CFR 14.407-4 - Mistakes after award.
Code of Federal Regulations, 2010 CFR
2010-10-01
... CONTRACTING METHODS AND CONTRACT TYPES SEALED BIDDING Opening of Bids and Award of Contract 14.407-4 Mistakes...) To rescind a contract; (2) To reform a contract (i) to delete the items involved in the mistake or.... (vi) A written request by the contractor to reform or rescind the contract, and copies of all other...
Having an Operation: An ESL Workbook. English as a Second Language Community Survival Skills.
ERIC Educational Resources Information Center
Rabinowitz, Myrna
This workbook, written in simple English with pictographs, helps adult learners of English in British Columbia (Canada) deal with biological and medical needs and hospital patient visits. Lessons include parts of the body, nurses' and doctors' examination and instructions, general health problems, hospital items, hospital admissions, and pre- and…
Outcome Measures of Triple Board Graduates, 1991-2003
ERIC Educational Resources Information Center
Warren, Marla J.; Dunn, David W.; Rushton, Jerry
2006-01-01
Objective: To describe program outcomes for the Combined Training Program in Child and Adolescent Psychiatry, Pediatrics, and Psychiatry (Triple Board Program). Method: All Triple Board Program graduates to date (1991-2003) were asked to participate in a 37-item written survey from February to April 2004. Results: The response rate was 80.7%. Most…
Elephants: Big, Strong and Wise. Young Discovery Library Series.
ERIC Educational Resources Information Center
Pfeffer, Pierre
This book is written for children ages 5 through 10. Part of a series designed to develop their curiosity, fascinate them and educate them, this volume examines the characteristics and natural history of elephants. Topics included are: (1) elephant's ancestors; (2) elephant life; and (3) training elephants for work. Quiz items are included. (YP)
17 CFR 229.911 - (Item 911) Reports, opinions and appraisals.
Code of Federal Regulations, 2010 CFR
2010-04-01
... fairness of the consideration to be offered to investors in connection with the roll-up transaction or the fairness of such transaction to the general partner or investors. (2) With respect to any report, opinion... to the effect that upon written request by an investor or his representative who has been so...
25 CFR 11.704 - Appointment and duties of executor or administrator.
Code of Federal Regulations, 2010 CFR
2010-04-01
... against the decedent's estate and determine their validity; (4) To cause a written inventory of all the decedent's property within the estate to be prepared promptly with each article or item being separately set forth and cause such property to be exhibited to and appraised by an appraiser, and the inventory...
Chinese and American Textbook Business--Totally Different the Finding.
ERIC Educational Resources Information Center
Mehl, Marc
1978-01-01
Members of the second United States booksellers delegation to the People's Republic of China observed that textbooks in China carry political messages; the state and teachers are involved in the publishing process; texts are written by committees; and textbooks are almost always paperbacks and not available as a retail item. (JMD)
Federal Register 2010, 2011, 2012, 2013, 2014
2010-04-14
... persons, to submit written comments or suggestions regarding items for inclusion in a new Work Program for... the performance of the forest concession system in meeting economic, social, and ecological objectives... Governance. We are requesting ideas and suggestions that may be considered for inclusion in the next Work...
Sensitivity to Disgust and Perceptions of Natural Bodies of Water and Watercraft Activities
Robert D. Bixler; Gwynn Powell
2003-01-01
A written 7-item self-report scale on sensitivity to disgust and participation in watercraft activities was administered to 450 seasonal park employees. Correlations indicate that nonparticipation in seven different watercraft sports was weakly related with reactions of disgust to contact with natural bodies of water (rpbis...
Comparing the Lexical Features of EAP Students' Essays by Prompt and Rating
ERIC Educational Resources Information Center
Lavallée, Maxime; McDonough, Kim
2015-01-01
Previous research has shown that high frequency lexical items, such as AWL words and formulaic expressions, may differentiate between texts written by expert and novice writers (Chen & Baker, 2010; Hancioglu, 2009), and that lexical features related to breadth, depth, and accessibility differentiate among texts from L2 writers of different…
24 CFR 904.111 - Nonroutine Maintenance Reserve (NRMR).
Code of Federal Regulations, 2010 CFR
2010-04-01
... painting, etc.), (2) show for each listed item the estimated frequency of maintenance or useful life before... but only pursuant to a prior written agreement with the LHA covering the nature and scope of the work... of the work, be credited to the homebuyer's EHPA and charged as provided in paragraph (c)(2) of this...
22 CFR 123.9 - Country of ultimate destination and approval of reexports or retransfers.
Code of Federal Regulations, 2014 CFR
2014-04-01
... subchapter as Missile Technology Control Regime (MTCR) items; and (3) The person reexporting the defense... ARMS REGULATIONS LICENSES FOR THE EXPORT AND TEMPORARY IMPORT OF DEFENSE ARTICLES § 123.9 Country of... written approval of the Directorate of Defense Trade Controls must be obtained before reselling...
Communication Strategies and Psychological Processes Underlying Lexical Simplification.
ERIC Educational Resources Information Center
Kumaravadivelu, B.
1988-01-01
Analyzes interlanguage written discourse produced by advanced Tamil-speaking learners of English as a second language. Eight communication strategies are discussed, including: 1) extended use of lexical items; 2) lexical paraphrase; 3) word coinage; 4) native language (L1) equivalence; 5) literal translation of L1 idiom; 6) L1 mode of emphasis; 7)…
ERIC Educational Resources Information Center
Federer, Meghan Rector; Nehm, Ross H.; Pearl, Dennis K.
2016-01-01
Understanding sources of performance bias in science assessment provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Research investigating assessment bias due to factors such as instrument structure, participant characteristics, and item types are well documented across a…
Rapp, B; Caramazza, A
1997-02-01
We describe the case of a brain-damaged individual whose speech is characterized by difficulty with practically all words except for elements of the closed class vocabulary. In contrast, his written sentence production exhibits a complementary impairment involving the omission of closed class vocabulary items and the relative sparing of nouns. On the basis of these differences we argue: (1) that grammatical categories constitute an organizing parameter of representation and/or processing for each of the independent, modality-specific lexicons, and (2) that these observations contribute to the growing evidence that access to the orthographic and phonological forms of words can occur independently.
NASA Astrophysics Data System (ADS)
Clark, Sharron Ann
This is possibly the first study of a hybrid online biology course where WebCT internet-enhanced modes of instruction replaced conventional face-to-face (F2F) lecture materials, merging with collaborative inquiry-based on-campus laboratory instructional modes. Although not a true experiment, the design of this study included three independent cohorts, a pretest and three posttests, as described by Gay and Airasian (2000). This study reported differences in age, gender, number of prior online courses and pretest scores. Over time, persistence, achievement and computer self-efficacy differed in one hybrid online section (N = 31) and two F2F cohorts (N = 29 and 30). One F2F cohort used written test materials and the other used intranet-delivered materials to examine possible differences in groups using electronic assessment modes. In this study, community college students self-selecting into online hybrid and traditional versions of the same biology course did not have the same number of prior online courses, achievement or persistence rates as those self-selecting into F2F sections of the same course with the same laboratories and instructor. This study includes twenty pretest items selected from Instructor's Manual and Test Item File to Accompany: Inquiry into Life, 9th Edition (Schrock, 2000). This study produced 63 tables, 13 figures and 173 references.
Vaona, Alberto; Marcon, Alessandro; Rava, Marta; Buzzetti, Roberto; Sartori, Marco; Abbinante, Crescenza; Moser, Andrea; Seddaiu, Antonia; Prontera, Manuela; Quaglio, Alessandro; Pallazzoni, Piera; Sartori, Valentina; Rigon, Giulio
2011-12-01
Many medical journals provide patient information leaflets on the correct use of medicines and/or appropriate lifestyles. Only a few studies have assessed the quality of this patient-specific literature. The purpose of this study was to evaluate the quality of JAMA Patient Pages on diabetes using the Ensuring Quality Information for Patient (EQIP) tool. A multidisciplinary group of 10 medical doctors analyzed all diabetes-related Patient Pages published by JAMA from 1998 to 2010 using the EQIP tool. Inter-rater reliability was assessed using the percentage of observed total agreement (p(o)). A quality score between 0 and 1 (the higher score indicating higher quality) was calculated for each item on every page as a function of raters' answers to the EQIP checklist. A mean score per item and a mean score per page were then calculated. We found 8 Patient Pages on diabetes on the JAMA web site. The overall quality score of the documents ranged between 0.55 (Managing Diabetes and Diabetes) and 0.67 (weight and diabetes). p(o) was at least moderate (>50%) for 15 of the 20 EQIP items. Despite generally favorable quality scores, some items received low scores. The worst scores were for the item assessing provision of an empty space to customize information for individual patients (score=0.01, p(o)=95%) and patients involvement in document drafting (score=0.11, p(o)=79%). The Patient Pages on diabetes published by JAMA were found to present weak points that limit their overall quality and may jeopardize their efficacy. We therefore recommend that authors and publishers of written patient information comply with published quality criteria. Further research is needed to evaluate the quality and efficacy of existing written health care information. Copyright © 2011 Primary Care Diabetes Europe. Published by Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Burns, Daniel J.; Martens, Nicholas J.; Bertoni, Alicia A.; Sweeney, Emily J.; Lividini, Michelle D.
2006-01-01
In a repeated testing paradigm, list items receiving item-specific processing are more likely to be recovered across successive tests (item gains), whereas items receiving relational processing are likely to be forgotten progressively less on successive tests. Moreover, analysis of cumulative-recall curves has shown that item-specific processing…
The Development of a Nystagmus-Specific Quality-of-Life Questionnaire.
McLean, Rebecca J; Maconachie, Gail D E; Gottlob, Irene; Maltby, John
2016-09-01
To develop a nystagmus-specific quality-of-life (QOL) questionnaire derived from patient concerns based on eudaimonic aspects of well-being. Cross-sectional study. A total of 206 participants with nystagmus for factor analysis phase and an additional 42 participants with nystagmus for construct validity phase. Questionnaire items were written on the basis of the 6 domains of everyday living affected by nystagmus that were elicited by previous semistructured interviews conducted with 21 people with nystagmus. After consultation with 8 nystagmus experts, 37 items were administered to 206 people with nystagmus. Factor analysis was used to identify latent factors among the items and identify items to propose new nystagmus QOL scales. Cronbach's alpha was used to assess the internal reliability of the new scales. To assess for discriminate and concurrent validity between the new nystagmus scales and an existing vision-related QOL tool, the Visual Function Questionnaire-25 (VFQ-25) was administered to 42 additional participants. Questionnaire response scores on nystagmus-specific QOL items. The factor analysis revealed the retention of 29 items to form a measure comprising 2 distinct subscales reflecting "personal and social" and "physical and environmental" functioning as relating to nystagmus-specific QOL. The Cronbach's alpha coefficients for the "personal and social" functioning scale and "physical and environmental" functioning were 0.95 and 0.93, respectively. Tests for validity of the measure, consistent with a priori predictions, when compared with the VFQ-25, revealed the "physical and environmental" subscale showed concurrent validity (0.88), whereas the "personal and social" subscale was demonstrated to have discriminative validity (0.81). We have developed a 29-item, nystagmus-specific QOL questionnaire (NYS-29) based on eudaimonic aspects of well-being with subscales that address not only physical functioning but also psycho-social issues. The NYS-29 is grounded in the perspectives and concerns of those who have nystagmus and can be used to determine the impact of nystagmus on daily living in terms of both physical and psychosocial aspects. Copyright © 2016 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Matlock, Ki Lynn; Turner, Ronna
2016-01-01
When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…
Students’ misconceptions on solubility equilibrium
NASA Astrophysics Data System (ADS)
Setiowati, H.; Utomo, S. B.; Ashadi
2018-05-01
This study investigated the students’ misconceptions of the solubility equilibrium. The participants of the study consisted of 164 students who were in the science class of second year high school. Instrument used is two-tier diagnostic test consisting of 15 items. Responses were marked and coded into four categories: understanding, misconception, understand little without misconception, and not understanding. Semi-structured interviews were carried out with 45 students according to their written responses which reflected different perspectives, to obtain a more elaborated source of data. Data collected from multiple methods were analyzed qualitatively and quantitatively. Based on the data analysis showed that the students misconceptions in all areas in solubility equilibrium. They had more misconceptions such as in the relation of solubility and solubility product, common-ion effect and pH in solubility, and precipitation concept.
Factors associated with middle-school mathematics achievement in Greece: the case of algebra
NASA Astrophysics Data System (ADS)
Skouras, A. S.
2014-01-01
This study presents a subset of factors and their association with students' achievement in school algebra. The participants were students who had enrolled in 2007 at the ninth year of Greek public education (third year of middle school). A total of 735 students participated (aged 14-15 years) from 37 public secondary schools. The sample consisted of 378 girls (51.4%) and 357 boys (48.6%). A written algebra test and a questionnaire including demographic survey items were used to collect data. The results show that attitude towards mathematics (ATM) and the current teacher rating of mathematics performance were identified as the more significant predictors of algebra achievement, contributing by 18.1% and 24.7%, respectively, in total variance of mean at the end of ninth grade.
Gattrell, William T; Hopewell, Sally; Young, Kate; Farrow, Paul; White, Richard; Winchester, Christopher C
2016-01-01
Objectives Authors may choose to work with professional medical writers when writing up their research for publication. We examined the relationship between medical writing support and the quality and timeliness of reporting of the results of randomised controlled trials (RCTs). Design Cross-sectional study. Study sample Primary reports of RCTs published in BioMed Central journals from 2000 to 16 July 2014, subdivided into those with medical writing support (n=110) and those without medical writing support (n=123). Main outcome measures Proportion of items that were completely reported from a predefined subset of the Consolidated Standards of Reporting Trials (CONSORT) checklist (12 items known to be commonly poorly reported), overall acceptance time (from manuscript submission to editorial acceptance) and quality of written English as assessed by peer reviewers. The effect of funding source and publication year was examined. Results The number of articles that completely reported at least 50% of the CONSORT items assessed was higher for those with declared medical writing support (39.1% (43/110 articles); 95% CI 29.9% to 48.9%) than for those without (21.1% (26/123 articles); 95% CI 14.3% to 29.4%). Articles with declared medical writing support were more likely than articles without such support to have acceptable written English (81.1% (43/53 articles); 95% CI 67.6% to 90.1% vs 47.9% (23/48 articles); 95% CI 33.5% to 62.7%). The median time of overall acceptance was longer for articles with declared medical writing support than for those without (167 days (IQR 114.5–231 days) vs 136 days (IQR 77–193 days)). Conclusions In this sample of open-access journals, declared professional medical writing support was associated with more complete reporting of clinical trial results and higher quality of written English. Medical writing support may play an important role in raising the quality of clinical trial reporting. PMID:26899254
ERIC Educational Resources Information Center
Spaan, Mary
2007-01-01
This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…
ERIC Educational Resources Information Center
Hewitt, Margaret A.; Homan, Susan P.
2004-01-01
Test validity issues considered by test developers and school districts rarely include individual item readability levels. In this study, items from a major standardized test were examined for individual item readability level and item difficulty. The Homan-Hewitt Readability Formula was applied to items across three grade levels. Results of…
The Effect of the Position of an Item within a Test on the Item Difficulty Value.
ERIC Educational Resources Information Center
Rubin, Lois S.; Mott, David E. W.
An investigation of the effect on the difficulty value of an item due to position placement within a test was made. Using a 60-item operational test comprised of 5 subtests, 60 items were placed as experimental items on a number of spiralled test forms in three different positions (first, middle, last) within the subtest composed of like items.…
ERIC Educational Resources Information Center
Marie, S. Maria Josephine Arokia; Edannur, Sreekala
2015-01-01
This paper focused on the analysis of test items constructed in the paper of teaching Physical Science for B.Ed. class. It involved the analysis of difficulty level and discrimination power of each test item. Item analysis allows selecting or omitting items from the test, but more importantly item analysis is a tool to help the item writer improve…
Patient health record on a smart card.
Naszlady, A; Naszlady, J
1998-02-01
A validated health questionnaire has been used for the documentation of a patient's history (826 items) and of the findings from physical examination (591 items) in our clinical ward for 25 years. This computerized patient record has been completed in EUCLIDES code (CEN TC/251) for laboratory tests and an ATC and EAN code listing for the names of the drugs permanently required by the patient. In addition, emergency data were also included on an EEPROM chipcard with a 24 kb capacity. The program is written in FOX-PRO language. A group of 5000 chronically ill in-patients received these cards which contain their health data. For security reasons the contents of the smart card is only accessible by a doctor's PIN coded key card. The personalization of each card was carried out in our health center and the depersonalized alphanumeric data were collected for further statistical evaluation. This information served as a basis for a real need assessment of health care and for the calculation of its cost. Code-combined with an optical card, a completely paperless electronic patient record system has been developed containing all three information carriers in medicine: Texts, Curves and Pictures.
Rollo, Megan E; Ash, Susan; Lyons-Wall, Philippa; Russell, Anthony
2011-01-01
We evaluated a mobile phone application (Nutricam) for recording dietary intake. It allowed users to capture a photograph of food items before consumption and store a voice recording to explain the contents of the photograph. This information was then sent to a website where it was analysed by a dietitian. Ten adults with type 2 diabetes (BMI 24.1-47.9 kg/m(2)) recorded their intake over a three-day period using both Nutricam and a written food diary. Compared to the food diary, energy intake was under-recorded by 649 kJ (SD 810) using the mobile phone method. However, there was no trend in the difference between dietary assessment methods at levels of low or high energy intake. All subjects reported that the mobile phone system was easy to use. Six subjects found that the time taken to record using Nutricam was shorter than recording using the written diary, while two reported that it was about the same. The level of detail provided in the voice recording and food items obscured in photographs reduced the quality of the mobile phone records. Although some modifications to the mobile phone method will be necessary to improve the accuracy of self-reported intake, the system was considered an acceptable alternative to written records and has the potential to be used by adults with type 2 diabetes for monitoring dietary intake by a dietitian.
Montoya, A; Llopis, N; Gilaberte, I
2011-12-01
DISCERN is an instrument designed to help patients assess the reliability of written information on treatment choices. Originally created in English, there is no validated Spanish version of this instrument. This study seeks to validate the Spanish translation of the DISCERN instrument used as a primary measure on a multicenter study aimed to assess the reliability of web-based information on treatment choices for attention deficit/hyperactivity disorder (ADHD). We used a modified version of a method for validating translated instruments in which the original source-language version is formally compared with the back-translated source-language version. Each item was ranked in terms of comparability of language, similarity of interpretability, and degree of understandability. Responses used Likert scales ranging from 1 to 7, where 1 indicates the best interpretability, language and understandability, and 7 indicates the worst. Assessments were performed by 20 raters fluent in the source language. The Spanish translation of DISCERN, based on ratings of comparability, interpretability and degree of understandability (mean score (SD): 1.8 (1.1), 1.4 (0.9) and 1.6 (1.1), respectively), was considered extremely comparable. All items received a score of less than three, therefore no further revision of the translation was needed. The validation process showed that the quality of DISCERN translation was high, validating the comparable language of the tool translated on assessing written information on treatment choices for ADHD.
ERIC Educational Resources Information Center
Wang, Wei
2013-01-01
Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…
Test item linguistic complexity and assessments for deaf students.
Cawthon, Stephanie
2011-01-01
Linguistic complexity of test items is one test format element that has been studied in the context of struggling readers and their participation in paper-and-pencil tests. The present article presents findings from an exploratory study on the potential relationship between linguistic complexity and test performance for deaf readers. A total of 64 students completed 52 multiple-choice items, 32 in mathematics and 20 in reading. These items were coded for linguistic complexity components of vocabulary, syntax, and discourse. Mathematics items had higher linguistic complexity ratings than reading items, but there were no significant relationships between item linguistic complexity scores and student performance on the test items. The discussion addresses issues related to the subject area, student proficiency levels in the test content, factors to look for in determining a "linguistic complexity effect," and areas for further research in test item development and deaf students.
The Selection of Test Items for Decision Making with a Computer Adaptive Test.
ERIC Educational Resources Information Center
Spray, Judith A.; Reckase, Mark D.
The issue of test-item selection in support of decision making in adaptive testing is considered. The number of items needed to make a decision is compared for two approaches: selecting items from an item pool that are most informative at the decision point or selecting items that are most informative at the examinee's ability level. The first…
ERIC Educational Resources Information Center
Espejo, Cristina Y., Ed.; Fontgalland, Guy de, Ed.
This bibliography lists and describes published and unpublished material relating to mass communications in Singapore, from 1945 to 1973. Most of the items listed are written in English; a limited number are in Chinese. The bibliography is divided into 18 sections: bibliography and reference material; communication theory and research methods;…
History Untold: Celebrating Ohio History Through ABLE Students. Ohio History Project.
ERIC Educational Resources Information Center
Kent State Univ., OH. Ohio Literacy Resource Center.
This document is a compilation of 33 pieces of writing presenting Ohio adult basic and literacy education (ABLE) students' perspectives of community and personal history. The items included in the compilation were written by ABLE students across Ohio in celebration of Ohio History Day. The compilation is organized in five sections as follows: (1)…
Information System Life-Cycle And Documentation Standards (SMAP DIDS)
NASA Technical Reports Server (NTRS)
1990-01-01
Although not computer program, SMAP DIDS written to provide systematic, NASA-wide structure for documenting information system development projects. Each DID (data item description) outlines document required for top-quality software development. When combined with management, assurance, and life cycle standards, Standards protect all parties who participate in design and operation of new information system.
ERIC Educational Resources Information Center
Holcombe, Lee
2016-01-01
The Educational Policy Institute released a new report today about the ability of state and national databases to meet the policy needs related to higher education affordability. Written by EPI Senior Research Associate Lee Holcombe, the report finds that although states are establishing ambitious higher education participation and success targets…
Powerful Letter Writing: Students Discover the Power of the Pen(cil).
ERIC Educational Resources Information Center
Neuhauser, Sandra P.
Students can better experience the impact of written words by the response of a real audience. This paper describes a letter-writing activity for sixth graders in which the students first identified and discussed real-life happenings which had caused disappointments with companies that had produced faulty items. They then obtained all the…
ERIC Educational Resources Information Center
Kanwit, Matthew; Geeslin, Kimberly L.
2014-01-01
The present study fills a need for investigations of learner and native speaker (NS) interpretation of the Spanish subjunctive in contexts that allow variation. The analysis compares responses by NSs and three levels of learners on a written interpretation task in which each item contained a temporal indicator ("cuando" "when",…
Verification of Self-Report Temperament Factors.
ERIC Educational Resources Information Center
Dermen, Diran; And Others
In an earlier report, one of the authors describes 28 temperament factors for which there is sufficient consensus in the literature to call them "established". From 1 to 5 distinct bipolar subscales were suggested to mark each factor; 12 or 16 item subscales were written, each balanced in terms of the two defined poles and in terms of numbers of…
15 CFR 748.4 - Basic guidance related to applying for a license.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION... sending of items out of the United States, except for Encryption License Arrangements (ELA) (see § 750.7(d... (Additional information) must be marked “748.4(b)(2)” to indicate that the power of attorney or other written...
15 CFR 748.4 - Basic guidance related to applying for a license.
Code of Federal Regulations, 2011 CFR
2011-01-01
... Foreign Trade (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE EXPORT ADMINISTRATION... sending of items out of the United States, except for Encryption License Arrangements (ELA) (see § 750.7(d... (Additional information) must be marked “748.4(b)(2)” to indicate that the power of attorney or other written...
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-10
... written commitments from the funding source at pre-application. If leverage funds are in the form of tax...-spaced between items and not be in narrative form. (a) Applicant's name. (b) Applicant's Taxpayer... and financial capability to carry out the obligation of the loan. (iii) Standard Form 424...
Preservice and Inservice Science Teachers' Responses and Reasoning about the Nature of Science
ERIC Educational Resources Information Center
Buaraphan, Khajornsak
2009-01-01
An adequate understanding of the nature of science (NOS) is essential for science teachers. The Myths of Science Questionnaire (MOSQ) consisting of 14 items, which comprised both optional and written types of response, was utilized to explore 113 Thai preservice and 101 inservice science teachers' understanding and reasoning about the NOS,…
LEARN JAPANESE--ELEMENTARY SCHOOL TEXT, VOLUME II.
ERIC Educational Resources Information Center
SATO, YAEKO; AND OTHERS
THIS TEXT WAS WRITTEN FOR THE USE OF THE ELEMENTARY SCHOOL TEACHER OF JAPANESE. IT IS TO BE USED IN THE SECOND SEMESTER OF JAPANESE LANGUAGE STUDY AND FOLLOWS THE AUDIO-LINGUAL ORIENTATION OF VOLUME I. THE MAIN GOAL OF BOTH VOLUMES IS "TO ELEVATE THE PUPIL'S MOTIVATION AND TO CULTIVATE PROPER PRONUNCIATION HABITS." THE NEW ITEMS IN VOLUME II…
Validation of a Four-Factor Model of Career Indecision
ERIC Educational Resources Information Center
Brown, Steven D.; Hacker, Jason; Abrams, Matthew; Carr, Andrea; Rector, Christopher; Lamp, Kristen; Telander, Kyle; Siena, Anne
2012-01-01
Two studies were designed to explore whether a meta-analytically derived four-factor model of career indecision (Brown & Rector, 2008) could be replicated at the primary and secondary data levels. In the first study, an initial pool of 167 items was written based on 35 different instruments whose scores had loaded saliently on at least one…
Code of Federal Regulations, 2014 CFR
2014-10-01
... than a tire) that was installed in or on a motor vehicle at the time of its delivery to the first purchaser if the item of equipment was installed on or in the motor vehicle at the time of its delivery to a... readable by machine. If readable by machine, the submitting party must obtain written confirmation from the...
Code of Federal Regulations, 2012 CFR
2012-10-01
... than a tire) that was installed in or on a motor vehicle at the time of its delivery to the first purchaser if the item of equipment was installed on or in the motor vehicle at the time of its delivery to a... readable by machine. If readable by machine, the submitting party must obtain written confirmation from the...
Code of Federal Regulations, 2013 CFR
2013-10-01
... than a tire) that was installed in or on a motor vehicle at the time of its delivery to the first purchaser if the item of equipment was installed on or in the motor vehicle at the time of its delivery to a... readable by machine. If readable by machine, the submitting party must obtain written confirmation from the...
ERIC Educational Resources Information Center
Anderson, Daniel; Irvin, P. Shawn; Patarapichayatham, Chalie; Alonzo, Julie; Tindal, Gerald
2012-01-01
In the following technical report, we describe the development and scaling of the easyCBM CCSS middle school mathematics measures, designed for use within a response to intervention framework. All items were developed in collaboration with experienced middle school mathematics teachers and were written to align with the Common Core State…
ERIC Educational Resources Information Center
Taviss, Irene, Ed.; Silverman, Linda, Ed.
Literature dealing with the relationship between technological change and value change is surveyed in this review. The emphasis is on the effect on contemporary American society. Most of the items covered were written since 1966. Topics include the contemporary situation, changing value orientations, social planning and the role of the social…
Assessing Lexical Proficiency Using Analytic Ratings: A Case for Collocation Accuracy
ERIC Educational Resources Information Center
Crossley, Scott A.; Salsbury, Tom; Mcnamara, Danielle S.
2015-01-01
This study analyzes lexical proficiency in oral and written texts produced by second language (L2) learners of English. The purpose of the study is to examine relationships between analytic scores of depth of lexical knowledge, breadth of lexical knowledge, and access to core lexical items and holistic scores of lexical proficiency. A corpus of…
ERIC Educational Resources Information Center
Montana Univ., Missoula. Div. of Educational Research and Services.
This compilation of essays and resources focuses on acceptance of children with disabilities and cooperation between parents and early intervention specialists. The compilation includes the following items written by Jan Spiegle-Mariska (sometimes cited as Jan Mariska): "Building Effective Parent/Professional Partnerships"; "What Parents Want from…
Learner Strategies for Filling the Knowledge Gap during Collaborative Tasks
ERIC Educational Resources Information Center
Gearon, Margaret
2004-01-01
Recent research by Swain (2000a, 2000b, 1998, 1995), Swain and Lapkin (2001, 1998, 1995) and Kowal and Swain (1997, 1994) has examined the role of collaborative tasks in focusing immersion students' attention on the need for explicit knowledge of grammatical forms and lexical items in the production (especially written) of French texts. This is…
The Effect of the Number of Syllables on Handwriting Production
ERIC Educational Resources Information Center
Lambert, Eric; Kandel, Sonia; Fayol, Michel; Esperet, Eric
2008-01-01
Four experiments examined whether motor programming in handwriting production can be modulated by the syllable structure of the word to be written. This study manipulated the number of syllables. The items, words and pseudo-words, had 2, 3 or 4 syllables. French adults copied them three times. We measured the latencies between the visual…
17 CFR 210.12-23 - Mortgage loans on real estate and interest earned on mortgages. 1
Code of Federal Regulations, 2010 CFR
2010-04-01
... each of the above classes of mortgage loans the average gross rate of interest on mortgage loans held... mortgages sold Amortization of premium Other (describe) Balance at close of period $ If additions represent... item of mortgage loans on real estate investments has been written down or reserved against pursuant to...
ERIC Educational Resources Information Center
Schultz, Madeleine
2011-01-01
This paper reports on the development of a tool that generates randomised, non-multiple choice assessment within the BlackBoard Learning Management System interface. An accepted weakness of multiple-choice assessment is that it cannot elicit learning outcomes from upper levels of Biggs' SOLO taxonomy. However, written assessment items require…
Salm, Florian; Schneider, Sandra; Schmücker, Katja; Petruschke, Inga; Kramer, Tobias S; Hanke, Regina; Schröder, Christin; Heintze, Christoph; Schwantes, Ulrich; Gastmeier, Petra; Gensichen, Jochen
2018-05-04
This study investigates the barriers and facilitators of the use of antibiotics in acute respiratory tract infections by general practitioners (GPs) in Germany. A multidisciplinary team designed and pre-tested a written questionnaire addressing the topics awareness of antimicrobial resistance (7 items), use of antibiotics (9 items), guidelines/sources of information (9 items) and sociodemographic factors (7 items), using a five-point-Likert-scale ("never" to "very often"). The questionnaire was mailed by postally to 987 GPs with registered practices in eastern Germany in May 2015. 34% (340/987) of the GPs responded to this survey. Most of the participants assumed a multifactorial origin for the rise of multidrug resistant organisms. In addition, 70.2% (239/340) believed that their own prescribing behavior influenced the drug-resistance situation in their area. GPs with longer work experience (> 25 years) assumed less individual influence on drug resistance than their colleagues with less than 7 years experience as practicing physicians (Odds Ratio [OR] 0.32, 95% Confidence Interval [CI] 0.17-0.62; P < 0.001). 99.1% (337/340) of participants were familiar with the "delayed prescription" strategy to reduce antibiotic prescriptions. However, only 29.4% (74/340) answered that they apply it "often" or "very often". GPs working in rural areas were less likely than those working in urban areas to apply delayed prescription. The knowledge on factors causing antimicrobial resistance in bacteria is good among GPs in eastern Germany. However measures to improve rational prescription are not widely implemented yet. Further efforts have to be made in order to improve rational prescription of antibiotic among GPs. Nevertheless, there is a strong awareness of antimicrobial resistance among the participating GPs.
Tepe, Rodger; Tepe, Chabha
2015-03-01
To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. In this test-retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. The IL self-efficacy survey demonstrated good reliability (test-retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test-retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments.
A New Item Selection Procedure for Mixed Item Type in Computerized Classification Testing.
ERIC Educational Resources Information Center
Lau, C. Allen; Wang, Tianyou
This paper proposes a new Information-Time index as the basis for item selection in computerized classification testing (CCT) and investigates how this new item selection algorithm can help improve test efficiency for item pools with mixed item types. It also investigates how practical constraints such as item exposure rate control, test…
ERIC Educational Resources Information Center
Banerjee, Jayanti; Papageorgiou, Spiros
2016-01-01
The research reported in this article investigates differential item functioning (DIF) in a listening comprehension test. The study explores the relationship between test-taker age and the items' language domains across multiple test forms. The data comprise test-taker responses (N = 2,861) to a total of 133 unique items, 46 items of which were…
Item validity vs. item discrimination index: a redundancy?
NASA Astrophysics Data System (ADS)
Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.
2018-03-01
In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.
Nelissen, Ellen; Ersdal, Hege; Mduma, Estomih; Evjen-Olsen, Bjørg; Broerse, Jacqueline; van Roosmalen, Jos; Stekelenburg, Jelle
2015-08-25
It is important to know the decay of knowledge, skills, and confidence over time to provide evidence-based guidance on timing of follow-up training. Studies addressing retention of simulation-based education reveal mixed results. The aim of this study was to measure the level of knowledge, skills, and confidence before, immediately after, and nine months after simulation-based training in obstetric care in order to understand the impact of training on these components. An educational intervention study was carried out in 2012 in a rural referral hospital in Northern Tanzania. Eighty-nine healthcare workers of different cadres were trained in "Helping Mothers Survive Bleeding After Birth", which addresses basic delivery skills including active management of third stage of labour and management of postpartum haemorrhage (PPH). Knowledge, skills, and confidence were tested before, immediately after, and nine months after training amongst 38 healthcare workers. Knowledge was tested by completing a written 26-item multiple-choice questionnaire. Skills were tested in two simulated scenarios "basic delivery" and "management of PPH". Confidence in active management of third stage of labour, management of PPH, determination of completeness of the placenta, bimanual uterine compression, and accessing advanced care was self-assessed using a written 5-item questionnaire. Mean knowledge scores increased immediately after training from 70 % to 77 %, but decreased close to pre-training levels (72 %) at nine-month follow-up (p = 0.386) (all p-levels are compared to pre-training). The mean score in basic delivery skills increased after training from 43 % to 51 %, and was 49 % after nine months (p = 0.165). Mean scores of management of PPH increased from 39 % to 51 % and were sustained at 50 % at nine months (p = 0.003). Bimanual uterine compression skills increased from 19 % before, to 43 % immediately after, to 48 % nine months after training (p = 0.000). Confidence increased immediately after training, and was largely retained at nine-month follow-up. Training resulted in an immediate increase in knowledge, skills, and confidence. While knowledge and simulated basic delivery skills decayed after nine months, confidence and simulated obstetric emergency skills were largely retained. These findings indicate a need for continuation of training. Future research should focus on the frequency and dosage of follow-up training.
A Comparison of Three Types of Test Development Procedures Using Classical and Latent Trait Methods.
ERIC Educational Resources Information Center
Benson, Jeri; Wilson, Michael
Three methods of item selection were used to select sets of 38 items from a 50-item verbal analogies test and the resulting item sets were compared for internal consistency, standard errors of measurement, item difficulty, biserial item-test correlations, and relative efficiency. Three groups of 1,500 cases each were used for item selection. First…
ERIC Educational Resources Information Center
Çokluk, Ömay; Gül, Emrah; Dogan-Gül, Çilem
2016-01-01
The study aims to examine whether differential item function is displayed in three different test forms that have item orders of random and sequential versions (easy-to-hard and hard-to-easy), based on Classical Test Theory (CTT) and Item Response Theory (IRT) methods and bearing item difficulty levels in mind. In the correlational research, the…
The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory
ERIC Educational Resources Information Center
Sahin, Alper; Anil, Duygu
2017-01-01
This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…
[Perceptions on item disclosure for the Korean medical licensing examination].
Yang, Eunbae B
2015-09-01
This study analyzed the perceptions of medical students and faculty regarding disclosure of test items on the Korean medical licensing examination. I conducted a survey of medical students from medical colleges and professional medical schools nationwide. Responses were analyzed from 718 participants as well as 69 faculty members who participated in creating the medical licensing examination item sets. Data were analyzed using descriptive statistics and the chi-square test. It is important to maintain test quality and to keep the test items unavailable to the public. There are also concerns among students that disclosure of test items would prompt increasing difficulty of test items (48.3%). Further, few students found it desirable to disclose test items regardless of any considerations (28.5%). The professors, who had experience in designing the test items, also expressed their opposition to test item disclosure (60.9%). It is desirable not to disclose the test items of the Korean medical licensing examination to the public on the condition that students are provided with a sufficient amount of information regarding the examination. This is so that the exam can appropriately identify candidates with the required qualifications.
Wong, Cynthia A; Scott, Shirley; Jones, Robin L; Walzer, Jennifer; Geller, Stacie
2016-03-01
The Illinois Department of Public Health mandated that all clinicians who provide care to obstetric patients participate in the Illinois Obstetric Hemorrhage Project. The aim of the current report is to describe change in knowledge among providers engaged in the project, as assessed by pre- and post-tests. The project, implemented 2008 to 2010, included four components: a written 25-item multiple-choice examination (pre-test), a didactic lecture, skill stations (for teaching blood loss estimation), and a simulation drill and debriefing. Participants completed a post-test 6 months later. Pre- and post-test examination scores were compared. Data from 95 hospitals are included in this analysis (9456 paired test results). The proportion of participants who scored ≥88% correct answers increased from 10.9% on the pre-test to 49.1% on the post-test (p < 0.0001). Registered nurses made greater improvements in test scores than anesthesia and obstetric providers (p < 0.0001). The Illinois Obstetric Hemorrhage Project was successful in improving knowledge of obstetric hemorrhage in a large number of providers with different expertise and experience levels. Further long-term study is essential to determine whether the skills acquired during the Project contribute to improved obstetric hemorrhage outcomes for the women of Illinois.
Psychometric properties of the Florence CyberBullying-CyberVictimization Scales.
Palladino, Benedetta Emanuela; Nocentini, Annalaura; Menesini, Ersilia
2015-02-01
The present study tried to answer the research need for empirically validated and theoretically based instruments to assess cyberbullying and cybervictimization. The psychometric properties of the Florence CyberBullying-CyberVictimization Scales (FCBVSs) were analyzed in a sample of 1,142 adolescents (Mage=15.18 years; SD=1.12 years; 54.5% male). For both cybervictimization and cyberbullying, results support a gender invariant model involving 14 items and four factors covering four types of behaviors (written-verbal, visual, impersonation, and exclusion). The second-order confirmatory factor analysis confirmed that a "global," second-order measure of cyberbullying and cybervictimization fits the data well. Overall, the scales showed good validity (construct, concurrent, and convergent) and reliability (internal consistency and test-retest). In addition, using the global key question measure as a criterion, ROC analyses, determining the ability of a test to discriminate between groups, allowed us to identify cutoff points to classify respondents as involved/not involved starting from the continuum measure derived from the scales.
A Review of Classical Methods of Item Analysis.
ERIC Educational Resources Information Center
French, Christine L.
Item analysis is a very important consideration in the test development process. It is a statistical procedure to analyze test items that combines methods used to evaluate the important characteristics of test items, such as difficulty, discrimination, and distractibility of the items in a test. This paper reviews some of the classical methods for…
Modeling Item-Position Effects within an IRT Framework
ERIC Educational Resources Information Center
Debeer, Dries; Janssen, Rianne
2013-01-01
Changing the order of items between alternate test forms to prevent copying and to enhance test security is a common practice in achievement testing. However, these changes in item order may affect item and test characteristics. Several procedures have been proposed for studying these item-order effects. The present study explores the use of…
Lillquist, Dean R; McCabe, Mary L; Church, Kurt Haden
2005-01-01
The Centers for Disease Control and Prevention (CDC) have stated that poor personal hygiene is the third most commonly reported food preparation practice contributing to foodborne disease and h further claimed that contaminated hands may be the most important means by which enteric viruses are transmitted. The study reported here compared the effectiveness of traditional (lecture/video) training with that of traditional training that provided an added active (hands-on) component for the retention of handwashing procedures two weeks after the initial training. Sixty-six food handlers attending training courses were included in the study. All participants received the same lecture/video presentation. Twenty-two (33 percent) of the participants received an additional interactive training component. All participants were tested by a 20-item written test on the day of training. Two weeks after the training, 25 to 30 percent of participants from each group were retested. Results revealed that the participants involved in the interactive training had statistically significant better test performances both on the day of training and on the two-week retest.
[Contribution of medical technologists in team medical care of diabetics].
Sato, Itsuko; Jikimoto, Takumi; Ooyabu, Chinami; Kusuki, Mari; Okano, Yosie; Mukai, Masahiko; Kawano, Seiji; Kumagai, Shunichi
2006-08-01
For the effective treatment of diabetic mellitus (DM), patients are encouraged to self-manage their disease according to the doctor's instructions and advice from certified diabetes educators (CDE) and other comedical staff. Therefore, the cooperation of medical staff consisting of a doctor, CDE, nurse, pharmacist, dietitian, and medical technologist is important for DM education. Medical technologists licensed for CDE (MT-CDE) have been participating in the DM education team in Kobe University Hospital since 2000. MT-CDE are in charge of classes for medical tests, guidance for self-monitoring of blood glucose and teaching how to read the fluctuation graph of the blood glucose level in the education program for hospitalized DM patients. MT-CDEs teach at the bedside how to read the results of medical tests during the first few days of hospitalization using pamphlets for medical tests. The pamphlets are made comprehensible for patients by using graphics and photographs as much as possible. It is important to create a friendly atmosphere and answer frank questions from patients, since they often feel stress when having medical tests at the early stage of hospitalization. This process of questions and answers promotes their understanding of medical tests, and seems to reduce their anxiety about having tests. We repeatedly evaluate their level of understanding during hospitalization. By showing them the fluctuation graph of the glucose level, patients can easily understand the status of their DM. When prescriptions are written on the graph, their therapeutic effects are more comprehensible for the patients. The items written on the graph are chosen to meet the level of understanding of each patient to promote their motivation. In summary, the introduction of MT-CDE has been successful in the education program for DM patients in our hospital. We plan to utilize the skills and knowledge of MT-CDE more in our program so that our DM education program will help patients cope with life with DM.
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Daghio, M Monica; Fattori, Giuseppe; Ciardullo, Anna V
2006-01-01
The objective of this study was to evaluate if easy-to-read information material on the prevention of chronic-degenerative diseases through healthy lifestyle co-written by communicators, educators, physicians and citizens -using a networking strategy- could be judged comprehensible. Readability scores were computed. The survey involved 100 individuals attending our centralized booking centre for medical appointments during an "index week". They filled out an anonymous questionnaire, just before and after they had read the material. Readability and comprehensibility frequencies were calculated. The participants had a mean age of 59.1+/-15.1 (SD) years (range 19-81yrs), 62% were females. Twenty-six percent of them had received no education, 30% "primary", 28% "secondary", and 14% had a "degree". According to readability scores, the booklet was "readable" by all persons who had finished primary school. Of the 100 participants, 40 percent found the booklet's language to be "easy" or "very easy", 46% "sufficiently easy", and 14% "difficult" for laypersons to understand. Ninety-four percent of them found no unintelligible words in the text. Education levels showed no differences. Readers' answers were more correct after they had read the booklet. The pre-test showed that 61+/-26% of the readers answered the comprehensibility items correctly. After reading the booklet, 81+/-17% of them gave correct answers. The after-minus-before net increase in knowledge was +20% (95% CIs +8 to +32%). The booklet was designed and written using a networking strategy with the help of the local population. It was found to be easy to read and quite clear.
Assembling a Computerized Adaptive Testing Item Pool as a Set of Linear Tests
ERIC Educational Resources Information Center
van der Linden, Wim J.; Ariel, Adelaide; Veldkamp, Bernard P.
2006-01-01
Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content…
Evaluation of Northwest University, Kano Post-UTME Test Items Using Item Response Theory
ERIC Educational Resources Information Center
Bichi, Ado Abdu; Hafiz, Hadiza; Bello, Samira Abdullahi
2016-01-01
High-stakes testing is used for the purposes of providing results that have important consequences. Validity is the cornerstone upon which all measurement systems are built. This study applied the Item Response Theory principles to analyse Northwest University Kano Post-UTME Economics test items. The developed fifty (50) economics test items was…
Item Specifications, Science Grade 8. Blue Prints for Testing Minimum Performance Test.
ERIC Educational Resources Information Center
Arkansas State Dept. of Education, Little Rock.
These item specifications were developed as a part of the Arkansas "Minimum Performance Testing Program" (MPT). There is one item specification for each instructional objective included in the MPT. The purpose of an item specification is to provide an overview of the general content and format of test items used to measure an…
Item Specifications, Science Grade 6. Blue Prints for Testing Minimum Performance Test.
ERIC Educational Resources Information Center
Arkansas State Dept. of Education, Little Rock.
These item specifications were developed as a part of the Arkansas "Minimum Performance Testing Program" (MPT). There is one item specification for each instructional objective included in the MPT. The purpose of an item specification is to provide an overview of the general content and format of test items used to measure an…
Criterion-Referenced Test Items for Welding.
ERIC Educational Resources Information Center
Davis, Diane, Ed.
This test item bank on welding contains test questions based upon competencies found in the Missouri Welding Competency Profile. Some test items are keyed for multiple competencies. These criterion-referenced test items are designed to work with the Vocational Instructional Management System. Questions have been statistically sampled and validated…
Jang, Yoonhee; Wixted, John T.; Pecher, Diane; Zeelenberg, René; Huber, David E.
2012-01-01
Even without feedback, test practice enhances delayed performance compared to study practice, but the size of the effect is variable across studies. We investigated the benefit of testing, separating initially retrievable items from initially non-retrievable items. In two experiments, an initial test determined item retrievability. Retrievable or non-retrievable items were subsequently presented for repeated study or test practice. Collapsing across items, in Experiment 1, we obtained the typical crossover interaction between retention interval and practice type. For retrievable items, however, the crossover interaction was quantitatively different, with a small study benefit for an immediate test and a larger testing benefit after a delay. For non-retrievable items, there was a large study benefit for an immediate test, but one week later there was no difference between the study and test practice conditions. In Experiment 2, initially non-retrievable items were given additional study followed by either an immediate test or even more additional study, and one week later performance did not differ between the two conditions. These results indicate that the effect size of study/test practice is due to the relative contribution of retrievable and non-retrievable items. PMID:22304454
Jang, Yoonhee; Wixted, John T; Pecher, Diane; Zeelenberg, René; Huber, David E
2012-01-01
Even without feedback, test practice enhances delayed performance compared to study practice, but the size of the effect is variable across studies. We investigated the benefit of testing, separating initially retrievable items from initially nonretrievable items. In two experiments, an initial test determined item retrievability. Retrievable or nonretrievable items were subsequently presented for repeated study or test practice. Collapsing across items, in Experiment 1, we obtained the typical cross-over interaction between retention interval and practice type. For retrievable items, however, the cross-over interaction was quantitatively different, with a small study benefit for an immediate test and a larger testing benefit after a delay. For nonretrievable items, there was a large study benefit for an immediate test, but one week later there was no difference between the study and test practice conditions. In Experiment 2, initially nonretrievable items were given additional study followed by either an immediate test or even more additional study, and one week later performance did not differ between the two conditions. These results indicate that the effect size of study/test practice is due to the relative contribution of retrievable and nonretrievable items.
Optimal Test Design with Rule-Based Item Generation
ERIC Educational Resources Information Center
Geerlings, Hanneke; van der Linden, Wim J.; Glas, Cees A. W.
2013-01-01
Optimal test-design methods are applied to rule-based item generation. Three different cases of automated test design are presented: (a) test assembly from a pool of pregenerated, calibrated items; (b) test generation on the fly from a pool of calibrated item families; and (c) test generation on the fly directly from calibrated features defining…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items are made available to teachers for the construction of unit tests or term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The test items meet syllabus…
Criterion-Referenced Test Items for Small Engines.
ERIC Educational Resources Information Center
Herd, Amon
This notebook contains criterion-referenced test items for testing students' knowledge of small engines. The test items are based upon competencies found in the Missouri Small Engine Competency Profile. The test item bank is organized in 18 sections that cover the following duties: shop procedures; tools and equipment; fasteners; servicing fuel…
An Investigation of the Impact of Guessing on Coefficient α and Reliability
2014-01-01
Guessing is known to influence the test reliability of multiple-choice tests. Although there are many studies that have examined the impact of guessing, they used rather restrictive assumptions (e.g., parallel test assumptions, homogeneous inter-item correlations, homogeneous item difficulty, and homogeneous guessing levels across items) to evaluate the relation between guessing and test reliability. Based on the item response theory (IRT) framework, this study investigated the extent of the impact of guessing on reliability under more realistic conditions where item difficulty, item discrimination, and guessing levels actually vary across items with three different test lengths (TL). By accommodating multiple item characteristics simultaneously, this study also focused on examining interaction effects between guessing and other variables entered in the simulation to be more realistic. The simulation of the more realistic conditions and calculations of reliability and classical test theory (CTT) item statistics were facilitated by expressing CTT item statistics, coefficient α, and reliability in terms of IRT model parameters. In addition to the general negative impact of guessing on reliability, results showed interaction effects between TL and guessing and between guessing and test difficulty.
Entry Descent and Landing Workshop Proceedings. Volume 1; Commercial Sources for EDL Flight Tests
NASA Technical Reports Server (NTRS)
Trombetta, Nick; Horan, Steve
2015-01-01
Commercial Off The Shelf is defined as a Federal Acquisition Regulation (FAR) term for commercial items, including services, available in the commercial marketplace that can be bought and used under government contracts. A need for COTS exists to help in reducing avionics cost associated with applicable missions. In a 2014 a Planetary Science Decadal Survey it was stated that it is imperative that NASA expand its investment in fundamental technology areas. Reduced mass and power requirements for spacecraft and their subsystems. New and improved sensors, instruments, and sampling systems; and Mission and trajectory design and optimization Two goals were written as part of the technology investment: 1. Reducing the cost of planetary missions 2. Improving their scientific capability and reliability...." COTS could certainty aid in reducing cost associated with the instrumentation systems.
ERIC Educational Resources Information Center
Richterich, Rene; And Others
This trilingual handbook (English, French, German) presents exercises for eight areas of language instruction: (1) practice in sentence patterns, (2) presentation and practice of a new item, (3) pronunciation, (4) use of pictures, (5) practicing the transfer from oral to written skills, (6) presentation and practice of conversational patterns, (7)…
Code of Federal Regulations, 2010 CFR
2010-07-01
.... However, an official photocopy of the report of separation or certificate of discharge (DD Form 214... written request of the member. (a) On the DD Forms 214 issued before October 1, 1979, the following items...). (b) For DD Forms 214 issued after October 1, 1979, send one copy with the Special Additional...
ERIC Educational Resources Information Center
Stein, Mary; Barman, Charles R.; Larrabee, Timothy
2007-01-01
This article describes the rationale for, and development of, an online instrument that helps identify commonly held science misconceptions. Science Beliefs is a 47-item instrument that targets topics in chemistry, physics, biology, earth science, and astronomy. It utilizes a true or false, along with a written-explanation, format. The true or…
Starting Out Right: How to Choose Books About Black People for Young Children.
ERIC Educational Resources Information Center
Latimer, Bettye I., Ed.; And Others
This critical and selective annotated bibliography is restricted to books written for preschool through grade three. Each title in this listing of "black inclusive" items is accompanied by a commentary whose length depends on the merits or faults of each book. The editors have recommended the books or have not according to the following rationale…
5 CFR 177.105 - Administrative claim; evidence and information to be submitted.
Code of Federal Regulations, 2010 CFR
2010-01-01
... agency. On written request, OPM will make available to the claimant a copy of the report of the examining... showing actual time lost from employment, whether he or she is a full-or part-time employee, and wages or... ownership of the property. (2) A detailed statement of the amount claimed with respect to each item of...
To Cut or Not to Cut: Cosmetic Surgery Usage and Women's Age-Related Experiences
ERIC Educational Resources Information Center
Eriksen, Shelley J.
2012-01-01
Part of the developmental trajectory of middle and late life presumes the adjustment to physical aging, an adjustment that is complicated for women for whom the prioritization of beauty is central to their social value in Western societies. A 60-item written questionnaire was distributed to a volunteer community sample of 202 women ages 19-86.…
When We like What We Know--A Parametric fMRI Analysis of Beauty and Familiarity
ERIC Educational Resources Information Center
Bohrn, Isabel C.; Altmann, Ulrike; Lubrich, Oliver; Menninghaus, Winfried; Jacobs, Arthur M.
2013-01-01
This paper presents a neuroscientific study of aesthetic judgments on written texts. In an fMRI experiment participants read a number of proverbs without explicitly evaluating them. In a post-scan rating they rated each item for familiarity and beauty. These individual ratings were correlated with the functional data to investigate the neural…
Choe, Kwisoon
2014-06-01
Hope has received attention as a central component of recovery from mental illness; however, most instruments measuring hope were developed outside the mental health field. To measure the effects of mental health programs on hope in people with schizophrenia, a specialized scale is needed. This study examined the psychometric properties of the newly developed 9-item Schizophrenia Hope Scale (SHS-9) designed to measure hope in individuals with schizophrenia. A descriptive survey design. Participants were recruited from three psychiatric hospitals and two community mental health centers in South Korea. A total of 347 individuals over age 18 with a DSM-IV diagnosis of schizophrenia, schizoaffective, or schizophrenia spectrum disorders (competent to provide written informed consent) participated in this study; 149 (94 men, 55 women) completed a preliminary scale consisting of 40 revised items, and 198 (110 men, 88 women) completed the second scale of 17 items. Scale items were first selected from extensive literature reviews and a qualitative study on hope in people with schizophrenia; the validity and reliability of a preliminary scale was then evaluated by an expert panel and exploratory factor analysis. The remaining 9 items forming the Schizophrenia Hope Scale (SHS-9) were evaluated through confirmatory factor analysis. The SHS-9 demonstrates promising psychometric integrity. The internal consistency alpha coefficient was 0.92 with a score range of 0-18 and a mean total score of 12.06 (SD=4.96), with higher scores indicating higher levels of hope. Convergent validity was established by correlating the SHS-9 to the State-Trait Hope Inventory, r=0.61 (p<0.01). Divergent validity with the Beck Hopelessness Scale was also established, r=-0.55 (p<0.01). Exploratory and confirmatory factor analysis resulted in a 1-factor solution, with the essential meaning of hope accounting for 61.77% of the total item variance. As hope has been shown to facilitate recovery from mental illness, the accurate assessment of hope provided by the short, easy-to-use Schizophrenia Hope Scale (SHS-9) may aid clinicians in improving the quality of life of individuals with schizophrenia. Copyright © 2013 Elsevier Ltd. All rights reserved.
Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André
2016-01-01
Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…
Tepe, Rodger; Tepe, Chabha
2015-01-01
Objective To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. Methods In this test–retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. Results The IL self-efficacy survey demonstrated good reliability (test–retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test–retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). Conclusions This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments. PMID:25517736
What do Seniors Remember from Freshman Physics?
NASA Astrophysics Data System (ADS)
Barrantes, Analia; Pawl, Andrew; Pritchard, David E.
2009-10-01
We have given a group of 56 MIT seniors who took mechanics as freshmen a written test similar to the final exam they took in their freshman course, plus the MBT and C-LASS standard instruments. Students in majors unrelated to physics scored 60% lower on the written analytic part of the final than they did as freshmen. The mean score of all students on conceptual multiple choice questions included on the final declined by approximately 50% relative to the scores of freshmen. The mean score of all participants on the MBT was insignificantly changed from the posttest taken as freshmen. More specifically, however, the students' performance on 9 of the 26 MBT items (with 6 of the 9 involving graphical kinematics) represents a gain over their freshman pretest score (a normalized gain of about 70%, double the gain achieved in the freshman course alone), while their performance on the remaining 17 questions is best characterized as a loss of approximately 50% of the material learned in the freshman course. Attitudinal survey results indicate that almost half the seniors feel the specific mechanics course content is unlikely to be useful to them, a significant majority (75-85%) feel that physics does teach valuable skills, and an overwhelming majority believe that mechanics should remain a required course at MIT.
Integrating Test-Form Formatting into Automated Test Assembly
ERIC Educational Resources Information Center
Diao, Qi; van der Linden, Wim J.
2013-01-01
Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using…
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis
2013-01-01
Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content-specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer…
Similarity as an organising principle in short-term memory.
LeCompte, D C; Watkins, M J
1993-03-01
The role of stimulus similarity as an organising principle in short-term memory was explored in a series of seven experiments. Each experiment involved the presentation of a short sequence of items that were drawn from two distinct physical classes and arranged such that item class changed after every second item. Following presentation, one item was re-presented as a probe for the 'target' item that had directly followed it in the sequence. Memory for the sequence was considered organised by class if probability of recall was higher when the probe and target were from the same class than when they were from different classes. Such organisation was found when one class was auditory and the other was visual (spoken vs. written words, and sounds vs. pictures). It was also found when both classes were auditory (words spoken in a male voice vs. words spoken in a female voice) and when both classes were visual (digits shown in one location vs. digits shown in another). It is concluded that short-term memory can be organised on the basis of sensory modality and on the basis of certain features within both the auditory and visual modalities.
A Procedure To Detect Test Bias Present Simultaneously in Several Items.
ERIC Educational Resources Information Center
Shealy, Robin; Stout, William
A statistical procedure is presented that is designed to test for unidirectional test bias existing simultaneously in several items of an ability test, based on the assumption that test bias is incipient within the two groups' ability differences. The proposed procedure--Simultaneous Item Bias (SIB)--is based on a multidimensional item response…
An Item Response Theory Model for Test Bias.
ERIC Educational Resources Information Center
Shealy, Robin; Stout, William
This paper presents a conceptualization of test bias for standardized ability tests which is based on multidimensional, non-parametric, item response theory. An explanation of how individually-biased items can combine through a test score to produce test bias is provided. It is contended that bias, although expressed at the item level, should be…
10 CFR 55.40 - Implementation.
Code of Federal Regulations, 2010 CFR
2010-01-01
... REGULATORY COMMISSION (CONTINUED) OPERATORS' LICENSES Written Examinations and Operating Tests § 55.40... Standards for Power Reactors,” 1 in effect six months before the examination date to prepare the written... also use the criteria in NUREG-1021 to evaluate the written examinations and operating tests prepared...
10 CFR 55.40 - Implementation.
Code of Federal Regulations, 2013 CFR
2013-01-01
... REGULATORY COMMISSION (CONTINUED) OPERATORS' LICENSES Written Examinations and Operating Tests § 55.40... Standards for Power Reactors,” 1 in effect six months before the examination date to prepare the written... also use the criteria in NUREG-1021 to evaluate the written examinations and operating tests prepared...
10 CFR 55.40 - Implementation.
Code of Federal Regulations, 2012 CFR
2012-01-01
... REGULATORY COMMISSION (CONTINUED) OPERATORS' LICENSES Written Examinations and Operating Tests § 55.40... Standards for Power Reactors,” 1 in effect six months before the examination date to prepare the written... also use the criteria in NUREG-1021 to evaluate the written examinations and operating tests prepared...
10 CFR 55.40 - Implementation.
Code of Federal Regulations, 2011 CFR
2011-01-01
... REGULATORY COMMISSION (CONTINUED) OPERATORS' LICENSES Written Examinations and Operating Tests § 55.40... Standards for Power Reactors,” 1 in effect six months before the examination date to prepare the written... also use the criteria in NUREG-1021 to evaluate the written examinations and operating tests prepared...
10 CFR 55.40 - Implementation.
Code of Federal Regulations, 2014 CFR
2014-01-01
... REGULATORY COMMISSION (CONTINUED) OPERATORS' LICENSES Written Examinations and Operating Tests § 55.40... Standards for Power Reactors,” 1 in effect six months before the examination date to prepare the written... also use the criteria in NUREG-1021 to evaluate the written examinations and operating tests prepared...
Adults and children with high imagery show more pronounced perceptual priming effect.
Hatakeyama, T
1997-06-01
36 children in Grade 5 and 59 university students, all native speakers of Japanese, studied three types of priming stimuli in a mixed list: words written in hiragana (Japanese syllabary used in writing), words written in kanji (Chinese characters also used in writing), and pictures. They were then given a task involving completion of hiragana-word fragments: the task involved studied and nonstudied items. For both children and university students, words in hiragana produced the largest priming effects, that is, the words that had appeared in hiragana in the preceding study phase were generated more often in the test phase of word completion than the other two types of priming stimuli. This confirms that the perceptual priming effect depends much on data-driven processing. For both age groups, words in kanji produced nearly half the priming effects seen for hiragana-words. On the other hand, pictures had no priming effect for children but they had a similar effect to kanji-words for students. The discrepancy between kanji-words and pictures for children suggests that the former force the subject to read the words, which, possibly, activates the hiragana-words, while the latter do not necessarily force labelling the pictures. Among three kinds of imagery tests, the Verbalizer-Visualizer Questionnaire predicted priming scores for children and the Questionnaire upon Mental Imagery did so for students, but the Test of Visual Imagery Control did not predict the scores for either age group. This shows that children reporting habitual use of imagery and adults reporting vivid imagery have more pronounced perceptual priming effects. We conclude that the imagery ability based on self-judgments reflects real characteristics of the perceptual representation system of Tulving and Schacter (1990).
The First Telescope in the Korean History I. Translation of Jeong's Report
NASA Astrophysics Data System (ADS)
Ahn, Sang-Hyeon
2009-06-01
In 1631 A.D. Jeong Duwon, an ambassador of the Joseon dynasty was sent to the Ming dynasty. There he met João Rodrigues, a Jesuit missionary, in Dengzhou of Shandong peninsula. The missionary gave the ambassador a number of results of latest European innovations. A detailed description on this event was written in `Jeong's official report regarding a message from an European country' () which is an important literature work to understand the event. Since the document was written in classical Chinese, we make a comprehensive translation to Korean with detailed notes. According to the report, the items that Rodrigues presented include four books written in Chinese that describe European discoveries about the world, a report on the tribute of new cannons manufactured by Portuguese in Macao, a telescope, a flintlock, a Foliot-type mechanical clock, a world atlas drawn by Matteo Ricci, an astronomical planisphere, and a sun-dial. We discuss the meaning of each item in the Korean history of science and technology. In particular, Jeong's introduction is an important event in the history of Korean astronomy, because the telescope he brought was the first one to be introduced in Korean history. Even though king Injo and his associates of the Joseon dynasty were well aware of the value as military armaments of new technologies such as telescopes, cannons, and flintlocks, they were not able to quickly adopt such technologies to defend against the military threat of Jurchen. We revisit the reason in view of the general history of science and technology of east-Asian countries in the 17th century.
ERIC Educational Resources Information Center
Quaigrain, Kennedy; Arhin, Ato Kwamina
2017-01-01
Item analysis is essential in improving items which will be used again in later tests; it can also be used to eliminate misleading items in a test. The study focused on item and test quality and explored the relationship between difficulty index (p-value) and discrimination index (DI) with distractor efficiency (DE). The study was conducted among…
Cross-cultural attitudes of flight crew regarding CRM
NASA Technical Reports Server (NTRS)
Merritt, Ashleigh
1993-01-01
This study asks if the Cockpit Management Attitude Questionnaire (CMAQ) can detect differences across countries, and/or across occupations. And if so, can those differences be interpreted? Research has shown that the CMAQ is sensitive to attitude differences between and within organizations, thereby demonstrating its effectiveness with American populations. But the CMAQ was originally designed by American researchers and psychometrically refined for American pilots. The items in the questionnaire, though general in nature, still reflect the ubiquitous Western bias, because the items were written by researchers from and for the one culture. Recognizing this constraint, this study is nonetheless interested in attitudes toward crew behavior, and how those attitudes may vary across country and occupation.
Student science achievement and the integration of Indigenous knowledge on standardized tests
NASA Astrophysics Data System (ADS)
Dupuis, Juliann; Abrams, Eleanor
2017-09-01
In this article, we examine how American Indian students in Montana performed on standardized state science assessments when a small number of test items based upon traditional science knowledge from a cultural curriculum, "Indian Education for All", were included. Montana is the first state in the US to mandate the use of a culturally relevant curriculum in all schools and to incorporate this curriculum into a portion of the standardized assessment items. This study compares White and American Indian student test scores on these particular test items to determine how White and American Indian students perform on culturally relevant test items compared to traditional standard science test items. The connections between student achievement on adapted culturally relevant science test items versus traditional items brings valuable insights to the fields of science education, research on student assessments, and Indigenous studies.
Computerized Adaptive Test (CAT) Applications and Item Response Theory Models for Polytomous Items
ERIC Educational Resources Information Center
Aybek, Eren Can; Demirtasli, R. Nukhet
2017-01-01
This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…
An Effect Size Measure for Raju's Differential Functioning for Items and Tests
ERIC Educational Resources Information Center
Wright, Keith D.; Oshima, T. C.
2015-01-01
This study established an effect size measure for differential functioning for items and tests' noncompensatory differential item functioning (NCDIF). The Mantel-Haenszel parameter served as the benchmark for developing NCDIF's effect size measure for reporting moderate and large differential item functioning in test items. The effect size of…
Detecting a Gender-Related DIF Using Logistic Regression and Transformed Item Difficulty
ERIC Educational Resources Information Center
Abedlaziz, Nabeel; Ismail, Wail; Hussin, Zaharah
2011-01-01
Test items are designed to provide information about the examinees. Difficult items are designed to be more demanding and easy items are less so. However, sometimes, test items carry with their demands other than those intended by the test developer (Scheuneman & Gerritz, 1990). When personal attributes such as gender systematically affect…
Influence of Fallible Item Parameters on Test Information During Adaptive Testing.
ERIC Educational Resources Information Center
Wetzel, C. Douglas; McBride, James R.
Computer simulation was used to assess the effects of item parameter estimation errors on different item selection strategies used in adaptive and conventional testing. To determine whether these effects reduced the advantages of certain optimal item selection strategies, simulations were repeated in the presence and absence of item parameter…
A Guide to Item Banking in Education. (Third Edition).
ERIC Educational Resources Information Center
Naccarato, Richard W.
The current status of banks of test items existing across the United States was determined through a survey conducted between September and December 1987. Item "bank" in this context does not imply that the test items are available in computerized form, but simply that "deposited" test items can be withdrawn for use. Emphasis…
Development and validation of an energy-balance knowledge test for fourth- and fifth-grade students.
Chen, Senlin; Zhu, Xihe; Kang, Minsoo
2017-05-01
A valid test measuring children's energy-balance (EB) knowledge is lacking in research. This study developed and validated the energy-balance knowledge test (EBKT) for fourth and fifth grade students. The original EBKT contained 25 items but was reduced to 23 items based on pilot result and intensive expert panel discussion. De-identified data were collected from 468 fourth and fifth grade students enrolled in four schools to examine the psychometric properties of the EBKT items. The Rasch model analysis was conducted using the Winstep 3.65.0 software. Differential item functioning (DIF) analysis flagged 1 item (item #4) functioning differently between boys and girls, which was deleted. The final 22-item EBKT showed desirable model-data fit indices. The items had large variability ranging from -3.58 logit (item #10, the easiest) to 1.70 logit (item #3, the hardest). The average person ability on the test was 0.28 logit (SD = .78). Additional analyses supported known-group difference validity of the EBKT scores in capturing gender- and grade-based ability differences. The test was overall valid but could be further improved by expanding test items to discern various ability levels. For lack of a better test, researchers and practitioners may use the EBKT to assess fourth- and fifth-grade students' EB knowledge.
NASA Astrophysics Data System (ADS)
Rakkapao, Suttida; Prasitpong, Singha; Arayathanitkul, Kwan
2016-12-01
This study investigated the multiple-choice test of understanding of vectors (TUV), by applying item response theory (IRT). The difficulty, discriminatory, and guessing parameters of the TUV items were fit with the three-parameter logistic model of IRT, using the parscale program. The TUV ability is an ability parameter, here estimated assuming unidimensionality and local independence. Moreover, all distractors of the TUV were analyzed from item response curves (IRC) that represent simplified IRT. Data were gathered on 2392 science and engineering freshmen, from three universities in Thailand. The results revealed IRT analysis to be useful in assessing the test since its item parameters are independent of the ability parameters. The IRT framework reveals item-level information, and indicates appropriate ability ranges for the test. Moreover, the IRC analysis can be used to assess the effectiveness of the test's distractors. Both IRT and IRC approaches reveal test characteristics beyond those revealed by the classical analysis methods of tests. Test developers can apply these methods to diagnose and evaluate the features of items at various ability levels of test takers.
ERIC Educational Resources Information Center
Baghaei, Purya; Ravand, Hamdollah
2016-01-01
In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…
Machine Shop. Criterion-Referenced Test (CRT) Item Bank.
ERIC Educational Resources Information Center
Davis, Diane, Ed.
This drafting criterion-referenced test item bank is keyed to the machine shop competency profile developed by industry and education professionals in Missouri. The 16 references used for drafting the test items are listed. Test items are arranged under these categories: orientation to machine shop; performing mathematical calculations; performing…
Rescuing Computerized Testing by Breaking Zipf's Law.
ERIC Educational Resources Information Center
Wainer, Howard
2000-01-01
Suggests that because of the nonlinear relationship between item usage and item security, the problems of test security posed by continuous administration of standardized tests cannot be resolved merely by increasing the size of the item pool. Offers alternative strategies to overcome these problems, distributing test items so as to avoid the…
ERIC Educational Resources Information Center
Ito, Kyoko; Sykes, Robert C.
This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…
ERIC Educational Resources Information Center
Atalmis, Erkan Hasan
2016-01-01
Multiple-choice (MC) items are commonly used in high-stake tests. Thus, each item of such tests should be meticulously constructed to increase the accuracy of decisions based on test results. Haladyna and his colleagues (2002) addressed the valid item-writing guidelines to construct high quality MC items in order to increase test reliability and…
[Development of the role scale for municipal supervising public health nurses].
Hatono, Yoko; Suzuki, Hiroko; Masaki, Naoko
2013-05-01
As public health nurses are becoming increasingly decentralized in municipalities, recommendations for allocating supervising public health nurses are being made. This study aimed to develop a scale for measuring the implementation of role of municipal supervising public health nurses and to test its reliability and validity. Scale items were developed using results of a qualitative inductive analysis of interview data, and the items were then revised following an examination of content validity by experts, resulting in a provisional scale of 17 items. A self-administered, written questionnaire was then completed by supervising public health nurses or public health nurses holding the most senior positions in all municipalities nationwide, with the exception of three prefectures in the Tohoku region (total 1,621 locations). In total, 1,036 responses were received, and 931 were used for analysis (valid response rate = 57.4%). Of these, 406 were completed by supervising public health nurses. After deleting one item as a result of item analysis and conducting principal component analysis, factor analysis was conducted using the major factor method and Promax rotation. One item with high loading on multiple factors was deleted, resulting in a scale comprising 15 items and 3 factors. The cumulative contribution ratio was 56.10%. The three factors were labeled "Promotion of health activities across the whole locality," "Coordination as a PHN role leader," and "Development of the skills of public health nurses". The reliability coefficient of the RMSP (Role Scale for Municipal Supervising Public Health Nurses) as a whole was 0.84 using the split-half method (Spearman-Brown formula) and 0.91 using Cronbach's alpha, confirming internal consistency. In terms of validity, an examination was conducted of the correlation of two RMSP scale scores (strength of awareness of role as a supervising public health nurse and confidence as a supervising public health nurse) and scores on existing scales assessing management abilities, and a significant correlation (P < 0.01) was obtained. Additionally, a comparison of the RMSP scores of decentralized local public health nurses according to rank and years of service in areas where there were no supervising public health nurses with the RMSP scores of supervising public health nurses showed that the scores of supervising public health nurses were higher. The developed scale was found to be reliable and valid for measuring the implementation of supervising public health nurses' role.
Item difficulty and item validity for the Children's Group Embedded Figures Test.
Rusch, R R; Trigg, C L; Brogan, R; Petriquin, S
1994-02-01
The validity and reliability of the Children's Group Embedded Figures Test was reported for students in Grade 2 by Cromack and Stone in 1980; however, a search of the literature indicates no evidence for internal consistency or item analysis. Hence the purpose of this study was to examine the item difficulty and item validity of the test with children in Grades 1 and 2. Confusion in the literature over development and use of this test was seemingly resolved through analysis of these descriptions and through an interview with the test developer. One early-appearing item was unreasonably difficult. Two or three other items were quite difficult and made little contribution to the total score. Caution is recommended, however, in any reordering or elimination of items based on these findings, given the limited number of subjects (n = 84).
1976-01-01
items. The items tested were the MODI-PAC, a proprietary item of Reming)on Arms Company, a standard 12 - gauge round of No. 4 lead shot, and an...to refrain from testing this item. Therefore, the final selection of items for testing were (1) the MODI-PAC, (2) a standard 12 - gauge shotgun round of...The first item evaluated was the MODI-PAC5. The MOQ1-PAC which standsfor “modified impact “ is a 12 - gauge shotgun shell loaded with approximately 320
Interactions Between Item Content And Group Membership on Achievement Test Items.
ERIC Educational Resources Information Center
Linn, Robert L.; Harnisch, Delwyn L.
The purpose of this investigation was to examine the interaction of item content and group membership on achievement test items. Estimates of the parameters of the three parameter logistic model were obtained on the 46 item math test for the sample of eighth grade students (N = 2055) participating in the Illinois Inventory of Educational Progress,…
Effects of Item Exposure for Conventional Examinations in a Continuous Testing Environment.
ERIC Educational Resources Information Center
Hertz, Norman R.; Chinn, Roberta N.
This study explored the effect of item exposure on two conventional examinations administered as computer-based tests. A principal hypothesis was that item exposure would have little or no effect on average difficulty of the items over the course of an administrative cycle. This hypothesis was tested by exploring conventional item statistics and…
McInnes, Matthew D F; Moher, David; Thombs, Brett D; McGrath, Trevor A; Bossuyt, Patrick M; Clifford, Tammy; Cohen, Jérémie F; Deeks, Jonathan J; Gatsonis, Constantine; Hooft, Lotty; Hunt, Harriet A; Hyde, Christopher J; Korevaar, Daniël A; Leeflang, Mariska M G; Macaskill, Petra; Reitsma, Johannes B; Rodin, Rachel; Rutjes, Anne W S; Salameh, Jean-Paul; Stevens, Adrienne; Takwoingi, Yemisi; Tonelli, Marcello; Weeks, Laura; Whiting, Penny; Willis, Brian H
2018-01-23
Systematic reviews of diagnostic test accuracy synthesize data from primary diagnostic studies that have evaluated the accuracy of 1 or more index tests against a reference standard, provide estimates of test performance, allow comparisons of the accuracy of different tests, and facilitate the identification of sources of variability in test accuracy. To develop the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagnostic test accuracy guideline as a stand-alone extension of the PRISMA statement. Modifications to the PRISMA statement reflect the specific requirements for reporting of systematic reviews and meta-analyses of diagnostic test accuracy studies and the abstracts for these reviews. Established standards from the Enhancing the Quality and Transparency of Health Research (EQUATOR) Network were followed for the development of the guideline. The original PRISMA statement was used as a framework on which to modify and add items. A group of 24 multidisciplinary experts used a systematic review of articles on existing reporting guidelines and methods, a 3-round Delphi process, a consensus meeting, pilot testing, and iterative refinement to develop the PRISMA diagnostic test accuracy guideline. The final version of the PRISMA diagnostic test accuracy guideline checklist was approved by the group. The systematic review (produced 64 items) and the Delphi process (provided feedback on 7 proposed items; 1 item was later split into 2 items) identified 71 potentially relevant items for consideration. The Delphi process reduced these to 60 items that were discussed at the consensus meeting. Following the meeting, pilot testing and iterative feedback were used to generate the 27-item PRISMA diagnostic test accuracy checklist. To reflect specific or optimal contemporary systematic review methods for diagnostic test accuracy, 8 of the 27 original PRISMA items were left unchanged, 17 were modified, 2 were added, and 2 were omitted. The 27-item PRISMA diagnostic test accuracy checklist provides specific guidance for reporting of systematic reviews. The PRISMA diagnostic test accuracy guideline can facilitate the transparent reporting of reviews, and may assist in the evaluation of validity and applicability, enhance replicability of reviews, and make the results from systematic reviews of diagnostic test accuracy studies more useful.
Program Helps Standardize Documentation Of Software
NASA Technical Reports Server (NTRS)
Howe, G.
1994-01-01
Intelligent Documentation Management System, IDMS, computer program developed to assist project managers in implementing information system documentation standard known as NASA-STD-2100-91, NASA STD, COS-10300, of NASA's Software Management and Assurance Program. Standard consists of data-item descriptions or templates, each of which governs particular component of software documentation. IDMS helps program manager in tailoring documentation standard to project. Written in C language.
2010-06-01
Online Shopping Tool • Web self-service capability for the DOD • Sells both finished goods and services • Supports contracts written by DLA, GSA...month and over 750K items of content a month • FY08 Total Sales $835M; Green Sales $7.3M 11 DOD EMALL DOD’s Online Shopping Tool 1st Choice Support for
ERIC Educational Resources Information Center
Stephens, Ana C.; Knuth, Eric J.; Blanton, Maria L.; Isler, Isil; Gardiner, Angela Murphy; Marum, Tim
2013-01-01
This paper reports results from a written assessment given to 290 third-, fourth-, and fifth-grade students prior to any instructional intervention. We share and discuss students' responses to items addressing their understanding of equation structure and the meaning of the equal sign. We found that many students held an operational conception of…
ERIC Educational Resources Information Center
Marien, Michael
This guide seeks to sort the flood of information written about environmental issues and sustainable societies in recent years and order it in some way. A total of 450 document abstracts are presented here, most of which are on books and think tank reports. These items first appeared in "Future Survey" (journal) and were published…
ERIC Educational Resources Information Center
Roberts, Laura Weiss; Hammond, Katherine A. Green; Geppert, Cynthia M. A.; Warner, Teddy D.
2004-01-01
Objective: To assess the perspectives and preferences of medical students and residents regarding professionalism and ethics education. Methods: A new written survey with 124 items (scale: "strongly disagree" = 1, "strongly agree" = 9) was sent to all medical students (n = 308) and PGY 1-3 residents (n = 233) at one academic center. Results: Of…
ERIC Educational Resources Information Center
Borge, Javier
2015-01-01
G, G°, [delta][subscript r]G, [delta][subscript r]G°, [delta]G, and [delta]G° are essential quantities to master the chemical equilibrium. Although the number of publications devoted to explaining these items is extremely high, it seems that they do not produce the desired effect because some articles and textbooks are still being written with…
An Efficiency Balanced Information Criterion for Item Selection in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Han, Kyung T.
2012-01-01
Successful administration of computerized adaptive testing (CAT) programs in educational settings requires that test security and item exposure control issues be taken seriously. Developing an item selection algorithm that strikes the right balance between test precision and level of item pool utilization is the key to successful implementation…
ERIC Educational Resources Information Center
Arendasy, Martin E.; Sommer, Markus
2012-01-01
The use of new test administration technologies such as computerized adaptive testing in high-stakes educational and occupational assessments demands large item pools. Classic item construction processes and previous approaches to automatic item generation faced the problems of a considerable loss of items after the item calibration phase. In this…
Item Purification Does Not Always Improve DIF Detection: A Counterexample with Angoff's Delta Plot
ERIC Educational Resources Information Center
Magis, David; Facon, Bruno
2013-01-01
Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…
Gattrell, William T; Hopewell, Sally; Young, Kate; Farrow, Paul; White, Richard; Wager, Elizabeth; Winchester, Christopher C
2016-02-21
Authors may choose to work with professional medical writers when writing up their research for publication. We examined the relationship between medical writing support and the quality and timeliness of reporting of the results of randomised controlled trials (RCTs). Cross-sectional study. Primary reports of RCTs published in BioMed Central journals from 2000 to 16 July 2014, subdivided into those with medical writing support (n=110) and those without medical writing support (n=123). Proportion of items that were completely reported from a predefined subset of the Consolidated Standards of Reporting Trials (CONSORT) checklist (12 items known to be commonly poorly reported), overall acceptance time (from manuscript submission to editorial acceptance) and quality of written English as assessed by peer reviewers. The effect of funding source and publication year was examined. The number of articles that completely reported at least 50% of the CONSORT items assessed was higher for those with declared medical writing support (39.1% (43/110 articles); 95% CI 29.9% to 48.9%) than for those without (21.1% (26/123 articles); 95% CI 14.3% to 29.4%). Articles with declared medical writing support were more likely than articles without such support to have acceptable written English (81.1% (43/53 articles); 95% CI 67.6% to 90.1% vs 47.9% (23/48 articles); 95% CI 33.5% to 62.7%). The median time of overall acceptance was longer for articles with declared medical writing support than for those without (167 days (IQR 114.5-231 days) vs 136 days (IQR 77-193 days)). In this sample of open-access journals, declared professional medical writing support was associated with more complete reporting of clinical trial results and higher quality of written English. Medical writing support may play an important role in raising the quality of clinical trial reporting. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Jia, Lin-Zhi; Ya-Jun, Ma; Cao, Yi; Qian, Fen; Li, Xiang-Yu
2012-04-30
The quality index among "Medical Parasitology" exam papers and measured data for students in three majors from the university in 2010 were compared and analyzed. The exam papers were formed from the test item bank. The alpha reliability coefficients of the three exam papers were above 0.70. The knowledge structure and capacity structure of the exam papers were basically balanced. But the alpha reliability coefficients of the second major was the lowest, mainly due to quality of test items in the exam paper and the failure of revising the index of test item bank in time. This observation demonstrated that revising the test items and their index in the item bank according to the measured data can improve the quality of test item bank proposition and reduce the difference among exam papers.
The Role of Item Models in Automatic Item Generation
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis
2012-01-01
Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…
ERIC Educational Resources Information Center
McCrimmon, Adam W.; Climie, Emma A.
2011-01-01
This article presents a review of the "Test of Written Language-Fourth Edition" (TOWL-4), a newly updated individual or group-based measure of written language for students aged 9 years, 0 months through 17 years, 11 months. The stated purposes of the measure are to identify students in need of support or intervention in the area of…
Hayes-Harb, Rachel; Cheng, Hui-Wen
2016-01-01
The role of written input in second language (L2) phonological and lexical acquisition has received increased attention in recent years. Here we investigated the influence of two factors that may moderate the influence of orthography on L2 word form learning: (i) whether the writing system is shared by the native language and the L2, and (ii) if the writing system is shared, whether the relevant grapheme-phoneme correspondences are also shared. The acquisition of Mandarin via the Pinyin and Zhuyin writing systems provides an ecologically valid opportunity to explore these factors. We first asked whether there is a difference in native English speakers' ability to learn Pinyin and Zhuyin grapheme-phoneme correspondences. In Experiment 1, native English speakers assigned to either Pinyin or Zhuyin groups were exposed to Mandarin words belonging to one of two conditions: in the “congruent” condition, the Pinyin forms are possible English spellings for the auditory words (e.g., < nai> for [nai]); in the “incongruent” condition, the Pinyin forms involve a familiar grapheme representing a novel phoneme (e.g., < xiu> for [ɕiou]). At test, participants were asked to indicate whether auditory and written forms matched; in the crucial trials, the written forms from training (e.g., < xiu>) were paired with possible English pronunciations of the Pinyin written forms (e.g., [ziou]). Experiment 2 was identical to Experiment 1 except that participants additionally saw pictures depicting word meanings during the exposure phase, and at test were asked to match auditory forms with the pictures. In both experiments the Zhuyin group outperformed the Pinyin group due to the Pinyin group's difficulty with “incongruent” items. A third experiment confirmed that the groups did not differ in their ability to perceptually distinguish the relevant Mandarin consonants (e.g., [ɕ]) from the foils (e.g., [z]), suggesting that the findings of Experiments 1 and 2 can be attributed to the effects of orthographic input. We thus conclude that despite the familiarity of Pinyin graphemes to native English speakers, the need to suppress native language grapheme-phoneme correspondences in favor of new ones can lead to less target-like knowledge of newly learned words' forms than does learning Zhuyin's entirely novel graphemes. PMID:27375506
Item Review and the Rearrangement Procedure: Its Process and Its Results
ERIC Educational Resources Information Center
Papanastasiou, Elena C.
2005-01-01
Permitting item review is to the benefit of the examinees who typically increase their test scores with item review. However, testing companies do not prefer item review since it does not follow the logic on which adaptive tests are based, and since it is prone to cheating strategies. Consequently, item review is not permitted in many adaptive…
A Model-Based Method for Content Validation of Automatically Generated Test Items
ERIC Educational Resources Information Center
Zhang, Xinxin; Gierl, Mark
2016-01-01
The purpose of this study is to describe a methodology to recover the item model used to generate multiple-choice test items with a novel graph theory approach. Beginning with the generated test items and working backward to recover the original item model provides a model-based method for validating the content used to automatically generate test…
[People in divorce and their ambivalence: initial use of a newly developed couples inventory].
Riehl-Emde, A; Frei, R; Willi, J
1994-02-01
It has been reported that intact internal, external and social accommodation is related to a stable partnership between men and women. Every partnership, however, is also characterized by a certain potential towards separation that may develop despite of the named stabilizing factors. This led to the question of how the accommodation is constituted in men and women during separation. A written questionnaire was, therefore, sent to a total of 35 men and women in separation. The Bradburn inventory was used to define the well-being of the volunteers in relation to a representative sample. Using a newly developed inventory, a number of items had to be rated twice: once to reflect the condition at the beginning of the partnership and once to reflect the situation during the last year. In addition, the relative contribution of a particular item in favor or against separation was asked. The Bradburn inventory shows that the general well-being of the test sample is more impaired than that of the representative sample. The evaluation of the 80% returned questionnaires revealed worsening of all items. Most important reasons to finish the partnership were i) the lack of verbal communication, ii) extramarital affairs, and iii) the impression that the partnership limits the personal development. Reasons to continue the partnership were different between those who initiated the separation and those who were left: The initiators predominantly mentioned structural factors such as financial situation, living conditions and care for children, whereas the non-initiators predominantly mentioned internal qualities of the partnership.(ABSTRACT TRUNCATED AT 250 WORDS)
Brown, Garielle E; Bharwani, Aleem; Patel, Kamala D; Lemaire, Jane B
2016-08-04
To evaluate the format, content, and effectiveness of a newly developed orientation to wellness workshop, and to explore participants' overall perceptions. This was a mixed methods study. Participants consisted of 47 new faculty of medicine members who attended one of the four workshops held between 2011 and 2013. Questionnaires were used to evaluate workshop characteristics (10 survey items; response scale 1=unacceptable to 7=outstanding), intention to change behavior (yes/no), and retrospective pre/post workshop self-efficacy (4 survey items; response scale 1=no confidence to 6=absolute confidence). Mean scores and standard deviations were calculated for the workshop characteristics. Pre/post workshop self-efficacy scores were compared using a Wilcoxon signed-rank test. Participants' written qualitative feedback was coded using an inductive strategy to identify themes. There was strong support for the workshop characteristics with mean scores entirely above 6.00 (N=42). Thirty-one of 34 respondents (91%) expressed intention to change their behavior as a result of participating in the workshop. The post workshop self-efficacy scores (N=38 respondents) increased significantly for all four items (p<0.0001) compared to pre workshop ratings. Participants perceived the key workshop elements as the evidence-based content relevant to academic physicians, incorporation of practical tips and strategies, and an atmosphere conducive to discussion and experience sharing. Participants welcomed wellness as a focus of faculty development. Enhancing instruction around wellness has the potential to contribute positively to the professional competency and overall functioning of faculty of medicine members.
Brown, Garielle E.; Bharwani, Aleem; Patel, Kamala D.
2016-01-01
Objectives To evaluate the format, content, and effectiveness of a newly developed orientation to wellness workshop, and to explore participants’ overall perceptions. Methods This was a mixed methods study. Participants consisted of 47 new faculty of medicine members who attended one of the four workshops held between 2011 and 2013. Questionnaires were used to evaluate workshop characteristics (10 survey items; response scale 1=unacceptable to 7=outstanding), intention to change behavior (yes/no), and retrospective pre/post workshop self-efficacy (4 survey items; response scale 1=no confidence to 6=absolute confidence). Mean scores and standard deviations were calculated for the workshop characteristics. Pre/post workshop self-efficacy scores were compared using a Wilcoxon signed-rank test. Participants’ written qualitative feedback was coded using an inductive strategy to identify themes. Results There was strong support for the workshop characteristics with mean scores entirely above 6.00 (N=42). Thirty-one of 34 respondents (91%) expressed intention to change their behavior as a result of participating in the workshop. The post workshop self-efficacy scores (N=38 respondents) increased significantly for all four items (p<0.0001) compared to pre workshop ratings. Participants perceived the key workshop elements as the evidence-based content relevant to academic physicians, incorporation of practical tips and strategies, and an atmosphere conducive to discussion and experience sharing. Conclusions Participants welcomed wellness as a focus of faculty development. Enhancing instruction around wellness has the potential to contribute positively to the professional competency and overall functioning of faculty of medicine members. PMID:27494833
Optimal Bayesian Adaptive Design for Test-Item Calibration.
van der Linden, Wim J; Ren, Hao
2015-06-01
An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers' ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.
State Assessment Program Item Banks: Model Language for Request for Proposals (RFP) and Contracts
ERIC Educational Resources Information Center
Swanson, Leonard C.
2010-01-01
This document provides recommendations for request for proposal (RFP) and contract language that state education agencies can use to specify their requirements for access to test item banks. An item bank is a repository for test items and data about those items. Item banks are used by state agency staff to view items and associated data; to…
The Impact of Receiving the Same Items on Consecutive Computer Adaptive Test Administrations.
ERIC Educational Resources Information Center
O'Neill, Thomas; Lunz, Mary E.; Thiede, Keith
2000-01-01
Studied item exposure in a computerized adaptive test when the item selection algorithm presents examinees with questions they were asked in a previous test administration. Results with 178 repeat examinees on a medical technologists' test indicate that the combined use of an adaptive algorithm to select items and latent trait theory to estimate…
ERIC Educational Resources Information Center
Saß, Steffani; Schütte, Kerstin
2016-01-01
Solving test items might require abilities in test-takers other than the construct the test was designed to assess. Item and student characteristics such as item format or reading comprehension can impact the test result. This experiment is based on cognitive theories of text and picture comprehension. It examines whether integration aids, which…
Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly
ERIC Educational Resources Information Center
Veldkamp, Bernard P.; Matteucci, Mariagiulia; de Jong, Martijn G.
2013-01-01
Item response theory parameters have to be estimated, and because of the estimation process, they do have uncertainty in them. In most large-scale testing programs, the parameters are stored in item banks, and automated test assembly algorithms are applied to assemble operational test forms. These algorithms treat item parameters as fixed values,…
Identifying Differential Item Functioning in Multi-Stage Computer Adaptive Testing
ERIC Educational Resources Information Center
Gierl, Mark J.; Lai, Hollis; Li, Johnson
2013-01-01
The purpose of this study is to evaluate the performance of CATSIB (Computer Adaptive Testing-Simultaneous Item Bias Test) for detecting differential item functioning (DIF) when items in the matching and studied subtest are administered adaptively in the context of a realistic multi-stage adaptive test (MST). MST was simulated using a 4-item…
Gerard, James M; Scalzo, Anthony J; Borgman, Matthew A; Watson, Christopher M; Byrnes, Chelsie E; Chang, Todd P; Auerbach, Marc; Kessler, David O; Feldman, Brian L; Payne, Brian S; Nibras, Sohail; Chokshi, Riti K; Lopreiato, Joseph O
2018-06-01
We developed a first-person serious game, PediatricSim, to teach and assess performances on seven critical pediatric scenarios (anaphylaxis, bronchiolitis, diabetic ketoacidosis, respiratory failure, seizure, septic shock, and supraventricular tachycardia). In the game, players are placed in the role of a code leader and direct patient management by selecting from various assessment and treatment options. The objective of this study was to obtain supportive validity evidence for the PediatricSim game scores. Game content was developed by 11 subject matter experts and followed the American Heart Association's 2011 Pediatric Advanced Life Support Provider Manual and other authoritative references. Sixty subjects with three different levels of experience were enrolled to play the game. Before game play, subjects completed a 40-item written pretest of knowledge. Game scores were compared between subject groups using scoring rubrics developed for the scenarios. Validity evidence was established and interpreted according to Messick's framework. Content validity was supported by a game development process that involved expert experience, focused literature review, and pilot testing. Subjects rated the game favorably for engagement, realism, and educational value. Interrater agreement on game scoring was excellent (intraclass correlation coefficient = 0.91, 95% confidence interval = 0.89-0.9). Game scores were higher for attendings followed by residents then medical students (Pc < 0.01) with large effect sizes (1.6-4.4) for each comparison. There was a very strong, positive correlation between game and written test scores (r = 0.84, P < 0.01). These findings contribute validity evidence for PediatricSim game scores to assess knowledge of pediatric emergency medicine resuscitation.
Written Informed-Consent Statutes and HIV Testing
Ehrenkranz, Peter D.; Pagán, José A.; Begier, Elizabeth M.; Linas, Benjamin; Madison, Kristin; Armstrong, Katrina
2009-01-01
Background Almost 1 million Americans are infected with HIV, yet it is estimated that as many as 250,000 of them do not know their serostatus. This study examined whether people residing in states with statutes requiring written informed consent prior to HIV testing were less likely to report a recent HIV test. Methods The study is based on survey data from the 2004 Behavioral Risk Factor Surveillance System. Logistic regression was used to assess the association between residence in a state with a pre-test written informed-consent requirement and individual self-report of recent HIV testing. The regression analyses controlled for potential state- and individual-level confounders. Results Almost 17% of respondents reported that they had been tested for HIV in the prior 12 months. Ten states had statutes requiring written informed consent prior to routine HIV testing; nine of those were analyzed in this study. After adjusting for other state- and individual-level factors, people who resided in these nine states were less likely to report a recent history of HIV testing (OR=0.85; 95% CI=0.80, 0.90). The average marginal effect was −0.02 (p<0.001, 95%CI= −0.03, −0.01); thus, written informed-consent statutes are associated with a 12% reduction in HIV testing from the baseline testing level of 17%. The association between a consent requirement and lack of testing was greatest among respondents who denied HIV risk factors, were non-Hispanic whites, or who had higher levels of education. Conclusions This study’s findings suggest that the removal of written informed-consent requirements might promote the non–risk-based routine-testing approach that the CDC advocates in its new testing guidelines. PMID:19423271
An investigation into pilot and system response to critical in-flight events, volume 2
NASA Technical Reports Server (NTRS)
Rockwell, T. H.; Giffin, W. C.
1981-01-01
Critical in-flight event is studied using mission simulation and written tests of pilot responses. Materials and procedures used in knowledge tests, written tests, and mission simulations are included
A Stepwise Test Characteristic Curve Method to Detect Item Parameter Drift
ERIC Educational Resources Information Center
Guo, Rui; Zheng, Yi; Chang, Hua-Hua
2015-01-01
An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…
Shen, Linjun; Li, Feiming; Wattleworth, Roberta; Filipetto, Frank
2010-10-01
The Comprehensive Osteopathic Medical Licensing Examination conducted a trial of multimedia items in the 2008-2009 Level 3 testing cycle to determine (1) if multimedia items were able to test additional elements of medical knowledge and skills and (2) how to develop effective multimedia items. Forty-four content-matched multimedia and text multiple-choice items were randomly delivered to Level 3 candidates. Logistic regression and paired-samples t tests were used for pairwise and group-level comparisons, respectively. Nine pairs showed significant differences in either difficulty or/and discrimination. Content analysis found that, if text narrations were less direct, multimedia materials could make items easier. When textbook terminologies were replaced by multimedia presentations, multimedia items could become more difficult. Moreover, a multimedia item was found not uniformly difficult for candidates at different ability levels, possibly because multimedia and text items tested different elements of a same concept. Multimedia items may be capable of measuring some constructs different from what text items can measure. Effective multimedia items with reasonable psychometric properties can be intentionally developed.
Koh, Bongyeun; Hong, Sunggi; Kim, Soon-Sim; Hyun, Jin-Sook; Baek, Milye; Moon, Jundong; Kwon, Hayran; Kim, Gyoungyong; Min, Seonggi; Kang, Gu-Hyun
2016-01-01
The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE), which requires examinees to select items randomly. The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01), as well as 4 of the 5 items on the advanced skills test (P<0.05). In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01), as well as all 3 of the advanced skills test items (P<0.01). In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination.
Stetson, Barbara; Schlundt, David; Rothschild, Chelsea; Floyd, Jennifer E; Rogers, Whitney; Mokshagundam, Sri Prakash
2011-03-01
To develop and evaluate the validity and reliability of The Personal Diabetes Questionnaire (PDQ), a brief, yet comprehensive measure of diabetes self-care behaviors, perceptions and barriers. To examine individual items to provide descriptive and normative information and provide data on scale reliability and associations between PDQ scales and concurrently assessed HBA(1c) and BMI. Items were written to address nutritional management, medication utilization, blood glucose monitoring, and physical activity. The initial instrument was reviewed by multidisciplinary diabetes care providers and items subsequently revised until the measure provided complete coverage of the diabetes care domains using as few items as possible. The scoring scheme was generated rationally. Subjects were 790 adults (205 with type 1 and 585 with type 2 diabetes) who completed the PDQ while waiting for clinic appointments. Item completion rates were high, with few items skipped by participants. Subscales demonstrated good internal consistency (Cronbach α=.650-.834) and demonstrated significant associations with BMI (p ≤.001) and HbA(1c) (p ≤.001). The PDQ is a useful measure of diabetes self-care behaviors and related perceptions and barriers that is reliable and valid and feasible to administer in a clinic setting. This measure may be used to obtain data for assessing diabetes self-management and barriers and to guide patient care. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Item Analysis in Introductory Economics Testing.
ERIC Educational Resources Information Center
Tinari, Frank D.
1979-01-01
Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)
Quality of the written radiology report: a review of the literature.
Pool, Felicity; Goergen, Stacy
2010-08-01
A literature review was carried out, guided by the question, What are the important elements of a high-quality radiology written report? Two papers known to the authors were used as a basis for 5 PubMed search strategies. Exclusion criteria were applied to retrieved citations. Reference lists of retrieved citations were scanned for additional relevant papers and exclusion criteria applied to these. Web sites of professional radiology organizations were scanned for guidelines relating to the written radiology report. Retrieved guidelines were appraised using the Appraisal of Guidelines for Research & Evaluation instrument. Methodologies of retrieved papers were not suitable for conventional appraisal, and an evidence table was constructed. The search strategy identified 25 published papers and 4 guidelines. Published study methodologies included 1 randomized controlled trial; 1 before-and-after study of interventions; 10 observational studies, audits, or analyses; 12 surveys; and 1 narrative review of the literature. Existing guidelines have a number of weaknesses with regard to scope and purpose, methods of development, stakeholder consultation, and editorial independence and applicability. There is a major gap in published studies relating to testing of interventions to improve report quality using conventional randomized controlled trial methods. Published studies and guidelines generally support report content, including clinical history, examination quality, description of findings, comparison, and diagnosis. Important report attributes include accuracy, clarity, and certainty. There is wide variation in the language used to describe imaging findings and diagnostic certainty. Survey participants strongly preferred reports with structured or itemized formats, but few studies exist regarding the effect of report structure on quality. Copyright 2010 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Zoltowski, Kathleen S W; Mistry, Rakesh D; Brousseau, David C; Whitfill, Travis; Aronson, Paul L
2016-01-01
Satisfaction is an important measure of care quality. Interventions to improve satisfaction in the pediatric emergency department (ED) are limited, especially for patients with nonurgent conditions. Our objective was to determine if clinician knowledge of written parental expectations improves parental satisfaction for nonurgent ED visits. This randomized controlled trial was conducted in a tertiary-care pediatric ED. Parents of children presenting for nonurgent visits (Emergency Severity Index level 4 or 5) were randomized into 3 groups: 1) the intervention group completed an expectation survey on arrival, which was reviewed by the clinician, 2) the control group completed the expectation survey, which was not reviewed, and 3) the baseline group did not complete an expectation survey. At ED disposition, all groups completed a 3-item satisfaction survey, scored using 5-point Likert scales (1 = very poor, 5 = very good). The primary outcome was rating of "overall care." Secondary outcomes included likelihood of recommending the ED and staff sensitivity to concerns. Proportions were compared by chi-square test. A total of 304 subjects were enrolled. The proportion of parents rating 5 of 5 for overall care did not differ among the baseline, control, and intervention groups (74.8% vs 73.2% vs 69.2%, P = .56). The proportion of parents rating 5 of 5 also did not differ for likelihood of recommending the ED (77.7% vs 72.2% vs 70.2%, P = .45) or staff sensitivity to concerns (78.6% vs 78.4% vs 78.8%, P = .71). For nonurgent pediatric ED visits, clinician knowledge of written parental expectations does not improve parental satisfaction. Copyright © 2016 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Ilich, Maria O.
Psychometricians and test developers evaluate standardized tests for potential bias against groups of test-takers by using differential item functioning (DIF). English language learners (ELLs) are a diverse group of students whose native language is not English. While they are still learning the English language, they must take their standardized tests for their school subjects, including science, in English. In this study, linguistic complexity was examined as a possible source of DIF that may result in test scores that confound science knowledge with a lack of English proficiency among ELLs. Two years of fifth-grade state science tests were analyzed for evidence of DIF using two DIF methods, Simultaneous Item Bias Test (SIBTest) and logistic regression. The tests presented a unique challenge in that the test items were grouped together into testlets---groups of items referring to a scientific scenario to measure knowledge of different science content or skills. Very large samples of 10, 256 students in 2006 and 13,571 students in 2007 were examined. Half of each sample was composed of Spanish-speaking ELLs; the balance was comprised of native English speakers. The two DIF methods were in agreement about the items that favored non-ELLs and the items that favored ELLs. Logistic regression effect sizes were all negligible, while SIBTest flagged items with low to high DIF. A decrease in socioeconomic status and Spanish-speaking ELL diversity may have led to inconsistent SIBTest effect sizes for items used in both testing years. The DIF results for the testlets suggested that ELLs lacked sufficient opportunity to learn science content. The DIF results further suggest that those constructed response test items requiring the student to draw a conclusion about a scientific investigation or to plan a new investigation tended to favor ELLs.
NASA Astrophysics Data System (ADS)
Wren, David A.
The research presented in this dissertation culminated in a 10-item Thermochemistry Concept Inventory (TCI). The development of the TCI can be divided into two main phases: qualitative studies and quantitative studies. Both phases focused on the primary stakeholders of the TCI, college-level general chemistry instructors and students. Each phase was designed to collect evidence for the validity of the interpretations and uses of TCI testing data. A central use of TCI testing data is to identify student conceptual misunderstandings, which are represented as incorrect options of multiple-choice TCI items. Therefore, quantitative and qualitative studies focused heavily on collecting evidence at the item-level, where important interpretations may be made by TCI users. Qualitative studies included student interviews (N = 28) and online expert surveys (N = 30). Think-aloud student interviews (N = 12) were used to identify conceptual misunderstandings used by students. Novice response process validity interviews (N = 16) helped provide information on how students interpreted and answered TCI items and were the basis of item revisions. Practicing general chemistry instructors (N = 18), or experts, defined boundaries of thermochemistry content included on the TCI. Once TCI items were in the later stages of development, an online version of the TCI was used in expert response process validity survey (N = 12), to provide expert feedback on item content, format and consensus of the correct answer for each item. Quantitative studies included three phases: beta testing of TCI items (N = 280), pilot testing of the a 12-item TCI (N = 485), and a large data collection using a 10-item TCI ( N = 1331). In addition to traditional classical test theory analysis, Rasch model analysis was also used for evaluation of testing data at the test and item level. The TCI was administered in both formative assessment (beta and pilot testing) and summative assessment (large data collection), with items performing well in both. One item, item K, did not have acceptable psychometric properties when the TCI was used as a quiz (summative assessment), but was retained in the final version of the TCI based on the acceptable psychometric properties displayed in pilot testing (formative assessment).
ERIC Educational Resources Information Center
Li, Yanmei
2012-01-01
In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often removed. For a test that contains mostly dichotomous items and only a small number of polytomous items, removing some drifted polytomous anchor items may result in anchor sets that no longer resemble mini-versions of…
Sinharay, Sandip
2017-09-01
Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.
ERIC Educational Resources Information Center
McLeod, Lori D.; Lewis, Charles; Thissen, David.
With the increased use of computerized adaptive testing, which allows for continuous testing, new concerns about test security have evolved, one being the assurance that items in an item pool are safeguarded from theft. In this paper, the risk of score inflation and procedures to detect test takers using item preknowledge are explored. When test…
How to construct and implement script concordance tests: insights from a systematic review.
Dory, Valérie; Gagnon, Robert; Vanpee, Dominique; Charlin, Bernard
2012-06-01
Programmes of assessment should measure the various components of clinical competence. Clinical reasoning has been traditionally assessed using written tests and performance-based tests. The script concordance test (SCT) was developed to assess clinical data interpretation skills. A recent review of the literature examined the validity argument concerning the SCT. Our aim was to provide potential users with evidence-based recommendations on how to construct and implement an SCT. A systematic review of relevant databases (MEDLINE, ERIC [Education Resources Information Centre], PsycINFO, the Research and Development Resource Base [RDRB, University of Toronto]) and Google Scholar, medical education journals and conference proceedings was conducted for references in English or French. It was supplemented by ancestry searching and by additional references provided by experts. The search yielded 848 references, of which 80 were analysed. Studies suggest that tests with around 100 items (25-30 cases), of which 25% are discarded after item analysis, should provide reliable scores. Panels with 10-20 members are needed to reach adequate precision in terms of estimated reliability. Panellists' responses can be analysed by checking for moderate variability among responses. Studies of alternative scoring methods are inconclusive, but the traditional scoring method is satisfactory. There is little evidence on how best to determine a pass/fail threshold for high-stakes examinations. Our literature search was broad and included references from medical education journals not indexed in the usual databases, conference abstracts and dissertations. There is good evidence on how to construct and implement an SCT for formative purposes or medium-stakes course evaluations. Further avenues for research include examining the impact of various aspects of SCT construction and implementation on issues such as educational impact, correlations with other assessments, and validity of pass/fail decisions, particularly for high-stakes examinations. © Blackwell Publishing Ltd 2012.
Effect of Multiple Testing Adjustment in Differential Item Functioning Detection
ERIC Educational Resources Information Center
Kim, Jihye; Oshima, T. C.
2013-01-01
In a typical differential item functioning (DIF) analysis, a significance test is conducted for each item. As a test consists of multiple items, such multiple testing may increase the possibility of making a Type I error at least once. The goal of this study was to investigate how to control a Type I error rate and power using adjustment…
Item Response Theory Models for Performance Decline during Testing
ERIC Educational Resources Information Center
Jin, Kuan-Yu; Wang, Wen-Chung
2014-01-01
Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…
Differential item functioning analysis of the Vanderbilt Expertise Test for cars.
Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W; Van Gulick, Ana Beth; Gauthier, Isabel
2015-01-01
The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.
Samejima Items in Multiple-Choice Tests: Identification and Implications
ERIC Educational Resources Information Center
Rahman, Nazia
2013-01-01
Samejima hypothesized that non-monotonically increasing item response functions (IRFs) of ability might occur for multiple-choice items (referred to here as "Samejima items") if low ability test takers with some, though incomplete, knowledge or skill are drawn to a particularly attractive distractor, while very low ability test takers…
Computerized Numerical Control Test Item Bank.
ERIC Educational Resources Information Center
Reneau, Fred; And Others
This guide contains 285 test items for use in teaching a course in computerized numerical control. All test items were reviewed, revised, and validated by incumbent workers and subject matter instructors. Items are provided for assessing student achievement in such aspects of programming and planning, setting up, and operating machines with…
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
ERIC Educational Resources Information Center
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
ERIC Educational Resources Information Center
He, Yong
2013-01-01
Common test items play an important role in equating multiple test forms under the common-item nonequivalent groups design. Inconsistent item parameter estimates among common items can lead to large bias in equated scores for IRT true score equating. Current methods extensively focus on detection and elimination of outlying common items, which…
ERIC Educational Resources Information Center
Scheuneman, Janice Dowd; Gerritz, Kalle
1990-01-01
Differential item functioning (DIF) methodology for revealing sources of item difficulty and performance characteristics of different groups was explored. A total of 150 Scholastic Aptitude Test items and 132 Graduate Record Examination general test items were analyzed. DIF was evaluated for males and females and Blacks and Whites. (SLD)
Item Structural Properties as Predictors of Item Difficulty and Item Association.
ERIC Educational Resources Information Center
Solano-Flores, Guillermo
1993-01-01
Studied the ability of logical test design (LTD) to predict student performance in reading Roman numerals for 211 sixth graders in Mexico City tested on Roman numeral items varying on LTD-related and non-LTD-related variables. The LTD-related variable item iterativity was found to be the best predictor of item difficulty. (SLD)
Investigating Item Exposure Control Methods in Computerized Adaptive Testing
ERIC Educational Resources Information Center
Ozturk, Nagihan Boztunc; Dogan, Nuri
2015-01-01
This study aims to investigate the effects of item exposure control methods on measurement precision and on test security under various item selection methods and item pool characteristics. In this study, the Randomesque (with item group sizes of 5 and 10), Sympson-Hetter, and Fade-Away methods were used as item exposure control methods. Moreover,…
ERIC Educational Resources Information Center
Lee, Woo-yeol; Cho, Sun-Joo
2017-01-01
Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…
Item Pool Design for an Operational Variable-Length Computerized Adaptive Test
ERIC Educational Resources Information Center
He, Wei; Reckase, Mark D.
2014-01-01
For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…
NASA Astrophysics Data System (ADS)
Fisher, W. P., Jr.; Petry, P.
2016-11-01
Many published research studies document item calibration invariance across samples using Rasch's probabilistic models for measurement. A new approach to outcomes evaluation for very small samples was employed for two workshop series focused on stress reduction and joyful living conducted for health system employees and caregivers since 2012. Rasch-calibrated self-report instruments measuring depression, anxiety and stress, and the joyful living effects of mindfulness behaviors were identified in peer-reviewed journal articles. Items from one instrument were modified for use with a US population, other items were simplified, and some new items were written. Participants provided ratings of their depression, anxiety and stress, and the effects of their mindfulness behaviors before and after each workshop series. The numbers of participants providing both pre- and post-workshop data were low (16 and 14). Analysis of these small data sets produce results showing that, with some exceptions, the item hierarchies defining the constructs retained the same invariant profiles they had exhibited in the published research (correlations (not disattenuated) range from 0.85 to 0.96). In addition, comparisons of the pre- and post-workshop measures for the three constructs showed substantively and statistically significant changes. Implications for program evaluation comparisons, quality improvement efforts, and the organization of communications concerning outcomes in clinical fields are explored.
The nutrition advisor expert system
NASA Technical Reports Server (NTRS)
Huse, Scott M.; Shyne, Scott S.
1991-01-01
The Nutrition Advisor Expert System (NAES) is an expert system written in the C Language Integrated Production System (CLIPS). NAES provides expert knowledge and guidance into the complex world of nutrition management by capturing the knowledge of an expert and placing it at the user's fingertips. Specifically, NAES enables the user to: (1) obtain precise nutrition information for food items; (2) perform nutritional analysis of meal(s), flagging deficiencies based upon the U.S. Recommended Daily Allowances; (3) predict possible ailments based upon observed nutritional deficiency trends; (4) obtain a top ten listing of food items for a given nutrient; and (5) conveniently upgrade the data base. An explanation facility for the ailment prediction feature is also provided to document the reasoning process.
ERIC Educational Resources Information Center
Yoon, Su-Youn; Lee, Chong Min; Houghton, Patrick; Lopez, Melissa; Sakano, Jennifer; Loukina, Anastasia; Krovetz, Bob; Lu, Chi; Madani, Nitin
2017-01-01
In this study, we developed assistive tools and resources to support TOEIC® Listening test item generation. There has recently been an increased need for a large pool of items for these tests. This need has, in turn, inspired efforts to increase the efficiency of item generation while maintaining the quality of the created items. We aimed to…
ERIC Educational Resources Information Center
Nissan, Susan; And Others
One of the item types in the Listening Comprehension section of the Test of English as a Foreign Language (TOEFL) test is the dialogue. Because the dialogue item pool needs to have an appropriate balance of items at a range of difficulty levels, test developers have examined items at various difficulty levels in an attempt to identify their…
Park, In Sook; Suh, Yeon Ok; Park, Hae Sook; Kang, So Young; Kim, Kwang Sung; Kim, Gyung Hee; Choi, Yeon-Hee; Kim, Hyun-Ju
2017-01-01
The purpose of this study was to improve the quality of items on the Korean Nursing Licensing Examination by developing and evaluating case-based items that reflect integrated nursing knowledge. We conducted a cross-sectional observational study to develop new case-based items. The methods for developing test items included expert workshops, brainstorming, and verification of content validity. After a mock examination of undergraduate nursing students using the newly developed case-based items, we evaluated the appropriateness of the items through classical test theory and item response theory. A total of 50 case-based items were developed for the mock examination, and content validity was evaluated. The question items integrated 34 discrete elements of integrated nursing knowledge. The mock examination was taken by 741 baccalaureate students in their fourth year of study at 13 universities. Their average score on the mock examination was 57.4, and the examination showed a reliability of 0.40. According to classical test theory, the average level of item difficulty of the items was 57.4% (80%-100% for 12 items; 60%-80% for 13 items; and less than 60% for 25 items). The mean discrimination index was 0.19, and was above 0.30 for 11 items and 0.20 to 0.29 for 15 items. According to item response theory, the item discrimination parameter (in the logistic model) was none for 10 items (0.00), very low for 20 items (0.01 to 0.34), low for 12 items (0.35 to 0.64), moderate for 6 items (0.65 to 1.34), high for 1 item (1.35 to 1.69), and very high for 1 item (above 1.70). The item difficulty was very easy for 24 items (below -2.0), easy for 8 items (-2.0 to -0.5), medium for 6 items (-0.5 to 0.5), hard for 3 items (0.5 to 2.0), and very hard for 9 items (2.0 or above). The goodness-of-fit test in terms of the 2-parameter item response model between the range of 2.0 to 0.5 revealed that 12 items had an ideal correct answer rate. We surmised that the low reliability of the mock examination was influenced by the timing of the test for the examinees and the inappropriate difficulty of the items. Our study suggested a methodology for the development of future case-based items for the Korean Nursing Licensing Examination.
The beneficial effect of testing: an event-related potential study
Bai, Cheng-Hua; Bridger, Emma K.; Zimmer, Hubert D.; Mecklinger, Axel
2015-01-01
The enhanced memory performance for items that are tested as compared to being restudied (the testing effect) is a frequently reported memory phenomenon. According to the episodic context account of the testing effect, this beneficial effect of testing is related to a process which reinstates the previously learnt episodic information. Few studies have explored the neural correlates of this effect at the time point when testing takes place, however. In this study, we utilized the ERP correlates of successful memory encoding to address this issue, hypothesizing that if the benefit of testing is due to retrieval-related processes at test then subsequent memory effects (SMEs) should resemble the ERP correlates of retrieval-based processing in their temporal and spatial characteristics. Participants were asked to learn Swahili-German word pairs before items were presented in either a testing or a restudy condition. Memory performance was assessed immediately and 1-day later with a cued recall task. Successfully recalling items at test increased the likelihood that items were remembered over time compared to items which were only restudied. An ERP subsequent memory contrast (later remembered vs. later forgotten tested items), which reflects the engagement of processes that ensure items are recallable the next day were topographically comparable with the ERP correlate of immediate recollection (immediately remembered vs. immediately forgotten tested items). This result shows that the processes which allow items to be more memorable over time share qualitatively similar neural correlates with the processes that relate to successful retrieval at test. This finding supports the notion that testing is more beneficial than restudying on memory performance over time because of its engagement of retrieval processes, such as the re-encoding of actively retrieved memory representations. PMID:26441577
1982-06-01
libary packages which support machine dependent physical interfaces, interrupt structures or special devices. Thus, programs and libraries written in...obtains real-time data, makes and imple- ments decisions and receives and originates digital messages. The major equipment items which are appropriate...maintenance. g. Provide digital communications access processing. Each microcomputer can be programmed to perform a specific set of functions using prepared
The Word ("Qara'a") (Read) in the Holy Koran and Pre-Islamic Poetry
ERIC Educational Resources Information Center
Al Deeky, Mahmoud
2016-01-01
This research deals with the verb "qara'a" (read) and with what is derived from or built on in Qur'an and pre Islam poetry. The research stems from the assumption that this item (read) did not appear in pre-Islam Arabic in the meaning agreed upon regarding the concept of reading a written text, and what is stated in the Qur'an regarding…
The development of a science process assessment for fourth-grade students
NASA Astrophysics Data System (ADS)
Smith, Kathleen A.; Welliver, Paul W.
In this study, a multiple-choice test entitled the Science Process Assessment was developed to measure the science process skills of students in grade four. Based on the Recommended Science Competency Continuum for Grades K to 6 for Pennsylvania Schools, this instrument measured the skills of (1) observing, (2) classifying, (3) inferring, (4) predicting, (5) measuring, (6) communicating, (7) using space/time relations, (8) defining operationally, (9) formulating hypotheses, (10) experimenting, (11) recognizing variables, (12) interpreting data, and (13) formulating models. To prepare the instrument, classroom teachers and science educators were invited to participate in two science education workshops designed to develop an item bank of test questions applicable to measuring process skill learning. Participants formed writing teams and generated 65 test items representing the 13 process skills. After a comprehensive group critique of each item, 61 items were identified for inclusion into the Science Process Assessment item bank. To establish content validity, the item bank was submitted to a select panel of science educators for the purpose of judging item acceptability. This analysis yielded 55 acceptable test items and produced the Science Process Assessment, Pilot 1. Pilot 1 was administered to 184 fourth-grade students. Students were given a copy of the test booklet; teachers read each test aloud to the students. Upon completion of this first administration, data from the item analysis yielded a reliability coefficient of 0.73. Subsequently, 40 test items were identified for the Science Process Assessment, Pilot 2. Using the test-retest method, the Science Process Assessment, Pilot 2 (Test 1 and Test 2) was administered to 113 fourth-grade students. Reliability coefficients of 0.80 and 0.82, respectively, were ascertained. The correlation between Test 1 and Test 2 was 0.77. The results of this study indicate that (1) the Science Process Assessment, Pilot 2, is a valid and reliable instrument applicable to measuring the science process skills of students in grade four, (2) using educational workshops as a means of developing item banks of test questions is viable and productive in the test development process, and (3) involving classroom teachers and science educators in the test development process is educationally efficient and effective.
Michaelides, Michalis P.
2010-01-01
Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items. PMID:21833230
Michaelides, Michalis P
2010-01-01
Many studies have investigated the topic of change or drift in item parameter estimates in the context of item response theory (IRT). Content effects, such as instructional variation and curricular emphasis, as well as context effects, such as the wording, position, or exposure of an item have been found to impact item parameter estimates. The issue becomes more critical when items with estimates exhibiting differential behavior across test administrations are used as common for deriving equating transformations. This paper reviews the types of effects on IRT item parameter estimates and focuses on the impact of misbehaving or aberrant common items on equating transformations. Implications relating to test validity and the judgmental nature of the decision to keep or discard aberrant common items are discussed, with recommendations for future research into more informed and formal ways of dealing with misbehaving common items.
Raykov, Tenko; Marcoulides, George A
2016-04-01
The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete nature of the observed items. Two distinct observational equivalence approaches are outlined that render the item response models from corresponding classical test theory-based models, and can each be used to obtain the former from the latter models. Similarly, classical test theory models can be furnished using the reverse application of either of those approaches from corresponding item response models.
Test Anxiety in Written and Oral Examinations
ERIC Educational Resources Information Center
Sparfeldt, Jorn R.; Rost, Detlef H.; Baumeister, Ulrike M.; Christ, Oliver
2013-01-01
The distinction of different test anxiety reactions (e.g., worry, emotionality) is well established. Recently, additional relevance has been given to school-subject-specific test anxiety factors. The present study explored a further aspect concerning the structure of test anxiety experiences, specifically oral versus written examination modes. A…
Locally Dependent Linear Logistic Test Model with Person Covariates
ERIC Educational Resources Information Center
Ip, Edward H.; Smits, Dirk J. M.; De Boeck, Paul
2009-01-01
The article proposes a family of item-response models that allow the separate and independent specification of three orthogonal components: item attribute, person covariate, and local item dependence. Special interest lies in extending the linear logistic test model, which is commonly used to measure item attributes, to tests with embedded item…
Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items
ERIC Educational Resources Information Center
Penfield, Randall D.
2006-01-01
This study applied the maximum expected information (MEI) and the maximum posterior-weighted information (MPI) approaches of computer adaptive testing item selection to the case of a test using polytomous items following the partial credit model. The MEI and MPI approaches are described. A simulation study compared the efficiency of ability…
Do Reading Experts Agree with MCAT Verbal Reasoning Item Classifications?
ERIC Educational Resources Information Center
Jackson, Evelyn W.; And Others
1994-01-01
Examined whether expert raters (n=5) could agree about classification of Medical College Admission Test (MCAT) items and whether they agreed with MCAT student manual in labeling skill being measured by each test item. Results revealed difficulties in replicating authors' labeling of skills for reading items on practice test provided with 1991 MCAT…
ACER Chemistry Test Item Collection (ACER CHEMTIC Year 12 Supplement).
ERIC Educational Resources Information Center
Australian Council for Educational Research, Hawthorn.
This publication contains 317 multiple-choice chemistry test items related to topics covered in the Victorian (Australia) Year 12 chemistry course. It allows teachers access to a range of items suitable for diagnostic and achievement purposes, supplementing the ACER Chemistry Test Item Collection--Year 12 (CHEMTIC). The topics covered are: organic…
Differential Item Functioning: Its Consequences. Research Report. ETS RR-10-01
ERIC Educational Resources Information Center
Lee, Yi-Hsuan; Zhang, Jinming
2010-01-01
This report examines the consequences of differential item functioning (DIF) using simulated data. Its impact on total score, item response theory (IRT) ability estimate, and test reliability was evaluated in various testing scenarios created by manipulating the following four factors: test length, percentage of DIF items per form, sample sizes of…
Electronics. Criterion-Referenced Test (CRT) Item Bank.
ERIC Educational Resources Information Center
Davis, Diane, Ed.
This document contains 519 criterion-referenced multiple choice and true or false test items for a course in electronics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and the Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 15 units covering the…
Auto Mechanics. Criterion-Referenced Test (CRT) Item Bank.
ERIC Educational Resources Information Center
Tannehill, Dana, Ed.
This document contains 546 criterion-referenced multiple choice and true or false test items for a course in auto mechanics. The test item bank is designed to work with both the Vocational Instructional Management System (VIMS) and Vocational Administrative Management System (VAMS) in Missouri. The items are grouped into 35 units covering the…
Developing a Strategy for Using Technology-Enhanced Items in Large-Scale Standardized Tests
ERIC Educational Resources Information Center
Bryant, William
2017-01-01
As large-scale standardized tests move from paper-based to computer-based delivery, opportunities arise for test developers to make use of items beyond traditional selected and constructed response types. Technology-enhanced items (TEIs) have the potential to provide advantages over conventional items, including broadening construct measurement,…
2016-01-01
Purpose: The goal of this study was to characterize the difficulty index of the items in the skills test components of the class I and II Korean emergency medical technician licensing examination (KEMTLE), which requires examinees to select items randomly. Methods: The results of 1,309 class I KEMTLE examinations and 1,801 class II KEMTLE examinations in 2013 were subjected to analysis. Items from the basic and advanced skills test sections of the KEMTLE were compared to determine whether some were significantly more difficult than others. Results: In the class I KEMTLE, all 4 of the items on the basic skills test showed significant variation in difficulty index (P<0.01), as well as 4 of the 5 items on the advanced skills test (P<0.05). In the class II KEMTLE, 4 of the 5 items on the basic skills test showed significantly different difficulty index (P<0.01), as well as all 3 of the advanced skills test items (P<0.01). Conclusion: In the skills test components of the class I and II KEMTLE, the procedure in which examinees randomly select questions should be revised to require examinees to respond to a set of fixed items in order to improve the reliability of the national licensing examination. PMID:26883810
Doig, Emmah; Prescott, Sarah; Fleming, Jennifer; Cornwell, Petrea; Kuipers, Pim
2016-01-01
To examine the internal reliability and test-retest reliability of the Client-Centeredness of Goal Setting (C-COGS) scale. The C-COGS scale was administered to 42 participants with acquired brain injury after completion of multidisciplinary goal planning. Internal reliability of scale items was examined using item-partial total correlations and Cronbach's α coefficient. The scale was readministered within a 1-mo period to a subsample of 12 participants to examine test-retest reliability by calculating exact and close percentage agreement for each item. After examination of item-partial total correlations, test items were revised. The revised items demonstrated stronger internal consistency than the original items. Preliminary evaluation of test-retest reliability was fair, with an average exact percent agreement across all test items of 67%. Findings support the preliminary reliability of the C-COGS scale as a tool to evaluate and promote client-centered goal planning in brain injury rehabilitation. Copyright © 2016 by the American Occupational Therapy Association, Inc.
Item-Writing Guidelines for Physics
ERIC Educational Resources Information Center
Regan, Tom
2015-01-01
A teacher learning how to write test questions (test items) will almost certainly encounter item-writing guidelines--lists of item-writing do's and don'ts. Item-writing guidelines usually are presented as applicable across all assessment settings. Table I shows some guidelines that I believe to be generally applicable and two will be briefly…
Unidimensional Interpretations for Multidimensional Test Items
ERIC Educational Resources Information Center
Kahraman, Nilufer
2013-01-01
This article considers potential problems that can arise in estimating a unidimensional item response theory (IRT) model when some test items are multidimensional (i.e., show a complex factorial structure). More specifically, this study examines (1) the consequences of model misfit on IRT item parameter estimates due to unintended minor item-level…
Kisala, Pamela A.; Victorson, David; Pace, Natalie; Heinemann, Allen W.; Choi, Seung W.; Tulsky, David S.
2015-01-01
Objective To describe the development and psychometric properties of the SCI-QOL Psychological Trauma item bank and short form. Design Using a mixed-methods design, we developed and tested a Psychological Trauma item bank with patient and provider focus groups, cognitive interviews, and item response theory based analytic approaches, including tests of model fit, differential item functioning (DIF) and precision. Setting We tested a 31-item pool at several medical institutions across the United States, including the University of Michigan, Kessler Foundation, Rehabilitation Institute of Chicago, the University of Washington, Craig Hospital and the James J. Peters/Bronx Veterans Administration hospital. Participants A total of 716 individuals with SCI completed the trauma items Results The 31 items fit a unidimensional model (CFI=0.952; RMSEA=0.061) and demonstrated good precision (theta range between 0.6 and 2.5). Nine items demonstrated negligible DIF with little impact on score estimates. The final calibrated item bank contains 19 items Conclusion The SCI-QOL Psychological Trauma item bank is a psychometrically robust measurement tool from which a short form and a computer adaptive test (CAT) version are available. PMID:26010967
Vaughn, Kalif E; Rawson, Katherine A; Pyc, Mary A
2013-12-01
A wealth of previous research has established that retrieval practice promotes memory, particularly when retrieval is successful. Although successful retrieval promotes memory, it remains unclear whether successful retrieval promotes memory equally well for items of varying difficulty. Will easy items still outperform difficult items on a final test if all items have been correctly recalled equal numbers of times during practice? In two experiments, normatively difficult and easy Lithuanian-English word pairs were learned via test-restudy practice until each item had been correctly recalled a preassigned number of times (from 1 to 11 correct recalls). Despite equating the numbers of successful recalls during practice, performance on a delayed final cued-recall test was lower for difficult than for easy items. Experiment 2 was designed to diagnose whether the disadvantage for difficult items was due to deficits in cue memory, target memory, and/or associative memory. The results revealed a disadvantage for the difficult versus the easy items only on the associative recognition test, with no differences on cue recognition, and even an advantage on target recognition. Although successful retrieval enhanced memory for both difficult and easy items, equating retrieval success during practice did not eliminate normative item difficulty differences.
Test Bias: An Objective Definition for Test Items.
ERIC Educational Resources Information Center
Durovic, Jerry J.
A test bias definition, applicable at the item-level of a test is presented. The definition conceptually equates test bias with measuring different things in different groups, and operationally equates test bias with a difference in item fit to the Rasch Model, greater than one, between groups. It is suggested that the proposed definition avoids…
2013-01-01
Background Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. Methods The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Results Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. Conclusions The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information. PMID:23453056
Zoanetti, Nathan; Beaves, Mark; Griffin, Patrick; Wallace, Euan M
2013-03-04
Despite the widespread use of multiple-choice assessments in medical education assessment, current practice and published advice concerning the number of response options remains equivocal. This article describes an empirical study contrasting the quality of three 60 item multiple-choice test forms within the Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) Fetal Surveillance Education Program (FSEP). The three forms are described below. The first form featured four response options per item. The second form featured three response options, having removed the least functioning option from each item in the four-option counterpart. The third test form was constructed by retaining the best performing version of each item from the first two test forms. It contained both three and four option items. Psychometric and educational factors were taken into account in formulating an approach to test construction for the FSEP. The four-option test performed better than the three-option test overall, but some items were improved by the removal of options. The mixed-option test demonstrated better measurement properties than the fixed-option tests, and has become the preferred test format in the FSEP program. The criteria used were reliability, errors of measurement and fit to the item response model. The position taken is that decisions about the number of response options be made at the item level, with plausible options being added to complete each item on both psychometric and educational grounds rather than complying with a uniform policy. The point is to construct the better performing item in providing the best psychometric and educational information.
Test anxiety and academic performance in chiropractic students.
Zhang, Niu; Henderson, Charles N R
2014-01-01
Objective : We assessed the level of students' test anxiety, and the relationship between test anxiety and academic performance. Methods : We recruited 166 third-quarter students. The Test Anxiety Inventory (TAI) was administered to all participants. Total scores from written examinations and objective structured clinical examinations (OSCEs) were used as response variables. Results : Multiple regression analysis shows that there was a modest, but statistically significant negative correlation between TAI scores and written exam scores, but not OSCE scores. Worry and emotionality were the best predictive models for written exam scores. Mean total anxiety and emotionality scores for females were significantly higher than those for males, but not worry scores. Conclusion : Moderate-to-high test anxiety was observed in 85% of the chiropractic students examined. However, total test anxiety, as measured by the TAI score, was a very weak predictive model for written exam performance. Multiple regression analysis demonstrated that replacing total anxiety (TAI) with worry and emotionality (TAI subscales) produces a much more effective predictive model of written exam performance. Sex, age, highest current academic degree, and ethnicity contributed little additional predictive power in either regression model. Moreover, TAI scores were not found to be statistically significant predictors of physical exam skill performance, as measured by OSCEs.
Detecting Gender Bias Through Test Item Analysis
NASA Astrophysics Data System (ADS)
González-Espada, Wilson J.
2009-03-01
Many physical science and physics instructors might not be trained in pedagogically appropriate test construction methods. This could lead to test items that do not measure what they are intended to measure. A subgroup of these items might show bias against some groups of students. This paper describes how the author became aware of potentially biased items against females in his examinations, which led to the exploration of fundamental issues related to item validity, gender bias, and differential item functioning, or DIF. A brief discussion of DIF in the context of university courses, as well as practical suggestions to detect possible gender-biased items, follows.
Life without Scan-Tron: Tests as Thinking.
ERIC Educational Resources Information Center
Posner, Richard
1987-01-01
Claims that written tests are superior to objective, scan-tron tests in literature, composition, and vocabulary because they require students to think on paper. Describes the following types of in-class written tests and examines the advantages of each: literary essay, topical composition, imitation, brief answer, timed rewrites, and vocabulary…
Estimating Total-test Scores from Partial Scores in a Matrix Sampling Design.
ERIC Educational Resources Information Center
Sachar, Jane; Suppes, Patrick
It is sometimes desirable to obtain an estimated total-test score for an individual who was administered only a subset of the items in a total test. The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students in grades 3-5 and 60 items of the ll0-item Stanford Mental…
Differential item functioning analysis of the Vanderbilt Expertise Test for cars
Lee, Woo-Yeol; Cho, Sun-Joo; McGugin, Rankin W.; Van Gulick, Ana Beth; Gauthier, Isabel
2015-01-01
The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge. PMID:26418499
ERIC Educational Resources Information Center
Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D.
2012-01-01
Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…
Science Library of Test Items. Volume Two.
ERIC Educational Resources Information Center
New South Wales Dept. of Education, Sydney (Australia).
The second volume of test items in the Science Library of Test Items is intended as a resource to assist teachers in implementing and evaluating science courses in the first 4 years of Australian secondary school. The items were selected from questions submitted to the School Certificate Development Unit by teachers in New South Wales. Only the…
Measuring the Instructional Sensitivity of ESL Reading Comprehension Items.
ERIC Educational Resources Information Center
Brutten, Sheila R.; And Others
A study attempted to estimate the instructional sensitivity of items in three reading comprehension tests in English as a second language (ESL). Instructional sensitivity is a test-item construct defined as the tendency for a test item to vary in difficulty as a function of instruction. Similar tasks were given to readers at different proficiency…
Reducing the Impact of Inappropriate Items on Reviewable Computerized Adaptive Testing
ERIC Educational Resources Information Center
Yen, Yung-Chin; Ho, Rong-Guey; Liao, Wen-Wei; Chen, Li-Ju
2012-01-01
In a test, the testing score would be closer to examinee's actual ability when careless mistakes were corrected. In CAT, however, changing the answer of one item in CAT might cause the following items no longer appropriate for estimating the examinee's ability. These inappropriate items in a reviewable CAT might in turn introduce bias in ability…
ERIC Educational Resources Information Center
Lau, C. Allen; Wang, Tianyou
The purposes of this study were to: (1) extend the sequential probability ratio testing (SPRT) procedure to polytomous item response theory (IRT) models in computerized classification testing (CCT); (2) compare polytomous items with dichotomous items using the SPRT procedure for their accuracy and efficiency; (3) study a direct approach in…
A Conditional Exposure Control Method for Multidimensional Adaptive Testing
ERIC Educational Resources Information Center
Finkelman, Matthew; Nering, Michael L.; Roussos, Louis A.
2009-01-01
In computerized adaptive testing (CAT), ensuring the security of test items is a crucial practical consideration. A common approach to reducing item theft is to define maximum item exposure rates, i.e., to limit the proportion of examinees to whom a given item can be administered. Numerous methods for controlling exposure rates have been proposed…
ERIC Educational Resources Information Center
Downing, Steven M.; Maatsch, Jack L.
To test the effect of clinically relevant multiple-choice item content on the validity of statistical discriminations of physicians' clinical competence, data were collected from a field test of the Emergency Medicine Examination, test items for the certification of specialists in emergency medicine. Two 91-item multiple-choice subscales were…
The Effect of Including or Excluding Students with Testing Accommodations on IRT Calibrations.
ERIC Educational Resources Information Center
Karkee, Thakur; Lewis, Dan M.; Barton, Karen; Haug, Carolyn
This study aimed to determine the degree to which the inclusion of accommodated students with disabilities in the calibration sample affects the characteristics of item parameters and the test results. Investigated were effects on test reliability, item fit to the applicable item response theory (IRT) model, item parameter estimates, and students'…
Three controversies over item disclosure in medical licensure examinations.
Park, Yoon Soo; Yang, Eunbae B
2015-01-01
In response to views on public's right to know, there is growing attention to item disclosure - release of items, answer keys, and performance data to the public - in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations - 1) fairness and validity, 2) impact on passing levels, and 3) utility of item disclosure - by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers' right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration.
Online Calibration of Polytomous Items Under the Generalized Partial Credit Model
Zheng, Yi
2016-01-01
Online calibration is a technology-enhanced architecture for item calibration in computerized adaptive tests (CATs). Many CATs are administered continuously over a long term and rely on large item banks. To ensure test validity, these item banks need to be frequently replenished with new items, and these new items need to be pretested before being used operationally. Online calibration dynamically embeds pretest items in operational tests and calibrates their parameters as response data are gradually obtained through the continuous test administration. This study extends existing formulas, procedures, and algorithms for dichotomous item response theory models to the generalized partial credit model, a popular model for items scored in more than two categories. A simulation study was conducted to investigate the developed algorithms and procedures under a variety of conditions, including two estimation algorithms, three pretest item selection methods, three seeding locations, two numbers of score categories, and three calibration sample sizes. Results demonstrated acceptable estimation accuracy of the two estimation algorithms in some of the simulated conditions. A variety of findings were also revealed for the interacted effects of included factors, and recommendations were made respectively. PMID:29881063
Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms
ERIC Educational Resources Information Center
Debeer, Dries; Ali, Usama S.; van Rijn, Peter W.
2017-01-01
Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…
10 CFR 55.43 - Written examination: Senior operators.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 10 Energy 2 2010-01-01 2010-01-01 false Written examination: Senior operators. 55.43 Section 55.43... Tests § 55.43 Written examination: Senior operators. (a) Content. The written examination for a senior... needed to perform licensed senior operator duties. The knowledge, skills, and abilities will be...
Nickel and cobalt release from jewellery and metal clothing items in Korea.
Cheong, Seung Hyun; Choi, You Won; Choi, Hae Young; Byun, Ji Yeon
2014-01-01
In Korea, the prevalence of nickel allergy has shown a sharply increasing trend. Cobalt contact allergy is often associated with concomitant reactions to nickel, and is more common in Korea than in western countries. The aim of the present study was to investigate the prevalence of items that release nickel and cobalt on the Korean market. A total of 471 items that included 193 branded jewellery, 202 non-branded jewellery and 76 metal clothing items were sampled and studied with a dimethylglyoxime (DMG) test and a cobalt spot test to detect nickel and cobalt release, respectively. Nickel release was detected in 47.8% of the tested items. The positive rates in the DMG test were 12.4% for the branded jewellery, 70.8% for the non-branded jewellery, and 76.3% for the metal clothing items. Cobalt release was found in 6.2% of items. Among the types of jewellery, belts and hair pins showed higher positive rates in both the DMG test and the cobalt spot test. Our study shows that the prevalence of items that release nickel or cobalt among jewellery and metal clothing items is high in Korea. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The Role of Item Feedback in Self-Adapted Testing.
ERIC Educational Resources Information Center
Roos, Linda L.; And Others
1997-01-01
The importance of item feedback in self-adapted testing was studied by comparing feedback and no feedback conditions for computerized adaptive tests and self-adapted tests taken by 363 college students. Results indicate that item feedback is not necessary to realize score differences between self-adapted and computerized adaptive testing. (SLD)
Criterion-Referenced Test Items for Auto Body.
ERIC Educational Resources Information Center
Tannehill, Dana, Ed.
This test item bank on auto body repair contains criterion-referenced test questions based upon competencies found in the Missouri Auto Body Competency Profile. Some test items are keyed for multiple competencies. The tests cover the following 26 competency areas in the auto body curriculum: auto body careers; measuring and mixing; tools and…
Automated Test-Form Generation
ERIC Educational Resources Information Center
van der Linden, Wim J.; Diao, Qi
2011-01-01
In automated test assembly (ATA), the methodology of mixed-integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different…
ERIC Educational Resources Information Center
Villarreal, Victor
2015-01-01
The Woodcock-Johnson IV Tests of Achievement (WJ IV ACH; Schrank, Mather, & McGrew, 2014a) is an individually administered measure containing tests of reading, mathematics, written language, and academic knowledge. Areas of reading, mathematics, and written language each include tests of basic skills, fluency, and application. Academic…
ERIC Educational Resources Information Center
Kouimanos, John, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection was reviewed for content validity and reliability. The…
Ow, Yen Ling Mandy; Thumboo, Julian; Cella, David; Cheung, Yin Bun; Yong Fong, Kok; Wee, Hwee Lin
2011-06-01
To identify health-related quality of life (HRQOL) domains of importance to multiethnic Asian systemic lupus erythematosus (SLE) patients, to identify content gaps in existing SLE-specific HRQOL measures, and to determine whether the Patient-Reported Outcomes Measurement Information System (PROMIS) item banks could serve as a core set of questions for HRQOL assessment among SLE patients. English-speaking patients with physician-diagnosed SLE from a specialist clinic in a tertiary care hospital in Singapore and a patient support group were recruited. Thematic analysis was performed to distill themes from transcripts through open coding by 2 independent coders and axial coding for refinement of categories. Items from 3 existing SLE-specific measures and PROMIS Version 1.0 Item Banks were compared with identified subthemes. Twenty-seven female and 2 male participants (21 Chinese, 4 Malay, 3 Indian, 1 other) ages 23-62 years participated in 6 focus groups and 2 individual interviews, respectively. Twenty-one domains and 92 subthemes were identified. Domains of family, relationships, stigma and discrimination, and freedom were unaddressed by existing SLE-specific measures. Forty subthemes from 14 domains were addressed by the PROMIS Version 1.0 Item Banks (Physical Function, Pain, Fatigue, Sleep Disturbance, Sleep-Related Impairment, Anger, Anxiety, and Depression banks). Family and stigma and discrimination (identified as content gaps) may be accentuated in the Asian sociocultural context. PROMIS item banks have tremendous potential to serve as a core set of items for HRQOL assessment in SLE patients. Additional items may be written to fill the gaps in existing PROMIS item banks. Copyright © 2011 by the American College of Rheumatology.
Dalton, Megan; Davidson, Megan; Keating, Jenny
2011-01-01
Is the Assessment of Physiotherapy Practice (APP) a valid instrument for the assessment of entry-level competence in physiotherapy students? Cross-sectional study with Rasch analysis of initial (n=326) and validation samples (n=318). Students were assessed on completion of 4, 5, or 6-week clinical placements across one university semester. 298 clinical educators and 456 physiotherapy students at nine universities in Australia and New Zealand provided 644 completed APP instruments. APP data in both samples showed overall fit to a Rasch model of expected item functioning for interval scale measurement. Item 6 (Written communication) exhibited misfit in both samples, but was retained as an important element of competence. The hierarchy of item difficulty was the same in both samples with items related to professional behaviour and communication the easiest to achieve and items related to clinical reasoning the most difficult. Item difficulty was well targeted to person ability. No Differential Item Functioning was identified, indicating that the scale performed in a comparable way regardless of the student's age, gender or amount of prior clinical experience, and the educator's age, gender, or experience as an educator, or the type of facility, university, or clinical area. The instrument demonstrated unidimensionality confirming the appropriateness of summing the scale scores on each item to provide an overall score of clinical competence and was able to discriminate four levels of professional competence (Person Separation Index=0.96). Person ability and raw APP scores had a linear relationship (r(2)=0.99). Rasch analysis supports the interpretation that a student's APP score is an indication of their underlying level of professional competence in workplace practice. Copyright © 2011 Australian Physiotherapy Association. Published by .. All rights reserved.
Comparison of Written and Oral Examinations in a Baccalaureate Medical-Surgical Nursing Course.
ERIC Educational Resources Information Center
Rushton, Patricia; Eggett, Dennis
2003-01-01
Of four groups of medical-surgical nurses, 55 took one final and three midterm written exams, 150 took one each (written), 45 took an oral final, 92 took both written and oral, and 47 took a written test with licensure questions and an oral final. Oral exams resulted in higher scores, more effective study habits, and increased application. (SK)
Solving the measurement invariance anchor item problem in item response theory.
Meade, Adam W; Wright, Natalie A
2012-09-01
The efficacy of tests of differential item functioning (measurement invariance) has been well established. It is clear that when properly implemented, these tests can successfully identify differentially functioning (DF) items when they exist. However, an assumption of these analyses is that the metric for different groups is linked using anchor items that are invariant. In practice, however, it is impossible to be certain which items are DF and which are invariant. This problem of anchor items, or referent indicators, has long plagued invariance research, and a multitude of suggested approaches have been put forth. Unfortunately, the relative efficacy of these approaches has not been tested. This study compares 11 variations on 5 qualitatively different approaches from recent literature for selecting optimal anchor items. A large-scale simulation study indicates that for nearly all conditions, an easily implemented 2-stage procedure recently put forth by Lopez Rivas, Stark, and Chernyshenko (2009) provided optimal power while maintaining nominal Type I error. With this approach, appropriate anchor items can be easily and quickly located, resulting in more efficacious invariance tests. Recommendations for invariance testing are illustrated using a pedagogical example of employee responses to an organizational culture measure.
When Listening Is Better Than Reading: Performance Gains on Cardiac Auscultation Test Questions.
Short, Kathleen; Bucak, S Deniz; Rosenthal, Francine; Raymond, Mark R
2018-05-01
In 2007, the United States Medical Licensing Examination embedded multimedia simulations of heart sounds into multiple-choice questions. This study investigated changes in item difficulty as determined by examinee performance over time. The data reflect outcomes obtained following initial use of multimedia items from 2007 through 2012, after which an interface change occurred. A total of 233,157 examinees responded to 1,306 cardiology test items over the six-year period; 138 items included multimedia simulations of heart sounds, while 1,168 text-based items without multimedia served as controls. The authors compared changes in difficulty of multimedia items over time with changes in difficulty of text-based cardiology items over time. Further, they compared changes in item difficulty for both groups of items between graduates of Liaison Committee on Medical Education (LCME)-accredited and non-LCME-accredited (i.e., international) medical schools. Examinee performance on cardiology test items with multimedia heart sounds improved by 12.4% over the six-year period, while performance on text-based cardiology items improved by approximately 1.4%. These results were similar for graduates of LCME-accredited and non-LCME-accredited medical schools. Examinees' ability to interpret auscultation findings in test items that include multimedia presentations increased from 2007 to 2012.
Revisiting the role of recollection in item versus forced-choice recognition memory.
Cook, Gabriel I; Marsh, Richard L; Hicks, Jason L
2005-08-01
Many memory theorists have assumed that forced-choice recognition tests can rely more on familiarity, whereas item (yes-no) tests must rely more on recollection. In actuality, several studies have found no differences in the contributions of recollection and familiarity underlying the two different test formats. Using word frequency to manipulate stimulus characteristics, the present study demonstrated that the contributions of recollection to item versus forced-choice tests is variable. Low word frequency resulted in significantly more recollection in an item test than did a forced-choice procedure, but high word frequency produced the opposite result. These results clearly constrain any uniform claim about the degree to which recollection supports responding in item versus forced-choice tests.
A Comparison of Methods of Vertical Equating.
ERIC Educational Resources Information Center
Loyd, Brenda H.; Hoover, H. D.
Rasch model vertical equating procedures were applied to three mathematics computation tests for grades six, seven, and eight. Each level of the test was composed of 45 items in three sets of 15 items, arranged in such a way that tests for adjacent grades had two sets (30 items) in common, and the sixth and eighth grades had 15 items in common. In…
ERIC Educational Resources Information Center
Zebehazy, Kim T.; Zigmond, Naomi; Zimmerman, George J.
2012-01-01
Introduction: This study investigated differential item functioning (DIF) of test items on Pennsylvania's Alternate System of Assessment (PASA) for students with visual impairments and severe cognitive disabilities and what the reasons for the differences may be. Methods: The Wilcoxon signed ranks test was used to analyze differences in the scores…
Objective and Item Banking Computer Software and Its Use in Comprehensive Achievement Monitoring.
ERIC Educational Resources Information Center
Schriber, Peter E.; Gorth, William P.
The current emphasis on objectives and test item banks for constructing more effective tests is being augmented by increasingly sophisticated computer software. Items can be catalogued in numerous ways for retrieval. The items as well as instructional objectives can be stored and test forms can be selected and printed by the computer. It is also…
Flight Engineer. Question Book. Expires September 1, 1991.
ERIC Educational Resources Information Center
Federal Aviation Administration (DOT), Washington, DC.
This question book was developed by the Federal Aviation Administration (FAA) to be used by FAA testing centers and FAA-designated written test examiners when administering the flight engineer written test. The book can be used to test applicants in the following flight engineer knowledge areas: basic, turbojet powered, turbopropeller powered, and…
An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38
ERIC Educational Resources Information Center
Ali, Usama S.; Chang, Hua-Hua
2014-01-01
Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…
Fitting the Rasch Model to Account for Variation in Item Discrimination
ERIC Educational Resources Information Center
Weitzman, R. A.
2009-01-01
Building on the Kelley and Gulliksen versions of classical test theory, this article shows that a logistic model having only a single item parameter can account for varying item discrimination, as well as difficulty, by using item-test correlations to adjust incorrect-correct (0-1) item responses prior to an initial model fit. The fit occurs…
Weighted Maximum-a-Posteriori Estimation in Tests Composed of Dichotomous and Polytomous Items
ERIC Educational Resources Information Center
Sun, Shan-Shan; Tao, Jian; Chang, Hua-Hua; Shi, Ning-Zhong
2012-01-01
For mixed-type tests composed of dichotomous and polytomous items, polytomous items often yield more information than dichotomous items. To reflect the difference between the two types of items and to improve the precision of ability estimation, an adaptive weighted maximum-a-posteriori (WMAP) estimation is proposed. To evaluate the performance of…
ERIC Educational Resources Information Center
Sengul Avsar, Asiye; Tavsancil, Ezel
2017-01-01
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Rasch Measurement and Item Banking: Theory and Practice.
ERIC Educational Resources Information Center
Nakamura, Yuji
The Rasch Model is an item response theory, one parameter model developed that states that the probability of a correct response on a test is a function of the difficulty of the item and the ability of the candidate. Item banking is useful for language testing. The Rasch Model provides estimates of item difficulties that are meaningful,…
Maternity and parental leave policies at COTH hospitals: an update. Council of Teaching Hospitals.
Philibert, I; Bickel, J
1995-11-01
Because residents' demands for parental leave are increasing, updated information about maternity and paternity leave policies was solicited from hospitals that are members of the Council of Teaching Hospitals (COTH) of the AAMC. A 20-item questionnaire, combining forced-choice categories and open-ended questions, was faxed to 405 COTH hospitals in October 1994; 45% responded. A total of 77% of the respondents reported having written policies for maternity and/or parental leave; in 1989, only 52% of COTH hospitals had reported having such policies. Forty-one percent of the 1994 responding hospitals offered dedicated paid maternity leave, with a mean of 42 days allowed. Twenty-five percent of the respondents offered paternity leave, and 15% offered adoption leave. It is encouraging that the majority of the teaching hospitals that responded to the survey had adopted written policies, but the 23% without written policies remain a source of concern. Well-defined policies for maternity, paternity, and adoption leave can reduce stress and foster equity both for trainees requiring leave and for their colleagues.
Test Design Project: Studies in Test Bias. Annual Report.
ERIC Educational Resources Information Center
McArthur, David
Item bias in a multiple-choice test can be detected by appropriate analyses of the persons x items scoring matrix. This permits comparison of groups of examinees tested with the same instrument. The test may be biased if it is not measuring the same thing in comparable groups, if groups are responding to different aspects of the test items, or if…
ERIC Educational Resources Information Center
Truell, Allen D.; Zhao, Jensen J.; Alexander, Melody W.
2005-01-01
The purposes of this study were to determine if there is a significant difference in postsecondary business student scores and test completion time based on settable test item exposure control interface format, and to determine if there is a significant difference in student scores and test completion time based on settable test item exposure…
Estimating Total-Test Scores from Partial Scores in a Matrix Sampling Design.
ERIC Educational Resources Information Center
Sachar, Jane; Suppes, Patrick
1980-01-01
The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students and 60 items of the 110-item Stanford Mental Arithmetic Test. Three methods yielded fairly good estimates of the total-test score. (Author/RL)
Malmström, Marlene; Ivarsson, Bodil; Klefsgård, Rosemarie; Persson, Kerstin; Jakobsson, Ulf; Johansson, Jan
2016-12-01
Following oesophagectomy, a major surgical procedure, it is known that patients suffer from severely reduced quality of life and have an unmet need for postoperative support. Still, there is a lack of research testing interventions aiming to enhance the patients' life situation after this surgical procedure. The aim of the study was to evaluate the effect of a nurse led telephone supportive care programme on quality of life (QOL), received information and the number of healthcare contacts compared to conventional care following oesophageal resection for cancer. The study was designed as a randomized controlled trial (RCT) aiming to test the effect of a nurse led telephone supportive care program compared to conventional care. Patient assessments were conducted at discharge, 2 weeks, 2, 4 and 6 months after discharge and comprised evaluation of QOL, received information and the number of health care contacts. Statistical testing were conducted with repeated measurements analysis of variance to test if there were differences between the groups during follow-up. The results show that the intervention group was significantly more satisfied with received information for items concerning the information they received about things to do to help yourself, written information and for the global information score. The control group scored significantly higher on the item regarding wishing to receive more information and wish to receive less information. No effect of the intervention was shown on QOL or number of health care contacts. Proactive nurse-led telephone follow-up has a significant positive impact on the patients' experience of received information. This is likely to have a positive effect on their ability to cope with a life that may include remaining side effects and adverse symptoms for a long time after surgery. Copyright © 2016 Elsevier Ltd. All rights reserved.
Hift, Richard J
2014-11-28
Written assessments fall into two classes: constructed-response or open-ended questions, such as the essay and a number of variants of the short-answer question, and selected-response or closed-ended questions; typically in the form of multiple-choice. It is widely believed that constructed response written questions test higher order cognitive processes in a manner that multiple-choice questions cannot, and consequently have higher validity. An extensive review of the literature suggests that in summative assessment neither premise is evidence-based. Well-structured open-ended and multiple-choice questions appear equivalent in their ability to assess higher cognitive functions, and performance in multiple-choice assessments may correlate more highly than the open-ended format with competence demonstrated in clinical practice following graduation. Studies of construct validity suggest that both formats measure essentially the same dimension, at least in mathematics, the physical sciences, biology and medicine. The persistence of the open-ended format in summative assessment may be due to the intuitive appeal of the belief that synthesising an answer to an open-ended question must be both more cognitively taxing and similar to actual experience than is selecting a correct response. I suggest that cognitive-constructivist learning theory would predict that a well-constructed context-rich multiple-choice item represents a complex problem-solving exercise which activates a sequence of cognitive processes which closely parallel those required in clinical practice, hence explaining the high validity of the multiple-choice format. The evidence does not support the proposition that the open-ended assessment format is superior to the multiple-choice format, at least in exit-level summative assessment, in terms of either its ability to test higher-order cognitive functioning or its validity. This is explicable using a theory of mental models, which might predict that the multiple-choice format will have higher validity, a statement for which some empiric support exists. Given the superior reliability and cost-effectiveness of the multiple-choice format consideration should be given to phasing out open-ended format questions in summative assessment. Whether the same applies to non-exit-level assessment and formative assessment is a question which remains to be answered; particularly in terms of the educational effect of testing, an area which deserves intensive study.
ERIC Educational Resources Information Center
Penfield, Randall D.; Algina, James
2006-01-01
One approach to measuring unsigned differential test functioning is to estimate the variance of the differential item functioning (DIF) effect across the items of the test. This article proposes two estimators of the DIF effect variance for tests containing dichotomous and polytomous items. The proposed estimators are direct extensions of the…
Smolen, Tomasz; Chuderski, Adam
2015-01-01
Fluid intelligence (Gf) is a crucial cognitive ability that involves abstract reasoning in order to solve novel problems. Recent research demonstrated that Gf strongly depends on the individual effectiveness of working memory (WM). We investigated a popular claim that if the storage capacity underlay the WM-Gf correlation, then such a correlation should increase with an increasing number of items or rules (load) in a Gf-test. As often no such link is observed, on that basis the storage-capacity account is rejected, and alternative accounts of Gf (e.g., related to executive control or processing speed) are proposed. Using both analytical inference and numerical simulations, we demonstrated that the load-dependent change in correlation is primarily a function of the amount of floor/ceiling effect for particular items. Thus, the item-wise WM correlation of a Gf-test depends on its overall difficulty, and the difficulty distribution across its items. When the early test items yield huge ceiling, but the late items do not approach floor, that correlation will increase throughout the test. If the early items locate themselves between ceiling and floor, but the late items approach floor, the respective correlation will decrease. For a hallmark Gf-test, the Raven-test, whose items span from ceiling to floor, the quadratic relationship is expected, and it was shown empirically using a large sample and two types of WMC tasks. In consequence, no changes in correlation due to varying WM/Gf load, or lack of them, can yield an argument for or against any theory of WM/Gf. Moreover, as the mathematical properties of the correlation formula make it relatively immune to ceiling/floor effects for overall moderate correlations, only minor changes (if any) in the WM-Gf correlation should be expected for many psychological tests.
Item response theory analysis of the mechanics baseline test
NASA Astrophysics Data System (ADS)
Cardamone, Caroline N.; Abbott, Jonathan E.; Rayyan, Saif; Seaton, Daniel T.; Pawl, Andrew; Pritchard, David E.
2012-02-01
Item response theory is useful in both the development and evaluation of assessments and in computing standardized measures of student performance. In item response theory, individual parameters (difficulty, discrimination) for each item or question are fit by item response models. These parameters provide a means for evaluating a test and offer a better measure of student skill than a raw test score, because each skill calculation considers not only the number of questions answered correctly, but the individual properties of all questions answered. Here, we present the results from an analysis of the Mechanics Baseline Test given at MIT during 2005-2010. Using the item parameters, we identify questions on the Mechanics Baseline Test that are not effective in discriminating between MIT students of different abilities. We show that a limited subset of the highest quality questions on the Mechanics Baseline Test returns accurate measures of student skill. We compare student skills as determined by item response theory to the more traditional measurement of the raw score and show that a comparable measure of learning gain can be computed.
Computerized adaptive testing: the capitalization on chance problem.
Olea, Julio; Barrada, Juan Ramón; Abad, Francisco J; Ponsoda, Vicente; Cuevas, Lara
2012-03-01
This paper describes several simulation studies that examine the effects of capitalization on chance in the selection of items and the ability estimation in CAT, employing the 3-parameter logistic model. In order to generate different estimation errors for the item parameters, the calibration sample size was manipulated (N = 500, 1000 and 2000 subjects) as was the ratio of item bank size to test length (banks of 197 and 788 items, test lengths of 20 and 40 items), both in a CAT and in a random test. Results show that capitalization on chance is particularly serious in CAT, as revealed by the large positive bias found in the small sample calibration conditions. For broad ranges of theta, the overestimation of the precision (asymptotic Se) reaches levels of 40%, something that does not occur with the RMSE (theta). The problem is greater as the item bank size to test length ratio increases. Potential solutions were tested in a second study, where two exposure control methods were incorporated into the item selection algorithm. Some alternative solutions are discussed.
ERIC Educational Resources Information Center
Öztürk-Gübes, Nese; Kelecioglu, Hülya
2016-01-01
The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…
Automated information retrieval using CLIPS
NASA Technical Reports Server (NTRS)
Raines, Rodney Doyle, III; Beug, James Lewis
1991-01-01
Expert systems have considerable potential to assist computer users in managing the large volume of information available to them. One possible use of an expert system is to model the information retrieval interests of a human user and then make recommendations to the user as to articles of interest. At Cal Poly, a prototype expert system written in the C Language Integrated Production System (CLIPS) serves as an Automated Information Retrieval System (AIRS). AIRS monitors a user's reading preferences, develops a profile of the user, and then evaluates items returned from the information base. When prompted by the user, AIRS returns a list of items of interest to the user. In order to minimize the impact on system resources, AIRS is designed to run in the background during periods of light system use.
A computer-based maintenance reminder and record-keeping system for clinical laboratories.
Roberts, B I; Mathews, C L; Walton, C J; Frazier, G
1982-09-01
"Maintenance" is all the activity an organization devotes to keeping instruments within performance specifications to assure accurate and precise operation. The increasing use of complex analytical instruments as "workhorses" in clinical laboratories requires more maintenance awareness by laboratory personnel. Record-keeping systems that document maintenance completion and that should prompt the continued performance of maintenance tasks have not kept up with instrumentation development. We report here a computer-based record-keeping and reminder system that lists weekly the maintenance items due for each work station in the laboratory, including the time required to complete each item. Written in BASIC, the system uses a DATABOSS data base management system running on a time-shared Digital Equipment Corporation PDP 11/60 computer with a RSTS V 7.0 operating system.
ERIC Educational Resources Information Center
Ali, Usama S.; Chang, Hua-Hua; Anderson, Carolyn J.
2015-01-01
Polytomous items are typically described by multiple category-related parameters; situations, however, arise in which a single index is needed to describe an item's location along a latent trait continuum. Situations in which a single index would be needed include item selection in computerized adaptive testing or test assembly. Therefore single…
Designing a Virtual Item Bank Based on the Techniques of Image Processing
ERIC Educational Resources Information Center
Liao, Wen-Wei; Ho, Rong-Guey
2011-01-01
One of the major weaknesses of the item exposure rates of figural items in Intelligence Quotient (IQ) tests lies in its inaccuracies. In this study, a new approach is proposed and a useful test tool known as the Virtual Item Bank (VIB) is introduced. The VIB combine Automatic Item Generation theory and image processing theory with the concepts of…
The Rasch Model and Missing Data, with an Emphasis on Tailoring Test Items.
ERIC Educational Resources Information Center
de Gruijter, Dato N. M.
Many applications of educational testing have a missing data aspect (MDA). This MDA is perhaps most pronounced in item banking, where each examinee responds to a different subtest of items from a large item pool and where both person and item parameter estimates are needed. The Rasch model is emphasized, and its non-parametric counterpart (the…
Three controversies over item disclosure in medical licensure examinations
Park, Yoon Soo; Yang, Eunbae B.
2015-01-01
In response to views on public's right to know, there is growing attention to item disclosure – release of items, answer keys, and performance data to the public – in medical licensure examinations and their potential impact on the test's ability to measure competence and select qualified candidates. Recent debates on this issue have sparked legislative action internationally, including South Korea, with prior discussions among North American countries dating over three decades. The purpose of this study is to identify and analyze three issues associated with item disclosure in medical licensure examinations – 1) fairness and validity, 2) impact on passing levels, and 3) utility of item disclosure – by synthesizing existing literature in relation to standards in testing. Historically, the controversy over item disclosure has centered on fairness and validity. Proponents of item disclosure stress test takers’ right to know, while opponents argue from a validity perspective. Item disclosure may bias item characteristics, such as difficulty and discrimination, and has consequences on setting passing levels. To date, there has been limited research on the utility of item disclosure for large scale testing. These issues requires ongoing and careful consideration. PMID:26374693
Bayesian Item Selection in Constrained Adaptive Testing Using Shadow Tests
ERIC Educational Resources Information Center
Veldkamp, Bernard P.
2010-01-01
Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item…
Mathematics Library of Test Items. Volume One.
ERIC Educational Resources Information Center
Fraser, Graham, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from previous tests are made available to teachers for the construction of pretests or posttests, reference tests for inter-class comparisons and general assignments. The collection was reviewed for content…
Are Learning Disabled Students "Test-Wise?": An Inquiry into Reading Comprehension Test Items.
ERIC Educational Resources Information Center
Scruggs, Thomas E.; Lifson, Steve
The ability to correctly answer reading comprehension test items, without having read the accompanying reading passage, was compared for third grade learning disabled students and their peers from a regular classroom. In the first experiment, fourteen multiple choice items were selected from the Stanford Achievement Test. No reading passages were…
Agriculture Library of Test Items.
ERIC Educational Resources Information Center
Sutherland, Duncan, Ed.
As one in a series of test item collections developed by the Assessment and Evaluation Unit of the Directorate of Studies, items of value from past tests are made available to teachers for the construction of unit tests, term examinations or as a basis for class discussion. Each collection is reviewed for content validity and reliability. The test…
ERIC Educational Resources Information Center
Bermundo, Cesar B.; Bermundo, Alex B.; Ballester, Rex C.
2012-01-01
iBank is a project that utilizes a software to create an item Bank that store quality questions, generate test and print exam. The items are from analyze teacher-constructed test questions that provides the basis for discussing test results, by determining why a test item is or not discriminating between the better and poorer students, and by…
Effects of Test Item Disclosure on Medical Licensing Examination
ERIC Educational Resources Information Center
Yang, Eunbae B.; Lee, Myung Ae; Park, Yoon Soo
2018-01-01
In 2012, the National Health Personnel Licensing Examination Board of Korea decided to publicly disclose all test items and answers to satisfy the test takers' right to know and enhance the transparency of tests administered by the government. This study investigated the effects of item disclosure on the medical licensing examination (MLE),…
Controlling Item Exposure Conditional on Ability in Computerized Adaptive Testing.
ERIC Educational Resources Information Center
Stocking, Martha L.; Lewis, Charles
1998-01-01
Ensuring item and pool security in a continuous testing environment is explored through a new method of controlling exposure rate of items conditional on ability level in computerized testing. Properties of this conditional control on exposure rate, when used in conjunction with a particular adaptive testing algorithm, are explored using simulated…
Battalion Combat Operations Center (COC) Test. Volume II. Test Report,
1982-02-08
reveal, perhaps, that item X can perform a task faster than item-Y. A utility assessment from an experienced, knowledgeable test participant, however...can ascertain whether or not item X can better enable him to accomplish his mission than item Y. 2.4 GENeRALIZED TEST FACILITY. The capabilities of...ATHE MIX D -IX AE4SY MIXES A & C MIX A .IX D M X D IMIX C RATHER DIFFICUJLT VERY DIFFICULT ABILITY TO ABILITY TO ABILITY TO CONTROL DATA EXPLOIT DATA
V-TECS Criterion-Referenced Test Item Bank for Radiologic Technology Occupations.
ERIC Educational Resources Information Center
Reneau, Fred; And Others
This Vocational-Technical Education Consortium of States (V-TECS) criterion-referenced test item bank provides 696 multiple-choice items and 33 matching items for radiologic technology occupations. These job titles are included: radiologic technologist, chief; radiologic technologist; nuclear medicine technologist; radiation therapy technologist;…
Modes of Modelling Assessment--A Literature Review
ERIC Educational Resources Information Center
Frejd, Peter
2013-01-01
This paper presents a critical review of literature investigating assessment of mathematical modelling. Written tests, projects, hands-on tests, portfolio and contests are modes of modelling assessment identified in this study. The written tests found in the reviewed papers draw on an atomistic view on modelling competencies, whereas projects are…
Criteria to Evaluate Interpretive Guides for Criterion-Referenced Tests
ERIC Educational Resources Information Center
Trapp, William J.
2007-01-01
This project provides a list of criteria for which the contents of interpretive guides written for customized, criterion-referenced tests can be evaluated. The criteria are based on the "Standards for Educational and Psychological Testing" (1999) and examine the content breadth of interpretive guides. Interpretive guides written for…
Airline Transport Pilot-Airplane (Air Carrier) Written Test Guide.
ERIC Educational Resources Information Center
Federal Aviation Administration (DOT), Washington, DC. Flight Standards Service.
Presented is information useful to applicants who are preparing for the Airline Transport Pilot-Airplane (Air Carrier) Written Test. The guide describes the basic aeronautical knowledge and associated requirements for certification, as well as information on source material, instructions for taking the official test, and questions that are…