Science.gov

Sample records for test scores implications

  1. Does Test Preparation Work? Implications for Score Validity

    ERIC Educational Resources Information Center

    Xie, Qin

    2013-01-01

    This article reports an empirical study that examined the pattern of test preparation for College English Test Band 4 (CET4) and the differential effects of test preparation practices on its scores, thereby drawing implications for CET4 score validity. Data collection involved 1,003 test takers of CET4. A pretest was administered at the beginning…

  2. The Implications of Family Size and Birth Order for Test Scores and Behavioral Development

    ERIC Educational Resources Information Center

    Silles, Mary A.

    2010-01-01

    This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…

  3. The Implications of Family Size and Birth Order for Test Scores and Behavioral Development

    ERIC Educational Resources Information Center

    Silles, Mary A.

    2010-01-01

    This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…

  4. Test Scoring [book review].

    ERIC Educational Resources Information Center

    Meijer, Rob R.

    2003-01-01

    This book discusses how to obtain test scores and, in particular, how to obtain test scores from tests that consist of a combination of multiple choice and open-ended questions. The strength of the book is that scoring solutions are presented for a diversity of real world scoring problems. (SLD)

  5. The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

    ERIC Educational Resources Information Center

    Baggerly, Jennifer; Ferretti, Larissa K.

    2008-01-01

    What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…

  6. The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

    ERIC Educational Resources Information Center

    Baggerly, Jennifer; Ferretti, Larissa K.

    2008-01-01

    What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…

  7. Demands on Users for Interpretation of Achievement Test Scores: Implications for the Evaluation Profession

    ERIC Educational Resources Information Center

    Della-Piana, Gabriel Mario; Gardner, Michael

    2011-01-01

    Background: Professional standards for validity of achievement tests have long reflected a consensus that validity is the degree to which evidence and theory support interpretations of test scores entailed by the intended uses of tests. Yet there are convincing lines of evidence that the standards are not adequately followed in practice, that…

  8. Demands on Users for Interpretation of Achievement Test Scores: Implications for the Evaluation Profession

    ERIC Educational Resources Information Center

    Della-Piana, Gabriel Mario; Gardner, Michael

    2011-01-01

    Background: Professional standards for validity of achievement tests have long reflected a consensus that validity is the degree to which evidence and theory support interpretations of test scores entailed by the intended uses of tests. Yet there are convincing lines of evidence that the standards are not adequately followed in practice, that…

  9. Relationship of Friends, Physical Education, and State Test Scores: Implications for School Counselors

    ERIC Educational Resources Information Center

    Hollingsworth, Mary Ann

    2010-01-01

    This study examined the relationship between dimensions of wellness and academic performance for 634 third through fifth grade students in Title One schools in rural Mississippi, using composites of the Five Factor Wellness Inventory for Elementary Children and Reading, Language, and Math Scores of the Mississippi Curriculum Test (a state level…

  10. Implications of Deployed and Nondeployed Fathers on Seventh Graders' California Achievement Test Scores during a Military Crisis.

    ERIC Educational Resources Information Center

    Pisano, Mark C.

    The differences in California Achievement Test (CAT) scores from 1990 to 1991 in seventh graders, currently enrolled in Albritton Junior High School in the Fort Bragg Schools, of deployed and nondeployed fathers were analyzed. CAT percentile scores from 1990 and 1991 (1991 being the year of "Desert Storm") were obtained in reading, math and…

  11. Effects of Multidimensionality on IRT Item Characteristics and True Score Estimates: Implications for Computerized Test Assembly. Computerized Testing Report. LSAC Research Report Series.

    ERIC Educational Resources Information Center

    Wang, Xiang-Bo; Harris, Vincent; Roussos, Louis

    Multidimensionality is known to affect the accuracy of item parameter and ability estimations, which subsequently influences the computation of item characteristic curves (ICCs) and true scores. By judiciously combining sections of a Law School Admission Test (LSAT), 11 sections of varying degrees of uni- and multidimensional structures are used…

  12. How Accurate Is a Test Score?

    ERIC Educational Resources Information Center

    Doppelt, Jerome E.

    1956-01-01

    The standard error of measurement as a means for estimating the margin of error that should be allowed for in test scores is discussed. The true score measures the performance that is characteristic of the person tested; the variations, plus and minus, around the true score describe a characteristic of the test. When the standard deviation is used…

  13. More than Just Test Scores

    ERIC Educational Resources Information Center

    Levin, Henry M.

    2012-01-01

    Around the world we hear considerable talk about creating world-class schools. Usually the term refers to schools whose students get very high scores on the international comparisons of student achievement such as PISA or TIMSS. The practice of restricting the meaning of exemplary schools to the narrow criterion of achievement scores is usually…

  14. The Absolute Normal Scores Test for Symmetry

    ERIC Educational Resources Information Center

    Penfield, Douglas A.; Sachdeva, Darshan

    1976-01-01

    The absolute normal scores test is described as a test for the symmetry of a distribution of scores about a location parameter. The test is compared to the sign test and the Wilcoxon test as an alternative to the "t"-test. (Editor/RK)

  15. Smoothing Methods for Estimating Test Score Distributions.

    ERIC Educational Resources Information Center

    Kolen, Michael J.

    1991-01-01

    Estimation/smoothing methods that are flexible enough to fit a wide variety of test score distributions are reviewed: kernel method, strong true-score model-based method, and method that uses polynomial log-linear models. Applications of these methods include describing/comparing test score distributions, estimating norms, and estimating…

  16. Statistics Scores and Testing Time.

    ERIC Educational Resources Information Center

    Kennedy, Robert L.; McCallister, Corliss J.

    The purpose of this study was to investigate the relationship between the scores students earned on their statistics final examinations and the number of minutes students required to complete the exams. In a previous study, K. Bridges (1985) extended the range of interest in this relationship from a single study to a course-based series, examining…

  17. 10 Tips for Higher Test Scores.

    ERIC Educational Resources Information Center

    Priestley, Michael

    2000-01-01

    Ten suggestions to help students increase standardized test scores include: read directions carefully; peek at the questions before reading stories or articles; note key words; use parts of questions to help plan answers; look back at the text; think before writing; write clearly and legibly; pay attention to how the test is scored; manage time…

  18. Raising Standardized Achievement Test Scores and the Origins of Test Score Pollution.

    ERIC Educational Resources Information Center

    Haladyna, Thomas M.; And Others

    1991-01-01

    Because of the importance of standardized test scores in current definitions of educational achievement, pressure to raise test scores has affected their accuracy. Examines the causes of two major sources of test score pollution and their impact on education. Discusses the ethical status of documented test-preparation activities. (CJS)

  19. The Absolute Normal Scores Test for Symmetry.

    ERIC Educational Resources Information Center

    Penfield, Douglas A.; Sachdeva, Darshan

    Behavioral scientists often wish to determine if a sample has been taken from a symmetric population. Similarly, classroom teachers are interested in symmetry if they wish to grade on a "curve." Previously, the sign test, the Wilcoxon test and the t-test have been used to test a hypothesis concerning the symmetry of a distribution of scores about…

  20. Do Examinees Understand Score Reports for Alternate Methods of Scoring Computer Based Tests?

    ERIC Educational Resources Information Center

    Whittaker, Tiffany A.; Williams, Natasha J.; Dodd, Barbara G.

    2011-01-01

    This study assessed the interpretability of scaled scores based on either number correct (NC) scoring for a paper-and-pencil test or one of two methods of scoring computer-based tests: an item pattern (IP) scoring method and a method based on equated NC scoring. The equated NC scoring method for computer-based tests was proposed as an alternative…

  1. Do Examinees Understand Score Reports for Alternate Methods of Scoring Computer Based Tests?

    ERIC Educational Resources Information Center

    Whittaker, Tiffany A.; Williams, Natasha J.; Dodd, Barbara G.

    2011-01-01

    This study assessed the interpretability of scaled scores based on either number correct (NC) scoring for a paper-and-pencil test or one of two methods of scoring computer-based tests: an item pattern (IP) scoring method and a method based on equated NC scoring. The equated NC scoring method for computer-based tests was proposed as an alternative…

  2. Teacher Greetings Increase College Students' Test Scores

    ERIC Educational Resources Information Center

    Weinstein, Lawrence; Laverghetta, Antonio; Alexander, Ralph; Stewart, Megan

    2009-01-01

    The current study is an extension of a previous investigation dealing with teacher greetings to students. The present investigation used teacher greetings with college students and academic performance (test scores). We report data using university students and in-class test performance. Students in introductory psychology who received teachers'…

  3. What Do Test Score Really Mean? A Latent Class Analysis of Danish Test Score Performance

    ERIC Educational Resources Information Center

    McIntosh, James; Munk, Martin D.

    2014-01-01

    Latent class Poisson count models are used to analyse a sample of Danish test score results from a cohort of individuals born in 1954-1955, tested in 1968, and followed until 2011. The procedure takes account of unobservable effects as well as excessive zeros in the data. We show that the test scores measure manifest or measured ability as it has…

  4. The Test Score Decline: Meaning and Issues.

    ERIC Educational Resources Information Center

    Lipsitz, Lawrence, Ed.

    This collection of original papers, first published in the June and July, 1976 issues of Educational Technology Magazine, was prompted by the enormous public outcry which greeted the general public realization that achievement and college aptitude test scores were continuing in recent months and years the steady erosion which began in the…

  5. Teacher Use of Achievement Test Score Data

    ERIC Educational Resources Information Center

    Miller, Steven C.

    2012-01-01

    The Wyoming Department of Education (WDE) has invested time and money developing standardized achievement test score reports designed to give teachers data about each of their students' levels of mastery of particular concepts in order to differentiate their instruction. The purpose of this study was to determine the extent to which…

  6. Critical Thinking: More than Test Scores

    ERIC Educational Resources Information Center

    Smith, Vernon G.; Szymanski, Antonia

    2013-01-01

    This article is for practicing or aspiring school administrators. The demand for excellence in public education has lead to an emphasis on standardized test scores. This article explores the development of a professional enhancement program designed to prepare teachers to teach higher order thinking skills. Higher order thinking is the primary…

  7. The Black-White Test Score Gap.

    ERIC Educational Resources Information Center

    Jencks, Christopher, Ed.; Phillips, Meredith, Ed.

    The 15 chapters of this book address issues related to the continuing test score gap between black and white students. The editors argue against traditional explanations which emphasize differences in economic resources and demographic factors, and they urge that more emphasis be put on psychological and cultural factors. The book suggests studies…

  8. Correction of developmental and intelligence test scores for premature birth.

    PubMed

    Rickards, A L; Kitchen, W H; Doyle, L W; Kelly, E A

    1989-06-01

    When using tests of infant development and intelligence in children born prematurely, the subject's age is commonly corrected for the degree of prematurity. However, there is disagreement: first, on whether this correction should ever be applied, and second, at what age to discontinue the adjustment. In a theoretical model, the difference between corrected and uncorrected scores in early infancy was massive and the difference remained clinically important until the age of 8.5 years in children who were born extremely prematurely. The clinical implications of using corrected or uncorrected scores were then evaluated in 174 very low birthweight children without severe sensorineural disabilities and with paired Bayley Mental Development Index (MDI) and Wechsler Preschool and Primary Scales of Intelligence (WPPSI) full scale scores. Failure to correct for prematurity reduced the mean MDI by 12.1 points but reduced the mean WPPSI by only 4.1 points. The disparity between individual MDI and WPPSI scores increased significantly with decreasing gestational age if uncorrected scores were used (P = 0.015) but not if scores were corrected. Using corrected scores, the MDI correctly predicted the WPPSI category in 86.1% of children (P less than 0.001) but in only 54.6% using uncorrected scores (the difference was not significant). It is suggested that a practical solution to the dilemma is to correct test scores for prematurity in the age range 2-8.5 years recognizing that only in extremely immature infants will uncorrected scores be substantially lower than corrected ones at a later age. PMID:2764833

  9. ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

    ERIC Educational Resources Information Center

    Allalouf, Avi

    2014-01-01

    The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…

  10. Test-Retest Reliability of Computer Based MCW-APM Test Scoring Methods.

    ERIC Educational Resources Information Center

    Abedi, Jamal; Bruno, James

    1989-01-01

    Reports the results of several test-reliability experiments which compared a modified confidence weighted-admissible probability measurement (MCW-APM) with conventional forced choice or binary type (R-W) test scoring methods. Psychometric properties using G theory and conventional correlational methods are examined, and their implications for…

  11. ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

    ERIC Educational Resources Information Center

    Allalouf, Avi

    2014-01-01

    The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…

  12. Validating the Interpretations and Uses of Test Scores

    ERIC Educational Resources Information Center

    Kane, Michael T.

    2013-01-01

    To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…

  13. On the Confidentiality of Student Test Scores. Report No. 32.

    ERIC Educational Resources Information Center

    Read, Peter B.

    A discussion of the limited meaning of test scores, testing as an invasion of privacy, the abuse of test scores as confidential information and privileged communication, recording and storing of test results, access to test scores, and the demand for accountability forms the basis for recommendations for the release of individual and group test…

  14. Differences in Reading and Math Achievement Test Scores for Students Experiencing Academic Difficulty.

    ERIC Educational Resources Information Center

    Slate, John R.; Jones, Craig H.

    1996-01-01

    Implications of the relationships among mathematics and reading achievement tests scores uncovered by testing 366 elementary school students with academic difficulties are discussed. Tests are the: (1) KeyMath-Revised (J. Connolly, 1988); (2) Peabody Individual Achievement Test-Revised (F. Markwardt, 1989); (3) Wechsler Individual Achievement Test…

  15. A table of color distance scores for quantitative scoring of the Lanthony Desaturate color vision test.

    PubMed

    Geller, A M

    2001-01-01

    The Lanthony Desaturate Panel D-15 (D-15d) color vision test is used in neurotoxicological testing to assess acquired color vision deficits. The original test design included a qualitative scoring method. Quantitative scoring requires mapping the colored objects used in the test into a color space describing perceptual distances. A table of these distances has previously been published for the saturated version of this color vision test, but not the desaturate test. This communication includes a table of color distances for the calculation of Bowman's Total Color Distance Score (TCDS) for the D-15d. This table should be useful for non-computerized scoring under field test conditions or for devising one's own computerized scoring methods using the tabulated color distances for a look-up table. Data analysis programs using SAS or Matlab are available from the author. PMID:11418268

  16. Using just noticeable differences to interpret test scores.

    PubMed

    Stricker, L J

    2000-12-01

    This study explored the value of obtaining a just noticeable difference (JND) for a test--the difference in scores needed before observers detect a difference in examinees' behavior--as a means of interpreting the practical meaning of scores. Classical psychophysical methods were adapted and applied to the scores of foreign teaching assistants (TAs) on an achievement test, the Test of Spoken English (TSE), and the ratings for English proficiency that the TAs received from their students. The JND for the TSE scores was substantial, as large as the standard deviation of the scores and much larger than the standard error of measurement and guidelines for the d index of effect size for mean differences, suggesting that both sets of standards may highlight score differences that are not practically significant. This study demonstrates the applicability of JNDs for evaluating scores on educational and psychologists' tests. PMID:11194205

  17. Objectivity of Scoring for the McCarthy Drawing Tests.

    ERIC Educational Resources Information Center

    Reynolds, Cecil R.

    1979-01-01

    Two doctoral level school psychologists independently scored 50 McCarthy drawing booklets. Children producing the drawings ranged from 5-11. Interscorer reliability for Draw-A-Design was .93 and for Draw-A-Child was .96. No significant differences occurred in the mean score for either test across scores. (Author)

  18. Reliability of Total Test Scores When Considered as Ordinal Measurements

    ERIC Educational Resources Information Center

    Biswas, Ajoy Kumar

    2006-01-01

    This article studies the ordinal reliability of (total) test scores. This study is based on a classical-type linear model of observed score (X), true score (T), and random error (E). Based on the idea of Kendall's tau-a coefficient, a measure of ordinal reliability for small-examinee populations is developed. This measure is extended to large…

  19. Observed-Score Equating as a Test Assembly Problem.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Luecht, Richard M.

    A set of linear conditions on the item response functions is derived that guarantees identical observed-score distributions on two test forms. The conditions can be added as constraints to a linear programming model for test assembly that assembles a new test form to have an observed-score distribution optimally equated to the distribution of the…

  20. Computerized Adaptive Testing with Equated Number-Correct Scoring.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.

    2001-01-01

    Presents a constrained computerized adaptive testing (CAT) algorithm that can be used to equate CAT number-correct scores to a reference test. Used an item bank from the Law School Admission Test to compare results of the algorithm with those for equipercentile observed-score equating. Discusses advantages of the approach. (SLD)

  1. Coefficient ? as a Measure of Test Score Reliability: Review of 3 Popular Misconceptions.

    PubMed

    Morera, Osvaldo F; Stokes, Sonya M

    2016-03-01

    We discuss 3 popular misconceptions about Cronbach ? or coefficient ?, traditionally used in public health and the behavioral sciences as an index of test score reliability. We also review several other indices of test score reliability. We encourage researchers to thoughtfully consider the nature of their data and the options when choosing an index of reliability, and to clearly communicate this choice and its implications to their audiences. PMID:26885962

  2. Test Score Reporting Referenced to Doubly-Moderated Cut Scores Using Splines

    ERIC Educational Resources Information Center

    Schafer, William D.; Hou, Xiaodong

    2011-01-01

    This study discusses and presents an example of a use of spline functions to establish and report test scores using a moderated system of any number of cut scores. Our main goals include studying the need for and establishing moderated standards and creating a reporting scale that is referenced to all the standards. Our secondary goals are to make…

  3. Standard Score Tables for the McCarthy Drawing Tests.

    ERIC Educational Resources Information Center

    Reynolds, Cecil R.

    1985-01-01

    The Draw-a-Design and Draw-a-Child, two subtests of the McCarthy Scales, are the best-normed drawing tests for children aged two and a half to eight and a half years but have no age-corrected deviation scaled scores available for interpretation. Scaled scores for use in interpretation are presented for these tests. (Author/NRB)

  4. Does weight affect children’s test scores and teacher assessments differently?✩

    PubMed Central

    Zavodny, Madeline

    2013-01-01

    The prevalence of childhood overweight and obesity increased dramatically in the United States during the past three decades. This increase has adverse public health implications, but its implication for children’s academic outcomes is less clear. This paper uses data from five waves of the Early Childhood Longitudinal Study-Kindergarten to examine how children’s weight is related to their scores on standardized tests and to their teachers’ assessments of their academic ability. The results indicate that children’s weight is more negatively related to teacher assessments of their academic performance than to test scores. PMID:24014932

  5. Norm Referenced Testing and the Standard Scores. Basic Testing Services.

    ERIC Educational Resources Information Center

    Childs, Roy

    The norm-referenced score scale used by the National Foundation for Educational Research (NFER) is described. The usefulness of standardized scores is explained by a simple numerical example, and the formulas and computations are shown for calculating a mean, a standard deviation, and a deviation or z score. The need for a representative sample is…

  6. Math/FCS Class Boosts Test Scores

    ERIC Educational Resources Information Center

    Sanden, Jan

    2004-01-01

    Integrating mathematics with family and consumer sciences (FCS) has enabled youth to pass the Minnesota 8th Grade Math Basic Skills test. The test focuses on the eight content areas: (1) problem solving with whole numbers and fractions; (2) problem solving with percentage/ratio; (3) number sense; (4) estimation; 5) measurement; (6) tables and…

  7. Accountability Is More than a Test Score

    ERIC Educational Resources Information Center

    Turnipseed, Stephan; Darling-Hammond, Linda

    2015-01-01

    The number one quality business leaders look for in employees is creativity and yet the U.S. education system undermines the development of the higher-order skills that promote creativity by its dogged focus on multiple-choice tests. Stephan Turnipseed and Linda DarlingHammond discuss the kind of rich accountability system that will help students…

  8. Improving Scores on the IELTS Speaking Test

    ERIC Educational Resources Information Center

    Issitt, Steve

    2008-01-01

    This article presents three strategies for teaching students who are taking the IELTS speaking test. The first strategy is aimed at improving confidence and uses a variety of self-help materials from the field of popular psychology. The second encourages students to think critically and invokes a range of academic perspectives. The third strategy…

  9. Fuzzy Math: A Meditation on Test Scoring

    ERIC Educational Resources Information Center

    Jacks, Meredith

    2011-01-01

    As a public school English teacher, the author observes standardized testing season each year with a sort of grim fascination. "So this is it," she thinks as she paces around her silent classroom, peering over kids' shoulders at articles about parasailing. Line graphs tracking the rainfall in Tulsa. Parts of speech. Functions of "x." "These are…

  10. Equating Test Scores (without IRT). Second Edition

    ERIC Educational Resources Information Center

    Livingston, Samuel A.

    2014-01-01

    This booklet grew out of a half-day class on equating that author Samuel Livingston teaches for new statistical staff at Educational Testing Service (ETS). The class is a nonmathematical introduction to the topic, emphasizing conceptual understanding and practical applications. The class consists of illustrated lectures, interspersed with…

  11. Missing the Mark: What Test Scores Really Tell Us

    ERIC Educational Resources Information Center

    Tanner, John R.

    2011-01-01

    State test scores administered for accountability purposes are regularly used to adjust instruction in nuanced ways. This is no accident--No Child Left Behind demanded that students' scores be returned quickly to teachers in order that this might be the case, and the idea of data-driven decision making continues as one way the promise of education…

  12. An Investigation of the Ordinal True Score Test Theory.

    ERIC Educational Resources Information Center

    Donoghue, John R.; Cliff, Norman

    1991-01-01

    The validity of the assumptions under which the ordinal true score test theory was derived was examined using (1) simulation based on classical test theory; (2) a long empirical test with data from 321 sixth graders; and (3) an extensive simulation with 480 datasets based on the 3-parameter model. (SLD)

  13. Interpreting Test Scores: More Complicated than You Think

    ERIC Educational Resources Information Center

    Tully, Susannah

    2008-01-01

    As more colleges move to "test optional" admissions policies, the debate over the utility and interpretation of standardized-test scores continues. In this article, the author interviews Daniel Koretz, a professor of education at Harvard University and author of "Measuring Up: What Educational Testing Really Tells Us". Koretz shares his thoughts…

  14. Observed-Score Equating as a Test Assembly Problem.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Luecht, Richard M.

    1998-01-01

    Derives a set of linear conditions of item-response functions that guarantees identical observed-score distributions on two test forms. The conditions can be added as constraints to a linear programming model for test assembly. An example illustrates the use of the model for an item pool from the Law School Admissions Test (LSAT). (SLD)

  15. The Uses and Misuses of Test Scores: Technical Assistance Perspective.

    ERIC Educational Resources Information Center

    Echternacht, Gary

    The uses and misuses of standardized test results used for program evaluation as seen by a staff member of an Elementary Secondary Education Act (ESEA) Title I Technical Assistance Center are described. In ESEA Title I, test scores are used to select students for the program. Although federal requirements do not require using standardized test…

  16. Why Standardized Test Scores Don't Measure Educational Quality.

    ERIC Educational Resources Information Center

    Popham, W. James

    1999-01-01

    Employing standardized achievement tests to ascertain educational quality is like measuring temperature with a tablespoon. Such tests are prone to testing-teaching mismatches, omitted items, and confounded causation problems. Actually, three factors influence students' scores: what's taught in school, native intellectual ability, and out-of-school…

  17. The Relationship between Scores on the Gifted Student Screening Scale and Scores on IQ Tests.

    ERIC Educational Resources Information Center

    Trentham, Landa L.; Hall, Eleanor G.

    1987-01-01

    When teachers' ratings of first- through tenth-graders (N=160) on the Gifted Student Screening Scale (GSSS) were compared with students' scores on various intelligence tests, the GSSS identified 35 of 37 previously-identified cognitively-gifted students, with 59 students (not previously identified as gifted) identified for further screening for…

  18. State Test Score Trends Through 2007-08, Part 2: Is There a Plateau Effect in Test Scores?

    ERIC Educational Resources Information Center

    Chudowsky, Naomi; Chudowsky, Victor

    2009-01-01

    Many in the research and policy worlds have taken for granted the existence of a phenomenon known as the "plateau effect," wherein test scores rise in the early years of a test-based accountability system and then level off. Drawing from our database of reading and math test results from all 50 states going back as far as 1999, the Center on…

  19. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. North Dakota

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles North Dakota's test score trends through 2008-09. Between 2005 and 2009, the percentage of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in both reading and math. Average annual gains were larger on the state test…

  20. RIASEC Interest and Confidence Cutoff Scores: Implications for Career Counseling

    ERIC Educational Resources Information Center

    Bonitz, Verena S.; Armstrong, Patrick Ian; Larson, Lisa M.

    2010-01-01

    One strategy commonly used to simplify the joint interpretation of interest and confidence inventories is the use of cutoff scores to classify individuals dichotomously as having high or low levels of confidence and interest, respectively. The present study examined the adequacy of cutoff scores currently recommended for the joint interpretation…

  1. American College Testing Program Scores as an Index of Intelligence.

    ERIC Educational Resources Information Center

    Wilkins, Elizabeth M.; And Others

    The relationship between American College Testing Program (ACT) and California Test of Mental Maturity (CTMM) scores were explored. Four hundred and thirty-four undergraduate subjects of both sexes were selected from a midwestern university. Pearson product moment correlation (r) and Kendall rank correlation coefficient (tau) were used to measure…

  2. Equating Test Scores Using the Linear Method: A Primer.

    ERIC Educational Resources Information Center

    Tanguma, Jesus

    This paper describes four commonly used designs in equating test scores. These designs are: (1) single-group; (2) random-group; (3) equivalent-group; and (4) anchor-test. Each design requires that its data be collected according to specific guidelines. Three of the four methods are illustrated through hypothetical examples. All four methods try to…

  3. The Uses and Misuses of Test Scores: Technical Assistance Perspective.

    ERIC Educational Resources Information Center

    Echternacht, Gary

    The uses and misuses of standardized test results used for program evaluation as seen by a staff member of an Elementary Secondary Education Act (ESEA) Title I Technical Assistance Center are described. In ESEA Title I, test scores are used to select students for the program. Although federal requirements do not require using standardized test…

  4. Effort Analysis: Individual Score Validation of Achievement Test Data

    ERIC Educational Resources Information Center

    Wise, Steven L.

    2015-01-01

    Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…

  5. Effort Analysis: Individual Score Validation of Achievement Test Data

    ERIC Educational Resources Information Center

    Wise, Steven L.

    2015-01-01

    Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…

  6. Motivating High School Students to Score Proficient on State Tests

    ERIC Educational Resources Information Center

    Brown, Sarah Lee

    2015-01-01

    The researcher interviewed two groups of eleventh grade students, in a rural Appalachian setting, who tended to score low on the state mandated high stakes/low stakes test to discover their efforts on the test, specifically in reading, and to obtain their opinions concerning the effects of a specific incentive or consequence. Before the eleventh…

  7. High Test Scores: The Wrong Road to National Economic Success

    ERIC Educational Resources Information Center

    Baker, Keith

    2011-01-01

    A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…

  8. A prognostic scoring system for arm exercise stress testing

    PubMed Central

    Xie, Yan; Xian, Hong; Chandiramani, Pooja; Bainter, Emily; Wan, Leping; Martin, Wade H

    2016-01-01

    Objective Arm exercise stress testing may be an equivalent or better predictor of mortality outcome than pharmacological stress imaging for the ≥50% for patients unable to perform leg exercise. Thus, our objective was to develop an arm exercise ECG stress test scoring system, analogous to the Duke Treadmill Score, for predicting outcome in these individuals. Methods In this retrospective observational cohort study, arm exercise ECG stress tests were performed in 443 consecutive veterans aged 64.1 (11.1) years. (mean (SD)) between 1997 and 2002. From multivariate Cox models, arm exercise scores were developed for prediction of 5-year and 12-year all-cause and cardiovascular mortality and 5-year cardiovascular mortality or myocardial infarction (MI). Results Arm exercise capacity in resting metabolic equivalents (METs), 1 min heart rate recovery (HRR) and ST segment depression ≥1 mm were the stress test variables independently associated with all-cause and cardiovascular mortality by step-wise Cox analysis (all p<0.01). A score based on the relation HRR (bpm)+7.3×METs−10.5×ST depression (0=no; 1=yes) prognosticated 5-year cardiovascular mortality with a C-statistic of 0.81 before and 0.88 after adjustment for significant demographic and clinical covariates. Arm exercise scores for the other outcome end points yielded C-statistic values of 0.77–0.79 before and 0.82–0.86 after adjustment for significant covariates versus 0.64–0.72 for best fit pharmacological myocardial perfusion imaging models in a cohort of 1730 veterans who were evaluated over the same time period. Conclusions Arm exercise scores, analogous to the Duke Treadmill Score, have good power for prediction of mortality or MI in patients who cannot perform leg exercise. PMID:26835142

  9. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Alaska

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Alaska's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in math and grade 8 in reading. In grade 4 reading, the percentage reaching the…

  10. The Probability of Obtaining Two Statistically Different Test Scores as a Test Index

    ERIC Educational Resources Information Center

    Muller, Jorg M.

    2006-01-01

    A new test index is defined as the probability of obtaining two randomly selected test scores (PDTS) as statistically different. After giving a concept definition of the test index, two simulation studies are presented. The first analyzes the influence of the distribution of test scores, test reliability, and sample size on PDTS within classical…

  11. Background Variables, Levels of Aggregation, and Standardized Test Scores

    ERIC Educational Resources Information Center

    Paulson, Sharon E.; Marchant, Gregory J.

    2009-01-01

    This article examines the role of student demographic characteristics in standardized achievement test scores at both the individual level and aggregated at the state, district, school levels. For several data sets, the majority of the variance among states, districts, and schools was related to demographic characteristics. Where these background…

  12. Univariate and Bivariate Loglinear Models for Discrete Test Score Distributions.

    ERIC Educational Resources Information Center

    Holland, Paul W.; Thayer, Dorothy T.

    2000-01-01

    Applied the theory of exponential families of distributions to the problem of fitting the univariate histograms and discrete bivariate frequency distributions that often arise in the analysis of test scores. Considers efficient computation of the maximum likelihood estimates of the parameters using Newton's Method and computationally efficient…

  13. Schooling and the Norming of Intelligence Test Scores.

    ERIC Educational Resources Information Center

    Cahan, Sorel

    2000-01-01

    Discusses the effects of schooling on the development of intelligence in children and how the amount of schooling should be considered when developing norms for turning intelligence test performance into IQ scores. Suggests that because of differences in schooling among same-age children, use of age-based norms results in biased deviation IQS.…

  14. Using Test Scores from Students with Disabilities in Teacher Evaluation

    ERIC Educational Resources Information Center

    Buzick, Heather M.; Jones, Nathan D.

    2015-01-01

    Much of the recent focus of educational policymakers has been on improving the measurement of teacher effectiveness. Linking student growth to teacher effects has been a large part of reform efforts. To date, neither researchers nor practitioners have arrived at a consensus on how to treat test scores from students with disabilities in…

  15. Study Finds Link between Quality Music Programs, Test Scores

    ERIC Educational Resources Information Center

    Teaching Music, 2007

    2007-01-01

    A recent study found that students in high-quality school music education programs score higher on standardized tests compared to students in schools with deficient music education programs. The study, which was published in the Winter 2006 issue of MENC's Journal for Research in Music Education, is the first to examine the quality of school music…

  16. Source Country Differences in Test Score Gaps: Evidence from Denmark

    ERIC Educational Resources Information Center

    Rangvid, Beatrice Schindler

    2010-01-01

    We combine data from three studies for Denmark in the PISA 2000 framework to investigate differences in the native-immigrant test score gap by country of origin. In addition to the controls available from PISA data sources, we use student-level data on home background and individual migration histories linked from administrative registers. We find…

  17. Small Classes Do Reduce the Test-Score Achievement Gap.

    ERIC Educational Resources Information Center

    Achilles, C. M.; Finn, J. D.; Gerber, Susan B.

    Tennessee's Project STAR, a randomized experiment involving almost 12,000 pupils, demonstrated convincingly that small classes in the early elementary (K-3) grades increase pupil performance, reduce the test-score achievement gap between or among different social groups, and can have long-lasting effects. The benefits are greater for minority…

  18. Assessment Test Scores of Incoming Students, Fall 2001.

    ERIC Educational Resources Information Center

    Negron, Maggie; Breindel, Matthew

    This assessment of placement test scores in reading, math, and sentence skills from incoming students at College of the Desert (California) shows that students are overwhelmingly underprepared for study at the college. Only 15% of students were prepared in sentence skills, 27% in reading skills, 7% in math skills; only 3% were prepared in all 3…

  19. What We Lose in Winning the Test Score Race

    ERIC Educational Resources Information Center

    Jorgenson, Olaf

    2012-01-01

    To achieve perpetually better test results each year as mandated by the No Child Left Behind Act (NCLB), teachers in successful schools such as Leroy Anderson Elementary in San Jose, California, will "try anything" to raise scores, as the school's principal stated in an interview with "The San Jose Mercury News." In schools across California for…

  20. Using Test Scores from Students with Disabilities in Teacher Evaluation

    ERIC Educational Resources Information Center

    Buzick, Heather M.; Jones, Nathan D.

    2015-01-01

    Much of the recent focus of educational policymakers has been on improving the measurement of teacher effectiveness. Linking student growth to teacher effects has been a large part of reform efforts. To date, neither researchers nor practitioners have arrived at a consensus on how to treat test scores from students with disabilities in…

  1. School Choice in Suburbia: Test Scores, Race, and Housing Markets

    ERIC Educational Resources Information Center

    Dougherty, Jack; Harelson, Jeffrey; Maloney, Laura; Murphy, Drew; Smith, Russell; Snow, Michael; Zannoni, Diane

    2009-01-01

    Home buyers exercise school choice when shopping for a private residence due to its location in a public school district or attendance area. In this quantitative study of one Connecticut suburban district, we measure the effect of elementary school test scores and racial composition on home buyers' willingness to purchase single-family homes over…

  2. America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

    ERIC Educational Resources Information Center

    Petrilli, Michael J.; Wright, Brandon L.

    2016-01-01

    At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…

  3. What We Lose in Winning the Test Score Race

    ERIC Educational Resources Information Center

    Jorgenson, Olaf

    2012-01-01

    To achieve perpetually better test results each year as mandated by the No Child Left Behind Act (NCLB), teachers in successful schools such as Leroy Anderson Elementary in San Jose, California, will "try anything" to raise scores, as the school's principal stated in an interview with "The San Jose Mercury News." In schools across California for…

  4. School Choice in Suburbia: Test Scores, Race, and Housing Markets

    ERIC Educational Resources Information Center

    Dougherty, Jack; Harelson, Jeffrey; Maloney, Laura; Murphy, Drew; Smith, Russell; Snow, Michael; Zannoni, Diane

    2009-01-01

    Home buyers exercise school choice when shopping for a private residence due to its location in a public school district or attendance area. In this quantitative study of one Connecticut suburban district, we measure the effect of elementary school test scores and racial composition on home buyers' willingness to purchase single-family homes over…

  5. Local Observed-Score Equating with Anchor-Test Designs

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Wiberg, Marie

    2010-01-01

    For traditional methods of observed-score equating with anchor-test designs, such as chain and poststratification equating, it is difficult to satisfy the criteria of equity and population invariance. Their equatings are therefore likely to be biased. The bias in these methods was evaluated against a simple local equating method in which the…

  6. America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

    ERIC Educational Resources Information Center

    Petrilli, Michael J.; Wright, Brandon L.

    2016-01-01

    At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…

  7. Commentary on "Validating the Interpretations and Uses of Test Scores"

    ERIC Educational Resources Information Center

    Brennan, Robert L.

    2013-01-01

    Kane's paper "Validating the Interpretations and Uses of Test Scores" is the most complete and clearest discussion yet available of the argument-based approach to validation. At its most basic level, validation as formulated by Kane is fundamentally a simply-stated two-step enterprise: (1) specify the claims inherent in a particular interpretation…

  8. A Bad Idea: National Standards Based on Test Scores

    ERIC Educational Resources Information Center

    Baker, Keith

    2010-01-01

    The justification for national standards is that test scores predict a nation's future economic success. There is no evidence that supports this assumption. There is evidence that it is wrong. For more than half a century, reformers have been trying to fix our schools with little success. The obvious conclusion is that something that can't be…

  9. A Latent Class Approach to Estimating Test-Score Reliability

    ERIC Educational Resources Information Center

    van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

    2011-01-01

    This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…

  10. Source Country Differences in Test Score Gaps: Evidence from Denmark

    ERIC Educational Resources Information Center

    Rangvid, Beatrice Schindler

    2010-01-01

    We combine data from three studies for Denmark in the PISA 2000 framework to investigate differences in the native-immigrant test score gap by country of origin. In addition to the controls available from PISA data sources, we use student-level data on home background and individual migration histories linked from administrative registers. We find…

  11. Score test for detecting linkage to quantitative traits.

    PubMed

    Putter, H; Sandkuijl, L A; van Houwelingen, J C

    2002-04-01

    The two most popular methods to detect linkage of a quantitative trait to a marker are the Haseman-Elston regression method and the variance components likelihood-ratio test. In the literature, these methods are frequently compared and the relative advantages and disadvantages of each method are well known. In this article, we derive a score test for the variance component attributable to a specific quantitative trait locus and show that for sib-pairs it is mathematically equivalent to a recently proposed version of the Haseman-Elston method that optimally combines the sum squared and the difference squared of the centered phenotype values of the sibs. Because score tests and likelihood-ratio tetsts are equivalent for large sample sizes, the variance components likelihood-ratio test is also asymptotically equivalent to this optimal Haseman-Elston test. This fact gives a theoretical explanation of the empirical observation from simulation studies reporting similar power of the variance components likelihood-ratio test and the optimal Haseman-Elston method. Perhaps more importantly for practical purposes, the score test can also be extended in a natural way to support the simultaneous analysis of more than two subjects and multivariate phenotypes. PMID:11984866

  12. Flow and diffusion of high-stakes test scores

    PubMed Central

    Marder, M.; Bansal, D.

    2009-01-01

    We apply visualization and modeling methods for convective and diffusive flows to public school mathematics test scores from Texas. We obtain plots that show the most likely future and past scores of students, the effects of random processes such as guessing, and the rate at which students appear in and disappear from schools. We show that student outcomes depend strongly upon economic class, and identify the grade levels where flows of different groups diverge most strongly. Changing the effectiveness of instruction in one grade naturally leads to strongly nonlinear effects on student outcomes in subsequent grades. PMID:19805049

  13. Simplifying multivariate survival analysis using global score test methodology

    NASA Astrophysics Data System (ADS)

    Zain, Zakiyah; Aziz, Nazrina; Ahmad, Yuhaniz

    2015-12-01

    In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve multiple endpoints, and this situation further complicates the analysis of survival data. In the case of tumor patients, endpoints concerning survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For each patient, these endpoints are correlated, and the estimation of the correlation between two score statistics is fundamental in derivation of overall treatment advantage. In this paper, the bivariate survival analysis method using the global score test methodology is extended to multivariate setting.

  14. Nurse entrance test scores: a predictor of success.

    PubMed

    Ellis, Sherri Orso

    2006-01-01

    A program evaluation was conducted to determine if requiring higher scores on critical thinking components of the Nurse Entrance Test would have a positive effect on the percentage of students that could be retained in a diploma nursing program. The program evaluation revealed that using the Nurse Entrance Test as a tool for admissions screening, specifically portions of the examination that predict critical thinking, was effective in helping to predict success through level I nursing courses. PMID:17108789

  15. Neighborhood Social Context and Individual Polycyclic Aromatic Hydrocarbon Exposures Associated with Child Cognitive Test Scores

    PubMed Central

    Eldred-Skemp, Nicolia; Quinn, James W.; Chang, Hsin-wen; Rauh, Virginia A.; Rundle, Andrew; Orjuela, Manuela A.; Perera, Frederica P.

    2013-01-01

    Childhood cognitive and test-taking abilities have long-term implications for educational achievement and health, and may be influenced by household environmental exposures and neighborhood contexts. This study evaluates whether age 5 scores on the Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R, administered in English) are associated with polycyclic aromatic hydrocarbon (PAH) exposure and neighborhood context variables including poverty, low educational attainment, low English language proficiency, and inadequate plumbing. The Columbia Center for Children’s Environmental Health enrolled African-American and Dominican-American New York City women during pregnancy, and conducted follow-up for subsequent childhood health outcomes including cognitive test scores. Individual outcomes were linked to data characterizing 1-km network buffers around prenatal addresses, home observations, interviews, and prenatal PAH exposure data from personal air monitors. Prenatal PAH exposure above the median predicted 3.5 point lower total WPPSI-R scores and 3.9 point lower verbal scores; the association was similar in magnitude across models with adjustments for neighborhood characteristics. Neighborhood-level low English proficiency was independently associated with 2.3 point lower mean total WPPSI-R score, 1.2 point lower verbal score, and 2.7 point lower performance score per standard deviation. Low neighborhood-level educational attainment was also associated with 2.0 point lower performance scores. In models examining effect modification, neighborhood associations were similar or diminished among the high PAH exposure group, as compared with the low PAH exposure group. Early life exposure to personal PAH exposure or selected neighborhood-level social contexts may predict lower cognitive test scores. However, these results may reflect limited geographic exposure variation and limited generalizability. PMID:24994947

  16. Neighborhood Social Context and Individual Polycyclic Aromatic Hydrocarbon Exposures Associated with Child Cognitive Test Scores.

    PubMed

    Lovasi, Gina S; Eldred-Skemp, Nicolia; Quinn, James W; Chang, Hsin-Wen; Rauh, Virginia A; Rundle, Andrew; Orjuela, Manuela A; Perera, Frederica P

    2014-07-01

    Childhood cognitive and test-taking abilities have long-term implications for educational achievement and health, and may be influenced by household environmental exposures and neighborhood contexts. This study evaluates whether age 5 scores on the Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R, administered in English) are associated with polycyclic aromatic hydrocarbon (PAH) exposure and neighborhood context variables including poverty, low educational attainment, low English language proficiency, and inadequate plumbing. The Columbia Center for Children's Environmental Health enrolled African-American and Dominican-American New York City women during pregnancy, and conducted follow-up for subsequent childhood health outcomes including cognitive test scores. Individual outcomes were linked to data characterizing 1-km network buffers around prenatal addresses, home observations, interviews, and prenatal PAH exposure data from personal air monitors. Prenatal PAH exposure above the median predicted 3.5 point lower total WPPSI-R scores and 3.9 point lower verbal scores; the association was similar in magnitude across models with adjustments for neighborhood characteristics. Neighborhood-level low English proficiency was independently associated with 2.3 point lower mean total WPPSI-R score, 1.2 point lower verbal score, and 2.7 point lower performance score per standard deviation. Low neighborhood-level educational attainment was also associated with 2.0 point lower performance scores. In models examining effect modification, neighborhood associations were similar or diminished among the high PAH exposure group, as compared with the low PAH exposure group. Early life exposure to personal PAH exposure or selected neighborhood-level social contexts may predict lower cognitive test scores. However, these results may reflect limited geographic exposure variation and limited generalizability. PMID:24994947

  17. Genetic analysis of California mastitis test records. II. Score for resistance to elevated tests.

    PubMed

    Alrawi, A A; Laben, R C; Pollak, E J

    1979-07-01

    A lactation score for rating individual cows for their apparent resistance to elevation of California Mastitis test is described. The score uses the first nine monthly California Mastitis Tests weighted by the number and position of elevated coded tests. California mastitis test readings of negative and trace were coded "normal" and 1, 2, or 3 "elevated". A cumulative lactation score of 21 was assigned to lactations without elevation in coded test, and a score of zero was assigned to lactations with all nine tests elevated. The number and position of the elevated coded tests influenced the 305-day milk yield, and the position of elevated coded tests influenced lactation persistency. Differences were significant among sire progeny groups for the cumulative lactation score. Heritabilities for the cumulative lactation score were .48 +/- .07, .36 +/- .08, .46 +/- .15, and .23 +/- .12 for first, second, third, and fourth or later lactation groups. Selection for a high cumulative lactation score should reduce the occurrence of elevated coded test scores. The genetic correlation between 305-day milk yield and cumulative score was -.31 +/- .13 for first lactation records. PMID:512135

  18. Which Test? Whose Scores? Comparing Standardized Critical Thinking Tests

    ERIC Educational Resources Information Center

    Hatcher, Donald L.

    2011-01-01

    In this article, after describing one approach for teaching critical thinking (CT) that was in place at Baker University from 1990 to 2008, the author describes their experience assessing CT using three standardized exams and shows why the choice of a standardized CT test can be problematic and the results misleading. These results can be…

  19. The Use of Confidence Intervals When Interpreting Test Scores. EREAPA Publication Series No. 93-4.

    ERIC Educational Resources Information Center

    Wheeler, Patricia H.

    A person's obtained score on a test provides an estimate of the individual's "true" score on that test. The obtained score is considered to have two parts, the true component and the error component. Classical test theory assumes that obtained scores for an individual over multiple administrations of the same test will lie symmetrically around the…

  20. Discrepancies between modified Medical Research Council dyspnea score and COPD assessment test score in patients with COPD

    PubMed Central

    Rhee, Chin Kook; Kim, Jin Woo; Hwang, Yong Il; Lee, Jin Hwa; Jung, Ki-Suck; Lee, Myung Goo; Yoo, Kwang Ha; Lee, Sang Haak; Shin, Kyeong-Cheol; Yoon, Hyoung Kyu

    2015-01-01

    Background and objective According to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelines, either a modified Medical Research Council (mMRC) dyspnea score of ?2 or a chronic obstructive pulmonary disease (COPD) assessment test (CAT) score of ?10 is considered to represent COPD patients who are more symptomatic. We aimed to identify the ideal CAT score that exhibits minimal discrepancy with the mMRC score. Methods A receiver operating characteristic curve of the CAT score was generated for an mMRC scores of 1 and 2. A concordance analysis was applied to quantify the association between the frequencies of patients categorized into GOLD groups A–D using symptom cutoff points. A ?-coefficient was calculated. Results For an mMRC score of 2, a CAT score of 15 showed the maximum value of Youden’s index with a sensitivity and specificity of 0.70 and 0.66, respectively (area under the receiver operating characteristic curve [AUC] 0.74; 95% confidence interval [CI], 0.70–0.77). For an mMRC score of 1, a CAT score of 10 showed the maximum value of Youden’s index with a sensitivity and specificity of 0.77 and 0.65, respectively (AUC 0.77; 95% CI, 0.72–0.83). The ? value for concordance was highest between an mMRC score of 1 and a CAT score of 10 (0.66), followed by an mMRC score of 2 and a CAT score of 15 (0.56), an mMRC score of 2 and a CAT score of 10 (0.47), and an mMRC score of 1 and a CAT score of 15 (0.43). Conclusion A CAT score of 10 was most concordant with an mMRC score of 1 when classifying patients with COPD into GOLD groups A–D. However, a discrepancy remains between the CAT and mMRC scoring systems. PMID:26316736

  1. The Relationship of Scores on Elizur's Hostility System on the Rorschach to the Acting-Out Score on the Hand Test.

    ERIC Educational Resources Information Center

    Martin, John D.; And Others

    1978-01-01

    The relationship between Elizur's Hostility Scoring on the Rorschach Test and the Acting-Out Score on the Hand Test was examined. Correlations between the two measures (using several scoring procedures) ranged from .40 to .64. (JKS)

  2. School accountability and the black-white test score gap.

    PubMed

    Gaddis, S Michael; Lauen, Douglas Lee

    2014-03-01

    Since at least the 1960s, researchers have closely examined the respective roles of families, neighborhoods, and schools in producing the black-white achievement gap. Although many researchers minimize the ability of schools to eliminate achievement gaps, the No Child Left Behind Act (NCLB) increased pressure on schools to do so by 2014. In this study, we examine the effects of NCLB's subgroup-specific accountability pressure on changes in black-white math and reading test score gaps using a school-level panel dataset on all North Carolina public elementary and middle schools between 2001 and 2009. Using difference-in-difference models with school fixed effects, we find that accountability pressure reduces black-white achievement gaps by raising mean black achievement without harming mean white achievement. We find no differential effects of accountability pressure based on the racial composition of schools, but schools with more affluent populations are the most successful at reducing the black-white math achievement gap. Thus, our findings suggest that school-based interventions have the potential to close test score gaps, but differences in school composition and resources play a significant role in the ability of schools to reduce racial inequality. PMID:24468431

  3. TEST-DAY MILK LOSS ASSOCIATED WITH ELEVATED TEST-DAY SOMATIC CELL SCORE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    To determine usefulness of current and previous test-day somatic cell score (SCS) in predicting test-day milk yield, test-day records from Holstein first and second calvings between 1995 and 2002 were examined. Initial selection required that cows have at least the first four test days with recorde...

  4. Directions for Scoring Typing Tests Taken Either on a Typewriter or a Computer.

    ERIC Educational Resources Information Center

    Kump, Ann

    Directions are given for scoring typing tests taken on a typewriter or on a computer using special software. The speed score (gross words per minute) is obtained by determining the total number of strokes typed, and dividing by 25. The accuracy score is obtained by comparing the examinee's test paper to the appropriate scoring key and counting the…

  5. Teacher Education Students: A Look at Basic Skills Admission Tests and National Teacher Examination Scores.

    ERIC Educational Resources Information Center

    Clawson, Kenneth

    This study examined the relationship between teacher education students' scores on basic skills admission tests and graduating seniors' scores on the National Teacher Examinations (NTE) at Eastern Kentucky University. The 1981-82 basic skills test scores for 262 teacher education students were compared with their NTE scores taken in 1984-85 during…

  6. Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

    ERIC Educational Resources Information Center

    Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.

    2010-01-01

    Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

  7. Student test scores are improved in a virtual learning environment.

    PubMed

    Goldberg, H R; McKhann, G M

    2000-06-01

    This study evaluates the effectiveness of delivering the core curriculum of an introductory neuroscience course using a software application referred to as a virtual learning interface (VLI). The performance of students in a virtual learning environment (VLE) is compared with that of students in a conventional lecture hall in which the same lecturer presented the same material. This study was not designed to determine whether grades are improved by augmenting a lecture with other information. The VLI takes advantage of audio, video, animation, and text in a multimedia computer environment. Our results indicate that raw average scores on weekly examinations were 14 percentage points higher for students in the VLE compared with those for students in a conventional lecture hall setting. Moreover, normalized test scores were over 5 points higher for students in the VLE. This analysis suggest that a core curriculum can be effectively presented to students using the VLE, thereby making it possible for faculty to spend less class time relaying facts and more time engaging students in discussion of scientific theory. PMID:10902528

  8. Money Improves Test Scores--Even State-Level SATs.

    ERIC Educational Resources Information Center

    Bracey, Gerald W.

    1996-01-01

    Three former secretaries of education--William Bennett, Lauro Cavazos, and Terrel Bell--have touted state-level SAT scores as proof that educational financing does not matter. Recently, Brian Powell and Lala Carr Steelman adjusted scores for participation rate and detected a very strong relationship between expenditures and SAT scores. Bigger…

  9. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An...: Ovarian Adnexal Mass Assessment Score Test System.” For the availability of this guidance document,...

  10. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 21 Food and Drugs 8 2013-04-01 2013-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An...: Ovarian Adnexal Mass Assessment Score Test System.” For the availability of this guidance document,...

  11. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 21 Food and Drugs 8 2012-04-01 2012-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An...: Ovarian Adnexal Mass Assessment Score Test System.” For the availability of this guidance document,...

  12. Not Your Parents’ Test Scores: Cohort Reduces Psychometric Aging Effects

    PubMed Central

    Zelinski, Elizabeth M.; Kennison, Robert F.

    2014-01-01

    Increases over birth cohorts in psychometric abilities may impact effects of aging. Data from 2 cohorts of the Long Beach Longitudinal Study, matched on age but tested 16 years apart, were modeled over ages 55–87 to test the hypothesis that the more fluid abilities of reasoning, list and text recall, and space would show larger cohort differences than vocabulary. This hypothesis was confirmed. At age 74, average performance estimates for people from the more recently born cohort were equivalent to those of people from the older cohort when they were up to 15 years younger. This finding suggests that older adults may perform like much younger ones from the previous generation on fluid measures, indicating higher levels of abilities than expected. This result could have major implications for the expected productivity of an aging workforce as well as for the quality of life of future generations. However, cohort improvements did not mitigate age declines. PMID:17874953

  13. Automated Scoring of Short-Answer Reading Items: Implications for Constructs

    ERIC Educational Resources Information Center

    Carr, Nathan T.; Xi, Xiaoming

    2010-01-01

    This article examines how the use of automated scoring procedures for short-answer reading tasks can affect the constructs being assessed. In particular, it highlights ways in which the development of scoring algorithms intended to apply the criteria used by human raters can lead test developers to reexamine and even refine the constructs they…

  14. Using Test-Taking Skills to Improve Students' Standardized Test Scores.

    ERIC Educational Resources Information Center

    Bowker, Mary; Irish, Barbara

    As an action research project, a program was developed to improve test-taking skills to increase standardized test scores. The targeted population was high school juniors in a small Midwestern community in west central Illinois. The problem of low standardized test achievement was documented through data that revealed that students fell below the…

  15. Using Patterns of Summed Scores in Paper-and-Pencil Tests and Computer-Adaptive Tests to Detect Misfitting Item Score Patterns

    ERIC Educational Resources Information Center

    Meijer, Rob R.

    2004-01-01

    Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a…

  16. Developing Test Score Reports that Work: The Process and Best Practices for Effective Communication

    ERIC Educational Resources Information Center

    Zenisky, April L.; Hambleton, Ronald K.

    2012-01-01

    Test scores matter these days. Test-takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees' information or usability needs, but this is clearly changing…

  17. The SAT® and SAT Subject Tests™: Discrepant Scores and Incremental Validity. Research Report 2012-2

    ERIC Educational Resources Information Center

    Kobrin, Jennifer L.; Patterson, Brian F.

    2012-01-01

    This study examines student performance on the SAT and SAT Subject Tests in order to identify groups of students who score differently on these two tests, and to determine whether certain demographic groups score higher on one test compared to the other. Discrepancy scores were created to capture individuals' performance differences on the…

  18. Equivalent Grade Equivalent Scores between Metro '78 and the 1973 Stanford Achievement Test. Metropolitan Achievement Tests Special Report Number 21.

    ERIC Educational Resources Information Center

    Psychological Corp., New York, NY.

    Ten tables present equivalent scores between the 1978 Metropolitan Achievement Tests and the 1973 Stanford Achievement Test in terms of grade equivalent (GE) scores. These data were derived empirically by administering the two tests to two groups of students matched in terms of Otis-Lennon Mental Ability Test scores. Equivalent GEs were determined…

  19. Evidence-Based Decision about Test Scoring Rules in Clinical Anatomy Multiple-Choice Examinations

    ERIC Educational Resources Information Center

    Severo, Milton; Gaio, A. Rita; Povo, Ana; Silva-Pereira, Fernanda; Ferreira, Maria Amélia

    2015-01-01

    In theory the formula scoring methods increase the reliability of multiple-choice tests in comparison with number-right scoring. This study aimed to evaluate the impact of the formula scoring method in clinical anatomy multiple-choice examinations, and to compare it with that from the number-right scoring method, hoping to achieve an…

  20. The Formalization of Fairness: Issues in Testing for Measurement Invariance Using Subtest Scores

    ERIC Educational Resources Information Center

    Molenaar, Dylan; Borsboom, Denny

    2013-01-01

    Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear factor analyses of subtest scores. These subtest scores typically result from summing the item scores. In this paper, we discuss 4 possible problems…

  1. The Formalization of Fairness: Issues in Testing for Measurement Invariance Using Subtest Scores

    ERIC Educational Resources Information Center

    Molenaar, Dylan; Borsboom, Denny

    2013-01-01

    Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear factor analyses of subtest scores. These subtest scores typically result from summing the item scores. In this paper, we discuss 4 possible problems…

  2. Using MCW-APM Test Scoring to Evaluate Economics Curricula.

    ERIC Educational Resources Information Center

    Bruno, James E.

    1989-01-01

    Reports on a study that explored the use of a scoring procedure called Modified Confidence Weighted-Admissible Probability Measurement (MCW-APM) to evaluate curriculum design and to assess students' knowledge of economic concepts. Concluded that the MCW-APM scoring method can help teachers develop curricula to meet specific student needs. (SLM)

  3. Test/score/report: Simulation techniques for automating the test process

    NASA Technical Reports Server (NTRS)

    Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.

    1994-01-01

    A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary to test a POCC/MOC software delivery, as well as improve the quality of the test process.

  4. Test/score/report: Simulation techniques for automating the test process

    NASA Astrophysics Data System (ADS)

    Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.

    1994-11-01

    A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary to test a POCC/MOC software delivery, as well as improve the quality of the test process.

  5. Predicting Teacher Certification Success: The Effect of Cumulative Grade Point Average and Preprofessional Academic Skills Test Scores on Testing Performance

    ERIC Educational Resources Information Center

    Hernandez, Barbara L. Michiels; Ward, Susan; Strickland, George

    2006-01-01

    Legislative mandates and reforms hold universities accountable for student certification test performance. The purpose of this investigation was to determine if cumulative grade point average scores and the preprofessional academic skills test scores predict performance on elementary certification test (professional development) scores of…

  6. On the Study of Matching Cut-Scores to Test Characteristics: An Observed Score Approach. Program Statistics Research Technical Report Series.

    ERIC Educational Resources Information Center

    Wainer, Howard

    Techniques derived from item response theory are useful for estimating the reliability of test classification above and below the cutting score. Test developers can construct a test whose information is peaked in the region of the cutting score; users can select a test which provides the most information in this region. The Cut-Score…

  7. Situational Effects May Account for Gain Scores in Cognitive Ability Testing: A Longitudinal SEM Approach

    ERIC Educational Resources Information Center

    Matton, Nadine; Vautier, Stephane; Raufaste, Eric

    2009-01-01

    Mean gain scores for cognitive ability tests between two sessions in a selection setting are now a robust finding, yet not fully understood. Many authors do not attribute such gain scores to an increase in the target abilities. Our approach consists of testing a longitudinal SEM model suitable to this view. We propose to model the scores' changes…

  8. The Predictive Efficiency of Achievement and Aptitude Test Data on Seventh Grade Mathematics Scores.

    ERIC Educational Resources Information Center

    Dossey, John A.; Jones, Marilyn Doran

    1980-01-01

    The computation, concept, and application subtests of the Stanford Achievement Tests (SAT) were administered to a student sample during grades 3, 5, and 7. The efficiency of earlier scores and Otis Lennon Mental Ability Test scores in predicting seventh-grade SAT math scores was examined and found to be weak. (SJL)

  9. Difficulty and Discriminating Indices of Three-Multiple Choice Tests Using the Confidence Scoring Procedure

    ERIC Educational Resources Information Center

    Omirin, M. S.

    2007-01-01

    The study investigated the comparison of the difficulty and discrimination incides of three multiple choice tests using the confidence scoring procedure (CSP). The study was also set to determine whether or not the difficulty and discrimination indices would be improved, if the tests were scored by the confidence scoring procedure. Two null…

  10. Situational Effects May Account for Gain Scores in Cognitive Ability Testing: A Longitudinal SEM Approach

    ERIC Educational Resources Information Center

    Matton, Nadine; Vautier, Stephane; Raufaste, Eric

    2009-01-01

    Mean gain scores for cognitive ability tests between two sessions in a selection setting are now a robust finding, yet not fully understood. Many authors do not attribute such gain scores to an increase in the target abilities. Our approach consists of testing a longitudinal SEM model suitable to this view. We propose to model the scores' changes…

  11. Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success

    PubMed Central

    Niu, Sunny X.; Tienda, Marta

    2012-01-01

    Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success—high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe. PMID:23788828

  12. Does Weight Affect Children's Test Scores and Teacher Assessments Differently?

    ERIC Educational Resources Information Center

    Zavodny, Madeline

    2013-01-01

    The prevalence of childhood overweight and obesity increased dramatically in the United States during the past three decades. This increase has adverse public health implications, but its implication for children's academic outcomes is less clear. This paper uses data from five waves of the Early Childhood Longitudinal Study-Kindergarten to…

  13. The relationship between selected standardized test scores and performance in advanced placement math and science exams: Analyzing the differential effectiveness of scores for course identification and placement

    NASA Astrophysics Data System (ADS)

    Urbina, Josue N.

    There is a national need to increase the STEM-related workforce. Among factors leading towards STEM careers include the number of advanced high school mathematics and science courses students complete. Florida's enrollment patterns in STEM-related Advanced Placement (AP) courses, however, reveal that only a small percentage of students enroll into these classes. Therefore, screening tools are needed to find more students for these courses, who are academically ready, yet have not been identified. The purpose of this study was to investigate the extent to which scores from a national standardized test, Preliminary Scholastic Assessment Test/ National Merit Qualifying Test (PSAT/NMSQT), in conjunction with and compared to a state-mandated standardized test, Florida Comprehensive Assessment Test (FCAT), are related to selected AP exam performance in Seminole County Public Schools. An ex post facto correlational study was conducted using 6,189 student records from the 2010 - 2012 academic years. Multiple regression analyses using simultaneous Full Model testing showed differential moderate to strong relationships between scores in eight of the nine AP courses (i.e., Biology, Environmental Science, Chemistry, Physics B, Physics C Electrical, Physics C Mechanical, Statistics, Calculus AB and BC) examined. For example, the significant unique contribution to overall variance in AP scores was a linear combination of PSAT Math (M), Critical Reading (CR) and FCAT Reading (R) for Biology and Environmental Science. Moderate relationships for Chemistry included a linear combination of PSAT M, W (Writing) and FCAT M; a combination of FCAT M and PSAT M was most significantly associated with Calculus AB performance. These findings have implications for both research and practice. FCAT scores, in conjunction with PSAT scores, can potentially be used for specific STEM-related AP courses, as part of a systematic approach towards AP course identification and placement. For courses with moderate to strong relationships, validation studies and development of expectancy tables, which estimate the probability of successful performance on these AP exams, are recommended. Also, findings established a need to examine other related research issues including, but not limited to, extensive longitudinal studies and analyses of other available or prospective standardized test scores.

  14. Standardized Testing of Special Education Students: A Comparison of Service Type and Test Scores

    ERIC Educational Resources Information Center

    Hogan-Young, Christine

    2013-01-01

    The purpose of this study was to determine if there was a difference in Tennessee Comprehensive Assessment Program Modified Academic Achievement Standards (TCAP MAAS) achievement test scores for special education students who receive their instruction in the resource classroom or in an inclusion classroom. The study involved third, fourth, and…

  15. Evaluating the Impact of Test Accommodations on Test Scores of LEP Students & Non-LEP Students.

    ERIC Educational Resources Information Center

    Hafner, Anne L.

    Using a quasi-experimental analysis of variance (ANOVA) design, this project examined the effects of the use of accommodations with students of limited English proficiency (LEP) and non-LEP students and whether the use of accommodations affected the validity of test score interpretations. Major accommodations examined were extra time, and extra…

  16. How Parents Can Help Kids Improve Test Scores: Taking the Stakes out of Literacy Testing

    ERIC Educational Resources Information Center

    Schneider, Steven

    2006-01-01

    In order to meet the goals of No Child Left Behind, standardized testing is preeminent as the sole indicator determining whether states all across America demonstrate adequate yearly progress regarding the improvement of student achievement in literacy education. This book will help teachers and parents raise children's scores on standardized…

  17. Principles and Practices of Test Score Equating. Research Report. ETS RR-10-29

    ERIC Educational Resources Information Center

    Dorans, Neil J.; Moses, Tim P.; Eignor, Daniel R.

    2010-01-01

    Score equating is essential for any testing program that continually produces new editions of a test and for which the expectation is that scores from these editions have the same meaning over time. Particularly in testing programs that help make high-stakes decisions, it is extremely important that test equating be done carefully and accurately.…

  18. Unlabeling the Disabled: A Perspective on Flagging Scores from Accommodated Test Administrations

    ERIC Educational Resources Information Center

    Sireci, Stephen G.

    2005-01-01

    Accommodations to standard test administrations are granted on many tests for students who have one or more disabling conditions. In some instances, students' scores from these nonstandard administrations are "flagged" to caution those who interpret the test score that the test was not administered under typical conditions. The practice of…

  19. Why African American College Students Miss the Perfect Test Score

    ERIC Educational Resources Information Center

    Gentry, Ruben; Stokes, Dorothy

    2016-01-01

    Many African Americans were imbued with the cliché that they must work twice as hard as others to be a success in life. Entering college, students with this belief put extensive effort into earning top grades to ensure quality preparation for their chosen career; yet, some fail to earn top scores. Why? This is the million dollar question, but the…

  20. A New Method for Administering and Scoring Multiple-Choice Tests: Theoretical Considerations and Empirical Results.

    ERIC Educational Resources Information Center

    Cross, Lawrence H.; And Others

    A new scoring procedure for multiple choice tests attempts to assess partial knowledge and to restrict guessing. It is a variant of Coombs' elimination scoring method, adapted for use with the carbon-shield answer sheets commonly used with answer-until-correct scoring. Examinees are directed to erase the carbon shields of choices they are certain…

  1. A New Method for Administering and Scoring Multiple-Choice Tests: Theoretical Considerations and Empirical Results.

    ERIC Educational Resources Information Center

    Cross, Lawrence H.; And Others

    A new scoring procedure for multiple choice tests attempts to assess partial knowledge and to restrict guessing. It is a variant of Coombs' elimination scoring method, adapted for use with the carbon-shield answer sheets commonly used with answer-until-correct scoring. Examinees are directed to erase the carbon shields of choices they are certain…

  2. Can Machine Scoring Deal with Broad and Open Writing Tests as Well as Human Readers?

    ERIC Educational Resources Information Center

    McCurry, Doug

    2010-01-01

    This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…

  3. Can Machine Scoring Deal with Broad and Open Writing Tests as Well as Human Readers?

    ERIC Educational Resources Information Center

    McCurry, Doug

    2010-01-01

    This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…

  4. Comparing Graphical and Verbal Representations of Measurement Error in Test Score Reports

    ERIC Educational Resources Information Center

    Zwick, Rebecca; Zapata-Rivera, Diego; Hegarty, Mary

    2014-01-01

    Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept. We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each…

  5. Comparing Graphical and Verbal Representations of Measurement Error in Test Score Reports

    ERIC Educational Resources Information Center

    Zwick, Rebecca; Zapata-Rivera, Diego; Hegarty, Mary

    2014-01-01

    Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept. We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each…

  6. D.C. Student Test Scores Show Uneven Progress. Data Snapshot

    ERIC Educational Resources Information Center

    DuPre, Mary

    2011-01-01

    Over the past five years, both DC Public Schools (DCPS) and public charter schools (PCS) have seen significant growth in secondary reading and math scores on the state test known as the District of Columbia Comprehensive Assessment System (DC CAS). However, scores have not improved as much at the elementary level. Reading and math scores for DCPS…

  7. The Expanding Racial Scoring Gap between Black and White SAT Test Takers.

    ERIC Educational Resources Information Center

    Journal of Blacks in Higher Education, 2002

    2002-01-01

    Between 1976-88, the black-white scoring gap on the Scholastic Assessment Test closed significantly. The improvement in black scores was so strong that some educators predicted that within a generation, the gap would disappear. However, since 1988, the racial gap in SAT scores has become wider, with no compelling evidence that any improvement is…

  8. Noncognitive Skills and the Gender Disparities in Test Scores and Teacher Assessments: Evidence from Primary School

    ERIC Educational Resources Information Center

    Cornwell, Christopher; Mustard, David B.; Van Parys, Jessica

    2013-01-01

    Using data from the 1998-99 ECLS-K cohort, we show that the grades awarded by teachers are not aligned with test scores. Girls in every racial category outperform boys on reading tests, while boys score at least as well on math and science tests as girls. However, boys in all racial categories across all subject areas are not represented in…

  9. Are Score Comparisons across Language Proficiency Test Batteries Justified?: An IELTS-TOEFL Comparability Study.

    ERIC Educational Resources Information Center

    Geranpayeh, Ardeshir

    1994-01-01

    This paper reports on a study conducted to determine if comparisons between scores on the Test of English as a Foreign Language (TOEFL) and the International English Language Testing Service (IELTS) are justifiable. The test scores of 216 Iranian graduate students who took the TOEFL and IELTS, as well as the Iranian Ministry of Culture and Higher…

  10. Score Reporting in Teacher Certification Testing: A Review, Design, and Interview/Focus Group Study

    ERIC Educational Resources Information Center

    Klesch, Heather S.

    2010-01-01

    The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…

  11. Score Reporting in Teacher Certification Testing: A Review, Design, and Interview/Focus Group Study

    ERIC Educational Resources Information Center

    Klesch, Heather S.

    2010-01-01

    The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…

  12. School Inputs, Household Substitution, and Test Scores. NBER Working Paper No. 16830

    ERIC Educational Resources Information Center

    Das, Jishnu; Dercon, Stefan; Habyarimana, James; Krishnan, Pramila; Muralidharan, Karthik; Sundararaman, Venkatesh

    2011-01-01

    Empirical studies of the relationship between school inputs and test scores typically do not account for the fact that households will respond to changes in school inputs. We present a dynamic household optimization model relating test scores to school and household inputs, and test its predictions in two very different low-income country…

  13. School Inputs, Household Substitution, and Test Scores. NBER Working Paper No. 16830

    ERIC Educational Resources Information Center

    Das, Jishnu; Dercon, Stefan; Habyarimana, James; Krishnan, Pramila; Muralidharan, Karthik; Sundararaman, Venkatesh

    2011-01-01

    Empirical studies of the relationship between school inputs and test scores typically do not account for the fact that households will respond to changes in school inputs. We present a dynamic household optimization model relating test scores to school and household inputs, and test its predictions in two very different low-income country…

  14. Noncognitive Skills and the Gender Disparities in Test Scores and Teacher Assessments: Evidence from Primary School

    ERIC Educational Resources Information Center

    Cornwell, Christopher; Mustard, David B.; Van Parys, Jessica

    2013-01-01

    Using data from the 1998-99 ECLS-K cohort, we show that the grades awarded by teachers are not aligned with test scores. Girls in every racial category outperform boys on reading tests, while boys score at least as well on math and science tests as girls. However, boys in all racial categories across all subject areas are not represented in…

  15. Social desirability bias in personality testing: Implications for astronaut selection

    NASA Astrophysics Data System (ADS)

    Sandal, Gro M.; Musson, Dave; Helmreich, Robert. L.; Gravdal, Lene

    2005-07-01

    The assessment of personality is recognized by space agencies as an approach to identify candidates likely to perform optimally during spaceflights. In the use of personality scales for selection, the impact of social desirability (SD) has been cited as a concern. Study 1 addressed the impact of SD on responses to the Personality Characteristic Inventory (PCI) and NEO-FFI. This was achieved by contrasting scores from active astronauts (N=65) with scores of successful astronaut applicants (N=63), and between pilots applicants (N=1271) and pilot research subjects (N=120). Secondly, personality scores were correlated with scores on the Marlow Crown Social Desirability Scale among applicants to managerial positions (N=120). The results indicated that SD inflated scores on PCI scales assessing negative interpersonal characteristics, and impacted on four of five scales in NEO-FFI. Still, the effect sizes were small or moderate. Study 2 addressed performance implications of SD during an assessment of males applying to work as rescue personnel operations in the North Sea (N=22). The results showed that SD correlated negatively with cognitive test performance, and positively with discrepancy in performance ratings between self and two observers. In conclusion, caution is needed in interpreting personality scores in applicant populations. SD may be a negative predictor for performance under stress.

  16. The increasing impact of socioeconomics and race on standardized academic test scores across elementary, middle, and high school.

    PubMed

    White, Gwyne W; Stepney, Cesalie T; Hatchimonji, Danielle Ryan; Moceri, Dominic C; Linsky, Arielle V; Reyes-Portillo, Jazmin A; Elias, Maurice J

    2016-01-01

    For students and schools, the current policy is to measure success via standardized testing. Yet the immutable factors of socioeconomic status (SES) and race have, consistently, been implicated in fostering an achievement gap. The current study explores, at the school-level, the impact of these factors on test scores. Percentage of students proficient for Language and Math was analyzed from 452 schools across the state of New Jersey. By high school, 52% of the variance in Language and 59% in Math test scores can be accounted for by SES and racial factors. At this level, a 1% increase in school minority population corresponds to a 0.19 decrease in percent Language proficient and 0.33 decrease for Math. These results have significant implications as they suggest that school-level interventions to improve academic achievement scores will be stymied by socioeconomic and racial factors and efforts to improve the achievement gap via testing have largely measured it. (PsycINFO Database Record PMID:26752444

  17. Peer Effects and the Indigenous/Non-Indigenous Early Test-Score Gap in Peru

    ERIC Educational Resources Information Center

    Sakellariou, Chris

    2008-01-01

    This paper assesses the magnitude of the non-indigenous/indigenous test-score gap for third-year and fourth-year primary school pupils in Peru, in relation to the main family, school and peer inputs contributing to the test-score gap using the estimation method of feasible generalized least squares. The article then decomposes the gap into its…

  18. The Dynamics of the Evolution of the Black-White Test Score Gap

    ERIC Educational Resources Information Center

    Sohn, Kitae

    2012-01-01

    We apply a quantile version of the Oaxaca-Blinder decomposition to estimate the counterfactual distribution of the test scores of Black students. In the Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999 (ECLS-K), we find that the gap initially appears only at the top of the distribution of test scores. As children age, however,…

  19. Scoring Yes-No Vocabulary Tests: Reaction Time vs. Nonword Approaches

    ERIC Educational Resources Information Center

    Pellicer-Sanchez, Ana; Schmitt, Norbert

    2012-01-01

    Despite a number of research studies investigating the Yes-No vocabulary test format, one main question remains unanswered: What is the best scoring procedure to adjust for testee overestimation of vocabulary knowledge? Different scoring methodologies have been proposed based on the inclusion and selection of nonwords in the test. However, there…

  20. Interpreting Standardized Test Scores: Strategies for Data-Driven Instructional Decision Making

    ERIC Educational Resources Information Center

    Mertler, Craig A.

    2007-01-01

    This book is designed to help K-12 teachers and administrators understand the nature of standardized tests and, in particular, the scores that result from them. This useful manual helps teachers develop the skills necessary to incorporate these test scores into various types of instructional decision making--a process known as "data-driven…

  1. Using Raters from India to Score a Large-Scale Speaking Test

    ERIC Educational Resources Information Center

    Xi, Xiaoming; Mollaun, Pam

    2011-01-01

    We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…

  2. An Investigation of Methods for Improving Estimation of Test Score Distributions.

    ERIC Educational Resources Information Center

    Hanson, Bradley A.

    Three methods of estimating test score distributions that may improve on using the observed frequencies (OBFs) as estimates of a population test score distribution are considered: the kernel method (KM); the polynomial method (PM); and the four-parameter beta binomial method (FPBBM). The assumption each method makes about the smoothness of the…

  3. Linking Scores from Tests of Similar Content Given in Different Languages: An Illustration Involving Methodological Alternatives

    ERIC Educational Resources Information Center

    Cascallar, Alicia S.; Dorans, Neil J.

    2005-01-01

    This study compares two methods commonly used (concordance and prediction) to establish linkages between scores from tests of similar content given in different languages. Score linkages between the Verbal and Math sections of the SAT I and the corresponding sections of the Spanish-language admissions test, the Prueba de Aptitud Academica (PAA),…

  4. Individual Part Score Profiles of Children with Intellectual Disability: A Descriptive Analysis across Three Intelligence Tests

    ERIC Educational Resources Information Center

    Bergeron, Renee; Floyd, Randy G.

    2013-01-01

    This study examined the group- and individual-level part score profiles of children with intellectual disability (ID) who participated in clinical validity studies supporting three individually administered intelligence tests. Across tests, children with ID produced group-level profiles comprising mean part scores that fell in the Low to Very Low…

  5. Kindergarten Black-White Test Score Gaps: Replicating and Updating Previous Findings with New National Data

    ERIC Educational Resources Information Center

    Quinn, David

    2014-01-01

    A substantial body of evidence has shown large academic test score gaps between black and white students in early childhood. These gaps remain, and probably grow, as students progress through school. Many researchers have sought to explain these persistent test score gaps, and particularly, to understand the role of students' socio-economic status…

  6. Are Mathematics and Science Test Scores Good Indicators of Labor-Force Quality?

    ERIC Educational Resources Information Center

    Chen, Shiu-Sheng; Luoh, Ming-Ching

    2010-01-01

    Using data from the Programme for International Student Assessment (PISA) and the Trends in International Mathematics and Science Study (TIMSS), we investigate the link between test scores (mathematics and science) and cross-country income differences. We would like to know whether test scores are good indicators of labor-force quality. The…

  7. Using Raters from India to Score a Large-Scale Speaking Test

    ERIC Educational Resources Information Center

    Xi, Xiaoming; Mollaun, Pam

    2011-01-01

    We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…

  8. Sex Differences in Cognitive Abilities Test Scores: A UK National Picture

    ERIC Educational Resources Information Center

    Strand, Steve; Deary, Ian J.; Smith, Pauline

    2006-01-01

    Background and aims: There is uncertainty about the extent or even existence of sex differences in the mean and variability of reasoning test scores ( Jensen, 1998; Lynn, 1994, ; Mackintosh, 1996). This paper analyses the Cognitive Abilities Test (CAT) scores of a large and representative sample of UK pupils to determine the extent of any sex…

  9. STABILITY OF ACADEMIC APTITUDE AND READING TEST SCORES OF MOBILE AND NON-MOBILE DISADVANTAGED CHILDREN.

    ERIC Educational Resources Information Center

    JUSTMAN, JOSEPH

    CHANGES IN ACADEMIC APTITUDE AND ACHIEVEMENT TEST SCORES OF PUPILS ATTENDING PUBLIC SCHOOLS IN DISADVANTAGED AREAS IN NEW YORK CITY WERE INVESTIGATED. AN ATTEMPT WAS MADE TO DETERMINE WHETHER VARYING DEGREES OF MOBILITY WERE ASSOCIATED WITH VARIATION IN CHANGES IN TEST SCORES. THE CUMULATIVE RECORD CARDS OF SIXTH-GRADE PUPILS WERE EXAMINED TO…

  10. Beyond Correlations: Usefulness of High School GPA and Test Scores in Making College Admissions Decisions

    ERIC Educational Resources Information Center

    Sawyer, Richard

    2013-01-01

    Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…

  11. The Influence of Foreign Language Learning during Early Childhood on Standardized Test Scores

    ERIC Educational Resources Information Center

    Shaw, Tommetta

    2010-01-01

    Increasing standardized test scores in reading and math is of high importance to the California Department of Education to meet requirements mandated by the No Child Left Behind (NCLB) act of 2001. More research is needed to understand the best ways to improve tests scores to meet concerns of the NCLB act. The purpose of the study was to evaluate…

  12. Beyond Correlations: Usefulness of High School GPA and Test Scores in Making College Admissions Decisions

    ERIC Educational Resources Information Center

    Sawyer, Richard

    2013-01-01

    Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…

  13. Graduate Students' Administration and Scoring Errors on the Woodcock-Johnson III Tests of Cognitive Abilities

    ERIC Educational Resources Information Center

    Ramos, Erica; Alfonso, Vincent C.; Schermerhorn, Susan M.

    2009-01-01

    The interpretation of cognitive test scores often leads to decisions concerning the diagnosis, educational placement, and types of interventions used for children. Therefore, it is important that practitioners administer and score cognitive tests without error. This study assesses the frequency and types of examiner errors that occur during the…

  14. Effects of Scoring by Section and Independent Scorers' Patterns on Scorer Reliability in Biology Essay Tests

    ERIC Educational Resources Information Center

    Ebuoh, Casmir N.; Ezeudu, S. A.

    2015-01-01

    The study investigated the effects of scoring by section, use of independent scorers and conventional patterns on scorer reliability in Biology essay tests. It was revealed from literature review that conventional pattern of scoring all items at a time in essay tests had been criticized for not being reliable. The study was true experimental study…

  15. Effect of Self-Assessment on Test Scores: Student Perceptions

    ERIC Educational Resources Information Center

    Ramirez, Beatriz U.

    2010-01-01

    After a sudden increase in most of the individual grades in a multiple-choice test, students were asked to rank the three most relevant factors responsible for this outcome. Among eight others, the availability of a test for self-assessment before the final test was by far the most frequently mentioned (82.4% of the students). Questions applied…

  16. An Error Score Model for Time-Limit Tests

    ERIC Educational Resources Information Center

    Ven, A. H. G. S. van der

    1976-01-01

    A more generalized error model for time-limit tests is developed. Model estimates are derived for right-attempted and wrong-attempted correlations both within the same test and between different tests. A comparison is made between observed correlations and their model counterparts and a fair agreement is found between observed and expected…

  17. A "Rearrangement Procedure" for Scoring Adaptive Tests with Review Options

    ERIC Educational Resources Information Center

    Papanastasiou, Elena C.; Reckase, Mark D.

    2007-01-01

    Because of the increased popularity of computerized adaptive testing (CAT), many admissions tests, as well as certification and licensure examinations, have been transformed from their paper-and-pencil versions to computerized adaptive versions. A major difference between paper-and-pencil tests and CAT from an examinee's point of view is that in…

  18. Scoring Creativity Tests by Computer Simulation: A Validation of Prediction Equations.

    ERIC Educational Resources Information Center

    Greene, John F.; Zirkel, Perry A.

    The general usefulness of selected predictions equations for computer simulated scoring of creativity tests was studied. This was carried out by testing previously established prediction equations for samples drawn from similar populations. (CK)

  19. Comparing Test Scores Using Information From Criterion-Related Validity Studies.

    PubMed

    Beaujean, A Alexander; McGlaughlin, Sean M

    2016-01-01

    There is frequently a need to compare a client's test scores from different instruments. If the scores come from instruments that use the same scale, it is tempting to compare the scores directly. Unfortunately, this method can lead clinicians to believe that there is a large difference between scores when the difference is minimal. As an alternative, we outline a method for score comparison that uses information from criterion-related validity studies. Using three examples, we show why this method is more psychometrically sound, produces more accurate comparison scores, and requires little extra work for clinicians than the direct comparison approach. To make the score comparison process easy for clinicians to use, we include an appendix that demonstrates how to implement this method in Microsoft Excel and the free R program. PMID:25650888

  20. Conceptual and Empirical Relationships between Temporal Measures of Fluency and Oral English Proficiency with Implications for Automated Scoring

    ERIC Educational Resources Information Center

    Ginther, April; Dimova, Slobodanka; Yang, Rui

    2010-01-01

    Information provided by examination of the skills that underlie holistic scores can be used not only as supporting evidence for the validity of inferences associated with performance tests but also as a way to improve the scoring rubrics, descriptors, and benchmarks associated with scoring scales. As fluency is considered a critical, perhaps…

  1. A Note on Recovering the Ability Distribution from Test Scores.

    ERIC Educational Resources Information Center

    Junker, Brian W.

    A simple scheme is proposed for smoothly approximating the ability distribution for relatively long tests, assuming that the item characteristic curves (ICCs) are known or well estimated. The scheme works for a general class of ICCs and is guaranteed to completely recover the theta distribution as the test length increases. The proposed method of…

  2. Increases in Test Scores as a Function of Material Rewards.

    ERIC Educational Resources Information Center

    Tuinman, J. Jaap; And Others

    From the entire population (N=341) of grades 7 and 8 in a rural Indiana junior high school, 160 subjects were randomly selected and assigned to the experimental and the control groups. Form A of the Nelson Reading Test was administered twice with a 4-week interval. While the control group was told only that the post-test was given to measure how…

  3. The Scoring of Matching Questions Tests: A Closer Look

    ERIC Educational Resources Information Center

    Jancarík, Antonín; Kostelecká, Yvona

    2015-01-01

    Electronic testing has become a regular part of online courses. Most learning management systems offer a wide range of tools that can be used in electronic tests. With respect to time demands, the most efficient tools are those that allow automatic assessment. The presented paper focuses on one of these tools: matching questions in which one…

  4. Cyclone-type separators score high in comparative tests

    SciTech Connect

    Oranje, L. )

    1990-01-22

    Full-scale performance tests of four types of gas-liquid separators are reported. They have indicated that a cyclone-type separator can have a catch-efficiency rate which approaches 100%. The tests were prompted by recurring condensate formation in a gastransmission system. Preliminary findings indicate that the condensate troubles resulted on several occasions from separators which failed to meet manufacturers' performance specifications on catch efficiency under operating conditions.

  5. Maintaining Equivalent Cut Scores for Small Sample Test Forms

    ERIC Educational Resources Information Center

    Dwyer, Andrew C.

    2016-01-01

    This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…

  6. Allometric Scaling of Wingate Anaerobic Power Test Scores in Women

    ERIC Educational Resources Information Center

    Hetzler, Ronald K.; Stickley, Christopher D.; Kimura, Iris F.

    2011-01-01

    In this study, we developed allometric exponents for scaling Wingate anaerobic test (WAnT) power data that are reflective in controlling for body mass (BM) and lean body mass (LBM) and established a normative WAnT data set for college-age women. One hundred women completed a standard WAnT. Allometric exponents and percentile ranks for peak (PP)…

  7. The Effect of Black Peers on Black Test Scores

    ERIC Educational Resources Information Center

    Armor, David J.; Duck, Stephanie

    2007-01-01

    Recent studies have used increasingly complex methodologies to estimate the effect of peer characteristics--race, poverty, and ability--on student achievement. A paper by Hanushek, Kain, and Rivkin using Texas state testing data has received particularly wide attention because it found a large negative effect of school percent black on black math…

  8. Allometric Scaling of Wingate Anaerobic Power Test Scores in Women

    ERIC Educational Resources Information Center

    Hetzler, Ronald K.; Stickley, Christopher D.; Kimura, Iris F.

    2011-01-01

    In this study, we developed allometric exponents for scaling Wingate anaerobic test (WAnT) power data that are reflective in controlling for body mass (BM) and lean body mass (LBM) and established a normative WAnT data set for college-age women. One hundred women completed a standard WAnT. Allometric exponents and percentile ranks for peak (PP)…

  9. An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

    ERIC Educational Resources Information Center

    Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

    2013-01-01

    Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…

  10. An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

    ERIC Educational Resources Information Center

    Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

    2013-01-01

    Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…

  11. The Effects of Diverse Test Score Distribution Characteristics on the Estimation of the Rasch Measurement Model.

    ERIC Educational Resources Information Center

    Cypress, Beulah K.

    The potential of the Rasch model to develop scores, on a ratio scale, suitable for interindividual comparisons, from intact groups with disparate distribution characteristics was investigated. The specific problems studied were: (1) the effects of skewed test score distributions on the ability parameter of the Rasch measurement model; (2) the…

  12. TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

    ERIC Educational Resources Information Center

    Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

    2012-01-01

    Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

  13. Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

    ERIC Educational Resources Information Center

    Kolen, Michael J.; Lee, Won-Chan

    2011-01-01

    This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

  14. See It, Be It, Write It: Using Performing Arts to Improve Writing Skills and Test Scores

    ERIC Educational Resources Information Center

    Blecher-Sass, Hope Sara; Moffitt, Maryellen

    2010-01-01

    Improve students' writing skills and boost their assessment scores while adding arts education, creativity, and fun to your writing curriculum. With this vibrant resource, improving writing skills goes hand-in-hand with improving test scores. Students learn how to use acting and visualization as prewriting activities to help them connect writing…

  15. Use of Standardized Test Scores to Predict Success in a Computer Applications Course

    ERIC Educational Resources Information Center

    Harris, Robert V.; King, Stephanie B.

    2016-01-01

    The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…

  16. Relationship of Sentence Skills Test Scores and Final Course Grades in Marketing 100.

    ERIC Educational Resources Information Center

    Ryan, Nancy

    1996-01-01

    Describes a study examining the relationship between scores on a Sentence Skills component of an English placement test and final course grades in a community college marketing course. Finds a significant positive correlation between scores and final grades, but one not strong enough to be used for predictive purposes. (13 citations) (BCY)

  17. Test Score or Student Progress? A Value-Added Evaluation of School Effectiveness in Urban China

    ERIC Educational Resources Information Center

    Peng, Pai; Hochweber, Jan; Klieme, Eckhard

    2013-01-01

    Outcome-oriented evaluation of school effectiveness is often based on student test scores in certain critical examinations. This study provides another method of evaluation--value-added--which is based on student achievement progress. This paper introduces the method of estimating the value-added score of schools in multi-level models. Based on…

  18. Score Generalizability of Academic Writing Tasks: Does One Test Method Fit It All?

    ERIC Educational Resources Information Center

    Gebril, Atta

    2009-01-01

    Generalizability of writing scores has always been a longstanding concern in L2 writing assessment. A number of studies have been conducted to investigate this topic during the last two decades. However, with the introduction of new test methods, such as reading-to-write tasks, generalizability studies need to focus on the score accuracy of…

  19. Use of Standardized Test Scores to Predict Success in a Computer Applications Course

    ERIC Educational Resources Information Center

    Harris, Robert V.; King, Stephanie B.

    2016-01-01

    The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…

  20. TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

    ERIC Educational Resources Information Center

    Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

    2012-01-01

    Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

  1. Multinomial and Compound Multinomial Error Models for Tests with Complex Item Scoring

    ERIC Educational Resources Information Center

    Lee, Won-Chan

    2007-01-01

    This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…

  2. Can Percentiles Replace Raw Scores in the Statistical Analysis of Test Data?

    ERIC Educational Resources Information Center

    Zimmerman, Donald W.; Zumbo, Bruno D.

    2005-01-01

    Educational and psychological testing textbooks typically warn of the inappropriateness of performing arithmetic operations and statistical analysis on percentiles instead of raw scores. This seems inconsistent with the well-established finding that transforming scores to ranks and using nonparametric methods often improves the validity and power…

  3. From #2 Pencils to the World Wide Web: A History of Test Scoring

    ERIC Educational Resources Information Center

    Zytowski, Donald G.

    2008-01-01

    The present highly developed status of psychological and educational testing in the United States is in part the result of many efforts over the past 100 years to develop economical and reliable methods of scoring. The present article traces a number of methods, ranging from hand scoring to present-day computer applications, stimulated by the need…

  4. Score Generalizability of Academic Writing Tasks: Does One Test Method Fit It All?

    ERIC Educational Resources Information Center

    Gebril, Atta

    2009-01-01

    Generalizability of writing scores has always been a longstanding concern in L2 writing assessment. A number of studies have been conducted to investigate this topic during the last two decades. However, with the introduction of new test methods, such as reading-to-write tasks, generalizability studies need to focus on the score accuracy of…

  5. Test Score or Student Progress? A Value-Added Evaluation of School Effectiveness in Urban China

    ERIC Educational Resources Information Center

    Peng, Pai; Hochweber, Jan; Klieme, Eckhard

    2013-01-01

    Outcome-oriented evaluation of school effectiveness is often based on student test scores in certain critical examinations. This study provides another method of evaluation--value-added--which is based on student achievement progress. This paper introduces the method of estimating the value-added score of schools in multi-level models. Based on…

  6. Optimal Scoring Methods of Hand-Strength Tests in Patients with Stroke

    ERIC Educational Resources Information Center

    Huang, Sheau-Ling; Hsieh, Ching-Lin; Lin, Jau-Hong; Chen, Hui-Mei

    2011-01-01

    The purpose of this study was to determine the optimal scoring methods for measuring strength of the more-affected hand in patients with stroke by examining the effect of reducing measurement errors. Three hand-strength tests of grip, palmar pinch, and lateral pinch were administered at two sessions in 56 patients with stroke. Five scoring methods…

  7. Language Variation and Score Variation in the Testing of English Language Learners, Native Spanish Speakers

    ERIC Educational Resources Information Center

    Solano-Flores, Guillermo; Li, Min

    2009-01-01

    We investigated language variation and score variation in the testing of English language learners, native Spanish speakers. We gave students the same set of National Assessment of Educational Progress mathematics items in both their first language and their second language. We examined the amount of score variation due to the main and interaction…

  8. Wisconsin card sorting test: a new global score, with Italian norms, and its relationship with the Weigl sorting test.

    PubMed

    Laiacona, M; Inzaghi, M G; De Tanti, A; Capitani, E

    2000-10-01

    The Wisconsin card sorting test and the Weigl test are two neuropsychological tools widely used in clinical practice to assess frontal lobe functions. In this study we present norms useful for Italian subjects aged from 15 to 85 years, with 5-17 years of education. Concerning the Wisconsin card sorting test, a new measure of global efficiency (global score) is proposed as well as norms for some well known qualitative aspects of the performance, i.e. perseverative responses, failure to maintain the set and non-perseverative errors. In setting normative values, we followed a statistical methodology (equivalent scores) employed in Italy for other neuropsychological tests, in order to favour the possibility of comparison among these tests. A correlation study between the global score of the Wisconsin card sorting test and the score on the Weigl test was carried out and it emerges that some cognitive aspects are not overlapping in these two measures. PMID:11286040

  9. Estimating Achievement Gaps from Test Scores Reported in Ordinal "Proficiency" Categories

    ERIC Educational Resources Information Center

    Ho, Andrew D.; Reardon, Sean F.

    2012-01-01

    Test scores are commonly reported in a small number of ordered categories. Examples of such reporting include state accountability testing, Advanced Placement tests, and English proficiency tests. This article introduces and evaluates methods for estimating achievement gaps on a familiar standard-deviation-unit metric using data from these ordered…

  10. Implications of Drug Testing Cheerleaders

    ERIC Educational Resources Information Center

    Trachsler, Tracy A.; Birren, Genevieve

    2016-01-01

    With the untimely death of a University of Louisville cheerleader due to an accidental drug overdose in the summer of 2014, the athletic department representatives took steps to prevent future incidents by adding cheerleaders to the randomized drug testing protocols conducted at the university for the student-athletes involved in National…

  11. Implications of Drug Testing Cheerleaders

    ERIC Educational Resources Information Center

    Trachsler, Tracy A.; Birren, Genevieve

    2016-01-01

    With the untimely death of a University of Louisville cheerleader due to an accidental drug overdose in the summer of 2014, the athletic department representatives took steps to prevent future incidents by adding cheerleaders to the randomized drug testing protocols conducted at the university for the student-athletes involved in National…

  12. Relationships between spatial activities and scores on the mental rotation test as a function of sex.

    PubMed

    Ginn, Sheryl R; Pickens, Stefanie J

    2005-06-01

    Previous results suggested that female college students' scores on the Mental Rotations Test might be related to their prior experience with spatial tasks. For example, women who played video games scored better on the test than their non-game-playing peers, whereas playing video games was not related to men's scores. The present study examined whether participation in different types of spatial activities would be related to women's performance on the Mental Rotations Test. 31 men and 59 women enrolled at a small, private church-affiliated university and majoring in art or music as well as students who participated in intercollegiate athletics completed the Mental Rotations Test. Women's scores on the Mental Rotations Test benefitted from experience with spatial activities; the more types of experience the women had, the better their scores. Thus women who were athletes, musicians, or artists scored better than those women who had no experience with these activities. The opposite results were found for the men. Efforts are currently underway to assess how length of experience and which types of experience are related to scores. PMID:16060458

  13. Effects of Targeted Test Preparation on Scores of Two Tests of Oral English as a Second Language

    ERIC Educational Resources Information Center

    Farnsworth, Tim

    2013-01-01

    This study investigated the effect of targeted test preparation, or coaching, on oral English as a second language test scores. The tests in question were the Basic English Skills Test Plus (BEST Plus), a scripted oral interview published by the Center for Applied Linguistics, and the Versant English Test (VET), a computer-administered and…

  14. Effects of Targeted Test Preparation on Scores of Two Tests of Oral English as a Second Language

    ERIC Educational Resources Information Center

    Farnsworth, Tim

    2013-01-01

    This study investigated the effect of targeted test preparation, or coaching, on oral English as a second language test scores. The tests in question were the Basic English Skills Test Plus (BEST Plus), a scripted oral interview published by the Center for Applied Linguistics, and the Versant English Test (VET), a computer-administered and…

  15. Determining When Single Scoring for Constructed-Response Items Is as Effective as Double Scoring in Mixed-Format Licensure Tests

    ERIC Educational Resources Information Center

    Kim, Sooyeon; Moses, Tim

    2013-01-01

    The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…

  16. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    ERIC Educational Resources Information Center

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  17. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    ERIC Educational Resources Information Center

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  18. Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score

    PubMed Central

    Hung, Man; Hon, Shirley D.; Cheng, Christine; Franklin, Jeremy D.; Aoki, Stephen K.; Anderson, Mike B.; Kapron, Ashley L.; Peters, Christopher L.; Pelt, Christopher E.

    2014-01-01

    Background: The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. Purpose: To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Study Design: Cohort study (diagnosis); Level of evidence, 2. Methods: Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. Results: All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. Conclusion: The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior in all psychometric aspects examined in this study. Future research should investigate the LE CAT for wider use in different populations. PMID:26535291

  19. Predicting Future PTSD using a Modified New York Risk Score: Implications for Patient Screening and Management

    PubMed Central

    Boscarino, Joseph A.; Kirchner, H. Lester; Hoffman, Stuart N; Sartorius, Jennifer; Adams, Richard E.; Figley, Charles R.

    2012-01-01

    Aim We previously developed a posttraumatic stress disorder (PTSD) screening instrument – the New York PTSD Risk Score – that was effective in predicting PTSD. In the present study, we assessed a 12-month prospective version of this risk score, which is important for patient management, follow-up, and for emergency medicine. Methods Using data collected in a study of New York City adults after the World Trade Center Disaster (WTCD), we developed a new PTSD prediction tool. Using diagnostic test methods, including receiver operating curve (ROC) and bootstrap procedures, we examined different prediction variables to assess PTSD status 12 months after initial assessment among 1,681 trauma-exposed adults. Results While our original PTSD screener worked well in the short term, it was not specifically developed to predict long-term PTSD. In the current study, we found that the Primary Care PTSD Screener (PCPS), when combined with psychosocial predictors from the original NY Risk Score, including depression, trauma exposure, sleep disturbance, and healthcare access, increased the area under the ROC curve (AUC) from 0.707 to 0.774, a significant improvement (p<0.0001). When additional risk-factor variables were added, including negative life events, handedness, self-esteem, and pain status, the AUC increased to 0.819, also a significant improvement (p=0.001). Adding Latino and foreign status to the model further increased the AUC to 0.839 (p=0.007). Conclusion A prospective version of the New York PTSD Risk Score appears to be effective in predicting PTSD status 12 months after initial assessment among trauma-exposed adults. Further research is advised to further validate and expand these findings. PMID:22408285

  20. Interpreting the g loadings of intelligence test composite scores in light of Spearman's law of diminishing returns.

    PubMed

    Reynolds, Matthew R

    2013-03-01

    The linear loadings of intelligence test composite scores on a general factor (g) have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the g loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a) investigate whether the g loadings of composite scores from the Differential Ability Scales (2nd ed.) (DAS-II, C. D. Elliott, 2007a, Differential Ability Scales (2nd ed.). San Antonio, TX: Pearson) were nonlinear and (b) if they were nonlinear, to compare them with linear g loadings to demonstrate how SLODR alters the interpretation of these loadings. Linear and nonlinear confirmatory factor analysis (CFA) models were used to model Nonverbal Reasoning, Verbal Ability, Visual Spatial Ability, Working Memory, and Processing Speed composite scores in four age groups (5-6, 7-8, 9-13, and 14-17) from the DAS-II norming sample. The nonlinear CFA models provided better fit to the data than did the linear models. In support of SLODR, estimates obtained from the nonlinear CFAs indicated that g loadings decreased as g level increased. The nonlinear portion for the nonverbal reasoning loading, however, was not statistically significant across the age groups. Knowledge of general ability level informs composite score interpretation because g is less likely to produce differences, or is measured less, in those scores at higher g levels. One implication is that it may be more important to examine the pattern of specific abilities at higher general ability levels. PMID:23506024

  1. Explaining the black-white gap in cognitive test scores: Toward a theory of adverse impact.

    PubMed

    Cottrell, Jonathan M; Newman, Daniel A; Roisman, Glenn I

    2015-11-01

    In understanding the causes of adverse impact, a key parameter is the Black-White difference in cognitive test scores. To advance theory on why Black-White cognitive ability/knowledge test score gaps exist, and on how these gaps develop over time, the current article proposes an inductive explanatory model derived from past empirical findings. According to this theoretical model, Black-White group mean differences in cognitive test scores arise from the following racially disparate conditions: family income, maternal education, maternal verbal ability/knowledge, learning materials in the home, parenting factors (maternal sensitivity, maternal warmth and acceptance, and safe physical environment), child birth order, and child birth weight. Results from a 5-wave longitudinal growth model estimated on children in the NICHD Study of Early Child Care and Youth Development from ages 4 through 15 years show significant Black-White cognitive test score gaps throughout early development that did not grow significantly over time (i.e., significant intercept differences, but not slope differences). Importantly, the racially disparate conditions listed above can account for the relation between race and cognitive test scores. We propose a parsimonious 3-Step Model that explains how cognitive test score gaps arise, in which race relates to maternal disadvantage, which in turn relates to parenting factors, which in turn relate to cognitive test scores. This model and results offer to fill a need for theory on the etiology of the Black-White ethnic group gap in cognitive test scores, and attempt to address a missing link in the theory of adverse impact. PMID:25867168

  2. Validity of Alternative Cut-Off Scores for the Back-Saver Sit and Reach Test

    ERIC Educational Resources Information Center

    Looney, Marilyn A.; Gilbert, Jennie

    2012-01-01

    The purpose of the study was to determine if currently used FITNESSGRAM[R] cut-off scores for the Back Saver Sit and Reach Test had the best criterion-referenced validity evidence for 6-12 year old children. Secondary analyses of an existing data set focused on the passive straight leg raise and Back Saver Sit and Reach Test flexibility scores of…

  3. A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

    ERIC Educational Resources Information Center

    Kamens, David H.

    2015-01-01

    This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…

  4. A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

    ERIC Educational Resources Information Center

    Kamens, David H.

    2015-01-01

    This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…

  5. How Does Emergency Department Crowding Affect Medical Student Test Scores and Clerkship Evaluations?

    PubMed Central

    Wei, Grant; Arya, Rajiv; Ritz, Z. Trevor; He, Albert S.; Ohman-Strickland, Pamela A.; McCoy, Jonathan V.

    2015-01-01

    Introduction The effect of emergency department (ED) crowding has been recognized as a concern for more than 20 years; its effect on productivity, medical errors, and patient satisfaction has been studied extensively. Little research has reviewed the effect of ED crowding on medical education. Prior studies that have considered this effect have shown no correlation between ED crowding and resident perception of quality of medical education. Objective To determine whether ED crowding, as measured by the National ED Overcrowding Scale (NEDOCS) score, has a quantifiable effect on medical student objective and subjective experiences during emergency medicine (EM) clerkship rotations. Methods We collected end-of-rotation examinations and medical student evaluations for 21 EM rotation blocks between July 2010 and May 2012, with a total of 211 students. NEDOCS scores were calculated for each corresponding period. Weighted regression analyses examined the correlation between components of the medical student evaluation, student test scores, and the NEDOCS score for each period. Results When all 21 rotations are included in the analysis, NEDOCS scores showed a negative correlation with medical student tests scores (regression coefficient= ?0.16, p=0.04) and three elements of the rotation evaluation (attending teaching, communication, and systems-based practice; p<0.05). We excluded an outlying NEDOCS score from the analysis and obtained similar results. When the data were controlled for effect of month of the year, only student test score remained significantly correlated with NEDOCS score (p=0.011). No part of the medical student rotation evaluation attained significant correlation with the NEDOCS score (p?0.34 in all cases). Conclusion ED overcrowding does demonstrate a small but negative association with medical student performance on end-of-rotation examinations. Additional studies are recommended to further evaluate this effect. PMID:26594289

  6. Instructions for additional qualitative scoring of the initial-letter Word-association Test.

    PubMed

    Zivkovi?, M

    1994-04-01

    An additional scoring method is based on grouping test-words according to whether the same sign is given by subjects to the test-words. In this way five test-word categories are formed, Eros (test-words with double plus signs), demi-Eros (single plus sign), demi-Thanatos (single minus), Thanatos (double minus), and Deviant (+/- and theta signs). The next step in scoring is to count the number of test-words in a given scoring category whose meanings do not conform. The greater the discrepancy between the test-word category and its meaning, the less well adapted is the subject. Several illustrative protocols are discussed. PMID:8022674

  7. Examining the stability of Automated Neuropsychological Assessment Metric (ANAM) baseline test scores.

    PubMed

    Kaminski, Thomas W; Groff, Rachel M; Glutting, Joseph J

    2009-08-01

    Computerized neuropsychological (NP) testing has evolved into an important tool for clinicians in the assessment of sport-related concussions. The importance of having a reliable baseline test score for comparison post concussion is critical; yet, the stability of these baseline measurements has not been well established. The purpose of this study was to examine the consistency of the measurements derived from the Automated Neuropsychological Assessment Metric (ANAM) test battery over a series of repeated trials, in an attempt to determine at what point the test scores stabilized. A cohort of 25 recreationally active collegiate students, free from mild head injury, volunteered for the study. Throughput score (measures of performance efficiency) stability was assessed for the computerized NP tests using intraclass correlation coefficients (ICCs). Average throughput scores for all five test trials were simple reaction time (SRT) = 235, matching to sample (MSP) = 41, continuous performance test (CPT) = 108, math processing (MTH) = 24, and Sternberg memory (STN) = 89, and these are within the range of those previously reported. Results show that all four of the ICCs were in the excellent range of agreement (i.e., > or = .75), and more importantly, the statistical comparisons of the ICCs show that there was no significant difference between the ICCs. Consequently, results serve to show that two time periods are sufficient to obtain stable NP results, and thus clinicians can feel comfortable relying on a two-score baseline test for follow-up comparison. PMID:19110989

  8. Test Scores in New Castle County, DE.--Before and After Busing.

    ERIC Educational Resources Information Center

    D'Onofrio, William D., Comp.

    This analysis compares student test scores before and after school busing in New Castle County, Delaware, in an attempt to see if busing to achieve racial balance reduces the achievement gap between black and white students. School authorities pre-tested students with the California Achievement Test (CAT) in 1978-79, the first year of busing, and…

  9. Effects of Student Self-Corrective Measures on Learning and Standardized Test Scores

    ERIC Educational Resources Information Center

    Poplin, Beth D.

    2010-01-01

    This study examined whether students who graded and corrected their own test papers improved their learning and standardized test scores on the North Carolina end-of-course test in United States History. Four preexisting, intact classrooms of 11th grade United States History students in two different high schools formed the basis of this…

  10. Deriving Comparable Scores for Computer Adaptive and Conventional Tests: An Example Using the SAT.

    ERIC Educational Resources Information Center

    Eignor, Daniel R.

    Procedures used to establish the comparability of scores derived from the College Board Admissions Testing Program (ATP) computer adaptive Scholastic Aptitude Test (SAT) prototype and the paper-and-pencil SAT are described in this report. Both the prototype, which is made up of Verbal and Mathematics computer adaptive tests (CATs), and a form of…

  11. Commentary: Student Cognition, the Situated Learning Context, and Test Score Interpretation

    ERIC Educational Resources Information Center

    La Marca, Paul M.

    2006-01-01

    Although it is assumed that student cognition contributes to student performance on achievement tests, it may be that current testing models lack the degree of specification necessary to warrant such inferences. With test score interpretations as the referent, the authors in this special issue address the role of student cognition in learning and…

  12. Methods for Evaluating the Validity of Test Scores for English Language Learners

    ERIC Educational Resources Information Center

    Sireci, Stephen G.; Han, Kyung T.; Wells, Craig S.

    2008-01-01

    In the United States, when English language learners (ELLs) are tested, they are usually tested in English and their limited English proficiency is a potential cause of construct-irrelevant variance. When such irrelevancies affect test scores, inaccurate interpretations of ELLs' knowledge, skills, and abilities may occur. In this article, we…

  13. Methods for Evaluating the Validity of Test Scores for English Language Learners

    ERIC Educational Resources Information Center

    Sireci, Stephen G.; Han, Kyung T.; Wells, Craig S.

    2008-01-01

    In the United States, when English language learners (ELLs) are tested, they are usually tested in English and their limited English proficiency is a potential cause of construct-irrelevant variance. When such irrelevancies affect test scores, inaccurate interpretations of ELLs' knowledge, skills, and abilities may occur. In this article, we…

  14. Testing the reliability of Grade, Roughness and Breathiness scores by means of synthetic speech stimuli.

    PubMed

    Schoentgen, Jean; Fraj, Samia; Lucero, Jorge C

    2015-04-01

    This article describes a synthesizer of disordered voices and reports a test of the reliability of Grade, Roughness, and Breathiness scores assigned to synthetic stimuli by eight expert listeners in two sessions. Speech stimuli [a], [i], [u], [ai], and [ia] were synthesized with three values of vocal frequency and four levels of vocal jitter and pulsatile additive noise each. The agreement and correlation of scores assigned by the same rater in different sessions, or by different raters in the same session, accord with published data. Only a small part of the variance of the arithmetic differences between the scores that are assigned to the same stimulus is explained by the stimuli properties. The conclusion is that differences between scores that are assigned to the same stimulus are not attributable to biases of individual raters; such biases would shift all the scores assigned on a scale, and the shift would be interpretable in terms of the properties of the stimuli. PMID:24117123

  15. An electrophysiological correlate of Eating Attitudes Test scores in female college students.

    PubMed

    Wilson, J F; Mercer, J C

    1990-11-01

    Eating Attitudes Test (EAT) scores of forty female college students were compared to their electrodermal activity (EDA) responses when offered a plate of chocolate chip cookies. A significant positive correlation was detected between the EAT scores and the skin conductivity measures associated with the presentation of food. Women with the highest EAT scores also exhibited the greatest sympathetic nervous system responses to a plate of cookies. This finding supports the conclusion that the EAT is capable of identifying individuals who are preoccupied with food or anxious about eating. PMID:2284404

  16. Does It Matter How Data Are Collected? A Comparison of Testing Conditions and the Implications for Validity

    ERIC Educational Resources Information Center

    Barry, Carol L.; Finney, Sara J.

    2009-01-01

    The effects of gathering test scores under low-stakes conditions has been a prominent domain of research in the assessment and testing literature. One important area within this larger domain concerns the implications of a test being low-stakes on test evaluation and development. The current study examined one variable, the testing context, that…

  17. Semi-Quantitative Scoring of an Immunochromatographic Test for Circulating Filarial Antigen

    PubMed Central

    Chesnais, Cédric B.; Missamou, François; Pion, Sébastien D. S.; Bopda, Jean; Louya, Frédéric; Majewski, Andrew C.; Weil, Gary J.; Boussinesq, Michel

    2013-01-01

    The value of a semi-quantitative scoring of the filarial antigen test (Binax Now Filariasis card test, ICT) results was evaluated during a field survey in the Republic of Congo. One hundred and thirty-four (134) of 774 tests (17.3%) were clearly positive and were scored 1, 2, or 3; and 11 (1.4%) had questionable results. Wuchereria bancrofti microfilariae (mf) were detected in 41 of those 133 individuals with an ICT test score ≥ 1 who also had a night blood smear; none of the 11 individuals with questionable ICT results harbored night mf. Cuzick's test showed a significant trend for higher microfilarial densities in groups with higher ICT scores (P < 0.001). The ICT scores were also significantly correlated with blood mf counts. Because filarial antigen levels provide an indication of adult worm infection intensity, our results suggest that semi-quantitative reading of the ICT may be useful for grading the intensity of filarial infections in individuals and populations. PMID:24019435

  18. CSCOPE's Effect on Texas' State Mandated Standardized Test Scores in Mathematics

    ERIC Educational Resources Information Center

    Merritt, Brent Ross

    2011-01-01

    The purpose of the study was to examine standardized test scores of school districts in the state of Texas that have implemented CSCOPE, a popular curriculum management system, in an effort to determine what effect, if any, its implementation has had. The standardized test used in the state of the Texas is titled the Texas Assessment of Knowledge…

  19. Pragmatism or Gaming the System? One School District's Solution to Low Test Scores

    ERIC Educational Resources Information Center

    McKenzie, Kathryn Bell

    2009-01-01

    In this era of accountability and high stakes testing, district and school administrators are vigilant in their attention to student test scores and the ramifications these have for district and school performance labels. In other words, no school or district wants to be labeled "low performing." This case, based on a real situation, demonstrates…

  20. Investigation and Treatment of Missing Item Scores in Test and Questionnaire Data

    ERIC Educational Resources Information Center

    Sijtsma, Klaas; van der Ark, L. Andries

    2003-01-01

    This article first discusses a statistical test for investigating whether or not the pattern of missing scores in a respondent-by-item data matrix is random. Since this is an asymptotic test, we investigate whether it is useful in small but realistic sample sizes. Then, we discuss two known simple imputation methods, person mean (PM) and two-way…

  1. Detection of Invalid Test Scores: The Usefulness of Simple Nonparametric Statistics

    ERIC Educational Resources Information Center

    Tendeiro, Jorge N.; Meijer, Rob R.

    2014-01-01

    In recent guidelines for fair educational testing it is advised to check the validity of individual test scores through the use of person-fit statistics. For practitioners it is unclear on the basis of the existing literature which statistic to use. An overview of relatively simple existing nonparametric approaches to identify atypical response…

  2. Score Reliability of a Test Composed of Passage-Based Testlets: A Generalizability Theory Perspective.

    ERIC Educational Resources Information Center

    Lee, Yong-Won

    The purpose of this study was to investigate the impact of local item dependence (LID) in passage-based testlets on the test score reliability of an English as a Foreign Language (EFL) reading comprehension test from the perspective of generalizability (G) theory. Definitions and causes of LID in passage-based testlets are reviewed within the…

  3. Beating the Odds: A Low Equalized Assessed Valuation Elementary School with High Standardized Test Scores

    ERIC Educational Resources Information Center

    Levin, Brian

    2011-01-01

    This mixed methods study examines what makes Bluffview Elementary School a success as measured by the ISAT, the mandated state test of Illinois. Despite national reports of achievement gaps and low test scores, Bluffview Elementary has shown sustained success in educating children. This paper reviews how Bluffview Elementary students are achieving…

  4. Stochastic Processes as True-Score Models for Highly Speeded Mental Tests.

    ERIC Educational Resources Information Center

    Moore, William E.

    The previous theoretical development of the Poisson process as a strong model for the true-score theory of mental tests is discussed, and additional theoretical properties of the model from the standpoint of individual examinees are developed. The paper introduces the Erlang process as a family of test theory models and shows in the context of…

  5. Identifying Local Dependence with a Score Test Statistic Based on the Bifactor Logistic Model

    ERIC Educational Resources Information Center

    Liu, Yang; Thissen, David

    2012-01-01

    Local dependence (LD) refers to the violation of the local independence assumption of most item response models. Statistics that indicate LD between a pair of items on a test or questionnaire that is being fitted with an item response model can play a useful diagnostic role in applications of item response theory. In this article, a new score test…

  6. Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

    ERIC Educational Resources Information Center

    Lalande, John F.; Schweckendiek, Jurgen

    1986-01-01

    Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)

  7. The Relationship of Motivational Values of Math and Reading Teachers to Student Test Score Gains

    ERIC Educational Resources Information Center

    Loewen, David Allen

    2013-01-01

    This exploratory correlational study seeks to answer the question of whether a relationship exists between student average test score gains on state exams and teachers' rating of values on the Schwartz Values Survey. Eighty-seven randomly selected Kansas teachers of math and/or reading, grades four through eight, participated. Student test…

  8. An Investigation of the Effectiveness of Vocabulary Learning Strategies on Iranian EFL Learners' Vocabulary Test Score

    ERIC Educational Resources Information Center

    Rahimy, Ramin; Shams, Kiana

    2012-01-01

    This study aims to investigate the effectiveness of vocabulary learning strategies on Iranian EFL learners' vocabulary test score. To achieve this aim, fifty Intermediate level students from Kish English Institute were randomly selected from among fifteen classes after administering the Oxford Placement Test (OPT). Then, an intermediate level…

  9. Implications of Changing Answers on Objective Test Items

    ERIC Educational Resources Information Center

    Mueller, Daniel J.; Wasser, Virginia

    1977-01-01

    Eighteen studies of the effects of changing initial answers to objective test items are reviewed. While students throughout the total test score range tended to gain more points than they lost, higher scoring students gain more than did lower scoring students. Suggestions for further research are made. (Author/JKS)

  10. A Score Based on Screening Tests to Differentiate Mild Cognitive Impairment from Subjective Memory Complaints

    PubMed Central

    de Gobbi Porto, Fábio Henrique; Spíndola, Lívia; de Oliveira, Maira Okada; Figuerêdo do Vale, Patrícia Helena; Orsini, Marco; Nitrini, Ricardo; Dozzi Brucki, Sonia Maria

    2013-01-01

    It is not easy to differentiate patients with mild cognitive impairment (MCI) from subjective memory complainers (SMC). Assessments with screening cognitive tools are essential, particularly in primary care where most patients are seen. The objective of this study was to evaluate the diagnostic accuracy of screening cognitive tests and to propose a score derived from screening tests. Elderly subjects with memory complaints were evaluated using the Mini Mental State Examination (MMSE) and the Brief Cognitive Battery (BCB). We added two delayed recalls in the MMSE (a delayed recall and a late-delayed recall, LDR), and also a phonemic fluency test of letter P fluency (LPF). A score was created based on these tests. The diagnoses were made on the basis of clinical consensus and neuropsychological testing. Receiver operating characteristic curve analyses were used to determine area under the curve (AUC), the sensitivity and specificity for each test separately and for the final proposed score. MMSE, LDR, LPF and delayed recall of BCB scores reach statistically significant differences between groups (P=0.000, 0.03, 0.001 and 0.01, respectively). Sensitivity, specificity and AUC were MMSE: 64%, 79% and 0.75 (cut off <29); LDR: 56%, 62% and 0.62 (cut off <3); LPF: 71%, 71% and 0.71 (cut off <14); delayed recall of BCB: 56%, 82% and 0.68 (cut off <9). The proposed score reached a sensitivity of 88% and 76% and specificity of 62% and 75% for cut off over 1 and over 2, respectively. AUC were 0.81. In conclusion, a score created from screening tests is capable of discriminating MCI from SMC with moderate to good accurancy. PMID:24147213

  11. Do We Really Become Smarter When Our Fluid-Intelligence Test Scores Improve?

    PubMed Central

    Hayes, Taylor R.; Petrov, Alexander A.; Sederberg, Per B.

    2014-01-01

    Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training—now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains. A novel scanpath analysis of eye movement data from 35 participants solving Raven’s Advanced Progressive Matrices on two separate sessions indicated that one-third of the variance of score gains could be attributed to test-taking strategy alone, as revealed by characteristic changes in eye-fixation patterns. When the strategic contaminant was partialled out, the residual score gains were no longer significant. These results are compatible with established theories of skill acquisition suggesting that procedural knowledge tacitly acquired during training can later be utilized at posttest. Our novel method and result both underline a reason to be wary of purported intelligence gains, but also provide a way forward for testing for them in the future. PMID:25395695

  12. Expected Test Scores for Preschoolers with a Cochlear Implant Who Use Spoken Language

    PubMed Central

    Nicholas, Johanna G.; Geers, Ann E.

    2008-01-01

    Purpose The major purpose of this study was to provide information about expected spoken language skills of preschool-aged children who are deaf and who use a cochlear implant. A goal was to provide “benchmarks” against which those skills may be compared, for a given age at implantation. We also examined whether parent-completed checklists of children's language were correlated with results of standardized language tests and whether scores increased linearly with decreasing age-of-implantation and increasing duration of cochlear implant use. Method Participants were a nation-wide sample of 76 children who were deaf and orally-educated and who received an implant by 38 months of age. Formal language tests were administered at age 4.5 years. The MacArthur-Bates Communicative Development Inventory (MBCDI) was completed by parents when children were ages 3.5 and 4.5 years. Results Based on regression analyses, expected test scores for each age at implant are provided for two commonly administered language tests at 4.5 years of age and MBCDI subscale scores at 3.5 and 4.5 years. Concurrent test scores were significantly correlated on all measures. A linear relation was found which predicted increasing test scores with younger ages at implantation for all scales administered. Conclusions While the expected scores reported here should not be considered as normative data, they are benchmarks which may be useful for evaluating spoken language progress of children with cochlear implants in enrolled in spoken language-based programs. PMID:18448600

  13. Correlation of Simulation Examination to Written Test Scores for Advanced Cardiac Life Support Testing: Prospective Cohort Study

    PubMed Central

    Strom, Suzanne L.; Anderson, Craig L.; Yang, Luanna; Canales, Cecilia; Amin, Alpesh; Lotfipour, Shahram; McCoy, C. Eric; Langdorf, Mark I.

    2015-01-01

    Introduction Traditional Advanced Cardiac Life Support (ACLS) courses are evaluated using written multiple-choice tests. High-fidelity simulation is a widely used adjunct to didactic content, and has been used in many specialties as a training resource as well as an evaluative tool. There are no data to our knowledge that compare simulation examination scores with written test scores for ACLS courses. Objective To compare and correlate a novel high-fidelity simulation-based evaluation with traditional written testing for senior medical students in an ACLS course. Methods We performed a prospective cohort study to determine the correlation between simulation-based evaluation and traditional written testing in a medical school simulation center. Students were tested on a standard acute coronary syndrome/ventricular fibrillation cardiac arrest scenario. Our primary outcome measure was correlation of exam results for 19 volunteer fourth-year medical students after a 32-hour ACLS-based Resuscitation Boot Camp course. Our secondary outcome was comparison of simulation-based vs. written outcome scores. Results The composite average score on the written evaluation was substantially higher (93.6%) than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6–14.0%], p<0.00005). We found a statistically significant moderate correlation between simulation scenario test performance and traditional written testing (Pearson r=0.48, p=0.04), validating the new evaluation method. Conclusion Simulation-based ACLS evaluation methods correlate with traditional written testing and demonstrate resuscitation knowledge and skills. Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation. PMID:26594288

  14. Impact of a standardized test package on exit examination scores and NCLEX-RN outcomes.

    PubMed

    Homard, Catherine M

    2013-03-01

    The purpose of this ex post facto correlational study was to compare exit examination scores and NCLEX-RN(®) pass rates of baccalaureate nursing students who differed in level of participation in a standardized test package. Three cohort groups emerged as a standardized test package was introduced: (a) students who did not participate in a standardized test package; (b) students with two semesters of a standardized test package; and (c) students with four semesters of a standardized test package. Benner's novice-to-expert theory framed the study in the belief that students best acquire knowledge and skills through practice and reflection. Students participating in four semesters of a standardized test package demonstrated higher exit examination scores and NCLEX-RN pass rates compared with students who did not participate in this package. This study's results could inform nurse educators about strategies to facilitate nursing student success on exit examinations and the NCLEX-RN. PMID:23413805

  15. The Black-White Scoring Gap on SAT II Achievement Tests: Some of the News Is Cheering.

    ERIC Educational Resources Information Center

    Journal of Blacks in Higher Education, 2003

    2003-01-01

    Academically accomplished applicants to the nation's top colleges usually take SAT II Achievement Tests. While scoring gaps between college-bound Blacks and Whites on these tests tend to be smaller than gaps on the basic SAT, a racial scoring gap persists. However, black students appear to be making progress in closing the racial scoring gap on…

  16. The Black-White Scoring Gap on SAT II Achievement Tests: Some of the News Is Cheering.

    ERIC Educational Resources Information Center

    Journal of Blacks in Higher Education, 2003

    2003-01-01

    Academically accomplished applicants to the nation's top colleges usually take SAT II Achievement Tests. While scoring gaps between college-bound Blacks and Whites on these tests tend to be smaller than gaps on the basic SAT, a racial scoring gap persists. However, black students appear to be making progress in closing the racial scoring gap on…

  17. The Valid Use of NAEP Achievement Level Scores to Confirm State Test Results in the No Child Left Behind Act

    ERIC Educational Resources Information Center

    Stoneberg, Bert D.

    2007-01-01

    The No Child Left Behind Act sanctions the use of NAEP scores to confirm state testing results. The U.S. Department of Education, as test developer, is responsible to set forth how NAEP scores are to be interpreted and used. Thus far, the Department has not published a clear set of guidelines for using NAEP achievement level scores to conduct a…

  18. Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance.

    PubMed

    McCarthy, Julie M; Van Iddekinge, Chad H; Lievens, Filip; Kung, Mei-Chuan; Sinar, Evan F; Campion, Michael A

    2013-09-01

    Considerable evidence suggests that how candidates react to selection procedures can affect their test performance and their attitudes toward the hiring organization (e.g., recommending the firm to others). However, very few studies of candidate reactions have examined one of the outcomes organizations care most about: job performance. We attempt to address this gap by developing and testing a conceptual framework that delineates whether and how candidate reactions might influence job performance. We accomplish this objective using data from 4 studies (total N = 6,480), 6 selection procedures (personality tests, job knowledge tests, cognitive ability tests, work samples, situational judgment tests, and a selection inventory), 5 key candidate reactions (anxiety, motivation, belief in tests, self-efficacy, and procedural justice), 2 contexts (industry and education), 3 continents (North America, South America, and Europe), 2 study designs (predictive and concurrent), and 4 occupational areas (medical, sales, customer service, and technological). Consistent with previous research, candidate reactions were related to test scores, and test scores were related to job performance. Further, there was some evidence that reactions affected performance indirectly through their influence on test scores. Finally, in no cases did candidate reactions affect the prediction of job performance by increasing or decreasing the criterion-related validity of test scores. Implications of these findings and avenues for future research are discussed. PMID:23937298

  19. Effects of Classroom Ventilation Rate and Temperature on Students’ Test Scores

    PubMed Central

    2015-01-01

    Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students’ mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9–7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12–13 points per each 1°C decrease in temperature within the observed range of 20–25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students. PMID:26317643

  20. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Ovarian adnexal mass assessment score test system. 866.6050 Section 866.6050 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES IMMUNOLOGY AND MICROBIOLOGY DEVICES Tumor Associated...

  1. States Eyeing Expense of Hand-Scored Tests in Light of NCLB Rules

    ERIC Educational Resources Information Center

    Archer, Jeff

    2005-01-01

    When students put down their pencils at the end of Connecticut's testing each year, another intensive process begins. Hundreds of trained evaluators work day and night for about a month to score the written responses. Although expensive, the use of open-ended questions drives the kind of instruction that state leaders say they want in their…

  2. Segregation and the Black-White Test Score Gap. NBER Working Paper No. 12988

    ERIC Educational Resources Information Center

    Vigdor, Jacob; Ludwig, Jens

    2007-01-01

    The mid-1980s witnessed breaks in two important trends related to race and schooling. School segregation, which had been declining, began a period of relative stasis. Black-white test score gaps, which had also been declining, also stagnated. The notion that these two phenomena may be related is also supported by basic cross-sectional evidence. We…

  3. Supplemental Educational Services and Student Test Score Gains: Evidence from a Large, Urban School District

    ERIC Educational Resources Information Center

    Springer, Matthew G.; Pepper, Matthew J.; Ghosh-Dastidar, Bonnie

    2014-01-01

    This study examines the effect of supplemental education services (SES) on student test score gains and whether particular subgroups of students benefit more from NCLB tutoring services. Our sample includes information on students enrolled in third through eighth grades nested in 121 elementary and middle schools over a five-year period comprising…

  4. Test Score Gaps between Private and Government Sector Students at School Entry Age in India

    ERIC Educational Resources Information Center

    Singh, Abhijeet

    2014-01-01

    Various studies have noted that students enrolled in private schools in India perform better on average than students in government schools. In this paper, I show that large gaps in the test scores of children in private and public sector education are evident even at the point of initial enrollment in formal schooling and are associated with…

  5. Defending the Quality of Links between Scores from Different Tests and Exams

    ERIC Educational Resources Information Center

    Cresswell, Mike

    2010-01-01

    Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance." His…

  6. An Evaluation of Three Approximate Item Response Theory Models for Equating Test Scores.

    ERIC Educational Resources Information Center

    Marco, Gary L.; And Others

    Three item response models were evaluated for estimating item parameters and equating test scores. The models, which approximated the traditional three-parameter model, included: (1) the Rasch one-parameter model, operationalized in the BICAL computer program; (2) an approximate three-parameter logistic model based on coarse group data divided…

  7. The Relationship between Computer Use and Standardized Test Scores: Does Gender Play a Role?

    ERIC Educational Resources Information Center

    Kay, Rachel E.

    2010-01-01

    Over the past few decades, and especially in the past ten years, computer use in schools has increased dramatically; however there has been little research examining the effects of technology use on student achievement, specifically defined by standardized test scores. There is also concern as to how technology use differs by gender and if that…

  8. Development and Validation of Scores from an Instrument Measuring Student Test-Taking Motivation

    ERIC Educational Resources Information Center

    Eklof, Hanna

    2006-01-01

    Using the expectancy-value model of achievement motivation as a basis, this study's purpose is to develop, apply, and validate scores from a self-report instrument measuring student test-taking motivation. Sampled evidence of construct validity for the present sample indicates that a number of the items in the instrument could be used as an…

  9. Factors affecting milk ELISA scores of cows tested for Johne’s disease

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Infection with Mycobacterium avium subsp. paratuberculosis (Johne’s disease) has been estimated to cost dairy producers over $1.5 billion per year. The objective of this study was to examine the influence a number of environmental and genetic factors have on ELISA milk test scores for Johne’s diseas...

  10. Test Score Gaps between Private and Government Sector Students at School Entry Age in India

    ERIC Educational Resources Information Center

    Singh, Abhijeet

    2014-01-01

    Various studies have noted that students enrolled in private schools in India perform better on average than students in government schools. In this paper, I show that large gaps in the test scores of children in private and public sector education are evident even at the point of initial enrollment in formal schooling and are associated with…

  11. Comprehensive School Reform and Standardized Test Scores in Illinois Elementary and Middle Schools

    ERIC Educational Resources Information Center

    McEnroe, James D.

    2010-01-01

    The study examined the effects of the federally funded Comprehensive School Reform (CSR) program on student performance on mandated standardized tests. The study focused on the mathematics and reading scores of Illinois public elementary and middle and junior high school students. The federal CSR program provided Illinois schools with an annual…

  12. Expected Multiple-Choice Test Item Scores Under Ordinal Response Modes.

    ERIC Educational Resources Information Center

    Frary, Robert B.

    Ordinal response modes for multiple choice tests are those under which the examinee marks one or more choices in an effort to identify the correct choice, or include it in a proper subset of the choices. Two ordinal response modes: answer-until-correct, and Coomb's elimination of choices which examinees identify as wrong, were analyzed for scoring…

  13. As Test Scores Have Fallen, So Has the Time Schools Give to Teaching

    ERIC Educational Resources Information Center

    Fowler, Charles W.

    1977-01-01

    While test scores have fallen, the amount of time students spend with teachers has fallen and the amount of knowledge and the number of skills students must master have risen. Six suggestions for guarding against loss of instructional time are provided. (Author/IRT)

  14. Florida Defeats the Skeptics: Test Scores Show Genuine Progress in the Sunshine State

    ERIC Educational Resources Information Center

    Winters, Marcus

    2012-01-01

    Among the 50 states, Florida's gains on the National Assessment of Educational Progress (NAEP) between 1992 and 2011 ranked second only to Maryland's. Florida's progress has been particularly impressive in the early grades. In 1998, Florida scored about one grade level below the national average on the 4th-grade NAEP reading test, but it was…

  15. Methods for Improving Test Scores: The Good, the Bad, and the Ugly

    ERIC Educational Resources Information Center

    Wright, Robert J.

    2009-01-01

    The No Child Left Behind Act (NCLB 2001) has the faculties of every public and charter school scrambling to drive test scores of seven identified groups of children (African-American children, Anglo-White children, children with disabilities, Hispanic children, children of poverty, children with English language limitations, and Native-American…

  16. Comparing State and District Test Results to National Norms: Interpretations of Scoring "Above the National Average."

    ERIC Educational Resources Information Center

    Linn, Robert L.; And Others

    Norm-referenced test results reported by states and school districts and factors related to those scores were studied through mail and telephone surveys of 35 states and a nationally representative sample of 153 school districts to determine the degree to which "above average" results were being reported. Part of the stimulus for this study came…

  17. Student Neighborhoods, Schools, and Test Score Growth: Evidence from Milwaukee, Wisconsin

    ERIC Educational Resources Information Center

    Carlson, Deven; Cowen, Joshua M.

    2015-01-01

    Schools and neighborhoods are thought to be two of the most important contextual influences on student academic outcomes. Drawing on a unique data set that permits simultaneous estimation of neighborhood and school contributions to student test score gains, we analyze the distributions of these contributions to consider the relative importance of…

  18. Defending the Quality of Links between Scores from Different Tests and Exams

    ERIC Educational Resources Information Center

    Cresswell, Mike

    2010-01-01

    Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance." His…

  19. Determinants of Academic Attainment in the United States: A Quantile Regression Analysis of Test Scores

    ERIC Educational Resources Information Center

    Haile, Getinet Astatike; Nguyen, Anh Ngoc

    2008-01-01

    We investigate the determinants of high school students' academic attainment in mathematics, reading and science in the United States; focusing particularly on possible differential impacts of ethnicity and family background across the distribution of test scores. Using data from the NELS2000 and employing quantile regression, we find two…

  20. 76 FR 16350 - Medical Devices; Ovarian Adnexal Mass Assessment Score Test System; Labeling; Black Box Restrictions

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-23

    ... HUMAN SERVICES Food and Drug Administration 21 CFR Part 866 Medical Devices; Ovarian Adnexal Mass Assessment Score Test System; Labeling; Black Box Restrictions AGENCY: Food and Drug Administration, HHS. ACTION: Proposed rule. SUMMARY: The Food and Drug Administration (FDA) is proposing to amend...

  1. Multiple Imputation of Item Scores in Test and Questionnaire Data, and Influence on Psychometric Results

    ERIC Educational Resources Information Center

    van Ginkel, Joost R.; van der Ark, L. Andries; Sijtsma, Klaas

    2007-01-01

    The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at…

  2. Effects of Programmed Learning Sequences on the Mathematics Test Scores of Bermudian Middle School Students

    ERIC Educational Resources Information Center

    Tully, Derek; Dunn, Rita; Hlawaty, Heide

    2006-01-01

    This research compared the effects of a Programmed Learning Sequence (PLS) (Dunn & Dunn, 1993) versus Traditional Teaching (TT) on 100 sixth-grade Bermudian students' test scores on a Fractions Unit. Fifty-three males' and forty-seven females' learning styles were identified with the "Learning Style Inventory" (LSI) (Dunn, Dunn, & Price, 2000) to…

  3. The Effect of Four Intervention Programs on Standardized Test Scores by Gender

    ERIC Educational Resources Information Center

    Cryder, Rebecca E.

    2012-01-01

    This quantitative correlational study involved the analysis, by gender, of the effect of four intervention programs at an Arizona middle school as seen on Arizona's Instrument to Measure Standards (AIMS) test scores. These four intervention programs included: Advancement Via Individual Determination (AVID), a planner stamping system, a World…

  4. The Relationship between Computer Use and Standardized Test Scores: Does Gender Play a Role?

    ERIC Educational Resources Information Center

    Kay, Rachel E.

    2010-01-01

    Over the past few decades, and especially in the past ten years, computer use in schools has increased dramatically; however there has been little research examining the effects of technology use on student achievement, specifically defined by standardized test scores. There is also concern as to how technology use differs by gender and if that…

  5. Effects of Reading Technology Integration on Sixth Grade Test and Reading Scores

    ERIC Educational Resources Information Center

    Thomas, P. Ann

    2012-01-01

    The focus of the investigation is on a sixth grade population not performing reading on grade level and not achieving high-stakes test score proficiency causing the school to fail adequate yearly progress (AYP). The lack of reading skills causes the students to repeat grades in middle school and high school. Reading technology instruction is the…

  6. Methods for Improving Test Scores: The Good, the Bad, and the Ugly

    ERIC Educational Resources Information Center

    Wright, Robert J.

    2009-01-01

    The No Child Left Behind Act (NCLB 2001) has the faculties of every public and charter school scrambling to drive test scores of seven identified groups of children (African-American children, Anglo-White children, children with disabilities, Hispanic children, children of poverty, children with English language limitations, and Native-American…

  7. Intelligence Test Scores and Birth Order among Young Norwegian Men (Conscripts) Analyzed within and between Families

    ERIC Educational Resources Information Center

    Bjerkedal, Tor; Kristensen, Petter; Skjeret, Geir A.; Brevik, John I.

    2007-01-01

    The present paper reports the results of a within and between family analysis of the relation between birth order and intelligence. The material comprises more than a quarter of a million test scores for intellectual performance of Norwegian male conscripts recorded during 1984-2004. Conscripts, mostly 18-19 years of age, were born to women for…

  8. Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

    ERIC Educational Resources Information Center

    Almond, Russell G.

    2014-01-01

    Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…

  9. California Standards Test Scores and Attendance Rates in an Afterschool Program

    ERIC Educational Resources Information Center

    Diamond, Sandra M.

    2013-01-01

    The Problem: The purpose of this study was to investigate whether or not there were any statistically significant differences in the Mathematics California Standard Test scores and attendance rates for African American and Latina high school girls who participated in an afterschool program. Method: A quasi-experimental design was conducted with…

  10. Permanent Income and the Black-White Test Score Gap. NBER Working Paper No. 17610

    ERIC Educational Resources Information Center

    Rothstein, Jesse; Wozny, Nathan

    2011-01-01

    Analysts often examine the black-white test score gap conditional on family income. Typically only a current income measure is available. We argue that the gap conditional on permanent income is of greater interest, and we describe a method for identifying this gap using an auxiliary data set to estimate the relationship between current and…

  11. Changes in Student Populations and Average Test Scores of Dutch Primary Schools

    ERIC Educational Resources Information Center

    Luyten, Hans; de Wolf, Inge

    2011-01-01

    This article focuses on the relation between student population characteristics and average test scores per school in the final grade of primary education from a dynamic perspective. Aggregated data of over 5,000 Dutch primary schools covering a 6-year period were used to study the relation between changes in school populations and shifts in mean…

  12. Estimated Effect of the Teacher Advancement Program on Student Test Score Gains

    ERIC Educational Resources Information Center

    Springer, Matthew G.; Ballou, Dale; Peng, Art

    2014-01-01

    This article presents findings from the first independent, third-party appraisal of the impact of the Teacher Advancement Program (TAP) on student test score gains in mathematics. TAP is a comprehensive school reform model designed to attract highly effective teachers, improve instructional effectiveness, and elevate student achievement. We use a…

  13. Intelligence Test Scores and Birth Order among Young Norwegian Men (Conscripts) Analyzed within and between Families

    ERIC Educational Resources Information Center

    Bjerkedal, Tor; Kristensen, Petter; Skjeret, Geir A.; Brevik, John I.

    2007-01-01

    The present paper reports the results of a within and between family analysis of the relation between birth order and intelligence. The material comprises more than a quarter of a million test scores for intellectual performance of Norwegian male conscripts recorded during 1984-2004. Conscripts, mostly 18-19 years of age, were born to women for…

  14. The Consequences of Ignorance Can Be More Serious than Low Test Scores.

    ERIC Educational Resources Information Center

    Erb, Tom

    2002-01-01

    Asserts that concern for academic success across the curriculum should extend beyond raising test scores. Contends that in the wake of the terrorist acts of 9/11/01, issues of public security and civil liberties present an opportunity for teachers to teach the Constitution and its amendments in an effort to fight citizen ignorance. (SD)

  15. Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

    ERIC Educational Resources Information Center

    Goldstein, Donna; Alibrandi, Marsha

    2013-01-01

    This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…

  16. Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

    ERIC Educational Resources Information Center

    Goldstein, Donna; Alibrandi, Marsha

    2013-01-01

    This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…

  17. Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

    ERIC Educational Resources Information Center

    Almond, Russell G.

    2014-01-01

    Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…

  18. Fitting the Normal-Ogive Factor Analytic Model to Scores on Tests.

    ERIC Educational Resources Information Center

    Ferrando, Pere J.; Lorenzo-Seva, Urbano

    2001-01-01

    Describes how the nonlinear factor analytic approach of R. McDonald to the normal ogive curve can be used to factor analyze test scores. Discusses the conditions in which this model is more appropriate than the linear model and illustrates the applicability of both models using an empirical example based on data from 1,769 adolescents who took the…

  19. Fitting the Normal-Ogive Factor Analytic Model to Scores on Tests.

    ERIC Educational Resources Information Center

    Ferrando, Pere J.; Lorenzo-Seva, Urbano

    2001-01-01

    Describes how the nonlinear factor analytic approach of R. McDonald to the normal ogive curve can be used to factor analyze test scores. Discusses the conditions in which this model is more appropriate than the linear model and illustrates the applicability of both models using an empirical example based on data from 1,769 adolescents who took the…

  20. Characterization of a Phenotype-Based Genetic Test Prediction Score for Unrelated Patients with Hypertrophic Cardiomyopathy

    PubMed Central

    Bos, J. Martijn; Will, Melissa L.; Gersh, Bernard J.; Kruisselbrink, Teresa M.; Ommen, Steve R.; Ackerman, Michael J.

    2014-01-01

    Objectives To determine the prevalence and spectrum of mutations and genotype phenotype relationships in the largest hypertrophic cardiomyopathy (HCM) cohort to date and provide an easy, clinically applicable phenotype-derived score that provides a pretest probability for a positive HCM genetic test. Patients and Methods Between 1999 and 2007, 1053 unrelated patients with the clinical diagnosis of HCM (60% male, age at diagnosis 44.4 ± 19 years) had HCM genetic testing for the HCM-associated myofilament genes. Phenotyping was performed by review of electronic medical record. Results Overall, 359 patients (34%) were genotype positive for a putative HCM associated mutation in ? 1 HCM-associated gene. Univariate and multivariate analyses demonstrated echocardiographic reverse curve morphology, age at diagnosis < 45 years, MLVWT ? 20 mm, family history of HCM, and family history of SCD to be positive predictors of positive genetic test while hypertension was a negative predictor. A score, based on the number 6 predictors of a positive genetic test, predicted a positive genetic test ranging from 6% when only hypertension was present to 80% when all 5 positive predictor markers were present. Conclusions In this largest HCM cohort published to date, the overall yield of genetic testing was 34%. Although all patients were diagnosed clinically with HCM, the presence or absence of six simple clinical/echocardiographic markers predicted the likelihood of mutation-positive HCM. Phenotype-guided genetic testing with the use of the Mayo HCM Genotype Predictor score provides an easy tool for an effective genetic counseling session. PMID:24793961

  1. Talent Search Qualifying: Comparisons between Talent Search Students Qualifying via Scores on Standardized Tests and via Parent Nomination

    ERIC Educational Resources Information Center

    Lee, Seon-Young; Olszewski-Kubilius, Paula

    2006-01-01

    This study examined differences between students who qualified for talent search testing via scores on standardized tests and via parent nomination in their performances on the SAT or ACT and some demographic characteristics. Overall, the standardized testing group earned higher scores on the off-level tests than the parent nominated group. Asian…

  2. An Optimization Model for Test Assembly To Match Observed-Score Distributions. Research Report 94-7.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Luecht, Richard M.

    An optimization model is presented that allows test assemblers to control the shape of the observed-score distribution on a test for a population with a known ability distribution. An obvious application is for item response theory-based test assembly in programs where observed scores are reported and operational test forms are required to produce…

  3. An NCME Instructional Module on Quality Control Procedures in the Scoring, Equating, and Reporting of Test Scores

    ERIC Educational Resources Information Center

    Allalouf, Avi

    2007-01-01

    There is significant potential for error in long production processes that consist of sequential stages, each of which is heavily dependent on the previous stage, such as the SER (Scoring, Equating, and Reporting) process. Quality control procedures are required in order to monitor this process and to reduce the number of mistakes to a minimum. In…

  4. An NCME Instructional Module on Quality Control Procedures in the Scoring, Equating, and Reporting of Test Scores

    ERIC Educational Resources Information Center

    Allalouf, Avi

    2007-01-01

    There is significant potential for error in long production processes that consist of sequential stages, each of which is heavily dependent on the previous stage, such as the SER (Scoring, Equating, and Reporting) process. Quality control procedures are required in order to monitor this process and to reduce the number of mistakes to a minimum. In…

  5. Survival analysis of cancer patients with multiple endpoints using global score test methodology

    NASA Astrophysics Data System (ADS)

    Zain, Zakiyah; Whitehead, John

    2014-06-01

    Progression-free survival (PFS), time-to-progression (TTP) and overall survival (OS) are examples of multiple endpoints commonly used in clinical trials of cancer patients. PFS is increasingly used as a primary endpoint in evaluation of patients with solid tumors, while multiple endpoints are often analysed independently. These endpoints are indeed correlated and it is desirable to evaluate effectiveness of treatments by means of a single parameter. In this paper, a single overall treatment effect is provided by combining the univariate score statistics for comparing treatments with respect to each survival endpoint. This global score test methodology was applied in analysis of 330 patients with an aggressive cancer, each with two endpoints recorded, T1 and T2, relating to disease progression and death respectively. The values of score statistics obtained from the proposed method matched closely those from the logrank test. Meanwhile, the correlations between the two score test statistics were found to be similar to those computed using the established Wei, Lin and Weissfeld method. Simulations further confirmed the consistent performance of this new method in analysis of bivariate survival data.

  6. The comparison question polygraph test: a contrast of methods and scoring.

    PubMed

    Honts, Charles R; Reavy, Racheal

    2015-05-01

    We conducted a mock crime experiment with 250 paid participants (126 females, Mdn age = 30 years) contrasting the validity of the probable-lie and the directed-lie variants of the comparison question test (CQT) for the detection of deception. Subjects were assigned at random to one of eight conditions in a Guilt (Guilty/Innocent) × Test Type (Probable-Lie/Directed-Lie) × Stimulation (Between Repetition Stimulation/No Stimulation) factorial design. The data were scored by an experienced polygraph examiner who was unaware of subject assignment to conditions and with a computer algorithm known as the Objective Scoring System Version 2 (OSS2). There were substantial main effects of guilt in both the OSS2 computer scores F(1, 241) = 143.82, p < .001, ?(p)(2) = 0.371, and in the human scoring, F(1, 242) = 98.92, p<.001, ?(p)(2) = .29. There were no differences between the test types in the number of spontaneous countermeasure attempts made against them. Although under the controlled conditions of an experiment the probable-lie and the directed-lie variants of the CQT produced equivocal results in terms of detection accuracy, the directed-lie variant has much to recommend it as it is inherently more standardized in its administration and construction. PMID:25703188

  7. A Comparison Study of the College Board Scholastic Aptitude Test Scores between Students in Indiana, the Midwestern Region, and the Nation. Includes Test Scores, High School Records, Socioeconomic Characteristics, and College Plans. Monograph 80-1.

    ERIC Educational Resources Information Center

    Lisack, J. P.

    Selected data from three of the latest summary reports of the College Board's Admissions Testing Program (ATP) are presented. They are: College Bound Seniors, 1980-National, Midwestern, and Indiana. Data including Scholastic Aptitude Test (SAT) scores, the Test of Standard Written English (TSWE) scores, and information from the Student Descriptive…

  8. Mental health matters in elementary school: first-grade screening predicts fourth grade achievement test scores.

    PubMed

    Guzman, Maria Paz; Jellinek, Michael; George, Myriam; Hartley, Marcela; Squicciarini, Ana Maria; Canenguez, Katia M; Kuhlthau, Karen A; Yucel, Recai; White, Gwyne W; Guzman, Javier; Murphy, J Michael

    2011-08-01

    The objective of the study was to evaluate whether mental health problems identified through screens administered in first grade are related to poorer academic achievement test scores in the fourth grade. The government of Chile uses brief teacher- and parent-completed measures [Teacher Observation of Classroom Adaptation-Revised (TOCA-RR) and Pediatric Symptom Checklist (PSC-Cl)] to screen for mental health problems in about one-fifth of the country's elementary schools. In fourth grade, students take the national achievement tests (SIMCE) of language, mathematics and science. This study examined whether mental health problems identified through either or both screens predicted achievement test scores after controlling for student and family risk factors. A total of 17,252 students had complete first grade teacher forms and these were matched with fourth grade SIMCE data for 11,185 students, 7,903 of whom also had complete parent form data from the first grade. Students at risk on either the TOCA-RR or the PSC-Cl or both performed significantly worse on all SIMCE subtests. Even after controlling for covariates and adjusting for missing data, students with mental health problems on one screen in first grade had fourth grade achievement scores that were 14-18 points (~1/3 SD) lower than students screened as not at risk. Students at risk on both screens had scores that were on average 33 points lower than students at risk on either screen. Mental health problems in first grade were one of the strongest predictors of lower achievement test scores 3 years later, supporting the premise that for children mental health matters in the real world. PMID:21647553

  9. On the Use of Composition Scoring Techiniques, Objective Measures, and Objective Tests to Evaluate ESL Writing Ability.

    ERIC Educational Resources Information Center

    Perkins, Kyle

    1983-01-01

    Summarizes and evaluates the major direct methods of assessing writing ability (holistic scoring, analytical scoring, and primary trait scoring) and the most popular indirect methods of predicting students' ability to write (T-unit analysis and a variety of published standardized tests). (EKN)

  10. A Comparison of Three Scoring Methods for Tests with Selected-Response and Constructed-Response Items

    ERIC Educational Resources Information Center

    Schaeffer, Gary A.; Henderson-Montero, Diane; Julian, Marc; Bene, Nancy H.

    2002-01-01

    A number of methods for scoring tests with selected-response (SR) and constructed-response (CR) items are available. The selection of a method depends on the requirements of the program, the particular psychometric model and assumptions employed in the analysis of item and score data, and how scores are to be used. This article compares 3 methods:…

  11. A glance at quality score: implication for de novo transcriptome reconstruction of Illumina reads.

    PubMed

    Mbandi, Stanley Kimbung; Hesse, Uljana; Rees, D Jasper G; Christoffels, Alan

    2014-01-01

    Downstream analyses of short-reads from next-generation sequencing platforms are often preceded by a pre-processing step that removes uncalled and wrongly called bases. Standard approaches rely on their associated base quality scores to retain the read or a portion of it when the score is above a predefined threshold. It is difficult to differentiate sequencing error from biological variation without a reference using quality scores. The effects of quality score based trimming have not been systematically studied in de novo transcriptome assembly. Using RNA-Seq data produced from Illumina, we teased out the effects of quality score based filtering or trimming on de novo transcriptome reconstruction. We showed that assemblies produced from reads subjected to different quality score thresholds contain truncated and missing transfrags when compared to those from untrimmed reads. Our data supports the fact that de novo assembling of untrimmed data is challenging for de Bruijn graph assemblers. However, our results indicates that comparing the assemblies from untrimmed and trimmed read subsets can suggest appropriate filtering parameters and enable selection of the optimum de novo transcriptome assembly in non-model organisms. PMID:24575122

  12. A glance at quality score: implication for de novo transcriptome reconstruction of Illumina reads

    PubMed Central

    Mbandi, Stanley Kimbung; Hesse, Uljana; Rees, D. Jasper G.; Christoffels, Alan

    2014-01-01

    Downstream analyses of short-reads from next-generation sequencing platforms are often preceded by a pre-processing step that removes uncalled and wrongly called bases. Standard approaches rely on their associated base quality scores to retain the read or a portion of it when the score is above a predefined threshold. It is difficult to differentiate sequencing error from biological variation without a reference using quality scores. The effects of quality score based trimming have not been systematically studied in de novo transcriptome assembly. Using RNA-Seq data produced from Illumina, we teased out the effects of quality score based filtering or trimming on de novo transcriptome reconstruction. We showed that assemblies produced from reads subjected to different quality score thresholds contain truncated and missing transfrags when compared to those from untrimmed reads. Our data supports the fact that de novo assembling of untrimmed data is challenging for de Bruijn graph assemblers. However, our results indicates that comparing the assemblies from untrimmed and trimmed read subsets can suggest appropriate filtering parameters and enable selection of the optimum de novo transcriptome assembly in non-model organisms. PMID:24575122

  13. The Relationship between Academic Averages of Primary School Science and Technology Class and Test Sub-Test Scores of Placement Test of Science

    ERIC Educational Resources Information Center

    Guzeller, Cem Oktay

    2012-01-01

    In this research, the relationship between written exam scores of science and technology class of 6th, 7th, and 8th grades, project, participation in class activities and performance work, year-end academic success point averages and sub-test raw scores of LDT science of 6th, 7th and 8th grades. Academic success point averages were used as…

  14. The effect of instructional methodology on high school students natural sciences standardized tests scores

    NASA Astrophysics Data System (ADS)

    Powell, P. E.

    Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.

  15. No differential heritability of intelligence test scores across ability levels in Norway.

    PubMed

    Sundet, J M; Eilertsen, D E; Tambs, K; Magnus, P

    1994-07-01

    The possibility of differential heritability of intelligence test scores across levels of ability has been raised in several recent reports. In the present paper intelligence test data from 862 monozygotic and 1325 dizygotic male twin pairs tested at about 19 years of age were analyzed in search for changes in heritability and shared environmentality as a function of ability level. The analyses were performed by means of multiple regression models (e.g., Cherny et al., 1992). No evidence of differential heritability across different ability levels was detected. PMID:7993311

  16. The effects of calculator-based laboratories on standardized test scores

    NASA Astrophysics Data System (ADS)

    Stevens, Charlotte Bethany Rains

    Nationwide, the goal of providing a productive science and math education to our youth in today's educational institutions is centering itself around the technology being utilized in these classrooms. In this age of digital technology, educational software and calculator-based laboratories (CBL) have become significant devices in the teaching of science and math for many states across the United States. Among the technology, the Texas Instruments graphing calculator and Vernier Labpro interface, are among some of the calculator-based laboratories becoming increasingly popular among middle and high school science and math teachers in many school districts across this country. In Tennessee, however, it is reported that this type of technology is not regularly utilized at the student level in most high school science classrooms, especially in the area of Physical Science (Vernier, 2006). This research explored the effect of calculator based laboratory instruction on standardized test scores. The purpose of this study was to determine the effect of traditional teaching methods versus graphing calculator teaching methods on the state mandated End-of-Course (EOC) Physical Science exam based on ability, gender, and ethnicity. The sample included 187 total tenth and eleventh grade physical science students, 101 of which belonged to a control group and 87 of which belonged to the experimental group. Physical Science End-of-Course scores obtained from the Tennessee Department of Education during the spring of 2005 and the spring of 2006 were used to examine the hypotheses. The findings of this research study suggested the type of teaching method, traditional or calculator based, did not have an effect on standardized test scores. However, the students' ability level, as demonstrated on the End-of-Course test, had a significant effect on End-of-Course test scores. This study focused on a limited population of high school physical science students in the middle Tennessee Putnam County area. The study should be reproduced in various school districts in the state of Tennessee to compare the findings.

  17. Adults with poor reading skills: How lexical knowledge interacts with scores on standardized reading comprehension tests.

    PubMed

    McKoon, Gail; Ratcliff, Roger

    2016-01-01

    Millions of adults in the United States lack the necessary literacy skills for most living wage jobs. For students from adult learning classes, we used a lexical decision task to measure their knowledge of words and we used a decision-making model (Ratcliff's, 1978, diffusion model) to abstract the mechanisms underlying their performance from their RTs and accuracy. We also collected scores for each participant on standardized IQ tests and standardized reading tests used commonly in the education literature. We found significant correlations between the model's estimates of the strengths with which words are represented in memory and scores for some of the standardized tests but not others. The findings point to the feasibility and utility of combining a test of word knowledge, lexical decision, that is well-established in psycholinguistic research, a decision-making model that supplies information about underlying mechanisms, and standardized tests. The goal for future research is to use this combination of approaches to understand better how basic processes relate to standardized tests with the eventual aim of understanding what these tests are measuring and what the specific difficulties are for individual, low-literacy adults. PMID:26550803

  18. The Contribution of Test-Takers' Speech Content to Scores on an English Oral Proficiency Test

    ERIC Educational Resources Information Center

    Sato, Takanori

    2012-01-01

    The content that test-takers attempt to convey is not always included in the construct definition of "general" English oral proficiency tests, although some English-for-academic-purposes (EAP) speaking tests and most writing tests tend to place great emphasis on the evaluation of the content or ideas in the performance. This study investigated the…

  19. The Effects of Handwriting, Spelling, and T-Units on Holistic Scoring with Implications for Dysgraphia

    ERIC Educational Resources Information Center

    Hooten, Regina Gay

    2009-01-01

    This study examined the relationship of holistic scoring with handwriting legibility, spelling accuracy and number of T-units within compositions written by children in grades 3 through 6 using path analysis. A sample of 223 compositions was rated for handwriting legibility and composition quality, and coded for number of T-units and percentage of…

  20. Tennessee TCAP Science Scale Scores: Implications for Continuous Improvement and Educational Reform or Is It Possible To Beat the Odds?

    ERIC Educational Resources Information Center

    Miller-Whitehead, Marie

    Evidence provided by analysis of science scale scores on the McGraw-Hill CTB/4 science test for grades 2 through 8 in Tennessee, part of the Tennessee Comprehensive Assessment Program (TCAP), shows that it is possible for high achieving school systems to show continuous improvement from year to year. These results would tend to offset fears that…

  1. Gender Differences in Factor Scores of Anxiety and Depression among Australian University Students: Implications for Counselling Interventions

    ERIC Educational Resources Information Center

    Bitsika, Vicki; Sharpley, Chris F.; Melham, Therese C.

    2010-01-01

    Anxiety and depression inventory scores from 200 male and female university students attending a private university in Australia were examined for their factor structure. Once established, the two sets of factors were tested for gender-based differences, revealing that females were more likely than males to report symptomatology associated with…

  2. Testing Students with Special Educational Needs in Large-Scale Assessments – Psychometric Properties of Test Scores and Associations with Test Taking Behavior

    PubMed Central

    Pohl, Steffi; Südkamp, Anna; Hardt, Katinka; Carstensen, Claus H.; Weinert, Sabine

    2016-01-01

    Assessing competencies of students with special educational needs in learning (SEN-L) poses a challenge for large-scale assessments (LSAs). For students with SEN-L, the available competence tests may fail to yield test scores of high psychometric quality, which are—at the same time—measurement invariant to test scores of general education students. We investigated whether we can identify a subgroup of students with SEN-L, for which measurement invariant competence measures of adequate psychometric quality may be obtained with tests available in LSAs. We furthermore investigated whether differences in test-taking behavior may explain dissatisfying psychometric properties and measurement non-invariance of test scores within LSAs. We relied on person fit indices and mixture distribution models to identify students with SEN-L for whom test scores with satisfactory psychometric properties and measurement invariance may be obtained. We also captured differences in test-taking behavior related to guessing and missing responses. As a result we identified a subgroup of students with SEN-L for whom competence scores of adequate psychometric quality that are measurement invariant to those of general education students were obtained. Concerning test taking behavior, there was a small number of students who unsystematically picked response options. Removing these students from the sample slightly improved item fit. Furthermore, two different patterns of missing responses were identified that explain to some extent problems in the assessments of students with SEN-L. PMID:26941665

  3. Testing Students with Special Educational Needs in Large-Scale Assessments - Psychometric Properties of Test Scores and Associations with Test Taking Behavior.

    PubMed

    Pohl, Steffi; Südkamp, Anna; Hardt, Katinka; Carstensen, Claus H; Weinert, Sabine

    2016-01-01

    Assessing competencies of students with special educational needs in learning (SEN-L) poses a challenge for large-scale assessments (LSAs). For students with SEN-L, the available competence tests may fail to yield test scores of high psychometric quality, which are-at the same time-measurement invariant to test scores of general education students. We investigated whether we can identify a subgroup of students with SEN-L, for which measurement invariant competence measures of adequate psychometric quality may be obtained with tests available in LSAs. We furthermore investigated whether differences in test-taking behavior may explain dissatisfying psychometric properties and measurement non-invariance of test scores within LSAs. We relied on person fit indices and mixture distribution models to identify students with SEN-L for whom test scores with satisfactory psychometric properties and measurement invariance may be obtained. We also captured differences in test-taking behavior related to guessing and missing responses. As a result we identified a subgroup of students with SEN-L for whom competence scores of adequate psychometric quality that are measurement invariant to those of general education students were obtained. Concerning test taking behavior, there was a small number of students who unsystematically picked response options. Removing these students from the sample slightly improved item fit. Furthermore, two different patterns of missing responses were identified that explain to some extent problems in the assessments of students with SEN-L. PMID:26941665

  4. Comparison between Dichotomous and Polytomous Scoring of Innovative Items in a Large-Scale Computerized Adaptive Test

    ERIC Educational Resources Information Center

    Jiao, Hong; Liu, Junhui; Haynie, Kathleen; Woo, Ada; Gorham, Jerry

    2012-01-01

    This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test…

  5. Correlation of McCarthy Scale Scores with ERB Achievement Tests Over a Three and Four Year Interval.

    ERIC Educational Resources Information Center

    Fuchs, Marilyn; Migdail, Sherry R.

    This longitudinal study investigates the relationship between preschool children's ability test scores and their scores on achievement tests in 1st and 2nd grade. Subjects were 59 academically able children (27 boys and 32 girls, 4 black and 55 white) enrolled in an independent school. Children (mean age, 3 years, 9.8 months) were assessed with…

  6. Interpreting the "g" Loadings of Intelligence Test Composite Scores in Light of Spearman's Law of Diminishing Returns

    ERIC Educational Resources Information Center

    Reynolds, Matthew R.

    2013-01-01

    The linear loadings of intelligence test composite scores on a general factor ("g") have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the "g" loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a)…

  7. Relationships of Teacher-Assigned Grades in High School Chemistry to Taxonomy-Type Objective Test Scores.

    ERIC Educational Resources Information Center

    Even, Alexander

    Reported is a study designed (1) to investigate the relationship between teacher-assigned chemistry grades and the scores obtained on a multiple-choice chemistry test built on taxonomic principles, and (2) to compare the contributions of various predictor variables to the explainable variance of the grades and the total test scores. The sample…

  8. Interpreting the "g" Loadings of Intelligence Test Composite Scores in Light of Spearman's Law of Diminishing Returns

    ERIC Educational Resources Information Center

    Reynolds, Matthew R.

    2013-01-01

    The linear loadings of intelligence test composite scores on a general factor ("g") have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the "g" loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a)…

  9. Genetic variation of the growth hormone secretagogue receptor gene is associated with alcohol use disorders identification test scores and smoking.

    PubMed

    Suchankova, Petra; Nilsson, Staffan; von der Pahlen, Bettina; Santtila, Pekka; Sandnabba, Kenneth; Johansson, Ada; Jern, Patrick; Engel, Jörgen A; Jerlhag, Elisabet

    2016-03-01

    The multifaceted gut-brain peptide ghrelin and its receptor (GHSR-1a) are implicated in mechanisms regulating not only the energy balance but also the reward circuitry. In our pre-clinical models, we have shown that ghrelin increases whereas GHSR-1a antagonists decrease alcohol consumption and the motivation to consume alcohol in rodents. Moreover, ghrelin signaling is required for the rewarding properties of addictive drugs including alcohol and nicotine in rodents. Given the hereditary component underlying addictive behaviors and disorders, we sought to investigate whether single nucleotide polymorphisms (SNPs) located in the pre-proghrelin gene (GHRL) and GHSR-1a gene (GHSR) are associated with alcohol use, measured by the alcohol use disorders identification test (AUDIT) and smoking. Two SNPs located in GHRL, rs4684677 (Gln90Leu) and rs696217 (Leu72Met), and one in GHSR, rs2948694, were genotyped in a subset (n = 4161) of a Finnish population-based cohort, the Genetics of Sexuality and Aggression project. The effect of these SNPs on AUDIT scores and smoking was investigated using linear and logistic regressions, respectively. We found that the minor allele of the rs2948694 SNP was nominally associated with higher AUDIT scores (P = 0.0204, recessive model) and smoking (P = 0.0002, dominant model). Furthermore, post hoc analyses showed that this risk allele was also associated with increased likelihood of having high level of alcohol problems as determined by AUDIT scores ≥ 16 (P = 0.0043, recessive model). These convergent findings lend further support for the hypothesized involvement of ghrelin signaling in addictive disorders. PMID:26059200

  10. Increased correlation coefficient between the written test score and tutors' performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia.

    PubMed

    Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

    2016-01-01

    This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (p<0.001) and for group 2 was 0.305 (p<0.001). The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course. PMID:26838577

  11. Reliability and validity of useful field of view test scores as administered by personal computer.

    PubMed

    Edwards, Jerri D; Vance, David E; Wadley, Virginia G; Cissell, Gayla M; Roenker, Daniel L; Ball, Karlene K

    2005-07-01

    The Useful Field of View test (UFOV(1)) is a measure of processing speed that predicts driving performance and other functional abilities in older adults. In comparison to a number of other visual and cognitive measures, the UFOV measure has consistently been found to be the strongest predictor of motor vehicle crashes of older adults. This measure has valuable applications in that computerized, performance-based measures that are predictive of crashes in the elderly population can provide an objective criterion for determining the need for driver restriction or rehabilitation. Administration of the UFOV test has evolved from the standard version (administered via touch-screen with the Visual Attention Analyzer) to two briefer versions, which are administered on a personal desktop computer (PC) using either a touch screen or mouse response option. These new versions of the test are briefer and require less specialized equipment, making the test more portable and practical for use in clinical settings. This study examined the reliability and validity of the scores from these two new versions. Results indicate that test-retest reliabilities of the scores from the UFOV PC versions are high (r's= 0 .884 for mouse and 0.735 for touch), and performance on both PC versions correlates well with performance on the standard version (r's = 0.658 for mouse and 0.746 for touch). Furthermore, scores were highly correlated (r = 0.916) when participants used either a touch screen or a mouse to input responses. In conclusion, the reliability and validity coefficients are of sufficient magnitude to make the touch and mouse PC versions of the UFOV practical for use in clinical evaluations. PMID:16019630

  12. Providing Subscale Scores for Diagnostic Information: A Case Study when the Test Is Essentially Unidimensional

    ERIC Educational Resources Information Center

    Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne

    2010-01-01

    Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…

  13. Score Decline

    ERIC Educational Resources Information Center

    Cameron, Robert G.; Guralnick, Elissa

    1977-01-01

    The results are now in from the national and state testing programs, and they show that the decline in test scores of high school students is a national trend. Score decline has provided educators with a powerful tool for improving literacy through the reintroduction of composition courses and composition requirements. (Author)

  14. Comparison of physical therapy anatomy performance and anxiety scores in timed and untimed practical tests.

    PubMed

    Schwartz, Sarah M; Evans, Cathy; Agur, Anne M R

    2015-11-12

    Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating students using untimed examinations. There is currently no consensus in the literature regarding whether untimed examinations provide a benefit to test performance in clinical anatomy. This study aimed to determine the impact of timed versus untimed practical tests on Master of Physical Therapy student anatomy performance and test anxiety. Test anxiety was measured using the State-Trait Anxiety Inventory (STAI). Differences in performance, anxiety scores, and time taken were compared using paired sample Student's t-tests. Eighty-one of the 84 students completed the study and provided feedback. Students performed significantly higher on the untimed test (P?=?0.005), with a significant reduction in test anxiety (P?test showed the greatest improvement on the untimed test ( x¯?=?20.4 ±10%). Eighty-three percent (n?=?69) of students preferred the untimed test, 8.4% (n?=?7) the timed test, and 8.4% (n?=?7) had no preference. Students took on average eight minutes longer on the untimed test. This study found that physical therapy students perform better on untimed tests, which may be related to a reduction in test anxiety. If the intended goal of evaluating health care professional students is to determine fundamental competencies, these factors should be considered when designing future curricula. Anat Sci Educ 8: 518-524. © 2014 American Association of Anatomists. PMID:25516337

  15. Effects of Knowledge of Cognitive-Moral Development and Request to Fake on Defining Issues Test P-Scores.

    ERIC Educational Resources Information Center

    Napier, John D.

    1979-01-01

    Support claims that the "Defining Issues Test" of cognitive-moral development cannot be faked higher. Finds that instruction about cognitive-moral development affected the scores of the teacher trainees who were tested. (RL)

  16. Effects of Differentially Time-Consuming Tests on Computer-Adaptive Test Scores

    ERIC Educational Resources Information Center

    Bridgeman, Brent; Cline, Frederick

    2004-01-01

    Time limits on some computer-adaptive tests (CATs) are such that many examinees have difficulty finishing, and some examinees may be administered tests with more time-consuming items than others. Results from over 100,000 examinees suggested that about half of the examinees must guess on the final six questions of the analytical section of the…

  17. Analysis of comorbid factors that increase the COPD assessment test scores

    PubMed Central

    2014-01-01

    Background The chronic obstructive pulmonary disease (COPD) Assessment Test (CAT) is a concise health status measure for COPD. COPD patients have a variety of comorbidities, but little is known about their impact on quality of life. This study was designed to investigate comorbid factors that may contribute to high CAT scores. Methods An observational study at Keio University and affiliated hospitals enrolled 336 COPD patients and 67 non-COPD subjects. Health status was assessed by the CAT, the St. Georges Respiratory Questionnaire (SGRQ), and all components of the Medical Outcomes Study Short-Form 36-Item (SF-36) version 2, which is a generic measure of health. Comorbidities were identified based on patients’ reports, physicians’ records, and questionnaires, including the Frequency Scale for the Symptoms of Gastro-esophageal reflux disease (GERD) and the Hospital Anxiety and Depression Scale. Dual X-ray absorptiometry measurements of bone mineral density were performed. Results The CAT showed moderate-good correlations with the SGRQ and all components of the SF-36. The presence of GERD, depression, arrhythmia, and anxiety was significantly associated with a high CAT score in the COPD patients. Conclusions Symptomatic COPD patients have a high prevalence of comorbidities. A high CAT score should alert the clinician to a higher likelihood of certain comorbidities such as GERD and depression, because these diseases may co-exist unrecognized. Trial registration Clinical trial registered with UMIN (UMIN000003470). PMID:24502760

  18. A toxicity scoring system for the 10-day whole sediment test with Corophium insidiosum (Crawford).

    PubMed

    Prato, Ermelinda; Biandolino, Francesca; Libralato, Giovanni

    2015-04-01

    This study developed a tool able to evaluate the potential contamination of marine sediments detecting the presence or absence of toxicity supporting environmental decision-making processes. When the sample is toxic, it is important to classify its level of toxicity to understand its subsequent effects and management practices. Corophium insidiosum is a widespread and frequently recorded species along the Mediterranean Sea, North Sea and western Baltic Sea with records also in the Atlantic Ocean and Pacific Ocean. This amphipod is found in high abundance in shallow brackish inshore areas and estuaries also with high turbidity. At Italian level, C. insidiosum is more frequently collectable than Corophium orientale, making routine toxicity tests easier to be performed. Moreover, according to the international scientific literature, C. insidiosum is more sensitive than C. orientale. Whole sediment toxicity data (10 days) with C. insidiosum were organised in a species-specific toxicity score on the basis of the minimum significance difference (MSD) approach. Thresholds to rank samples as non-toxic and toxic were based on sediment samples (n=84) from the Gulf of Taranto (Italy). A five-class toxicity score (absent, low, medium, high and very high toxicity) was developed, considering the distribution of the 90th percentile of the MSD normalised to the effects on the negative controls (samples from reference sites). This toxicity score could be useful for interpreting sediment potential impacts and providing quick responsive management information. PMID:25773894

  19. A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

    ERIC Educational Resources Information Center

    Lee, Guemin; Park, In-Yong

    2012-01-01

    Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

  20. A Subgroup Analysis of the Impact of Self-testing Frequency on Examination Scores in a Pathophysiology Course

    PubMed Central

    Stewart, David W.; Hagemeier, Nicholas E.; Thigpen, Jim C.; Brooks, Lauren

    2014-01-01

    Objective: To determine if the frequency of self-testing of course material prior to actual examination improves examination scores, regardless of the actual scores on the self-testing. Methods: Practice quizzes were randomly generated from a total of 1342 multiple-choice questions in pathophysiology and made available online for student self-testing. Intercorrelations, 2-way repeated measures ANOVA with post hoc tests, and 2-group comparisons following rank ordering, were conducted. Results: During each of 4 testing blocks, more than 85% of students took advantage of the self-testing process for a total of 7042 attempts. A consistent significant correlation (p?0.05) existed between the number of practice quiz attempts and the subsequent examination scores. No difference in the number of quiz attempts was demonstrated compared to the first testing block. Exam scores for the first and second testing blocks were both higher than those for third and fourth blocks. Conclusion: Although self-testing strategies increase retrieval and retention, they are uncommon in pharmacy education. The results suggested that the number of self-testing attempts alone improved subsequent examination scores, regardless of the score for self-tests. PMID:26056403

  1. A Score Test for Association of a Longitudinal Marker and an Event with Missing Data

    PubMed Central

    Finkelstein, Dianne M.; Wang, Rui; Ficociello, Linda H.; Schoenfeld, David A.

    2010-01-01

    Summary Often clinical studies periodically record information on disease progression as well as results from laboratory studies that are believed to reflect the progressing stages of the disease. A primary aim of such a study is to determine the relationship between the lab measurements and disease progression. If there were no missing or censored data, these analyses would be straightforward. However, often patients miss visits, and return after their disease has progressed. In this case, not only is their progression time interval-censored, but their lab test series is also incomplete. In this paper, we propose a simple test for the association between a longitudinal marker and an event time from incomplete data. We derive the test using a very intuitive technique of calculating the expected complete data score conditional on the observed incomplete data (CEST). The problem was motivated by data from an observational study of patients with diabetes. PMID:19754923

  2. Participation in a coteaching classroom and students' end-of-course test scores

    NASA Astrophysics Data System (ADS)

    Debro, Ava

    General education students consistently perform poorly on standardized science tests. Coteaching is an instructional strategy that improves the achievement of students with disabilities, but very little research exists that examines the effect of coteaching classrooms on the performance of general education students. The purpose of this study was to examine the effect of coteaching classrooms on the performance of general education students. The constructivist theoretical framework provided the foundation for this research. The research question examined the effect that coteaching classrooms had on the performance of general education biology students. In this experimental design utilizing a posttest-only control group, coteaching instructional strategy was the treatment, and student performance was measured using the scores obtained from the biology end-of-course test. Data for this study was analyzed using an independent t-test. The results of this study revealed that there was not a statistically significant difference in student performance on the biology end-of-course test between treatment and control groups. More than half of the general education biology students enrolled in coteaching classrooms failed the end-of-course test. Researchers may use this study as a catalyst to examine other instructional practices that may improve student performance in science courses. The results of this study may be used to persuade coteachers of the importance of attending frequent professional development opportunities that examine a variety of coteaching instructional strategies. Improving the performance of general education students in science may improve standardized test scores, afford more students the opportunity to attend college, and ensure that students are able to compete on a global level.

  3. Predicting First-Quarter Test Scores from the New Medical College Admission Test.

    ERIC Educational Resources Information Center

    Cullen, Thomas J.; And Others

    1980-01-01

    The predictive validity of the new Medical College Admission Test as it relates to end-of-quarter examinations in anatomy, histology, physiology, biochemistry, and "ages of man" is presented. Results indicate that the Science Knowledge assessment areas of chemistry and physics and the Science Problems subtest were most useful in predicting student…

  4. Predicting First-Quarter Test Scores from the New Medical College Admission Test.

    ERIC Educational Resources Information Center

    Cullen, Thomas J.; And Others

    1980-01-01

    The predictive validity of the new Medical College Admission Test as it relates to end-of-quarter examinations in anatomy, histology, physiology, biochemistry, and "ages of man" is presented. Results indicate that the Science Knowledge assessment areas of chemistry and physics and the Science Problems subtest were most useful in predicting student…

  5. Relationship of Students' Prior Knowledge and Order of Questions on Tests to Students' Test Scores.

    ERIC Educational Resources Information Center

    Papp, Klara K.; And Others

    1987-01-01

    A study examined whether students beginning a cell biology course with prior knowledge of its three areas (genetics, histology, and biochemistry) would retain that advantage throughout the course and whether achievement was influenced by the order of questions in a test. (MSE)

  6. Assessing Growth in Young Children: A Comparison of Raw, Age-Equivalent, and Standard Scores Using the Peabody Picture Vocabulary Test

    ERIC Educational Resources Information Center

    Sullivan, Jeremy R.; Winter, Suzanne M.; Sass, Daniel A.; Svenkerud, Nicole

    2014-01-01

    Many tests provide users with several different types of scores to facilitate interpretation and description of students' performance. Common examples include raw scores, age- and grade-equivalent scores, and standard scores. However, when used within the context of assessing growth among young children, these scores should not be…

  7. Assessing Growth in Young Children: A Comparison of Raw, Age-Equivalent, and Standard Scores Using the Peabody Picture Vocabulary Test

    ERIC Educational Resources Information Center

    Sullivan, Jeremy R.; Winter, Suzanne M.; Sass, Daniel A.; Svenkerud, Nicole

    2014-01-01

    Many tests provide users with several different types of scores to facilitate interpretation and description of students' performance. Common examples include raw scores, age- and grade-equivalent scores, and standard scores. However, when used within the context of assessing growth among young children, these scores should not be…

  8. CT densitovolumetry in children with obliterative bronchiolitis: correlation with clinical scores and pulmonary function test results*,**

    PubMed Central

    Mocelin, Helena; Bueno, Gilberto; Irion, Klaus; Marchiori, Edson; Sarria, Edgar; Watte, Guilherme; Hochhegger, Bruno

    2013-01-01

    OBJECTIVE: To determine whether air trapping (expressed as the percentage of air trapping relative to total lung volume [AT%]) correlates with clinical and functional parameters in children with obliterative bronchiolitis (OB). METHODS: CT scans of 19 children with OB were post-processed for AT% quantification with the use of a fixed threshold of −950 HU (AT%950) and of thresholds selected with the aid of density masks (AT%DM). Patients were divided into three groups by AT% severity. We examined AT% correlations with oxygen saturation (SO2) at rest, six-minute walk distance (6MWD), minimum SO2 during the six-minute walk test (6MWT_SO2), FVC, FEV1, FEV1/FVC, and clinical parameters. RESULTS: The 6MWD was longer in the patients with larger normal lung volumes (r = 0.53). We found that AT%950 showed significant correlations (before and after the exclusion of outliers, respectively) with the clinical score (r = 0.72; 0.80), FVC (r = 0.24; 0.59), FEV1 (r = −0.58; −0.67), and FEV1/FVC (r = −0.53; r = −0.62), as did AT%DM with the clinical score (r = 0.58; r = 0.63), SO2 at rest (r = −0.40; r = −0.61), 6MWT_SO2 (r = −0.24; r = −0.55), FVC (r = −0.44; r = −0.80), FEV1 (r = −0.65; r = −0.71), and FEV1/FVC (r = −0.41; r = −0.52). CONCLUSIONS: Our results show that AT% correlates significantly with clinical scores and pulmonary function test results in children with OB. PMID:24473764

  9. Test and Score Data Summary for TOEFL[R] Internet-Based and Paper-Based Tests. January 2008-December 2008 Test Data

    ERIC Educational Resources Information Center

    Educational Testing Service, 2008

    2008-01-01

    The Test of English as a Foreign Language[TM], better known as TOEFL[R], is designed to measure the English-language proficiency of people whose native language is not English. TOEFL scores are accepted by more than 6,000 colleges, universities, and licensing agencies in 130 countries. The test is also used by governments, and scholarship and…

  10. Comparison between Dichotomous and Polytomous Scoring of Innovative Items in a Large-Scale Computerized Adaptive Test

    ERIC Educational Resources Information Center

    Jiao, Hong; Liu, Junhui; Haynie, Kathleen; Woo, Ada; Gorham, Jerry

    2012-01-01

    This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test…

  11. Validation of Automated Scores of TOEFL iBT Tasks against Non-Test Indicators of Writing Ability

    ERIC Educational Resources Information Center

    Weigle, Sara Cushing

    2010-01-01

    Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study approaches validity by comparing human and automated scores on responses to…

  12. Validation of Automated Scores of TOEFL iBT Tasks against Non-Test Indicators of Writing Ability

    ERIC Educational Resources Information Center

    Weigle, Sara Cushing

    2010-01-01

    Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study approaches validity by comparing human and automated scores on responses to…

  13. Improving Personality Facet Scores with Multidimensional Computer Adaptive Testing: An Illustration with the Neo Pi-R

    ERIC Educational Resources Information Center

    Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W.

    2013-01-01

    Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…

  14. Differential Item Functioning for a Test with a Cutoff Score: Use of Limited Closed-Interval Measures.

    ERIC Educational Resources Information Center

    Oshima, T. C.; And Others

    1994-01-01

    A procedure to detect differential item functioning (DIF) is introduced that is suitable for tests with a cutoff score. DIF is assessed on a limited closed interval of thetas in which a cutoff score falls. How this approach affects the identification of DIF items is demonstrated with real data sets. (SLD)

  15. Improving Personality Facet Scores with Multidimensional Computer Adaptive Testing: An Illustration with the Neo Pi-R

    ERIC Educational Resources Information Center

    Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W.

    2013-01-01

    Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…

  16. Proposal of a clinical score for the molecular test for Pitt-Hopkins syndrome.

    PubMed

    Marangi, Giuseppe; Ricciardi, Stefania; Orteschi, Daniela; Tenconi, Romano; Monica, Matteo Della; Scarano, Gioacchino; Battaglia, Domenica; Lettori, Donatella; Vasco, Gessica; Zollino, Marcella

    2012-07-01

    Pitt-Hopkins syndrome (PTHS) is an emerging condition characterized by severe intellectual disability (ID), typical facial gestalt, and additional features, such as breathing abnormalities. Because of the overlapping phenotype of severe ID with absent speech, epilepsy, microcephaly, large mouth, and constipation, differential diagnosis of PTHS with respect to Angelman, Rett, and Mowat-Wilson syndromes represents a relevant clinical issue, and many patients are currently undergoing genetic tests for different conditions that are assumed to fall within the PTHS clinical spectrum. During a search for TCF4 mutations in 78 patients with a suspected PTHS, haploinsufficiency of TCF4 was identified in 18. By evaluating clinical features of patients with a proven TCF4 mutation with those of patients without, we noticed that, in addition to the typical facial gestalt, the PTHS phenotype results from the various combination of the following characteristics: ID with severe speech impairment, normal growth parameters at birth, postnatal microcephaly, breathing abnormalities, motor incoordination, ocular anomalies, constipation, seizures, typical behavior, and subtle brain abnormalities. On the basis of these observations, here we propose a clinically based score system as useful tool for driving a first choice molecular test for PTHS. This scoring system is also proposed for a clinically based diagnosis of PTHS in absence of a proven TCF4 mutation. PMID:22678594

  17. A multivariate spatial mixture model for areal data: examining regional differences in standardized test scores

    PubMed Central

    Neelon, Brian; Gelfand, Alan E.; Miranda, Marie Lynn

    2013-01-01

    Summary Researchers in the health and social sciences often wish to examine joint spatial patterns for two or more related outcomes. Examples include infant birth weight and gestational length, psychosocial and behavioral indices, and educational test scores from different cognitive domains. We propose a multivariate spatial mixture model for the joint analysis of continuous individual-level outcomes that are referenced to areal units. The responses are modeled as a finite mixture of multivariate normals, which accommodates a wide range of marginal response distributions and allows investigators to examine covariate effects within subpopulations of interest. The model has a hierarchical structure built at the individual level (i.e., individuals are nested within areal units), and thus incorporates both individual- and areal-level predictors as well as spatial random effects for each mixture component. Conditional autoregressive (CAR) priors on the random effects provide spatial smoothing and allow the shape of the multivariate distribution to vary flexibly across geographic regions. We adopt a Bayesian modeling approach and develop an efficient Markov chain Monte Carlo model fitting algorithm that relies primarily on closed-form full conditionals. We use the model to explore geographic patterns in end-of-grade math and reading test scores among school-age children in North Carolina. PMID:26401059

  18. Enhancing accuracy in observational test scoring: the comprehensive system as a case example.

    PubMed

    McGrath, Robert E

    2003-10-01

    Inaccuracies in administration and scoring can potentially compromise the validity of any standardized psychosocial measure. The threat is particularly pertinent to methods involving behavioral observation, a category that includes many intelligence tests, neuropsychological measures, personality assessment instruments, and diagnostic procedures. Despite evidence and conjecture that errors in testing procedure are common for at least some of these measures and that these errors are often severe enough to influence interpretation, the topic has received relatively little attention. In particular, the absence of any safeguard against inaccurate test use in clinical situations can put the respondent at risk and violates ethical standards for the use of tests. In this article, I review some issues surrounding accuracy in testing procedures, including a discussion of what is known about the problem, an evaluation of several approaches to improving testing practices, and a review of recommendations for the statistical evaluation of rater accuracy. In this article, I use the Rorschach Comprehensive System (Exner, 1993) to demonstrate the concepts discussed. PMID:12946917

  19. A Brief Look at: Test Scores and the Standard Error of Measurement. E&R Report No. 10.13

    ERIC Educational Resources Information Center

    Holdzkom, David; Sumner, Brian; McMillen, Brad

    2010-01-01

    In the context of standardized testing, the standard error of measurement (SEM) is a measure of the factors other than the student's actual knowledge of the tested material that may affect the student's test score. Such factors may include distractions in the testing environment, fatigue, hunger, or even luck. This means that a student's observed…

  20. Guided-Inquiry Lessons Raise Scores on the Sixth Grade Georgia Science Test

    NASA Astrophysics Data System (ADS)

    Page, Purlie M.

    At the local level, G Middle School has the highest district-wide percentage of 6th grade science students who are not meeting standards. It is imperative that G middle school take corrective action to reduce the number of students failing to meet state science standards. Dewey's theory of conceptual framework, which involves knowledge constructed on a person's personal experience and mind activity through active forms of learning, guided this study. The goal of the study was to determine whether inquiry-based science modules produce greater 6th grade science achievement, as measured by an equivalent instrument of the science section of the Georgia Criterion-Referenced Competency Test, when compared to traditional instruction among eastern Georgia 6th graders. The sample consisted of 230 students in the nonintervention group and 119 students in the intervention group. All students were from intact classes. At the end of the intervention, an independent t test was conducted to analyze the scores. According to the study t test, (t = 12.33, df = 304.56, p < 0.05), the difference between the means was statistically significant. This project's potential impact on social change includes increasing student motivation towards, comprehension of, and interest in science concepts. At the local level, these inquiry lessons can be shared with science teachers across grade levels and within the district to improve county-wide science scores. An increase in student interest and comprehension of science concepts could ultimately lead to the United States producing more students in the fields of science, technology, engineering, and mathematics (STEM) education.

  1. Multiple tests for wind turbine fault detection and score fusion using two- level multidimensional scaling (MDS)

    NASA Astrophysics Data System (ADS)

    Ye, Xiang; Gao, Weihua; Yan, Yanjun; Osadciw, Lisa A.

    2010-04-01

    Wind is an important renewable energy source. The energy and economic return from building wind farms justify the expensive investments in doing so. However, without an effective monitoring system, underperforming or faulty turbines will cause a huge loss in revenue. Early detection of such failures help prevent these undesired working conditions. We develop three tests on power curve, rotor speed curve, pitch angle curve of individual turbine. In each test, multiple states are defined to distinguish different working conditions, including complete shut-downs, under-performing states, abnormally frequent default states, as well as normal working states. These three tests are combined to reach a final conclusion, which is more effective than any single test. Through extensive data mining of historical data and verification from farm operators, some state combinations are discovered to be strong indicators of spindle failures, lightning strikes, anemometer faults, etc, for fault detection. In each individual test, and in the score fusion of these tests, we apply multidimensional scaling (MDS) to reduce the high dimensional feature space into a 3-dimensional visualization, from which it is easier to discover turbine working information. This approach gains a qualitative understanding of turbine performance status to detect faults, and also provides explanations on what has happened for detailed diagnostics. The state-of-the-art SCADA (Supervisory Control And Data Acquisition) system in industry can only answer the question whether there are abnormal working states, and our evaluation of multiple states in multiple tests is also promising for diagnostics. In the future, these tests can be readily incorporated in a Bayesian network for intelligent analysis and decision support.

  2. Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

    NASA Astrophysics Data System (ADS)

    Zain, Zakiyah; Aziz, Nazrina; Ahmad, Yuhaniz; Azwan, Zairul; Raduan, Farhana; Sagap, Ismail

    2014-12-01

    Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.

  3. Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

    SciTech Connect

    Zain, Zakiyah Ahmad, Yuhaniz; Azwan, Zairul E-mail: farhanaraduan@gmail.com Raduan, Farhana E-mail: farhanaraduan@gmail.com Sagap, Ismail E-mail: farhanaraduan@gmail.com; Aziz, Nazrina

    2014-12-04

    Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.

  4. Should We Stop Looking for a Better Scoring Algorithm for Handling Implicit Association Test Data? Test of the Role of Errors, Extreme Latencies Treatment, Scoring Formula, and Practice Trials on Reliability and Validity

    PubMed Central

    Perugini, Marco; Schönbrodt, Felix

    2015-01-01

    Since the development of D scores for the Implicit Association Test, few studies have examined whether there is a better scoring method. In this contribution, we tested the effect of four relevant parameters for IAT data that are the treatment of extreme latencies, the error treatment, the method for computing the IAT difference, and the distinction between practice and test critical trials. For some options of these different parameters, we included robust statistic methods that can provide viable alternative metrics to existing scoring algorithms, especially given the specificity of reaction time data. We thus elaborated 420 algorithms that result from the combination of all the different options and test the main effect of the four parameters with robust statistical analyses as well as their interaction with the type of IAT (i.e., with or without built-in penalty included in the IAT procedure). From the results, we can elaborate some recommendations. A treatment of extreme latencies is preferable but only if it consists in replacing rather than eliminating them. Errors contain important information and should not be discarded. The D score seems to be still a good way to compute the difference although the G score could be a good alternative, and finally it seems better to not compute the IAT difference separately for practice and test critical trials. From this recommendation, we propose to improve the traditional D scores with small yet effective modifications. PMID:26107176

  5. The TSCA interagency testing committee`s approaches to screening and scoring chemicals and chemical groups: 1977-1983

    SciTech Connect

    Walker, J.D.

    1990-12-31

    This paper describes the TSCA interagency testing committee`s (ITC) approaches to screening and scoring chemicals and chemical groups between 1977 and 1983. During this time the ITC conducted five scoring exercises to select chemicals and chemical groups for detailed review and to determine which of these chemicals and chemical groups should be added to the TSCA Section 4(e) Priority Testing List. 29 refs., 1 fig., 2 tabs.

  6. Development of new risk score for pre-test probability of obstructive coronary artery disease based on coronary CT angiography.

    PubMed

    Fujimoto, Shinichiro; Kondo, Takeshi; Yamamoto, Hideya; Yokoyama, Naoyuki; Tarutani, Yasuhiro; Takamura, Kazuhisa; Urabe, Yoji; Konno, Kumiko; Nishizaki, Yuji; Shinozaki, Tomohiro; Kihara, Yasuki; Daida, Hiroyuki; Isshiki, Takaaki; Takase, Shinichi

    2015-09-01

    Existing methods to calculate pre-test probability of obstructive coronary artery disease (CAD) have been established using selected high-risk patients who were referred to conventional coronary angiography. The purpose of this study is to develop and validate our new method for pre-test probability of obstructive CAD using patients who underwent coronary CT angiography (CTA), which could be applicable to a wider range of patient population. Using consecutive 4137 patients with suspected CAD who underwent coronary CTA at our institution, a multivariate logistic regression model including clinical factors as covariates calculated the pre-test probability (K-score) of obstructive CAD determined by coronary CTA. The K-score was compared with the Duke clinical score using the area under the curve (AUC) for the receiver-operating characteristic curve. External validation was performed by an independent sample of 319 patients. The final model included eight significant predictors: age, gender, coronary risk factor (hypertension, diabetes mellitus, dyslipidemia, smoking), history of cerebral infarction, and chest symptom. The AUC of the K-score was significantly greater than that of the Duke clinical score for both derivation (0.736 vs. 0.699) and validation (0.714 vs. 0.688) data sets. Among patients who underwent coronary CTA, newly developed K-score had better pre-test prediction ability of obstructive CAD compared to Duke clinical score in Japanese population. PMID:24770610

  7. Correlations among scores on the Matrix Analogies Test--Short Form and the WISC--R with gifted youth.

    PubMed

    Karnes, F A; McGinnis, J C

    1994-06-01

    A study of the correlations between scores on the Matrix Analogies Test--Short Form and the WISC--R was conducted with 39 students enrolled in a Saturday program for the intellectually gifted. The Matrix Analogies Test was group-administered and WISC--R scores were obtained from records required for entry into the program. A significant Pearson r of .52 was found between the Matrix Analogies scores and the Performance IQs of the WISC--R; other correlations were not significant. PMID:8058884

  8. Validation of Group Domain Score Estimates Using a Test of Domain

    ERIC Educational Resources Information Center

    Pommerich, Mary

    2006-01-01

    Domain scores have been proposed as a user-friendly way of providing instructional feedback about examinees' skills. Domain performance typically cannot be measured directly; instead, scores must be estimated using available information. Simulation studies suggest that IRT-based methods yield accurate group domain score estimates. Because…

  9. An Unsolved Mystery: Interpreting Grade Scores or How Come My Seven Year Old Scored at the Sixth Grade Level and She Can't Do Fourth Grade Work.

    ERIC Educational Resources Information Center

    Coleman, Laurence J.

    1983-01-01

    Questions that parents, students, and teachers have frequently asked about the meaning of grade equivalent scores are answered, and implications for gifted students are considered. Facts concerning achievement test scores, content, and the norm group are indicated. (SEW)

  10. Metric-Free Measures of Test Score Trends and Gaps with Policy-Relevant Examples. CSE Report 665

    ERIC Educational Resources Information Center

    Ho, Andrew D.; Haertel, Edward H.

    2006-01-01

    Problems of scale typically arise when comparing test score trends, gaps, and gap trends across different tests. To overcome some of these difficulties, we can express the difference between the observed test performance of two groups with graphs or statistics that are metric-free (i.e., invariant under positive monotonic transformations of the…

  11. Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

    ERIC Educational Resources Information Center

    Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

    2010-01-01

    In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…

  12. Setting Local Cut Scores on the Sat Reasoning Test™ Writing Section: For Use in College Placement and Admissions Decisions

    ERIC Educational Resources Information Center

    Morgan, Deanna L.

    2006-01-01

    The introduction of the SAT Reasoning Test™ with a writing section in March 2005 and the concomitant elimination of the SAT® Subject Test in Writing after January 2005 have led many colleges and institutions to ask for guidance in using the new SAT Reasoning Test writing section scores for college placement and admissions. Standard-setting…

  13. Meta-Analyses of the Relationship of Creative Achievement to both IQ and Divergent Thinking Test Scores

    ERIC Educational Resources Information Center

    Kim, Kyung Hee

    2008-01-01

    There is disagreement among researchers about whether IQ tests or divergent thinking (DT) tests are better predictors of creative achievement. Resolving this dispute is complicated by the fact that some research has shown a relationship between IQ and DT test scores (e.g., Runco & Albert, 1986; Wallach, 1970). The present study conducted…

  14. Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

    ERIC Educational Resources Information Center

    Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

    2010-01-01

    In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…

  15. Validation of a new scoring system for the Weigl Color Form Sorting Test in a memory disorders clinic sample.

    PubMed

    Byrne, L M; Bucks, R S; Cuerden, J M

    1998-04-01

    The Bristol Memory Disorders Clinic uses the Weigl Color Form Sorting Test (CFST) to appraise abstraction and the ability to shift set. The original scoring system for the CFST (Grewal & Haward, 1984), developed on the premise that sorting to form is more difficult than sorting to color, had no score for an individual able to sort to form and subsequently unable to shift to color with a cue. Clinical experience suggested that the performance of some individuals required such a score. A new scoring system was developed and validated in a memory-disorders-clinic sample. The validation showed the new score to be necessary and gave support to the original premise that people with organic brain damage show a preference for sorting to color. PMID:9777483

  16. The achievement impact of the inclusion model on the standardized test scores of general education students

    NASA Astrophysics Data System (ADS)

    Garrett-Rainey, Syrena

    The purpose of this study was to compare the achievement of general education students within regular education classes to the achievement of general education students in inclusion/co-teach classes to determine whether there was a significant difference in the achievement between the two groups. The school district's inclusion/co-teach model included ongoing professional development support for teachers and administrators. General education teachers, special education teachers, and teacher assistants collaborated to develop instructional strategies to provide additional remediation to help students to acquire the skills needed to master course content. This quantitative study reviewed the end-of course test (EoCT) scores of Grade 10 physical science and math students within an urban school district. It is not known whether general education students in an inclusive/co-teach science or math course will demonstrate a higher achievement on the EoCT in math or science than students not in an inclusive/co-teach classroom setting. In addition, this study sought to determine if students classified as low socioeconomic status benefited from participating in co-teaching classrooms as evidenced by standardized tests. Inferential statistics were used to determine whether there was a significant difference between the achievements of the treatment group (inclusion/co-teach) and the control group (non-inclusion/co-teach). The findings can be used to provide school districts with optional instructional strategies to implement in the diverse classroom setting in the modern classroom to increase academic performance on state standardized tests.

  17. Reliability and Validity of the New Tanaka B Intelligence Scale Scores: A Group Intelligence Test

    PubMed Central

    Uno, Yota; Mizukami, Hitomi; Ando, Masahiko; Yukihiro, Ryoji; Iwasaki, Yoko; Ozaki, Norio

    2014-01-01

    Objective The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. Methods The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2±0.7 years) residing in a juvenile detention home; reliability was assessed using Cronbach’s alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQ<70) was performed. In addition, stratum-specific likelihood ratios for detection of intellectual disability were calculated. Results The Cronbach’s alpha for the new Tanaka B Intelligence Scale IQ (BIQ) was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85–0.96). In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9–48.9), and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03–0.4). Thus, intellectual disability could be ruled out or determined. Conclusion The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult. PMID:24940880

  18. Alaska, National SAT Scores Increase; Has the Quality of Education Improved? Assessment Report 12: An Update on the Alaska Statewide Testing Program.

    ERIC Educational Resources Information Center

    Alaska State Dept. of Education, Juneau. Office of Evaluation, Assessment and Research.

    Alaskan students' scores on the Scholastic Aptitude Test (SAT) increased nine points between 1984 and 1985, matching the national gain. These scores marked the fourth year of increases following 17 years of consistently declining scores. Thirty-three percent of Alaska's high school seniors took the SAT in 1985. The combined score of 923 was 17…

  19. Childhood Fitness and Academic Performance: An Investigation into the Effect of Aerobic Capacity on Academic Test Scores

    ERIC Educational Resources Information Center

    Hobbs, Mark

    2014-01-01

    The purpose of this quantitate ve study was to determine whether or not students in fifth grade who meet the healthy fitness zone (HFZ) for aerobic capacity on the fall 2013 FITNESSGRAM® Test scored higher on the math portion of the 2013 fall Measures of Academic Progress (MAP) test, than students that failed to reach the HFZ for aerobic capacity…

  20. Aptitude Test Scores of Prospective Graduate Students in Science Remained Essentially the Same from 1970 to 1975.

    ERIC Educational Resources Information Center

    National Science Foundation, Washington, DC. Div. of Science Resources Studies.

    Presented is a summary of an Educational Testing Service (ETS) review of mean scores on the Graduate Record Examination (GRE) of candidates for graduate study in science and engineering fields for the period 1970-1975. Test results were found to have remained essentially stable over the period within each particular field. Significant differences…

  1. On the Question of Secular Trends in the Heritability of Intelligence Test Scores: A Study of Norwegian Twins.

    ERIC Educational Resources Information Center

    Sundet, Jon Martin; And Others

    1988-01-01

    Intelligence test data collected in 1931 through 1960 on 757 identical and 1,093 fraternal male twins, from the files of the Norwegian Armed Forces, were examined for secular trends in the heritability of intelligence test scores. Only ambiguous evidence of such trends was found. (SLD)

  2. The Impact of Cooperative Learning on Critical Thinking Test Scores of Associate's Degree Graduates in Southwest Virginia

    ERIC Educational Resources Information Center

    Hodges, James Gregory

    2013-01-01

    This study examined the impact that the teaching technique known as cooperative learning had on the changes between pre- and post-test scores on all sub-categories ("induction, deduction, analysis, evaluation, inference", and "total composite") associated with the "California Critical Thinking Skills Test" (CCTST) for…

  3. Mental Help: Test-Prep Products Promise To Boost Your Students' Scores, but Do They Really Deliver?

    ERIC Educational Resources Information Center

    Hardy, Lawrence

    2001-01-01

    SkillsTutor and its competitors (including dot.com companies) are tapping a potentially lucrative market--kids wanting academic coaching to improve test scores. Some experts worry that test-prep could overemphasize skill-building materials or usurp broader-based, critical-thinking classroom activities. The jury is still out. (MLH)

  4. Assessing Follow Through: Changes in Intelligence Test Scores over Two and Three Years of Experience in the Responsive Program.

    ERIC Educational Resources Information Center

    Rayder, Nicolas; And Others

    1978-01-01

    Four Wechsler subscales were administered in a longitudinal design to children from the Responsive Model Follow Through Program. On the first testing, subjects' average intelligence scores were significantly lower, but on subsequent tests equivalent to or higher than national norms, calling into question Deutsch's cumulative-deficit hypothesis.…

  5. Error Rates in Measuring Teacher and School Performance Based on Student Test Score Gains. NCEE 2010-4004

    ERIC Educational Resources Information Center

    Schochet, Peter Z.; Chiang, Hanley S.

    2010-01-01

    This paper addresses likely error rates for measuring teacher and school performance in the upper elementary grades using value-added models applied to student test score gain data. Using realistic performance measurement system schemes based on hypothesis testing, we develop error rate formulas based on OLS and Empirical Bayes estimators.…

  6. On-Field Testing Environment and Balance Error Scoring System Performance During Preseason Screening of Healthy Collegiate Baseball Players

    PubMed Central

    Onate, James A; Beck, Brian C; Van Lunen, Bonnie L

    2007-01-01

    Context: To determine if testing environment affects Balance Error Scoring System (BESS) scores in healthy collegiate baseball players. Design: Experimental, randomized, repeated-measures design with a sample of convenience. Setting: Uncontrolled sideline and controlled locker room baseball environments. Patients or Other Participants: A total of 21 healthy collegiate baseball players (age = 20.1 ± 1.4 years, height = 185.1 ± 6.8 cm, mass = 86.3 ± 9.5 kg) with no history of head injury within the last 12 months, no lower extremity injuries reported within the past 2 months that caused them to miss 1 or more days of practice or game time, and no history of otitis media, Parkinson disease, or Meniere disease. Main Outcome Measure(s): Participants performed the BESS test in 2 environments, controlled locker room and uncontrolled sideline, in 2 testing sessions 1 week apart during the baseball preseason. The BESS scores were evaluated for each of the 6 conditions and total score across the testing sessions. Separate, paired-samples t tests with Bonferroni adjustment (P < .008) were used to examine differences between testing environments for each BESS subcategory and total score. Cohen d tests were calculated to evaluate effect sizes and relative change. Results: Significant group mean differences were found between testing environments for single-leg foam stance (P = .001), with higher scores reported for the uncontrolled sideline environment (7.33 ± 2.11 errors) compared with the controlled clinical environment (5.19 ± 2.16 errors). Medium to large effect sizes (0.53 to 1.03) were also found for single-leg foam, tandem foam, and total BESS scores, with relative increases (worse scores) of 30% to 44% in the sideline environment compared with the clinical environment. Conclusions: The BESS performance was impaired when participants were tested in a sideline environment compared with a clinical environment. Baseline testing for postural control using the BESS should be conducted in the setting or environment in which testing after injury will most likely be conducted. PMID:18174931

  7. Efforts to Produce Relevant Score Reports to School, District, and State Officials on National Tests

    ERIC Educational Resources Information Center

    Patelis, Thanos; Matos-Elefonte, Haifa

    2009-01-01

    Presented at the Annual National Council on Measurement in Education (NCME) in San Diego in April 2009. This presentation explores how the College Board strives to ensure the relevance and utility of score reporting practices and methods for the PSAT/NMSQT and SAT scores. The new reporting methods allow for greater interaction and intervention at…

  8. Categorical Differences in Statewide Standardized Testing Scores of Students with Disabilities

    ERIC Educational Resources Information Center

    Trexler, Ellen L.

    2013-01-01

    The No Child Left Behind Act requires all students be proficient in reading and mathematics by 2014, and students in subgroups to make Adequate Yearly Progress. One of these groups is students with disabilities, who continue to score well below their general education peers. This quantitative study identified scoring differences between disability…

  9. Genetic parameters for test day somatic cell score in Brazilian Holstein cattle.

    PubMed

    Costa, C N; Santos, G G; Cobuci, J A; Thompson, G; Carvalheira, J G V

    2015-01-01

    Selection for lower somatic cell count has been included in the breeding objectives of several countries in order to increase resistance to mastitis. Genetic parameters of somatic cell scores (SCS) were estimated from the first lactation test day records of Brazilian Holstein cows using random-regression models with Legendre polynomials (LP) of the order 3-5. Data consisted of 87,711 TD produced by 10,084 cows, sired by 619 bulls calved from 1993 to 2007. Heritability estimates varied from 0.06 to 0.14 and decreased from the beginning of the lactation up to 60 days in milk (DIM) and increased thereafter to the end of lactation. Genetic correlations between adjacent DIM were very high (>0.83) but decreased to negative values, obtained with LP of order four, between DIM in the extremes of lactation. Despite the favorable trend, genetic changes in SCS were not significant and did not differ among LP. There was little benefit of fitting an LP of an order >3 to model animal genetic and permanent environment effects for SCS. Estimates of variance components found in this study may be used for breeding value estimation for SCS and selection for mastitis resistance in Holstein cattle in Brazil. PMID:26782564

  10. Hand Test scores of panic disordered outpatients sexually abused as children.

    PubMed

    Zizolfi, S; Cilli, G; Concari, S; Colombo, G

    1997-12-01

    A history of childhood sexual abuse has been implicated in a variety of adult psychiatric disorders as more frequent in females than in males and in subjects with more prominent dissociative symptoms such as panic disorder. Previous research has varied greatly in terms of methods, measurement instruments, and reported findings. Recent studies, however, suggest that projective techniques may be useful in resolving some of these inconsistencies. The present study utilized the Hand Test to investigate the late effects of childhood sexual trauma in a group of authenticated cases of panic disordered adult outpatients sexually abused as children compared to a matched sample of presumably nonabused patients. No statistically significant differences on quantitative variables were obtained between the two groups, but the group of outpatients (n = 16) sexually abused as children showed a larger latency to the ninth card of the Hand Test (shock reaction). This may be a potentially useful index in investigating cases of suspected abuse and confirms Wagner's (1983) contention that Card IX has a psychosexual "pull" as documented also by Italian studies. PMID:9450295

  11. A hidden Markov model to predict early mastitis from test-day somatic cell scores.

    PubMed

    Detilleux, J C

    2011-02-01

    In many countries, high somatic cell scores (SCS) in milk are used as an indicator for mastitis because they are collected on a routine basis. However, individual test-day SCS are not very accurate in identifying infected cows. Mathematical models may improve the accuracy of the biological marker by making better use of the information contained in the available data. Here, a simple hidden Markov model (HMM) is described mathematically and applied to SCS recorded monthly on cows with or without clinical mastitis to evaluate its accuracy in estimating parameters (mean, variance and transition probabilities) under healthy or diseased states. The SCS means were estimated at 1.96 (s.d. = 0.16) and 4.73 (s.d. = 0.71) for the hidden healthy and infected states, and the common variance at 0.83 (s.d. = 0.11). The probability of remaining uninfected, recovering from infection, getting newly infected and remaining infected between consecutive test days was estimated at 78.84%, 60.49%, 11.70% and 15%, respectively. Three different health-related states were compared: clinical stages observed by farmers, subclinical cases defined for somatic cell counts below or above 250 000 cells/ml and infected stages obtained from the HMM. The results showed that HMM identifies infected cows before the appearance of clinical and subclinical signs, which may critically improve the power of the studies on the genetic determinants of SCS and reduce biases in predicting breeding values for SCS. PMID:22440761

  12. The Relationship among Student Achievement Scores on the Math and Science End-of-Course-Tests and Scores on the High School Graduation Test

    ERIC Educational Resources Information Center

    Turner, Sherry L.

    2011-01-01

    Thirteen percent of the 2008-2009 senior class in one southeastern state did not pass the science portion of the state's high school graduation test. Another 5% failed to pass the math portion of the graduation test, leaving these students unable to obtain a high school diploma. The purpose of this nonexperimental quantitative research study was…

  13. The Relationship among Student Achievement Scores on the Math and Science End-of-Course-Tests and Scores on the High School Graduation Test

    ERIC Educational Resources Information Center

    Turner, Sherry L.

    2011-01-01

    Thirteen percent of the 2008-2009 senior class in one southeastern state did not pass the science portion of the state's high school graduation test. Another 5% failed to pass the math portion of the graduation test, leaving these students unable to obtain a high school diploma. The purpose of this nonexperimental quantitative research study was…

  14. Effect of locomotion score on sows' performances in a feed reward collection test.

    PubMed

    Bos, E-J; Nalon, E; Maes, D; Ampe, B; Buijs, S; van Riet, M M J; Millet, S; Janssens, G P J; Tuyttens, F A M

    2015-10-01

    Sows housed in groups have to move through their pen to fulfil their behavioural and physiological needs such as feeding and resting. In addition to causing pain and discomfort, lameness may restrict the ability of sows to fulfil such needs. The aim of our study was to investigate the extent to which the mobility of sows is affected by different degrees of lameness. Mobility was measured as the sow's willingness or capability to cover distances. Feed-restricted hybrid sows with different gait scores were subjected to a feed reward collection test in which they had to walk distances to obtain subsequent rewards. In all, 29 group-housed sows at similar gestation stage (day 96.6 ± 7 s.d.) were visually recorded for gait and classified as non-lame, mildly lame, moderately lame or severely lame. All sows received 2.6 kg of standard commercial gestation feed per day. The test arena consisted of two feeding locations separated from each other by a Y-shaped middle barrier. Feed rewards were presented at the two feeders in turn, using both light and sound cues to signal the availability of a new feed reward. Sows were individually trained during 5 non-consecutive days for 10 min/day with increasing barrier length (range: 0 to 3.5 m) each day. After training, sows were individually tested once per day on 3 non-consecutive days with the maximum barrier length such that they had to cover 9.3 m to walk from one feeder to the other. The outcome variable was the number of rewards collected in a 15-min time span. Non-lame and mildly lame sows obtained more rewards than moderately lame and severely lame sows (P<0.01). However, no significant difference was found between non-lame and mildly lame sows (P=0.69), nor between moderately lame and severely lame sows (P=1.00). This feed reward collection test indicates that both moderately lame and severely lame sows are limited in their combined ability and willingness to walk, but did not reveal an effect of mild lameness on mobility. These findings suggest that moderately and more severely lame sows, but not mildly lame sows, might suffer from reduced access to valuable resources in group housing systems. PMID:26160227

  15. What No Child Left Behind Leaves Behind: The Roles of IQ and Self-Control in Predicting Standardized Achievement Test Scores and Report Card Grades

    PubMed Central

    Duckworth, Angela L.; Quinn, Patrick D.; Tsukayama, Eli

    2013-01-01

    The increasing prominence of standardized testing to assess student learning motivated the current investigation. We propose that standardized achievement test scores assess competencies determined more by intelligence than by self-control, whereas report card grades assess competencies determined more by self-control than by intelligence. In particular, we suggest that intelligence helps students learn and solve problems independent of formal instruction, whereas self-control helps students study, complete homework, and behave positively in the classroom. Two longitudinal, prospective studies of middle school students support predictions from this model. In both samples, IQ predicted changes in standardized achievement test scores over time better than did self-control, whereas self-control predicted changes in report card grades over time better than did IQ. As expected, the effect of self-control on changes in report card grades was mediated in Study 2 by teacher ratings of homework completion and classroom conduct. In a third study, ratings of middle school teachers about the content and purpose of standardized achievement tests and report card grades were consistent with the proposed model. Implications for pedagogy and public policy are discussed. PMID:24072936

  16. Teacher Competency Testing and Equity: Implications for Teacher Education.

    ERIC Educational Resources Information Center

    Wood, Eric F.

    This paper examines some of the implications of present testing programs for minority groups. The effects on prospective teachers and the institutions that prepare them are discussed. The role of the testing movement in molding the curriculum that is taught in colleges of education is described with particular consideration of state-mandated…

  17. An Investigation of Calculator Use on Employment Tests of Mathematical Ability: Effects on Reliability, Validity, Test Scores, and Speed of Completion

    ERIC Educational Resources Information Center

    Bing, Mark N.; Stewart, Susan M.; Davison, H. Kristl

    2009-01-01

    Handheld calculators have been used on the job for more than 30 years, yet the degree to which these devices can affect performance on employment tests of mathematical ability has not been thoroughly examined. This study used a within-subjects research design (N = 167) to investigate the effects of calculator use on test score reliability, test…

  18. School performance and IQ-test scores at age 13 as related to birth weight and gestational age.

    PubMed

    Lagerström, M; Bremme, K; Eneroth, P; Magnusson, D

    1991-01-01

    The cohort in the present longitudinal research program consisted of 873 children in an entire school grade, in a Swedish community. The present results showed a main effect of birth weight; low birth weight (LBW) children had lower school performance and intelligence-test (IQ) scores at age 13 than did normal birth weight (NBW) children irrespective of parental SES. Second, there was no significant main effect of gestational age (GA) on scholastic performance and IQ-test scores. Third, there was a significant main effect of the combination of birth weight and GA on scholastic performance and IQ-test scores. The LBW children born at term (38-40 pregnancy weeks; pw) had significantly lower scores and school grades as compared to the control group while the LBW children born with short gestational age (34-37 pw) and with very short gestational age (less than 34 pw) had significantly lower scores and marks in fewer areas of academic attainment. PMID:1775949

  19. Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography

    PubMed Central

    Mallett, Susan; Halligan, Steve; Collins, Gary S.; Altman, Doug G.

    2014-01-01

    Background Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. Methods In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Results Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. Conclusions The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. PMID:25353643

  20. Science course sequences: The alignment of written, enacted, and tested curricula and their impact on grade 11 HSPA science scores

    NASA Astrophysics Data System (ADS)

    Lentz, Christine A.

    The purpose of this mixed method study was to examine the alignment of the written, enacted, and tested curricula of the Ocean City High School science course sequencing and its impact on student achievement. This study also examined the school's ability to predict student scores on the science portion of the High School Proficiency Assessment (HSPA). Data collected for science achievement included the science portion of the Grade Eight Proficiency Assessment (GEPA) as a pretest and the scores for the science portion of the HSPA as a posttest. Data collected for curriculum alignment included an examination of teacher generated course curriculum maps to determine the alignment with the New Jersey Core Curriculum Content Standards and the HSPA Test Specifications Directory. The quantitative data were treated through a series of paired samples t-tests, Pearson product moment correlation was used to examine relationships between variables, an ANCOVA analysis and a stepwise regression analysis were also completed. Based on the findings of the data analysis of this research effort, the following conclusions were drawn: (1) the alignment of the enacted curriculum with the tested and written curricula affected science achievement. (2) GEPA scores are significantly tied to HSPA scores and (3) GEPA scores and enrollment in the science sequence whose curriculum was aligned with the written and tested curricula, met the requirements of a predictor of scores on the HSPA exam. It is expected that educational leadership will use the results of this research to inform practice and drive decision-making in respect to student placement in to course sequences. It is hoped that the results will not only increase support for the district's curricula development plan but also add to the overall body of knowledge surrounding science program effectiveness in relation to the No Child Left Behind standards.

  1. Latent ability: grades and test scores systematically underestimate the intellectual ability of negatively stereotyped students.

    PubMed

    Walton, Gregory M; Spencer, Steven J

    2009-09-01

    Past research has assumed that group differences in academic performance entirely reflect genuine differences in ability. In contrast, extending research on stereotype threat, we suggest that standard measures of academic performance are biased against non-Asian ethnic minorities and against women in quantitative fields. This bias results not from the content of performance measures, but from the context in which they are assessed-from psychological threats in common academic environments, which depress the performances of people targeted by negative intellectual stereotypes. Like the time of a track star running into a stiff headwind, such performances underestimate the true ability of stereotyped students. Two meta-analyses, combining data from 18,976 students in five countries, tested this latent-ability hypothesis. Both meta-analyses found that, under conditions that reduce psychological threat, stereotyped students performed better than nonstereotyped students at the same level of past performance. We discuss implications for the interpretation of and remedies for achievement gaps. PMID:19656335

  2. Drug Testing in Schools: Implications for Policy.

    ERIC Educational Resources Information Center

    Bozeman, William C.; And Others

    1987-01-01

    Public concern about substance abuse, fueled by political and media attention, is causing school administrators to consider a variety of approaches beyond traditional drug education. No procedures, methods, or rules regarding drug testing should be established in the absence of clear school board policy, and no policy decisions should be made…

  3. Performance of children on the Turkish Nonword Repetition Test: Effect of word similarity, word length, and scoring.

    PubMed

    Topba?, Seyhun; Kaçar-Kütükçü, Dilber; Kopkalli-Yavuz, Handan

    2014-01-01

    This study aims to report the preliminary results of the development of the Turkish Nonword Repetition Test and to contribute to the clinical accuracy of the test by comparing the performance of children with specific language impairment with that of language-level matched and age-matched typically developing children on a nonword repetition (NWR) test developed for Turkish. To determine the effect of word similarity and word length, the Turkish Nonword Repetition Test is composed of language-like and language-unlike items. To determine the effect of scoring, the performances of children were scored as correct/incorrect for a whole word, for only the consonants, and for only the vowels. The findings suggest that the test is a reliable tool to differentiate Turkish-speaking children with SLI from typically developing children. PMID:25000381

  4. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Maine

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Maine's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 445 for non-Title I students and 438 for Title I students. In 2009, the mean scale score in 4th grade reading was 477 for non-Title I students and 441 for Title I students. Between 2006 and 2009, the mean scale score…

  5. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Idaho

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Idaho's test score trends through 2008-09. In 2007, the mean scale score on the state 4th grade reading test was 209 for non-Title I students and 205 for Title I students. In 2007, the mean scale score in 4th grade reading was 211 for non-Title I students and 208 for Title I students. Between 2007 and 2009, the mean scale score…

  6. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Kansas

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Kansas' test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 80 for non-Title I students and 73 for Title I students. In 2009, the mean scale score in 4th grade reading was 84 for non-Title I students and 78 for Title I students. Between 2006 and 2009, the mean scale score…

  7. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Utah

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Utah's test score trends through 2008-09. In 2004, the mean scale score on the state 4th grade reading test was 167 for non-Title I students and 164 for Title I students. In 2009 the mean scale score in 4th grade reading was 168 for non-Title I students and 164 for Title I students. Between 2004 and 2009, the mean scale score…

  8. Science standardized achievement tests: The relationship between publishers, textbook completion, admission standards and science test scores of seventh through ninth grade students in FACCS schools

    NASA Astrophysics Data System (ADS)

    Nix, Sharon J.

    Scaled scores from the Stanford Achievement Test Series, Tenth Edition were examined in this causal-comparative study to determine if science publishers in Florida Association of Christian Colleges and Schools (FACCS), textbook completion rates, and admission standards affect standardized test scores. Administrators from 34 schools in FACCS participated in the study by returning an original eleven-question survey instrument to help ascertain what differences or relationships affect standardized test scores. Nine Mann-Whitney tests, one for each grade level in seventh through ninth, did not reveal a significant difference on hypotheses 1a-3c. Publishers (BJ U Press, A.C.E., Glencoe, Prentice Hall), standardized tests, entrance exams, GPA, and ability index factors were reviewed in the study. The results of this study might prompt administrators to consider factors other than publisher usage, textbook completion, and admission standards when attempting to close achievement gaps.

  9. Longitudinal analysis of standardized test scores of students in the Science Writing Heuristic approach

    NASA Astrophysics Data System (ADS)

    Chanlen, Niphon

    The purpose of this study was to examine the longitudinal impacts of the Science Writing Heuristic (SWH) approach on student science achievement measured by the Iowa Test of Basic Skills (ITBS). A number of studies have reported positive impact of an inquiry-based instruction on student achievement, critical thinking skills, reasoning skills, attitude toward science, etc. So far, studies have focused on exploring how an intervention affects student achievement using teacher/researcher-generated measurement. Only a few studies have attempted to explore the long-term impacts of an intervention on student science achievement measured by standardized tests. The students' science and reading ITBS data was collected from 2000 to 2011 from a school district which had adopted the SWH approach as the main approach in science classrooms since 2002. The data consisted of 12,350 data points from 3,039 students. The multilevel model for change with discontinuity in elevation and slope technique was used to analyze changes in student science achievement growth trajectories prior and after adopting the SWH approach. The results showed that the SWH approach positively impacted students by initially raising science achievement scores. The initial impact was maintained and gradually increased when students were continuously exposed to the SWH approach. Disadvantaged students who were at risk of having low science achievement had bigger benefits from experience with the SWH approach. As a result, existing problematic achievement gaps were narrowed down. Moreover, students who started experience with the SWH approach as early as elementary school seemed to have better science achievement growth compared to students who started experiencing with the SWH approach only in high school. The results found in this study not only confirmed the positive impacts of the SWH approach on student achievement, but also demonstrated additive impacts found when students had longitudinal experiences with the approach. By engaging in the argument-based classrooms where teachers value students' prior knowledge, encourage students to take control of their learning, and provide non-threatening environment for students to developing big ideas through negotiation, student's achievement can be enhanced. The results also started to shed some light on sustainability of the SWH approach within the school district.

  10. Indian Students Outperform Blacks on NAEP: Federal Report Is First In-Depth Analysis of Such Test Scores

    ERIC Educational Resources Information Center

    Klein, Alyson

    2006-01-01

    American Indian students tend to lag behind their white and Asian-American peers on National Assessment of Educational Progress reading and mathematics tests in 4th and 8th grade, but they score higher on average than African-American students, according to a first-of-its-kind federal analysis. The U.S. Department of Education says the May 23…

  11. Improving Secondary Practical Computer Skills: Logo Test Scores through Graphically Designed Computer Programs and Utilization of Multimedia and Technology.

    ERIC Educational Resources Information Center

    Miller, Douglas S.

    The intent of this project was to improve test and programming scores of 9th through 12th grade students enrolled in the Practical Computer Skills: Logo course in a north central Florida high school. An implementation program that demonstrated teacher-designed graphical computer language Logo programs, utilized multimedia techniques, and used…

  12. Differential Predictive Validity of High School GPA and College Entrance Test Scores for University Students in Yemen

    ERIC Educational Resources Information Center

    Al-Hattami, Abdulghani Ali Dawod

    2012-01-01

    High school grade point average and college entrance test scores are two admission criteria that are currently used by most colleges in Yemen to select their prospective students. Given their widespread use, it is important to investigate their predictive validity to ensure the accuracy of the admission decisions in these institutions. This study…

  13. Legal Issues in the Use of Student Test Scores and Value-Added Models (VAM) to Determine Educational Quality

    ERIC Educational Resources Information Center

    Pullin, Diana

    2013-01-01

    A growing number of states and local schools across the country have adopted educator evaluation and accountability programs based on the use of student test scores and value-added models (VAM). A wide array of potential legal issues could arise from the implementation of these programs. This article uses legal analysis and social science evidence…

  14. Comparing Standardized Test Scores among Arts-Integrated and Non-Arts Integrated Schools in Central Mississippi

    ERIC Educational Resources Information Center

    Dean, Darlene

    2014-01-01

    The topic of arts integration creates continuing dialog among educators and arts advocates. This study examined the degree to which student achievement was affected when arts education is limited or eliminated from schools to meet the mandates of NCLB (2001) legislation. Standardized test scores from 12 schools in Central Mississippi were used to…

  15. Examination of the Psychometric Properties of the Test Anxiety Scale for Elementary Students (TAS-E) Scores

    ERIC Educational Resources Information Center

    Lowe, Patricia A.; Grumbein, Matthew J.; Raad, Jennifer M.

    2011-01-01

    The psychometric properties of the Test Anxiety Scale for Elementary Students (TAS-E) scores were examined. In Study 1, an exploratory factor analysis (EFA) was performed on the responses of 997 students in Grades 2 to 6 on the TAS-E. The results of the EFA produced a four-factor solution: Physiological Hyperarousal, Social Concerns, Task…

  16. The Effects of Georgia's Choice Curricular Reform Model on Third Grade Science Scores on the Georgia Criterion Referenced Competency Test

    ERIC Educational Resources Information Center

    Phemister, Art W.

    2010-01-01

    The purpose of this study was to evaluate the effectiveness of the Georgia's Choice reading curriculum on third grade science scores on the Georgia Criterion Referenced Competency Test from 2002 to 2008. In assessing the effectiveness of the Georgia's Choice curriculum model this causal comparative study examined the 105 elementary schools that…

  17. A Study of the Relationship between Student Placement Test Scores and Final Grades in Physics 121 at Pima College.

    ERIC Educational Resources Information Center

    Iadevaia, David G.

    A study was conducted at Pima Community College to determine the relationship between the final grade received by students in an introductory, algebra-based physics course (PHY 121) and their scores on the reading, writing, and mathematics portions of the college's nonmandatory assessment test. Between 1983 and 1988, 639 students obtained a final…

  18. Is the Test Score Decline Responsible for the Productivity Growth Decline? Working Paper No. 87-05.

    ERIC Educational Resources Information Center

    Bishop, John

    This paper presents evidence that recent aptitude test score decline is signaling a significant deterioration in the quality of entering cohorts of workers. The impact of general intellectual achievement (GIA) on productivity; trends in the GIA of the adult populations, students, and working adults; accounting for the labor quality growth when…

  19. Options in Education, Transcript for March 8, 1976: Parent Tutors, Feminization of the Teaching Profession, Test Score Controversy, and Busing.

    ERIC Educational Resources Information Center

    George Washington Univ., Washington, DC. Inst. for Educational Leadership.

    "Options in Education" is a radio news program which focuses on issues and developments in education. This transcript contains discussions of volunteer parent tutors in a junior high school, the feminization of the teaching profession, the test score controversy, busing as an issue in the political primaries, and busing and the role of the social…

  20. School Policies and the Black-White Test Score Gap. Working Paper Series. SAN08-03

    ERIC Educational Resources Information Center

    Ladd, Helen F.

    2008-01-01

    This paper examines school-related policies and strategies that have been proposed or justified, at least in part, on the basis of their potential for reducing black-white test score gaps. These include strategies, one of which is greater integration, to reduce differences in the quality of teachers faced by black and white students; school and…

  1. Will Teacher Value-Added Scores Change When Accountability Tests Change? What We Know Series: Value-Added Methods and Applications. Knowledge Brief 8

    ERIC Educational Resources Information Center

    McCaffrey, Daniel F.

    2013-01-01

    Value-added evaluations use student test scores to assess teacher effectiveness. How student achievement is judged can depend on which test is used to measure it. Thus it is reasonable to ask whether a teacher's value-added score depends on which test is used to calculate it. Would it change if a different test was used? Specifically, might a…

  2. How Close Is Close Enough? Testing Nonexperimental Estimates of Impact against Experimental Estimates of Impact with Education Test Scores as Outcomes. Discussion Paper.

    ERIC Educational Resources Information Center

    Wilde, Elizabeth Ty; Hollister, Robinson

    This study tested the performance of nonexperimental estimators of impacts applied to a class size reduction intervention with achievement test scores as the outcome. Nonexperimental estimates of impacts were compared to "true impact" estimates provided by a random-assignment design that assessed intervention effects. Data came from Project STAR,…

  3. How Close Is Close Enough? Testing Nonexperimental Estimates of Impact against Experimental Estimates of Impact with Education Test Scores as Outcomes. Discussion Paper No. 1242-02

    ERIC Educational Resources Information Center

    Wilde, Elizabeth Ty; Hollister, Robinson

    2002-01-01

    In this study we test the performance of some nonexperimental estimators of impacts applied to an educational intervention--reduction in class size--where achievement test scores were the outcome. We compare the nonexperimental estimates of the impacts to "true impact" estimates provided by a random-assignment design used to assess the…

  4. Classifying and scoring of molecules with the NGN: new datasets, significance tests, and generalization

    PubMed Central

    2010-01-01

    This paper demonstrates how a Neural Grammar Network learns to classify and score molecules for a variety of tasks in chemistry and toxicology. In addition to a more detailed analysis on datasets previously studied, we introduce three new datasets (BBB, FXa, and toxicology) to show the generality of the approach. A new experimental methodology is developed and applied to both the new datasets as well as previously studied datasets. This methodology is rigorous and statistically grounded, and ultimately culminates in a Wilcoxon significance test that proves the effectiveness of the system. We further include a complete generalization of the specific technique to arbitrary grammars and datasets using a mathematical abstraction that allows researchers in different domains to apply the method to their own work. Background Our work can be viewed as an alternative to existing methods to solve the quantitative structure-activity relationship (QSAR) problem. To this end, we review a number approaches both from a methodological and also a performance perspective. In addition to these approaches, we also examined a number of chemical properties that can be used by generic classifier systems, such as feed-forward artificial neural networks. In studying these approaches, we identified a set of interesting benchmark problem sets to which many of the above approaches had been applied. These included: ACE, AChE, AR, BBB, BZR, Cox2, DHFR, ER, FXa, GPB, Therm, and Thr. Finally, we developed our own benchmark set by collecting data on toxicology. Results Our results show that our system performs better than, or comparatively to, the existing methods over a broad range of problem types. Our method does not require the expert knowledge that is necessary to apply the other methods to novel problems. Conclusions We conclude that our success is due to the ability of our system to: 1) encode molecules losslessly before presentation to the learning system, and 2) leverage the design of molecular description languages to facilitate the identification of relevant structural attributes of the molecules over different problem domains. PMID:21034429

  5. Chronic obstructive pulmonary disease (COPD) assessment test scores corresponding to modified Medical Research Council grades among COPD patients

    PubMed Central

    Lee, Chang-Hoon; Lee, Jinwoo; Park, Young Sik; Lee, Sang-Min; Yim, Jae-Joon; Kim, Young Whan; Han, Sung Koo; Yoo, Chul-Gyu

    2015-01-01

    Background/Aims: In assigning patients with chronic obstructive pulmonary disease (COPD) to subgroups according to the updated guidelines of the Global Initiative for Chronic Obstructive Lung Disease, discrepancies have been noted between the COPD assessment test (CAT) criteria and modified Medical Research Council (mMRC) criteria. We investigated the determinants of symptom and risk groups and sought to identify a better CAT criterion. Methods: This retrospective study included COPD patients seen between June 20, 2012, and December 5, 2012. The CAT score that can accurately predict an mMRC grade ? 2 versus < 2 was evaluated by comparing the area under the receiver operating curve (AUROC) and by classification and regression tree (CART) analysis. Results: Among 428 COPD patients, the percentages of patients classif ied into subgroups A, B, C, and D were 24.5%, 47.2%, 4.2%, and 24.1% based on CAT criteria and 49.3%, 22.4%, 8.9%, and 19.4% based on mMRC criteria, respectively. More than 90% of the patients who met the mMRC criteria for the ‘more symptoms group’ also met the CAT criteria. AUROC and CART analyses suggested that a CAT score ? 15 predicted an mMRC grade ? 2 more accurately than the current CAT score criterion. During follow-up, patients with CAT scores of 10 to 14 did not have a different risk of exacerbation versus those with CAT scores < 10, but they did have a lower exacerbation risk compared to those with CAT scores of 15 to 19. Conclusions: A CAT score ? 15 is a better indicator for the ‘more symptoms group’ in the management of COPD patients. PMID:26354057

  6. Prediction of Mental Health Through Computer Scoring of a Sentence Completion Test.

    ERIC Educational Resources Information Center

    Menaker, Shirley L.; And Others

    An investigation was undertaken to determine how successfully an overall adjustment rating made by trained clinical raters could be predicted by means of a computer program for scoring adjustment which used only the data from a one word sentence completion instrument. Subjects for the study were 69 female college seniors at the University of…

  7. The Relationship between Kindergarten Students' Home Block Play and Their Spatial Ability Test Scores

    ERIC Educational Resources Information Center

    Jones, Tracy Anne

    2010-01-01

    Researchers are increasingly aware of the role of spatial skills in preparing children for future mathematics achievement (National Mathematics Advisory Panel, 2008). In addition, sex differences have been consistently documented showing boys score higher than girls in assessments of spatial ability, particularly mental rotation (Linn & Peterson,…

  8. The Relation of Self-Reported Abilities to Aptitude Test Scores: A Replication and Extension.

    ERIC Educational Resources Information Center

    Carson, Andrew D.

    1998-01-01

    A larger canonical correlation of Self-Directed Search scores with those from the Ball Aptitude Battery for 198 high school students was found compared to results from a 1977 study being replicated. Only modest overlap between the Self Directed Search and the Ball Aptitude Battery was found. (SK)

  9. Examining the Achievement Test Score Gap between Urban and Suburban Students

    ERIC Educational Resources Information Center

    Sandy, Jonathan; Duncan, Kevin

    2010-01-01

    Data from the National Longitudinal Survey of Labor Market Experience for Youth (1997 cohort) are used to examine the urban school achievement gap. Specifically, we use the Blinder-Oaxaca technique to decompose differences in Armed Services Vocational Aptitude Battery scores for students who attended urban and suburban schools. We find that…

  10. Teenage Self Test: cigarette smoking. Discussion Leader's Guide. How do you score?

    ERIC Educational Resources Information Center

    Public Health Service (DHEW), Rockville, MD. National Clearinghouse for Smoking and Health.

    This self-scoring questionnaire on attitudes related to smoking includes norms based upon the responses of 7,000 teenagers and a discussion of the meaning of eight subscores. The subscores are: (1) effect of smoking on health; (2) non-smoker's rights; (3) positive effects of smoking; (4) manufactured reasons for smoking; (5) reasons for starting;…

  11. Out-of-School Time Program Test Score Impact for Black Children of Single-Parents

    ERIC Educational Resources Information Center

    Nagle, Barry T.

    2013-01-01

    Out-of-School Time programs and their impact on standardized college entrance exam scores for black or African-American children of single parents who have applied for a competitive college scholarship program is the study focus. Study importance is supported by the large percentage of black children raised by single parents, the large percentage…

  12. Using Subject Test Scores Efficiently to Predict Teacher Value-Added

    ERIC Educational Resources Information Center

    Lefgren, Lars; Sims, David

    2012-01-01

    This article develops a simple model of teacher value-added to show how efficient use of information across subjects can improve the predictive ability of value-added models. Using matched student-teacher data from North Carolina, we show that the optimal use of math and reading scores improves the fit of prediction models of overall future…

  13. The Relationship between Kindergarten Students' Home Block Play and Their Spatial Ability Test Scores

    ERIC Educational Resources Information Center

    Jones, Tracy Anne

    2010-01-01

    Researchers are increasingly aware of the role of spatial skills in preparing children for future mathematics achievement (National Mathematics Advisory Panel, 2008). In addition, sex differences have been consistently documented showing boys score higher than girls in assessments of spatial ability, particularly mental rotation (Linn & Peterson,…

  14. The Fagerström test for nicotine dependence: a comparison of standard scoring and latent class analysis approaches.

    PubMed

    Storr, Carla L; Reboussin, Beth A; Anthony, James C

    2005-11-01

    The classification of being tobacco dependent obtained via the established scoring method of the Fagerström test for nicotine dependence (FTND) is compared to a method that bases classification on the pattern of item responses. Young adults participating in a longitudinal study, who indicated they had ever smoked, were asked six standardized items (n = 962; mean age 21 years). By standard scoring, the mean FTND score was 1.9 (S.E.= 2.3): 66% of the smokers qualified for a very low level of dependence, 17% low, 9% moderate, and 9% a high level of dependence. Response patterns detected by latent class analysis (LCA) indicated class differences based on severity gradations and of qualitative content. Three profiles of tobacco dependence were found: a non-dependent class (50%), a class manifesting a moderate number of dependence features (31%), and more severely affected class (19%). The vast majority of smokers (three-fourth) were classified congruently by these two methods. Discrepancies involved LCA classifying smokers into a higher level of dependence when compared to the conventional scoring classification. Patterns of dependence features obtained from population samples that include a wide range of smokers may provide insight into possible phenotypic differences among tobacco smokers, particularly when LCA methods are used to complement standard scoring methods. PMID:15908142

  15. Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

    ERIC Educational Resources Information Center

    Haberman, Shelby J.

    2011-01-01

    Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…

  16. Lymph Node Status After Resection for Gallbladder Adenocarcinoma: Prognostic Implications of Different Nodal Staging/Scoring Systems

    PubMed Central

    AMINI, NEDA; SPOLVERATO, GAYA; KIM, YUHREE; GUPTA, ROHAN; MARGONIS, GEORGIOS ANTONIOS; EJAZ, ASLAM; PAWLIK, TIMOTHY M.

    2015-01-01

    Background and Objectives Several lymph node (LN) staging/scoring systems have been proposed to stratify the prognosis of patients with gallbladder adenocarcinoma (GBA). We sought to define the prognostic performance of the most commonly utilized LN staging/scoring systems including AJCC/UICC N stage, lymph node ratio (LNR), log odds (LODDS), and N score, among patients with GBA. Method Between 2004 and 2010, 1,124 patients with GBA were identified from the Surveillance Epidemiology and End Results (SEER) database. The discriminative ability of each LN staging/scoring system was assessed using the Akaike’s Information Criterion (AIC) and the Harrell’s concordance index. Results When assessed using categorical values, LNR had a modest, improved ability to discriminate patients with regard to prognosis (C-index: 0.615; AIC: 2118.2) compared with AJCC/UICC N stage or N score and a prognostic discrimination comparable to LODDS. Among patients who had a total number of LN examined (TNLE) of 1 or 2, all the staging/scoring systems performed comparably. In contrast, among patients who had ?4 TNLE, LODDS performed the best (C-index: 0.613; AIC: 303.2). Conclusion The performance of the different LN staging/scoring systems varied based on the TNLE. In particular, for patients who had ?4 TNLE, LODDS out-performed the other staging/scoring systems. PMID:25312786

  17. Alternative Methods to Curriculum-Based Measurement for Written Expression: Implications for Reliability and Validity of the Scores

    ERIC Educational Resources Information Center

    Merrigan, Teresa E.

    2012-01-01

    The purpose of the current study was to evaluate the psychometric properties of alternative approaches to administering and scoring curriculum-based measurement for written expression. Specifically, three response durations (3, 5, and 7 minutes) and six score types (total words written, words spelled correctly, percent of words spelled correctly,…

  18. Normative Scores for Standard Neuropsychological Tests in the Oldest Old From the French Population-Based PAQUID Study.

    PubMed

    Giulioli, Caroline; Meillon, Céline; Gonzalez-Colaço Harmand, Magali; Dartigues, Jean-François; Amieva, Hélène

    2016-02-01

    There is an obvious lack of validated norms for elderly persons aged 85 and older for the large majority of the neuropsychological tests used in clinical practice. Yet this range of "oldest-old" individuals drastically increases worldwide and is the more likely to develop dementia. Providing clinicians validated and updated norms to accurately evaluate cognitive functioning in this population is an important issue in geriatrics. This study provides normative scores for 7 neuropsychological tests commonly used in clinical practice. Data were collected in a sample of 283 subjects aged 85 and older, included in the PAQUID study, a population-based cohort conducted in France. Normative scores were calculated according to 2 age ranges and 2 educational levels, and are presented in percentiles. The norms provided in the present study involve 7 tests that are widely used in the neuropsychological assessment of geriatrics populations and should be of help for clinicians. PMID:26353935

  19. A quantitative examination of school configurations in Tennessee using sixth grade math, reading, science, and social studies standardized test scores

    NASA Astrophysics Data System (ADS)

    Ramsey, Whitney J.

    The purpose of this study was to determine if there were differences in standardized test scores, expressed as percentage passing, in math, reading-language arts, science, and social studies by comparing 6th grade students in K--8 schools with those in 6--8 schools. The data were gathered from an analysis of 6th grade students' scores on the 2006--2007 TCAP standardized assessment test in the state of Tennessee. The relationship between grade configuration (6--8 or K--8) and percent of 6th grade students scoring at the below proficient, proficient, or advanced level in each subject area was examined. The analysis was based on 5 research questions. A t-test for independent samples was used to identify the relationships between the independent variables, configuration of the school (K--8 or 6--8), and the dependent variables, the percent of students scoring below proficient, proficient, or advanced. A chi square analysis was used to identify the relationship between the proportion of K--8 schools meeting AYP versus the proportion of 6--8 schools meeting AYP. The study showed no relationship between grade configuration (6--8 or K--8) and percent of 6th grade students scoring at the below proficient level in math, reading-language arts, and social studies. Similarly, there was not a significant difference between grade configuration (6--8 or K--8) and percent of 6th grade students scoring at the proficient level in math and reading-language arts and the advanced level in math, reading-language arts, and science. However, there was a significant relationship between grade configuration (6--8 or K--8) and percent of 6th grade students scoring at the below proficient level and the proficient level in science and the percent of 6th grade students scoring at the proficient level and advanced level in social studies. In science, a lower percentage of 6th grade students in K--8 schools scored below proficient than did 6th grade students in 6--8 schools. In science, a higher percentage of 6th grade students in K--8 schools scored proficient than did 6th grade students in 6--8 schools. In social studies, a higher percentage of 6th grade students in K--8 schools scored proficient than did 6th grade students in 6--8 schools. However, a higher percentage of 6th grade students in 6--8 schools scored advanced than did 6th grade students in 6--8 schools. The study showed a significant difference in the proportion of K--8 schools meeting AYP versus the proportion of 6--8 schools meeting AYP.

  20. Cutoff scores in neurocognitive testing and symptom clusters that predict protracted recovery from concussions in high school athletes.

    TOXLINE Toxicology Bibliographic Information

    Lau BC; Collins MW; Lovell MR

    2012-02-01

    BACKGROUND: Many studies address diagnosing concussions, but few look at predicting prognosis. A previous discriminant function analysis showed that symptom clusters derived from the Post-Concussion Symptom Scale and Immediate Postconcussion Assessment and Cognitive Testing composite scores used together improved predictions of protracted recovery after a sports-related concussion.OBJECTIVE: To determine cutoff scores in neurocognitive and Post-Concussion Symptom Scale symptom cluster scores when classifying protracted recovery in concussed athletes.METHODS: 108 male high school football athletes completed a computer-based neurocognitive test battery (Immediate Postconcussion Assessment and Cognitive Testing) within a median of 2 days after injury. Patients completed graded exertional protocols requiring athletes to be symptom free at rest and during increasing levels of activity and had recovery of neurocognitive scores before return to play. After return to play, athletes were classified as protracted recovery (>14 days, n = 58) or short-recovery (≤14 days, n = 50). Receiver-operating characteristic curves analyzed each of the neurocognitive (verbal, visual, processing speed, and reaction time) and symptom cluster (migraine, cognitive, sleep, and neuropsychiatric) scores.RESULTS: Cutoffs for migraine cluster, cognitive cluster, visual memory, and processing speed were statistically significant. Cutoffs at 75%, 80%, and 85% sensitivity to predict protracted recovery for the migraine symptom cluster were 15 or greater, 18, 20; cognitive symptom cluster 18 or greater, 19, 22; visual memory 48 or less, 46, 44.5; and processing speed 24.5 or less, 23.46, 22.5, respectively. Eighty-percent sensitivity indicates that the corresponding cutoff correctly identify 80% of concussed athletes requiring protracted recovery.CONCLUSION: Specific cutoffs may help to set numerical thresholds for clinicians to predict which concussed athletes will have a protracted recovery.

  1. What Is the Apgar Score?

    MedlinePLUS

    ... All About Food Allergies What Is the Apgar Score? KidsHealth > For Parents > What Is the Apgar Score? ... es la puntuación de Apgar? About the Apgar Score The Apgar score, the very first test given ...

  2. Use of cardiac CT and calcium scoring for detecting coronary plaque: implications on prognosis and patient management.

    PubMed

    Divakaran, S; Cheezum, M K; Hulten, E A; Bittencourt, M S; Silverman, M G; Nasir, K; Blankstein, R

    2015-02-01

    Clinicians often use risk factor-based calculators to estimate an individual's risk of developing cardiovascular disease. Non-invasive cardiovascular imaging, particularly coronary artery calcium (CAC) scoring and coronary CT angiography (CTA), allows for direct visualization of coronary atherosclerosis. Among patients without prior coronary artery disease, studies examining CAC and coronary CTA have consistently shown that the presence, extent and severity of coronary atherosclerosis provide additional prognostic information for patients beyond risk factor-based scores alone. This review will highlight the basics of CAC scoring and coronary CTA and discuss their role in impacting patient prognosis and management. PMID:25494818

  3. Comparability of Examinee Proficiency Scores on Computer Adaptive Tests Using Real and Simulated Data

    ERIC Educational Resources Information Center

    Evans, Josiah Jeremiah

    2010-01-01

    In measurement research, data simulations are a commonly used analytical technique. While simulation designs have many benefits, it is unclear if these artificially generated datasets are able to accurately capture real examinee item response behaviors. This potential lack of comparability may have important implications for administration of…

  4. Comparability of Examinee Proficiency Scores on Computer Adaptive Tests Using Real and Simulated Data

    ERIC Educational Resources Information Center

    Evans, Josiah Jeremiah

    2010-01-01

    In measurement research, data simulations are a commonly used analytical technique. While simulation designs have many benefits, it is unclear if these artificially generated datasets are able to accurately capture real examinee item response behaviors. This potential lack of comparability may have important implications for administration of…

  5. Computer based testing: implications for testing handicapped/disabled examinees.

    PubMed

    Yocom, C J

    1991-01-01

    The purpose of this study was to obtain information about computer use by nurses with a learning disability or other handicap, to identify the difficulties encountered by these nurses, and to distinguish the adaptive behaviors used to compensate for these difficulties. In addition, information was desired regarding learning disabled/handicapped nurses' perceptions about problems related to computer-administered examinations. Sixty-three percent of 86 respondents indicated using computers at work and/or at home. Difficulty in reading screen text was reported by 21% of the computer users; 34% reported problems with keyboard use. Adaptive mechanisms included using slotted paper, decreasing text scrolling rates, working slowly, and re-reading text for accuracy. When compared to administration variances granted for paper-and-pencil examinations, relatively few additional modifications will be needed for administering computer-based tests. PMID:1832581

  6. The ability of reaction time tests to detect simulation: an investigation of contextual effects and criterion scores.

    PubMed

    Reicker, Lindsay I

    2008-07-01

    Two experiments examined whether experience gained with a series of reaction time tests [Computerized Tests of Information Processing (CTIP); Tombaugh, T. N. & Rees, L. (in press). Computerized Tests of Information Processing (CTIP). Toronto, Canada: Multi-Health Systems Inc.] influenced the performance of individuals instructed to simulate the cognitive effects of a traumatic brain injury. Experience with the tests was manipulated by varying the order and number of tests administered for simulator and control groups. Simulators responded significantly slower and exhibited increased variability compared to controls. Performance was not affected by order or number of tests. The results of a third experiment showed that criterion scores could be established that correctly classified members of control, simulator, mild TBI, and severe TBI groups. Overall, the results suggest that the performance of the simulators was based on a context-free, absolute judgment and that reaction time measures show considerable promise for detecting low effort. PMID:18420373

  7. Treatment for Schistosoma japonicum, Reduction of Intestinal Parasite Load, and Cognitive Test Score Improvements in School-Aged Children

    PubMed Central

    Ezeamama, Amara E.; McGarvey, Stephen T.; Hogan, Joseph; Lapane, Kate L.; Bellinger, David C.; Acosta, Luz P.; Leenstra, Tjalling; Olveda, Remigio M.; Kurtis, Jonathan D.; Friedman, Jennifer F.

    2012-01-01

    Background To determine whether treatment of intestinal parasitic infections improves cognitive function in school-aged children, we examined changes in cognitive testscores over 18 months in relation to: (i) treatment-related Schistosoma japonicum intensity decline, (ii) spontaneous reduction of single soil-transmitted helminth (STH) species, and (iii) ?2 STH infections among 253 S. japonicum-infected children. Methodology Helminth infections were assessed at baseline and quarterly by the Kato-Katz method. S. japonicum infection was treated at baseline using praziquantel. An intensity-based indicator of lower vs. no change/higher infection was defined separately for each helminth species and joint intensity declines of ?2 STH species. In addition, S. japonicum infection-free duration was defined in four categories based on time of schistosome re-infection: >18 (i.e. cured), >12 to ?18, 6 to ?12 and ?6 (persistently infected) months. There was no baseline treatment for STHs but their intensity varied possibly due to spontaneous infection clearance/acquisition. Four cognitive tests were administered at baseline, 6, 12, and 18 months following S. japonicum treatment: learning and memory domains of Wide Range Assessment of Memory and Learning (WRAML), verbal fluency (VF), and Philippine nonverbal intelligence test (PNIT). Linear regression models were used to relate changes in respective infections to test performance with adjustment for sociodemographic confounders and coincident helminth infections. Principal Findings Children cured (??=?5.8; P?=?0.02) and those schistosome-free for >12 months (??=?1.5; P?=?0.03) scored higher in WRAML memory and VF tests compared to persistently infected children independent of STH infections. A decline vs. no change/increase of any individual STH species (?:11.5–14.5; all P<0.01) and the joint decline of ?2 STH (??=?13.1; P?=?0.01) species were associated with higher scores in WRAML learning test independent of schistosome infection. Hookworm and Trichuris trichiura declines were independently associated with improvements in WRAML memory scores as was the joint decline in ?2 STH species. Baseline coinfection by ?2 STH species was associated with low PNIT scores (??=??1.9; P?=?0.04). Conclusion/Significance Children cured/S. japonicum-free for >12 months post-treatment and those who experienced declines of ?2 STH species scored higher in three of four cognitive tests. Our result suggests that sustained deworming and simultaneous control for schistosome and STH infections could improve children's ability to take advantage of educational opportunities in helminth-endemic regions. PMID:22563514

  8. Does It Matter if You "Kill" the Patient or Order Too Many Tests? Scoring Alternatives for a Test of Clinical Reasoning Skill

    ERIC Educational Resources Information Center

    Childs, Ruth A.; Dunn, Jennifer L.; van Barneveld, Christina; Jaciw, Andrew P.

    2007-01-01

    This study compares five scoring approaches for a test of clinical reasoning skills. All of the approaches incorporate information about the correct item responses selected and the errors, such as selecting too many responses or selecting a response that is inappropriate and/or harmful to the patient. The approaches are combinations of theoretical…

  9. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Pennsylvania

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Pennsylvania's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 1390 for non-Title I students and 1220 for Title I students. In 2009, the mean scale score in 4th grade reading was 1420 for non-Title I students and 1270 for Title I students. Between 2006 and 2009, the mean…

  10. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Maryland

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Maryland's test score trends through 2008-09. In 2004, 82% of non-Title I 4th graders and 61% of Title I 4th graders scored at the proficient level on the state reading test. In 2009, 90% of non-Title I 4th graders and 78% of Title I 4th graders scored at the proficient level in reading. Between 2004 and 2009, the percentage…

  11. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? North Carolina

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles North Carolina's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade math test was 351 for non-Title I students and 347 for Title I students. In 2009, the mean scale score in 4th grade math was 354 for non-Title I students and 350 for Title I students. Between 2006 and 2009, the mean scale…

  12. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Missouri

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Missouri's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 661 for non-Title I students and 642 for Title I students. In 2009, the mean scale score in 4th grade reading was 661 for non-Title I students and 648 for Title I students. Between 2006 and 2009, there was no…

  13. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Colorado

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Colorado's test score trends through 2008-09. In 2003, the mean scale score on the state 4th grade reading test was 598 for non-Title I students and 558 for Title I students. In 2009, the mean scale score in 4th grade reading was 599 for non-Title I students and 556 for Title I students. Between 2003 and 2009, the mean scale…

  14. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Massachusetts

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Massachusetts's test score trends through 2008-09. In 2006, 59% of non-Title I 4th graders and 29% of Title I 4th graders scored at the proficient level on the state reading test. In 2009, 64% of non-Title I 4th graders and 31% of Title I 4th graders scored at the proficient level in reading. Between 2006 and 2009, the…

  15. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? New Hampshire

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles New Hampshire's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 445 for non-Title I students and 438 for Title I students. In 2009, the mean scale score in 4th grade reading was 448 for non-Title I students and 441 for Title I students. Between 2006 and 2009, the mean…

  16. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Rhode Island

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Rhode Island's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 445 for non-Title I students and 435 for Title I students. In 2009, the mean scale score in 4th grade reading was 448 for non-Title I students and 440 for Title I students. Between 2006 and 2009, the mean…

  17. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Tennessee

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Tennessee's test score trends through 2008-09. In 2004, the mean scale score on the state 4th grade reading test was 501 for non-Title I students and 486 for Title I students. In 2009, the mean scale score in 4th grade reading was 512 for non-Title I students and 495 for Title I students. Between 2004 and 2009, the mean scale…

  18. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Delaware

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Delaware's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 474 for non-Title I students and 464 for Title I students. In 2009, the mean scale score in 4th grade reading was 478 for non-Title I students and 467 for Title I students. Between 2006 and 2009, the mean scale…

  19. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Kentucky

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Kentucky's test score trends through 2008-09. In 2007, the mean scale score on the state 4th grade reading test was 455 for non-Title I students and 451 for Title I students. In 2009, the mean scale score in 4th grade reading was 455 for non-Title I students and 451 for Title I students. Between 2007 and 2009, the mean scale…

  20. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Arizona

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Arizona's test score trends through 2008-09. In 2005, the mean scale score on the state 4th grade reading test was 478 for non-Title I students and 445 for Title I students. In 2008, the mean scale score in 4th grade reading was 477 for non-title I students and 450 for title I students. Between 2005 and 2008, the mean scale…

  1. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Washington

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Washington's test score trends through 2008-09. Three years of comparable mean scale score data were not available from the state. In 2004, 77% of non-Title I 4th graders and 60% of Title I 4th graders scored at the proficient level on the state reading test. In 2009, 75% of non-Title I 4th graders and 61% of Title I 4th…

  2. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? California

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles California's test score trends through 2008-09. In 2004, the mean scale score on the state 4th grade reading test was 341 for non-Title I students and 315 for Title I students. In 2008, the mean scale score in 4th grade reading was 379 for non-Title I students and 340 for Title I students. Between 2004 and 2008, the mean scale…

  3. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Texas

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Texas's test score trends through 2008-09. In 2005, the mean scale score on the state 4th grade reading test was 2297 for non-Title I students and 2207 for Title I students. In 2009, the mean scale score in 4th grade reading was 2334 for non-Title I students and 2235 for Title I students. Between 2005 and 2009, the mean scale…

  4. Recommendations, evaluation and validation of a semi-automated, fluorescent-based scoring protocol for micronucleus testing in human cells.

    PubMed

    Seager, Anna L; Shah, Ume-Kulsoom; Brüsehafer, Katja; Wills, John; Manshian, Bella; Chapman, Katherine E; Thomas, Adam D; Scott, Andrew D; Doherty, Ann T; Doak, Shareen H; Johnson, George E; Jenkins, Gareth J S

    2014-05-01

    Micronucleus (MN) induction is an established cytogenetic end point for evaluating structural and numerical chromosomal alterations in genotoxicity testing. A semi-automated scoring protocol for the assessment of MN preparations from human cell lines and a 3D skin cell model has been developed and validated. Following exposure to a range of test agents, slides were stained with 4'-6-diamidino-2-phenylindole (DAPI) and scanned by use of the MicroNuc module of metafer 4, after the development of a modified classifier for selecting MN in binucleate cells. A common difficulty observed with automated systems is an artefactual output of high false positives, in the case of the metafer system this is mainly due to the loss of cytoplasmic boundaries during slide preparation. Slide quality is paramount to obtain accurate results. We show here that to avoid elevated artefactual-positive MN outputs, diffuse cell density and low-intensity nuclear staining are critical. Comparisons between visual (Giemsa stained) and automated (DAPI stained) MN frequencies and dose-response curves were highly correlated (R (2) = 0.70 for hydrogen peroxide, R (2) = 0.98 for menadione, R (2) = 0.99 for mitomycin C, R (2) = 0.89 for potassium bromate and R (2) = 0.68 for quantum dots), indicating the system is adequate to produce biologically relevant and reliable results. Metafer offers many advantages over conventional scoring including increased output and statistical power, and reduced scoring subjectivity, labour and costs. Further, the metafer system is easily adaptable for use with a range of different cells, both suspension and adherent human cell lines. Awareness of the points raised here reduces the automatic positive errors flagged and drastically reduces slide scoring time, making metafer an ideal candidate for genotoxic biomonitoring and population studies and regulatory genotoxic testing. PMID:24705543

  5. The Impact of Scholastic Instrumental Music and Scholastic Chess Study on the Standardized Test Scores of Students in Grades Three, Four, and Five

    ERIC Educational Resources Information Center

    Martinez, Edwin E.

    2012-01-01

    This study examines the impact of instrumental music study and group chess lessons on the standardized test scores of suburban elementary public school students (grades three through five) in Levittown, New York. The study divides the students into the following groups and compares the standardized test scores of each: a) instrumental music…

  6. The Impact of Scholastic Instrumental Music and Scholastic Chess Study on the Standardized Test Scores of Students in Grades Three, Four, and Five

    ERIC Educational Resources Information Center

    Martinez, Edwin E.

    2012-01-01

    This study examines the impact of instrumental music study and group chess lessons on the standardized test scores of suburban elementary public school students (grades three through five) in Levittown, New York. The study divides the students into the following groups and compares the standardized test scores of each: a) instrumental music…

  7. Score Equity Assessment:Development of a Prototype Analysis Using SAT[R] Mathematics Test Data Across Several Administrations. Research Report. ETS RR-09-08

    ERIC Educational Resources Information Center

    Dorans, Neil J.; Liu, Jinghua

    2009-01-01

    The equating process links scores from different editions of the same test. For testing programs that build nearly parallel forms to the same explicit content and statistical specifications and administer forms under the same conditions, the linkings between the forms are expected to be equatings. Score equity assessment (SEA) provides a useful…

  8. A Primer-Test Centered Equating Method for Setting Cut-Off Scores

    ERIC Educational Resources Information Center

    Zhu, Weimo; Plowman, Sharon Ann; Park, Youngsik

    2010-01-01

    This study evaluated the use of a new primary field test method based on test equating to address inconsistent classification among field tests. We analyzed students' information on the Progressive Aerobic Cardiovascular Endurance Run (PACER), mile run (MR), and VO[subscript 2]max from three data sets (college: n = 94; middle school: n = 39;…

  9. Pass-Fail Reliability for Tests with Cut Scores: A Simplified Method.

    ERIC Educational Resources Information Center

    Breyer, F. Jay; Lewis, Charles

    A single-administration classification reliability index is described that estimates the probability of consistently classifying examinees to mastery or nonmastery states as if those examinees had been tested with two alternate forms. The procedure is applicable to any test used for classification purposes, subdividing that test into two…

  10. The Effects of General Practice, Specific Practice, and Item Familiarization on Change in Aptitude Test Scores

    ERIC Educational Resources Information Center

    Nevo, Barukh

    1976-01-01

    Freshmen (N=202) took two batteries of aptitude tests 10 months apart. Six pairs of tests were studied. Two pairs were identical, two were parallel, and two were completely different. This design made it possible to separate three components of practice: (a) general test sophistication, (b) specific practice effect, and (c) item familiarization.…

  11. Comparison of Physical Therapy Anatomy Performance and Anxiety Scores in Timed and Untimed Practical Tests

    ERIC Educational Resources Information Center

    Schwartz, Sarah M.; Evans, Cathy; Agur, Anne M.R.

    2015-01-01

    Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating…

  12. The MDT Innovation: Machine-Scoring of Fill-in-the-Blank Tests.

    ERIC Educational Resources Information Center

    Anderson, Paul S.

    The Multi-Digit Technologies (MDT) testing technique is discussed as the first major advance in computer assisted testing in several decades. The MDT testing method uses fill-in-the-blank or completion-type questions, with an alphabetized long list of possible responses. An MDT answer sheet is used to record the code number of the answer. For…

  13. The Search for the Holy Grail: Content-Referenced Score Interpretations from Large-Scale Tests

    ERIC Educational Resources Information Center

    Marion, Scott F.

    2015-01-01

    The measurement industry is in crisis. The public outcry against "over testing" and the opt-out movement are symptoms of a larger sociopolitical battle being fought over Common Core, teacher evaluation, federal intrusion, and a host of other issues, but much of the vitriol is directed at the tests and the testing industry. If we, as…

  14. A Primer-Test Centered Equating Method for Setting Cut-Off Scores

    ERIC Educational Resources Information Center

    Zhu, Weimo; Plowman, Sharon Ann; Park, Youngsik

    2010-01-01

    This study evaluated the use of a new primary field test method based on test equating to address inconsistent classification among field tests. We analyzed students' information on the Progressive Aerobic Cardiovascular Endurance Run (PACER), mile run (MR), and VO[subscript 2]max from three data sets (college: n = 94; middle school: n = 39;…

  15. Comparison of Physical Therapy Anatomy Performance and Anxiety Scores in Timed and Untimed Practical Tests

    ERIC Educational Resources Information Center

    Schwartz, Sarah M.; Evans, Cathy; Agur, Anne M.R.

    2015-01-01

    Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating…

  16. The Search for the Holy Grail: Content-Referenced Score Interpretations from Large-Scale Tests

    ERIC Educational Resources Information Center

    Marion, Scott F.

    2015-01-01

    The measurement industry is in crisis. The public outcry against "over testing" and the opt-out movement are symptoms of a larger sociopolitical battle being fought over Common Core, teacher evaluation, federal intrusion, and a host of other issues, but much of the vitriol is directed at the tests and the testing industry. If we, as…

  17. Transcultural adaptation and testing psychometric properties of the Korean version of the Foot and Ankle Outcome Score (FAOS).

    PubMed

    Lee, Kyoung Min; Chung, Chin Youb; Kwon, Soon Sun; Sung, Ki Hyuk; Lee, Seung Yeol; Won, Sung Hun; Lee, Damian J; Lee, Seoryong C; Park, Moon Seok

    2013-10-01

    This study was performed to translate and transculturally adapt the English version of the Foot and Ankle Outcome Score (FAOS) into a Korean version, and to test psychometric properties of the Korean FAOS in terms of internal consistency, test-retest reliability, convergent validity, and dimensionality. Translation and transcultural adaptation of FAOS into a Korean version was performed according to internationally recommended guidelines. Internal consistency (N = 294) and test-retest reliability (N = 21) were evaluated. Convergent validity was analyzed using correlation with pain visual analogue scale (VAS) score. All subscales, except for the quality of life (Q) subscale (Cronbach's alpha, 0.615), showed satisfactory internal consistency (Cronbach's alpha?> 0.7). Cronbach's alpha of function in daily living (ADL) was highest (0.962), which might represent the redundancy of the items. All five subscales showed satisfactory reliability with ADL subscale showing the highest ICC (intraclass correlation coefficient; 0.851) and Q subscale the lowest ICC (0.718). Pain VAS score showed the highest correlation with pain (P) subscale of FAOS (r = 0.675, p < 0.001) and the lowest correlation with Q subscale (r = 0.495, p < 0.001). In the dimensionality test, a factor analysis was performed using the total items to rank their relative significance, which showed seven components solution. Considerable portion of the items showed a similar dimension according to their original subscales, except for ADL items. Translation and transcultural adaptation of FAOS into the Korean language was performed successfully. The items were understandable, and the subscales showed satisfactory test-retest reliability. Some minor revision might be needed to enhance the internal consistency of Q subscale and reduce the redundancy of ADL subscale. PMID:23703359

  18. Patterns of SAT Scores, Choice of STEM Major, and Gender

    ERIC Educational Resources Information Center

    Davison, Mark L.; Jew, Gilbert B.; Davenport, Ernest C., Jr.

    2014-01-01

    Using Baccalaureate and Beyond 2001 data, we found that STEM major was associated with an SAT pattern less common among females than males, in which the student's quantitative score exceeded the verbal score. Verbal ability was negatively associated with STEM major. Implications for career theory and test interpretation are discussed.

  19. An Assessment of the Predictive Validity of Impact Factor Scores: Implications for Academic Employment Decisions in Social Work

    ERIC Educational Resources Information Center

    Holden, Gary; Rosenberg, Gary; Barker, Kathleen; Onghena, Patrick

    2006-01-01

    Objective: Bibliometrics is a method of examining scholarly communications. Concerns regarding the use of bibliometrics in general, and the impact factor score (IFS) in particular, have been discussed across disciplines including social work. Although there are frequent mentions in the literature of the IFS as an indicator of the impact or quality…

  20. An Assessment of the Predictive Validity of Impact Factor Scores: Implications for Academic Employment Decisions in Social Work

    ERIC Educational Resources Information Center

    Holden, Gary; Rosenberg, Gary; Barker, Kathleen; Onghena, Patrick

    2006-01-01

    Objective: Bibliometrics is a method of examining scholarly communications. Concerns regarding the use of bibliometrics in general, and the impact factor score (IFS) in particular, have been discussed across disciplines including social work. Although there are frequent mentions in the literature of the IFS as an indicator of the impact or quality…

  1. The Relationship between English Language Learners' Language Proficiency and Standardized Test Scores

    ERIC Educational Resources Information Center

    Thakkar, Darshan

    2013-01-01

    It is generally theorized that English Language Learner (ELL) students do not succeed on state standardized tests because ELL students lack the cognitive academic language skills necessary to function on the large scale content assessments. The purpose of this dissertation was to test that theory. Through the use of quantitative methodology, ELL…

  2. Relationship of WPPSI and Subsequent Metropolitan Achievement Test Scores in Head-Start Children

    ERIC Educational Resources Information Center

    Crockett, Bruce K.; And Others

    1976-01-01

    The Metropolitan Achievement Test (MAT) was administered to 35 original Head-Start children three to four years after initial WPPSI testing. WPPSI Verbal IQ did not correlate significantly with any of the subject areas of the MAT, while Performance IQ correlated only moderately with mathematical components of the MAT (r = .42 - .52). (Author)

  3. Identifying Language Impairment in Children: Combining Language Test Scores with Parental Report

    ERIC Educational Resources Information Center

    Bishop, Dorothy V. M.; McDonald, David

    2009-01-01

    Background: Children who meet language test criteria for specific language impairment (SLI) are not necessarily the same as those who are referred to a speech and language therapist. Aims: To consider how far this discrepancy reflects insensitivity of traditional language tests to clinically important features of language impairment. Methods &…

  4. Examining the Relationship between Students' Mathematics Test Scores and Computer Use at Home and at School

    ERIC Educational Resources Information Center

    O'Dwyer, Laura M.; Russell, Michael; Bebell, Damian; Seeley, Kevon

    2008-01-01

    Over the past decade, standardized test results have become the primary tool used to judge the effectiveness of schools and educational programs, and today, standardized testing serves as the keystone for educational policy at the state and federal levels. This paper examines the relationship between fourth grade mathematics achievement and…

  5. An Investigation of Difference Scores for a Grade-Level Testing Program.

    ERIC Educational Resources Information Center

    Yin, Ping; Brennan, Robert L.

    2002-01-01

    Studied longitudinal changes in performance at both the student and school district level in major content areas of a widely used norm-referenced grade-level testing program. Used data from grades 3 to 4 and from 7 to 8 of the Iowa Tests of Basic Skills (in Iowa). Reports descriptive statistics and empirical norms and reliability estimates for…

  6. The Impact of Intensive Reading Interventions on Student Standardized Test Scores

    ERIC Educational Resources Information Center

    Munoz, Carolyn Sue

    2010-01-01

    The purpose of this study was to identify the impact intensive reading instruction had for 28 students with learning disabilities at the middle school level on standardized tests. National Assessment of Education Progress testing indicates that across the United States, learning disabled students literacy skills are decreasing annually, and these…

  7. Correlation of SPINE Test Scores to Judges' Ratings of Speech Intelligibility in Hearing-Impaired Children.

    ERIC Educational Resources Information Center

    Kelly, Colleen; And Others

    1986-01-01

    The SPINE test (SPeech INtelligibility Evaluation), designed to measure speech intelligibility of severely to profoundly hearing-impaired children was administered to 30 hearing-impaired children (12-16 years old) to examine its validity. Results suggested that the SPINE test is a valid measure of speech intelligibility with hearing-impaired…

  8. Major Field Achievement Test in Business: Guidelines for Improved Outcome Scores--Part I

    ERIC Educational Resources Information Center

    McLaughlin, J. Patrick; White, Jason T.

    2007-01-01

    Outcomes measurements have always been an important part of proving to outside constituencies how you "measure up" to other schools with your business programs. A common nationally-normed exam that is used is the Major Field Achievement Test in Business from Educational Testing Services. Our paper discusses some guidelines that we are "pilot…

  9. EAP Study Recommendations and Score Gains on the IELTS Academic Writing Test

    ERIC Educational Resources Information Center

    Green, Anthony

    2005-01-01

    The IELTS test is widely accepted by university admissions offices as evidence of English language ability. The test is also used to guide decisions about the amount of language study required for students to satisfy admissions requirements. Guidelines currently published by the British Association of Lecturers in English for Academic Purposes…

  10. Keeping Scores: Audited Self-Monitoring of High-Stakes Testing Environments

    ERIC Educational Resources Information Center

    Padilla, Raymond; Richards, Michael

    2006-01-01

    To address a public relations problem faced by a large urban public school district in Texas, we conducted action research that resulted in an audited self-monitoring system for high-stakes testing environments. The system monitors violations of testing protocols while identifying and disseminating best practices to improve the education of…

  11. Major Field Achievement Test in Business: Guidelines for Improved Outcome Scores--Part I

    ERIC Educational Resources Information Center

    McLaughlin, J. Patrick; White, Jason T.

    2007-01-01

    Outcomes measurements have always been an important part of proving to outside constituencies how you "measure up" to other schools with your business programs. A common nationally-normed exam that is used is the Major Field Achievement Test in Business from Educational Testing Services. Our paper discusses some guidelines that we are "pilot…

  12. The Effects of a Translation Bias on the Scores for the "Basic Economics Test"

    ERIC Educational Resources Information Center

    Hahn, Jinsoo; Jang, Kyungho

    2012-01-01

    International comparisons of economic understanding generally require a translation of a standardized test written in English into another language. Test results can differ based on how researchers translate the English written exam into one in their own language. To confirm this hypothesis, two differently translated versions of the "Basic…

  13. Comparability of Scores on Word-Processed and Handwritten Essays on the Graduate Management Admissions Test.

    ERIC Educational Resources Information Center

    Bridgeman, Brent; Cooper, Peter

    Essays for the Graduate Management Admissions Test must be written with a word processor (except in some foreign countries). The test sponsors, the Graduate Management Admissions Council, believed that this is fair because some word processing skill is a prerequisite for advanced management education. Furthermore, it might also be unfair to…

  14. An Ecological Analysis of Test Score Changes Over Time. No. 64.

    ERIC Educational Resources Information Center

    Harnqvist, Kjell; Stahle, Gun

    A written intelligence test with verbal, reasoning, and spatial subscales was administered to two comparable national samples of 1,000 thirteen-year olds in Sweden, tested in 1961 and in 1966. The increases were greater for girls than for boys, and the changes occurred simultaneously with several changes in social and educational conditions. To…

  15. Estimating Future Adverse Impact Using Selection Ratios and Group Differences in Test Score Means.

    ERIC Educational Resources Information Center

    Aamodt, Michael G.; And Others

    Estimating the validity of a test is only one concern for the human resources professional developing a personnel selection battery. An equally important concern is whether the test will result in adverse impact against a member of a protected class. It would be useful if the probability of adverse impact could be estimated prior to spending time…

  16. The Impact of Linking Distinct Achievement Test Scores on the Interpretation of Student Growth in Achievement

    ERIC Educational Resources Information Center

    Airola, Denise Tobin

    2011-01-01

    Changes to state tests impact the ability of State Education Agencies (SEAs) to monitor change in performance over time. The purpose of this study was to evaluate the Standardized Performance Growth Index (PGIz), a proposed statistical model for measuring change in student and school performance, across transitions in tests. The PGIz is a…

  17. Apgar Scores

    MedlinePLUS

    ... Listen Español Text Size Email Print Share Apgar Scores Page Content Article Body As soon as your ... heart or respiratory system. What if Your Baby Scores Low? If your baby's Apgar scores are very ...

  18. Apgar score

    MedlinePLUS

    ... scores 1 for respiratory effort. If the infant cries well, the respiratory score is 2. Heart rate ... is grimacing and a cough, sneeze, or vigorous cry, the infant scores 2 for reflex irritability. Skin ...

  19. Clinical score and rapid antigen detection test to guide antibiotic use for sore throats: randomised controlled trial of PRISM (primary care streptococcal management)

    PubMed Central

    2013-01-01

    Objective To determine the effect of clinical scores that predict streptococcal infection or rapid streptococcal antigen detection tests compared with delayed antibiotic prescribing. Design Open adaptive pragmatic parallel group randomised controlled trial. Setting Primary care in United Kingdom. Patients Patients aged ≥3 with acute sore throat. Intervention An internet programme randomised patients to targeted antibiotic use according to: delayed antibiotics (the comparator group for analyses), clinical score, or antigen test used according to clinical score. During the trial a preliminary streptococcal score (score 1, n=1129) was replaced by a more consistent score (score 2, n=631; features: fever during previous 24 hours; purulence; attends rapidly (within three days after onset of symptoms); inflamed tonsils; no cough/coryza (acronym FeverPAIN). Outcomes Symptom severity reported by patients on a 7 point Likert scale (mean severity of sore throat/difficulty swallowing for days two to four after the consultation (primary outcome)), duration of symptoms, use of antibiotics. Results For score 1 there were no significant differences between groups. For score 2, symptom severity was documented in 80% (168/207 (81%) in delayed antibiotics group; 168/211 (80%) in clinical score group; 166/213 (78%) in antigen test group). Reported severity of symptoms was lower in the clinical score group (−0.33, 95% confidence interval −0.64 to −0.02; P=0.04), equivalent to one in three rating sore throat a slight versus moderate problem, with a similar reduction for the antigen test group (−0.30, −0.61 to −0.00; P=0.05). Symptoms rated moderately bad or worse resolved significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63) but not the antigen test group (1.11, 0.88 to 1.40). In the delayed antibiotics group, 75/164 (46%) used antibiotics. Use of antibiotics in the clinical score group (60/161) was 29% lower (adjusted risk ratio 0.71, 95% confidence interval 0.50 to 0.95; P=0.02) and in the antigen test group (58/164) was 27% lower (0.73, 0.52 to 0.98; P=0.03). There were no significant differences in complications or reconsultations. Conclusion Targeted use of antibiotics for acute sore throat with a clinical score improves reported symptoms and reduces antibiotic use. Antigen tests used according to a clinical score provide similar benefits but with no clear advantages over a clinical score alone. Trial registration ISRCTN32027234 PMID:24114306

  20. A standardized scoring method for the copy of cube test, developed to be suitable for use in psychiatric populations

    PubMed Central

    2011-01-01

    Background Although the 'copy of cube test', a version of which is included in the Short Test of Mental Status (STMS), has existed for years, little has been done to standardize it in detail. The aim of the current study was to develop a novel and detailed standardized method of administration and scoring this test. Methods The study sample included 93 healthy control subjects (53 women and 40 men) aged 35.87 ± 12.62 and 127 patients suffering from schizophrenia (54 women and 73 men) aged 34.07 ± 9.83 years. The psychometric assessment included the Positive and Negative Symptoms Scale (PANSS) the Young Mania Rating Scale (YMRS), and the Montgomery-Åsberg Depression Rating Scale (MADRS). Results A scoring method was developed based on the frequencies of responses of healthy controls. Cronbach's ? was equal to 0.75 and inter-rater reliability was 0.90. Three indices and five subscales of the Standardized Copy of the Cube Test (SCCT) were eventually developed. They included the Deficit Index (DcI), which includes the Missing Elements (ME) Mirror Image (M) subscales, the Deformation Index (DfI) which includes the Deformation (D) and the Rotation (R) subscales and the Closing-In Index (CiI). Discussion The SCCT seems to be a reliable, valid and sensitive to change instrument for the testing of psychiatric patients. The great advantage of this instrument is the fact that it only requires paper and a pencil, and is this easily administered and brief. Further research is necessary to test its usefulness as a neuropsychological test. PMID:21745404