Science.gov

Sample records for ability test scores

  1. A Review of Scoring Algorithms for Ability and Aptitude Tests.

    ERIC Educational Resources Information Center

    Chevalier, Shirley A.

    In conventional practice, most educators and educational researchers score cognitive tests using a dichotomous right-wrong scoring system. Although simple and straightforward, this method does not take into consideration other factors, such as partial knowledge or guessing tendencies and abilities. This paper discusses alternative scoring models:…

  2. Prediction of WAIS Scores from Group Ability Tests

    ERIC Educational Resources Information Center

    Watson, Charles G.; Klett, William G.

    1973-01-01

    In a search for an adequate but efficient substitute, the authors have instituted three evaluations of the relationships between potential WAIS-substitutes and the WAIS itself. The present report describes the first of these researches-- a study of the relationships between the four group ability tests and the WAIS in a mental hospital setting.…

  3. Situational Effects May Account for Gain Scores in Cognitive Ability Testing: A Longitudinal SEM Approach

    ERIC Educational Resources Information Center

    Matton, Nadine; Vautier, Stephane; Raufaste, Eric

    2009-01-01

    Mean gain scores for cognitive ability tests between two sessions in a selection setting are now a robust finding, yet not fully understood. Many authors do not attribute such gain scores to an increase in the target abilities. Our approach consists of testing a longitudinal SEM model suitable to this view. We propose to model the scores' changes…

  4. The Effect of Schooling and Ability on Achievement Test Scores. NBER Working Paper Series.

    ERIC Educational Resources Information Center

    Hansen, Karsten; Heckman, James J.; Mullen, Kathleen J.

    This study developed two methods for estimating the effect of schooling on achievement test scores that control for the endogeneity of schooling by postulating that both schooling and test scores are generated by a common unobserved latent ability. The methods were applied to data on schooling and test scores. Estimates from the two methods are in…

  5. On Reporting IRT Ability Scores When the Test Is Not Unidimensional.

    ERIC Educational Resources Information Center

    Dirir, Mohamed A.; Sinclair, Norma

    The purpose of this study was to examine the effect of test dimensionality on the stability of examinee ability estimates and item response theory (IRT) based score reports. A simulation procedure based on W. F. Stout's Essential Unidimensionality was used to generate test data with one dominant trait for the whole test and three minor traits…

  6. The Relationship between Deductive Reasoning Ability, Test Anxiety, and Standardized Test Scores in a Latino Sample

    ERIC Educational Resources Information Center

    Rich, John D., Jr.; Fullard, William; Overton, Willis

    2011-01-01

    One Hundred and Twelve Latino students from Philadelphia participated in this study, which examined the development of deductive reasoning across adolescence, and the relation of reasoning to test anxiety and standardized test scores. As predicted, 11th and ninth graders demonstrated significantly more advanced reasoning than seventh graders.…

  7. The Relationship between Kindergarten Students' Home Block Play and Their Spatial Ability Test Scores

    ERIC Educational Resources Information Center

    Jones, Tracy Anne

    2010-01-01

    Researchers are increasingly aware of the role of spatial skills in preparing children for future mathematics achievement (National Mathematics Advisory Panel, 2008). In addition, sex differences have been consistently documented showing boys score higher than girls in assessments of spatial ability, particularly mental rotation (Linn &…

  8. Aptitude Test Score Validity: No Moderating Effect Due to Job Ability Requirement Differences.

    ERIC Educational Resources Information Center

    Jones, Gwen E.; Ree, Malcolm James

    1998-01-01

    This study tested the specificity-generality hypothesis regarding moderation of aptitude test validity by job ability requirement differences using 24,482 Air Force enlistees in 37 jobs. Moderating effects due to job differences were not found, and job ability differences did not moderate the relationship between the amount of "g"…

  9. A Factor Analysis of Learning Data and Selected Ability Test Scores

    ERIC Educational Resources Information Center

    Jones, Dorothy L.

    1976-01-01

    A verbal concept-learning task permitting the externalizing and quantifying of learning behavior and 16 ability tests were administered to female graduate students. Data were analyzed by alpha factor analysis and incomplete image analysis. Six alpha factors and 12 image factors were extracted and orthogonally rotated. Four areas of cognitive…

  10. Cognitive Ability and Personality Variables as Predictors of School Grades and Test Scores in Adolescents

    ERIC Educational Resources Information Center

    Hofer, Manfred; Kuhnle, Claudia; Kilian, Britta; Fries, Stefan

    2012-01-01

    The predictive power of cognitive ability and self-control strength for self-reported grades and an achievement test were studied. It was expected that the variables use of time structure, academic procrastination, and motivational interference during learning further aid in predicting students' achievement because they are operative in situations…

  11. Interpreting Vocabulary Test Scores: What Do Various Item Formats Tell Us about Learners' Ability to Employ Words?

    ERIC Educational Resources Information Center

    Kremmel, Benjamin; Schmitt, Norbert

    2016-01-01

    The scores from vocabulary size tests have typically been interpreted as demonstrating that the target words are "known" or "learned." But "knowing" a word should entail the ability to use it in real language communication in one or more of the four skills. It should also entail deeper knowledge, such as knowing the…

  12. The Role of Fluid, Crystallized, and Creative Abilities in the Prediction of Scores on Essay and Objective Tests.

    ERIC Educational Resources Information Center

    Legg, Sue M.; Ware, William B.

    Student and test characteristics were examined by multiple regression analysis and discriminant function analysis to explain why 171 political science undergraduates scored differently on essay versus objective final examinations. Student characteristics included: (1) patterns of creative, crystallized, and fluid abilities as measured by the…

  13. The Score Reliability of Draw-a-Person Intellectual Ability Test (DAP: IQ) for Rural Malawi Students

    ERIC Educational Resources Information Center

    Khasu, Denis S.; Williams, Thomas O., Jr.

    2016-01-01

    In this brief article, the reliability of scores for the Draw-A-Person Intellectual Ability Test for Children, Adolescents, and Adults (DAP: IQ; Reynolds & Hickman, 2004) was examined through several analyses with a sample of 147 children from rural Malawi, Africa using a Chichewa translation of instructions. Cronbach alpha coefficients for…

  14. A Test of the Relationship between Reading Ability & Standardized Biology Assessment Scores

    ERIC Educational Resources Information Center

    Allen, Denise A.

    2014-01-01

    Little empirical evidence suggested that independent reading abilities of students enrolled in biology predicted their performance on the Biology I Graduation End-of-Course Assessment (ECA). An archival study was conducted at one Indiana urban public high school in Indianapolis, Indiana, by examining existing educational assessment data to test…

  15. An Investigation of Calculator Use on Employment Tests of Mathematical Ability: Effects on Reliability, Validity, Test Scores, and Speed of Completion

    ERIC Educational Resources Information Center

    Bing, Mark N.; Stewart, Susan M.; Davison, H. Kristl

    2009-01-01

    Handheld calculators have been used on the job for more than 30 years, yet the degree to which these devices can affect performance on employment tests of mathematical ability has not been thoroughly examined. This study used a within-subjects research design (N = 167) to investigate the effects of calculator use on test score reliability, test…

  16. Effects of Test Length and Sample Size on the Estimates of Precision of Latent Ability Scores

    DTIC Science & Technology

    1979-03-01

    describing a test item, and methods used to estimate parameters) we will be even more pleased. * e -32- References Birnbaum, A. Some latent trait models and...IS. SUPPLEMENTARY .- rES A paper presented at an AERA-NCME symposium entitled "Explorations of Latent Trait Models is a Means of Solving Practical...of latent trait moduls is the possibility of specifying a tairget information cumv,’ and thcn selecting items from an item pool to produce a test with

  17. Validation of Automated Scores of TOEFL iBT Tasks against Non-Test Indicators of Writing Ability

    ERIC Educational Resources Information Center

    Weigle, Sara Cushing

    2010-01-01

    Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study approaches validity by comparing human and automated scores on responses to…

  18. On the Myth and the Reality of the Temporal Validity Degradation of General Mental Ability Test Scores

    ERIC Educational Resources Information Center

    Reeve, Charlie L.; Bonaccio, Silvia

    2011-01-01

    Claims of changes in the validity coefficients associated with general mental ability (GMA) tests due to the passage of time (i.e., temporal validity degradation) have been the focus of an on-going debate in applied psychology. To evaluate whether and, if so, under what conditions this degradation may occur, we integrate evidence from multiple…

  19. Test facilities for SCORE-D

    NASA Astrophysics Data System (ADS)

    Greuel, Dirk; Deeken, Jan; Suslov, Dmitry; Schäfer, Klaus; Schlechtriem, Stefan

    2013-06-01

    The LOX/LH2 Staged Combustion Rocket Engine Demonstrator (SCORE-D) is part of ESA's Future Launcher Preparatory Program (FLPP). SCORE-D serves as a technology demonstrator in perspective of the development of the High Thrust Engine (HTE), which is designated as a candidate for the main stage engine of the Next Generation Launcher (NGL). To develop and test the SCORE-D engine, ESA investigates configurations of the test benches P3.2 and P5 at DLR test site in Lampoldshausen. For the SCORE-D Hot Combustion Devices (HCD) development, i.e. Pre-burner (PB) and thrust chamber assembly (TCA), the P3.2 test facility has to be modified for further usage. Recently, the first steps in this endeavor have been made with the evaluation of the necessary modifications to the facility. To accommodate the SCORE-D engine, it is foreseen to modify the P5 test facility in the coming years. In the last year, DLR has started the design phase for these modifications. In preparatory test programs at the P8 test facility, Astrium has conducted sub-scale hot combustion devices tests. While Astrium designed and manufactured the sub-scale assembly of the pre-burner and the main combustion chamber (MCC) for SCORE-D, DLR operated the P8 test facility.

  20. Multidimensional Scoring of Abilities: The Ordered Polytomous Response Case

    ERIC Educational Resources Information Center

    de la Torre, Jimmy

    2008-01-01

    Recent work has shown that multidimensionally scoring responses from different tests can provide better ability estimates. For educational assessment data, applications of this approach have been limited to binary scores. Of the different variants, the de la Torre and Patz model is considered more general because implementing the scoring procedure…

  1. The Ability of Standardized Test Instruments to Predict Training Success and Employment Success. Project MINI-SCORE, Final Technical Report.

    ERIC Educational Resources Information Center

    Pucel, David J.; And Others

    Using post-secondary vocational and technical education students as the populations, the objectives of this project were to determine: (1) the ability of standardized instruments to predict the various criteria of success, (2) the relative ability of the different instruments to predict each criterion of success, and (3) which sub-set of all of…

  2. Fluctuation in Spatial Ability Scores during the Menstrual Cycle.

    ERIC Educational Resources Information Center

    Moody, M. Suzanne

    Whether or not fluctuations in spatial ability as measured by S. G. Vandenberg's Mental Rotations Test occur during the menstrual cycle was studied with 133 female students from 9 undergraduate educational psychology and nursing classes. For comparison, 28 male students also took the test. Scores from 55 females fell into the relevant menstrual…

  3. TRACKING Trounces Test Scores

    ERIC Educational Resources Information Center

    Education Digest: Essential Readings Condensed for Quick Review, 2004

    2004-01-01

    This article presents an adaptation of an article from School Board News, January 6, 2004 edition. The article describes the effort of de-tracking students of varying ability levels, made by officials of South Side High School, in Rockville Centre, New York, and Noble High School, in North Berwick, Maine. Officials from both schools say that the…

  4. Polychotomous Responses and the Test Score.

    ERIC Educational Resources Information Center

    Samejima, Fumiko

    Traditionally, the test score represented by the number of items answered correctly was taken as an indicator of the examinee's ability level. Researchers still tend to think that the number-correct score is a way of ordering individuals with respect to the latent trait. The objective of this study is to depict the benefits of using ability…

  5. Beyond the Test Scores.

    ERIC Educational Resources Information Center

    Thibodeau, Janice J.

    1985-01-01

    A diagnostic-prescriptive scheme is illustrated using subtests of the Slingerland Screening Tests for Identifying Children with Specific Language Disability and the Detroit Tests of Learning Aptitude. The scheme is intended to focus on the child's learning style by examining the task and the strategies employed. (CL)

  6. What Do Test Score Really Mean? A Latent Class Analysis of Danish Test Score Performance

    ERIC Educational Resources Information Center

    McIntosh, James; Munk, Martin D.

    2014-01-01

    Latent class Poisson count models are used to analyse a sample of Danish test score results from a cohort of individuals born in 1954-1955, tested in 1968, and followed until 2011. The procedure takes account of unobservable effects as well as excessive zeros in the data. We show that the test scores measure manifest or measured ability as it has…

  7. Hedonism or Higher Test Scores?

    ERIC Educational Resources Information Center

    Wold, Donald C.

    2004-01-01

    In the 20 years since the federal report on education "A Nation at Risk" appeared, much has been written on test scores of students in the United States versus their counterparts elsewhere. One of the issues is whether their scores are in fact inferior, or merely a statistical difference due to their universal schooling philosophy. Since…

  8. From neural oscillations to reasoning ability: Simulating the effect of the theta-to-gamma cycle length ratio on individual scores in a figural analogy test.

    PubMed

    Chuderski, Adam; Andrelczyk, Krzysztof

    2015-02-01

    Several existing computational models of working memory (WM) have predicted a positive relationship (later confirmed empirically) between WM capacity and the individual ratio of theta to gamma oscillatory band lengths. These models assume that each gamma cycle represents one WM object (e.g., a binding of its features), whereas the theta cycle integrates such objects into the maintained list. As WM capacity strongly predicts reasoning, it might be expected that this ratio also predicts performance in reasoning tasks. However, no computational model has yet explained how the differences in the theta-to-gamma ratio found among adult individuals might contribute to their scores on a reasoning test. Here, we propose a novel model of how WM capacity constraints figural analogical reasoning, aimed at explaining inter-individual differences in reasoning scores in terms of the characteristics of oscillatory patterns in the brain. In the model, the gamma cycle encodes the bindings between objects/features and the roles they play in the relations processed. Asynchrony between consecutive gamma cycles results from lateral inhibition between oscillating bindings. Computer simulations showed that achieving the highest WM capacity required reaching the optimal level of inhibition. When too strong, this inhibition eliminated some bindings from WM, whereas, when inhibition was too weak, the bindings became unstable and fell apart or became improperly grouped. The model aptly replicated several empirical effects and the distribution of individual scores, as well as the patterns of correlations found in the 100-people sample attempting the same reasoning task. Most importantly, the model's reasoning performance strongly depended on its theta-to-gamma ratio in same way as the performance of human participants depended on their WM capacity. The data suggest that proper regulation of oscillations in the theta and gamma bands may be crucial for both high WM capacity and effective complex

  9. Are Cattell-Horn-Carroll Broad Ability Composite Scores Exchangeable across Batteries?

    ERIC Educational Resources Information Center

    Floyd, Randy G.; Bergeron, Renee; McCormack, Allison C.; Anderson, Janice L.; Hargrove-Owens, Gabrielle L.

    2005-01-01

    Many school psychologists use the Cattell-Horn-Carroll (CHC) theory of cognitive abilities to guide their interpretation of scores from intelligence test batteries. Some may frequently assume that composite scores purported to measure the same CHC broad abilities should be relatively similar for individuals no matter what subtests or batteries…

  10. Aptitude Test Score Trends: 1959-1984.

    ERIC Educational Resources Information Center

    Brown, Dianne C.

    The decline in standardized test scores during the 1960s and 1970s is well documented and is seen in both aptitude and achievement test scores. This paper describes and analyzes the test score trends over the 1960s, 1970s and early 1980s for five aptitude tests: (1) the Scholastic Aptitude Test; (2) the American College Test; (3) the Preliminary…

  11. Tests of Cognitive Ability

    DTIC Science & Technology

    2005-12-01

    Carretta, 1996, p. 113). In fact , tests that do not even appear to measure g do so as illustrated by Rabbitt, Banerji, and Szymanski (1989) who...predictive validity where it is assumed that it measures a certain construct, but in fact measures a different construct. For example, Walters, Miller...grammar, word knowledge, making inferences, finding facts , seeing relationships, and identifying the main idea of the text. Reading and difficulty

  12. Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

    ERIC Educational Resources Information Center

    Wang, Wei

    2013-01-01

    Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

  13. Does Objective Structured Clinical Examinations Score Reflect the Clinical Reasoning Ability of Medical Students?

    PubMed Central

    Park, Wan Beom; Kang, Seok Hoon; Lee, Yoon-Seong

    2015-01-01

    Abstract: Background: Clinical reasoning ability is an important factor in a physician's competence and thus should be taught and tested in medical schools. Medical schools generally use objective structured clinical examinations (OSCE) to measure the clinical competency of medical students. However, it is unknown whether OSCE can also evaluate clinical reasoning ability. In this study, the authors investigated whether OSCE scores reflected students' clinical reasoning abilities. Methods: Sixty-five fourth-year medical students participated in this study. Medical students completed the OSCE with 4 cases using standardized patients. For assessment of clinical reasoning, students were asked to list differential diagnoses and the findings that were compatible or not compatible with each diagnosis. The OSCE score (score of patient encounter), diagnostic accuracy score, clinical reasoning score, clinical knowledge score and grade point average (GPA) were obtained for each student, and correlation analysis was performed. Results: Clinical reasoning score was significantly correlated with diagnostic accuracy and GPA (correlation coefficient = 0.258 and 0.380; P = 0.038 and 0.002, respectively) but not with OSCE score or clinical knowledge score (correlation coefficient = 0.137 and 0.242; P = 0.276 and 0.052, respectively). Total OSCE score was not significantly correlated with clinical knowledge test score, clinical reasoning score, diagnostic accuracy score or GPA. Conclusions: OSCE score from patient encounters did not reflect the clinical reasoning abilities of the medical students in this study. The evaluation of medical students' clinical reasoning abilities through OSCE should be strengthened. PMID:25647834

  14. More than Just Test Scores

    ERIC Educational Resources Information Center

    Levin, Henry M.

    2012-01-01

    Around the world we hear considerable talk about creating world-class schools. Usually the term refers to schools whose students get very high scores on the international comparisons of student achievement such as PISA or TIMSS. The practice of restricting the meaning of exemplary schools to the narrow criterion of achievement scores is usually…

  15. Understanding and Interpreting Pharmacy College Admission Test Scores

    PubMed Central

    2017-01-01

    To fairly and accurately interpret candidates’ Pharmacy College Admission Test (PCAT) scores as listed on their official transcripts, it is important to understand how these scores reflect candidates’ performances on cognitive tasks involving the identification, interpretation, analysis, and evaluation of information assumed to have been covered in pre-pharmacy science, math, and general education coursework. This paper attempts to facilitate this understanding by explaining how candidates’ responses to PCAT test items relate to their scaled scores and percentile ranks and how their writing scores reflect their performance. This paper also suggests how differences between candidates’ PCAT subtest scores may reflect different personal experiences, educational backgrounds, and cognitive abilities. PMID:28289307

  16. Conditional Reliability Coefficients for Test Scores.

    PubMed

    Nicewander, W Alan

    2017-04-06

    The most widely used, general index of measurement precision for psychological and educational test scores is the reliability coefficient-a ratio of true variance for a test score to the true-plus-error variance of the score. In item response theory (IRT) models for test scores, the information function is the central, conditional index of measurement precision. In this inquiry, conditional reliability coefficients for a variety of score types are derived as simple transformations of information functions. It is shown, for example, that the conditional reliability coefficient for an ordinary, number-correct score, X, is equal to, ρ(X,X'|θ)=I(X,θ)/[I(X,θ)+1] Where: θ is a latent variable measured by an observed test score, X; p(X, X'|θ) is the conditional reliability of X at a fixed value of θ; and I(X, θ) is the score information function. This is a surprisingly simple relationship between the 2, basic indices of measurement precision from IRT and classical test theory (CTT). This relationship holds for item scores as well as test scores based on sums of item scores-and it holds for dichotomous as well as polytomous items, or a mix of both item types. Also, conditional reliabilities are derived for computerized adaptive test scores, and for θ-estimates used as alternatives to number correct scores. These conditional reliabilities are all related to information in a manner similar-or-identical to the 1 given above for the number-correct (NC) score. (PsycINFO Database Record

  17. Relationship between Age and the Ability to Break Scored Tablets

    PubMed Central

    Notenboom, Kim; Vromans, Herman; Schipper, Maarten; Leufkens, Hubert G. M.; Bouvy, Marcel L.

    2016-01-01

    Background: Practical problems with the use of medicines, such as difficulties with breaking tablets, are an often overlooked cause for non-adherence. Tablets frequently break in uneven parts and loss of product can occur due to crumbling and powdering. Health characteristics, such as the presence of peripheral neuropathy, decreased grip strength and manual dexterity, can affect a patient's ability to break tablets. As these impairments are associated with aging and age-related diseases, such as Parkinson's disease and arthritis, difficulties with breaking tablets could be more prevalent among older adults. The objective of this study was to investigate the relationship between age and the ability to break scored tablets. Methods: A comparative study design was chosen. Thirty-six older adults and 36 young adults were systematically observed with breaking scored tablets. Twelve different tablets were included. All participants were asked to break each tablet by three techniques: in between the fingers with the use of nails, in between the fingers without the use of nails and pushing the tablet downward with one finger on a solid surface. It was established whether a tablet was broken or not, and if broken, whether the tablet was broken accurately or not. Results: The older adults experienced more difficulties to break tablets compared to the young adults. On average, the older persons broke 38.1% of the tablets, of which 71.0% was broken accurately. The young adults broke 78.2% of the tablets, of which 77.4% was broken accurately. Further analysis by mixed effects logistic regression revealed that age was associated with the ability to break tablets, but not with the accuracy of breaking. Conclusions: Breaking scored tablets by hand is less successful in an elderly population compared to a group of young adults. Health care providers should be aware that tablet breaking is not appropriate for all patients and for all drugs. In case tablet breaking is unavoidable, a

  18. On the Scoring of Cloze Tests.

    ERIC Educational Resources Information Center

    Clausing, Gerhard; Senko, Donna

    1978-01-01

    Cloze testing and language performance is discussed as are two techniques for awarding partial credit: the quick performance measurement and feedback technique and the three-stage scoring hierarchy for partial credit. A figure and tables are included. (EJS)

  19. Teacher Greetings Increase College Students' Test Scores

    ERIC Educational Resources Information Center

    Weinstein, Lawrence; Laverghetta, Antonio; Alexander, Ralph; Stewart, Megan

    2009-01-01

    The current study is an extension of a previous investigation dealing with teacher greetings to students. The present investigation used teacher greetings with college students and academic performance (test scores). We report data using university students and in-class test performance. Students in introductory psychology who received teachers'…

  20. TIE: an ability test of emotional intelligence.

    PubMed

    Śmieja, Magdalena; Orzechowski, Jarosław; Stolarski, Maciej S

    2014-01-01

    The Test of Emotional Intelligence (TIE) is a new ability scale based on a theoretical model that defines emotional intelligence as a set of skills responsible for the processing of emotion-relevant information. Participants are provided with descriptions of emotional problems, and asked to indicate which emotion is most probable in a given situation, or to suggest the most appropriate action. Scoring is based on the judgments of experts: professional psychotherapists, trainers, and HR specialists. The validation study showed that the TIE is a reliable and valid test, suitable for both scientific research and individual assessment. Its internal consistency measures were as high as .88. In line with theoretical model of emotional intelligence, the results of the TIE shared about 10% of common variance with a general intelligence test, and were independent of major personality dimensions.

  1. TIE: An Ability Test of Emotional Intelligence

    PubMed Central

    Śmieja, Magdalena; Orzechowski, Jarosław; Stolarski, Maciej S.

    2014-01-01

    The Test of Emotional Intelligence (TIE) is a new ability scale based on a theoretical model that defines emotional intelligence as a set of skills responsible for the processing of emotion-relevant information. Participants are provided with descriptions of emotional problems, and asked to indicate which emotion is most probable in a given situation, or to suggest the most appropriate action. Scoring is based on the judgments of experts: professional psychotherapists, trainers, and HR specialists. The validation study showed that the TIE is a reliable and valid test, suitable for both scientific research and individual assessment. Its internal consistency measures were as high as .88. In line with theoretical model of emotional intelligence, the results of the TIE shared about 10% of common variance with a general intelligence test, and were independent of major personality dimensions. PMID:25072656

  2. The Black-White Test Score Gap.

    ERIC Educational Resources Information Center

    Jencks, Christopher, Ed.; Phillips, Meredith, Ed.

    The 15 chapters of this book address issues related to the continuing test score gap between black and white students. The editors argue against traditional explanations which emphasize differences in economic resources and demographic factors, and they urge that more emphasis be put on psychological and cultural factors. The book suggests studies…

  3. Teacher Use of Achievement Test Score Data

    ERIC Educational Resources Information Center

    Miller, Steven C.

    2012-01-01

    The Wyoming Department of Education (WDE) has invested time and money developing standardized achievement test score reports designed to give teachers data about each of their students' levels of mastery of particular concepts in order to differentiate their instruction. The purpose of this study was to determine the extent to which eighth-grade…

  4. Indicators of Usefulness of Test Scores

    ERIC Educational Resources Information Center

    Sawyer, Richard

    2007-01-01

    Current thinking on validity suggests that educational institutions and individuals should evaluate their uses of test scores in the context of their fundamental goals. Regression coefficients and other traditional criterion-related validity statistics provide relevant information, but often do not, by themselves, address the fundamental reasons…

  5. Critical Thinking: More than Test Scores

    ERIC Educational Resources Information Center

    Smith, Vernon G.; Szymanski, Antonia

    2013-01-01

    This article is for practicing or aspiring school administrators. The demand for excellence in public education has lead to an emphasis on standardized test scores. This article explores the development of a professional enhancement program designed to prepare teachers to teach higher order thinking skills. Higher order thinking is the primary…

  6. Misidentifying Factors Underlying Singapore's High Test Scores

    ERIC Educational Resources Information Center

    Usiskin, Zalman

    2012-01-01

    Singapore students have scored exceedingly well on international tests in mathematics. In response, there has been a desire in the United States--both at the policy level and at the school level--to emulate Singapore. Because what can be identified most easily about Singapore's school mathematics can be gleaned from curriculum documents from the…

  7. Does weight affect children's test scores and teacher assessments differently?

    PubMed

    Zavodny, Madeline

    2013-06-01

    The prevalence of childhood overweight and obesity increased dramatically in the United States during the past three decades. This increase has adverse public health implications, but its implication for children's academic outcomes is less clear. This paper uses data from five waves of the Early Childhood Longitudinal Study-Kindergarten to examine how children's weight is related to their scores on standardized tests and to their teachers' assessments of their academic ability. The results indicate that children's weight is more negatively related to teacher assessments of their academic performance than to test scores.

  8. The Stratified Adaptive Computerized Ability Test.

    ERIC Educational Resources Information Center

    Weiss, David J.

    This report describes the stratified adaptive (stradaptive) test as a strategy for tailoring an ability test to individual differences in testee ability; administration of the test is controlled by a time-shared computer system. The rationale of this method is described as it derives from Binet's strategy of ability test administration and…

  9. The Effects of Anchor Length, Test Difficulty, Population Ability Differences, Mixture of Populations and Sample Size on the Psychometric Properties of Levine Observed Score Linear Equating Method for Different Assumptions

    ERIC Educational Resources Information Center

    Carvajal-Espinoza, Jorge E.

    2011-01-01

    The Non-Equivalent groups with Anchor Test equating (NEAT) design is a widely used equating design in large scale testing that involves two groups that do not have to be of equal ability. One group P gets form X and a group of items A and the other group Q gets form Y and the same group of items A. One of the most commonly used equating methods in…

  10. Validating Test Score Meaning and Defending Test Score Use: Different Aims, Different Methods

    ERIC Educational Resources Information Center

    Cizek, Gregory J.

    2016-01-01

    Advances in validity theory and alacrity in validation practice have suffered because the term "validity" has been used to refer to two incompatible concerns: (1) the degree of support for specified interpretations of test scores (i.e. intended score meaning) and (2) the degree of support for specified applications (i.e. intended test…

  11. Ability Estimation for Conventional Tests.

    ERIC Educational Resources Information Center

    Kim, Jwa K.; Nicewander, W. Alan

    1993-01-01

    Bias, standard error, and reliability of five ability estimators were evaluated using Monte Carlo estimates of the unknown conditional means and variances of the estimators. Results indicate that estimates based on Bayesian modal, expected a posteriori, and weighted likelihood estimators were reasonably unbiased with relatively small standard…

  12. ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

    ERIC Educational Resources Information Center

    Allalouf, Avi

    2014-01-01

    The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…

  13. Some Properties of a Bayesian Adaptive Ability Testing Strategy.

    ERIC Educational Resources Information Center

    McBride, James R.; Weiss, David J.

    Four monte carlo simulation studies of Owen's Bayesian sequential procedure for adaptive mental testing were conducted. Whereas previous simulation studies of this procedure have concentrated on evaluating it in terms of the correlation of its test scores with simulated ability in a normal population, these four studies explored a number of…

  14. Measurement of ability emotional intelligence: results for two new tests.

    PubMed

    Austin, Elizabeth J

    2010-08-01

    Emotional intelligence (EI) has attracted considerable interest amongst both individual differences researchers and those in other areas of psychology who are interested in how EI relates to criteria such as well-being and career success. Both trait (self-report) and ability EI measures have been developed; the focus of this paper is on ability EI. The associations of two new ability EI tests with psychometric intelligence, emotion perception, and the Mayer-Salovey-Caruso EI test (MSCEIT) were examined. The new EI tests were the Situational Test of Emotion Management (STEM) and the Situational Test of Emotional Understanding (STEU). Only the STEU and the MSCEIT Understanding Emotions branch were significantly correlated with psychometric intelligence, suggesting that only understanding emotions can be regarded as a candidate new intelligence component. These understanding emotions tests were also positively correlated with emotion perception tests, and STEM and STEU scores were positively correlated with MSCEIT total score and most branch scores. Neither the STEM nor the STEU were significantly correlated with trait EI tests, confirming the distinctness of trait and ability EI. Taking the present results as a starting-point, approaches to the development of new ability EI tests and models of EI are suggested.

  15. Studies of the Ability of Preservice Social Studies Teachers to Stage Score Moral Thought Statements.

    ERIC Educational Resources Information Center

    Napier, John D.

    The report describes two experiments involving the ability of preservice social studies teachers to stage score moral thought statements. Stage scoring is defined as keeping a record of statements in accordance with the stages of moral development originated by psychologist Lawrence Kohlberg. The two experiments involved the use of three stage…

  16. The Ability of Elementary School Teachers to Stage Score Moral Thought Statements

    ERIC Educational Resources Information Center

    Napier, John D.

    1976-01-01

    The study examined (1) whether 60 elementary school teachers could score moral thought statements into Kohlberg's moral stages by receiving special training and using a rater manual, and (2) what factors were related to their stage-scoring ability. Major conclusion was that the rater manual and training were ineffective. (Author/ND)

  17. Does Test Anxiety Induce Measurement Bias in Cognitive Ability Tests?

    ERIC Educational Resources Information Center

    Reeve, Charlie L.; Bonaccio, Silvia

    2008-01-01

    Although test anxiety is typically negatively related to performance on cognitive ability tests, little research has systematically investigated whether differences in test anxiety result in measurement bias on cognitive ability tests. The current paper uses a structural equation modeling technique to explicitly test for measurement bias due to…

  18. Estimating Total-Test Scores from Partial Scores in a Matrix Sampling Design.

    ERIC Educational Resources Information Center

    Sachar, Jane; Suppes, Patrick

    1980-01-01

    The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students and 60 items of the 110-item Stanford Mental Arithmetic Test. Three methods yielded fairly good estimates of the total-test score. (Author/RL)

  19. School accountability and the black-white test score gap.

    PubMed

    Gaddis, S Michael; Lauen, Douglas Lee

    2014-03-01

    Since at least the 1960s, researchers have closely examined the respective roles of families, neighborhoods, and schools in producing the black-white achievement gap. Although many researchers minimize the ability of schools to eliminate achievement gaps, the No Child Left Behind Act (NCLB) increased pressure on schools to do so by 2014. In this study, we examine the effects of NCLB's subgroup-specific accountability pressure on changes in black-white math and reading test score gaps using a school-level panel dataset on all North Carolina public elementary and middle schools between 2001 and 2009. Using difference-in-difference models with school fixed effects, we find that accountability pressure reduces black-white achievement gaps by raising mean black achievement without harming mean white achievement. We find no differential effects of accountability pressure based on the racial composition of schools, but schools with more affluent populations are the most successful at reducing the black-white math achievement gap. Thus, our findings suggest that school-based interventions have the potential to close test score gaps, but differences in school composition and resources play a significant role in the ability of schools to reduce racial inequality.

  20. Testing Intelligently Includes Double-Checking Wechsler IQ Scores

    ERIC Educational Resources Information Center

    Kuentzel, Jeffrey G.; Hetterscheidt, Lesley A.; Barnett, Douglas

    2011-01-01

    The rigors of standardized testing make for numerous opportunities for examiner error, including simple computational mistakes in scoring. Although experts recommend that test scoring be double-checked, the extent to which independent double-checking would reduce scoring errors is not known. A double-checking procedure was established at a…

  1. Does Test Preparation Work? Implications for Score Validity

    ERIC Educational Resources Information Center

    Xie, Qin

    2013-01-01

    This article reports an empirical study that examined the pattern of test preparation for College English Test Band 4 (CET4) and the differential effects of test preparation practices on its scores, thereby drawing implications for CET4 score validity. Data collection involved 1,003 test takers of CET4. A pretest was administered at the beginning…

  2. Comparison of the Qualitative and Developmental Scoring Systems for the Modified Version of the Bender-Gestalt Test.

    ERIC Educational Resources Information Center

    Brannigan, Gary G.; Brunner, Nancy A.

    1993-01-01

    Examined two scoring systems for Modified Version of the Bender-Gestalt Test. Administered Bender-Gestalt and Otis-Lennon School Ability Test to 75 first-grade and 84 second-grade students. Both systems were significantly correlated with school ability. Results of tests for differences between correlations indicated that Qualitative Scoring System…

  3. The Test Score Decline: A Review and Annotated Bibliography

    DTIC Science & Technology

    1981-08-01

    40. Champagne, D., & Roberts, E., An Exercise in Freedom: A Place Where Test Scores Appear to Be Rising. = 3. Acland , H., If Reading Scores Are...of the nation’s young teachers. Scientific, Engineering, Tech- nical Manpower Comments, November 1979. 3. Acland , Henry, If reading scores are

  4. The Generalizability of Motivation Filtering in Improving Test Score Validity

    ERIC Educational Resources Information Center

    Wise, Vicki L.; Wise, Steven L.; Bhola, Dennison S.

    2006-01-01

    Accountability for educational quality is a priority at all levels of education. Low-stakes testing is one way to measure the quality of education that students receive and make inferences about what students know and can do. Aggregate test scores from low-stakes testing programs are suspect, however, to the degree that these scores are influenced…

  5. Test Reliability by Ability Level of Examinees.

    ERIC Educational Resources Information Center

    Green, Kathy; Sax, Gilbert

    Achievement test reliability as a function of ability was determined for multiple sections of a large university French class (n=193). A 5-option multiple-choice examination was constructed, least attractive distractors were eliminated based on the instructor's judgment, and the resulting three forms of the examination (i.e. 3-, 4-, or 5-choice…

  6. Estimating Total-test Scores from Partial Scores in a Matrix Sampling Design.

    ERIC Educational Resources Information Center

    Sachar, Jane; Suppes, Patrick

    It is sometimes desirable to obtain an estimated total-test score for an individual who was administered only a subset of the items in a total test. The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students in grades 3-5 and 60 items of the ll0-item Stanford Mental…

  7. Phishing IQ Tests Measure Fear, Not Ability

    NASA Astrophysics Data System (ADS)

    Anandpara, Vivek; Dingman, Andrew; Jakobsson, Markus; Liu, Debin; Roinestad, Heather

    We argue that phishing IQ tests fail to measure susceptibility to phishing attacks. We conducted a study where 40 subjects were asked to answer a selection of questions from existing phishing IQ tests in which we varied the portion (from 25% to 100%) of the questions that corresponded to phishing emails. We did not find any correlation between the actual number of phishing emails and the number of emails that the subjects indicated were phishing. Therefore, the tests did not measure the ability of the subjects. To further confirm this, we exposed all the subjects to existing phishing education after they had taken the test, after which each subject was asked to take a second phishing test, with the same design as the first one, but with different questions. The number of stimuli that were indicated as being phishing in the second test was, again, independent of the actual number of phishing stimuli in the test. However, a substantially larger portion of stimuli was indicated as being phishing in the second test, suggesting that the only measurable effect of the phishing education (from the point of view of the phishing IQ test) was an increased concern—not an increased ability.

  8. Motor impairment influences Farnsworth-Munsell 100 Hue test error scores in Parkinson's disease patients.

    PubMed

    Müller, Thomas; Meisel, Margareta; Russ, Herrmann; Przuntek, Horst

    2003-09-15

    Farnsworth-Munsell 100 Hue test (FMT) error scores and peg insertion abilities significantly differ between Parkinson's disease (PD) patients and controls. Both tasks ask for performance of voluntary movements. The objective of this study was to demonstrate a relation between FMT error scores and peg insertion outcomes. We successively performed both tasks in 28 previously untreated PD patients. The FMT error score was significantly (p=0.016) lower in patients with better peg insertion outcome. A significant (Spearman R=0.47, p=0.012) correlation between peg insertion results and the FMT error scores appeared. Motor impairment influences FMT error scores in PD patients.

  9. The Anatomy Competence Score--A New Marker for Anatomical Ability

    ERIC Educational Resources Information Center

    Schoeman, Scarpa; Chandratilake, Madawa

    2012-01-01

    The assessment of students' ability in gross anatomy is a complex process as it involves the measurement of multiple facets. In this work, the authors developed and introduced the Anatomy Competence Score (ACS), which incorporates the three domains of anatomy teaching and assessment namely: theoretical knowledge, practical 3D application of the…

  10. Test Score Reporting Referenced to Doubly-Moderated Cut Scores Using Splines

    ERIC Educational Resources Information Center

    Schafer, William D.; Hou, Xiaodong

    2011-01-01

    This study discusses and presents an example of a use of spline functions to establish and report test scores using a moderated system of any number of cut scores. Our main goals include studying the need for and establishing moderated standards and creating a reporting scale that is referenced to all the standards. Our secondary goals are to make…

  11. Estimating the Reliability of a Test Battery Composite or a Test Score Based on Weighted Item Scoring

    ERIC Educational Resources Information Center

    Feldt, Leonard S.

    2004-01-01

    In some settings, the validity of a battery composite or a test score is enhanced by weighting some parts or items more heavily than others in the total score. This article describes methods of estimating the total score reliability coefficient when differential weights are used with items or parts.

  12. Pictures Speak Louder than Test Scores.

    ERIC Educational Resources Information Center

    McCabe, Deborah; Hilmo, Joellen

    1985-01-01

    The Goodenough-Harris Draw-a-Person Test, if given at regular intervals during periods of remediation, may show clear evidence of improvement in behavior and attitude of learning disabled students. (CL)

  13. Prediction of Metropolitan Readiness Test Scores

    ERIC Educational Resources Information Center

    Blowers, E. A.

    1977-01-01

    The efficiency of several visual and auditory predictors of the Metropolitan Readiness Test was examined utilizing 106 grade 1 subjects considered by their teachers to show learning difficulties. (Author/JC)

  14. Fuzzy Math: A Meditation on Test Scoring

    ERIC Educational Resources Information Center

    Jacks, Meredith

    2011-01-01

    As a public school English teacher, the author observes standardized testing season each year with a sort of grim fascination. "So this is it," she thinks as she paces around her silent classroom, peering over kids' shoulders at articles about parasailing. Line graphs tracking the rainfall in Tulsa. Parts of speech. Functions of "x." "These are…

  15. Accountability Is More than a Test Score

    ERIC Educational Resources Information Center

    Turnipseed, Stephan; Darling-Hammond, Linda

    2015-01-01

    The number one quality business leaders look for in employees is creativity and yet the U.S. education system undermines the development of the higher-order skills that promote creativity by its dogged focus on multiple-choice tests. Stephan Turnipseed and Linda DarlingHammond discuss the kind of rich accountability system that will help students…

  16. Improving Scores on the IELTS Speaking Test

    ERIC Educational Resources Information Center

    Issitt, Steve

    2008-01-01

    This article presents three strategies for teaching students who are taking the IELTS speaking test. The first strategy is aimed at improving confidence and uses a variety of self-help materials from the field of popular psychology. The second encourages students to think critically and invokes a range of academic perspectives. The third strategy…

  17. Methods for Evaluating the Validity of Test Scores for English Language Learners

    ERIC Educational Resources Information Center

    Sireci, Stephen G.; Han, Kyung T.; Wells, Craig S.

    2008-01-01

    In the United States, when English language learners (ELLs) are tested, they are usually tested in English and their limited English proficiency is a potential cause of construct-irrelevant variance. When such irrelevancies affect test scores, inaccurate interpretations of ELLs' knowledge, skills, and abilities may occur. In this article, we…

  18. Effects of Grade Level and Subject on Student Test Score Predictions.

    ERIC Educational Resources Information Center

    Barnett, Jerrold E.; Hixon, Jon E.

    1997-01-01

    Interviews with elementary students before and after tests in three subjects investigated how grade level and subject affected students' ability to predict test scores. Results found a significant grade-subject area interaction for predictions prior to testing. Posttest predictions differed only slightly from pretest. Prediction accuracy was…

  19. Digit symbol substitution test score and hyperhomocysteinemia in older adults

    PubMed Central

    Hsu, Wen-Chuin; Chu, Yi-Chuan; Fung, Hon-Chung; Wai, Yau-Yau; Wang, Jiun-Jie; Lee, Jiann-Der; Chen, Yi-Chun

    2016-01-01

    Abstract Mounting evidence shows that hyperhomocysteinemia is a risk factor for cognitive decline. This study enrolled subjects with normal serum levels of B12 and folate and performed thorough neuropsychological assessments to illuminate the independent role of homocysteine on cognitive functions. Participants between ages 50 and 85 were enrolled with Modified Hachinski ischemic score of <4, adequate visual and auditory acuity to allow neuropsychological testing, and good general health. Subjects with cognitive impairment resulting from secondary causes were excluded. Each of the participants completed evaluations of general intellectual function, including the Mini-Mental State Examination, Cognitive Abilities Screening Instrument, Clinical Dementia Rating, and a battery of neuropsychological assessments. This study enrolled 225 subjects (90 subjects younger than 65 years and 135 subjects aged 65 years or older). The sex proportion was similar between the 2 age groups. Years of education were significantly fewer in the elderly (7.49 ± 5.40 years) than in the young (9.76 ± 4.39 years, P = 0.001). There was no significant difference in body mass index or levels of vitamin B12 and folate between the 2 age groups. Homocysteine levels were significantly higher in the elderly group compared to the younger group (10.8 ± 2.7 vs. 9.5 ± 2.5 μmol/L, respectively, P = 0.0006). After adjusting for age, sex, and education, only the Digit Symbol Substitution (DSS) score was significantly lower in subjects with hyperhomocysteinemia (homocysteine >12 μmol/L) than those with homocysteine ≤12 μmol/L in the elderly group (DSS score: 7.1 ± 2.7 and 9.0 ± 3.0, respectively, beta = −1.6, 95% confidence interval [CI] = −2.8∼−0.5, P = 0.001) and borderline significance was noted in the combined age group (beta = −1.1, 95% CI = −2.1∼−0.1, P = 0.04). We did not find an association between

  20. Development of WAIS-III General Ability Index Minus WMS-III memory discrepancy scores.

    PubMed

    Lange, Rael T; Chelune, Gordon J; Tulsky, David S

    2006-09-01

    Analysis of the discrepancy between intellectual functioning and memory ability has received some support as a useful means for evaluating memory impairment. In recent additions to Wechlser scale interpretation, the WAIS-III General Ability Index (GAI) and the WMS-III Delayed Memory Index (DMI) were developed. The purpose of this investigation is to develop base rate data for GAI-IMI, GAI-GMI, and GAI-DMI discrepancy scores using data from the WAIS-III/WMS-III standardization sample (weighted N = 1250). Base rate tables were developed using the predicted-difference method and two simple-difference methods (i.e., stratified and non-stratified). These tables provide valuable data for clinical reference purposes to determine the frequency of GAI-IMI, GAI-GMI, and GAI-DMI discrepancy scores in the WAIS-III/WMS-III standardization sample.

  1. Partial-Credit Scoring Methods for Multiple-Choice Tests.

    ERIC Educational Resources Information Center

    Frary, Robert B.

    1989-01-01

    Multiple-choice response and scoring methods that attempt to determine an examinee's degree of knowledge about each item in order to produce a total test score are reviewed. There is apparently little advantage to such schemes; however, they may have secondary benefits such as providing feedback to enhance learning. (SLD)

  2. A Review of Scoring Algorithms for Multiple-Choice Tests.

    ERIC Educational Resources Information Center

    Kurz, Terri Barber

    Multiple-choice tests are generally scored using a conventional number right scoring method. While this method is easy to use, it has several weaknesses. These weaknesses include decreased validity due to guessing and failure to credit partial knowledge. In an attempt to address these weaknesses, psychometricians have developed various scoring…

  3. Evaluating the Predictive Validity of Graduate Management Admission Test Scores

    ERIC Educational Resources Information Center

    Sireci, Stephen G.; Talento-Miller, Eileen

    2006-01-01

    Admissions data and first-year grade point average (GPA) data from 11 graduate management schools were analyzed to evaluate the predictive validity of Graduate Management Admission Test[R] (GMAT[R]) scores and the extent to which predictive validity held across sex and race/ethnicity. The results indicated GMAT verbal and quantitative scores had…

  4. Test Scores and What They Mean. Sixth Edition.

    ERIC Educational Resources Information Center

    Lyman, Howard B.

    The first edition of this book was written to give information about testing to people whose work gave them access to test results, but whose training included little or nothing about the use and interpretation of tests. Later editions have been intended for a broader audience as the need for understanding what test scores really mean has…

  5. Counselor Simulation by Film in Test Score Reporting Interviews

    ERIC Educational Resources Information Center

    Collins, Tom

    1972-01-01

    The responsible and innovative utilization of media, not only in test score reporting but also in other guidance functions, may assist the counselor in permitting him more time to function with clients in counseling relationships. (Author)

  6. Demographically Adjusted Groups for Equating Test Scores. Research Report. ETS RR-14-30

    ERIC Educational Resources Information Center

    Livingston, Samuel A.

    2014-01-01

    In this study, I investigated 2 procedures intended to create test-taker groups of equal ability by poststratifying on a composite variable created from demographic information. In one procedure, the stratifying variable was the composite variable that best predicted the test score. In the other procedure, the stratifying variable was the…

  7. Prediction of true test scores from observed item scores and ancillary data.

    PubMed

    Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

    2015-05-01

    In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability.

  8. Making Sense of Test Scores. Assessment Brief. Number 10

    ERIC Educational Resources Information Center

    Bergman, Lincoln

    2004-01-01

    It is challenging for parents and the general public to make sense of the reports on test scores that appear in the mass media. This article offers some things for readers to consider as they bring a critical eye to what is read in the papers. Usually reports on test scores in the media are quite short and focus on one or two aspects of test…

  9. Wage and Test Score Dispersion: Some International Evidence.

    ERIC Educational Resources Information Center

    Bedard, Kelly; Ferrall, Christopher

    2003-01-01

    Compares the distribution of test scores at age 13 in 1964 and 1982 and wages later in life across 11 countries. Finds that wage dispersion later in life is never greater than test-score dispersion. For three countries (U.S., UK, and Japan), finds evidence of skill-biased changes in wage dispersion between the early 1970s and the late 1980s.…

  10. Grades and Test Scores: Accounting for Observed Differences.

    ERIC Educational Resources Information Center

    Willingham, Warren W.; Pollack, Judith M.; Lewis, Charles

    2002-01-01

    Proposed a framework of possible differences between grades and test scores and tested the framework with data on 8,454 high school seniors from the National Education Longitudinal Study. Identified differences and correlations among achievement factors. Differences between grades and tests give these measures complementary strengths in…

  11. Machine-Scored Testing, Part II: Creativity and Item Analysis.

    ERIC Educational Resources Information Center

    Leuba, Richard J.

    1986-01-01

    Explains how multiple choice test items can be devised to measure higher-order learning, including engineering problem solving. Discusses the value and information provided in item analysis procedures with machine-scored tests. Suggests elements to consider in test design. (ML)

  12. Observed-Score Equating as a Test Assembly Problem.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Luecht, Richard M.

    1998-01-01

    Derives a set of linear conditions of item-response functions that guarantees identical observed-score distributions on two test forms. The conditions can be added as constraints to a linear programming model for test assembly. An example illustrates the use of the model for an item pool from the Law School Admissions Test (LSAT). (SLD)

  13. Test Takers and the Validity of Score Interpretations

    ERIC Educational Resources Information Center

    Kopriva, Rebecca J.; Thurlow, Martha L.; Perie, Marianne; Lazarus, Sheryl S.; Clark, Amy

    2016-01-01

    This article argues that test takers are as integral to determining validity of test scores as defining target content and conditioning inferences on test use. A principled sustained attention to how students interact with assessment opportunities is essential, as is a principled sustained evaluation of evidence confirming the validity or calling…

  14. Dearborn 1981-82 Achievement Test Scores (Fifth Annual Report).

    ERIC Educational Resources Information Center

    Dearborn Public Schools, MI.

    The purpose of the fifth annual Dearborn Achievement Test Score report is to summarize and to help interpret the test results so that Dearborn citizens and educators will have a better understanding of the educational achievements of Dearborn students. The District-wide Testing Program assesses reading readiness, scholastic aptitude, academic…

  15. Interpreting Test Scores: More Complicated than You Think

    ERIC Educational Resources Information Center

    Tully, Susannah

    2008-01-01

    As more colleges move to "test optional" admissions policies, the debate over the utility and interpretation of standardized-test scores continues. In this article, the author interviews Daniel Koretz, a professor of education at Harvard University and author of "Measuring Up: What Educational Testing Really Tells Us". Koretz…

  16. Do elderly people score better on cognitive tests at home?

    PubMed Central

    Shievitz, A. L.; Tudiver, F.; Araujo, A.; Sanghe, P.; Boyle, E.

    1998-01-01

    OBJECTIVE: To determine whether Mini-Mental State Examination (MMSE) scores of elderly family medicine patients are different when the test is administered at home rather than at the clinic. DESIGN: Cross-sectional comparison study. SETTING: University family practice unit in an urban area. PARTICIPANTS: A convenience sample of family practice clinic patients 70 years or older were referred to the study in the sequence seen at the clinic. Of 171 patients approached in person or by telephone, 77 agreed to participate. METHOD: The MMSE was administered at home and at the clinic on the same day for all subjects. Testing site order was randomized across patients. MAIN FINDINGS: Of the 77 patients who agreed to be subjects, only 13 (16.9%) had low MMSE scores (< or = 24). Five (41.7%) of these had normal scores (> 24) at home, but low scores in the clinic. Subjects had significantly higher scores on MMSEs administered at home (P < .01) on the same day. CONCLUSIONS: Previous research has shown patients achieve higher MMSE scores at home; this study demonstrated it in a representative family medicine population. Primary care physicians should be cautious about classifying elderly patients as possibly cognitively impaired based on clinic testing alone. Testing at home could avoid many unnecessary referrals to specialist services for further assessment and diagnostic tests that use up precious health care resources. PMID:9721421

  17. The non-credible score of the Rey Auditory Verbal Learning Test: is it better at predicting non-credible neuropsychological test performance than the RAVLT recognition score?

    PubMed

    Whitney, Kriscinda A; Davis, Jeremy J

    2015-03-01

    The ability of both the non-credible score of the Rey Auditory Verbal Learning Test (RAVLT NC) and the recognition score of the RAVLT (RAVLT Recog) to predict credible versus non-credible neuropsychological test performance was examined. Credible versus non-credible group membership was determined according to diagnostic criteria with consideration of performance on two stand-alone performance validity tests. Findings from this retrospective data analysis of outpatients seen for neuropsychological testing within a Veterans Affairs Medical Center (N = 175) showed that RAVLT Recog demonstrated better classification accuracy than RAVLT NC in predicting credible versus non-credible neuropsychological test performance. Specifically, an RAVLT Recog cutoff of ≤9 resulted in reasonable sensitivity (48%) and acceptable specificity (91%) in predicting non-credible neuropsychological test performance. Implications for clinical practice are discussed. Note: The views contained here within are those of the authors and not representative of the institutions with which they are associated.

  18. Correlation Between Students' Dental Admission Test Scores and Performance on a Dental School's Competency Exam.

    PubMed

    Carroll, Alexander M; Schuster, Gregory M

    2015-11-01

    The aim of this study was to investigate whether there was a statistically significant positive correlation between dental students' Dental Admission Test (DAT) scores, particularly on the Perceptual Ability Test (PAT), and their performance on a dental school's competency exam. Scores from the written and clinical competency exam administered in the fall quarter of the fourth year of the curriculum at Midwestern University College of Dental Medicine-Arizona were compared to DAT scores of all 216 members of the graduating classes of 2012 and 2013. It was hypothesized that students who performed highly on one or more sections of the DAT would perform highly on the competency exam. Backward stepwise regression analyses were used to analyze the data. The results showed that the PAT scores were most strongly correlated with the competency exam scores and were a positive predictor for all three clinical sections of the exam (operative dentistry, periodontics, and endodontics). Positive predictors for the written portion of the exam were total DAT score for patient assessment and treatment planning and the DAT reading comprehension score for prosthodontics; there were no predictors for periodontics. The total variance explained by the results ranged from 4% to 15%. While statistically significant relationships were found between the students' PAT scores and clinical performance, DAT scores explained relatively little variance in the competency exam scores. According to these findings, neither the PAT nor any of the DAT components contributed to predicting these students' clinical performance.

  19. Examining alternative scoring rubrics on a statewide test: The impact of different scoring methods on science and social studies performance assessments

    NASA Astrophysics Data System (ADS)

    Creighton, Susan Dabney

    There is no consensus regarding the most reliable and valid scoring methods for the assessment of higher order thinking skills. Most of the research on alternative formats has focused on the scoring of writing ability. This study examined the value of different types of performance assessment scoring guides on state mandated science and social studies tests. A proportional stratified sample of raters were randomly assigned to one of four scoring groups: checklist, analytic rubric, holistic rubric, and generic rubrics. A fifth method, the weighted analytic rubric, was included by applying an algorithmic formula to the scores assigned by raters using the analytic rubric. A comparison of the mean scores for the five scoring groups suggests that there may be a difference in the way raters applied the rubric for each group. Although the literature suggests that it is possible to achieve high levels of inter-rater reliability, across forms of scoring, phi coefficients of moderate strength were obtained for three of the four constructed-response items. Results for each scoring group were compared indicating that item complexity may impact the level of inter-rate, reliability and the selection of the most reliable rubric for each discipline. Analytic rubrics appear to achieve more reliable results with less complex items. A multitrait-multimethod approach was utilized to investigate the external validity of the social studies and science tasks. As expected, there tended to be a stronger association between the PACT science constructed-response scores with scores based on science multiple-choice scores than between the science constructed-response scores and the writing ability subtest scores. A similar pattern was seen with social studies items. These results provide some evidence for the validity of the performance assessments. A post study survey completed by raters provided qualitative information regarding their thought processes and their primary focus during the

  20. The Uses and Misuses of Test Scores: Technical Assistance Perspective.

    ERIC Educational Resources Information Center

    Echternacht, Gary

    The uses and misuses of standardized test results used for program evaluation as seen by a staff member of an Elementary Secondary Education Act (ESEA) Title I Technical Assistance Center are described. In ESEA Title I, test scores are used to select students for the program. Although federal requirements do not require using standardized test…

  1. Effort Analysis: Individual Score Validation of Achievement Test Data

    ERIC Educational Resources Information Center

    Wise, Steven L.

    2015-01-01

    Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…

  2. High Test Scores: The Wrong Road to National Economic Success

    ERIC Educational Resources Information Center

    Baker, Keith

    2011-01-01

    A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…

  3. Cancer of the Prostate Risk Assessment (CAPRA) Preoperative Score Versus Postoperative Score (CAPRA-S): ability to predict cancer progression and decision-making regarding adjuvant therapy after radical prostatectomy.

    PubMed

    Seo, Won Ik; Kang, Pil Moon; Kang, Dong Il; Yoon, Jang Ho; Kim, Wansuk; Chung, Jae Il

    2014-09-01

    The University of California, San Francisco, announced in 2011 Cancer of the Prostate Risk Assessment Postsurgical (CAPRA-S) score which included pathologic data, but there were no results for comparing preoperative predictors with the CAPRA-S score. We evaluated the validation of the CAPRA-S score in our institution and compare the result with the preoperative progression predictor, CAPRA score. Data of 130 patients were reviewed who underwent radical prostatectomy for localized prostate cancer from 2008 to 2013. Performance of CAPRA-S score in predicting progression free probabilities was assessed through Kaplan Meier analysis and Cox proportional hazards regression test. Additionally, prediction probability was compared with preoperative CAPRA score by logistic regression analysis. Comparing CAPRA score, the CAPRA-S score showed improved prediction ability for 5 yr progression free survival (concordance index 0.80, P = 0.04). After risk group stratification, 3 group model of CAPRA-S was superior than 3 group model of CAPRA for 3-yr progression free survival and 5-yr progression free survival (concordance index 0.74 vs. 0.70, 0.77 vs. 0.71, P < 0.001). Finally the CAPRA-S score was the more ideal predictor concerned with adjuvant therapy than the CAPRA score through decision curve analysis. The CPARA-S score is a useful predictor for disease progression after radical prostatectomy.

  4. Neurocognitive abilities in the general population and composite genetic risk scores for attention-deficit hyperactivity disorder

    PubMed Central

    Martin, Joanna; Hamshere, Marian L; Stergiakouli, Evangelia; O'Donovan, Michael C; Thapar, Anita

    2015-01-01

    Background The genetic architecture of ADHD is complex, with rare and common variants involved. Common genetic variants (as indexed by a composite risk score) associated with clinical ADHD significantly predict ADHD and autistic-like behavioural traits in children from the general population, suggesting that ADHD lies at the extreme of normal trait variation. ADHD and other neurodevelopmental disorders share neurocognitive difficulties in several domains (e.g. impaired cognitive ability and executive functions). We hypothesised that ADHD composite genetic risk scores derived from clinical ADHD cases would also contribute to variation in neurocognitive abilities in the general population. Methods Children (N = 6,832) from a UK population cohort, the Avon Longitudinal Study of Parents and Children (ALSPAC), underwent neurocognitive testing. Parent-reported measures of their children's ADHD and autistic-like traits were used to construct a behavioural latent variable of ‘neurodevelopmental traits’. Composite genetic risk scores for ADHD were calculated for ALSPAC children based on findings from an independent ADHD case–control genome-wide association study. Structural equation modelling was used to assess associations between ADHD composite genetic risk scores and IQ, working memory, inhibitory control and facial emotion recognition, as well as the latent ‘neurodevelopmental trait’ measure. Results The results confirmed that neurocognitive and neurodevelopmental traits are correlated in children in the general population. Composite genetic risk scores for ADHD were independently associated with lower IQ (β = −.05, p < .001) and working memory performance (β = −.034, p = .013), even after accounting for the relationship with latent neurodevelopmental behavioural trait scores. No associations were found between composite genetic risk scores and inhibitory control or emotion recognition (p > .05). Conclusions These findings suggest that common

  5. The Effect of Piano Playing on Preservice Teachers' Ability to Detect Errors in a Choral Score

    ERIC Educational Resources Information Center

    Napoles, Jessica; Babb, Sandra L.; Bowers, Judy; Hankle, Steven; Zrust, Adam

    2017-01-01

    The purpose of this study was to examine and empirically test the pedagogical claim that playing the piano while listening to choral singers impedes error detection ability. In a within-subjects design, participants (N = 55 preservice teachers) either listened to four excerpts of choral hymns or played a single part (soprano/bass) on the piano…

  6. TH-SCORE: A Program for Obtaining Ability Estimates under Different Psychometric Models.

    ERIC Educational Resources Information Center

    Ferrando, Pere J.; Lorenzo, Urbano

    1998-01-01

    A program for obtaining ability estimates and their standard errors under a variety of psychometric models is documented. The general models considered are (1) classical test theory; (2) item factor analysis for continuous censored responses; and (3) unidimensional and multidimensional item response theory graded response models. (SLD)

  7. Computerized Ability Testing, 1972-1975. Final Report.

    ERIC Educational Resources Information Center

    Weiss, David J.

    Three and one-half years of research on computerized ability testing are summarized. The original objectives of the research were: (1) to develop and implement the stratified computer-based ability test; (2) to compare, on psychometric criteria, the various approaches to computer-based ability testing, including the stratified computerized test,…

  8. Admissions Testing at Career College and Trade School Training Programs. Test Score Guidelines, Norms, and Student Demographics.

    ERIC Educational Resources Information Center

    Wonderlic, Charles F.; And Others

    This report provides a method for determining minimum score by vocational program based on the use of the Wonderlic Scholastic Level Exam (SLE). The SLE has been demonstrated to be a highly accurate and reliable measure of adult cognitive ability. It is currently in use as an admissions test at many career colleges and trade schools. The SLE test…

  9. Source Country Differences in Test Score Gaps: Evidence from Denmark

    ERIC Educational Resources Information Center

    Rangvid, Beatrice Schindler

    2010-01-01

    We combine data from three studies for Denmark in the PISA 2000 framework to investigate differences in the native-immigrant test score gap by country of origin. In addition to the controls available from PISA data sources, we use student-level data on home background and individual migration histories linked from administrative registers. We find…

  10. Study Finds Link between Quality Music Programs, Test Scores

    ERIC Educational Resources Information Center

    Teaching Music, 2007

    2007-01-01

    A recent study found that students in high-quality school music education programs score higher on standardized tests compared to students in schools with deficient music education programs. The study, which was published in the Winter 2006 issue of MENC's Journal for Research in Music Education, is the first to examine the quality of school music…

  11. A Latent Class Approach to Estimating Test-Score Reliability

    ERIC Educational Resources Information Center

    van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

    2011-01-01

    This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…

  12. America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

    ERIC Educational Resources Information Center

    Petrilli, Michael J.; Wright, Brandon L.

    2016-01-01

    At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…

  13. Student Laptop Use and Scores on Standardized Tests

    ERIC Educational Resources Information Center

    Kposowa, Augustine J.; Valdez, Amanda D.

    2013-01-01

    Objectives: The primary objective of the study was to investigate the relationship between ubiquitous laptop use and academic achievement. It was hypothesized that students with ubiquitous laptops would score on average higher on standardized tests than those without such computers. Methods: Data were obtained from two sources. First, demographic…

  14. The Correlational Relationship between Homeschooling Demographics and High Test Scores.

    ERIC Educational Resources Information Center

    Burns, Johnna

    Homeschooling, one of the fastest growing educational alternatives, is enjoying increasing respect from educators and parents alike. This is partly because homeschooling children score as well and often better on standardized tests than their publicly schooled counterparts. However, the vast majority of homeschooled students come from the…

  15. Between-District Test Score Variation, 2009-2012

    ERIC Educational Resources Information Center

    Fahle, Erin; Reardon, Sean

    2016-01-01

    Describing the variation in test scores between and within school districts is critical for: (1) for policy-related and descriptive work that investigates the sorting of students among districts and the differential effectiveness of those districts; and (2) for methodological work planning future experiments or interventions. Intraclass…

  16. School Choice in Suburbia: Test Scores, Race, and Housing Markets

    ERIC Educational Resources Information Center

    Dougherty, Jack; Harelson, Jeffrey; Maloney, Laura; Murphy, Drew; Smith, Russell; Snow, Michael; Zannoni, Diane

    2009-01-01

    Home buyers exercise school choice when shopping for a private residence due to its location in a public school district or attendance area. In this quantitative study of one Connecticut suburban district, we measure the effect of elementary school test scores and racial composition on home buyers' willingness to purchase single-family homes over…

  17. What We Lose in Winning the Test Score Race

    ERIC Educational Resources Information Center

    Jorgenson, Olaf

    2012-01-01

    To achieve perpetually better test results each year as mandated by the No Child Left Behind Act (NCLB), teachers in successful schools such as Leroy Anderson Elementary in San Jose, California, will "try anything" to raise scores, as the school's principal stated in an interview with "The San Jose Mercury News." In schools…

  18. Assessment Test Scores of Incoming Students, Fall 2001.

    ERIC Educational Resources Information Center

    Negron, Maggie; Breindel, Matthew

    This assessment of placement test scores in reading, math, and sentence skills from incoming students at College of the Desert (California) shows that students are overwhelmingly underprepared for study at the college. Only 15% of students were prepared in sentence skills, 27% in reading skills, 7% in math skills; only 3% were prepared in all 3…

  19. Commentary on "Validating the Interpretations and Uses of Test Scores"

    ERIC Educational Resources Information Center

    Brennan, Robert L.

    2013-01-01

    Kane's paper "Validating the Interpretations and Uses of Test Scores" is the most complete and clearest discussion yet available of the argument-based approach to validation. At its most basic level, validation as formulated by Kane is fundamentally a simply-stated two-step enterprise: (1) specify the claims inherent in a particular interpretation…

  20. Racial Differences in Mathematics Test Scores for Advanced Mathematics Students

    ERIC Educational Resources Information Center

    Minor, Elizabeth Covay

    2016-01-01

    Research on achievement gaps has found that achievement gaps are larger for students who take advanced mathematics courses compared to students who do not. Focusing on the advanced mathematics student achievement gap, this study found that African American advanced mathematics students have significantly lower test scores and are less likely to be…

  1. Utility of a scoring balloon for a severely calcified lesion: bench test and finite element analysis.

    PubMed

    Kawase, Yoshiaki; Saito, Naritatsu; Watanabe, Shin; Bao, Bingyuan; Yamamoto, Erika; Watanabe, Hiroki; Higami, Hirooki; Matsuo, Hitoshi; Ueno, Katsumi; Kimura, Takeshi

    2014-04-01

    We aimed to investigate the effectiveness of a scoring balloon catheter in expanding a circumferentially calcified lesion compared to a conventional balloon catheter using an in vitro experiment setting and elucidate the underlying mechanisms of this ability using a finite element analysis. True efficacy of the scoring device and the underlying mechanisms for heavily calcified coronary lesions are unclear. We employed a Scoreflex scoring balloon catheter (OrbusNeich, Hong Kong, China). The ability of Scoreflex to dilate a calcified lesion was compared with a conventional balloon catheter using 3 different sized calcium tubes. The thickness of the calcium tubes were 2.0, 2.25, and 2.5 mm. The primary endpoints were the successful induction of cracks in the calcium tubes and the inflation pressures required for inducing cracks. The inflation pressure required for cracking the calcium tubes were consistently lower with Scoreflex (p < 0.05, Student t test). The finite element analysis revealed that the first principal stress applied to the calcified plaque was higher by at least threefold when applying the balloon catheter with scoring elements. A scoring balloon catheter can expand a calcified lesion with lower pressure than that of a conventional balloon. The finite element analysis revealed that the concentration of the stress observed in the outside of the calcified plaque just opposite to the scoring element is the underlying mechanism of the increased ability of Scoreflex to dilate the calcified lesion.

  2. Flow and diffusion of high-stakes test scores

    NASA Astrophysics Data System (ADS)

    Marder, M.; Bansal, D.

    2009-10-01

    We apply visualization and modeling methods for convective and diffusive flows to public school mathematics test scores from Texas. We obtain plots that show the most likely future and past scores of students, the effects of random processes such as guessing, and the rate at which students appear in and disappear from schools. We show that student outcomes depend strongly upon economic class, and identify the grade levels where flows of different groups diverge most strongly. Changing the effectiveness of instruction in one grade naturally leads to strongly nonlinear effects on student outcomes in subsequent grades.

  3. Individual differences in left parietal white matter predict math scores on the Preliminary Scholastic Aptitude Test.

    PubMed

    Matejko, Anna A; Price, Gavin R; Mazzocco, Michèle M M; Ansari, Daniel

    2013-02-01

    Mathematical skills are of critical importance, both academically and in everyday life. Neuroimaging research has primarily focused on the relationship between mathematical skills and functional brain activity. Comparatively few studies have examined which white matter regions support mathematical abilities. The current study uses diffusion tensor imaging (DTI) to test whether individual differences in white matter predict performance on the math subtest of the Preliminary Scholastic Aptitude Test (PSAT). Grades 10 and 11 PSAT scores were obtained from 30 young adults (ages 17-18) with wide-ranging math achievement levels. Tract based spatial statistics was used to examine the correlation between PSAT math scores, fractional anisotropy (FA), radial diffusivity (RD) and axial diffusivity (AD). FA in left parietal white matter was positively correlated with math PSAT scores (specifically in the left superior longitudinal fasciculus, left superior corona radiata, and left corticospinal tract) after controlling for chronological age and same grade PSAT critical reading scores. Furthermore, RD, but not AD, was correlated with PSAT math scores in these white matter microstructures. The negative correlation with RD further suggests that participants with higher PSAT math scores have greater white matter integrity in this region. Individual differences in FA and RD may reflect variability in experience dependent plasticity over the course of learning and development. These results are the first to demonstrate that individual differences in white matter are associated with mathematical abilities on a nationally administered scholastic aptitude measure.

  4. An Approach to Scoring and Equating Tests with Binary Items: Piloting With Large-Scale Assessments

    ERIC Educational Resources Information Center

    Dimitrov, Dimiter M.

    2016-01-01

    This article describes an approach to test scoring, referred to as "delta scoring" (D-scoring), for tests with dichotomously scored items. The D-scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The D-score is computed from the…

  5. The validity of self-reported physical fitness test scores.

    PubMed

    Jones, Sarah B; Knapik, Joseph J; Sharp, Marilyn A; Darakjy, Salima; Jones, Bruce H

    2007-02-01

    Epidemiological studies often have to rely on a participant's self-reporting of information. The validity of the self-report instrument is an important consideration in any study. The purpose of this investigation was to determine the validity of self-reported Army Physical Fitness Test (APFT) scores. The APFT is administered to all soldiers in the U.S. Army twice a year and consists of the maximum number of push-ups completed in 2 minutes, the maximum number of sit-ups completed in 2 minutes, and a 2-mile run for time. Army mechanics responded to a questionnaire in March and June 2004 asking them to report the exact scores of each event on their most recent APFT. Actual APFT scores were obtained from the soldier's military unit. The mean +/- standard deviation (SD) of actual and self-reported numbers of push-ups was 61 +/- 14 and 65 +/- 13, respectively. The mean +/- SD of actual and self-reported numbers of sit-ups were 66 +/- 10 and 68 +/- 10, respectively. The mean +/- SD of actual and self-reported run times (minutes) were 14.8 +/- 1.4 and 14.6 +/- 1.4, respectively. Correlations between actual and self-reported push-ups, sit-ups, and run were 0.83, 0.71, and 0.85, respectively. On average, soldiers tended to slightly over-report performance on all APFT events and individual self-reported scores could vary widely from actual scores based on Bland-Altman plots. Despite this, the close correlations between the actual and self-reported scores suggest that self-reported values are adequate for most epidemiological military studies involving larger sample sizes.

  6. Comparison between Dichotomous and Polytomous Scoring of Innovative Items in a Large-Scale Computerized Adaptive Test

    ERIC Educational Resources Information Center

    Jiao, Hong; Liu, Junhui; Haynie, Kathleen; Woo, Ada; Gorham, Jerry

    2012-01-01

    This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test…

  7. Neighborhood Social Context and Individual Polycyclic Aromatic Hydrocarbon Exposures Associated with Child Cognitive Test Scores

    PubMed Central

    Eldred-Skemp, Nicolia; Quinn, James W.; Chang, Hsin-wen; Rauh, Virginia A.; Rundle, Andrew; Orjuela, Manuela A.; Perera, Frederica P.

    2013-01-01

    Childhood cognitive and test-taking abilities have long-term implications for educational achievement and health, and may be influenced by household environmental exposures and neighborhood contexts. This study evaluates whether age 5 scores on the Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R, administered in English) are associated with polycyclic aromatic hydrocarbon (PAH) exposure and neighborhood context variables including poverty, low educational attainment, low English language proficiency, and inadequate plumbing. The Columbia Center for Children’s Environmental Health enrolled African-American and Dominican-American New York City women during pregnancy, and conducted follow-up for subsequent childhood health outcomes including cognitive test scores. Individual outcomes were linked to data characterizing 1-km network buffers around prenatal addresses, home observations, interviews, and prenatal PAH exposure data from personal air monitors. Prenatal PAH exposure above the median predicted 3.5 point lower total WPPSI-R scores and 3.9 point lower verbal scores; the association was similar in magnitude across models with adjustments for neighborhood characteristics. Neighborhood-level low English proficiency was independently associated with 2.3 point lower mean total WPPSI-R score, 1.2 point lower verbal score, and 2.7 point lower performance score per standard deviation. Low neighborhood-level educational attainment was also associated with 2.0 point lower performance scores. In models examining effect modification, neighborhood associations were similar or diminished among the high PAH exposure group, as compared with the low PAH exposure group. Early life exposure to personal PAH exposure or selected neighborhood-level social contexts may predict lower cognitive test scores. However, these results may reflect limited geographic exposure variation and limited generalizability. PMID:24994947

  8. Individual differences in social dominance orientation predict support for the use of cognitive ability tests.

    PubMed

    Kim, Anita; Berry, Christopher M

    2015-02-01

    This study investigates the personality processes involved in the debate surrounding the use of cognitive ability tests in college admissions. In Study 1, 108 undergraduates (Mage  = 18.88 years, 60 women, 80 Whites) completed measures of social dominance orientation (SDO), testing self-efficacy, and attitudes regarding the use of cognitive ability tests in college admissions; SAT/ACT scores were collected from the registrar. Sixty-seven undergraduates (Mage  = 19.06 years, 39 women, 49 Whites) completed the same measures in Study 2, along with measures of endorsement of commonly presented arguments about test use. In Study 3, 321 American adults (Mage  = 35.58 years, 180 women, 251 Whites) completed the same measures used in Study 2; half were provided with facts about race and validity issues surrounding cognitive ability tests. Individual differences in SDO significantly predicted support for the use of cognitive ability tests in all samples, after controlling for SAT/ACT scores and test self-efficacy and also among participants who read facts about cognitive ability tests. Moreover, arguments for and against test use mediated this effect. The present study sheds new light on an old debate by demonstrating that individual differences in beliefs about hierarchy play a key role in attitudes toward cognitive ability test use.

  9. Comparison of the Ability to Predict Mortality between the Injury Severity Score and the New Injury Severity Score: A Meta-Analysis

    PubMed Central

    Deng, Qiangyu; Tang, Bihan; Xue, Chen; Liu, Yuan; Liu, Xu; Lv, Yipeng; Zhang, Lulu

    2016-01-01

    Background: Description of the anatomical severity of injuries in trauma patients is important. While the Injury Severity Score has been regarded as the “gold standard” since its creation, several studies have indicated that the New Injury Severity Score is better. Therefore, we aimed to systematically evaluate and compare the accuracy of the Injury Severity Score and the New Injury Severity Score in predicting mortality. Methods: Two researchers independently searched the PubMed, Embase, and Web of Science databases and included studies from which the exact number of true-positive, false-positive, false-negative, and true-negative results could be extracted. Quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies checklist criteria. The meta-analysis was performed using Meta-DiSc. Meta-regression, subgroup analyses, and sensitivity analyses were conducted to determine the source(s) of heterogeneity and factor(s) affecting the accuracy of the New Injury Severity Score and the Injury Severity Score in predicting mortality. Results: The heterogeneity of the 11 relevant studies (total n = 11,866) was high (I2 > 80%). The meta-analysis using a random-effects model resulted in sensitivity of 0.64, specificity of 0.93, positive likelihood ratio of 5.11, negative likelihood ratio of 0.27, diagnostic odds ratio of 27.75, and area under the summary receiver operator characteristic curve of 0.9009 for the Injury Severity Score; and sensitivity of 0.71, specificity of 0.87, positive likelihood ratio of 5.22, negative likelihood ratio of 0.20, diagnostic odds ratio of 24.74, and area under the summary receiver operating characteristic curve of 0.9095 for the New Injury Severity Score. Conclusion: The New Injury Severity Score and the Injury Severity Score have similar abilities in predicting mortality. Further research is required to determine the appropriate use of the Injury Severity Score or the New Injury Severity Score based on specific

  10. Correlation of the Scores on Barron's Ego Strength Scale with the Scores on the Bender-Gestalt Test.

    ERIC Educational Resources Information Center

    Martin, John D.; And Others

    1979-01-01

    The degree of relationship between scores on the Barron Ego Strength Scale and the scores on the Bender-Gestalt Test was investigated on a sample of college students. Correlations were moderate to low. Racial differences were observed on the Bender-Gestalt Test. (Author/JKS)

  11. Evaluation of functional ability of rheumatoid arthritis based on HAQ score and BMD among South Indian patients.

    PubMed

    Snekhalatha, U; Anburajan, M

    2012-07-01

    Aim of this study is to analyze the functional ability of rheumatoid arthritis among South Indian male and female patients based on HAQ score and forearm ulna-BMD measurement by peripheral DXA, and to investigate the correlation between forearm ulna-BMD and HAQ score among RA patients. Sixty-four patients with RA and 64 age- and sex-matched healthy controls were included in this study. The health assessment questionnaire test was self administered by each RA patients. The bone mineral density (BMD) in forearm ulna region was measured using peripheral Dual energy X-ray absorptiometry (osteometer model-DTX200 Meditech.Inc, Hawthorn, California, USA) both for RA patients and for healthy control group. RA patients (n = 64) and age- and sex-matched healthy controls (n = 64) were selected, of which 46 (72%) patients were women and 18 (28%) were men. The mean age was 47.75 ± 11.37 years, and a majority of the patients were in the age group of 30-75 years. The mean age of healthy controls was 46.42 ± 10.67 years. For male RA patients, U-BMD shows moderate significance with healthy controls (0.371 ± 0.05 (g cm(2)) [mean ± SD], 0.413 ± 0.05 (g cm(2)), P = 0.03). For female RA patients, U-BMD was highly significant with that of healthy controls (0.300 ± 0.132 (g cm(2)), 0.376 ± 0.05 (g cm(2)), P = 0.0006). Because as U-BMD decreases for RA patients, HAQ score increases, hence, Pearson correlation analysis revealed that U-BMD was negatively correlated with HAQ score (r = -0.732, P < 0.0001). Forearm U-BMD for RA patients is significantly lower than the healthy controls both for male and for female patients. There was a negative correlation found between HAQ score and P-DXA forearm U-BMD.

  12. The Performance of the Upper Limb scores correlate with pulmonary function test measures and Egen Klassifikation scores in Duchenne muscular dystrophy.

    PubMed

    Lee, Ha Neul; Sawnani, Hemant; Horn, Paul S; Rybalsky, Irina; Relucio, Lani; Wong, Brenda L

    2016-01-01

    The Performance of the Upper Limb scale was developed as an outcome measure specifically for ambulant and non-ambulant patients with Duchenne muscular dystrophy and is implemented in clinical trials needing longitudinal data. The aim of this study is to determine whether this novel tool correlates with functional ability using pulmonary function test, cardiac function test and Egen Klassifikation scale scores as clinical measures. In this cross-sectional study, 43 non-ambulatory Duchenne males from ages 10 to 30 years and on long-term glucocorticoid treatment were enrolled. Cardiac and pulmonary function test results were analyzed to assess cardiopulmonary function, and Egen Klassifikation scores were analyzed to assess functional ability. The Performance of the Upper Limb scores correlated with pulmonary function measures and had inverse correlation with Egen Klassifikation scores. There was no correlation with left ventricular ejection fraction and left ventricular dysfunction. Body mass index and decreased joint range of motion affected total Performance of the Upper Limb scores and should be considered in clinical trial designs.

  13. Computerized Scoring Algorithms for the Autobiographical Memory Test.

    PubMed

    Takano, Keisuke; Gutenbrunner, Charlotte; Martens, Kris; Salmon, Karen; Raes, Filip

    2017-04-03

    Reduced specificity of autobiographical memories is a hallmark of depressive cognition. Autobiographical memory (AM) specificity is typically measured by the Autobiographical Memory Test (AMT), in which respondents are asked to describe personal memories in response to emotional cue words. Due to this free descriptive responding format, the AMT relies on experts' hand scoring for subsequent statistical analyses. This manual coding potentially impedes research activities in big data analytics such as large epidemiological studies. Here, we propose computerized algorithms to automatically score AM specificity for the Dutch (adult participants) and English (youth participants) versions of the AMT by using natural language processing and machine learning techniques. The algorithms showed reliable performances in discriminating specific and nonspecific (e.g., overgeneralized) autobiographical memories in independent testing data sets (area under the receiver operating characteristic curve > .90). Furthermore, outcome values of the algorithms (i.e., decision values of support vector machines) showed a gradient across similar (e.g., specific and extended memories) and different (e.g., specific memory and semantic associates) categories of AMT responses, suggesting that, for both adults and youth, the algorithms well capture the extent to which a memory has features of specific memories. (PsycINFO Database Record

  14. Assembling Tests for the Measurement of Multiple Abilities.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.

    It is proposed that the assembly of tests for the measurement of multiple abilities be based on targets for the (asymptotic) variance functions of the estimators in each of the abilities. A linear programming model is presented that can be used to computerize the assembly process. Several cases of test assembly dealing with multidimensional…

  15. Basketball ability testing and category for players with mental retardation: 8-month training effect.

    PubMed

    Franciosi, Emanuele; Gallotta, Maria Chiara; Baldari, Carlo; Emerenziani, Gian Pietro; Guidetti, Laura

    2012-06-01

    Although sport for athletes with mental retardation (MR) is achieving an important role, the literature concerning basketball tests and training is still poor. The aims of this study were to verify whether the basketball test battery could be an appropriate modality to classify the players in the Promotion (Pro) category, to assess basketball abilities before (PRE) and after (POST) an 8-month training in players with MR in relation to Competitive (Comp) and Pro categories, to analyze the variation of specific basketball abilities based on subjects' MR diagnosis. Forty-one male basketball players with MR (17 Comp and 24 Pro; age range 18-45 years; MR: 15% mild, 54% moderate, 29% severe, and 2% profound) were assessed PRE and POST training through the basketball test battery, which assessed 4 ability levels of increasing difficulty (from I to IV), each one characterized by the analysis of fundamental areas (ball handling, reception, passing, and shooting). Level I was significantly changed after the intervention period regardless of the Category, whereas shooting was affected by the interaction between Category and Intervention. The results showed significant differences between categories in the scores of individual global, level I, level II, level III, and in all fundamental areas. Individual global score in both categories significantly increased. The players of Comp significantly improved in level III, in ball handling, reception, passing, and shooting scores. The players of Pro improved significantly in level II, in ball handling, reception, and passing scores. Individual global, ability levels I-III, and fundamental area scores were negatively correlated to the MR level indicating that the players with a lower MR obtained higher ability scores. In conclusion, it was found that the basketball test battery could be useful for improving and monitoring training in both Comp and Pro players.

  16. Concurrent and Predictive Validity of the Raven Progressive Matrices and the Naglieri Nonverbal Ability Test

    ERIC Educational Resources Information Center

    Balboni, Giulia; Naglieri, Jack A.; Cubelli, Roberto

    2010-01-01

    The concurrent and predictive validities of the Naglieri Nonverbal Ability Test (NNAT) and Raven's Colored Progressive Matrices (CPM) were investigated in a large group of Italian third-and fifth-grade students with different sociocultural levels evaluated at the beginning and end of the school year. CPM and NNAT scores were related to math and…

  17. Normal test scores in the Farnsworth-Munsell 100 hue test.

    PubMed

    Mantyjarvi, M

    2001-01-01

    One hundred and sixty persons aged from 10 to 69 years (106 women, 54 men) with healthy eyes were studied with the Farnsworth-Munsell 100 hue (FM100) test. The mean of the results in the total scores and in the individual box scores in the right and left eye were calculated. The total score was also separately calculated in women and men. The test was administered under the illumination of Macbeth Easel lamp, 1000 lux, and the right eye was tested first. The results were calculated in six different age groups, 10-19 years, 20-29 years, etc. The mean of the total scores in the right eye varied from 7.44+/-2.46 (SD) to 10.07+/-2.03 in different age groups and in the left eye from 7.56+/-2.36 to 10.16+/-2.68. The scores changed significantly with the age: the correlation between the age and the test scores by linear regression gave significant results, in the right eye (R = 0.308, P = 0.0001), and in the left eye (R = 0.246, P = 0.0021). The present study with the normal error scores in the FM100 test and its individual boxes in persons aged 10-69 years gives clinicians working with colour vision defects a possibility to estimate the normality or abnormality of the results in their patients.

  18. An Investigation into the Relationships Between Cloze Test Scores and Informal Reading Inventory Scores of Fifth Grade Pupils.

    ERIC Educational Resources Information Center

    Walter, Richard Barry

    This study investigated the relationship between instructional level scores as determined by a cloze test and instructional level scores as determined by an informal reading inventory (IRI). Fifty male and 50 female subjects were randomly selected from the total fifth grade population of five schools chosen from a total of 22 midwestern elementary…

  19. Application of new WAIS-III/WMS-III discrepancy scores for evaluating memory functioning: relationship between intellectual and memory ability.

    PubMed

    Lange, Rael T; Chelune, Gordon J

    2006-05-01

    Analysis of the discrepancy between memory and intellectual ability has received some support as a means for evaluating memory impairment. Recently, comprehensive base rate tables for General Ability Index (GAI) minus memory discrepancy scores (i.e., GAI-memory) were developed using the WAIS-III/WMS-III standardization sample (Lange, Chelune, & Tulsky, in press). The purpose of this study was to evaluate the clinical utility of GAI-memory discrepancy scores to identify memory impairment in 34 patients with Alzheimer's type dementia (DAT) versus a sample of 34 demographically matched healthy participants. On average, patients with DAT obtained significantly lower scores on all WAIS-III and WMS-III indexes and had larger GAI-memory discrepancy scores. Clinical outcome analyses revealed that GAI-memory scores were useful at identifying memory impairment in patients with DAT versus matched healthy participants. However, GAI-memory discrepancy scores failed to provide unique interpretive information beyond that which is gained from the memory indexes alone. Implications and future research directions are discussed.

  20. A method for increasing the scoring efficiency of the Farnsworth-Munsell 100-Hue test.

    PubMed

    Craven, B J

    1997-03-01

    This paper describes a method for scoring the Farnsworth-Munsell 100-Hue test, based on maximum-likelihood estimation, which in theory reduces test-to-test variability in scores and which is therefore better able to discriminate between different levels of overall colour discrimination than is the original Farnsworth scoring system. Error scores produced by the method are directly comparable to error scores produced by the traditional scoring system. It is hoped that this work will provoke further consideration of the efficiency of the scoring system as far as test-to-test variability is concerned, including the efficient detection of polarity in the subject's hue discrimination function.

  1. The Visual Aural Digit Span Test and Bender Gestalt Test as Predictors of Wide Range Achievement Test-Revised Scores.

    ERIC Educational Resources Information Center

    Smith, Teresa C.; Smith, Billy L.

    1988-01-01

    Examined Visual Aural Digit Span Test (VADS) and Bender-Gestalt (BG) scores as predictors of Wide Range Achievement Test-Revised (WRAT-R) scores among 115 elementary school students referred for low academic achievement. Divided children into three age groups. Results suggest BG and VADS Test can be effective screening devices for young children…

  2. Ability of Lower-Extremity Injury Severity Scores to Predict Functional Outcome After Limb Salvage

    PubMed Central

    Ly, Thuan V.; Travison, Thomas G.; Castillo, Renan C.; Bosse, Michael J.; MacKenzie, Ellen J.

    2008-01-01

    Background: Lower-extremity injury severity scoring systems were developed to assist surgeons in decision-making regarding whether to amputate or perform limb salvage after high-energy trauma to the lower extremity. These scoring systems have been shown to not be good predictors of limb amputation or salvage. This study was performed to evaluate the clinical utility of the five commonly used lower-extremity injury severity scoring systems as predictors of final functional outcome. Methods: We analyzed data from a cohort of patients who participated in a multicenter prospective study of clinical and functional outcomes after high-energy lower-extremity trauma. Injury severity was assessed with use of the Mangled Extremity Severity Score; the Limb Salvage Index; the Predictive Salvage Index; the Nerve Injury, Ischemia, Soft-Tissue Injury, Skeletal Injury, Shock, and Age of Patient Score; and the Hannover Fracture Scale-98. Functional outcomes were measured with use of the physical and psychosocial domains of the Sickness Impact Profile at both six months and two years following hospital discharge. Four hundred and seven subjects for whom the reconstruction regimen was considered successful at six months were included in the analysis. We used partial correlation statistics and multiple linear regression models to quantify the association between injury severity scores and Sickness Impact Profile outcomes with the subjects' ages held constant. Results: The mean age of the patients was thirty-six years (interquartile range, twenty-six to forty-four years); 75.2% were male and 24.8% were female. The median Sickness Impact Profile scores were 15.2 and 6.0 points at six and twenty-four months, respectively. The analysis showed that none of the scoring systems were predictive of the Sickness Impact Profile outcomes at six or twenty-four months to any reasonable degree. Likewise, none were predictive of patient recovery between six and twenty-four months postoperatively as

  3. Score Gains on "g"-Loaded Tests: No "g"

    ERIC Educational Resources Information Center

    te Nijenhuis, Jan; van Vianen, Annelies E. M.; van der Flier, Henk

    2007-01-01

    IQ scores provide the best general predictor of success in education, job training, and work. However, there are many ways in which IQ scores can be increased, for instance by means of retesting or participation in learning potential training programs. What is the nature of these score gains? Jensen [Jensen, A. R. (1998a). "The g factor: The…

  4. Empirical Bayes Estimates of Domain Scores under Binomial and Hypergeometric Distributions for Test Scores.

    ERIC Educational Resources Information Center

    Lin, Miao-Hsiang; Hsiung, Chao A.

    1994-01-01

    Two simple empirical approximate Bayes estimators are introduced for estimating domain scores under binomial and hypergeometric distributions respectively. Criteria are established regarding use of these functions over maximum likelihood estimation counterparts. (SLD)

  5. [Internal structure and standardised scores of the Torrance Test of Creative Thinking].

    PubMed

    Ferrando, Mercedes; Ferrándiz, Carmen; Bermejo, María R; Sánchez, Cristina; Parra, Joaquín; Prieto, María D

    2007-08-01

    The present work sets out to study the internal structure of the Torrance Test of Creative Thinking (TTCT) and to establish standardised scores that will enable the test to be used in both a diagnostic and educational context. 649 students (319 girls and 330 boys), aged 5 to 12 years from various schools in Murcia and Alicante (SE Spain), took part in the study. The findings suggest that the psychometric characteristics of TTCT are satisfactory, and its internal structure can be attributed to three factors that are responsible for a high percentage of the variance (73.8%). The standardised score tables, which are provided for first time in this context, will be useful in the evaluation of creativity and the identification of students with high intellectual abilities.

  6. Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

    ERIC Educational Resources Information Center

    Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.

    2010-01-01

    Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

  7. A new scoring system for the Spraings Multiple Choice Bender Gestalt Test.

    PubMed

    Friedman, A F; Wakefield, J A; Sasek, J; Schroeder, D

    1977-01-01

    A new scoring procedure to be used with Spraings' technique for administering the Bender-Gestalt test in a multiple choice format is presented. Scoring weights are used instead of simply scoring each item right or wrong. The evidence presented suggests that this method of scoring would increase the value of Spraings' test in the diagnosis of perceptual deficits.

  8. Estimating the Consistency and Accuracy of Classifications Based on Test Scores.

    ERIC Educational Resources Information Center

    Livingston, Samuel A.; Lewis, Charles

    This paper presents a method for estimating the accuracy and consistency of classifications based on test scores. The scores can be produced by any scoring method, including the formation of a weighted composite. The estimates use data from a single form. The reliability of the score is used to estimate its effective test length in terms of…

  9. The effects of calculator-based laboratories on standardized test scores

    NASA Astrophysics Data System (ADS)

    Stevens, Charlotte Bethany Rains

    Nationwide, the goal of providing a productive science and math education to our youth in today's educational institutions is centering itself around the technology being utilized in these classrooms. In this age of digital technology, educational software and calculator-based laboratories (CBL) have become significant devices in the teaching of science and math for many states across the United States. Among the technology, the Texas Instruments graphing calculator and Vernier Labpro interface, are among some of the calculator-based laboratories becoming increasingly popular among middle and high school science and math teachers in many school districts across this country. In Tennessee, however, it is reported that this type of technology is not regularly utilized at the student level in most high school science classrooms, especially in the area of Physical Science (Vernier, 2006). This research explored the effect of calculator based laboratory instruction on standardized test scores. The purpose of this study was to determine the effect of traditional teaching methods versus graphing calculator teaching methods on the state mandated End-of-Course (EOC) Physical Science exam based on ability, gender, and ethnicity. The sample included 187 total tenth and eleventh grade physical science students, 101 of which belonged to a control group and 87 of which belonged to the experimental group. Physical Science End-of-Course scores obtained from the Tennessee Department of Education during the spring of 2005 and the spring of 2006 were used to examine the hypotheses. The findings of this research study suggested the type of teaching method, traditional or calculator based, did not have an effect on standardized test scores. However, the students' ability level, as demonstrated on the End-of-Course test, had a significant effect on End-of-Course test scores. This study focused on a limited population of high school physical science students in the middle Tennessee

  10. The Predictive Ability of IQ and Working Memory Scores in Literacy in an Adult Population

    ERIC Educational Resources Information Center

    Alloway, Tracy Packiam; Gregory, David

    2013-01-01

    Literacy problems are highly prevalent and can persist into adulthood. Yet, the majority of research on the predictive nature of cognitive skills to literacy has primarily focused on development and adolescent populations. The aim of the present study was to extend existing research to investigate the roles of IQ scores and Working Memory…

  11. An examination of psychometric bias due to retesting on cognitive ability tests in selection settings.

    PubMed

    Lievens, Filip; Reeve, Charlie L; Heggestad, Eric D

    2007-11-01

    Using a latent variable approach, the authors examined whether retesting on a cognitive ability measure resulted in measurement and predictive bias. A sample of 941 candidates completed a cognitive ability test in a high-stakes context. Results of both the within-group between-occasions comparison and the between-groups within-occasion comparison indicated that no measurement bias existed during the initial testing but that retesting induced both measurement and predictive bias. Specifically, the results suggest that the factor underlying the retest scores was less saturated with g and more associated with memory than the latent factor underlying initial test scores and that these changes eliminated the test's criterion-related validity. This study's implications for retesting theory, practice, and research are discussed.

  12. The Contributions of Memory and Vocabulary to Non-Verbal Ability Scores in Adolescents with Intellectual Disability

    PubMed Central

    Mungkhetklang, Chantanee; Bavin, Edith L.; Crewther, Sheila G.; Goharpey, Nahal; Parsons, Carl

    2016-01-01

    It is usually assumed that performance on non-verbal intelligence tests reflects visual cognitive processing and that aspects of working memory (WM) will be involved. However, the unique contribution of memory to non-verbal scores is not clear, nor is the unique contribution of vocabulary. Thus, we aimed to investigate these contributions. Non-verbal test scores for 17 individuals with intellectual disability (ID) and 39 children with typical development (TD) of similar mental age were compared to determine the unique contribution of visual and verbal short-term memory (STM) and WM and the additional variance contributed by vocabulary scores. No significant group differences were found in the non-verbal test scores or receptive vocabulary scores, but there was a significant difference in expressive vocabulary. Regression analyses indicate that for the TD group STM and WM (both visual and verbal) contributed similar variance to the non-verbal scores. For the ID group, visual STM and verbal WM contributed most of the variance to the non-verbal test scores. The addition of vocabulary scores to the model contributed greater variance for both groups. More unique variance was contributed by vocabulary than memory for the TD group, whereas for the ID group memory contributed more than vocabulary. Visual and auditory memory and vocabulary contributed significantly to solving visual non-verbal problems for both the TD group and the ID group. However, for each group, there were different weightings of these variables. Our findings indicate that for individuals with TD, vocabulary is the major factor in solving non-verbal problems, not memory, whereas for adolescents with ID, visual STM, and verbal WM are more influential than vocabulary, suggesting different pathways to achieve solutions to non-verbal problems. PMID:28082922

  13. Volatility in School Test Scores: Implications for Test-Based Accountability Systems

    ERIC Educational Resources Information Center

    Kane, Thomas J.; Staiger, Douglas O.

    2002-01-01

    By the spring of 2000, forty states had begun using student test scores to rate school performance. Twenty states have gone a step further and are attaching explicit monetary rewards or sanctions to a school's test performance. In this paper, the authors focus on accountability programs in which states measure the effectiveness of individual…

  14. A Quick Assessment of Visuospatial Abilities in Adolescents Using the Design Organization Test (DOT).

    PubMed

    Burggraaf, Rudolf; Frens, Maarten A; Hooge, Ignace T C; van der Geest, Jos N

    2016-01-01

    Tests measuring visuospatial abilities have shown that these abilities increase during adolescence. Unfortunately, the Block Design test and other such tests are complicated and time-consuming to administer, making them unsuitable for use with large groups of restless adolescents. The results of the Design Organization Test (DOT), a quick pen-and-paper test, have been shown to correlate with those of the Block Design test. A group of 198 healthy adolescents (110 male and 88 female) aged 12 to 19 years old participated in this study. A slightly modified version of the DOT has been used in which we shortened the administration time to avoid a ceiling effect in the score. Scores show a linear increase with age (on average 2.0 points per year, r = .61) independent of sex. Scores did not differ between individual setting and group setting. Thus, the DOT is a simple and effective way to assess visuospatial ability in large groups, such as in schools, and it can be easily administered year after year to follow the development of students.

  15. Variability of Test Scores and the Split-Half Reliability Coefficient

    ERIC Educational Resources Information Center

    Zimmerman, Donald W.

    1970-01-01

    Results of this study indicate that the correlation between half-test scores over repeated splits, over persons, and over repeated testings resulting in different sets of observed scores, is given by Kuder-Richardson Formula 21. (RF)

  16. Test/score/report: Simulation techniques for automating the test process

    NASA Technical Reports Server (NTRS)

    Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.

    1994-01-01

    A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary

  17. The ability of video image analysis to predict lean meat yield and EUROP score of lamb carcasses.

    PubMed

    Einarsson, E; Eythórsdóttir, E; Smith, C R; Jónmundsson, J V

    2014-07-01

    A total of 862 lamb carcasses that were evaluated by both the VIAscan® and the current EUROP classification system were deboned and the actual yield was measured. Models were derived for predicting lean meat yield of the legs (Leg%), loin (Loin%) and shoulder (Shldr%) using the best VIAscan® variables selected by stepwise regression analysis of a calibration data set (n=603). The equations were tested on validation data set (n=259). The results showed that the VIAscan® predicted lean meat yield in the leg, loin and shoulder with an R 2 of 0.60, 0.31 and 0.47, respectively, whereas the current EUROP system predicted lean yield with an R 2 of 0.57, 0.32 and 0.37, respectively, for the three carcass parts. The VIAscan® also predicted the EUROP score of the trial carcasses, using a model derived from an earlier trial. The EUROP classification from VIAscan® and the current system were compared for their ability to explain the variation in lean yield of the whole carcass (LMY%) and trimmed fat (FAT%). The predicted EUROP scores from the VIAscan® explained 36% of the variation in LMY% and 60% of the variation in FAT%, compared with the current EUROP system that explained 49% and 72%, respectively. The EUROP classification obtained by the VIAscan® was tested against a panel of three expert classifiers (n=696). The VIAscan® classification agreed with 82% of conformation and 73% of the fat classes assigned by a panel of expert classifiers. It was concluded that VIAscan® provides a technology that can directly predict LMY% of lamb carcasses with more accuracy than the current EUROP classification system. The VIAscan® is also capable of classifying lamb carcasses into EUROP classes with an accuracy that fulfils minimum demands for the Icelandic sheep industry. Although the VIAscan® prediction of the Loin% is low, it is comparable to the current EUROP system, and should not hinder the adoption of the technology to estimate the yield of Icelandic lambs as it delivered

  18. Using Subpopulation Invariance to Assess Test Score Equity

    ERIC Educational Resources Information Center

    Dorans, Neil J.

    2004-01-01

    Score equity assessment (SEA) is introduced, and placed within a fair assessment context that includes differential prediction or fair selection and differential item functioning. The notion of subpopulation invariance of linking functions is central to the assessment of score equity, just as it has been for differential item functioning and…

  19. Teachers See What Ability Scores Cannot: Predicting Student Performance with Challenging Mathematics

    ERIC Educational Resources Information Center

    Foreman, Jennifer L.; Gubbins, E. Jean

    2015-01-01

    Teacher nominations of students are commonly used in gifted and talented identification systems to supplement psychometric measures of reasoning ability. In this study, second grade teachers were requested to nominate approximately one fourth of their students as having high learning potential in the year prior to the students' participation in a…

  20. Interpreting the g loadings of intelligence test composite scores in light of Spearman's law of diminishing returns.

    PubMed

    Reynolds, Matthew R

    2013-03-01

    The linear loadings of intelligence test composite scores on a general factor (g) have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the g loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a) investigate whether the g loadings of composite scores from the Differential Ability Scales (2nd ed.) (DAS-II, C. D. Elliott, 2007a, Differential Ability Scales (2nd ed.). San Antonio, TX: Pearson) were nonlinear and (b) if they were nonlinear, to compare them with linear g loadings to demonstrate how SLODR alters the interpretation of these loadings. Linear and nonlinear confirmatory factor analysis (CFA) models were used to model Nonverbal Reasoning, Verbal Ability, Visual Spatial Ability, Working Memory, and Processing Speed composite scores in four age groups (5-6, 7-8, 9-13, and 14-17) from the DAS-II norming sample. The nonlinear CFA models provided better fit to the data than did the linear models. In support of SLODR, estimates obtained from the nonlinear CFAs indicated that g loadings decreased as g level increased. The nonlinear portion for the nonverbal reasoning loading, however, was not statistically significant across the age groups. Knowledge of general ability level informs composite score interpretation because g is less likely to produce differences, or is measured less, in those scores at higher g levels. One implication is that it may be more important to examine the pattern of specific abilities at higher general ability levels.

  1. New Testing Methods to Assess Technical Problem-Solving Ability.

    ERIC Educational Resources Information Center

    Hambleton, Ronald K.; And Others

    Tests to assess problem-solving ability being provided for the Air Force are described, and some details on the development and validation of these computer-administered diagnostic achievement tests are discussed. Three measurement approaches were employed: (1) sequential problem solving; (2) context-free assessment of fundamental skills and…

  2. Measuring Writing Ability with the Cloze Test is not Closed.

    ERIC Educational Resources Information Center

    Esau, Helmut; Yost, Carlson

    This paper describes an experiment that was undertaken to examine the usefulness of the cloze test as an objective measure of a native speaker's writing ability. A modified version of the cloze test used by Oller and others to measure integrative language skills in non-native speakers was given to 100 freshman English students. The test…

  3. New scores for the Category Test: measures of interference for subtests 5 and 6.

    PubMed

    Webster, Jeffrey S; Lopez, Michael N

    2006-12-01

    The Category Test is a well-known neuropsychological instrument used to assess concept formation and higher executive abilities. The present study investigated the utility of additional scores for the Category Test. We used principles developed in cognitive psychology to create several new measures for subtests 5 and 6 of this test. These scores were primarily designed to be sensitive to interference effects of learning decision rules from subtest 2, subtest 3, and subtest 4. The new scores as well as the total error scores from subtests 5 and 6 were used to discriminate subjects with documented brain injury from subjects who were neurologically normal based on neuroimaging and neurologic evaluation. The Category Test was given following Reitan's (1979) instructions, with the exception that no additional prompting was given to participants who struggled early with the test in order to reduce the "executive" guidance of the examiner. Because any "interference" from earlier subtests on performance of subtest 5 and subtest 6 should be related to mastery of these earlier subtests, the normal group was matched to the brain-impaired group on which subtest(s) they learned. This resulted in four learning groups: (a) learned subtests 3 and 4; (b) learned subtest 4 but not 3; (c) learned subtest 3 but not 4; and (d) failed to learn either subtest. ANOVA analyses revealed that the three measures of interference were significantly greater in the brain-damaged group than in the normal controls. Also, specific interference measures were related to specific prior subtest mastery, thus providing support for a proactive interference effect. In addition, we have evidence that our new measures may be selectively sensitive to frontal system dysfunction.

  4. Evidence-Based Decision about Test Scoring Rules in Clinical Anatomy Multiple-Choice Examinations

    ERIC Educational Resources Information Center

    Severo, Milton; Gaio, A. Rita; Povo, Ana; Silva-Pereira, Fernanda; Ferreira, Maria Amélia

    2015-01-01

    In theory the formula scoring methods increase the reliability of multiple-choice tests in comparison with number-right scoring. This study aimed to evaluate the impact of the formula scoring method in clinical anatomy multiple-choice examinations, and to compare it with that from the number-right scoring method, hoping to achieve an…

  5. A Diet Score Assessing Norwegian Adolescents’ Adherence to Dietary Recommendations—Development and Test-Retest Reproducibility of the Score

    PubMed Central

    Handeland, Katina; Kjellevold, Marian; Wik Markhus, Maria; Eide Graff, Ingvild; Frøyland, Livar; Lie, Øyvind; Skotheim, Siv; Stormark, Kjell Morten; Dahl, Lisbeth; Øyen, Jannike

    2016-01-01

    Assessment of adolescents’ dietary habits is challenging. Reliable instruments to monitor dietary trends are required to promote healthier behaviours in this group. The purpose of this cross-sectional study was to assess adolescents’ adherence to Norwegian dietary recommendations with a diet score and to report results from, and test-retest reliability of, the score. The diet score involved seven food groups and one physical activity indicator, and was applied to answers from a semi-quantitative food frequency questionnaire (FFQ) administered twice. Reproducibility of the score was assessed with Cohen’s Kappa (κ statistics) at an interval of three months. The setting was eight lower-secondary schools in Hordaland County, Norway, and subjects were adolescents (n = 472) aged 14–15 years and their caregivers. Results showed that the proportion of adolescents consistently classified by the diet score was 87.6% (κ = 0.465). For food groups, proportions ranged from 74.0% to 91.6% (κ = 0.249 to κ = 0.573). Less than 40% of the participants were found to adhere to recommendations for frequencies of eating fruits, vegetables, added sugar, and fish. Highest compliance to recommendations was seen for choosing water as beverage and limit the intake of red meat. The score was associated with parental socioeconomic status. The diet score was found to be reproducible at an acceptable level. Health promoting work targeting adolescents should emphasize to increase the intake of recommended foods to approach nutritional guidelines. PMID:27483312

  6. The counterintuitive effect of multiple injuries in severity scoring: a simple variable improves the predictive ability of NISS

    PubMed Central

    2011-01-01

    Background Injury scoring is important to formulate prognoses for trauma patients. Although scores based on empirical estimation allow for better prediction, those based on expert consensus, e.g. the New Injury Severity Score (NISS) are widely used. We describe how the addition of a variable quantifying the number of injuries improves the ability of NISS to predict mortality. Methods We analyzed 2488 injury cases included into the trauma registry of the Italian region Emilia-Romagna in 2006-2008 and assessed the ability of NISS alone, NISS plus number of injuries, and the maximum Abbreviated Injury Scale (AIS) to predict in-hospital mortality. Hierarchical logistic regression was used. We measured discrimination through the C statistics, and calibration through Hosmer-Lemeshow statistics, Akaike's information criterion (AIC) and calibration curves. Results The best discrimination and calibration resulted from the model with NISS plus number of injuries, followed by NISS alone and then by the maximum AIS (C statistics 0.775, 0.755, and 0.729, respectively; AIC 1602, 1635, and 1712, respectively). The predictive ability of all the models improved after inclusion of age, gender, mechanism of injury, and the motor component of Glasgow Coma Scale (C statistics 0.889, 0.898, and 0.901; AIC 1234, 1174, and 1167). The model with NISS plus number of injuries still showed the best performances, this time with borderline statistical significance. Conclusions In NISS, the same weight is assigned to the three worst injuries, although the contribution of the second and third to the probability of death is smaller than that of the worst one. An improvement of the predictive ability of NISS can be obtained adjusting for the number of injuries. PMID:21504567

  7. How Parents Can Help Kids Improve Test Scores: Taking the Stakes out of Literacy Testing

    ERIC Educational Resources Information Center

    Schneider, Steven

    2006-01-01

    In order to meet the goals of No Child Left Behind, standardized testing is preeminent as the sole indicator determining whether states all across America demonstrate adequate yearly progress regarding the improvement of student achievement in literacy education. This book will help teachers and parents raise children's scores on standardized…

  8. Evaluating the Impact of Test Accommodations on Test Scores of LEP Students & Non-LEP Students.

    ERIC Educational Resources Information Center

    Hafner, Anne L.

    Using a quasi-experimental analysis of variance (ANOVA) design, this project examined the effects of the use of accommodations with students of limited English proficiency (LEP) and non-LEP students and whether the use of accommodations affected the validity of test score interpretations. Major accommodations examined were extra time, and extra…

  9. Effects of white noise on Callsign Acquisition Test and Modified Rhyme Test scores.

    PubMed

    Blue-Terry, Misty; Letowski, Tomasz

    2011-02-01

    The Callsign Acquisition Test (CAT) is a speech intelligibility test developed by the US Army Research Laboratory. The test has been used to evaluate speech transmission through various communication systems but has not been yet sufficiently standardised and validated. The aim of this study was to compare CAT and Modified Rhyme Test (MRT) performance in the presence of white noise across a range of signal-to-noise ratios (SNRs). A group of 16 normal-hearing listeners participated in the study. The speech items were presented at 65 dB(A) in the background of white noise at SNRs of -18, -15, -12, -9 and -6 dB. The results showed a strong positive association (75.14%) between the two tests, but significant differences between the CAT and MRT absolute scores in the range of investigated SNRs. Based on the data, a function to predict CAT scores based on existing MRT scores and vice versa was formulated. STATEMENT OF RELEVANCE: This work compares performance data of a common speech intelligibility test (MRT) with a new test (CAT) in the presence of white noise. The results here can be used as a part of the standardisation procedures and provide insights to the predictive capabilities of the CAT to quantify speech intelligibility communication in high-noise military environments.

  10. Ability evaluation by binary tests: Problems, challenges & recent advances

    NASA Astrophysics Data System (ADS)

    Bashkansky, E.; Turetsky, V.

    2016-11-01

    Binary tests designed to measure abilities of objects under test (OUTs) are widely used in different fields of measurement theory and practice. The number of test items in such tests is usually very limited. The response to each test item provides only one bit of information per OUT. The problem of correct ability assessment is even more complicated, when the levels of difficulty of the test items are unknown beforehand. This fact makes the search for effective ways of planning and processing the results of such tests highly relevant. In recent years, there has been some progress in this direction, generated by both the development of computational tools and the emergence of new ideas. The latter are associated with the use of so-called “scale invariant item response models”. Together with maximum likelihood estimation (MLE) approach, they helped to solve some problems of engineering and proficiency testing. However, several issues related to the assessment of uncertainties, replications scheduling, the use of placebo, as well as evaluation of multidimensional abilities still present a challenge for researchers. The authors attempt to outline the ways to solve the above problems.

  11. Using Heteroskedastic Ordered Probit Models to Recover Moments of Continuous Test Score Distributions from Coarsened Data

    ERIC Educational Resources Information Center

    Reardon, Sean F.; Shear, Benjamin R.; Castellano, Katherine E.; Ho, Andrew D.

    2017-01-01

    Test score distributions of schools or demographic groups are often summarized by frequencies of students scoring in a small number of ordered proficiency categories. We show that heteroskedastic ordered probit (HETOP) models can be used to estimate means and standard deviations of multiple groups' test score distributions from such data. Because…

  12. Difficulty and Discriminating Indices of Three-Multiple Choice Tests Using the Confidence Scoring Procedure

    ERIC Educational Resources Information Center

    Omirin, M. S.

    2007-01-01

    The study investigated the comparison of the difficulty and discrimination incides of three multiple choice tests using the confidence scoring procedure (CSP). The study was also set to determine whether or not the difficulty and discrimination indices would be improved, if the tests were scored by the confidence scoring procedure. Two null…

  13. Modified Balance Error Scoring System (M-BESS) test scores in athletes wearing protective equipment and cleats

    PubMed Central

    Azad, Aftab Mohammad; Al Juma, Saad; Bhatti, Junaid Ahmad; Delaney, J Scott

    2016-01-01

    Background Balance testing is an important part of the initial concussion assessment. There is no research on the differences in Modified Balance Error Scoring System (M-BESS) scores when tested in real world as compared to control conditions. Objective To assess the difference in M-BESS scores in athletes wearing their protective equipment and cleats on different surfaces as compared to control conditions. Methods This cross-sectional study examined university North American football and soccer athletes. Three observers independently rated athletes performing the M-BESS test in three different conditions: (1) wearing shorts and T-shirt in bare feet on firm surface (control); (2) wearing athletic equipment with cleats on FieldTurf; and (3) wearing athletic equipment with cleats on firm surface. Mean M-BESS scores were compared between conditions. Results 60 participants were recruited: 39 from football (all males) and 21 from soccer (11 males and 10 females). Average age was 21.1 years (SD=1.8). Mean M-BESS scores were significantly lower (p<0.001) for cleats on FieldTurf (mean=26.3; SD=2.0) and for cleats on firm surface (mean=26.6; SD=2.1) as compared to the control condition (mean=28.4; SD=1.5). Females had lower scores than males for cleats on FieldTurf condition (24.9 (SD=1.9) vs 27.3 (SD=1.6), p=0.005). Players who had taping or bracing on their ankles/feet had lower scores when tested with cleats on firm surface condition (24.6 (SD=1.7) vs 26.9 (SD=2.0), p=0.002). Conclusions Total M-BESS scores for athletes wearing protective equipment and cleats standing on FieldTurf or a firm surface are around two points lower than M-BESS scores performed on the same athletes under control conditions. PMID:27900181

  14. Allometric scaling of Wingate anaerobic power test scores in men.

    PubMed

    Stickley, Christopher D; Hetzler, Ronald K; Wages, Jennifer J; Freemyer, Bret G; Kimura, Iris F

    2013-09-01

    This study examined the appropriate magnitude of allometric scaling of the Wingate anaerobic test (WAnT) power data for body mass (BM) and established normative data for the WAnT for adult men. Eighty-three men completed a standard WAnT using 0.1 kg·kg(-1) BM resistance. Allometric exponents and percentile ranks for 1-second peak power (PP), 5-second PP, and mean power (MP) were established. The Predicted Residual Sum of Squares (PRESS) procedure was used to assess external validity while avoiding data splitting. The mean 1-second PP, 5-second PP, and MP were 1,049.1 ± 168.8 W, 1,013.4 ± 158.6 W, and 777.9 ± 105.0 W, respectively. Allometric exponents for 1-second PP, 5-second PP, and MP scaled for BM were b = 0.89, 0.88, and 0.86, respectively. Correlations between allometrically scaled 1-second PP, 5-second PP, and MP, and BM were r = -0.03, -0.03, and -0.02, respectively, suggesting that the allometric exponents derived were effective in partialling out the effect of BM on WAnT values. The PRESS procedure values resulted in small decreases in R² (0.03, 0.04, and 0.02 for 1-second PP, 5-second PP, and MP, respectively) suggesting acceptable levels of external validity when applied to independent samples. The allometric exponents and normative values provide a useful tool for comparing WAnT scores in college-aged females without the confounding effect of BM. It is suggested that exponents of b = 0.89 (1-second PP), b = 0.88 (5-second PP), and b = 0.86 (MP) be used for allometrically scaling WAnT power values in healthy adult men and that the confidence limits for these allometric exponents be considered as 0.66-1.0 for PP and 0.69-1.0 for MP. The use of these exponents in allometric scaling of male WAnT power values provide coaches and practitioners with valid means for comparing power production between individuals without the confounding influence of BM.

  15. Development of new risk score for pre-test probability of obstructive coronary artery disease based on coronary CT angiography.

    PubMed

    Fujimoto, Shinichiro; Kondo, Takeshi; Yamamoto, Hideya; Yokoyama, Naoyuki; Tarutani, Yasuhiro; Takamura, Kazuhisa; Urabe, Yoji; Konno, Kumiko; Nishizaki, Yuji; Shinozaki, Tomohiro; Kihara, Yasuki; Daida, Hiroyuki; Isshiki, Takaaki; Takase, Shinichi

    2015-09-01

    Existing methods to calculate pre-test probability of obstructive coronary artery disease (CAD) have been established using selected high-risk patients who were referred to conventional coronary angiography. The purpose of this study is to develop and validate our new method for pre-test probability of obstructive CAD using patients who underwent coronary CT angiography (CTA), which could be applicable to a wider range of patient population. Using consecutive 4137 patients with suspected CAD who underwent coronary CTA at our institution, a multivariate logistic regression model including clinical factors as covariates calculated the pre-test probability (K-score) of obstructive CAD determined by coronary CTA. The K-score was compared with the Duke clinical score using the area under the curve (AUC) for the receiver-operating characteristic curve. External validation was performed by an independent sample of 319 patients. The final model included eight significant predictors: age, gender, coronary risk factor (hypertension, diabetes mellitus, dyslipidemia, smoking), history of cerebral infarction, and chest symptom. The AUC of the K-score was significantly greater than that of the Duke clinical score for both derivation (0.736 vs. 0.699) and validation (0.714 vs. 0.688) data sets. Among patients who underwent coronary CTA, newly developed K-score had better pre-test prediction ability of obstructive CAD compared to Duke clinical score in Japanese population.

  16. Improving Test Score Reporting: Perspectives from the ETS Score Reporting Conference. Research Report. ETS RR-11-45

    ERIC Educational Resources Information Center

    Zapata-Rivera, Diego, Ed.; Zwick, Rebecca, Ed.

    2011-01-01

    This volume includes 3 papers based on presentations at a workshop on communicating assessment information to particular audiences, held at Educational Testing Service (ETS) on November 4th, 2010, to explore some issues that influence score reports and new advances that contribute to the effectiveness of these reports. Jessica Hullman, Rebecca…

  17. Why African American College Students Miss the Perfect Test Score

    ERIC Educational Resources Information Center

    Gentry, Ruben; Stokes, Dorothy

    2016-01-01

    Many African Americans were imbued with the cliché that they must work twice as hard as others to be a success in life. Entering college, students with this belief put extensive effort into earning top grades to ensure quality preparation for their chosen career; yet, some fail to earn top scores. Why? This is the million dollar question, but the…

  18. External validation of the ability of the DRAGON score to predict outcome after thrombolysis treatment.

    PubMed

    Ovesen, C; Christensen, A; Nielsen, J K; Christensen, H

    2013-11-01

    Easy-to-perform and valid assessment scales for the effect of thrombolysis are essential in hyperacute stroke settings. Because of this we performed an external validation of the DRAGON scale proposed by Strbian et al. in a Danish cohort. All patients treated with intravenous recombinant plasminogen activator between 2009 and 2011 were included. Upon admission all patients underwent physical and neurological examination using the National Institutes of Health Stroke Scale along with non-contrast CT scans and CT angiography. Patients were followed up through the Outpatient Clinic and their modified Rankin Scale (mRS) was assessed after 3 months. Three hundred and three patients were included in the analysis. The DRAGON scale proved to have a good discriminative ability for predicting highly unfavourable outcome (mRS 5-6) (area under the curve-receiver operating characteristic [AUC-ROC]: 0.89; 95% confidence interval [CI] 0.81-0.96; p<0.001) and good outcome (mRS 0-2) (AUC-ROC: 0.79; 95% CI 0.73-0.85; p<0.001). When only patients with M1 occlusions were selected the DRAGON scale provided good discriminative capability (AUC-ROC: 0.89; 95% CI 0.78-1.0; p=0.003) for highly unfavourable outcome. We confirmed the validity of the DRAGON scale in predicting outcome after thrombolysis treatment.

  19. Measuring Infant Communicative Abilities: A Guide to Formal Test Selection.

    ERIC Educational Resources Information Center

    Proctor, Adele

    This guide was prepared to facilitate the practitioner's selection of formal tests for evaluating communicative behavior in clinical infant populations during the first year of life. Clinical instruments with particular emphasis on communication and emerging language and speech abilities were identified in terms of publishers' recommended…

  20. Predicting Performance on a Firefighter's Ability Test from Fitness Parameters

    ERIC Educational Resources Information Center

    Michaelides, Marcos A.; Parpa, Koulla M.; Thompson, Jerald; Brown, Barry

    2008-01-01

    The purpose of this project was to identify the relationships between various fitness parameters such as upper body muscular endurance, upper and lower body strength, flexibility, body composition and performance on an ability test (AT) that included simulated firefighting tasks. A second intent was to create a regression model that would predict…

  1. Determining the Relationship of Nursing Test Scores and Test Anxiety Levels before and after a Test-Taking Strategy Seminar.

    ERIC Educational Resources Information Center

    Carraway, Cassandra T.

    A study was conducted to determine whether participation in a test-taking strategy seminar significantly decreased test anxiety in first-year nursing students. The study also sought to compare nursing test scores of first-year nursing students who participated in the seminar with those who did not. The sample consisted of 30 first-year nursing…

  2. End-box scoring artefact evaluation of the Farnsworth-Munsell 100-Hue colour vision test.

    PubMed

    Viliūnas, V; Lukauskiene, R; Svegzda, A; Zukauskas, A

    2006-11-01

    The scoring artefact in the Farnsworth-Munsell 100-Hue test, arising from the grouping of the caps into four boxes, was investigated. The traditional method of scoring performed with the numbers of the anchor caps disregarded and the alternative scoring performed with the numbers of the anchor caps employed, were compared. For the traditional method of scoring, we revealed an increase of the error score of the outside (end-box) caps when the total error score was above 240. On the contrary for scoring performed with the numbers of the anchor caps employed, the difference between the error score of the outside caps and the average error per cap is not significant. To mitigate the end-box artefact and to improve the reliability of the Farnsworth-Munsell 100-Hue test, corrections to the traditional method of scoring are proposed.

  3. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum...

  4. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 21 Food and Drugs 8 2012-04-01 2012-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum...

  5. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 21 Food and Drugs 8 2013-04-01 2013-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum...

  6. The Role of Test Scores in Explaining Race and Gender Differences in Wages

    ERIC Educational Resources Information Center

    Blackburn, McKinley L.

    2004-01-01

    Previous research has suggested that skills reflected in test-score performance on tests such as the Armed Forces Qualification Test (AFQT) can account for some of the racial differences in average wages. I use a more complete set of test scores available with the National Longitudinal Survey of Youth 1979 Cohort to reconsider this evidence, and…

  7. Neuropsychological test scores, academic performance, and developmental disorders in Spanish-speaking children.

    PubMed

    Rosselli, M; Ardila, A; Bateman, J R; Guzmán, M

    2001-01-01

    Limited information is currently available about performance of Spanish-speaking children on different neuropsychological tests. This study was designed to (a) analyze the effects of age and sex on different neuropsychological test scores of a randomly selected sample of Spanish-speaking children, (b) analyze the value of neuropsychological test scores for predicting school performance, and (c) describe the neuropsychological profile of Spanish-speaking children with learning disabilities (LD). Two hundred ninety (141 boys, 149 girls) 6- to 11-year-old children were selected from a school in Bogotá, Colombia. Three age groups were distinguished: 6- to 7-, 8- to 9-, and 10- to 11-year-olds. Performance was measured utilizing the following neuropsychological tests: Seashore Rhythm Test, Finger Tapping Test (FTT), Grooved Pegboard Test, Children's Category Test (CCT), California Verbal Learning Test-Children's Version (CVLT-C), Benton Visual Retention Test (BVRT), and Bateria Woodcock Psicoeducativa en Español (Woodcock, 1982). Normative scores were calculated. Age effect was significant for most of the test scores. A significant sex effect was observed for 3 test scores. Intercorrelations were performed between neuropsychological test scores and academic areas (science, mathematics, Spanish, social studies, and music). In a post hoc analysis, children presenting very low scores on the reading, writing, and arithmetic achievement scales of the Woodcock battery were identified in the sample, and their neuropsychological test scores were compared with a matched normal group. Finally, a comparison was made between Colombian and American norms.

  8. The Woodcock-Johnson Tests of Cognitive Abilities III's Cognitive Performance Model: Empirical Support for Intermediate Factors within CHC Theory

    ERIC Educational Resources Information Center

    Taub, Gordon E.; McGrew, Kevin S.

    2014-01-01

    The Woodcock-Johnson Tests of Cognitive Ability Third Edition is developed using the Cattell-Horn-Carroll (CHC) measurement-theory test design as the instrument's theoretical blueprint. The instrument provides users with cognitive scores based on the Cognitive Performance Model (CPM); however, the CPM is not a part of CHC theory. Within the…

  9. Evaluation of Selected Interview Data in Improving the Predictive Validity of a Verbal Ability Test with Psychiatric Aide Trainees.

    ERIC Educational Resources Information Center

    Distefano, M. K., Jr.; Pryer, Margaret W.

    1987-01-01

    From 13 objective interview items, five with adequate response variability were studied to determine if they would improve the validity of a verbal ability selection test in predicting work performance of 181 psychiatric aide trainees. In a multiple regression analysis, the verbal test correlated .27 with the weighted composite rating score.…

  10. A Seven-Year Follow-Up of Intelligence Test Scores of Foster Grandparents

    ERIC Educational Resources Information Center

    Troll, Lillian E.; And Others

    1976-01-01

    After seven years, a group (N=32) of originally nonemployed poverty-level older people (over 60) now employed as foster grandparents were retested with the WAIS. Three subtest scores showed stability and Digit Span showed a statistically significant drop. Neither age nor initial level of health or WAIS scores was related to test-score changes over…

  11. Comparing Graphical and Verbal Representations of Measurement Error in Test Score Reports

    ERIC Educational Resources Information Center

    Zwick, Rebecca; Zapata-Rivera, Diego; Hegarty, Mary

    2014-01-01

    Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept. We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each…

  12. D.C. Student Test Scores Show Uneven Progress. Data Snapshot

    ERIC Educational Resources Information Center

    DuPre, Mary

    2011-01-01

    Over the past five years, both DC Public Schools (DCPS) and public charter schools (PCS) have seen significant growth in secondary reading and math scores on the state test known as the District of Columbia Comprehensive Assessment System (DC CAS). However, scores have not improved as much at the elementary level. Reading and math scores for DCPS…

  13. Further Validation of the Qualitative Scoring System for the Modified Bender-Gestalt Test.

    ERIC Educational Resources Information Center

    Brannigan, Gary G.; And Others

    1995-01-01

    Compares the Qualitative Scoring System and the Developmental Scoring Systems, both Bender-Gestalt tests, in predicting achievement on the Metropolitan Achievement Test (MAT). In this study, first through fourth graders (n=409) from regular elementary schools were subjected to both tests; both systems correlated significantly with school…

  14. School Inputs, Household Substitution, and Test Scores. NBER Working Paper No. 16830

    ERIC Educational Resources Information Center

    Das, Jishnu; Dercon, Stefan; Habyarimana, James; Krishnan, Pramila; Muralidharan, Karthik; Sundararaman, Venkatesh

    2011-01-01

    Empirical studies of the relationship between school inputs and test scores typically do not account for the fact that households will respond to changes in school inputs. We present a dynamic household optimization model relating test scores to school and household inputs, and test its predictions in two very different low-income country…

  15. Score Reporting in Teacher Certification Testing: A Review, Design, and Interview/Focus Group Study

    ERIC Educational Resources Information Center

    Klesch, Heather S.

    2010-01-01

    The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…

  16. Note on the Scoring of Foreign Language Speaking and Writing Fluency Tests.

    ERIC Educational Resources Information Center

    Carroll, John B.

    The problem of determining relative weights for quantity and quality in scoring foreign language speaking and writing fluency tests is studied. French speaking and writing fluency tests were administered to students of French in several schools in England. Data from these tests was analyzed to support the suggestion that scoring formulas should…

  17. Relationship of Achievement Test Scores and State Board Performance in a Diploma Nursing Program.

    ERIC Educational Resources Information Center

    Washburn, Gail

    The relationship between the National League for Nursing (NLN) achievement test scores and performance on the State Board Test Pool Examination (SBTPE) was studied with 166 graduates of a diploma degree school of nursing between 1976 and 1978. It was found that NLN achievement test scores had a highly significant correlation with SBTPE results.…

  18. Noncognitive Skills and the Gender Disparities in Test Scores and Teacher Assessments: Evidence from Primary School

    ERIC Educational Resources Information Center

    Cornwell, Christopher; Mustard, David B.; Van Parys, Jessica

    2013-01-01

    Using data from the 1998-99 ECLS-K cohort, we show that the grades awarded by teachers are not aligned with test scores. Girls in every racial category outperform boys on reading tests, while boys score at least as well on math and science tests as girls. However, boys in all racial categories across all subject areas are not represented in…

  19. Using Expected Growth Size Estimates To Summarize Test Score Changes. ERIC/AE Digest.

    ERIC Educational Resources Information Center

    Russell, Michael

    An earlier Digest described the shortcomings of three methods commonly used to summarize changes in test scores. This Digest describes two less commonly used approaches for examining changes in test scores, those of Standardized Growth Estimates and Effect Sizes. Aspects of these two approaches are combined and applied to the Iowa Test of Basic…

  20. A "Rearrangement Procedure" for Scoring Adaptive Tests with Review Options

    ERIC Educational Resources Information Center

    Papanastasiou, Elena C.; Reckase, Mark D.

    2007-01-01

    Because of the increased popularity of computerized adaptive testing (CAT), many admissions tests, as well as certification and licensure examinations, have been transformed from their paper-and-pencil versions to computerized adaptive versions. A major difference between paper-and-pencil tests and CAT from an examinee's point of view is that in…

  1. An Error Score Model for Time-Limit Tests

    ERIC Educational Resources Information Center

    Ven, A. H. G. S. van der

    1976-01-01

    A more generalized error model for time-limit tests is developed. Model estimates are derived for right-attempted and wrong-attempted correlations both within the same test and between different tests. A comparison is made between observed correlations and their model counterparts and a fair agreement is found between observed and expected…

  2. The value of Bayes' theorem for interpreting abnormal test scores in cognitively healthy and clinical samples.

    PubMed

    Gavett, Brandon E

    2015-03-01

    The base rates of abnormal test scores in cognitively normal samples have been a focus of recent research. The goal of the current study is to illustrate how Bayes' theorem uses these base rates--along with the same base rates in cognitively impaired samples and prevalence rates of cognitive impairment--to yield probability values that are more useful for making judgments about the absence or presence of cognitive impairment. Correlation matrices, means, and standard deviations were obtained from the Wechsler Memory Scale--4th Edition (WMS-IV) Technical and Interpretive Manual and used in Monte Carlo simulations to estimate the base rates of abnormal test scores in the standardization and special groups (mixed clinical) samples. Bayes' theorem was applied to these estimates to identify probabilities of normal cognition based on the number of abnormal test scores observed. Abnormal scores were common in the standardization sample (65.4% scoring below a scaled score of 7 on at least one subtest) and more common in the mixed clinical sample (85.6% scoring below a scaled score of 7 on at least one subtest). Probabilities varied according to the number of abnormal test scores, base rates of normal cognition, and cutoff scores. The results suggest that interpretation of base rates obtained from cognitively healthy samples must also account for data from cognitively impaired samples. Bayes' theorem can help neuropsychologists answer questions about the probability that an individual examinee is cognitively healthy based on the number of abnormal test scores observed.

  3. Stability of scores for the Slosson Full-Range Intelligence Test.

    PubMed

    Williams, Thomas O; Eaves, Ronald C; Woods-Groves, Suzanne; Mariano, Gina

    2007-08-01

    The test-retest stability of the Slosson Full-Range Intelligence Test by Algozzine, Eaves, Mann, and Vance was investigated with test scores from a sample of 103 students. With a mean interval of 13.7 mo. and different examiners for each of the two test administrations, the test-retest reliability coefficients for the Full-Range IQ, Verbal Reasoning, Abstract Reasoning, Quantitative Reasoning, and Memory were .93, .85, .80, .80, and .83, respectively. Mean differences from the test-retest scores were not statistically significantly different for any of the scales. Results suggest that Slosson scores are stable over time even when different examiners administer the test.

  4. Diagnostic Utility of WISC-IV General Abilities Index and Cognitive Proficiency Index Difference Scores among Children with ADHD

    ERIC Educational Resources Information Center

    Devena, Sarah E.; Watkins, Marley W.

    2012-01-01

    The Wechsler Intelligence Scale for Children-Fourth Edition General Abilities Index and Cognitive Proficiency Index have been advanced as possible diagnostic markers of attention deficit hyperactivity disorder. This hypothesis was tested with a hospital sample with attention deficit hyperactivity disorder (n = 78), a referred but nondiagnosed…

  5. The Effect of Examiner Variation in Cartridge Case Acquisition on IBISreg Correlation Scores and the Ability of the System to Return a True Positive

    NASA Astrophysics Data System (ADS)

    Scicchitano, Kristine M.

    When entering cartridge case exhibits into the Integrated Ballistics Identification System (IBISRTM), examiners have the ability to manually manipulate three parameters: lighting intensity, ring selection and exhibit orientation. User guidelines for these settings are subjective, and the effect of examiner variation is largely unknown. If examiner variation negatively affects the returned correlation scores, the ability of the system to return true positives will be compromised. By entering cartridge cases into IBISRTM 88 separate times, using 88 different combinations of parameter settings, the effect of these variables was determined. Analysis of variance testing revealed that no variable has a statistically significant effect on average true positive combined correlation scores or results list position. This did not change when the parameters were tested individually or in combination. Results indicate that examiner variability of cartridge case image acquisition has no effect on the outcome of IBIS RTM. The system's matching algorithm is robust enough to handle exhibit entry and data collection without the intervention of human input. For this reason, acquisition could be completely automated, allowing examiners to focus on the decision making stage of cartridge case comparison.

  6. Clinical Importance of the Heel Drop Test and a New Clinical Score for Adult Appendicitis

    PubMed Central

    Ahn, Shin; Lee, Hyeji; Choi, Wookjin; Ahn, Ryeok; Hong, Jung-Suk; Sohn, Chang Hwan; Seo, Dong Woo; Lee, Yoon-Seon; Lim, Kyung Soo; Kim, Won Young

    2016-01-01

    Objective We tried to evaluate the accuracy of the heel drop test in patients with suspected appendicitis and tried to develop a new clinical score, which incorporates the heel drop test and other parameters, for the diagnosis of this condition. Methods We performed a prospective observational study on adult patients with suspected appendicitis at two academic urban emergency departments between January and August 2015. The predictive characteristics of each parameter, along with heel drop test results were calculated. A composite score was generated by logistic regression analysis. The performance of the generated score was compared to that of the Alvarado score. Results Of the 292 enrolled patients, 165 (56.5%) had acute appendicitis. The heel drop test had a higher predictive value than rebound tenderness. Variables and their points included in the new (MESH) score were pain migration (2), elevated white blood cell (WBC) >10,000/μL (3), shift to left (2), and positive heel drop test (3). The MESH score had a higher AUC than the Alvarado score (0.805 vs. 0.701). Scores of 5 and 11 were chosen as cut-off values; a MESH score ≥5 compared to an Alvarado score ≥5, and a MESH score ≥8 compared to an Alvarado score ≥7 showed better performance in diagnosing appendicitis. Conclusion MESH (migration, elevated WBC, shift to left, and heel drop test) is a simple clinical scoring system for assessing patients with suspected appendicitis and is more accurate than the Alvarado score. Further validation studies are needed. PMID:27723842

  7. The Scoring of Matching Questions Tests: A Closer Look

    ERIC Educational Resources Information Center

    Jancarík, Antonín; Kostelecká, Yvona

    2015-01-01

    Electronic testing has become a regular part of online courses. Most learning management systems offer a wide range of tools that can be used in electronic tests. With respect to time demands, the most efficient tools are those that allow automatic assessment. The presented paper focuses on one of these tools: matching questions in which one…

  8. Printing Performance School Readiness Test: Administration and Scoring Manual.

    ERIC Educational Resources Information Center

    Simner, Marvin L.

    The Printing Performance School Readiness Test is an empirically derived instrument designed to aid in the early identification of preschool children who are at risk for school failure. The test is based on the outcome of a research program dealing with various aspects of children's printing that involved over 400 normal, non-repeating, native…

  9. Effects of Test Media on Different EFL Test-Takers in Writing Scores and in the Cognitive Writing Process

    ERIC Educational Resources Information Center

    Zou, Xiao-Ling; Chen, Yan-Min

    2016-01-01

    The effects of computer and paper test media on EFL test-takers with different computer familiarity in writing scores and in the cognitive writing process have been comprehensively explored from the learners' aspect as well as on the basis of related theories and practice. The results indicate significant differences in test scores among the…

  10. The Relationship between Students' Performance on the Cognitive Abilities Test (CogAT) and the Fourth and Fifth Grade Reading and Math Achievement Tests in Ohio

    ERIC Educational Resources Information Center

    Warnimont, Chad S.

    2010-01-01

    The purpose of this quantitative study was to examine the relationship between students' performance on the Cognitive Abilities Test (CogAT) and the fourth and fifth grade Reading and Math Achievement Tests in Ohio. The sample utilized students from a suburban school district in Northwest Ohio. Third grade CogAT scores (2006-2007 school year), 4th…

  11. Comparing the Effects of Elementary Music and Visual Arts Lessons on Standardized Mathematics Test Scores

    ERIC Educational Resources Information Center

    King, Molly Elizabeth

    2016-01-01

    The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…

  12. Effects of Scoring by Section and Independent Scorers' Patterns on Scorer Reliability in Biology Essay Tests

    ERIC Educational Resources Information Center

    Ebuoh, Casmir N.; Ezeudu, S. A.

    2015-01-01

    The study investigated the effects of scoring by section, use of independent scorers and conventional patterns on scorer reliability in Biology essay tests. It was revealed from literature review that conventional pattern of scoring all items at a time in essay tests had been criticized for not being reliable. The study was true experimental study…

  13. Comparison of Two Scoring Systems for the Modified Version of the Bender-Gestalt Test.

    ERIC Educational Resources Information Center

    Schachter, Steven; And Others

    1991-01-01

    Examined relative utility of two scoring systems for Modified Version of Bender-Gestalt Test in predicting performance on Developmental Test of Visual-Motor Integration. Findings from 53 kindergarten and 47 first grade students indicated that Qualitative Scoring System was significantly better predictor of visual-motor integration skills than…

  14. Kindergarten Black-White Test Score Gaps: Replicating and Updating Previous Findings with New National Data

    ERIC Educational Resources Information Center

    Quinn, David

    2014-01-01

    A substantial body of evidence has shown large academic test score gaps between black and white students in early childhood. These gaps remain, and probably grow, as students progress through school. Many researchers have sought to explain these persistent test score gaps, and particularly, to understand the role of students' socio-economic status…

  15. The Dynamics of the Evolution of the Black-White Test Score Gap

    ERIC Educational Resources Information Center

    Sohn, Kitae

    2012-01-01

    We apply a quantile version of the Oaxaca-Blinder decomposition to estimate the counterfactual distribution of the test scores of Black students. In the Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999 (ECLS-K), we find that the gap initially appears only at the top of the distribution of test scores. As children age, however,…

  16. Peer Effects and the Indigenous/Non-Indigenous Early Test-Score Gap in Peru

    ERIC Educational Resources Information Center

    Sakellariou, Chris

    2008-01-01

    This paper assesses the magnitude of the non-indigenous/indigenous test-score gap for third-year and fourth-year primary school pupils in Peru, in relation to the main family, school and peer inputs contributing to the test-score gap using the estimation method of feasible generalized least squares. The article then decomposes the gap into its…

  17. The Effects of Developmental Placement and Early Retention on Children's Later Scores on Standardized Tests.

    ERIC Educational Resources Information Center

    May, Deborah C.; Welch, Edward L.

    1984-01-01

    Examined the relationship between early school retention as a result of preschool and kindergarten developmental testing and children's later academic achievement (N=223). Results showed children who scored as immature on the Gesell Screening Test and who were retained a year had the lowest scores on all measures. (JAC)

  18. Life Stress and Reading Comprehension Test Scores in the Middle School Student.

    ERIC Educational Resources Information Center

    Jones, Maryann Clementi

    A study determined the relationship between life stress and reading comprehension test scores on the IOWA Tests of Basic Skills. Subjects, 41 middle-school students attending Lincoln School in Garwood, New Jersey, were surveyed as to the amount of life stress prevalent in their lives. In addition, the Iowa scores for reading comprehension were…

  19. Using Raters from India to Score a Large-Scale Speaking Test

    ERIC Educational Resources Information Center

    Xi, Xiaoming; Mollaun, Pam

    2011-01-01

    We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…

  20. The Implications of Family Size and Birth Order for Test Scores and Behavioral Development

    ERIC Educational Resources Information Center

    Silles, Mary A.

    2010-01-01

    This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…

  1. Scoring Yes-No Vocabulary Tests: Reaction Time vs. Nonword Approaches

    ERIC Educational Resources Information Center

    Pellicer-Sanchez, Ana; Schmitt, Norbert

    2012-01-01

    Despite a number of research studies investigating the Yes-No vocabulary test format, one main question remains unanswered: What is the best scoring procedure to adjust for testee overestimation of vocabulary knowledge? Different scoring methodologies have been proposed based on the inclusion and selection of nonwords in the test. However, there…

  2. Increasing Racial Isolation and Test Score Gaps in Mathematics: A 30-Year Perspective

    ERIC Educational Resources Information Center

    Berends, Mark; Penaloza, Roberto V.

    2010-01-01

    Background/Context: Although there has been progress in closing the test score gaps among student groups over past decades, that progress has stalled. Many researchers have speculated why the test score gaps closed between the early 1970s and the early 1990s, but only a few have been able to empirically study how changes in school factors and…

  3. Interpreting Standardized Test Scores: Strategies for Data-Driven Instructional Decision Making

    ERIC Educational Resources Information Center

    Mertler, Craig A.

    2007-01-01

    This book is designed to help K-12 teachers and administrators understand the nature of standardized tests and, in particular, the scores that result from them. This useful manual helps teachers develop the skills necessary to incorporate these test scores into various types of instructional decision making--a process known as "data-driven…

  4. Linking Scores From Tests of Similar Content Given in Different Languages: An Illustration Involving Methodological Alternatives

    ERIC Educational Resources Information Center

    Cascallar, Alicia S.; Dorans, Neil J.

    2005-01-01

    This study compares two methods commonly used (concordance and prediction) to establish linkages between scores from tests of similar content given in different languages. Score linkages between the Verbal and Math sections of the SAT I and the corresponding sections of the Spanish-language admissions test, the Prueba de Aptitud Academica (PAA),…

  5. Linking Scores from Tests of Similar Content Given in Different Languages: An Illustration Involving Methodological Alternatives

    ERIC Educational Resources Information Center

    Cascallar, Alicia S.; Dorans, Neil J.

    2005-01-01

    This study compares two methods commonly used (concordance and prediction) to establish linkages between scores from tests of similar content given in different languages. Score linkages between the Verbal and Math sections of the SAT I and the corresponding sections of the Spanish-language admissions test, the Prueba de Aptitud Academica (PAA),…

  6. The Influence of Foreign Language Learning during Early Childhood on Standardized Test Scores

    ERIC Educational Resources Information Center

    Shaw, Tommetta

    2010-01-01

    Increasing standardized test scores in reading and math is of high importance to the California Department of Education to meet requirements mandated by the No Child Left Behind (NCLB) act of 2001. More research is needed to understand the best ways to improve tests scores to meet concerns of the NCLB act. The purpose of the study was to evaluate…

  7. Beyond Correlations: Usefulness of High School GPA and Test Scores in Making College Admissions Decisions

    ERIC Educational Resources Information Center

    Sawyer, Richard

    2013-01-01

    Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…

  8. Correcting for Test Score Measurement Error in ANCOVA Models for Estimating Treatment Effects

    ERIC Educational Resources Information Center

    Lockwood, J. R.; McCaffrey, Daniel F.

    2014-01-01

    A common strategy for estimating treatment effects in observational studies using individual student-level data is analysis of covariance (ANCOVA) or hierarchical variants of it, in which outcomes (often standardized test scores) are regressed on pretreatment test scores, other student characteristics, and treatment group indicators. Measurement…

  9. An Item Analysis and Validity Investigation of Bender Visual Motor Gestalt Test Score Items

    ERIC Educational Resources Information Center

    Lambert, Nadine M.

    1971-01-01

    This investigation attempted to demonstrate the utility of standard item analysis procedures for selecting the most reliable and valid items for scoring Bender Visual Motor Gestalt Test test records. (Author)

  10. Maintaining Equivalent Cut Scores for Small Sample Test Forms

    ERIC Educational Resources Information Center

    Dwyer, Andrew C.

    2016-01-01

    This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…

  11. Are Increasing Test Scores in Texas Really a Myth?

    ERIC Educational Resources Information Center

    Toenjes, Lawrence A.; Dworkin, A. Gary

    2002-01-01

    Used the same methodology and data used by W. Haney in his study of achievement gains on the Texas high school exit examination to demonstrate that his conclusion about the increased pass rate was not correct. None of the 20% improvement in the exit test pass rate was explained by combined increases in dropout rates or special education…

  12. Allometric Scaling of Wingate Anaerobic Power Test Scores in Women

    ERIC Educational Resources Information Center

    Hetzler, Ronald K.; Stickley, Christopher D.; Kimura, Iris F.

    2011-01-01

    In this study, we developed allometric exponents for scaling Wingate anaerobic test (WAnT) power data that are reflective in controlling for body mass (BM) and lean body mass (LBM) and established a normative WAnT data set for college-age women. One hundred women completed a standard WAnT. Allometric exponents and percentile ranks for peak (PP)…

  13. Posttraumatic Stress Disorder and Standardized Test-Taking Ability

    DTIC Science & Technology

    2010-01-01

    factors that affect test-taking ability in young adults is vital. Although scholarly attention has often focused on demographic factors (c.g., gender...abi lity in young adults is vital. Although scholarly allemion h;J.l; often focused on (I\\:mobr.lphic factors (e.g .. gender nnd race). sun1cicntly...and the U.S. Anny Research lnstilute for Environmental Medicine. ’The U.S. Army Medical 223 Factors th:1I are largely determined by birth. such as

  14. Which spatial abilities and strategies predict males' and females' performance in the object perspective test?

    PubMed

    Meneghetti, Chiara; Pazzaglia, Francesca; De Beni, Rossana

    2012-08-01

    The present study aimed to investigate whether different spatial abilities and strategies sustain perspective-taking (PT) performance in males and females. The PT task used was the Object Perspective Test (OPT, Kozhevnikov and Hegarty in Mem Cogn 29:745-756, 2001; Hegarty and Waller in Intelligence 32:175-191, 2004). A sample of 40 males and 40 females completed the OPT and several other visuo-spatial tasks and questionnaires. Multiple regression analysis showed that OPT performance was predicted positively by a spatial imagery preference and negatively by the specific use of mental rotation strategy (i.e. turning the sheet of paper). Gender interacted with the Embedded Figure Test (EFT), a spatial visualization task, since high EFT scores only positively predicted the OPT results in males. Overall, our results show that OPT performance is sustained by specific spatial abilities and strategies modulated, at least in part, by gender.

  15. Estimating Conditional Distributions of Scores on an Alternate Form of a Test. Research Report. ETS RR-15-18

    ERIC Educational Resources Information Center

    Livingston, Samuel A.; Chen, Haiwen H.

    2015-01-01

    Quantitative information about test score reliability can be presented in terms of the distribution of equated scores on an alternate form of the test for test takers with a given score on the form taken. In this paper, we describe a procedure for estimating that distribution, for any specified score on the test form taken, by estimating the joint…

  16. Does IQ = IQ? Comparability of Intelligence Test Scores in Typically Developing Children.

    PubMed

    Hagmann-von Arx, Priska; Lemola, Sakari; Grob, Alexander

    2016-08-05

    Numerous intelligence tests are available to psychological diagnosticians to assess children's intelligence, but whether they yield comparable test results has been little studied. We examined test scores of 206 typically developing children aged 6 to 11 years on five German intelligence tests (Reynolds Intellectual Assessment Scales; Snijders Oomen Nonverbal Intelligence Test; Intelligence and Development Scales; Wechsler Intelligence Scale for Children, 4th edition; Culture Fair Intelligence Test Scale 2), which were individually administered. On a sample level, the test scores showed strong correlation and little or no mean difference. These results indicate that the tests measure a similar underlying construct, which is interpreted as general intelligence. On an individual level, however, test scores significantly differed across tests for 12% to 38% of the children. Differences did not depend on which test was used but rather on unexplained error. Implications for the application of intelligence assessment in psychological practice are discussed.

  17. Priming competence diminishes the link between cognitive test anxiety and test performance. Implications for the interpretation of test scores.

    PubMed

    Lang, Jonas W B; Lang, Jessica

    2010-06-01

    Researchers disagree whether the correlation between cognitive test anxiety and test performance is causal or explainable by skill deficits, which lead to both cognitive test anxiety and lower test performance. Most causal theories of test anxiety assume that individual differences in cognitive test anxiety originate from differences in self-perceived competence. Accordingly, in the present research, we sought to temporarily heighten perceptions of competence using a priming intervention. Two studies with secondary- and vocational-school students (Ns = 219 and 232, respectively) contrasted this intervention with a no-priming control condition. Priming competence diminished the association between cognitive test anxiety and test performance by heightening the performance of cognitively test-anxious students and by lowering the performance of students with low levels of cognitive test anxiety. The findings suggest that cognitively test-anxious persons have greater abilities than they commonly show. Competency priming may offer a way to improve the situation of people with cognitive test anxiety.

  18. Minority Performance on the Naglieri Nonverbal Ability Test, Second Edition, versus the Cognitive Abilities Test, Form 6: One Gifted Program's Experience

    ERIC Educational Resources Information Center

    Giessman, Jacob A.; Gambrell, James L.; Stebbins, Molly S.

    2013-01-01

    The Naglieri Nonverbal Ability Test, Second Edition (NNAT2), is used widely to screen students for possible inclusion in talent development programs. The NNAT2 claims to provide a more culturally neutral evaluation of general ability than tests such as Form 6 of the Cognitive Abilities Test (CogAT6), which has Verbal and Quantitative batteries in…

  19. Test Score Stability and Construct Validity of the Adult Manifest Anxiety Scale-College Version Scores among College Students: A Brief Report

    ERIC Educational Resources Information Center

    Lowe, Patricia A.; Papanastasiou, Elena C.; DeRuyck, Kimberly A.; Reynolds, Cecil R.

    2005-01-01

    In this study, the authors investigated the temporal stability and construct validity of the Adult Manifest Anxiety Scale-College Version (AMAS-C; C. R. Reynolds, B. O. Richmond, & P. A. Lowe, 2003b) scores. Results indicated that the AMAS-C scores had adequate to excellent test score stability, and evidence supported the construct validity of the…

  20. Effects of Targeted Test Preparation on Scores of Two Tests of Oral English as a Second Language

    ERIC Educational Resources Information Center

    Farnsworth, Tim

    2013-01-01

    This study investigated the effect of targeted test preparation, or coaching, on oral English as a second language test scores. The tests in question were the Basic English Skills Test Plus (BEST Plus), a scripted oral interview published by the Center for Applied Linguistics, and the Versant English Test (VET), a computer-administered and…

  1. Predicting Student Success in a Major's Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores

    NASA Astrophysics Data System (ADS)

    Thompson, E. David; Bowling, Bethany V.; Markle, Ross E.

    2017-02-01

    Studies over the last 30 years have considered various factors related to student success in introductory biology courses. While much of the available literature suggests that the best predictors of success in a college course are prior college grade point average (GPA) and class attendance, faculty often require a valuable predictor of success in those courses wherein the majority of students are in the first semester and have no previous record of college GPA or attendance. In this study, we evaluated the efficacy of the ACT Mathematics subject exam and Lawson's Classroom Test of Scientific Reasoning in predicting success in a major's introductory biology course. A logistic regression was utilized to determine the effectiveness of a combination of scientific reasoning (SR) scores and ACT math (ACT-M) scores to predict student success. In summary, we found that the model—with both SR and ACT-M as significant predictors—could be an effective predictor of student success and thus could potentially be useful in practical decision making for the course, such as directing students to support services at an early point in the semester.

  2. Science course sequences: The alignment of written, enacted, and tested curricula and their impact on grade 11 HSPA science scores

    NASA Astrophysics Data System (ADS)

    Lentz, Christine A.

    The purpose of this mixed method study was to examine the alignment of the written, enacted, and tested curricula of the Ocean City High School science course sequencing and its impact on student achievement. This study also examined the school's ability to predict student scores on the science portion of the High School Proficiency Assessment (HSPA). Data collected for science achievement included the science portion of the Grade Eight Proficiency Assessment (GEPA) as a pretest and the scores for the science portion of the HSPA as a posttest. Data collected for curriculum alignment included an examination of teacher generated course curriculum maps to determine the alignment with the New Jersey Core Curriculum Content Standards and the HSPA Test Specifications Directory. The quantitative data were treated through a series of paired samples t-tests, Pearson product moment correlation was used to examine relationships between variables, an ANCOVA analysis and a stepwise regression analysis were also completed. Based on the findings of the data analysis of this research effort, the following conclusions were drawn: (1) the alignment of the enacted curriculum with the tested and written curricula affected science achievement. (2) GEPA scores are significantly tied to HSPA scores and (3) GEPA scores and enrollment in the science sequence whose curriculum was aligned with the written and tested curricula, met the requirements of a predictor of scores on the HSPA exam. It is expected that educational leadership will use the results of this research to inform practice and drive decision-making in respect to student placement in to course sequences. It is hoped that the results will not only increase support for the district's curricula development plan but also add to the overall body of knowledge surrounding science program effectiveness in relation to the No Child Left Behind standards.

  3. Computerized scoring and graphing of the Farnsworth-Munsell 100-hue color vision test.

    PubMed

    Lugo, M; Tiedeman, J S

    1986-04-15

    The Farnsworth-Munsell 100-hue test is a sensitive and accurate test of color discrimination. A major disadvantage of the test is the laborious and time-consuming calculation needed to score the results and plot them on a chart for interpretation. We present a computer program, written in Microsoft's BASIC language, that performs the calculation and reports both the individual color cap error scores (from which the graph is plotted) and the total error score. If used with an IBM personal computer (or compatible) capable of graphics, the program plots a graph in a modified polar coordinate format that can be printed on a dot-matrix printer.

  4. A comparison of likelihood ratio tests and Rao's score test for three separable covariance matrix structures.

    PubMed

    Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha

    2017-01-01

    The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ(2) distribution. The tests are implemented on a real dataset from medical studies.

  5. Evaluation of 2 cognitive abilities tests in a dual-task environment

    NASA Technical Reports Server (NTRS)

    Vidulich, M. A.; Tsang, P. S.

    1986-01-01

    Most real world operators are required to perform multiple tasks simultaneously. In some cases, such as flying a high performance aircraft or trouble shooting a failing nuclear power plant, the operator's ability to time share or process in parallel" can be driven to extremes. This has created interest in selection tests of cognitive abilities. Two tests that have been suggested are the Dichotic Listening Task and the Cognitive Failures Questionnaire. Correlations between these test results and time sharing performance were obtained and the validity of these tests were examined. The primary task was a tracking task with dynamically varying bandwidth. This was performed either alone or concurrently with either another tracking task or a spatial transformation task. The results were: (1) An unexpected negative correlation was detected between the two tests; (2) The lack of correlation between either test and task performance made the predictive utility of the tests scores appear questionable; (3) Pilots made more errors on the Dichotic Listening Task than college students.

  6. Relationship of Sentence Skills Test Scores and Final Course Grades in Marketing 100.

    ERIC Educational Resources Information Center

    Ryan, Nancy

    1996-01-01

    Describes a study examining the relationship between scores on a Sentence Skills component of an English placement test and final course grades in a community college marketing course. Finds a significant positive correlation between scores and final grades, but one not strong enough to be used for predictive purposes. (13 citations) (BCY)

  7. See It, Be It, Write It: Using Performing Arts to Improve Writing Skills and Test Scores

    ERIC Educational Resources Information Center

    Blecher-Sass, Hope Sara; Moffitt, Maryellen

    2010-01-01

    Improve students' writing skills and boost their assessment scores while adding arts education, creativity, and fun to your writing curriculum. With this vibrant resource, improving writing skills goes hand-in-hand with improving test scores. Students learn how to use acting and visualization as prewriting activities to help them connect writing…

  8. Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

    ERIC Educational Resources Information Center

    Kolen, Michael J.; Lee, Won-Chan

    2011-01-01

    This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

  9. From #2 Pencils to the World Wide Web: A History of Test Scoring

    ERIC Educational Resources Information Center

    Zytowski, Donald G.

    2008-01-01

    The present highly developed status of psychological and educational testing in the United States is in part the result of many efforts over the past 100 years to develop economical and reliable methods of scoring. The present article traces a number of methods, ranging from hand scoring to present-day computer applications, stimulated by the need…

  10. The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

    ERIC Educational Resources Information Center

    Baggerly, Jennifer; Ferretti, Larissa K.

    2008-01-01

    What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…

  11. Comparison of the Koppitz and Watkins Scoring Systems for the Bender Gestalt Test.

    ERIC Educational Resources Information Center

    Johnston, Cris W.; Lanak, Brenda

    1985-01-01

    The Bender Gestalt Test was administered to 25 children (7-10 years old) referred for neuropsychological assessment and scored using the Koppitz system and the Watkins system. Although the scores obtained using the two different sets of criteria were highly correlated, the Watkins rules produced generally better performance. (Author/CL)

  12. TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

    ERIC Educational Resources Information Center

    Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

    2012-01-01

    Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

  13. Optimal Scoring Methods of Hand-Strength Tests in Patients with Stroke

    ERIC Educational Resources Information Center

    Huang, Sheau-Ling; Hsieh, Ching-Lin; Lin, Jau-Hong; Chen, Hui-Mei

    2011-01-01

    The purpose of this study was to determine the optimal scoring methods for measuring strength of the more-affected hand in patients with stroke by examining the effect of reducing measurement errors. Three hand-strength tests of grip, palmar pinch, and lateral pinch were administered at two sessions in 56 patients with stroke. Five scoring methods…

  14. Using Test Scores from Students with Disabilities in Teacher Effectiveness Indicators

    ERIC Educational Resources Information Center

    Buzick, Heather M.; Jones, Nathan D.

    2015-01-01

    The increased emphasis on using student growth measures in teacher evaluation has raised questions about how to treat test scores from students with disabilities. This study explores the consequences of three common approaches for treating scores from students with disabilities in statistical approaches to estimating teacher effectiveness: (1)…

  15. Language Variation and Score Variation in the Testing of English Language Learners, Native Spanish Speakers

    ERIC Educational Resources Information Center

    Solano-Flores, Guillermo; Li, Min

    2009-01-01

    We investigated language variation and score variation in the testing of English language learners, native Spanish speakers. We gave students the same set of National Assessment of Educational Progress mathematics items in both their first language and their second language. We examined the amount of score variation due to the main and interaction…

  16. Use of Standardized Test Scores to Predict Success in a Computer Applications Course

    ERIC Educational Resources Information Center

    Harris, Robert V.; King, Stephanie B.

    2016-01-01

    The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…

  17. The Effect of Mobility on Texas Assessment of Knowledge and Skills Test Scores

    ERIC Educational Resources Information Center

    Alvarez, Ray

    2006-01-01

    This research studies the effects of mobility on the high-stakes test scores of a Title I South Central Texas school district. The study involved 10, 5th-grade elementary feeder school populations graduating to the 6th grade in 3 middle schools. The researcher compared the 1st administration scores of the Texas Assessment of Knowledge and Skills…

  18. Using Scholastic Aptitude Test Scores as Indicators of State Educational Performance.

    ERIC Educational Resources Information Center

    Dynarski, Mark; Gleason, Philip

    1993-01-01

    The Scholastic Aptitude Test (SAT) is often used to measure educational performance at national, state, and local levels. Because participation rates differ considerably, such comparisons are invalid. This article proposes a regression model framework for adjusting SAT scores. Results are validated by comparing adjusted SAT scores with state…

  19. Estimating Achievement Gaps from Test Scores Reported in Ordinal "Proficiency" Categories

    ERIC Educational Resources Information Center

    Ho, Andrew D.; Reardon, Sean F.

    2012-01-01

    Test scores are commonly reported in a small number of ordered categories. Examples of such reporting include state accountability testing, Advanced Placement tests, and English proficiency tests. This paper introduces and evaluates methods for estimating achievement gaps on a familiar standard-deviation-unit metric using data from these ordered…

  20. Estimating Achievement Gaps from Test Scores Reported in Ordinal "Proficiency" Categories

    ERIC Educational Resources Information Center

    Ho, Andrew D.; Reardon, Sean F.

    2012-01-01

    Test scores are commonly reported in a small number of ordered categories. Examples of such reporting include state accountability testing, Advanced Placement tests, and English proficiency tests. This article introduces and evaluates methods for estimating achievement gaps on a familiar standard-deviation-unit metric using data from these ordered…

  1. Loanwords and Vocabulary Size Test Scores: A Case of Different Estimates for Different L1 Learners

    ERIC Educational Resources Information Center

    Laufer, Batia; McLean, Stuart

    2016-01-01

    The article investigated how the inclusion of loanwords in vocabulary size tests affected the test scores of two L1 groups of EFL learners: Hebrew and Japanese. New BNC- and COCA-based vocabulary size tests were constructed in three modalities: word form recall, word form recognition, and word meaning recall. Depending on the test modality, the…

  2. Chromosome painting in biological dosimetry: assessment of the ability to score stable chromosome aberrations using different pairs of paint probes.

    PubMed Central

    García Sagredo, J M; Vallcorba, I; López-Yarto; Sanchez-Hombre, M D; Resino, M; Ferro, M T

    1996-01-01

    We exposed human peripheral lymphocytes in vitro to 0.3 and 1 Gy of 60Co gamma rays to evaluate whether the ability and sensitivity to detect chromosomal aberrations by chromosome painting is independent or not to the specific paint probes. To detect structural aberrations (translocations), we painted chromosome spreads simultaneously with two whole-chromosome libraries for chromosomes 1, 2, 3, 4, 5, 6, 7, 11, 13, 16, and 18. To compare the rate of chromosome translocations detected by the different pairs of chromosomes, data were normalized according to the fraction of genome painted and evaluated by unconditional logistic regression. Our results show that any combination of paint probes can be used to score induced chromosomal aberrations. We observed that the amounts of translocations are dose dependent and quite homogeneous within each dose of radiation, independently of chromosomes painted. However, the use of small chromosome probes is not recommended because of the high number of cells to be analyzed due to the small amount of genome painted and because it is more difficult to detect translocations in small chromosomes. PMID:8781367

  3. Objective and subjective hardness of a test item used for evaluating food mixing ability.

    PubMed

    Salleh, N M; Fueki, K; Garrett, N R; Ohyama, T

    2007-03-01

    The aim of this study was to compare objective and subjective hardness of selected common foods with a wax cube used as a test item in a mixing ability test. Objective hardness was determined for 11 foods (cream cheese, boiled fish paste, boiled beef, apple, raw carrot, peanut, soft/hard rice cracker, jelly, plain chocolate and chewing gum) and the wax cube. Peak force (N) to compress each item was obtained from force-time curves generated with the Tensipresser. Perceived hardness ratings of each item were made by 30 dentate subjects (mean age 26.9 years) using a visual analogue scale (100 mm). These subjective assessments were given twice with a 1 week interval. High intraclass correlation coefficients (ICCs) for test-retest reliability were seen for all foods (ICC > 0.68; P < 0.001). One-way anova found a significant effect of food type on both the objective hardness score and the subjective hardness rating (P < 0.001). The wax cube showed significant lower objective hardness score (32.6 N) and subjective hardness rating (47.7) than peanut (45.3 N, 63.5) and raw carrot (82.5 N, 78.4) [P < 0.05; Ryan-Einot-Gabriel-Welsch (REGW)-F]. A significant semilogarithmic relationship was found between the logarithm of objective hardness scores and subjective hardness ratings across twelve test items (r = 0.90; P < 0.001). These results suggest the wax cube has a softer texture compared with test foods traditionally used for masticatory performance test, such as peanut and raw carrot. The hardness of the wax cube could be modified to simulate a range of test foods by changing mixture ratio of soft and hard paraffin wax.

  4. [French version of TASTE (test for the ability and evaluation)].

    PubMed

    Masson, A M; Cadot, M; Pereira, A M; Depreeuw, E; Ansseau, M

    2001-01-01

    Ability to study and evaluation is only one example of performance among many others but research and publications concerning this issue for more than 50 years, especially in the context of test anxiety and need of achievement, conferred upon it a prototypical dimension. Investigations about motivation also stimulate many scientists and constitute another foundation of this study (13). The level of performance depends on knowledge and motivation (33). Time devoted to study is essential to succeed; so motivation and procrastination are in competition. The importance of reinforcement (extrinsical motivation) and the desire for learning and knowing (intrinsical motivation) are determinant. Other elements must be emphasized: guarantee of obtaining rewards, self efficacy and causal attribution. These considerations point out the multidimensional and interactive aspects of test anxiety (7, 31). The number of components is not described unanimously but experts agree with emotional, cognitive and behavioral dimensions (25). So, anxiety was approached in its motivational properties, and it was the case until the sixties, in terms of drive corresponding to a need like thirst or hunger (18); then it was conceptualized in a dynamic context broader than that of stress and coping (29, 30). Last, it constitutes the object of theories highlighting cognitive interference (9, 23, 26) or defective skills (8, 32). A lot of questionnaires were built without answering the different aspects and for instance without linking the theoretical and therapeutic components concerning this problem. Committed to the traditional fields of research (test anxiety and need of achievement), to Weiner's work about attribution theory (34) and that of Bandura in self efficacy (4, 5), E. Depreeuw (10) was particularly interested in Heckhausen's model (16, 17), trying to associate experimental conceptions with the clinical reality. On this basis, he elaborated the TASTE (10, 12, 20): test for ability to

  5. A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

    PubMed

    Kosinski, Andrzej S

    2013-03-15

    Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting.

  6. Scoring Divergent Thinking Tests by Computer With a Semantics-Based Algorithm

    PubMed Central

    Beketayev, Kenes; Runco, Mark A.

    2016-01-01

    Divergent thinking (DT) tests are useful for the assessment of creative potentials. This article reports the semantics-based algorithmic (SBA) method for assessing DT. This algorithm is fully automated: Examinees receive DT questions on a computer or mobile device and their ideas are immediately compared with norms and semantic networks. This investigation compared the scores generated by the SBA method with the traditional methods of scoring DT (i.e., fluency, originality, and flexibility). Data were collected from 250 examinees using the “Many Uses Test” of DT. The most important finding involved the flexibility scores from both scoring methods. This was critical because semantic networks are based on conceptual structures, and thus a high SBA score should be highly correlated with the traditional flexibility score from DT tests. Results confirmed this correlation (r = .74). This supports the use of algorithmic scoring of DT. The nearly-immediate computation time required by SBA method may make it the method of choice, especially when it comes to moderate- and large-scale DT assessment investigations. Correlations between SBA scores and GPA were insignificant, providing evidence of the discriminant and construct validity of SBA scores. Limitations of the present study and directions for future research are offered. PMID:27298632

  7. Relationships between spatial activities and scores on the mental rotation test as a function of sex.

    PubMed

    Ginn, Sheryl R; Pickens, Stefanie J

    2005-06-01

    Previous results suggested that female college students' scores on the Mental Rotations Test might be related to their prior experience with spatial tasks. For example, women who played video games scored better on the test than their non-game-playing peers, whereas playing video games was not related to men's scores. The present study examined whether participation in different types of spatial activities would be related to women's performance on the Mental Rotations Test. 31 men and 59 women enrolled at a small, private church-affiliated university and majoring in art or music as well as students who participated in intercollegiate athletics completed the Mental Rotations Test. Women's scores on the Mental Rotations Test benefitted from experience with spatial activities; the more types of experience the women had, the better their scores. Thus women who were athletes, musicians, or artists scored better than those women who had no experience with these activities. The opposite results were found for the men. Efforts are currently underway to assess how length of experience and which types of experience are related to scores.

  8. Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score

    PubMed Central

    Hung, Man; Hon, Shirley D.; Cheng, Christine; Franklin, Jeremy D.; Aoki, Stephen K.; Anderson, Mike B.; Kapron, Ashley L.; Peters, Christopher L.; Pelt, Christopher E.

    2014-01-01

    Background: The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. Purpose: To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Study Design: Cohort study (diagnosis); Level of evidence, 2. Methods: Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. Results: All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. Conclusion: The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior

  9. New and updated tests of print exposure and reading abilities in college students

    PubMed Central

    Acheson, Daniel J.; Wells, Justine B.; MacDonald, Maryellen C.

    2010-01-01

    The relationship between print exposure and measures of reading skill was examined in college students (N = 99, 58 female; mean age = 20.3 years). Print exposure was measured with several new self-reports of reading and writing habits, as well as updated versions of the Author Recognition Test and the Magazine Recognition Test (Stanovich & West, 1989). Participants completed a sentence comprehension task with syntactically complex sentences, and reading times and comprehension accuracy were measured. An additional measure of reading skill was provided by participants’ scores on the verbal portions of the ACT, a standardized achievement test. Higher levels of print exposure were associated with higher sentence processing abilities and superior verbal ACT performance. The relative merits of different print exposure assessments are discussed. PMID:18411551

  10. An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

    ERIC Educational Resources Information Center

    Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

    2013-01-01

    Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…

  11. Accuracy of Test-Score-Difference Decisions and the Jagged-ProfileEffect

    ERIC Educational Resources Information Center

    Clawar, Harry J.; Hopkins, Thomas F.

    1975-01-01

    The present paper emphasizes the importance of making interpretations regarding differences among an individual's scores on a test battery based on this same concern for the appropriateness of the reference group. (Author)

  12. Effect of Two Types of Footwear on Physical Fitness Test Scores.

    DTIC Science & Technology

    A total of 53 U.S. Marines took the Physical Fitness Test (less push-ups and sit-ups) in both Marine Corps issue boots and tennis shoes. Scores were significantly better when tennis shoes were worn. (Author)

  13. A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

    ERIC Educational Resources Information Center

    Kamens, David H.

    2015-01-01

    This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…

  14. Estimating the Relationship between Use of Test-Preparation Methods and Scores on the Graduate Management Admission Test.

    ERIC Educational Resources Information Center

    Leary, Linda F.; Wightman, Lawrence E.

    This study sought to examine the relationship between five methods of test preparation and test performance as measured by Graduate Management Admission Test (GMAT) Verbal, Quantitative and Total scores. Data on method of test preparation were obtained through voluntary examinee response to five questions which appeared on the answer sheets. One…

  15. Reliability of the Koppitz scoring system for the Bender Gestalt Test.

    PubMed

    Hustak, T L; Dinning, W D; Andert, J N

    1976-04-01

    This study investigated the test-retest reliability of the Koppitz scoring system with Bender Gestalt protocols of adult retardates. Results of a sample of 74 adult retardates yielded a correlation of .80 over an interval of 8 to 146 months. A directional measure of change between error scores on the first and second administrations was not significant, which suggests that the test-retest reliability coefficient is an accurate estimate of the Koppitz scoring system for adult retardates. Scorer reliability for three independent scorers ranged from .92 to .95, which suggests comparability to other investigations with different populations.

  16. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    ERIC Educational Resources Information Center

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  17. Determining When Single Scoring for Constructed-Response Items Is as Effective as Double Scoring in Mixed-Format Licensure Tests

    ERIC Educational Resources Information Center

    Kim, Sooyeon; Moses, Tim

    2013-01-01

    The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…

  18. Test Score Stability and the Relationship of Adult Manifest Anxiety Scale-College Version Scores to External Variables among Graduate Students

    ERIC Educational Resources Information Center

    Lowe, Patricia A.; Peyton, Vicki; Reynolds, Cecil R.

    2007-01-01

    A sample of 79 individuals participated in the present study to evaluate the test score stability (8-week test-retest interval) and construct validity of the scores of the Adult Manifest Anxiety Scale-College Version, a new measure used to assess anxiety in college students, for application to graduate-level students. Results of the study…

  19. The Validity of Scores from the "GRE"® revised General Test for Forecasting Performance in Business Schools: Phase One. ETS GRE® Board Research Report. ETS GRE®-14-01. ETS Research Report. RR-14-17

    ERIC Educational Resources Information Center

    Young, John W.; Klieger, David; Bochenek, Jennifer; Li, Chen; Cline, Fred

    2014-01-01

    Scores from the "GRE"® revised General Test provide important information regarding the verbal and quantitative reasoning abilities and analytical writing skills of applicants to graduate programs. The validity and utility of these scores depend upon the degree to which the scores predict success in graduate and business school in…

  20. Vowel Deletion and Cloze Tests Compared with a Reading Ability Test.

    ERIC Educational Resources Information Center

    Lisman, Linda C.

    Fifty-seven seventh and 60 eighth graders were divided into three reading ability groups. All were given the Wechsler Intelligence Scale for Children (WISC) prior to the study and the Gates-MacGinitie Reading Test Survey E for grades 7 to 9 immediately after the study. A practice sample was given before the tests on prepared vowel deletion and…

  1. The 5-Step Way to Raise Test Scores: Using the Data to Drive Your Instruction

    ERIC Educational Resources Information Center

    East, Pam C.

    2005-01-01

    Many teachers look at standardized tests as something to be dreaded. This author and teacher looks at standardized-test scores and sees a tool to bring students learning to new heights. This is a way for teachers to target instruction exactly where it's needed. A way to get students looking forward to end-of-the-year tests (really!) as a way to…

  2. A Study of the Relationship between Scores and Time on Tests.

    ERIC Educational Resources Information Center

    Kennedy, Rob

    The purpose of this study was to investigate the relationship between the scores students earned on multiple choice tests and the number of minutes students required to complete the tests. The 5 tests were made up of 20 randomly drawn questions from a large pool of questions about research methods. Students were allowed an unlimited amount of time…

  3. Self Adapted Testing as Formative Assessment: Effects of Feedback and Scoring on Engagement and Performance

    ERIC Educational Resources Information Center

    Arieli-Attali, Meirav

    2016-01-01

    This dissertation investigated the feasibility of self-adapted testing (SAT) as a formative assessment tool with the focus on learning. Under two different orientation goals--to excel on a test (performance goal) or to learn from the test (learning goal)--I examined the effect of different scoring rules provided as interactive feedback, on test…

  4. Interpretation and Utilization of Scores on the Air Force Officer Qualifying Test.

    ERIC Educational Resources Information Center

    Miller, Robert E.

    The report summarizes a large body of data relevant to the proper interpretation and use of aptitude scores on the Air Force Officer Qualifying Test (AFOQT). Included are descriptions of the AFOQT testing program and the test itself. Technical data include an extensive sampling of validation studies covering predictors of success in pilot…

  5. The Relationship between Career Maturity Test Scores and Appropriateness of Career Choices: A Replication.

    ERIC Educational Resources Information Center

    Westbrook, Bert W.; And Others

    1990-01-01

    Attempted to replicate study determining relationship between appropriateness of career choices and career maturity test scores in rural ninth grade students (N=112) using Goal Selection scale of Career Maturity Inventory Competence Test and American College Testing Program Career Planning Program. Found two career maturity measures correlated…

  6. Commentary: Student Cognition, the Situated Learning Context, and Test Score Interpretation

    ERIC Educational Resources Information Center

    La Marca, Paul M.

    2006-01-01

    Although it is assumed that student cognition contributes to student performance on achievement tests, it may be that current testing models lack the degree of specification necessary to warrant such inferences. With test score interpretations as the referent, the authors in this special issue address the role of student cognition in learning and…

  7. Reading ability and print exposure: item response theory analysis of the author recognition test.

    PubMed

    Moore, Mariah; Gordon, Peter C

    2015-12-01

    In the author recognition test (ART), participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, and this predictive ability is generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. In this large-scale study (1,012 college student participants), we used item response theory (IRT) to analyze item (author) characteristics in order to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and optimize scoring of the ART. Factor analysis suggested a potential two-factor structure of the ART, differentiating between literary and popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of the time spent encoding words, as measured using eyetracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Furthermore, they show that frequency data can be used to select items of appropriate difficulty, and that frequency data from corpora based on particular time periods and types of texts may allow adaptations of the test for different populations.

  8. Reading Ability and Print Exposure: Item Response Theory Analysis of the Author Recognition Test

    PubMed Central

    Moore, Mariah; Gordon, Peter C.

    2015-01-01

    In the Author Recognition Test (ART) participants are presented with a series of names and foils and are asked to indicate which ones they recognize as authors. The test is a strong predictor of reading skill, with this predictive ability generally explained as occurring because author knowledge is likely acquired through reading or other forms of print exposure. This large-scale study (1012 college student participants) used Item Response Theory (IRT) to analyze item (author) characteristics to facilitate identification of the determinants of item difficulty, provide a basis for further test development, and to optimize scoring of the ART. Factor analysis suggests a potential two factor structure of the ART differentiating between literary vs. popular authors. Effective and ineffective author names were identified so as to facilitate future revisions of the ART. Analyses showed that the ART is a highly significant predictor of time spent encoding words as measured using eye-tracking during reading. The relationship between the ART and time spent reading provided a basis for implementing a higher penalty for selecting foils, rather than the standard method of ART scoring (names selected minus foils selected). The findings provide novel support for the view that the ART is a valid indicator of reading volume. Further, they show that frequency data can be used to select items of appropriate difficulty and that frequency data from corpora based on particular time periods and types of text may allow test adaptation for different populations. PMID:25410405

  9. Early Mathematics Skills from Prekindergarten to First Grade: Score Changes and Ability Group Differences in Kentucky, Nebraska, and Shanghai Samples

    ERIC Educational Resources Information Center

    Ryoo, Ji Hoon; Molfese, Victoria J.; Heaton, Ruth; Zhou, Xin; Brown, E. Todd; Prokasky, Amanda; Davis, Erika

    2014-01-01

    The 2011 Trends in International Mathematics and Science Study shows average mathematics scores of U.S. fourth graders are lower than children in many Asian countries. There are questions about differences in mathematics skills at younger ages. This study examines differences in score growth for High-, Average-, and Low-performing children in two…

  10. Self-Esteem Scores among Deaf College Students: An Examination of Gender and Parents' Hearing Status and Signing Ability.

    ERIC Educational Resources Information Center

    Crowe, Teresa V.

    2003-01-01

    A study involving 152 college students with deafness found students who had at least one parent with deafness and signed scored significantly higher on self-esteem measures than those with hearing parents who could or who could not sign. Overall, self-esteem scores for all respondents were high. (Contains references.) (Author/CR)

  11. Study Protocol on Intentional Distortion in Personality Assessment: Relationship with Test Format, Culture, and Cognitive Ability

    PubMed Central

    Van Geert, Eline; Orhon, Altan; Cioca, Iulia A.; Mamede, Rui; Golušin, Slobodan; Hubená, Barbora; Morillo, Daniel

    2016-01-01

    Self-report personality questionnaires, traditionally offered in a graded-scale format, are widely used in high-stakes contexts such as job selection. However, job applicants may intentionally distort their answers when filling in these questionnaires, undermining the validity of the test results. Forced-choice questionnaires are allegedly more resistant to intentional distortion compared to graded-scale questionnaires, but they generate ipsative data. Ipsativity violates the assumptions of classical test theory, distorting the reliability and construct validity of the scales, and producing interdependencies among the scores. This limitation is overcome in the current study by using the recently developed Thurstonian item response theory model. As online testing in job selection contexts is increasing, the focus will be on the impact of intentional distortion on personality questionnaire data collected online. The present study intends to examine the effect of three different variables on intentional distortion: (a) test format (graded-scale versus forced-choice); (b) culture, as data will be collected in three countries differing in their attitudes toward intentional distortion (the United Kingdom, Serbia, and Turkey); and (c) cognitive ability, as a possible predictor of the ability to choose the more desirable responses. Furthermore, we aim to integrate the findings using a comprehensive model of intentional distortion. In the Anticipated Results section, three main aspects are considered: (a) the limitations of the manipulation, theoretical approach, and analyses employed; (b) practical implications for job selection and for personality assessment in a broader sense; and (c) suggestions for further research. PMID:27445902

  12. Study Protocol on Intentional Distortion in Personality Assessment: Relationship with Test Format, Culture, and Cognitive Ability.

    PubMed

    Van Geert, Eline; Orhon, Altan; Cioca, Iulia A; Mamede, Rui; Golušin, Slobodan; Hubená, Barbora; Morillo, Daniel

    2016-01-01

    Self-report personality questionnaires, traditionally offered in a graded-scale format, are widely used in high-stakes contexts such as job selection. However, job applicants may intentionally distort their answers when filling in these questionnaires, undermining the validity of the test results. Forced-choice questionnaires are allegedly more resistant to intentional distortion compared to graded-scale questionnaires, but they generate ipsative data. Ipsativity violates the assumptions of classical test theory, distorting the reliability and construct validity of the scales, and producing interdependencies among the scores. This limitation is overcome in the current study by using the recently developed Thurstonian item response theory model. As online testing in job selection contexts is increasing, the focus will be on the impact of intentional distortion on personality questionnaire data collected online. The present study intends to examine the effect of three different variables on intentional distortion: (a) test format (graded-scale versus forced-choice); (b) culture, as data will be collected in three countries differing in their attitudes toward intentional distortion (the United Kingdom, Serbia, and Turkey); and (c) cognitive ability, as a possible predictor of the ability to choose the more desirable responses. Furthermore, we aim to integrate the findings using a comprehensive model of intentional distortion. In the Anticipated Results section, three main aspects are considered: (a) the limitations of the manipulation, theoretical approach, and analyses employed; (b) practical implications for job selection and for personality assessment in a broader sense; and

  13. Validity of the General Conceptual Ability Score from the Differential Ability Scales as a Function of Significant and Rare Interfactor Variability

    ERIC Educational Resources Information Center

    Kotz, Kasey M.; Watkins, Marley W.; McDermott, Paul A.

    2008-01-01

    Some researchers have argued that discrepant broad index scores invalidate IQs, but others have questioned the fundamental logic of that argument. To resolve this debate, the present study used a nationally representative sample of children (N = 1,200) who were matched individually for IQ. Children with significantly uneven broad index score…

  14. An electrophysiological correlate of Eating Attitudes Test scores in female college students.

    PubMed

    Wilson, J F; Mercer, J C

    1990-11-01

    Eating Attitudes Test (EAT) scores of forty female college students were compared to their electrodermal activity (EDA) responses when offered a plate of chocolate chip cookies. A significant positive correlation was detected between the EAT scores and the skin conductivity measures associated with the presentation of food. Women with the highest EAT scores also exhibited the greatest sympathetic nervous system responses to a plate of cookies. This finding supports the conclusion that the EAT is capable of identifying individuals who are preoccupied with food or anxious about eating.

  15. Piloting a Polychotomous Partial-Credit Scoring Procedure in a Multiple-Choice Test

    ERIC Educational Resources Information Center

    Tsopanoglou, Antonios; Ypsilandis, George S.; Mouti, Anna

    2014-01-01

    Multiple-choice (MC) tests are frequently used to measure language competence because they are quick, economical and straightforward to score. While degrees of correctness have been investigated for partially correct responses in combined-response MC tests, degrees of incorrectness in distractors and the role they play in determining the…

  16. The Relationship of Motivational Values of Math and Reading Teachers to Student Test Score Gains

    ERIC Educational Resources Information Center

    Loewen, David Allen

    2013-01-01

    This exploratory correlational study seeks to answer the question of whether a relationship exists between student average test score gains on state exams and teachers' rating of values on the Schwartz Values Survey. Eighty-seven randomly selected Kansas teachers of math and/or reading, grades four through eight, participated. Student test score…

  17. The Effects of Using Selected Metacognitive Strategies on ACT Mathematics Sub-Test Scores

    ERIC Educational Resources Information Center

    LeMay, Jeffrey W.

    2016-01-01

    This quasi-experimental post-test only control group designed quantitative study examined whether or not members of an experimental group of participants who utilized two metacognitive strategy training regimens experienced a significant increase in their ACT mathematics sub-test scores compared to a group of students who did not utilize either of…

  18. The Disaggregation of Value-Added Test Scores to Assess Learning Outcomes in Economics Courses

    ERIC Educational Resources Information Center

    Walstad, William B.; Wagner, Jamie

    2016-01-01

    This study disaggregates posttest, pretest, and value-added or difference scores in economics into four types of economic learning: positive, retained, negative, and zero. The types are derived from patterns of student responses to individual items on a multiple-choice test. The micro and macro data from the "Test of Understanding in College…

  19. Comparison of Standardized Test Scores from Traditional Classrooms and Those Using Problem-Based Learning

    ERIC Educational Resources Information Center

    Needham, Martha Elaine

    2010-01-01

    This research compares differences between standardized test scores in problem-based learning (PBL) classrooms and a traditional classroom for 6th grade students using a mixed-method, quasi-experimental and qualitative design. The research shows that problem-based learning is as effective as traditional teaching methods on standardized tests. The…

  20. Two for One: Using QAR to Increase Reading Comprehension and Improve Test Scores

    ERIC Educational Resources Information Center

    Green, Susan

    2016-01-01

    This teaching tip describes an intervention used in a third-grade classroom implemented to help students pass an end-of-grade reading comprehension test. Low scores on a practice end-of-grade comprehension test prompted a re-examination of classroom reading instruction and a plan for intervention. This teaching tip describes the phases implemented…

  1. The Hand Test Acting Out Score as a Predictor of Acting Out in Correctional Settings

    ERIC Educational Resources Information Center

    Porecki, Daniel; Vandergoot, David

    1978-01-01

    The Hand Test was administered to 107 maximum-security prison inmates. The Acting Out Score (AOS) was computed, and one year later, actual acting out for the same inmates was recorded. Determined that the Hand Test AOS is useful in identifying inmates who have potential for acting out. (Author)

  2. Bi-Factor MIRT Observed-Score Equating for Mixed-Format Tests

    ERIC Educational Resources Information Center

    Lee, Guemin; Lee, Won-Chan

    2016-01-01

    The main purposes of this study were to develop bi-factor multidimensional item response theory (BF-MIRT) observed-score equating procedures for mixed-format tests and to investigate relative appropriateness of the proposed procedures. Using data from a large-scale testing program, three types of pseudo data sets were formulated: matched samples,…

  3. Investigation and Treatment of Missing Item Scores in Test and Questionnaire Data

    ERIC Educational Resources Information Center

    Sijtsma, Klaas; van der Ark, L. Andries

    2003-01-01

    This article first discusses a statistical test for investigating whether or not the pattern of missing scores in a respondent-by-item data matrix is random. Since this is an asymptotic test, we investigate whether it is useful in small but realistic sample sizes. Then, we discuss two known simple imputation methods, person mean (PM) and two-way…

  4. Stochastic Processes as True-Score Models for Highly Speeded Mental Tests.

    ERIC Educational Resources Information Center

    Moore, William E.

    The previous theoretical development of the Poisson process as a strong model for the true-score theory of mental tests is discussed, and additional theoretical properties of the model from the standpoint of individual examinees are developed. The paper introduces the Erlang process as a family of test theory models and shows in the context of…

  5. An Investigation of the Effectiveness of Vocabulary Learning Strategies on Iranian EFL Learners' Vocabulary Test Score

    ERIC Educational Resources Information Center

    Rahimy, Ramin; Shams, Kiana

    2012-01-01

    This study aims to investigate the effectiveness of vocabulary learning strategies on Iranian EFL learners' vocabulary test score. To achieve this aim, fifty Intermediate level students from Kish English Institute were randomly selected from among fifteen classes after administering the Oxford Placement Test (OPT). Then, an intermediate level…

  6. Pragmatism or Gaming the System? One School District's Solution to Low Test Scores

    ERIC Educational Resources Information Center

    McKenzie, Kathryn Bell

    2009-01-01

    In this era of accountability and high stakes testing, district and school administrators are vigilant in their attention to student test scores and the ramifications these have for district and school performance labels. In other words, no school or district wants to be labeled "low performing." This case, based on a real situation, demonstrates…

  7. Investigating Score Dependability in English/Chinese Interpreter Certification Performance Testing: A Generalizability Theory Approach

    ERIC Educational Resources Information Center

    Han, Chao

    2016-01-01

    As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…

  8. Score Reliability of a Test Composed of Passage-Based Testlets: A Generalizability Theory Perspective.

    ERIC Educational Resources Information Center

    Lee, Yong-Won

    The purpose of this study was to investigate the impact of local item dependence (LID) in passage-based testlets on the test score reliability of an English as a Foreign Language (EFL) reading comprehension test from the perspective of generalizability (G) theory. Definitions and causes of LID in passage-based testlets are reviewed within the…

  9. Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

    ERIC Educational Resources Information Center

    Lalande, John F.; Schweckendiek, Jurgen

    1986-01-01

    Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)

  10. A new assessment of the normal ranges of the Farnsworth-Munsell 100-hue test scores.

    PubMed

    Verriest, G; Van Laethem, J; Uvijls, A

    1982-05-01

    We gave the Farnsworth-Munsell 100-hue color vision test to 232 normal subjects between 10 and 80 years of age. One half the subjects underwent binocular testing followed by monocular testing. In the other half monocular testing preceded binocular testing. Performance was better with both eyes than with either eye alone. The worst performance occurred on monocular tests in subjects without previous experience with the task (that is, those for whom this was the first test). The well-known age trend was apparent (children and elderly have the worst color vision). New data are provided for judging the point at which the total error score may be considered pathologic.

  11. Predicting Scores on the College-Level Examination Program (CLEP) General Examinations from Scores Earned on the American College Test (ACT) Assessment.

    ERIC Educational Resources Information Center

    Nimmer, Donald N.; Shakiba-Nejad, Hadi

    The study was conducted to provide formulae by which College-Level Examination Program (CLEP) General Examination scores may be predicted from scores earned on the American College Test (ACT) Assessment. Five basic areas of liberal arts achievement are measured by the CLEP General Examinations: English Composition, Humanities, Mathematics, Natural…

  12. The Impact of Individual Ability, Favorable Team Member Scores, and Student Perception of Course Importance on Student Preference of Team-Based Learning and Grading Methods

    ERIC Educational Resources Information Center

    Su, Allan Yen-Lun

    2007-01-01

    This study explores the impact of individual ability and favorable team member scores on student preference of team-based learning and grading methods, and examines the moderating effects of student perception of course importance on student preference of team-based learning and grading methods. The author also investigates the relationship…

  13. Developing a Measure of General Academic Ability: An Application of Maximal Reliability and Optimal Linear Combination to High School Students' Scores

    ERIC Educational Resources Information Center

    Dimitrov, Dimiter M.; Raykov, Tenko; AL-Qataee, Abdullah Ali

    2015-01-01

    This article is concerned with developing a measure of general academic ability (GAA) for high school graduates who apply to colleges, as well as with the identification of optimal weights of the GAA indicators in a linear combination that yields a composite score with maximal reliability and maximal predictive validity, employing the framework of…

  14. The impact of individual ability, favorable team member scores, and student perception of course importance on student preference of team-based learning and grading methods.

    PubMed

    Su, Allan Yen-Lun

    2007-01-01

    This study explores the impact of individual ability and favorable team member scores on student preference of team-based learning and grading methods, and examines the moderating effects of student perception of course importance on student preference of team-based learning and grading methods. The author also investigates the relationship between student perception of course importance and their responses to social loafing. Results indicate that individual ability on the preference of team-based learning was affected by the three levels of favorable team member scores. For students with a low level of individual ability, the preference for team-based learning was significant among students with each of three levels of favorable team member scores (p < .05). However, the team-based learning and grading methods was not significant (p > .05). The findings also reveal a negative correlation between student perception of course importance and their responses to social loafing (p < .05). Findings note the importance of teachers' grading methods, student perceptions of course importance as well as individual ability and favorable team member scores in the team selection process to promote student attitude toward team-based learning.

  15. Discriminative Ability of CHC Factor Scores from the WJ III Tests of Cognitive Abilities in Children with ADHD

    ERIC Educational Resources Information Center

    Rowland, Julie Elizabeth

    2013-01-01

    Students with attention-deficit/hyperactivity disorder (ADHD) make up approximately 5% of the school-aged population and they often experience significant difficulties in school, particularly in the areas of academics, disruptive behavior, and social relationships. A diagnosis of ADHD does not provide guidance for creating interventions to address…

  16. A Score Based on Screening Tests to Differentiate Mild Cognitive Impairment from Subjective Memory Complaints

    PubMed Central

    de Gobbi Porto, Fábio Henrique; Spíndola, Lívia; de Oliveira, Maira Okada; Figuerêdo do Vale, Patrícia Helena; Orsini, Marco; Nitrini, Ricardo; Dozzi Brucki, Sonia Maria

    2013-01-01

    It is not easy to differentiate patients with mild cognitive impairment (MCI) from subjective memory complainers (SMC). Assessments with screening cognitive tools are essential, particularly in primary care where most patients are seen. The objective of this study was to evaluate the diagnostic accuracy of screening cognitive tests and to propose a score derived from screening tests. Elderly subjects with memory complaints were evaluated using the Mini Mental State Examination (MMSE) and the Brief Cognitive Battery (BCB). We added two delayed recalls in the MMSE (a delayed recall and a late-delayed recall, LDR), and also a phonemic fluency test of letter P fluency (LPF). A score was created based on these tests. The diagnoses were made on the basis of clinical consensus and neuropsychological testing. Receiver operating characteristic curve analyses were used to determine area under the curve (AUC), the sensitivity and specificity for each test separately and for the final proposed score. MMSE, LDR, LPF and delayed recall of BCB scores reach statistically significant differences between groups (P=0.000, 0.03, 0.001 and 0.01, respectively). Sensitivity, specificity and AUC were MMSE: 64%, 79% and 0.75 (cut off <29); LDR: 56%, 62% and 0.62 (cut off <3); LPF: 71%, 71% and 0.71 (cut off <14); delayed recall of BCB: 56%, 82% and 0.68 (cut off <9). The proposed score reached a sensitivity of 88% and 76% and specificity of 62% and 75% for cut off over 1 and over 2, respectively. AUC were 0.81. In conclusion, a score created from screening tests is capable of discriminating MCI from SMC with moderate to good accurancy. PMID:24147213

  17. Do We Really Become Smarter When Our Fluid-Intelligence Test Scores Improve?

    PubMed

    Hayes, Taylor R; Petrov, Alexander A; Sederberg, Per B

    2015-01-01

    Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training-now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains. A novel scanpath analysis of eye movement data from 35 participants solving Raven's Advanced Progressive Matrices on two separate sessions indicated that one-third of the variance of score gains could be attributed to test-taking strategy alone, as revealed by characteristic changes in eye-fixation patterns. When the strategic contaminant was partialled out, the residual score gains were no longer significant. These results are compatible with established theories of skill acquisition suggesting that procedural knowledge tacitly acquired during training can later be utilized at posttest. Our novel method and result both underline a reason to be wary of purported intelligence gains, but also provide a way forward for testing for them in the future.

  18. The Relationship between Scholastic Aptitude Test Scores and the Academic Success of Industrial Arts/Technology Education Majors.

    ERIC Educational Resources Information Center

    Wescott, Jack W.

    1989-01-01

    A study of 107 graduates of industrial arts teacher education programs found statistically significant relationships between Scholastic Aptitude Test (SAT) scores and grade point average (GPA); between SAT scores and major GPA; especially between combined math and verbal scores and major GPA. It was concluded that SAT scores are important factors…

  19. Improvement in national test reading scores at Key Stage 1; grade inflation or better achievement?

    PubMed

    Meadows, Sara; Herrick, David; Feiler, Anthony

    2007-02-01

    The aim of the UK National Literacy Strategy is to raise standards in literacy. Strong evidence for its success has, however, been lacking: most of the available data comes from performance on tests administered in schools or from Office for Standards in Education reports and is vulnerable to suggestions of bias. An opportunistic analysis of data from a population cohort study extending over three school years compares school-based scores at school entry and at age 7-8 with independently administered scores on similar tests. The results show a small but statistically significant rise between 1998 and 1999 and between 1998 and 2000 in scores on both Key Stage 1 Reading Standard Assessment Tasks taken in schools and the reading component of the WORD test taken independently. This is clear evidence for a real rise in reading attainment over this period, which may be attributable to the children's experience of the National Literacy Strategy.

  20. Test Operations Procedure (TOP) 03-2-827 Test Procedures for Video Target Scoring Using Calibration Lights

    DTIC Science & Technology

    2016-04-04

    Final 3. DATES COVERED (From - To) 4. TITLE AND SUBTITLE Test Operations Procedure (TOP) 03-2-827 Test Procedures for Video Target Scoring Using...NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Survivability/Lethality Division (TEDT-AT-SLB) U.S. Army Aberdeen Test Center 400...ADDRESS(ES) Range Infrastructure Division (CSTE-TM) U.S. Army Test and Evaluation Command 2202 Aberdeen Boulevard Aberdeen Proving Ground, MD 21005

  1. Impact of a standardized test package on exit examination scores and NCLEX-RN outcomes.

    PubMed

    Homard, Catherine M

    2013-03-01

    The purpose of this ex post facto correlational study was to compare exit examination scores and NCLEX-RN(®) pass rates of baccalaureate nursing students who differed in level of participation in a standardized test package. Three cohort groups emerged as a standardized test package was introduced: (a) students who did not participate in a standardized test package; (b) students with two semesters of a standardized test package; and (c) students with four semesters of a standardized test package. Benner's novice-to-expert theory framed the study in the belief that students best acquire knowledge and skills through practice and reflection. Students participating in four semesters of a standardized test package demonstrated higher exit examination scores and NCLEX-RN pass rates compared with students who did not participate in this package. This study's results could inform nurse educators about strategies to facilitate nursing student success on exit examinations and the NCLEX-RN.

  2. Opportunity to learn: Investigating possible predictors for pre-course Test Of Astronomy STandards TOAST scores

    NASA Astrophysics Data System (ADS)

    Berryhill, Katie J.

    As astronomy education researchers become more interested in experimentally testing innovative teaching strategies to enhance learning in introductory astronomy survey courses ("ASTRO 101"), scholars are placing increased attention toward better understanding factors impacting student gain scores on the widely used Test Of Astronomy STandards (TOAST). Usually used in a pre-test and post-test research design, one might naturally assume that the pre-course differences observed between high- and low-scoring college students might be due in large part to their pre-existing motivation, interest, experience in science, and attitudes about astronomy. To explore this notion, 11 non-science majoring undergraduates taking ASTRO 101 at west coast community colleges were interviewed in the first few weeks of the course to better understand students' pre-existing affect toward learning astronomy with an eye toward predicting student success. In answering this question, we hope to contribute to our understanding of the incoming knowledge of students taking undergraduate introductory astronomy classes, but also gain insight into how faculty can best meet those students' needs and assist them in achieving success. Perhaps surprisingly, there was only weak correlation between students' motivation toward learning astronomy and their pre-test scores. Instead, the most fruitful predictor of TOAST pre-test scores was the quantity of pre-existing, informal, self-directed astronomy learning experiences.

  3. Use of max and min scores for trend tests for association when the genetic model is unknown.

    PubMed

    Zheng, Gang

    2003-08-30

    In case-control studies, the Cochran-Armitage (CA) trend test is powerful for detection of an association between a risk allele and a marker. To apply this test, a score should be assigned to the genotypes based on the genetic model. When the underlying genetic model is unknown, the trend test statistic is a function of the score. In this paper, simple procedures are given to obtain two scores (max and min), which respectively maximize and minimize the CA trend test statistics for genetic associations. These two scores can be used to examine the effect of the choice of scores on the test of no association. When the CA trend test statistic with the max (or min) score is less (or greater) than a prespecified value, the conclusion is clear: we will accept (or reject) the null hypothesis of no association for any scores used. When this value is less than the CA trend test statistic with the max score but greater than the one with the min score, the decision of whether or not to reject the null hypothesis depends on the choice of scores. In this situation, the CA trend test with a prespecified score cannot be used without careful scientific justification of the choice of scores. The use of max and min scoring schemes is applied to a real data set.

  4. The Design and Development of the Phillips-Patterson Test of Inference Ability in Reading Comprehension.

    ERIC Educational Resources Information Center

    Phillips, Linda M.

    The design and development of a test of inference ability in reading comprehension for grades 6, 7, and 8 (the Phillips-Patterson Test of Inference Ability in Reading Comprehension) are described. After development of a contemporary theoretical framework for the test of inference ability in reading comprehension, the design, item development, and…

  5. Development of a mathematical ability test: a validity and reliability study

    NASA Astrophysics Data System (ADS)

    Dündar, Sefa; Temel, Hasan; Gündüz, Nazan

    2016-10-01

    The identification of talented students accurately at an early age and the adaptation of the education provided to the students depending on their abilities are of great importance for the future of the countries. In this regard, this study aims to develop a mathematical ability test for the identification of the mathematical abilities of students and the determination of the relationships between the structure of abilities and these structures. Furthermore, this study adopts test development processes. A structure consisting of the factors of quantitative ability, causal ability, inductive/deductive reasoning ability, qualitative ability and spatial ability has been obtained following this study. The fit indices of the finalized version of the mathematical ability test of 24 items indicate the suitability of the test.

  6. Benton's Visual Retention Test: New Age, Scale Score and Percentile Norms for Children.

    ERIC Educational Resources Information Center

    Rice, James A.

    The Benton Visual Retention Test which is designed to assess visual perceptual, visual motor, and visuoconstructive abilities can give school personnel greater precision and range in testing. The standardization of this instrument was tested on 700 Houston elementary school students. Chronological age differences were maintained and correlation…

  7. Measuring College Students' Reading Comprehension Ability Using Cloze Tests

    ERIC Educational Resources Information Center

    Williams, Rihana Shiri; Ari, Omer; Santamaria, Carmen Nicole

    2011-01-01

    Recent investigations challenge the construct validity of sustained silent reading tests. Performance of two groups of post-secondary students (e.g. struggling and non-struggling) on a sustained silent reading test and two types of cloze test (i.e. maze and open-ended) was compared in order to identify the test format that contributes greater…

  8. Comparisons between a mixing ability test and masticatory performance tests using a brittle or an elastic test food.

    PubMed

    Sugiura, T; Fueki, K; Igarashi, Y

    2009-03-01

    A variety of chewing tests and test items have been utilized to evaluate masticatory function. The purpose of this study was to compare a mixing ability test with masticatory performance tests using peanuts or gummy jelly as test foods. Thirty-two completely dentate subjects (Dentate group, mean age: 25.1 years) and 40 removable partial denture wearers (RPD group, mean age: 65.5 years) participated in this study. The subjects were asked to chew a two-coloured paraffin wax cube as a test item for 10 strokes. Mixing Ability Index (MAI) was determined from the colour mixture and shape of the chewed cube. Subjects were asked to chew 3 g portions of peanuts and a piece of gummy jelly for 20 strokes, respectively. Median particle size of chewed peanuts was determined using a multiple-sieving method. Concentration of dissolved glucose from the surface of the chewed gummy jelly was measured using a blood glucose meter. Pearson's correlation coefficient was used to test the relationships between the MAI, median particle size and the concentration of dissolved glucose. Mixing Ability Index was significantly related to median particle size (Dentate group: r = -0.56, P < 0.001, RPD group: r = -0.70, P < 0.001), but not significantly related to glucose concentration (Dentate group: r = 0.12, RPD group: r = 0.21, P > 0.05). It seems that ability of mixing the bolus is more strongly related to the ability of comminuting brittle food than elastic food.

  9. Relationships between Ability and Personality: Three Hypotheses Tested.

    ERIC Educational Resources Information Center

    Austin, Elizabeth J.; Gibson, Gavin J.; Deary, Ian J.

    1997-01-01

    The interrelationship between personality and intelligence was investigated in several studies using data from a survey of 210 Scottish farmers. Evidence was found for increased differentiation of neuroticism and openness at higher levels of ability. There was no support for the hypothesis that intelligence affects the correlation between…

  10. Applicability of a change of direction ability field test in soccer assistant referees.

    PubMed

    Castagna, Carlo; Impellizzeri, Franco M; Bizzini, Mario; Weston, Matthew; Manzi, Vincenzo

    2011-03-01

    The aim of this study was to examine the applicability of a test for change of direction ability (10-8-8-10 test, involving line and sideward sprinting, 36 m) in elite-level soccer assistant referees (ARs). One hundred AR of the first-second and third Italian Championships (ARA-B and ARC, n = 50, respectively) performed the 10-8-8-10 on 3 separate occasions. Twenty AR authorities scored test relevance (1-5 scale, from trivial to very large) for logical validity using a questionnaire. Construct validity was examined comparing ARA-B and ARC for the 10-8-8-10 performance. Short-term reliability was assessed testing a random selection of ARs (n = 64) over 3 separate occasions every other day. Performance in the 10-8-8-10 test was assumed as total coverage time using telemetric photocells. Results showed that the 10-8-8-10 test was perceived as possessing from large (n = 4/20) to very-large (n = 16/20) relevance to AR physical match performance. No significant performance difference was found between competitive levels (p = 0.57). Area under the curve (= 0.49; p = 0.87) showed no significant sensitivity of 10-8-8-10 in detecting competitive-level difference. The intraclass correlation coefficient (n = 64) and typical error of measurement (test 2 vs. 3) values were 0.90 (p < 0.0001) and 0.18 seconds, respectively. This study showed that the 10-8-8-10 test possesses logical validity, good reliability, and it is independent of the competitive level. As such, this original investigation represents the first step in the identification and assessment of a valid and reliable AR change of direction test. Given the strength of our findings, governing bodies should look to integrate the 10-8-8-10 test into the fitness test protocols devised for ARs, with scores ≥ 9.67 being considered as a starting point for the empirical validation of minimum selection criteria for elite-level ARs.

  11. Relationships between Scores of Gifted Children on the Stanford-Binet IV and Woodcock-Johnson Tests of Achievement.

    ERIC Educational Resources Information Center

    Carvajal, Howard; And Others

    1989-01-01

    Forty-five gifted children, ages 11-17, were tested with the Stanford-Binet Intelligence Scale and the Woodcock-Johnson Tests of Achievement. Results indicated 18 of 20 correlations between the area and composite scores were significant. The Stanford-Binet Short-Term Memory standard age score mean was lower than other scores' means. (Author/JDD)

  12. Tests that Measure Language Ability: A Descriptive Compilation.

    ERIC Educational Resources Information Center

    Bye, Thomas J.

    A collection of tests measuring language proficiency and/or language dominance is described; twenty-eight of the tests are commercially available and twelve are available from non-commercial sources. There are no evaluative judgments made. The descriptive information for each test includes: title; author; where to order (or where to inquire, for…

  13. Score test for familial aggregation in probands studies: application to Alzheimer's disease.

    PubMed

    Commenges, D; Jacqmin, H; Letenneur, L; Van Duijn, C M

    1995-06-01

    When studying familial aggregation of a disease, the following two-stage design is often used: first select index subjects (cases and controls); then record data on their relatives. The likelihood corresponding to this design is derived and a score test of homogeneity is proposed for testing the hypothesis of no-aggregation. This test takes into account the selection procedure and allows adjustment to be made for explanatory variables. It appears as the sum of three terms: a pure test of homogeneity, a test of comparison of observed minus expected cases in the two groups, and a term which adjusts for the possible unequal probabilities of disease of the index subjects. Asymptotic efficiency and a simulation study show that the proposed test is superior to either the pure homogeneity test or tests based on the comparison of numbers of affected in the two groups. The test statistic, which has an asymptotically standard normal distribution, is applied to a study of familial aggregation of early-onset Alzheimer's disease for which a highly significant value (9.46) is obtained: this is the highest value among the three tests compared, in agreement with the simulation study. A logistic normal model is fitted to the data, taking account of the selection procedure: it allows to estimate the regression parameters and the variance of the random effect; the likelihood ratio test for familial aggregation seems less powerful than the score test.

  14. Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease.

    PubMed

    Moradi, Elaheh; Hallikainen, Ilona; Hänninen, Tuomo; Tohka, Jussi

    2017-01-01

    Rey's Auditory Verbal Learning Test (RAVLT) is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD), thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting) and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI) data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50) and RAVLT Percent Forgetting (R = 0.43) in a dataset consisting of 806 AD, mild cognitive impairment (MCI) or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.

  15. An Investigation of Procedures for Computerized Adaptive Testing Using Partial Credit Scoring.

    ERIC Educational Resources Information Center

    Koch, William R.; Dodd, Barbara G.

    1989-01-01

    Various aspects of the computerized adaptive testing (CAT) procedure for partial credit scoring were manipulated, focusing on the effects of the manipulations on operational characteristics of the CAT. The effects of item-pool size, item-pool information, and stepsizes used along the trait continuum were assessed. (TJH)

  16. Intelligence Test Scores and Birth Order among Young Norwegian Men (Conscripts) Analyzed within and between Families

    ERIC Educational Resources Information Center

    Bjerkedal, Tor; Kristensen, Petter; Skjeret, Geir A.; Brevik, John I.

    2007-01-01

    The present paper reports the results of a within and between family analysis of the relation between birth order and intelligence. The material comprises more than a quarter of a million test scores for intellectual performance of Norwegian male conscripts recorded during 1984-2004. Conscripts, mostly 18-19 years of age, were born to women for…

  17. Defending the Quality of Links between Scores from Different Tests and Exams

    ERIC Educational Resources Information Center

    Cresswell, Mike

    2010-01-01

    Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

  18. The Effect of Four Intervention Programs on Standardized Test Scores by Gender

    ERIC Educational Resources Information Center

    Cryder, Rebecca E.

    2012-01-01

    This quantitative correlational study involved the analysis, by gender, of the effect of four intervention programs at an Arizona middle school as seen on Arizona's Instrument to Measure Standards (AIMS) test scores. These four intervention programs included: Advancement Via Individual Determination (AVID), a planner stamping system, a World…

  19. Nomograms for the assessment of Farnsworth-Munsell 100-hue test scores.

    PubMed

    Han, D P; Thompson, H S

    1983-05-01

    Although the Farnsworth-Munsell 100-hue test is a sensitive means of evaluating congenital and acquired color vision deficiencies, using the data it provides involves complex calculations. We have developed two nomograms that permit the clinician to determine quickly and easily whether a given score is normal for the patient's age and whether the difference between fellow eyes is within the normal range.

  20. The Fight's Not Always Fixed: Using Literary Response to Transcend Standardized Test Scores

    ERIC Educational Resources Information Center

    Avila, JuliAnna

    2012-01-01

    In 2004, the National Endowment for the Arts (NEA) concluded that "literature reading is fading as a meaningful activity, especially among younger people." How can educators continue to teach students about the power of literary response when the priority is for them to achieve proficiency on standardized tests, whose scores can only be narrowly…

  1. Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

    ERIC Educational Resources Information Center

    Almond, Russell G.

    2014-01-01

    Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…

  2. The Relationship between Computer Use and Standardized Test Scores: Does Gender Play a Role?

    ERIC Educational Resources Information Center

    Kay, Rachel E.

    2010-01-01

    Over the past few decades, and especially in the past ten years, computer use in schools has increased dramatically; however there has been little research examining the effects of technology use on student achievement, specifically defined by standardized test scores. There is also concern as to how technology use differs by gender and if that…

  3. Effects of Reading Technology Integration on Sixth Grade Test and Reading Scores

    ERIC Educational Resources Information Center

    Thomas, P. Ann

    2012-01-01

    The focus of the investigation is on a sixth grade population not performing reading on grade level and not achieving high-stakes test score proficiency causing the school to fail adequate yearly progress (AYP). The lack of reading skills causes the students to repeat grades in middle school and high school. Reading technology instruction is the…

  4. Comparing State and District Test Results to National Norms: Interpretations of Scoring "Above the National Average."

    ERIC Educational Resources Information Center

    Linn, Robert L.; And Others

    Norm-referenced test results reported by states and school districts and factors related to those scores were studied through mail and telephone surveys of 35 states and a nationally representative sample of 153 school districts to determine the degree to which "above average" results were being reported. Part of the stimulus for this…

  5. The Adaptation of Naval Enlistees Scoring in Mental Group 4 on the Armed Forces Qualification Test.

    ERIC Educational Resources Information Center

    Plag, John A.; And Others

    This report presents findings from a study evaluating differences in the adaptation of "average" and mentally marginal sailors during four years of military service. Sailors with Armed Forces Qualification Test (AFQT) scores of 50 are significantly superior to Category 4 enlistees on military performance measures which stress cognitive…

  6. Florida Defeats the Skeptics: Test Scores Show Genuine Progress in the Sunshine State

    ERIC Educational Resources Information Center

    Winters, Marcus

    2012-01-01

    Among the 50 states, Florida's gains on the National Assessment of Educational Progress (NAEP) between 1992 and 2011 ranked second only to Maryland's. Florida's progress has been particularly impressive in the early grades. In 1998, Florida scored about one grade level below the national average on the 4th-grade NAEP reading test, but it was…

  7. "No Child" Effect on English-Learners Mulled: Teachers Welcome Attention, Fault Focus on Test Scores

    ERIC Educational Resources Information Center

    Zehr, Mary Ann

    2006-01-01

    Educators who specialize in teaching English-language learners agree that the 4-year-old No Child Left Behind Act has brought unprecedented attention to those students by requiring schools to isolate test-score data for them. They disagree, though, on whether changes in instruction spurred by the law have been positive or negative overall. Such…

  8. The Effect of Maturation and Educational Experience on Air Force Officer Qualifying Test Scores.

    ERIC Educational Resources Information Center

    Gregg, George

    Maturation and education improve Air Force Officer Qualifying Test (AFOQT) scores. Since the AFOQT is given at different educational levels for the several commissioning programs, differences, largely spurious, exist between the programs. To assess differences produced by maturation and education, the AFOQT was given to 415 Air Force Reserve…

  9. Permanent Income and the Black-White Test Score Gap. NBER Working Paper No. 17610

    ERIC Educational Resources Information Center

    Rothstein, Jesse; Wozny, Nathan

    2011-01-01

    Analysts often examine the black-white test score gap conditional on family income. Typically only a current income measure is available. We argue that the gap conditional on permanent income is of greater interest, and we describe a method for identifying this gap using an auxiliary data set to estimate the relationship between current and…

  10. Identifying Local Dependence with a Score Test Statistic Based on the Bifactor Logistic Model

    ERIC Educational Resources Information Center

    Liu, Yang; Thissen, David

    2012-01-01

    Local dependence (LD) refers to the violation of the local independence assumption of most item response models. Statistics that indicate LD between a pair of items on a test or questionnaire that is being fitted with an item response model can play a useful diagnostic role in applications of item response theory. In this article, a new score test…

  11. Using College Admission Test Scores to Clarify High School Placement. Leading Indicator Spotlight

    ERIC Educational Resources Information Center

    Flug, Susanna

    2010-01-01

    In "Beyond Test Scores: Leading Indicators for Education," Foley and colleagues (2008) define leading indicators as those that "provide early signals of progress toward academic achievement" (p. 1) and stress that educators "need leading indicators to help them see the direction their efforts are going in and to take…

  12. End of Course Grades and Standardized Test Scores: Are Grades Predictive of Student Achievement?

    ERIC Educational Resources Information Center

    Ricketts, Christine R.

    2010-01-01

    This study examined the extent to which end-of-course grades are predictive of Virginia Standards of Learning test scores in nine high school content areas. It also analyzed the impact of the variables school cluster attended, gender, ethnicity, disability status, Limited English Proficiency status, and socioeconomic status on the relationship…

  13. How to Set Cutoff Scores for Knowledge Tests Used in Promotion, Training, Certification, and Licensing.

    ERIC Educational Resources Information Center

    Biddle, Richard E.

    1993-01-01

    Suggests a process for setting a cutoff score on knowledge tests for promotion, certification, and licensing: a modified Angoff method, in which a competency estimate is determined by subject matter experts, is combined with analysis of potential impact on any groups protected by the Civil Rights Act. (SK)

  14. Selected Demographic Variables, School Music Participation, and Achievement Test Scores of Urban Middle School Students

    ERIC Educational Resources Information Center

    Kinney, Daryl W.

    2008-01-01

    Nontransient 6th- and 8th-grade urban middle school students' achievement test scores were examined before (4th grade) and during (6th or 8th grade) enrollment in a performing ensemble. Ensemble participation (band, choir, none) and subject variables of socioeconomic status (SES) and home environment were considered. Fourth- and 6th-grade…

  15. Mathematics Achievement Test Scores of American Indian and Anglo Students: A Comparison.

    ERIC Educational Resources Information Center

    Scott, Patrick B.

    1983-01-01

    Reports a preliminary study of possible differences in the performance in mathematics between American Indian Pueblo and Anglos. Sample evaluated consisted of 65 Pueblos and 59 Anglos that excluded highest and lowest scores on Form B of the College Qualification Tests from Fall 1977 to Spring 1981 at the University of New Mexico. (ERB)

  16. A Model for Incorporating Response-Time Data in Scoring Achievement Tests. Research Report No. 3.

    ERIC Educational Resources Information Center

    Tatsuoka, Kikumi; Tatsuoka, Maurice

    The differences in types of information-processing skills developed by different instructional backgrounds affect, negatively or positively, the learning of further advanced instructional materials. If prior and subsequent instructional methods are different, a proactive inhibition effect produces low achievement scores on a post test. This poses…

  17. Response to "What Do Klein et al. Tell Us about Test Scores in Texas?"

    ERIC Educational Resources Information Center

    Klein, Stephen P.; Hamilton, Laura S.; McCaffrey, Daniel F.; Stecher, Brian M.

    2005-01-01

    The authors reviewed the article "What Do Klein et al. Tell Us About Test Scores in Texas?" by Toenjes. A summary of their responses is presented. First, Toenjes incorrectly describes the focus of the authors' study. Second, Toenjes appears to have misunderstood the purpose of their 20-schools analysis. Third, Toenjes misunderstands the…

  18. Comprehensive School Reform and Standardized Test Scores in Illinois Elementary and Middle Schools

    ERIC Educational Resources Information Center

    McEnroe, James D.

    2010-01-01

    The study examined the effects of the federally funded Comprehensive School Reform (CSR) program on student performance on mandated standardized tests. The study focused on the mathematics and reading scores of Illinois public elementary and middle and junior high school students. The federal CSR program provided Illinois schools with an annual…

  19. California Standards Test Scores and Attendance Rates in an Afterschool Program

    ERIC Educational Resources Information Center

    Diamond, Sandra M.

    2013-01-01

    The Problem: The purpose of this study was to investigate whether or not there were any statistically significant differences in the Mathematics California Standard Test scores and attendance rates for African American and Latina high school girls who participated in an afterschool program. Method: A quasi-experimental design was conducted with…

  20. Supplemental Educational Services and Student Test Score Gains: Evidence from a Large, Urban School District

    ERIC Educational Resources Information Center

    Springer, Matthew G.; Pepper, Matthew J.; Ghosh-Dastidar, Bonnie

    2014-01-01

    This study examines the effect of supplemental education services (SES) on student test score gains and whether particular subgroups of students benefit more from NCLB tutoring services. Our sample includes information on students enrolled in third through eighth grades nested in 121 elementary and middle schools over a five-year period comprising…

  1. Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

    ERIC Educational Resources Information Center

    Goldstein, Donna; Alibrandi, Marsha

    2013-01-01

    This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…

  2. Detecting Dissimulation in Personality Test Scores: A Comparison between Person-Fit Indices and Detection Scales.

    ERIC Educational Resources Information Center

    Ferrando, Pere J.; Chico, Eliseo

    2001-01-01

    Examined whether a procedure based on item response theory (IRT) for assessing the scalability of response patterns could detect deliberate dissimulation (faking good) on scores from three tests of the Eysenck Personality Questionnaire Revised. Results for 489 and 140 undergraduates show that IRT measures were not powerful enough to detect…

  3. Relationship of Friends, Physical Education, and State Test Scores: Implications for School Counselors

    ERIC Educational Resources Information Center

    Hollingsworth, Mary Ann

    2010-01-01

    This study examined the relationship between dimensions of wellness and academic performance for 634 third through fifth grade students in Title One schools in rural Mississippi, using composites of the Five Factor Wellness Inventory for Elementary Children and Reading, Language, and Math Scores of the Mississippi Curriculum Test (a state level…

  4. Fitting the Normal-Ogive Factor Analytic Model to Scores on Tests.

    ERIC Educational Resources Information Center

    Ferrando, Pere J.; Lorenzo-Seva, Urbano

    2001-01-01

    Describes how the nonlinear factor analytic approach of R. McDonald to the normal ogive curve can be used to factor analyze test scores. Discusses the conditions in which this model is more appropriate than the linear model and illustrates the applicability of both models using an empirical example based on data from 1,769 adolescents who took the…

  5. Raise Test Scores without Selling Your Soul: An Interview with Scott Mandel

    ERIC Educational Resources Information Center

    Curriculum Review, 2006

    2006-01-01

    With his 10th book, Improving Test Scores: A Practical Approach for Teachers and Administrators, Scott Mandel outlines steps educators can take to boost achievement on standardized exams while maintaining the integrity of their day-to-day teaching. Mandel, who holds a Ph.D. in curriculum and instruction from USC, teaches history and English at…

  6. Estimated Effect of the Teacher Advancement Program on Student Test Score Gains

    ERIC Educational Resources Information Center

    Springer, Matthew G.; Ballou, Dale; Peng, Art

    2014-01-01

    This article presents findings from the first independent, third-party appraisal of the impact of the Teacher Advancement Program (TAP) on student test score gains in mathematics. TAP is a comprehensive school reform model designed to attract highly effective teachers, improve instructional effectiveness, and elevate student achievement. We use a…

  7. Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

    PubMed

    Haverinen-Shaughnessy, Ulla; Shaughnessy, Richard J

    2015-01-01

    Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.

  8. Effects of Classroom Ventilation Rate and Temperature on Students’ Test Scores

    PubMed Central

    2015-01-01

    Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students’ mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9–7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12–13 points per each 1°C decrease in temperature within the observed range of 20–25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students. PMID:26317643

  9. Skewness and transformations of Farnsworth-Munsell 100-hue test scores.

    PubMed

    Dain, S J

    1998-11-01

    In the past, suggested transformations of Farnsworth-Munsell 100-Hue Test (FM 100-Hue) test scores distributions have been limited to a square root transformation. In this study, the choice of transformations of total error scores (TES) are considered by identifying a possible source of skewness. Several distributions of FM100-Hue Test TES were assessed for skewness (third moment). The error score (ES) distributions for the 85 individual caps in each of the populations were also analysed for skewness (Figs. 3 and 4). There is no single transformation which will normalise all TES distributions. The single cap ES distributions with low mean ES (such as those achieved normals and, for some regions of the test, by anomalous trichromats and dichromats) are symmetrical because most subjects can organise the cap perfectly (and could do even better given smaller colour differences). The distributions of ESs where the mean ES is in the moderate range (such as those achieved by diabetics) are skewed because some ESs at the lower end of the range represent performance which could also be better than the test allows. ES distributions with a high mean (such as random distributions and some regions of the test by congenital dichromats) are symmetrical being unaffected by the limitations of the test. TES distributions of diabetics are asymmetrical and comprise skewed cap ES distributions. A suggestion for a transformation is made.

  10. School Readiness and the Draw-a-Man Test: An Empiricaly Derived Alternative to Harris' Scoring System.

    ERIC Educational Resources Information Center

    Simner, Marvin L.

    1985-01-01

    An abbreviated scoring system for the Goodenough-Harris Draw-A-Man Test found that three items had the same overall potential for correctly identifying at-risk kindergarteners as more time-consuming scoring methods. (CL)

  11. An NCME Instructional Module on Quality Control Procedures in the Scoring, Equating, and Reporting of Test Scores

    ERIC Educational Resources Information Center

    Allalouf, Avi

    2007-01-01

    There is significant potential for error in long production processes that consist of sequential stages, each of which is heavily dependent on the previous stage, such as the SER (Scoring, Equating, and Reporting) process. Quality control procedures are required in order to monitor this process and to reduce the number of mistakes to a minimum. In…

  12. Farnsworth and Kinnear method of plotting the Farnsworth Munsell 100-Hue test scores: a comparison

    NASA Astrophysics Data System (ADS)

    Seshadri, Jayasree; Lakshminarayanan, Vasudevan; Christensen, Jerry

    2006-11-01

    We have compared two different methods of plotting the Farnsworth Munsell FM-100 test, namely the Farnsworth and the Kinnear techniques. Data from 30 individuals having a red-green deficiency were plotted using both techniques. The cap score distributions were analysed using both the Knobaluch Sine wave (Fourier) and the the Red-Green and Blue-Yellow partial scoring methods. We found that the Farnsworth and Kinnear plots give different results for the measures obtained by the Knoblauch method of analysis. On the other hand, the ‘difference score’ obtained from the King-Smith et al. method were not significantly different for the two plotting methods.

  13. Test Review: Beal, A. L. (2011). "Insight Test of Cognitive Abilities." Markham, Ontario, Canadian Test Centre

    ERIC Educational Resources Information Center

    Colp, S. Mitchell; Nordstokke, David W.

    2014-01-01

    Published by the Canadian Test Centre (CTC), "Insight" represents a group-administered test of cognitive functioning that has been built entirely upon the Cattell-Horn-Carroll (CHC) theoretical framework. "Insight" is intended to be administered by educators and screen entire classrooms for students who present learning…

  14. The Impact of Linking Distinct Achievement Test Scores on the Interpretation of Student Growth in Achievement

    ERIC Educational Resources Information Center

    Airola, Denise Tobin

    2011-01-01

    Changes to state tests impact the ability of State Education Agencies (SEAs) to monitor change in performance over time. The purpose of this study was to evaluate the Standardized Performance Growth Index (PGIz), a proposed statistical model for measuring change in student and school performance, across transitions in tests. The PGIz is a…

  15. Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

    ERIC Educational Resources Information Center

    Jacob, Brian A.

    2016-01-01

    Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…

  16. Talent Search Qualifying: Comparisons between Talent Search Students Qualifying via Scores on Standardized Tests and via Parent Nomination

    ERIC Educational Resources Information Center

    Lee, Seon-Young; Olszewski-Kubilius, Paula

    2006-01-01

    This study examined differences between students who qualified for talent search testing via scores on standardized tests and via parent nomination in their performances on the SAT or ACT and some demographic characteristics. Overall, the standardized testing group earned higher scores on the off-level tests than the parent nominated group. Asian…

  17. Agreement in the Scoring of Respiratory Events Among International Sleep Centers for Home Sleep Testing

    PubMed Central

    Magalang, Ulysses J.; Arnardottir, Erna S.; Chen, Ning-Hung; Cistulli, Peter A.; Gíslason, Thorarinn; Lim, Diane; Penzel, Thomas; Schwab, Richard; Tufik, Sergio; Pack, Allan I.

    2016-01-01

    Study Objectives: Home sleep testing (HST) is used worldwide to confirm the presence of obstructive sleep apnea (OSA). We sought to determine the agreement of HST scoring among international sleep centers. Methods: Fifteen HSTs, previously recorded using a type 3 monitor, were deidentified and saved in European Data Format. The studies were scored by nine technologists from the sleep centers of the Sleep Apnea Global Interdisciplinary Consortium (SAGIC) using the locally available software. Each study was scored separately using one of three different airflow signals: nasal pressure (NP), transformed (square root) nasal pressure signal (transformed NP), and uncalibrated respiratory inductive plethysmography (RIP) flow. Only one of the three airflow signals was visible to the scorer at each scoring session. The scoring procedure was repeated to determine the intrarater reliability. Results: The intraclass correlation coefficients (ICCs) using the NP were: apnea-hypopnea index (AHI) = 0.96 (95% confidence interval [CI]: 0.93–0.99); apnea index = 0.91 (0.83–0.96); and hypopnea index = 0.75 (0.59–0.89). The ICCs using the transformed NP were: AHI = 0.98 (0.96–0.99); apnea index = 0.95 (0.90–0.98); and hypopnea index = 0.90 (0.82–0.96). The ICCs using the RIP flow were: AH I = 0.98 (0.96–0.99); apnea index = 0.66 (0.48–0.84); and hypopnea index = 0.78 (0.63–0.90). The mean difference of first and second scoring sessions of the same respiratory variables ranged from −1.02 to 0.75/h. Conclusion: There is a strong agreement in the scoring of the respiratory events for HST among international sleep centers. Our results suggest that centralized scoring of HSTs may not be necessary in future research collaboration among international sites. Commentary: A commentary on this article appears in this issue on page 7. Citation: Magalang UJ, Arnardottir ES, Chen NH, Cistulli PA, Gíslason T, Lim D, Penzel T, Schwab R, Tufik S, Pack AI, SAGIC Investigators

  18. Effects of Public Preschool Expenditures on the Test Scores of 4th Graders: Evidence from TIMSS

    PubMed Central

    Waldfogel, Jane; Zhai, Fuhua

    2011-01-01

    This study examines the effects of public preschool expenditures on the math and science scores of 4th graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4th graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,. PMID:21442008

  19. Effects of Public Preschool Expenditures on the Test Scores of 4 Graders: Evidence from TIMSS.

    PubMed

    Waldfogel, Jane; Zhai, Fuhua

    2008-02-01

    This study examines the effects of public preschool expenditures on the math and science scores of 4(th) graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4(th) graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,.

  20. Ethnic differences in children's intelligence test scores: role of economic deprivation, home environment, and maternal characteristics.

    PubMed

    Brooks-Gunn, J; Klebanov, P K; Duncan, G J

    1996-04-01

    We examine differences in intelligence test scores of black and white 5-year-olds. The Infant Health and Development Program data set includes 483 low birthweight premature children who were assessed with the Wechsler Preschool and Primary Scale of Intelligence. These children had been followed from birth, with data on neighborhood and family poverty, family structure, family resources, maternal characteristics, and home environment collected over the first 5 years of life. Black children's IQ scores were 1 SD lower than those of white children. Adjustments for ethnic differences in poverty reduced the ethnic differential by 52%. Adjustments for maternal education and whether the head of household was female did not reduce the ethnic difference further. However, differences in home environment reduced the ethnic differential by an additional 28%. Adjustments for economic and social differences in the lives of black and white children all but eliminate differences in the IQ scores between these two groups.

  1. Bayesian and Empirical Bayes Approaches to Setting Passing Scores on Mastery Tests. Publication Series in Mastery Testing.

    ERIC Educational Resources Information Center

    Huynh, Huynh; Saunders, Joseph C., III

    The Bayesian approach to setting passing scores, as proposed by Swaminathan, Hambleton, and Algina, is compared with the empirical Bayes approach to the same problem that is derived from Huynh's decision-theoretic framework. Comparisons are based on simulated data which follow an approximate beta-binomial distribution and on real test results from…

  2. Correcting Two-Sample "z" and "t" Tests for Correlation: An Alternative to One-Sample Tests on Difference Scores

    ERIC Educational Resources Information Center

    Zimmerman, Donald W.

    2012-01-01

    In order to circumvent the influence of correlation in paired-samples and repeated measures experimental designs, researchers typically perform a one-sample Student "t" test on difference scores. That procedure entails some loss of power, because it employs N - 1 degrees of freedom instead of the 2N - 2 degrees of freedom of the…

  3. Reading Abilities Tests: Development and Norming for Air Force Use

    DTIC Science & Technology

    1983-02-01

    consistency reliability ( Kuder - Richardson Formula 20), test meai . ,andard deviation. Means for Army samples were adjusted in order to control for test...92 AFRAT B N 540 540 736 736 Rel .92 .90 .87 .94 Note. Internal consistency reliabililies (Ret) based on formula KR-20. Reliabilities were not as high...administrative and psychometric specifications. AFRAT appears to be a highly reliable instrument and is recommended as a replacement for commercial reading

  4. COPD assessment test score and serum C-reactive protein levels in stable COPD patients

    PubMed Central

    Kang, Hyung Koo; Kim, Kang; Lee, Hyun; Jeong, Byeong-Ho; Koh, Won-Jung; Park, Hye Yun

    2016-01-01

    Background An eight-item questionnaire of the COPD assessment test (CAT) is widely used to quantify the impact of COPD on the patient’s health status. C-reactive protein (CRP) is associated with disease severity and adverse health outcomes of patients with COPD. This study aimed to evaluate the relationship between CAT score and serum CRP levels in stable COPD patients. Methods We evaluated the medical records of 226 patients with CAT and serum CRP measured within a week at Samsung Medical Center between October 2013 and October 2015. Results Serum CRP levels had a significantly positive relationship with CAT score (Spearman’s r=0.20, P=0.003). Patients with elevated serum CRP levels (>0.3 mg/dL) were significantly more likely to have CAT scores of ≥14. The adjusted odds ratio for elevated serum CRP levels in total CAT score was 1.06 (95% confidence interval, 1.02–1.09). Among CAT components, cough (adjusted P=0.005), phlegm (adjusted P=0.001), breathlessness going up hills/stairs (adjusted P=0.005), low confidence leaving home (adjusted P=0.002), and feeling low in energy (adjusted P=0.019) were independently associated with elevated serum CRP levels. Conclusion In stable COPD patients, serum CRP levels were independently associated with total CAT score and CAT components related to respiratory symptoms, confidence leaving home, and energy. PMID:27994452

  5. Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test

    NASA Astrophysics Data System (ADS)

    Liebeschuetz, John W.; Cole, Jason C.; Korb, Oliver

    2012-06-01

    The performance of all four GOLD scoring functions has been evaluated for pose prediction and virtual screening under the standardized conditions of the comparative docking and scoring experiment reported in this Edition. Excellent pose prediction and good virtual screening performance was demonstrated using unmodified protein models and default parameter settings. The best performing scoring function for both pose prediction and virtual screening was demonstrated to be the recently introduced scoring function ChemPLP. We conclude that existing docking programs already perform close to optimally in the cognate pose prediction experiments currently carried out and that more stringent pose prediction tests should be used in the future. These should employ cross-docking sets. Evaluation of virtual screening performance remains problematic and much remains to be done to improve the usefulness of publically available active and decoy sets for virtual screening. Finally we suggest that, for certain target/scoring function combinations, good enrichment may sometimes be a consequence of 2D property recognition rather than a modelling of the correct 3D interactions.

  6. Post-Hoc IRT Equating of Previously Administered English Tests for Comparison of Test Scores

    ERIC Educational Resources Information Center

    Saida, Chisato; Hattori, Tamaki

    2008-01-01

    Despite growing concerns about declining scholastic abilities of Japanese students throughout Japan prior to the implementation of the revised Courses of Study in 2002, little empirical evidence was available at that time to support this perceived decline in academic performance. This research describes post-hoc IRT equating of previously…

  7. Alcohol Use Disorders Identification Test (AUDIT) scores are elevated in antipsychotic-induced hyperprolactinaemia.

    PubMed

    Lawford, Bruce R; Barnes, Mark; Connor, Jason P; Heslop, Karen; Nyst, Phillip; Young, Ross McD

    2012-02-01

    Hyperprolactinaemia in antipsychotic treated patients with schizophrenia is a consequence of D2 receptor (DRD2) blockade. Alcohol use disorder is commonly comorbid with schizophrenia and low availability of striatal DRD2 may predispose individuals to alcohol use. In this pilot study we investigated whether hyperprolactinaemia secondary to pharmacological DRD2 blockade was associated with alcohol use disorder in 219 (178 males and 41 females) patients with schizophrenia. Serum prolactin determinations were made in patients diagnosed with schizophrenia and maintained on antipsychotic agents. Clinical assessment included demographics, family history and administration of the AUDIT (Alcohol Use Disorders Identification Test). Higher AUDIT scores were associated with prolactin-raising antipsychotic medication (n=106) compared with prolactin-sparing medication (n=113). Risperidone (n=63) treated patients had higher AUDIT scores and prolactin levels than those on other atypical antipsychotics (n = 113). Across the entire sample, patients with a prolactin greater than 800 mIU/L had higher AUDIT scores and were more likely to exceed the cut-off score for harmful and hazardous alcohol use. These differences were not explained by potential confounds related to clinical features and demographics, comorbidity or medication side-effects. These data suggest that by lowering dosage, or switching to another antipsychotic agent, the risk for alcohol use disorder in those with schizophrenia may be reduced. This hypothesis requires testing using a prospective methodology.

  8. Pediatric Residents' Learning Styles and Temperaments and Their Relationships to Standardized Test Scores

    PubMed Central

    Tuli, Sanjeev Y.; Thompson, Lindsay A.; Saliba, Heidi; Black, Erik W.; Ryan, Kathleen A.; Kelly, Maria N.; Novak, Maureen; Mellott, Jane; Tuli, Sonal S.

    2011-01-01

    Background Board certification is an important professional qualification and a prerequisite for credentialing, and the Accreditation Council for Graduate Medical Education (ACGME) assesses board certification rates as a component of residency program effectiveness. To date, research has shown that preresidency measures, including National Board of Medical Examiners scores, Alpha Omega Alpha Honor Medical Society membership, or medical school grades poorly predict postresidency board examination scores. However, learning styles and temperament have been identified as factors that 5 affect test-taking performance. The purpose of this study is to characterize the learning styles and temperaments of pediatric residents and to evaluate their relationships to yearly in-service and postresidency board examination scores. Methods This cross-sectional study analyzed the learning styles and temperaments of current and past pediatric residents by administration of 3 validated tools: the Kolb Learning Style Inventory, the Keirsey Temperament Sorter, and the Felder-Silverman Learning Style test. These results were compared with known, normative, general and medical population data and evaluated for correlation to in-service examination and postresidency board examination scores. Results The predominant learning style for pediatric residents was converging 44% (33 of 75 residents) and the predominant temperament was guardian 61% (34 of 56 residents). The learning style and temperament distribution of the residents was significantly different from published population data (P  =  .002 and .04, respectively). Learning styles, with one exception, were found to be unrelated to standardized test scores. Conclusions The predominant learning style and temperament of pediatric residents is significantly different than that of the populations of general and medical trainees. However, learning styles and temperament do not predict outcomes on standardized in-service and board

  9. The Relationship between Academic Averages of Primary School Science and Technology Class and Test Sub-Test Scores of Placement Test of Science

    ERIC Educational Resources Information Center

    Guzeller, Cem Oktay

    2012-01-01

    In this research, the relationship between written exam scores of science and technology class of 6th, 7th, and 8th grades, project, participation in class activities and performance work, year-end academic success point averages and sub-test raw scores of LDT science of 6th, 7th and 8th grades. Academic success point averages were used as…

  10. The Second Century of Ability Testing: Some Predictions and Speculations

    ERIC Educational Resources Information Center

    Embretson, Susan E.

    2004-01-01

    The last century was marked by dazzling changes in many areas, such as technology and communications. Predictions into the second century of testing are seemingly difficult in such a context. Yet, looking back to the turn of the last century, Kirkpatrick (1900), in his American Psychological Association presidential address, presented fundamental…

  11. The Relationships between Social Class, Listening Test Anxiety and Test Scores

    ERIC Educational Resources Information Center

    Rezaabadi, Omid Talebi

    2016-01-01

    This study investigated the relationships between the social anxiety, social class and listening-test anxiety of students learning English as a foreign language. The aims of the study were to examine the relationship between listening-test anxiety and listening-test performance. The data were collected using an adapted Foreign Language Listening…

  12. Does von Willebrand factor improve the predictive ability of current risk stratification scores in patients with atrial fibrillation?

    PubMed Central

    García-Fernández, Amaya; Roldán, Vanessa; Rivera-Caravaca, José Miguel; Hernández-Romero, Diana; Valdés, Mariano; Vicente, Vicente; Lip, Gregory Y. H.; Marín, Francisco

    2017-01-01

    Von Willebrand factor (vWF) is a biomarker of endothelial dysfunction. We investigated its role on prognosis in anticoagulated atrial fibrillation (AF) patients and determined whether its addition to clinical risk stratification schemes improved event-risk prediction. Consecutive outpatients with non-valvular AF were recruited and rates of thrombotic/cardiovascular events, major bleeding and mortality were recorded. The effect of vWF on prognosis was calculated using a Cox regression model. Improvements in predictive accuracy over current scores were determined by calculating the integrated discrimination improvement (IDI), net reclassification improvement (NRI), comparison of receiver-operator characteristic (ROC) curves and Decision Curve Analysis (DCA). 1215 patients (49% males, age 76 (71–81) years) were included. Follow-up was almost 7 years. Significant associations were found between vWF and cardiovascular events, stroke, mortality and bleeding. Based on IDI and NRI, addition of vWF to CHA2DS2-VASc statistically improved its predictive value, but c-indexes were not significantly different. For major bleeding, the addition of vWF to HAS-BLED improved the c-index but not IDI or NRI. DCA showed minimal net benefit. vWF acts as a simple prognostic biomarker in AF and, whilst its addition to current scores statistically improves prediction for some endpoints, absolute changes and impact on clinical decision-making are marginal. PMID:28134282

  13. Effect of Mindfulness Meditation on Perceived Stress Scores and Autonomic Function Tests of Pregnant Indian Women

    PubMed Central

    Jain, Reena; Kohli, Sangeeta; Batra, Swaraj

    2016-01-01

    Introduction Various pregnancy complications like hypertension, preeclampsia have been strongly correlated with maternal stress. One of the connecting links between pregnancy complications and maternal stress is mind-body intervention which can be part of Complementary and Alternative Medicine (CAM). Biologic measures of stress during pregnancy may get reduced by such interventions. Aim To evaluate the effect of Mindfulness meditation on perceived stress scores and autonomic function tests of pregnant Indian women. Materials and Methods Pregnant Indian women of 12 weeks gestation were randomised to two treatment groups: Test group with Mindfulness meditation and control group with their usual obstetric care. The effect of Mindfulness meditation on perceived stress scores and cardiac sympathetic functions and parasympathetic functions (Heart rate variation with respiration, lying to standing ratio, standing to lying ratio and respiratory rate) were evaluated on pregnant Indian women. Results There was a significant decrease in perceived stress scores, a significant decrease of blood pressure response to cold pressor test and a significant increase in heart rate variability in the test group (p< 0.05, significant) which indicates that mindfulness meditation is a powerful modulator of the sympathetic nervous system and can thereby reduce the day-to-day perceived stress in pregnant women. Conclusion The results of this study suggest that mindfulness meditation improves parasympathetic functions in pregnant women and is a powerful modulator of the sympathetic nervous system during pregnancy. PMID:27190795

  14. Interrater Reliability of the Original and a Revised Scoring System for the Developmental Test of Visual-Motor Integration.

    ERIC Educational Resources Information Center

    Lepkin, Sheila Ratsch; Pryzwansky, Walter B.

    1983-01-01

    Investigated the interrater reliability of teachers' and school psychology externs' scoring of protocols for the Developmental Test of Visual-Motor Integration (VMI), using a revised scoring system. Results showed high reliability coefficients for all raters, regardless of the scoring system employed. The influence of rater training is discussed.…

  15. Examining the Validity of GED[R] Tests Scores with Scheduling and Setting Accommodations. GED Testing Service Research Studies, 2004-1

    ERIC Educational Resources Information Center

    George-Ezzelle, Carol E.; Skaggs, Gary

    2004-01-01

    Current testing standards call for test developers to provide evidence that testing procedures and test scores, and the inferences made based on the test scores, show evidence of validity and are comparable across subpopulations (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on…

  16. Computerized Adaptive Ability Measurement.

    ERIC Educational Resources Information Center

    Weiss, David J.

    The general objective of a research program on adaptive testing was to identify several sources of potential error in test scores, and to study adaptive testing as a means for reducing these errors. Errors can result from the mismatch of item difficulty to the individual's ability; the psychological effects of testing and the test environment; the…

  17. Adults with poor reading skills: How lexical knowledge interacts with scores on standardized reading comprehension tests.

    PubMed

    McKoon, Gail; Ratcliff, Roger

    2016-01-01

    Millions of adults in the United States lack the necessary literacy skills for most living wage jobs. For students from adult learning classes, we used a lexical decision task to measure their knowledge of words and we used a decision-making model (Ratcliff's, 1978, diffusion model) to abstract the mechanisms underlying their performance from their RTs and accuracy. We also collected scores for each participant on standardized IQ tests and standardized reading tests used commonly in the education literature. We found significant correlations between the model's estimates of the strengths with which words are represented in memory and scores for some of the standardized tests but not others. The findings point to the feasibility and utility of combining a test of word knowledge, lexical decision, that is well-established in psycholinguistic research, a decision-making model that supplies information about underlying mechanisms, and standardized tests. The goal for future research is to use this combination of approaches to understand better how basic processes relate to standardized tests with the eventual aim of understanding what these tests are measuring and what the specific difficulties are for individual, low-literacy adults.

  18. Adults with poor reading skills: How lexical knowledge interacts with scores on standardized reading comprehension tests

    PubMed Central

    McKoon, Gail; Ratcliff, Roger

    2016-01-01

    Millions of adults in the United States lack the necessary literacy skills for most living wage jobs. For students from adult learning classes, we used a lexical decision task to measure their knowledge of words and we used a decision-making model (Ratcliff’s, 1978, diffusion model) to abstract the mechanisms underlying their performance from their RTs and accuracy. We also collected scores for each participant on standardized IQ tests and standardized reading tests used commonly in the education literature. We found significant correlations between the model’s estimates of the strengths with which words are represented in memory and scores for some of the standardized tests but not others. The findings point to the feasibility and utility of combining a test of word knowledge, lexical decision, that is well-established in psycholinguistic research, a decision-making model that supplies information about underlying mechanisms, and standardized tests. The goal for future research is to use this combination of approaches to understand better how basic processes relate to standardized tests with the eventual aim of understanding what these tests are measuring and what the specific difficulties are for individual, low-literacy adults. PMID:26550803

  19. Psychometrics of Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) scores.

    PubMed

    Brannick, Michael T; Wahi, Monika M; Goldin, Steven B

    2011-08-01

    A sample of 183 medical students completed the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT V2.0). Scores on the test were examined for evidence of reliability and factorial validity. Although Cronbach's alpha for the total scores was adequate (.79), many of the scales had low internal consistency (scale alphas ranged from .34 to .77; median = .48). Previous factor analyses of the MSCEIT are critiqued and the rationale for the current analysis is presented. Both confirmatory and exploratory factor analyses of the MSCEIT item parcels are reported. Pictures and faces items formed separate factors rather than loading on a Perception factor. Emotional Management appeared as a factor, but items from Blends and Facilitation failed to load consistently on any factor, rendering factors for Emotional Understanding and Emotional Facilitation problematic.

  20. Predicting scores of the Halstead Category Test with the WAIS-III.

    PubMed

    Titus, Jeffrey B; Retzlaff, Paul D; Dean, Raymond S

    2002-09-01

    The Halstead Category Test (HCT) and the Wechsler Adult Intelligence Scale (WAIS) are two of the most widely used neuropsychological tests. Often assessment conclusions are dependent upon the comparison of these measures. Therefore, it is crucial for clinicians to know how they relate to one another. This study examined the relationship between the HCT and the WAIS-III with undergraduate psychology students. Correlational analyses were conducted between HCT scores and WAIS-III subtests, Verbal and Performance IQ, and Full Scale IQ scores. Additionally, the new WAIS-III scales (Letter-Number Sequencing, Matrix Reasoning, and Symbol Search) were further examined. Regression analyses were run to develop predictor equations for the HCT using VIQ, PIQ, and FSIQ. Finally, predictor tables were generated between the HCT and VIQ, PIQ, and FSIQ to provide assessment of brain dysfunction for clinical use.

  1. Comparison of Test Directions for Ability Tests: Impact on Young English-Language Learner and Non-ELL Students

    ERIC Educational Resources Information Center

    Lakin, Joni Marie

    2010-01-01

    Ability tests play an important role in the assessment programs of many schools. However, the inferences about ability made from such tests presume that students understand the tasks they are attempting. Task familiarity can vary by student as well as by format. By design, nonverbal reasoning tests use formats that are intended to be novel. The…

  2. Measuring intellectual ability in cerebral palsy: The comparison of three tests and their neuroimaging correlates.

    PubMed

    Ballester-Plané, Júlia; Laporta-Hoyos, Olga; Macaya, Alfons; Póo, Pilar; Meléndez-Plumed, Mar; Vázquez, Élida; Delgado, Ignacio; Zubiaurre-Elorza, Leire; Narberhaus, Ana; Toro-Tamargo, Esther; Russi, Maria Eugenia; Tenorio, Violeta; Segarra, Dolors; Pueyo, Roser

    2016-09-01

    Standard intelligence scales require both verbal and manipulative responses, making it difficult to use in cerebral palsy and leading to underestimate their actual performance. This study aims to compare three intelligence tests suitable for the heterogeneity of cerebral palsy in order to identify which one(s) could be more appropriate to use. Forty-four subjects with bilateral dyskinetic cerebral palsy (26 male, mean age 23 years) conducted the Raven's Coloured Progressive Matrices (RCPM), the Peabody Picture Vocabulary Test-3rd (PPVT-III) and the Wechsler Nonverbal Scale of Ability (WNV). Furthermore, a comprehensive neuropsychological battery and magnetic resonance imaging were assessed. The results show that PPVT-III gives limited information on cognitive performance and brain correlates, getting lower intelligence quotient scores. The WNV provides similar outcomes as RCPM, but cases with severe motor impairment were unable to perform it. Finally, the RCPM gives more comprehensive information on cognitive performance, comprising not only visual but also verbal functions. It is also sensitive to the structural state of the brain, being related to basal ganglia, thalamus and white matter areas such as superior longitudinal fasciculus. So, the RCPM may be considered a standardized easy-to-administer tool with great potential in both clinical and research fields of bilateral cerebral palsy.

  3. Score statistic to test for genetic correlation for proband-family design.

    PubMed

    el Galta, R; van Duijn, C M; van Houwelingen, J C; Houwing-Duistermaat, J J

    2005-07-01

    In genetic epidemiological studies informative families are often oversampled to increase the power of a study. For a proband-family design, where relatives of probands are sampled, we derive the score statistic to test for clustering of binary and quantitative traits within families due to genetic factors. The derived score statistic is robust to ascertainment scheme. We considered correlation due to unspecified genetic effects and/or due to sharing alleles identical by descent (IBD) at observed marker locations in a candidate region. A simulation study was carried out to study the distribution of the statistic under the null hypothesis in small data-sets. To illustrate the score statistic, data from 33 families with type 2 diabetes mellitus (DM2) were analyzed. In addition to the binary outcome DM2 we also analyzed the quantitative outcome, body mass index (BMI). For both traits familial aggregation was highly significant. For DM2, also including IBD sharing at marker D3S3681 as a cause of correlation gave an even more significant result, which suggests the presence of a trait gene linked to this marker. We conclude that for the proband-family design the score statistic is a powerful and robust tool for detecting clustering of outcomes.

  4. Development of a Mathematical Ability Test: A Validity and Reliability Study

    ERIC Educational Resources Information Center

    Dündar, Sefa; Temel, Hasan; Gündüz, Nazan

    2016-01-01

    The identification of talented students accurately at an early age and the adaptation of the education provided to the students depending on their abilities are of great importance for the future of the countries. In this regard, this study aims to develop a mathematical ability test for the identification of the mathematical abilities of students…

  5. Rejoinder: Constructs and Measurement Principles in the Second Century of Ability Testing

    ERIC Educational Resources Information Center

    Embretson, Susan E.

    2004-01-01

    "The Second Century of Ability Testing: Some Predictions and Speculations" did not include predictions about the ability construct or the role of fundamental measurement principles. All commentators raised issues about the nature of the ability construct. The diverse viewpoints represented in these comments highlight well the complexity…

  6. Behavioural linear standardized scoring system of the Lidia cattle breed by testing in herd: estimation of genetic parameters.

    PubMed

    Pelayo, R; Solé, M; Sánchez, M J; Molina, A; Valera, M

    2016-10-01

    Docility is very important for cattle production, and many behavioural tests to measure this trait have been developed. However, very few objective behavioural tests to measure the opposite approach 'aggressive behaviour' have been described. Therefore, the aim of this work was to validate in the Lidia cattle breed a behavioural linear standardized scoring system that measure the aggressiveness and enable genetic analysis of behavioural traits expressing fearless and fighting ability. Reproducibility and repeatability measures were calculated for the 12 linear traits of this scoring system to assess its accuracy, and ranged from 85.3 and 94.2%, and from 66.7 to 97.9%, respectively. Genetic parameters were estimated using an animal model with a Bayesian approach. A total of 1202 behavioural records were used. The pedigree matrix contained 5001 individuals. Heritability values (with standard deviations) ranged between 0.13 (0.04) (Falls of the bull) and 0.41 (0.08) (Speed of approach to horse). Genetic correlations varied from 0.01 (0.07) to 0.90 (0.13). Finally, an exploratory factor analysis using the genetic correlation matrix was calculated. Three main factors were retained to describe the traditional genetic indexes aggressiveness, strength and mobility.

  7. Testing Students with Special Educational Needs in Large-Scale Assessments - Psychometric Properties of Test Scores and Associations with Test Taking Behavior.

    PubMed

    Pohl, Steffi; Südkamp, Anna; Hardt, Katinka; Carstensen, Claus H; Weinert, Sabine

    2016-01-01

    Assessing competencies of students with special educational needs in learning (SEN-L) poses a challenge for large-scale assessments (LSAs). For students with SEN-L, the available competence tests may fail to yield test scores of high psychometric quality, which are-at the same time-measurement invariant to test scores of general education students. We investigated whether we can identify a subgroup of students with SEN-L, for which measurement invariant competence measures of adequate psychometric quality may be obtained with tests available in LSAs. We furthermore investigated whether differences in test-taking behavior may explain dissatisfying psychometric properties and measurement non-invariance of test scores within LSAs. We relied on person fit indices and mixture distribution models to identify students with SEN-L for whom test scores with satisfactory psychometric properties and measurement invariance may be obtained. We also captured differences in test-taking behavior related to guessing and missing responses. As a result we identified a subgroup of students with SEN-L for whom competence scores of adequate psychometric quality that are measurement invariant to those of general education students were obtained. Concerning test taking behavior, there was a small number of students who unsystematically picked response options. Removing these students from the sample slightly improved item fit. Furthermore, two different patterns of missing responses were identified that explain to some extent problems in the assessments of students with SEN-L.

  8. Testing Students with Special Educational Needs in Large-Scale Assessments – Psychometric Properties of Test Scores and Associations with Test Taking Behavior

    PubMed Central

    Pohl, Steffi; Südkamp, Anna; Hardt, Katinka; Carstensen, Claus H.; Weinert, Sabine

    2016-01-01

    Assessing competencies of students with special educational needs in learning (SEN-L) poses a challenge for large-scale assessments (LSAs). For students with SEN-L, the available competence tests may fail to yield test scores of high psychometric quality, which are—at the same time—measurement invariant to test scores of general education students. We investigated whether we can identify a subgroup of students with SEN-L, for which measurement invariant competence measures of adequate psychometric quality may be obtained with tests available in LSAs. We furthermore investigated whether differences in test-taking behavior may explain dissatisfying psychometric properties and measurement non-invariance of test scores within LSAs. We relied on person fit indices and mixture distribution models to identify students with SEN-L for whom test scores with satisfactory psychometric properties and measurement invariance may be obtained. We also captured differences in test-taking behavior related to guessing and missing responses. As a result we identified a subgroup of students with SEN-L for whom competence scores of adequate psychometric quality that are measurement invariant to those of general education students were obtained. Concerning test taking behavior, there was a small number of students who unsystematically picked response options. Removing these students from the sample slightly improved item fit. Furthermore, two different patterns of missing responses were identified that explain to some extent problems in the assessments of students with SEN-L. PMID:26941665

  9. The Benefits of Preschool: Do Children Who Attend Preschool Prior to Kindergarten Achieve Higher Test Scores

    ERIC Educational Resources Information Center

    Harrington, Julie

    2015-01-01

    The purpose of this quantitative study was to determine what, if any, impact that attending a four year old kindergarten program had on five year old kindergarteners reading ability as measured by Dominie testing, compared to those five year olds who did not attend a four year old program at Inman Elementary School. The significance of this study…

  10. Comparison of physical therapy anatomy performance and anxiety scores in timed and untimed practical tests.

    PubMed

    Schwartz, Sarah M; Evans, Cathy; Agur, Anne M R

    2015-01-01

    Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating students using untimed examinations. There is currently no consensus in the literature regarding whether untimed examinations provide a benefit to test performance in clinical anatomy. This study aimed to determine the impact of timed versus untimed practical tests on Master of Physical Therapy student anatomy performance and test anxiety. Test anxiety was measured using the State-Trait Anxiety Inventory (STAI). Differences in performance, anxiety scores, and time taken were compared using paired sample Student's t-tests. Eighty-one of the 84 students completed the study and provided feedback. Students performed significantly higher on the untimed test (P = 0.005), with a significant reduction in test anxiety (P < 0.001). Students who were unsuccessful on the timed test showed the greatest improvement on the untimed test ( x¯ = 20.4 ±10%). Eighty-three percent (n = 69) of students preferred the untimed test, 8.4% (n = 7) the timed test, and 8.4% (n = 7) had no preference. Students took on average eight minutes longer on the untimed test. This study found that physical therapy students perform better on untimed tests, which may be related to a reduction in test anxiety. If the intended goal of evaluating health care professional students is to determine fundamental competencies, these factors should be considered when designing future curricula.

  11. A Test for the Assessment of Pragmatic Abilities and Cognitive Substrates (APACS): Normative Data and Psychometric Properties.

    PubMed

    Arcara, Giorgio; Bambini, Valentina

    2016-01-01

    The Assessment of Pragmatic Abilities and Cognitive Substrates (APACS) test is a new tool to evaluate pragmatic abilities in clinical populations with acquired communicative deficits, ranging from schizophrenia to neurodegenerative diseases. APACS focuses on two main domains, namely discourse and non-literal language, combining traditional tasks with refined linguistic materials in Italian, in a unified framework inspired by language pragmatics. The test includes six tasks (Interview, Description, Narratives, Figurative Language 1, Humor, Figurative Language 2) and three composite scores (Pragmatic Productions, Pragmatic Comprehension, APACS Total). Psychometric properties and normative data were computed on a sample of 119 healthy participants representative of the general population. The analysis revealed acceptable internal consistency and good test-retest reliability for almost every APACS task, suggesting that items are coherent and performance is consistent over time. Factor analysis supports the validity of the test, revealing two factors possibly related to different facets and substrates of the pragmatic competence. Finally, excellent match between APACS items and scores and the pragmatic constructs measured in the test was evidenced by experts' evaluation of content validity. The performance on APACS showed a general effect of demographic variables, with a negative effect of age and a positive effect of education. The norms were calculated by means of state-of-the-art regression methods. Overall, APACS is a valuable tool for the assessment of pragmatic deficits in verbal communication. The short duration and easiness of administration make the test especially suitable to use in clinical settings. In presenting APACS, we also aim at promoting the inclusion of pragmatics in the assessment practice, as a relevant dimension in defining the patient's cognitive profile, given its vital role for communication and social interaction in daily life. The combined

  12. A Test for the Assessment of Pragmatic Abilities and Cognitive Substrates (APACS): Normative Data and Psychometric Properties

    PubMed Central

    Arcara, Giorgio; Bambini, Valentina

    2016-01-01

    The Assessment of Pragmatic Abilities and Cognitive Substrates (APACS) test is a new tool to evaluate pragmatic abilities in clinical populations with acquired communicative deficits, ranging from schizophrenia to neurodegenerative diseases. APACS focuses on two main domains, namely discourse and non-literal language, combining traditional tasks with refined linguistic materials in Italian, in a unified framework inspired by language pragmatics. The test includes six tasks (Interview, Description, Narratives, Figurative Language 1, Humor, Figurative Language 2) and three composite scores (Pragmatic Productions, Pragmatic Comprehension, APACS Total). Psychometric properties and normative data were computed on a sample of 119 healthy participants representative of the general population. The analysis revealed acceptable internal consistency and good test-retest reliability for almost every APACS task, suggesting that items are coherent and performance is consistent over time. Factor analysis supports the validity of the test, revealing two factors possibly related to different facets and substrates of the pragmatic competence. Finally, excellent match between APACS items and scores and the pragmatic constructs measured in the test was evidenced by experts' evaluation of content validity. The performance on APACS showed a general effect of demographic variables, with a negative effect of age and a positive effect of education. The norms were calculated by means of state-of-the-art regression methods. Overall, APACS is a valuable tool for the assessment of pragmatic deficits in verbal communication. The short duration and easiness of administration make the test especially suitable to use in clinical settings. In presenting APACS, we also aim at promoting the inclusion of pragmatics in the assessment practice, as a relevant dimension in defining the patient's cognitive profile, given its vital role for communication and social interaction in daily life. The combined

  13. Evaluating the Effects of Differences in Group Abilities on the Tucker and the Levine Observed-Score Methods for Common-Item Nonequivalent Groups Equating. ACT Research Report Series 2010-1

    ERIC Educational Resources Information Center

    Chen, Hanwei; Cui, Zhongmin; Zhu, Rongchun; Gao, Xiaohong

    2010-01-01

    The most critical feature of a common-item nonequivalent groups equating design is that the average score difference between the new and old groups can be accurately decomposed into a group ability difference and a form difficulty difference. Two widely used observed-score linear equating methods, the Tucker and the Levine observed-score methods,…

  14. A comparison of scoring systems and level of scorer experience on the Bender-Gestalt Test.

    PubMed

    Lacks, P B; Newport, K

    1980-08-01

    Compared the usefulness of four scoring approaches to the Bender-Gestalt Test (Hain, Hutt-Briskin, Pauker, and number of rotations) on the same sample of 50 mixed, psychiatric inpatients. Also, the accuracy of scorers of varying levels of experience was compared. Twelve different scorers were used representing three levels of expertise: "expert," "typical," and "novice." For a measure of reliability and two measures of diagnostic discrimination the Hutt-Briskin and Pauker systems were more successful than the Hain system or number of rotations. For each scoring system there were no differences in diagnostic accuracy attributable to level of past experience. It was recommended that the findings on the Pauker system be cross-validated before being used in clinical settings.

  15. Reviews of the Tests Approved by the Secretary of Education for Ability To Benefit Admissions.

    ERIC Educational Resources Information Center

    Rudner, Lawrence M.

    To comply with the new U.S. Department of Education Ability-To-Benefit policy, schools need to select tests on the Secretary's approved list. The pertinent aspects of 22 approved tests are individually summarized. The test reviews are based on examinations of the test publishers' technical documentation and the tests. Information provided in the…

  16. Detection of Item Preknowledge Using Likelihood Ratio Test and Score Test

    ERIC Educational Resources Information Center

    Sinharay, Sandip

    2017-01-01

    An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have…

  17. Relationship of Students' Prior Knowledge and Order of Questions on Tests to Students' Test Scores.

    ERIC Educational Resources Information Center

    Papp, Klara K.; And Others

    1987-01-01

    A study examined whether students beginning a cell biology course with prior knowledge of its three areas (genetics, histology, and biochemistry) would retain that advantage throughout the course and whether achievement was influenced by the order of questions in a test. (MSE)

  18. Predicting First-Quarter Test Scores from the New Medical College Admission Test.

    ERIC Educational Resources Information Center

    Cullen, Thomas J.; And Others

    1980-01-01

    The predictive validity of the new Medical College Admission Test as it relates to end-of-quarter examinations in anatomy, histology, physiology, biochemistry, and "ages of man" is presented. Results indicate that the Science Knowledge assessment areas of chemistry and physics and the Science Problems subtest were most useful in…

  19. Galtonian eugenics and the study of growth: the relation of body size, intelligence test score, and social circumstances in children and adults.

    PubMed

    Tanner, J M

    1966-09-01

    The attempt is made to describe and analyze the way in which mental ability, physical size, and social circumstances are related in children and adults. This example is used to develop the thesis that is exactly at the interphase of heredity and environment that positive eugenices may make a significant impact. The belief is that the positive eugenists attention should be directed at providing the environmental stimuli most appropriate to evoke and derive from each zygote those potentialities which would best enrich and humanize the culture. Focus is on body size and mental ability, the number of children in the family, occupational or socioeconomic class, social stratification and the steady state. Among children of school age there is a significant but low correlation between body size and scores in various tests of ability and attainment, such that larger children score more highly than children of the same age. This correlation diminishes when maturity is reached, but it does not totally disappear. The greater the number of chidlren in the family the lower their height and the less their scores in mental tests. There are also differences in height and mental ability between children in different socioeconomic groups and these persist to a degree into adult. Taller women tend to rise in the social scale, both in getting jobs and in marriage, while shorter women, on average, tend to sink. It is not known in what proportions heredity and environment contribute to these effects.

  20. Providing Subscale Scores for Diagnostic Information: A Case Study when the Test Is Essentially Unidimensional

    ERIC Educational Resources Information Center

    Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne

    2010-01-01

    Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…

  1. Linking Composite Scores: Effects of Anchor Test Length and Content Representativeness. Research Report. ETS RR-16-36

    ERIC Educational Resources Information Center

    Lin, Peng; Dorans, Neil; Weeks, Jonathan

    2016-01-01

    The nonequivalent groups with anchor test (NEAT) design is frequently used in test score equating or linking. One important assumption of the NEAT design is that the anchor test is a miniversion of the 2 tests to be equated/linked. When the content of the 2 tests is different, it is not possible for the anchor test to be adequately representative…

  2. Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success.

    PubMed

    Niu, Sunny X; Tienda, Marta

    2012-04-01

    Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success-high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe.

  3. Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success

    PubMed Central

    Niu, Sunny X.; Tienda, Marta

    2012-01-01

    Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success—high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe. PMID:23788828

  4. Multigroup Generalizability Analysis of Verbal, Quantitative, and Nonverbal Ability Tests for Culturally and Linguistically Diverse Students

    ERIC Educational Resources Information Center

    Lakin, Joni M.; Lai, Emily R.

    2012-01-01

    For educators seeking to differentiate instruction, cognitive ability tests sampling multiple content domains, including verbal, quantitative, and nonverbal reasoning, provide superior information about student strengths and weaknesses compared with unidimensional reasoning measures. However, these ability tests have not been fully evaluated with…

  5. Conservatism and Cognitive Ability

    ERIC Educational Resources Information Center

    Stankov, Lazar

    2009-01-01

    Conservatism and cognitive ability are negatively correlated. The evidence is based on 1254 community college students and 1600 foreign students seeking entry to United States' universities. At the individual level of analysis, conservatism scores correlate negatively with SAT, Vocabulary, and Analogy test scores. At the national level of…

  6. Interpreting the "g" Loadings of Intelligence Test Composite Scores in Light of Spearman's Law of Diminishing Returns

    ERIC Educational Resources Information Center

    Reynolds, Matthew R.

    2013-01-01

    The linear loadings of intelligence test composite scores on a general factor ("g") have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the "g" loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of…

  7. The Relationship Between Nelson-Denny Test Scores and Academic Performance of Educational Opportunity Program Students. EAC Reports.

    ERIC Educational Resources Information Center

    Yamagishi, Midori; Gillmore, Gerald M.

    The relationship of Nelson-Denny Reading Test scores and an English course placement recommendation to academic success of Educational Opportunity Program students at the University of Washington was studied. The placement recommendation was based on a writing sample and test scores. The 207 freshmen students who entered in either 1976 or 1978…

  8. Comparison of the Bender Gestalt Test for Both Black and White Brain-Damaged Patients Using Two Scoring Systems

    ERIC Educational Resources Information Center

    Butler, Oliver T.; And Others

    1976-01-01

    This study tested for cultural bias in the Bender Visual Motor Gestalt Test. Subjects were 72 black and white patients diagnosed as either brain damaged or psychiatric. Bender protocols were scored by Pascal-Suttell and Hain systems. No race effect appeared except for the Pascal-Suttell system for which blacks scored significantly better. (Author)

  9. Effects of Knowledge of Cognitive-Moral Development and Request to Fake on Defining Issues Test P-Scores.

    ERIC Educational Resources Information Center

    Napier, John D.

    1979-01-01

    Support claims that the "Defining Issues Test" of cognitive-moral development cannot be faked higher. Finds that instruction about cognitive-moral development affected the scores of the teacher trainees who were tested. (RL)

  10. Longitudinal Assessment of Intellectual Abilities of Children with Williams Syndrome: Multilevel Modeling of Performance on the Kaufman Brief Intelligence Test--Second Edition

    ERIC Educational Resources Information Center

    Mervis, Carolyn B.; Kistler, Doris J.; John, Angela E.; Morris, Colleen A.

    2012-01-01

    Multilevel modeling was used to address the longitudinal stability of standard scores (SSs) measuring intellectual ability for children with Williams syndrome (WS). Participants were 40 children with genetically confirmed WS who completed the Kaufman Brief Intelligence Test--Second Edition (KBIT-2; A. S. Kaufman & N. L. Kaufman, 2004) 4-7…

  11. The Development of Extraversion and Ability: Analysis of Data from a Large-Scale Longitudinal Study of Children Tested at 10-11 and 14-15 Years.

    ERIC Educational Resources Information Center

    Anthony, W. S.

    1983-01-01

    Results of analysis of correlations collected by Cookson, following Eysenck and Cookson's study of personality and ability in young people, confirm the finding from previous Cattellian test data that the more intelligent children decline in relative extraversion scores and cast doubt on Eysenck's suggestion that introverts gradually show higher…

  12. A Confirmatory Factor Analysis of Cattell-Horn-Carroll Theory and Cross-Age Invariance of the Woodcock-Johnson Tests of Cognitive Abilities III

    ERIC Educational Resources Information Center

    Taub, Gordon E.; McGrew, Kevin S.

    2004-01-01

    Establishing an instrument's factorial invariance provides the empirical foundation to compare an individual's score across time or to examine the pattern of correlations between variables in differentiated age groups. In the recently published Woodcock-Johnson Tests of Cognitive Ability (WJ COG) and Achievement (WJ ACH) Third Edition (III) the…

  13. Increased correlation coefficient between the written test score and tutors’ performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia

    PubMed Central

    Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

    2016-01-01

    This paper is aimed at finding if there was a change of correlation between the written test score and tutors’ performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group’s tutors did not receive tutor training; while the second group’s tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors’ performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors’ scores in group 1 was 0.099 (p<0.001) and for group 2 was 0.305 (p<0.001). The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course. PMID:26838577

  14. Increased correlation coefficient between the written test score and tutors' performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia.

    PubMed

    Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

    2016-03-01

    This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (p<0.001) and for group 2 was 0.305 (p<0.001). The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.

  15. Associations of maximal strength and muscular endurance test scores with cardiorespiratory fitness and body composition.

    PubMed

    Vaara, Jani P; Kyröläinen, Heikki; Niemi, Jaakko; Ohrankämmen, Olli; Häkkinen, Arja; Kocay, Sheila; Häkkinen, Keijo

    2012-08-01

    The purpose of the present study was to assess the relationships between maximal strength and muscular endurance test scores additionally to previously widely studied measures of body composition and maximal aerobic capacity. 846 young men (25.5 ± 5.0 yrs) participated in the study. Maximal strength was measured using isometric bench press, leg extension and grip strength. Muscular endurance tests consisted of push-ups, sit-ups and repeated squats. An indirect graded cycle ergometer test was used to estimate maximal aerobic capacity (V(O2)max). Body composition was determined with bioelectrical impedance. Moreover, waist circumference (WC) and height were measured and body mass index (BMI) calculated. Maximal bench press was positively correlated with push-ups (r = 0.61, p < 0.001), grip strength (r = 0.34, p < 0.001) and sit-ups (r = 0.37, p < 0.001) while maximal leg extension force revealed only a weak positive correlation with repeated squats (r = 0.23, p < 0.001). However, moderate correlation between repeated squats and V(O2)max was found (r = 0.55, p < 0.001) In addition, BM and body fat correlated negatively with muscular endurance (r = -0.25 - -0.47, p < 0.001), while FFM and maximal isometric strength correlated positively (r = 0.36-0.44, p < 0.001). In conclusion, muscular endurance test scores were related to maximal aerobic capacity and body fat content, while fat free mass was associated with maximal strength test scores and thus is a major determinant for maximal strength. A contributive role of maximal strength to muscular endurance tests could be identified for the upper, but not the lower extremities. These findings suggest that push-up test is not only indicative of body fat content and maximal aerobic capacity but also maximal strength of upper body, whereas repeated squat test is mainly indicative of body fat content and maximal aerobic capacity, but not maximal strength of lower extremities.

  16. A toxicity scoring system for the 10-day whole sediment test with Corophium insidiosum (Crawford).

    PubMed

    Prato, Ermelinda; Biandolino, Francesca; Libralato, Giovanni

    2015-04-01

    This study developed a tool able to evaluate the potential contamination of marine sediments detecting the presence or absence of toxicity supporting environmental decision-making processes. When the sample is toxic, it is important to classify its level of toxicity to understand its subsequent effects and management practices. Corophium insidiosum is a widespread and frequently recorded species along the Mediterranean Sea, North Sea and western Baltic Sea with records also in the Atlantic Ocean and Pacific Ocean. This amphipod is found in high abundance in shallow brackish inshore areas and estuaries also with high turbidity. At Italian level, C. insidiosum is more frequently collectable than Corophium orientale, making routine toxicity tests easier to be performed. Moreover, according to the international scientific literature, C. insidiosum is more sensitive than C. orientale. Whole sediment toxicity data (10 days) with C. insidiosum were organised in a species-specific toxicity score on the basis of the minimum significance difference (MSD) approach. Thresholds to rank samples as non-toxic and toxic were based on sediment samples (n=84) from the Gulf of Taranto (Italy). A five-class toxicity score (absent, low, medium, high and very high toxicity) was developed, considering the distribution of the 90th percentile of the MSD normalised to the effects on the negative controls (samples from reference sites). This toxicity score could be useful for interpreting sediment potential impacts and providing quick responsive management information.

  17. Acoustic radiation force impulse elastography for chronic liver disease: comparison with ultrasound-based scores of experienced radiologists, Child-Pugh scores and liver function tests.

    PubMed

    Kim, Ji Eun; Lee, Jae Young; Kim, Yoon Jun; Yoon, Jung Hwan; Kim, Se Hyung; Lee, Jeong Min; Han, Joon Koo; Choi, Byung Ihn

    2010-10-01

    The purpose of our study was to investigate whether acoustic radiation force impulse (ARFI) elastography provides better diagnostic performance for diagnosis of chronic liver disease and correlates better with Child-Pugh scores and liver function tests, compared with an ultrasound (US) scoring system based on visual assessment of conventional B-mode US images by experienced radiologists. Five hundred and twenty-one patients with clinically proven chronic liver disease (n = 293), fatty liver (n = 95) or normal liver (n = 133) were included in this study. B-mode liver US and ARFI elastography were performed in all patients. ARFI elastography was performed at least five times, with each measurement obtained at a different area of the right hepatic lobe; mean shear wave velocity (SWV) was calculated for each patient. The mean SWV was compared with US-based scores from two radiologists (based on liver surface nodularity, parenchyma echotexture and hepatic vein contour), Child-Pugh scores and liver function tests. The mean SWV of the normal liver group was 1.08 m/s ± 0.15; of the fatty liver group, 1.02 m/s ± 0.16; and of the chronic liver disease group, 1.66 m/s ± 0.60 (p < 0.001). The area under the receiver operating characteristics curve of the mean SWV in ARFI elastography was significantly higher than that of the conventional B-mode US-based scores by two radiologists (0.89 vs. 0.74 and 0.77, p < 0.05), with a sensitivity of 75.4% and a specificity of 89.5% at the cut-off value of 1.22 m/s. The sensitivity of the mean SWV was significantly higher than the US-based scores (p < 0.001), although the specificity was not (p > 0.05). The mean SWV was better correlated with Child-Pugh scores and all liver function tests (except total protein) than the US-based scores from two radiologists. In conclusion, ARFI elastography showed better diagnostic performance than visual assessment of experienced radiologists for diagnosis of chronic liver disease, as well as for

  18. The relationship between selected standardized test scores and performance in advanced placement math and science exams: Analyzing the differential effectiveness of scores for course identification and placement

    NASA Astrophysics Data System (ADS)

    Urbina, Josue N.

    There is a national need to increase the STEM-related workforce. Among factors leading towards STEM careers include the number of advanced high school mathematics and science courses students complete. Florida's enrollment patterns in STEM-related Advanced Placement (AP) courses, however, reveal that only a small percentage of students enroll into these classes. Therefore, screening tools are needed to find more students for these courses, who are academically ready, yet have not been identified. The purpose of this study was to investigate the extent to which scores from a national standardized test, Preliminary Scholastic Assessment Test/ National Merit Qualifying Test (PSAT/NMSQT), in conjunction with and compared to a state-mandated standardized test, Florida Comprehensive Assessment Test (FCAT), are related to selected AP exam performance in Seminole County Public Schools. An ex post facto correlational study was conducted using 6,189 student records from the 2010 - 2012 academic years. Multiple regression analyses using simultaneous Full Model testing showed differential moderate to strong relationships between scores in eight of the nine AP courses (i.e., Biology, Environmental Science, Chemistry, Physics B, Physics C Electrical, Physics C Mechanical, Statistics, Calculus AB and BC) examined. For example, the significant unique contribution to overall variance in AP scores was a linear combination of PSAT Math (M), Critical Reading (CR) and FCAT Reading (R) for Biology and Environmental Science. Moderate relationships for Chemistry included a linear combination of PSAT M, W (Writing) and FCAT M; a combination of FCAT M and PSAT M was most significantly associated with Calculus AB performance. These findings have implications for both research and practice. FCAT scores, in conjunction with PSAT scores, can potentially be used for specific STEM-related AP courses, as part of a systematic approach towards AP course identification and placement. For courses with

  19. The effect of constructivist teaching strategies on science test scores of middle school students

    NASA Astrophysics Data System (ADS)

    Vaca, James L., Jr.

    International studies show that the United States is lagging behind other industrialized countries in science proficiency. The studies revealed how American students showed little significant gain on standardized tests in science between 1995 and 2005. Little information is available regarding how reform in American teaching strategies in science could improve student performance on standardized testing. The purpose of this quasi-experimental quantitative study using a pretest/posttest control group design was to examine how the use of a hands-on, constructivist teaching approach with low achieving eighth grade science students affected student achievement on the 2007 Ohio Eighth Grade Science Achievement Test posttest (N = 76). The research question asked how using constructivist teaching strategies in the science classroom affected student performance on standardized tests. Two independent samples of 38 students each consisting of low achieving science students as identified by seventh grade science scores and scores on the Ohio Eighth Grade Science Half-Length Practice Test pretest were used. Four comparisons were made between the control group receiving traditional classroom instruction and the experimental group receiving constructivist instruction including: (a) pretest/posttest standard comparison, (b) comparison of the number of students who passed the posttest, (c) comparison of the six standards covered on the posttest, (d) posttest's sample means comparison. A Mann-Whitney U Test revealed that there was no significant difference between the independent sample distributions for the control group and the experimental group. These findings contribute to positive social change by investigating science teaching strategies that could be used in eighth grade science classes to improve student achievement in science.

  20. Participation in a coteaching classroom and students' end-of-course test scores

    NASA Astrophysics Data System (ADS)

    Debro, Ava

    General education students consistently perform poorly on standardized science tests. Coteaching is an instructional strategy that improves the achievement of students with disabilities, but very little research exists that examines the effect of coteaching classrooms on the performance of general education students. The purpose of this study was to examine the effect of coteaching classrooms on the performance of general education students. The constructivist theoretical framework provided the foundation for this research. The research question examined the effect that coteaching classrooms had on the performance of general education biology students. In this experimental design utilizing a posttest-only control group, coteaching instructional strategy was the treatment, and student performance was measured using the scores obtained from the biology end-of-course test. Data for this study was analyzed using an independent t-test. The results of this study revealed that there was not a statistically significant difference in student performance on the biology end-of-course test between treatment and control groups. More than half of the general education biology students enrolled in coteaching classrooms failed the end-of-course test. Researchers may use this study as a catalyst to examine other instructional practices that may improve student performance in science courses. The results of this study may be used to persuade coteachers of the importance of attending frequent professional development opportunities that examine a variety of coteaching instructional strategies. Improving the performance of general education students in science may improve standardized test scores, afford more students the opportunity to attend college, and ensure that students are able to compete on a global level.

  1. Scoring in genetically modified organism proficiency tests based on log-transformed results.

    PubMed

    Thompson, Michael; Ellison, Stephen L R; Owen, Linda; Mathieson, Kenneth; Powell, Joanne; Key, Pauline; Wood, Roger; Damant, Andrew P

    2006-01-01

    The study considers data from 2 UK-based proficiency schemes and includes data from a total of 29 rounds and 43 test materials over a period of 3 years. The results from the 2 schemes are similar and reinforce each other. The amplification process used in quantitative polymerase chain reaction determinations predicts a mixture of normal, binomial, and lognormal distributions dominated by the latter 2. As predicted, the study results consistently follow a positively skewed distribution. Log-transformation prior to calculating z-scores is effective in establishing near-symmetric distributions that are sufficiently close to normal to justify interpretation on the basis of the normal distribution.

  2. Learning Anatomy Enhances Spatial Ability

    ERIC Educational Resources Information Center

    Vorstenbosch, Marc A. T. M.; Klaassen, Tim P. F. M.; Donders, A. R. T.; Kooloos, Jan G. M.; Bolhuis, Sanneke M.; Laan, Roland F. J. M.

    2013-01-01

    Spatial ability is an important factor in learning anatomy. Students with high scores on a mental rotation test (MRT) systematically score higher on anatomy examinations. This study aims to investigate if learning anatomy also oppositely improves the MRT-score. Five hundred first year students of medicine ("n" = 242, intervention) and…

  3. Power comparisons between similarity-based multilocus association methods, logistic regression, and score tests for haplotypes.

    PubMed

    Lin, Wan-Yu; Schaid, Daniel J

    2009-04-01

    Recently, a genomic distance-based regression for multilocus associations was proposed (Wessel and Schork [2006] Am. J. Hum. Genet. 79:792-806) in which either locus or haplotype scoring can be used to measure genetic distance. Although it allows various measures of genomic similarity and simultaneous analyses of multiple phenotypes, its power relative to other methods for case-control analyses is not well known. We compare the power of traditional methods with this new distance-based approach, for both locus-scoring and haplotype-scoring strategies. We discuss the relative power of these association methods with respect to five properties: (1) the marker informativity; (2) the number of markers; (3) the causal allele frequency; (4) the preponderance of the most common high-risk haplotype; (5) the correlation between the causal single-nucleotide polymorphism (SNP) and its flanking markers. We found that locus-based logistic regression and the global score test for haplotypes suffered from power loss when many markers were included in the analyses, due to many degrees of freedom. In contrast, the distance-based approach was not as vulnerable to more markers or more haplotypes. A genotype counting measure was more sensitive to the marker informativity and the correlation between the causal SNP and its flanking markers. After examining the impact of the five properties on power, we found that on average, the genomic distance-based regression that uses a matching measure for diplotypes was the most powerful and robust method among the seven methods we compared.

  4. Effect of Examinee Certainty on Probabilistic Test Scores and a Comparison of Scoring Methods for Probabilistic Responses.

    DTIC Science & Technology

    1983-07-01

    OF PSYCHOLOGY UNIVERSITY OF MINNESOTA - MINNEAPOLIS, MN 55455 This research was supported by funds from the Air Force Office of Scientific Research...PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT, TASKAREA & WORK UNIT NUMBERS • Department of Psychology P.E.:61153N Proj.:RR042-04...of .06. Test Administration The 30 multiple-choice analogy items chosen were then administered to 299 psychology and biology undergraduate students

  5. The Bender Gestalt Test with the Human Figure Drawing Test for Young School Children. A Manual for Use with the Koppitz Scoring System.

    ERIC Educational Resources Information Center

    Koppitz, Elizabeth Munsterberg

    Presented is a manual for scoring the Bender Gestalt Test and the Human Figure Drawing Test for screening and diagnostic uses with emotionally disturbed, brain damaged, or perceptually handicapped 5- to 11-year-old children. Given are suggestions for administering and scoring the Bender test which examines distortion of shape, rotation,…

  6. A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

    ERIC Educational Resources Information Center

    Lee, Guemin; Park, In-Yong

    2012-01-01

    Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

  7. Developments and Challenges in the Use of Computer-Based Testing for Assessing Second Language Ability

    ERIC Educational Resources Information Center

    Ockey, Gary J.

    2009-01-01

    Computer-based testing (CBT) to assess second language ability has undergone remarkable development since Garret (1991) described its purpose as "the computerized administration of conventional tests" in "The Modern Language Journal." For instance, CBT has made possible the delivery of more authentic tests than traditional paper-and-pencil tests.…

  8. Individual Differences in Gender Role Beliefs Influence Spatial Ability Test Performance

    ERIC Educational Resources Information Center

    Massa, Laura J.; Mayer, Richard E.; Bohon, Lisa M.

    2005-01-01

    The gender role hypothesis posits that performance on a cognitive ability test is influenced by whether the test instructions frame the test as measuring a skill that is consistent or inconsistent with the test taker's gender role beliefs. The Bem sex role inventory was used to measure the gender role of female college students, and the group…

  9. Test and Score Data Summary for TOEFL[R] Internet-Based and Paper-Based Tests. January 2008-December 2008 Test Data

    ERIC Educational Resources Information Center

    Educational Testing Service, 2008

    2008-01-01

    The Test of English as a Foreign Language[TM], better known as TOEFL[R], is designed to measure the English-language proficiency of people whose native language is not English. TOEFL scores are accepted by more than 6,000 colleges, universities, and licensing agencies in 130 countries. The test is also used by governments, and scholarship and…

  10. On the Interchangeability of Individually Administered and Group Administered Ability Tests

    ERIC Educational Resources Information Center

    Nevo, Baruch; Sela, Roni

    2003-01-01

    This research studied the interchangeability of individually administered and group administered cognitive tests. Seventy undergraduate students took the Hebrew version of the WAIS-R (Wechsler Adult Intelligence Scale-Revised), and their IQs were measured. They also took the IPET (Israeli Psychometric Entrance Test) and their IPET scores were…

  11. Subpopulation Differences in Performance on Tests of Mental Ability: Historical Review and Annotated Bibliography

    DTIC Science & Technology

    1981-08-01

    and white (N-142, 545) job applicants, in 80 occupations throughout the country, on the Wonderlic Personnel Test (a fifty-item measure of general...22.29; (2) Among college graduates, the mean scores for blacks and whites of both sexes were 23.26 and 29.96, respectively. (The Wonderlic Test results

  12. A COMPARISON OF THE EMPIRICAL VALIDITY OF SIX TESTS OF ABILITY WITH EDUCABLE MENTAL RETARDATES.

    ERIC Educational Resources Information Center

    MUELLER, MAX W.

    AN INVESTIGATION OF THE VALIDITY OF INTELLIGENCE AND OTHER TESTS USED IN THE DIAGNOSIS OF RETARDED CHILDREN WAS PERFORMED. EXPERIMENTAL SAMPLES CONSISTED OF 101 CHILDREN SELECTED FROM SPECIAL CLASSES FOR EDUCABLE MENTALLY RETARDED (EMR) WHOSE AGES RANGED FROM 6.9 TO 10 YEARS AND WHOSE IQ SCORES RANGED FROM 50 TO 80. THE TESTS EVALUATED WERE (1)…

  13. CT densitovolumetry in children with obliterative bronchiolitis: correlation with clinical scores and pulmonary function test results*,**

    PubMed Central

    Mocelin, Helena; Bueno, Gilberto; Irion, Klaus; Marchiori, Edson; Sarria, Edgar; Watte, Guilherme; Hochhegger, Bruno

    2013-01-01

    OBJECTIVE: To determine whether air trapping (expressed as the percentage of air trapping relative to total lung volume [AT%]) correlates with clinical and functional parameters in children with obliterative bronchiolitis (OB). METHODS: CT scans of 19 children with OB were post-processed for AT% quantification with the use of a fixed threshold of −950 HU (AT%950) and of thresholds selected with the aid of density masks (AT%DM). Patients were divided into three groups by AT% severity. We examined AT% correlations with oxygen saturation (SO2) at rest, six-minute walk distance (6MWD), minimum SO2 during the six-minute walk test (6MWT_SO2), FVC, FEV1, FEV1/FVC, and clinical parameters. RESULTS: The 6MWD was longer in the patients with larger normal lung volumes (r = 0.53). We found that AT%950 showed significant correlations (before and after the exclusion of outliers, respectively) with the clinical score (r = 0.72; 0.80), FVC (r = 0.24; 0.59), FEV1 (r = −0.58; −0.67), and FEV1/FVC (r = −0.53; r = −0.62), as did AT%DM with the clinical score (r = 0.58; r = 0.63), SO2 at rest (r = −0.40; r = −0.61), 6MWT_SO2 (r = −0.24; r = −0.55), FVC (r = −0.44; r = −0.80), FEV1 (r = −0.65; r = −0.71), and FEV1/FVC (r = −0.41; r = −0.52). CONCLUSIONS: Our results show that AT% correlates significantly with clinical scores and pulmonary function test results in children with OB. PMID:24473764

  14. [Development and clinical testing of the Russian version of the Acute Cystitis Symptom Score - ACSS].

    PubMed

    Alidjanov, J F; Abdufattaev, U A; Makhmudov, D Kh; Mirkhamidov, D Kh; Khadzhikhanov, F A; Azgamov, A V; Pilatz, A; Naber, K G; Wagenlehner, F M; Akilov, F A

    2014-01-01

    The Acute Cystitis Symptom Score - ACSS was originally developed in the Uzbek language and has demonstrated high reliability and validity. The study was aimed to develop a Russian version of the ACSS questionnaire and evaluate its psychometric properties. Translation and adaptation of the ACSS questionnaire containing 18 questions, 6 of them - for the typical symptoms of acute cystitis (AC), 4 - for the differential diagnosis; 3 - for the quality of life, and 5 - for the conditions that may affect the choice of treatment, were performed according to the recommendations developed by the Mapi Research Institute. Study involved 83 Russian-speaking women (mean age, 35.6 ±13.7 years); 38 (45.8%) patients were in the main group (patients with AC), and 45 (54.2%) - in the control group (without AC). Medical examination and appropriate treatment of the respondents were conducted in accordance with approved standards. After completing the course of therapy, 19 (50%) patients of the main group came for the control examination. There was statistically significant difference in the scores obtained in the two groups. Score profiles positively correlated with the results of laboratory tests (rho = 0.26-0.48). Cronbach's alpha for the Russian version of the questionnaire was 0.86 (95% CI, 0.81-0.91), area under the curve in the ROC analysis was 0.96. The results of testing the Russian version correspond to those of the original version. The Russian version of the ACSS questionnaire has high. reliability and validity, and can be recommended for clinical research and diagnosis of primary AC, and dynamic monitoring of the effectiveness of the treatment of the Russian-speaking population of patients.

  15. Improving personality facet scores with multidimensional computer adaptive testing: an illustration with the NEO PI-R.

    PubMed

    Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A W

    2013-02-01

    Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated facets. This article investigates the possibility of increasing the precision of the NEO PI-R facet scores by scoring items with multidimensional item response theory and by efficiently administering and scoring items with multidimensional computer adaptive testing (MCAT). The increase in the precision of personality facet scores is obtained from exploiting the correlations between the facets. Results indicate that the NEO PI-R could be substantially shorter without attenuating precision when the MCAT methodology is used. Furthermore, the study shows that the MCAT methodology is particularly appropriate for constructs that have many highly correlated facets.

  16. How Do Executive Functions Fit with the Cattell-Horn-Carroll Model? Some Evidence from a Joint Factor Analysis of the Delis-Kaplan Executive Function System and the Woodcock-Johnson III Tests of Cognitive Abilities

    ERIC Educational Resources Information Center

    Floyd, Randy G.; Bergeron, Renee; Hamilton, Gloria; Parra, Gilbert R.

    2010-01-01

    This study investigated the relations among executive functions and cognitive abilities through a joint exploratory factor analysis and joint confirmatory factor analysis of 25 test scores from the Delis-Kaplan Executive Function System and the Woodcock-Johnson III Tests of Cognitive Abilities. Participants were 100 children and adolescents…

  17. Limitations of the Score-Difference Method in Detecting Cheating in Recognition Test Situations.

    ERIC Educational Resources Information Center

    Roberts, Dennis M.

    1987-01-01

    This study examines a score-difference model for the detection of cheating based on the difference between two scores for an examinee: one based on the appropriate scoring key and another based on an alternative, inappropriate key. It argues that the score-difference method could falsely accuse students as cheaters. (Author/JAZ)

  18. A multivariate spatial mixture model for areal data: examining regional differences in standardized test scores

    PubMed Central

    Neelon, Brian; Gelfand, Alan E.; Miranda, Marie Lynn

    2013-01-01

    Summary Researchers in the health and social sciences often wish to examine joint spatial patterns for two or more related outcomes. Examples include infant birth weight and gestational length, psychosocial and behavioral indices, and educational test scores from different cognitive domains. We propose a multivariate spatial mixture model for the joint analysis of continuous individual-level outcomes that are referenced to areal units. The responses are modeled as a finite mixture of multivariate normals, which accommodates a wide range of marginal response distributions and allows investigators to examine covariate effects within subpopulations of interest. The model has a hierarchical structure built at the individual level (i.e., individuals are nested within areal units), and thus incorporates both individual- and areal-level predictors as well as spatial random effects for each mixture component. Conditional autoregressive (CAR) priors on the random effects provide spatial smoothing and allow the shape of the multivariate distribution to vary flexibly across geographic regions. We adopt a Bayesian modeling approach and develop an efficient Markov chain Monte Carlo model fitting algorithm that relies primarily on closed-form full conditionals. We use the model to explore geographic patterns in end-of-grade math and reading test scores among school-age children in North Carolina. PMID:26401059

  19. Comparison of educationally handicapped students scores on the Revised Developmental Test of Visual-Motor Integration and Bender-Gestalt.

    PubMed

    Breen, M J

    1982-06-01

    32 elementary-aged boys enrolled in a program for the emotionally disturbed were administered the Revised Beery and Bender-Gestalt. A significant correlation of .73 was found between Beery and Bender age-equivalent scores. A t test for correlated data indicated mean scores did not differ significantly from one another, but scores were quite varied. The implications of such variability are discussed.

  20. Confirmation of interrater reliability of the Marley Differential Diagnostic Scoring System for the Bender-Gestalt Test.

    PubMed

    DeCato, C M; Meldrum, D

    1989-06-01

    The Bender-Gestalt test has been one of the most popular clinical instruments for the past four decades. Much controversy has surrounded the use of this test as a screening instrument for organicity (brain dysfunction). Marley's Differential Diagnostic Scoring System was recently developed to improve the validity of the test for detecting organicity. The original standardization of the system reported very high interrater reliability. To provide an independent assessment of interscorer reliability, three raters were trained in the system and separately rated 40 protocols. Kappa coefficients for the three raters ranged from .94 to .98. Substantial interscorer reliability was obtained, Mdn = 92.5% for specific scores with three scores attaining 100% agreement, although some values were lower. These results suggest that there is a strong empirical basis for the scoring system and encourage further refinement of the scoring system to reflect central nervous system dysfunction.

  1. Good test--retest reliability for standard and advanced false-belief tasks across a wide range of abilities.

    PubMed

    Hughes, C; Adlam, A; Happé, F; Jackson, J; Taylor, A; Caspi, A

    2000-05-01

    Although tests of young children's understanding of mind have had a remarkable impact upon developmental and clinical psychological research over the past 20 years, very little is known about their reliability. Indeed, the only existing study of test-retest reliability suggests unacceptably poor results for first-order false-belief tasks (Mayes, Klin, Tercyak, Cicchetti, & Cohen, 1996), although this may in part reflect the nonstandard (video-based) procedures adopted by these authors. The present study had four major aims. The first was to re-examine the reliability of false-belief tasks, using more standard (puppet and storybook) procedures. The second was to assess whether the test-retest reliability of false-belief task performance is equivalent for children of contrasting ability levels. The third aim was to explore whether adopting an aggregate approach improves the reliability with which children's early mental-state awareness can be measured. The fourth aim was to examine for the first time the test-retest reliability of children's performances on more advanced theory-of-mind tasks. Our results suggest that most standard and advanced false-belief tasks do in fact show good test-retest reliability and internal consistency, with very strong test-retest correlations between aggregate scores for children of all levels of ability.

  2. The Information a Test Provides on an Ability Parameter. Research Report. ETS RR-07-18

    ERIC Educational Resources Information Center

    Haberman, Shelby J.

    2007-01-01

    In item-response theory, if a latent-structure model has an ability variable, then elementary information theory may be employed to provide a criterion for evaluation of the information the test provides concerning ability. This criterion may be considered even in cases in which the latent-structure model is not valid, although interpretation of…

  3. Computerized Classification Testing under the One-Parameter Logistic Response Model with Ability-Based Guessing

    ERIC Educational Resources Information Center

    Wang, Wen-Chung; Huang, Sheng-Yun

    2011-01-01

    The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…

  4. Multiple tests for wind turbine fault detection and score fusion using two- level multidimensional scaling (MDS)

    NASA Astrophysics Data System (ADS)

    Ye, Xiang; Gao, Weihua; Yan, Yanjun; Osadciw, Lisa A.

    2010-04-01

    Wind is an important renewable energy source. The energy and economic return from building wind farms justify the expensive investments in doing so. However, without an effective monitoring system, underperforming or faulty turbines will cause a huge loss in revenue. Early detection of such failures help prevent these undesired working conditions. We develop three tests on power curve, rotor speed curve, pitch angle curve of individual turbine. In each test, multiple states are defined to distinguish different working conditions, including complete shut-downs, under-performing states, abnormally frequent default states, as well as normal working states. These three tests are combined to reach a final conclusion, which is more effective than any single test. Through extensive data mining of historical data and verification from farm operators, some state combinations are discovered to be strong indicators of spindle failures, lightning strikes, anemometer faults, etc, for fault detection. In each individual test, and in the score fusion of these tests, we apply multidimensional scaling (MDS) to reduce the high dimensional feature space into a 3-dimensional visualization, from which it is easier to discover turbine working information. This approach gains a qualitative understanding of turbine performance status to detect faults, and also provides explanations on what has happened for detailed diagnostics. The state-of-the-art SCADA (Supervisory Control And Data Acquisition) system in industry can only answer the question whether there are abnormal working states, and our evaluation of multiple states in multiple tests is also promising for diagnostics. In the future, these tests can be readily incorporated in a Bayesian network for intelligent analysis and decision support.

  5. How Out-of-Level Testing Affects the Psychometric Quality of Test Scores. Out-of-Level Testing Report 2.

    ERIC Educational Resources Information Center

    Bielinski, John; Thurlow, Martha; Minnema, Jane; Scott, Jim

    This report is a review and analysis of the psychometric literature on the topic of out-of-level testing. Out-of-level testing refers to the practice of using a level of the test other than the test taken by most of the students in a student's current grade level. Much of the research on out-of-level testing was conducted in the 1970s and 1980s,…

  6. The Effect of Having Previously Attended Junior Kindergarten on "Draw-A-Classroom" Test Scores Obtained in Senior Kindergarten.

    ERIC Educational Resources Information Center

    Rogers, Rex S.

    Data are presented which show the degree to which specific prior exposure to a learning situation (Junior Kindergarten) is reflected in the scores of children who had this experience compared to a group of their peers who did not. Scores obtained in Senior Kindergarten on the Draw-a-Classroom Test (DAC) are used as the measurement method. The…

  7. Expanded Koppitz Scoring System of the Bender Gestalt Visual-Motor Test for Adolescents: A Pilot Study.

    ERIC Educational Resources Information Center

    Bolen, Larry M.; And Others

    1992-01-01

    Examined use of Bender Gestalt Visual-Motor Test with school-age adolescents over age 11. Mean error scores suggest that visual-motor development is not maturationally complete by age 11 years, 11 months. Suggests additional research focusing on extending normative sample or developing new scoring system for adolescents. (Author/NB)

  8. Sorting and Supporting: Why Double-Dose Algebra Led to Better Test Scores but More Course Failures

    ERIC Educational Resources Information Center

    Nomi, Takako; Allensworth, Elaine M.

    2013-01-01

    In 2003, Chicago schools required students entering ninth grade with below-average math scores to take two periods of algebra. This led to higher test scores for students with both above- and below-average skills, yet failure rates increased for above-average students. We examine the mechanisms behind these surprising results. Sorting by incoming…

  9. Utilizing the Six Realms of Meaning in Improving Campus Standardized Test Scores through Team Teaching and Strategic Planning

    ERIC Educational Resources Information Center

    Stevenson, Rosnisha D.; Kritsonis, William Allan

    2009-01-01

    This article will seek to utilize Dr. William Allan Kritsonis' book "Ways of Knowing Through the Realms of Meaning" (2007) as a framework to improve a campus's standardized test scores, more specifically, their TAKS (Texas Assessment of Knowledge and Skills) scores. Many campuses have an improvement plan, also known as a Campus…

  10. AN INVESTIGATION OF NON-INDEPENDENCE OF COMPONENTS OF SCORES ON MULTIPLE-CHOICE TESTS. FINAL REPORT.

    ERIC Educational Resources Information Center

    ZIMMERMAN, DONALD W.; BURKHEIMER, GRAHAM J., JR.

    INVESTIGATION IS CONTINUED INTO VARIOUS EFFECTS OF NON-INDEPENDENT ERROR INTRODUCED INTO MULTIPLE-CHOICE TEST SCORES AS A RESULT OF CHANCE GUESSING SUCCESS. A MODEL IS DEVELOPED IN WHICH THE CONCEPT OF THEORETICAL COMPONENTS OF SCORES IS NOT INTRODUCED AND IN WHICH, THEREFORE, NO ASSUMPTIONS REGARDING ANY RELATIONSHIP BETWEEN SUCH COMPONENTS NEED…

  11. Correlations Between the Porteus Maze Test Qualitative Score and Age and Recidivism Rates of Female Correctional Inmates.

    ERIC Educational Resources Information Center

    Pearson, Virginia L.

    This study investigated correlations between the Qualitative score of the Porteus Maze Test and age and rates of recidivism of correctional institution inmates. In addition, the study was structured to provide answers to the following questions: (1) Is there a relationship between age and rates of recidivism and the Conformity-Variability score of…

  12. Improving Personality Facet Scores with Multidimensional Computer Adaptive Testing: An Illustration with the Neo Pi-R

    ERIC Educational Resources Information Center

    Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W.

    2013-01-01

    Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…

  13. A Brief Look at: Test Scores and the Standard Error of Measurement. E&R Report No. 10.13

    ERIC Educational Resources Information Center

    Holdzkom, David; Sumner, Brian; McMillen, Brad

    2010-01-01

    In the context of standardized testing, the standard error of measurement (SEM) is a measure of the factors other than the student's actual knowledge of the tested material that may affect the student's test score. Such factors may include distractions in the testing environment, fatigue, hunger, or even luck. This means that a student's observed…

  14. Guided-Inquiry Lessons Raise Scores on the Sixth Grade Georgia Science Test

    NASA Astrophysics Data System (ADS)

    Page, Purlie M.

    At the local level, G Middle School has the highest district-wide percentage of 6th grade science students who are not meeting standards. It is imperative that G middle school take corrective action to reduce the number of students failing to meet state science standards. Dewey's theory of conceptual framework, which involves knowledge constructed on a person's personal experience and mind activity through active forms of learning, guided this study. The goal of the study was to determine whether inquiry-based science modules produce greater 6th grade science achievement, as measured by an equivalent instrument of the science section of the Georgia Criterion-Referenced Competency Test, when compared to traditional instruction among eastern Georgia 6th graders. The sample consisted of 230 students in the nonintervention group and 119 students in the intervention group. All students were from intact classes. At the end of the intervention, an independent t test was conducted to analyze the scores. According to the study t test, (t = 12.33, df = 304.56, p < 0.05), the difference between the means was statistically significant. This project's potential impact on social change includes increasing student motivation towards, comprehension of, and interest in science concepts. At the local level, these inquiry lessons can be shared with science teachers across grade levels and within the district to improve county-wide science scores. An increase in student interest and comprehension of science concepts could ultimately lead to the United States producing more students in the fields of science, technology, engineering, and mathematics (STEM) education.

  15. The Applicability of Multidimensional Computerized Adaptive Testing for Cognitive Ability Measurement in Organizational Assessment

    ERIC Educational Resources Information Center

    Makransky, Guido; Glas, Cees A. W.

    2013-01-01

    Cognitive ability tests are widely used in organizations around the world because they have high predictive validity in selection contexts. Although these tests typically measure several subdomains, testing is usually carried out for a single subdomain at a time. This can be ineffective when the subdomains assessed are highly correlated. This…

  16. At the Interface between Language Testing and Second Language Acquisition: Language Ability and Context of Learning

    ERIC Educational Resources Information Center

    Gu, Lin

    2014-01-01

    This study investigated the relationship between latent components of academic English language ability and test takers' study-abroad and classroom learning experiences through a structural equation modeling approach in the context of TOEFL iBT® testing. Data from the TOEFL iBT public dataset were used. The results showed that test takers'…

  17. Genetic Tests for Ability?: Talent Identification and the Value of an Open Future

    ERIC Educational Resources Information Center

    Miah, Andy; Rich, Emma

    2006-01-01

    This paper explores the prospect of genetic tests for performance in physical activity and sports practices. It investigates the terminology associated with genetics, testing, selection and ability as a means towards a socio-ethical analysis of its value within sport, education and society. Our argument suggests that genetic tests need not even be…

  18. Web-Based Adaptive Testing System (WATS) for Classifying Students Academic Ability

    ERIC Educational Resources Information Center

    Lee, Jaemu; Park, Sanghoon; Kim, Kwangho

    2012-01-01

    Computer Adaptive Testing (CAT) has been highlighted as a promising assessment method to fulfill two testing purposes: estimating student academic ability and classifying student academic level. In this paper, assessment for we introduced the Web-based Adaptive Testing System (WATS) developed to support a cost effective assessment for classifying…

  19. The Woodcock-Johnson Tests of Cognitive Ability: Concurrent Validity with the WISC-R.

    ERIC Educational Resources Information Center

    Reeve, Ronald E.; And Others

    1979-01-01

    The study compared the Woodcock-Johnson Psycho-Educational Battery Tests of Cognitive Ability and the Wechsler Intelligence Scale for Children--Revised for a sample of 51 learning disabled children (7-11 years old). (Author/SBH)

  20. Protective Service Physical Ability Tests: Establishing Pass/Fail, Ranking, and Banding Procedures.

    ERIC Educational Resources Information Center

    Biddle, Dan; Sill, Nikki Shepherd

    1999-01-01

    Setting pass/fail cutoffs that accurately reflect physical ability required for job performance is a key consideration for public-sector employment testing. Top-down ranking is less appropriate than job-performance expectancy banding. (SK)

  1. Grouped to Achieve: Are There Benefits to Assigning Students to Heterogeneous Cooperative Learning Groups Based on Pre-Test Scores?

    NASA Astrophysics Data System (ADS)

    Werth, Arman Karl

    Cooperative learning has been one of the most widely used instructional practices around the world since the early 1980's. Small learning groups have been in existence since the beginning of the human race. These groups have grown in their variance and complexity overtime. Classrooms are getting more diverse every year and instructors need a way to take advantage of this diversity to improve learning. The purpose of this study was to see if heterogeneous cooperative learning groups based on student achievement can be used as a differentiated instructional strategy to increase students' ability to demonstrate knowledge of science concepts and ability to do engineering design. This study includes two different groups made up of two different middle school science classrooms of 25-30 students. These students were given an engineering design problem to solve within cooperative learning groups. One class was put into heterogeneous cooperative learning groups based on student's pre-test scores. The other class was grouped based on random assignment. The study measured the difference between each class's pre-post gains, student's responses to a group interaction form and interview questions addressing their perceptions of the makeup of their groups. The findings of the study were that there was no significant difference between learning gains for the treatment and comparison groups. There was a significant difference between the treatment and comparison groups in student perceptions of their group's ability to stay on task and manage their time efficiently. Both the comparison and treatment groups had a positive perception of the composition of their cooperative learning groups.

  2. Ability Estimates That Order Individuals with Consistent Philosophies.

    ERIC Educational Resources Information Center

    Samejima, Fumiko

    Latent trait models introduced the concept of the latent trait, or ability, as distinct from the test score. There is a recent tendency to treat the test score as through it were a substitute for ability, largely because the test score is a convenient way to place individuals in order. F. Samejima (1969) has shown that, in general, the amount of…

  3. Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

    NASA Astrophysics Data System (ADS)

    Zain, Zakiyah; Aziz, Nazrina; Ahmad, Yuhaniz; Azwan, Zairul; Raduan, Farhana; Sagap, Ismail

    2014-12-01

    Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.

  4. Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

    SciTech Connect

    Zain, Zakiyah Ahmad, Yuhaniz; Azwan, Zairul E-mail: farhanaraduan@gmail.com Raduan, Farhana E-mail: farhanaraduan@gmail.com Sagap, Ismail E-mail: farhanaraduan@gmail.com; Aziz, Nazrina

    2014-12-04

    Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.

  5. Premorbid IQ influence on screening tests' scores in healthy patients and patients with cognitive impairment.

    PubMed

    Alves, Lara; Simões, Mário R; Martins, Cristina; Freitas, Sandra; Santana, Isabel

    2013-06-01

    Cognitive screening tests are well-established tools for detecting cognitive impairment, but concerns regarding the influence of premorbid intelligence on patient's performance and cognitive status classification remain. Risk of inaccurate assessment especially affects the elders with high or low premorbid intelligence (who are more likely to be misclassified). The present study examines the influence of premorbid intelligence assessed by the TeLPI (an irregular words reading test) on 2 cognitive screening tests, the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA), in healthy participants and patients with cognitive impairments (mild cognitive impairment and Alzheimer disease). Results show that premorbid IQ influences the MMSE and the MoCA scores in both the groups, predicting variance from 8.4% to 33.2%, according to test and group analyzed. Hence, we propose that whenever the MMSE or the MoCA is used, premorbid IQ evaluation should also be considered to ensure correct interpretation and classification.

  6. An exposure-weighted score test for genetic associations integrating environmental risk factors

    PubMed Central

    Han, Summer S.; Rosenberg, Philip S.; Ghosh, Arpita; Landi, Marisa Teresa; Caporaso, Neil E.; Chatterjee, Nilanjan

    2015-01-01

    1. Summary Current methods for detecting genetic associations lack full consideration of the background effects of environmental exposures. Recently proposed methods to account for environmental exposures have focused on logistic regressions with gene-environment interactions. In this report, we developed a test for genetic association, encompassing a broad range of risk models, including linear, logistic and probit, for specifying joint effects of genetic and environmental exposures. We obtained the test statistics by maximizing over a class of score tests, each of which involves modified standard tests of genetic association through a weight function. This weight function reflects the potential heterogeneity of the genetic effects by levels of environmental exposures under a particular model. Simulation studies demonstrate the robust power of these methods for detecting genetic associations under a wide range of scenarios. Applications of these methods are further illustrated using data from genome-wide association studies of type 2 diabetes with body mass index and of lung cancer risk with smoking. PMID:26134142

  7. Comparison of tests for measuring maximal exercise ability in elite swimmers

    PubMed Central

    Suk, Min-Hwa; Yu, Kyung-Hun; Shin, Yun-A

    2016-01-01

    The purpose of this study was to compare of tests for measuring maximal exercise ability in elite swimmers. The high-school male elite swimmers (n=17) were performed maximal exercise ability tests. The experimental method consisted of a crossover design at 1-week intervals with the swimming tests (field test, water VAMEVAL test, 200-m test, and 400-m test) in random order. It measured the heart rate, ratings of perceived exertion (RPE), and lactate level by physiological factors, and swimming velocity (SV), stroke rate (SR), and stroke length (SL) by mechanical factors. The change of SV, SR, and SL in swimming tests was no significantly different. To compare tests, however, the lactate level and RPE in 200-m test was higher than water VAMEVAL test. The RPE of the 200-m and 400-m tests were higher than the field test and the water VAMEVAL test. Correlations showed between the field test and the 400-m test in heart rate and RPE. Moreover, a correlation observed between the field test and 200-m test in heart rate. In this study, 200-m and 400-m tests were suit to apply the test methods for establishing the exercise intensity appropriate for the underwater training of swimmers. PMID:27419117

  8. Psychometric properties of the Bender Gestalt Test using Lacks' version of the Hutt-Briskin scoring system.

    PubMed

    Lopez, Michael N; Perez, Jose J; Smith, Whitney E; Castillo, Wendy

    2007-01-01

    Criterion-referenced (Livingston r) and norm-referenced (Gilmer-Feldt r and Coefficient Alpha) techniques were used to calculate the internal consistency reliability of the Bender-Gestalt Test (BGT) Total Score using the 12-item Lacks system of scoring. Livingston's r was found to be .825 for the Lacks BGT cutoff score of 5. The Gilmer-Feldt and alpha coefficients for the Lacks Total Score was found to be .644 and .626, respectively. An item analysis showed that most of the BGT items (9 out of 12) were within established criteria for item difficulty, however, 7 items were found to be poor discriminators. The interscorer reliabilities based on three scorers, two scorers, and a single scorer was found to be .895, .852, and .740, respectively. Due to the low reliabilities and several inherent flaws that were identified with the Lacks scoring system, the authors recommend that users of the BGT consider alternative objective scoring systems.

  9. Predicting Academic Achievement with Cognitive Ability

    ERIC Educational Resources Information Center

    Rohde, Treena Eileen; Thompson, Lee Anne

    2007-01-01

    The purpose of the present study is to explain variation in academic achievement with general cognitive ability and specific cognitive abilities. Grade point average, Wide Range Achievement Test III scores, and SAT scores represented academic achievement. The specific cognitive abilities of interest were: working memory, processing speed, and…

  10. Academic self-concept, interest, grades, and standardized test scores: reciprocal effects models of causal ordering.

    PubMed

    Marsh, Herbert W; Trautwein, Ulrich; Lüdtke, Oliver; Köller, Olaf; Baumert, Jürgen

    2005-01-01

    Reciprocal effects models of longitudinal data show that academic self-concept is both a cause and an effect of achievement. In this study this model was extended to juxtapose self-concept with academic interest. Based on longitudinal data from 2 nationally representative samples of German 7th-grade students (Study 1: N = 5,649, M age = 13.4; Study 2: N = 2,264, M age = 13.7 years), prior self-concept significantly affected subsequent math interest, school grades, and standardized test scores, whereas prior math interest had only a small effect on subsequent math self-concept. Despite stereotypic gender differences in means, linkages relating these constructs were invariant over gender. These results demonstrate the positive effects of academic self-concept on a variety of academic outcomes and integrate self-concept with the developmental motivation literature.

  11. Assessing Growth in Young Children: A Comparison of Raw, Age-Equivalent, and Standard Scores Using the Peabody Picture Vocabulary Test

    ERIC Educational Resources Information Center

    Sullivan, Jeremy R.; Winter, Suzanne M.; Sass, Daniel A.; Svenkerud, Nicole

    2014-01-01

    Many tests provide users with several different types of scores to facilitate interpretation and description of students' performance. Common examples include raw scores, age- and grade-equivalent scores, and standard scores. However, when used within the context of assessing growth among young children, these scores should not be interchangeable…

  12. REPRODUCIBILITY OF THE MODIFIED STAR EXCURSION BALANCE TEST COMPOSITE AND SPECIFIC REACH DIRECTION SCORES

    PubMed Central

    van Lieshout, Remko; Reijneveld, Elja A.E.; van den Berg, Sandra M.; Haerkens, Gijs M.; Koenders, Niek H.; de Leeuw, Arina J.; van Oorsouw, Roel G.; Paap, Davy; Scheffer, Else; Weterings, Stijn

    2016-01-01

    ABSTRACT Background The mSEBT is a screening tool used to evaluate dynamic balance. Most research investigating measurement properties focused on intrarater reliability and was done in small samples. To know whether the mSEBT is useful to discriminate dynamic balance between persons and to evaluate changes in dynamic balance, more research into intra- and interrater reliability and smallest detectable change (synonymous with minimal detectable change) is needed. Purpose To estimate intra- and interrater reliability and smallest detectable change of the mSEBT in adults at risk for ankle sprain. Study Design Cross-sectional, test-retest design Methods Fifty-five healthy young adults participating in sports at risk for ankle sprain participated (mean ± SD age, 24.0 ± 2.9 years). Each participant performed three test sessions within one hour and was rated by two physical therapists (session 1, rater 1; session 2, rater 2; session 3, rater 1). Participants and raters were blinded for previous measurements. Normalized composite and reach direction scores for the right and left leg were collected. Analysis of variance was used to calculate intraclass correlation coefficient values for intra- and interrater reliability. Smallest detectable change values were calculated based on the standard error of measurement. Results Intra- and interrater reliability for both legs was good to excellent (intraclass correlation coefficient ranging from 0.87 to 0.94). The intrarater smallest detectable change for the composite score of the right leg was 7.2% and for the left 6.2%. The interrater smallest detectable change for the composite score of the right leg was 6.9% and for the left 5.0%. Conclusion The mSEBT is a reliable measurement instrument to discriminate dynamic balance between persons. Most smallest detectable change values of the mSEBT appear to be large. More research is needed to investigate if the mSEBT is usable for evaluative purposes. Level of Evidence Level 2

  13. The Effect of Continuing Education on the Test Scores of a State Licensing Board Examination.

    ERIC Educational Resources Information Center

    Mergener, Michael A.

    1981-01-01

    This study compares the scores optometrists obtained on a pharmacology examination with years since licensure and type of continuing education participation preceding the examination. Recent licensees scored better than those licensed before 1953. Continuing education activity also promoted better scores. (CT)

  14. Relations of eye color to scores on Bruininks-Oseretsky Test of Motor Proficiency--Short Form.

    PubMed

    Beer, J; Fleming, P

    1989-06-01

    The Bruininks-Oseretsky Test of Motor Proficiency--Short Form (8 subtests and 15 motor skill activities) was administered individually to 28 students. Multivariate analysis of variance showed no association with differences in eye color. There were two significant sex differences on univariate F tests; boys scored better at standing broad jump than girls, while girls scored better at standing on one leg and drawing a straight line than boys.

  15. The TSCA interagency testing committee`s approaches to screening and scoring chemicals and chemical groups: 1977-1983

    SciTech Connect

    Walker, J.D.

    1990-12-31

    This paper describes the TSCA interagency testing committee`s (ITC) approaches to screening and scoring chemicals and chemical groups between 1977 and 1983. During this time the ITC conducted five scoring exercises to select chemicals and chemical groups for detailed review and to determine which of these chemicals and chemical groups should be added to the TSCA Section 4(e) Priority Testing List. 29 refs., 1 fig., 2 tabs.

  16. The Effect of Transient Students' Scores on the Norm of One High School's Standardized Basic Skills Test Battery.

    ERIC Educational Resources Information Center

    Hill, Carolyn Stevens

    The effect of the standardized test scores of transient students on the mean of the 9th, 10th, and 11th grade standardized test scores of a school was studied. The groups used in the study were grades 9, 10, and 11 at a new high school in Clarksburg (West Virginia). The study was conducted in the spring of 1998. Groups consisted of 204 9th…

  17. Poisson Approximation-Based Score Test for Detecting Association of Rare Variants.

    PubMed

    Fang, Hongyan; Zhang, Hong; Yang, Yaning

    2016-07-01

    Genome-wide association study (GWAS) has achieved great success in identifying genetic variants, but the nature of GWAS has determined its inherent limitations. Under the common disease rare variants (CDRV) hypothesis, the traditional association analysis methods commonly used in GWAS for common variants do not have enough power for detecting rare variants with a limited sample size. As a solution to this problem, pooling rare variants by their functions provides an efficient way for identifying susceptible genes. Rare variant typically have low frequencies of minor alleles, and the distribution of the total number of minor alleles of the rare variants can be approximated by a Poisson distribution. Based on this fact, we propose a new test method, the Poisson Approximation-based Score Test (PAST), for association analysis of rare variants. Two testing methods, namely, ePAST and mPAST, are proposed based on different strategies of pooling rare variants. Simulation results and application to the CRESCENDO cohort data show that our methods are more powerful than the existing methods.

  18. Examination of Substance Use, Risk Factors, and Protective Factors on Student Academic Test Score Performance

    PubMed Central

    Arthur, Michael W.; Brown, Eric C.; Briney, John S.; Hawkins, J. David; Abbott, Robert D.; Catalano, Richard F.; Becker, Linda; Langer, Michael; Mueller, Martin T.

    2016-01-01

    BACKGROUND School administrators and teachers face difficult decisions about how best to use school resources in order to meet academic achievement goals. Many are hesitant to adopt prevention curricula that are not focused directly on academic achievement. Yet, some have hypothesized that prevention curricula can remove barriers to learning and, thus, promote achievement. This study examined relationships between school levels of student substance use and risk and protective factors that predict adolescent problem behaviors and achievement test performance in Washington State. METHODS Hierarchical Generalized Linear Models were used to examine predictive associations between school-averaged levels of substance use and risk and protective factors and Washington State students’ likelihood of meeting achievement test standards on the Washington Assessment of Student Learning, statistically controlling for demographic and economic factors known to be associated with achievement. RESULTS Results indicate that levels of substance use and risk/protective factors predicted the academic test score performance of students. Many of these effects remained significant even after controlling for model covariates. CONCLUSIONS The findings suggest that implementing prevention programs that target empirically identified risk and protective factors have the potential to positively affect students’ academic achievement. PMID:26149305

  19. A powerful score test to detect positive selection in genome-wide scans

    PubMed Central

    Zhong, Ming; Lange, Kenneth; Papp, Jeanette C; Fan, Ruzong

    2010-01-01

    One of the surest signatures of recent positive selection is a local elevation of advantageous allele frequency and linkage disequilibrium (LD). We proposed to detect such hitchhiking effects by using extended stretches of homozygosity as a surrogate indicator of recent positive selection. An extended haplotype-based homozygosity score test (EHHST) was developed to detect excess homozygosity. The EHHST conditioned on existing LD and it tested the haplotype version of the Hardy–Weinberg equilibrium. Compared with existing popular tests, which usually lack clear distribution, the EHHST is asymptotically normal, which makes analysis and applications easier. In particular, the EHHST facilitates the computation of an asymptotic P-value instead of an empirical P-value, using simulations. We evaluated by simulation that the EHHST led to appropriate false-positive rates, and it had higher or similar power as the existing popular methods. The method was applied to HapMap Phase II data. We were able to replicate previous findings of strong positive selection in 17 autosome genomic regions out of 20 reported candidates. On the basis of high EHHST values and population differentiations, we identified 15 new candidate regions that could undergo recent selection. PMID:20461112

  20. One year test-retest reliability of neurocognitive baseline scores in 10- to 12-year olds.

    PubMed

    Moser, Rosemarie Scolaro; Schatz, Philip; Grosner, Emily; Kollias, Kelly

    2017-01-01

    How often youth athletes 10-12 years of age should undergo neurocognitive baseline testing remains an unanswered question. We sought to examine the test-retest reliability of annual ImPACT data in a sample of middle school athletes. Participants were 30 youth athletes, ages 10-12 years (Mean = 11.6, SD = 0.6) selected from a larger database of 10-18 year old athletes, who completed two consecutive annual baseline evaluations using the online version of ImPACT. Athlete assent and parental consent were obtained for all participants. Assessments were conducted either individually or in small groups of 2 to 3 athletes, under the supervision of a neuropsychologist or post-doctoral fellow. Test-retest coefficients were as follows: Verbal Memory .71, Visual Memory .35, Visual Motor Speed .69, Reaction Time .34. Intra-class Correlation Coefficients (single/average) were as follows: Verbal Memory .70/.83, Visual Memory .35/.52, Visual Motor Speed .69/.82, Reaction Time .34/.50. Regression-based measures to correct for practice effects revealed that only a small percentage of cases fell outside 90 and 95% confidence intervals, reflecting stability across assessments. Findings indicate that test-retest reliability of Verbal Memory and Visual Motor Speed are generally stable in 10-12 year old athletes. Nevertheless, Visual Memory Index, Reaction Time Index, and Symptom Checklist scores appear to be less reliable over time, especially compared to published data on high school athletes, suggesting the utility of re-testing on an annual basis in this younger age group.