Hernandez, Barbara L. Michiels; Ward, Susan; Strickland, George
Legislative mandates and reforms hold universities accountable for student certification test performance. The purpose of this investigation was to determine if cumulative grade point average scores and the preprofessional academic skills test scores predict performance on elementary certification test (professional development) scores of…
Guzeller, Cem Oktay
In this research, the relationship between written exam scores of science and technology class of 6th, 7th, and 8th grades, project, participation in class activities and performance work, year-end academic success point averages and sub-test raw scores of LDT science of 6th, 7th and 8th grades. Academic success point averages were used as…
Legg, Sue M.; Buhr, Dianne C.
Possible causes of a 16-point mean score increase for the computer adaptive form of the College Level Academic Skills Test (CLAST) in reading over the paper-and-pencil test (PPT) in reading are examined. The adaptive form of the CLAST was used in a state-wide field test in which reading, writing, and computation scores for approximately 1,000…
Meijer, Rob R.
This book discusses how to obtain test scores and, in particular, how to obtain test scores from tests that consist of a combination of multiple choice and open-ended questions. The strength of the book is that scoring solutions are presented for a diversity of real world scoring problems. (SLD)
Ickovics, Jeannette R.; Carroll-Scott, Amy; Peters, Susan M.; Schwartz, Marlene; Gilstad-Hayden, Kathryn; McCaslin, Catherine
Background The Institute of Medicine (2012) concluded that we must “strengthen schools as the heart of health.” To intervene for better outcomes in both health and academic achievement, identifying factors that impact children is essential. Study objectives are to (1) document associations between health assets and academic achievement, and (2) examine cumulative effects of these assets on academic achievement. Methods Participants include 940 students (grades 5 and 6) from 12 schools randomly selected from an urban district. Data include physical assessments, fitness testing, surveys, and district records. Fourteen health indicators were gathered including physical health (eg, body mass index [BMI]), health behaviors (eg, meeting recommendations for fruit/vegetable consumption), family environment (eg, family meals), and psychological well-being (eg, sleep quality). Data were collected 3-6 months prior to standardized testing. Results On average, students reported 7.1 health assets out of 14. Those with more health assets were more likely to be at goal for standardized tests (reading/writing/mathematics), and students with the most health assets were 2.2 times more likely to achieve goal compared with students with the fewest health assets (both p < .001). Conclusions Schools that utilize nontraditional instructional strategies to improve student health may also improve academic achievement, closing equity gaps in both health and academic achievement. PMID:24320151
Pomplun, Mark R.
This study investigated convergent validity evidence for student growth scores with high school course grades. The Measures of Academic Progress and Educational Planning and Assessment System growth scores for approximately 1,800 ninth-grade students over 2 years were related to language, arts, and mathematics course grades for developmental,…
Zhao, Sihai Dave; Li, Yi
Variable screening has emerged as a crucial first step in the analysis of high-throughput data, but existing procedures can be computationally cumbersome, difficult to justify theoretically, or inapplicable to certain types of analyses. Motivated by a high-dimensional censored quantile regression problem in multiple myeloma genomics, this paper makes three contributions. First, we establish a score test-based screening framework, which is widely applicable, extremely computationally efficient, and relatively simple to justify. Secondly, we propose a resampling-based procedure for selecting the number of variables to retain after screening according to the principle of reproducibility. Finally, we propose a new iterative score test screening method which is closely related to sparse regression. In simulations we apply our methods to four different regression models and show that they can outperform existing procedures. We also apply score test screening to an analysis of gene expression data from multiple myeloma patients using a censored quantile regression model to identify high-risk genes. PMID:25124197
Weinstein, Lawrence; Laverghetta, Antonio; Alexander, Ralph; Stewart, Megan
The current study is an extension of a previous investigation dealing with teacher greetings to students. The present investigation used teacher greetings with college students and academic performance (test scores). We report data using university students and in-class test performance. Students in introductory psychology who received teachers'…
Ickovics, Jeannette R.; Carroll-Scott, Amy; Peters, Susan M.; Schwartz, Marlene; Gilstad-Hayden, Kathryn; McCaslin, Catherine
Background: The Institute of Medicine (2012) concluded that we must "strengthen schools as the heart of health." To intervene for better outcomes in both health and academic achievement, identifying factors that impact children is essential. Study objectives are to (1) document associations between health assets and academic achievement,…
Questions use of value-added assessment of student achievement to solve problems of accountability. Discusses three problems associated with value-added assessment: (1) limited accuracy of testing to measure student gains; (2) factors other than teacher or school quality possibly attributable to gains; and (3) lack of gain comparators for students…
Heinrich Stumpf; Julian C. Stanley
For every 4-year college in the United States listed in the 1998 College Handbook of the College Board, the percentages of students graduating within 6 years of entering and of students having high school grade point averages (GPAs) of at least 3.00 were recorded. The authors also obtained the College Board Scholastic Assessment Test I (SAT I) Verbal and Math
Dearman, Nancy B.; Plisko, Valena White
Looks at four sources for measuring national student performance: (1) the National Assessment of Educational Progress study of basic skills; (2) competency testing in reading, writing, and arithmetic; (3) college entrance examination scores; and (4) rates of educational attainment by sex, race, ability level, and socioeconomic status. (SK)
Academic Testing Services STRATEGIC PLAN 2009 MISSION STATEMENT Academic Testing Services provides proctored testing services administered in a secure and appropriate standardized testing environment. VISION STATEMENT Academic Testing Services will provide quality services that are integral to recruitment
Bridging the Gap through Academic Intervention Programs: A Quantitative Study of the Efficacy of the Health Sciences and Technology Academy (HSTA) on Underrepresented Students' State Standardized Test Scores
Smith, Feon M.
The purpose of the quantitative research study was to determine if participation in the Health Sciences and Technology Academy (HSTA) led to significant differences in the math and reading/language arts scores on the West Virginia Educational Standards Test 2 (WESTEST 2), between students who participated in the program compared to students who…
A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…
Green, Donald Ross
Uses of the variety of scores generated by standardized achievement tests are discussed. Desirable characteristics of scales, raw score scales, percent of correct items, percentile ranks, grade equivalents, normal curve equivalents, and scale scores are considered. The various meanings and purposes of each type of score are discussed. It is…
Paul, Clyde; Rosenkoetter, John
Total scores from a series of classroom examinations compared with the order in which students completed the tests showed a relationship between completion time and test score. The first half of the students to finish scored significantly higher than the last half. (DS)
A User's Guide To BRILLIANT! TEST SCORING AND ITEM ANALYSIS August, 2008 Program Brilliant!: Test ....................................................................................................2 Test Scoring Enhancements.............................................................................................................................................................2 Scoring different test forms
Green, Donald Ross
Explains achievement test scores, focusing on types, uses, meanings, and relative importance. Describes currently used scales, normal distribution curves, percentile ranks, grade equivalents, and other rating systems. Advises inclusion of more than one kind of standardized test score, since each provides different information. Includes three…
This paper by Stephen P. Klein, et al., was at the center of the Presidential campaign last week as Al Gore seized on its conclusion that the great disparity in Texas between student scores on state (Texas Assessment of Academic Skills) vs. federal (NAEP) tests suggested that the improvements claimed by Governor Bush in the state's education system were in fact inflated, possibly due to a policy of teachers teaching to the Texas tests.
Whittaker, Tiffany A.; Williams, Natasha J.; Dodd, Barbara G.
This study assessed the interpretability of scaled scores based on either number correct (NC) scoring for a paper-and-pencil test or one of two methods of scoring computer-based tests: an item pattern (IP) scoring method and a method based on equated NC scoring. The equated NC scoring method for computer-based tests was proposed as an alternative…
van der Linden, Wim J.
Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test…
Traditionally, the test score represented by the number of items answered correctly was taken as an indicator of the examinee's ability level. Researchers still tend to think that the number-correct score is a way of ordering individuals with respect to the latent trait. The objective of this study is to depict the benefits of using ability…
The paper investigates if the provision of financial incentives has an impact on the performance of students in educational tests. The analysis is based on data from an experiment with high school students who answered multiple?choice items from the Third International Mathematics and Science Study (TIMSS). As in TIMSS, the setup did not discourage students from guessing. Students with a
A Study of the Relationship Between Scores on the School and College Ability Test (SCAT Series II), the College English Placement Test (CEPT) and Academic Achievement in American History and Constitution (History 27).
Schaumburg, Gary F.
This paper reports the results of an investigation of the relationship between scores on the School and College Ability Test (SCAT), the College English Placement Test (CEPT), and grades earned in American History and Constitution (History 27 at Cerritos College, California) in order to ascertain if predictability of "successful" or "unsuccessful"…
Aimee L. Webb; Usha Ramakrishnan; Aryeh D. Stein; Daniel W. Sellen; Moeza Merchant; Reynaldo Martorell
Appropriate home management can alleviate many of the consequences of diarrhea including malnutrition, impaired development,\\u000a growth faltering, and mortality. Maternal cognitive ability, years of schooling, and acquired academic skills are hypothesized\\u000a to improve child health by improving maternal child care practices, such as illness management. Using information collected\\u000a longitudinally in 1996–1999 from 466 rural Guatemalan women with children <36 months, we
This article presents three strategies for teaching students who are taking the IELTS speaking test. The first strategy is aimed at improving confidence and uses a variety of self-help materials from the field of popular psychology. The second encourages students to think critically and invokes a range of academic perspectives. The third strategy…
The purpose of this study was to determine if there was a difference in Tennessee Comprehensive Assessment Program Modified Academic Achievement Standards (TCAP MAAS) achievement test scores for special education students who receive their instruction in the resource classroom or in an inclusion classroom. The study involved third, fourth, and…
Current thinking on validity suggests that educational institutions and individuals should evaluate their uses of test scores in the context of their fundamental goals. Regression coefficients and other traditional criterion-related validity statistics provide relevant information, but often do not, by themselves, address the fundamental reasons…
Journal of Blacks in Higher Education, 2003
Academically accomplished applicants to the nation's top colleges usually take SAT II Achievement Tests. While scoring gaps between college-bound Blacks and Whites on these tests tend to be smaller than gaps on the basic SAT, a racial scoring gap persists. However, black students appear to be making progress in closing the racial scoring gap on…
The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…
Kane, Michael T.
To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…
Verret, Erik Phillip
accurately and quickly. . . . Unless the whole program is care- fully planned, there is danger that the scoring of tests will be allowed to drag over a period of several months until the faculty and administration, as well as the students, have , 3 lost... for Develop- ment of Computer Test Grading and Computer Naintained Course Gradebook", p. l. 3 Grossman, Alvin and Howe, Robert L. , Data Processin for Educators, pp. 152-153. 4 Op. cit. , Hedges and Hope, p. 2. Associate Professor of Chemistry. When...
Verret, Erik Phillip
accurately and quickly. . . . Unless the whole program is care- fully planned, there is danger that the scoring of tests will be allowed to drag over a period of several months until the faculty and administration, as well as the students, have , 3 lost... for Develop- ment of Computer Test Grading and Computer Naintained Course Gradebook", p. l. 3 Grossman, Alvin and Howe, Robert L. , Data Processin for Educators, pp. 152-153. 4 Op. cit. , Hedges and Hope, p. 2. Associate Professor of Chemistry. When...
Providing Transparency and Credibility: The Selection of International Students for Australian Universities. An Examination of the Relationship between Scores in the International Student Admissions Test (ISAT), Final Year Academic Programs and an Australian University's Foundation Program
Lai, Kelvin; Nankervis, Susan; Story, Margot; Hodgson, Wayne; Lewenberg, Michael; Ball, Marita MacMahon
Throughout 2003-04 five cohorts of students in their final year of school studies in various Malaysian colleges and a group of students completing an Australian university foundation year in Malaysia sat the International Student Admissions Test (ISAT). The ISAT is a multiple-choice test of general academic abilities developed for students whose…
Karsten T. Hansen; James J. Heckman; Kathleen J. Mullen
This paper develops two methods for estimating the e!ect of schooling on achievement test scores that control for the endogeneity of schooling by postulating that both schooling and test scores are generated by a common unobserved latent ability. These methods are applied to data on schooling and test scores. Estimates from the two methods are in close agreement. We ,nd
Karsten T. Hansen; James J. Heckman; K. J. Kathleen J. Mullen
This paper develops two methods for estimating the effect of schooling on achievement test scores that control for the endogeneity of schooling by postulating that both schooling and test scores are generated by a common unobserved latent ability. These methods are applied to data on schooling and test scores. Estimates from the two methods are in close agreement. We find
Karsten T. Hansen; James J. Heckman; Kathleen J. Mullen
This paper develops two methods for estimating the effect of schooling on achievement test scores that control for the endogeneity of schooling by postulating that both schooling and test scores are generated by a common unobserved latent ability. These methods are applied to data on schooling and test scores. Estimates from the two methods are in close agreement. We find
Pae, Hye K.
The aim of this study was to apply Rasch modeling to an examination of the psychometric properties of the "Pearson Test of English Academic" (PTE Academic). Analyzed were 140 test-takers' scores derived from the PTE Academic database. The mean age of the participants was 26.45 (SD = 5.82), ranging from 17 to 46. Conformity of the participants'…
Kuentzel, Jeffrey G.; Hetterscheidt, Lesley A.; Barnett, Douglas
The rigors of standardized testing make for numerous opportunities for examiner error, including simple computational mistakes in scoring. Although experts recommend that test scoring be double-checked, the extent to which independent double-checking would reduce scoring errors is not known. A double-checking procedure was established at a…
Hieronymus, A. N.; Stroud, James B.
Attempts to fill research gap on testing by obtaining comparisons of deviation scores, at grade levels four, seven, and ten, from the California Test of Mental Maturity, Henmon-Nelson Tests, and Lorge-Thorndike Intelligence tests. Results tabulated. (CJ)
In "Beyond Test Scores: Leading Indicators for Education," Foley and colleagues (2008) define leading indicators as those that "provide early signals of progress toward academic achievement" (p. 1) and stress that educators "need leading indicators to help them see the direction their efforts are going in and to take corrective action as soon as…
Hollingsworth, Mary Ann
This study examined the relationship between dimensions of wellness and academic performance for 634 third through fifth grade students in Title One schools in rural Mississippi, using composites of the Five Factor Wellness Inventory for Elementary Children and Reading, Language, and Math Scores of the Mississippi Curriculum Test (a state level…
Hoffman, John L.; Lowitzki, Katie E.
Using a sample of 522 students at a Lutheran university in the Southwestern United States, researchers examined differences in the predictive strength of high school grades and standardized test scores for student involvement, academic achievement, retention, and satisfaction. Findings indicate that high school grades are stronger predictors of…
This study presents a method for journal collection evaluation using citation analysis. Cost-per-use (CPU) for each title is used to measure cost-effectiveness with higher CPU scores indicating cost-effective titles. Use data are based on the impact factor and locally collected citation score of each title and is compared to the cost of managing…
Worrell, Frank C.; Mello, Zena R.
In this study, the authors examined the reliability, structural validity, and concurrent validity of Zimbardo Time Perspective Inventory (ZTPI) scores in a group of 815 academically talented adolescents. Reliability estimates of the purported factors' scores were in the low to moderate range. Exploratory factor analysis supported a five-factor…
Erik E. Noftle; Richard W. Robins
The authors examined relations between the Big Five personality traits and academic outcomes, specifically SAT scores and grade-point average (GPA). Openness was the strongest predictor of SAT verbal scores, and Conscientiousness was the strongest predictor of both high school and college GPA. These relations replicated across 4 independent samples and across 4 different personality inventories. Further analyses showed that Conscientiousness
Levine, Michael V.; Rubin, Donald B.
Appropriateness indexes (statistical formulas) for detecting suspiciously high or low scores on aptitude tests were presented, based on a simulation of the Scholastic Aptitude Test (SAT) with 3,000 simulated scores--2,800 normal and 200 suspicious. The traditional index--marginal probability--uses a model for the normal examinee's test-taking…
Wise, Vicki L.; Wise, Steven L.; Bhola, Dennison S.
Accountability for educational quality is a priority at all levels of education. Low-stakes testing is one way to measure the quality of education that students receive and make inferences about what students know and can do. Aggregate test scores from low-stakes testing programs are suspect, however, to the degree that these scores are influenced…
van der Linden, Wim J.
Presents a constrained computerized adaptive testing (CAT) algorithm that can be used to equate CAT number-correct scores to a reference test. Used an item bank from the Law School Admission Test to compare results of the algorithm with those for equipercentile observed-score equating. Discusses advantages of the approach. (SLD)
Feldt, Leonard S.
In some settings, the validity of a battery composite or a test score is enhanced by weighting some parts or items more heavily than others in the total score. This article describes methods of estimating the total score reliability coefficient when differential weights are used with items or parts.
Nichta, Lawrence J., Jr.; And Others
Evaluated the Screening Test of Academic Readiness (STAR) using a sample of 28 third graders. The third graders' scores on the Peabody Individual Achievement Test were correlated with their total STAR scores from prekindergarten testing. Results showed the STAR is a useful instrument for predicting third grade achievement. (Author/JAC)
Levine, Michael V.; Rubin, Donald B.
A student may be so unlike other students that his/her aptitude test score fails to be a completely appropriate measure. We consider the problem of using the student's pattern of multiple-choice aptitude test answers to decide whether his/her score is an appropriate ability measure. (Author/CTM)
Schagen, I. P.
A model for the age standardization of test scores is presented, which is fitted to the percentile points of the raw score distribution and assumes a linear trend of each percentile with age. The model's applications in standardizing tests and diagnostic plots produced by a computer program--STANEW--are described. (SLD)
ELIZABETH BURLEIGH; I AN REEVES; C HRISTINE MCALPINE; J AMES DAVIE
Objectives: the abbreviated mental test is widely used in the assessment of cognitive impairment in elderly patients. However, many doctors do not administer the full 10 questions, preferring to estimate the patient's score instead. We have studied the accuracy of doctors in predicting patients' abbreviated mental test scores. Methods: we assessed 102 patients in the geriatric unit. We asked doctors
Haberman, Shelby J; Yao, Lili; Sinharay, Sandip
In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®) . In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. PMID:25773314
a course in English 101; nor can I earn test credit for English 102 if I have completed a course in English 102 or ENGL 112. I understandReceiving Credit for English Composition Based on Test Scores Updated 4
Center on Education Policy, 2010
This paper profiles Wisconsin's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in math at grades 4 and 8 and in reading at grade 8. In grade 4 reading, the percentage scoring…
J. R. Crawford; Paul H. Garthwaite
Neuropsychologists often need to estimate the abnormality of an individual patient’s test score, or test score discrepancies, when the normative or control sample against which the patient is compared is modest in size. Crawford and Howell [The Clinical Neuropsychologist 12 (1998) 482] and Crawford et al. [Journal of Clinical and Experimental Neuropsychology 20 (1998) 898] presented methods for obtaining point
It is challenging for parents and the general public to make sense of the reports on test scores that appear in the mass media. This article offers some things for readers to consider as they bring a critical eye to what is read in the papers. Usually reports on test scores in the media are quite short and focus on one or two aspects of test…
Sireci, Stephen G.; Talento-Miller, Eileen
Admissions data and first-year grade point average (GPA) data from 11 graduate management schools were analyzed to evaluate the predictive validity of Graduate Management Admission Test[R] (GMAT[R]) scores and the extent to which predictive validity held across sex and race/ethnicity. The results indicated GMAT verbal and quantitative scores had…
Academic Testing Services Strategic Plan 2010 - 2015 Contribute to increasing enrollment and promoting student success: Provide test assessments which are integral to higher education participation and progression Provide a quality test environment conducive to optimal performance and success Provide quality
Powers, Donald E.
After adjusting for different background characteristics of students, effects on test scores were related to the length and type of test coaching programs offered. The data suggest that the test item types in the Graduate Record Examination General Test appear to show little susceptibility to formal coaching experiences. (Author/DWH)
There is often little correlation between objective tests of writing or writing components and grading by teachers. The technology that can be applied to student writing evaluation lags behind a reasoned rhetorical explanation of test results. Evaluations of writing are inadequate unless they are interpreted within a rhetorical context that…
Integrating mathematics with family and consumer sciences (FCS) has enabled youth to pass the Minnesota 8th Grade Math Basic Skills test. The test focuses on the eight content areas: (1) problem solving with whole numbers and fractions; (2) problem solving with percentage/ratio; (3) number sense; (4) estimation; 5) measurement; (6) tables and…
Ezeala, Christian C.; Swami, Niraj S.; Lal, Nilesh; Hussain, Shagufta
Secondary education in Fiji ends with the Form 7 examination. Predictive validity for academic success of Form 7 scores which form the basis for admission into the Bachelor of Medicine Bachelor of Surgery programme of the Fiji School of Medicine was examined via a cohort of 129 students. Success rates for year 1 in 2008, 2009, and 2010 were 90.7…
Kretschmann, Julia; Vock, Miriam; Lüdtke, Oliver
Using German data, we examined the effects of one specific type of acceleration--grade skipping--on academic performance. Prior research on the effects of acceleration has suffered from methodological restrictions, especially due to a lack of appropriate comparison groups and a priori measurements. For this reason, propensity score matching was…
Rooney, Charles; Schaeffer, Bob
More than 275 colleges across the United States now admit some or all of their applicants without regard to Scholastic Assessment Test (SAT) or American College Testing Program (ACT) scores, and many say that the policy has increased both the diversity and the academic quality of their entering classes. Many lessons have been learned at schools…
Feeney, M. Patrick
The study evaluated a distinctive feature scoring technique for List 1 of the California Consonant Test for the purpose of improving test reliability in this test used to identify errors in speech recognition made by adult listeners (N=50) with high frequency sensorineural hearing loss. (DB)
As more colleges move to "test optional" admissions policies, the debate over the utility and interpretation of standardized-test scores continues. In this article, the author interviews Daniel Koretz, a professor of education at Harvard University and author of "Measuring Up: What Educational Testing Really Tells Us". Koretz shares his thoughts…
Center on Education Policy, 2010
This paper profiles Maryland's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased at grades 4 and 8 in both reading and math. Average annual gains were larger on the state test than…
Beatriz U Ramirez (Universidad de Santiago de Chile)
After a sudden increase in most of the individual grades in a multiple-choice test, students were asked to rank the three most relevant factors responsible for this outcome. Among eight others, the availability of a test for self-assessment before the final test was by far the most frequently mentioned (82.4% of the students). Questions applied during different course activities did not have the same effect on student scores as the "online" self-assessment test.
This article argues that so-called 'objective', scientifically 'valid and reliable' tests of aptitude such as ASAT (Australian Scholastic Apti tude Test, used in Queensland as the scaling mechanism for pro ducing a state-wide ranked order of student merit prior to the allocation of tertiary entrance scores), in fact operate to reinforce existing biases in the education system. Drawing on an
Brown, Sarah Lee
The researcher interviewed two groups of eleventh grade students, in a rural Appalachian setting, who tended to score low on the state mandated high stakes/low stakes test to discover their efforts on the test, specifically in reading, and to obtain their opinions concerning the effects of a specific incentive or consequence. Before the eleventh…
Wise, Steven L.
Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…
Muller, Jorg M.
A new test index is defined as the probability of obtaining two randomly selected test scores (PDTS) as statistically different. After giving a concept definition of the test index, two simulation studies are presented. The first analyzes the influence of the distribution of test scores, test reliability, and sample size on PDTS within classical…
Bishop, N. Scott
This study examined the effects of different test administration conditions on reading comprehension test scores. Evidence of performance differences across district testing conditions might imply that the meanings and interpretations associated with the corresponding test scores have limited generalizability (i.e., knowing how well one reads…
Academic Testing Services Jennifer Fidler- Lead Specialist 2012 Red Raider Orientation #12;Academic Testing Services What Do We Do In Academic Testing? We provide testing to Texas Tech students credit #12;Academic Testing Services TSI Compliance Students have to meet certain criteria set
Brennan, Robert L.
Kane's paper "Validating the Interpretations and Uses of Test Scores" is the most complete and clearest discussion yet available of the argument-based approach to validation. At its most basic level, validation as formulated by Kane is fundamentally a simply-stated two-step enterprise: (1) specify the claims inherent in a particular interpretation…
Report on Education Research, 1983
THE FOLLOWING IS THE FULL TEXT OF THIS DOCUMENT: A new study by a pair of Harvard University researchers discounts earlier findings that coaching can substantially improve student performance on the Scholastic Aptitude Test (SAT). "There is simply insufficient evidence that large score increases are a result of a coaching program," write Rebecca…
van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas
This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
Holland, Paul W.; Thayer, Dorothy T.
Applied the theory of exponential families of distributions to the problem of fitting the univariate histograms and discrete bivariate frequency distributions that often arise in the analysis of test scores. Considers efficient computation of the maximum likelihood estimates of the parameters using Newton's Method and computationally efficient…
van der Linden, Wim J.; Wiberg, Marie
For traditional methods of observed-score equating with anchor-test designs, such as chain and poststratification equating, it is difficult to satisfy the criteria of equity and population invariance. Their equatings are therefore likely to be biased. The bias in these methods was evaluated against a simple local equating method in which the…
Grissom, Jason A.; Kalogrides, Demetra; Loeb, Susanna
Expansion of the use of student test score data to measure teacher performance has fueled recent policy interest in using those data to measure the effects of school administrators as well. However, little research has considered the capacity of student performance data to uncover principal effects. Filling this gap, this article identifies…
Marder, M.; Bansal, D.
We apply visualization and modeling methods for convective and diffusive flows to public school mathematics test scores from Texas. We obtain plots that show the most likely future and past scores of students, the effects of random processes such as guessing, and the rate at which students appear in and disappear from schools. We show that student outcomes depend strongly upon economic class, and identify the grade levels where flows of different groups diverge most strongly. Changing the effectiveness of instruction in one grade naturally leads to strongly nonlinear effects on student outcomes in subsequent grades. PMID:19805049
Helm, Denise Muesch
The principal goal of acceptance criteria is to select candi-dates who will graduate and transition into professional practice. However, in an attempt to increase the diversity of their student populations, educators are anxious to make changes to the traditional acceptance criteria, such as standardized test scores. Yet data indicate that standardized testing biases against certain populations of students (i.e., female, culturally diverse, and those from lower socioeconomic backgrounds). Fairer assessment measures should continue to be sought. PMID:18847114
Many studies have focused on finding the level of effect that academic locus of control, tendencies towards academic dishonesty, and test anxiety levels have had on academic self-efficacy, and providing a separate explanation ratio for each. The relationship among the effects of the academic locus of control, tendencies towards academic…
Wu, Brad C.
The additive and response patterns scoring methods within and between multiple true-false (MTF) items were examined using data for 5,000 students for each of 2 years from the mathematics portion of the national college entrance examination in Taiwan. For additive scoring at item level, response to each option was scored dichotomously and added up…
Marrah, Arleezah K.
The academic performance of African American students continues to be a concern for educators, researchers, and most importantly their community. This issue is particularly prevalent in the standardized test scores of African American students where they score on average one or more standard deviations below their Caucasian and Asian American…
Dickson, Teresa Kay
This study analyzed student test scores to determine if teacher participation in an inquiry-based professional development was able to make a statistically significant difference in student achievement levels. Test scores for objectives that assessed the critical thinking skills and problem-solving strategies modeled in a science inquiry institute were studied. Inquiry-based experiences are the cornerstones for meeting the science standards for scientific literacy. State mandated assessment tests measure the levels of student achievement and are reported as meeting minimum expectations or showing mastery for specific learning objectives. Students test scores from the Texas Assessment of Academic Skills Test (TAAS) for 8th grade science and the biology End Of Course (EOC) exams were analyzed using ANCOVA, chi square, and logistic regression, with the Iowa Test of Basic Skills (ITBS) 7th Grade Science Subtest as covariate. It was hypothesized that the students of Inquiry Institute teachers would have higher scale scores and better rates of mastery on the critical thinking objectives than the students of non-Institute teachers. It was also hypothesized that it would be possible to predict student mastery on the objectives that assessed critical thinking and problem solving based on Institute participation. This quasi-experimental study did not show a statistically significant difference between the two groups. The effects of inquiry-based professional development may not be determined by analyzing the results of the standardized tests currently being used in Texas. Inquiry training may make a difference, but because of factors such as the ceiling effect, insufficient time to implement the program, and test items that are intended to but do not address critical thinking skills, the TAAS and EOC tests may not accurately assess effects of the Inquiry Institute. The results of this study did indicate the best predictor of student mastery for the 8th grade science TAAS and Biology EOC may possibly be prior knowledge acquired in elementary school and as demonstrated on the 7th grade ITBS science subtest.
Chafetz, Michael D; Matthews, Lee H
A New interference calculation method for the Stroop test was developed based upon a neuropsychological model of the suppression of word reading in favor of color naming. Polynomial regression equations show a significant relationship between word reading and the New interference score that closely fits the underlying prediction of the New model, while the Golden [Stroop Color and Word Test, Stoelting Co., IL, Wood Dale, 1978] model (Old) produces only a random relationship. Constructs of developmental maturation and lateralized brain damage are supported by the New but not the Old method. The New compared to the Old method also gives a significant reduction in scores in a small sample of demented patients. It would be advisable to use this New model in both cognitive and neuropsychological comparisons of different lesions or different stimulus and response demands. The New model will also help promote finer clinical inferences when an understanding relative to the patient's own baselines is necessary. PMID:15163456
Dollinger, Stephen J.; Clark, M. H.
The issue of race differences in standardized test scores and academic achievement continues to be a vexing one for behavioral scientists and society at large. Ellis and Ryan (2003) suggested that a portion of the cognitive-ability test performance differences between White/Caucasian-American and Black/African-American college students could be…
Dutrow, Anita Marceca; Houston, Charles A.
A study was conducted at Dabney S. Lancaster Community College (DSLCC) to examine the relationships between reading achievement, academic major, selected personality variables, grade point average (GPA), and scores on the College Guidance and Placement Test (CGPT). The Iowa Silent Reading Test, the Survey of Study Habits and Attitudes, and the…
Martin, John D.; And Others
The relationship between Elizur's Hostility Scoring on the Rorschach Test and the Acting-Out Score on the Hand Test was examined. Correlations between the two measures (using several scoring procedures) ranged from .40 to .64. (JKS)
Wolf, Dennis M; Williamson, Peter A
INTRODUCTION To compare the citation indices of original articles and case reports in otolaryngology journals and thereby determine whether case reports are of less interest and possibly of academically inferior value to original articles. METIERIALS AND METHODS All articles in two reputable UK otolaryngology journals (Clinical Otolaryngology and Journal of Laryngology and Otology) for 2000 and 2001 were identified. Citation indices were obtained from ISI Web of Knowledge and compared. Statistical analysis was performed using Microsoft® Office Excel 2003. RESULTS Review articles were cited most frequently with a mean of 5.21 followed by original articles with 4.28 citations and case reports with 2.40 citations. The difference in citing between original articles and case reports was statistically significant (P < 0.0001). There was no statistically significant difference in citations between review articles and original articles. CONCLUSIONS As case reports are clearly of lesser academic value than original and review articles, we suggest a scoring system incorporating journal impact factor and a scoring multiple taking into account study design. This facilitates easier comparison and recognition of publications in curricula vitae during job application. PMID:18990264
Hatcher, Donald L.
In this article, after describing one approach for teaching critical thinking (CT) that was in place at Baker University from 1990 to 2008, the author describes their experience assessing CT using three standardized exams and shows why the choice of a standardized CT test can be problematic and the results misleading. These results can be…
The relationship between selected standardized test scores and performance in advanced placement math and science exams: Analyzing the differential effectiveness of scores for course identification and placement
Urbina, Josue N.
There is a national need to increase the STEM-related workforce. Among factors leading towards STEM careers include the number of advanced high school mathematics and science courses students complete. Florida's enrollment patterns in STEM-related Advanced Placement (AP) courses, however, reveal that only a small percentage of students enroll into these classes. Therefore, screening tools are needed to find more students for these courses, who are academically ready, yet have not been identified. The purpose of this study was to investigate the extent to which scores from a national standardized test, Preliminary Scholastic Assessment Test/ National Merit Qualifying Test (PSAT/NMSQT), in conjunction with and compared to a state-mandated standardized test, Florida Comprehensive Assessment Test (FCAT), are related to selected AP exam performance in Seminole County Public Schools. An ex post facto correlational study was conducted using 6,189 student records from the 2010 - 2012 academic years. Multiple regression analyses using simultaneous Full Model testing showed differential moderate to strong relationships between scores in eight of the nine AP courses (i.e., Biology, Environmental Science, Chemistry, Physics B, Physics C Electrical, Physics C Mechanical, Statistics, Calculus AB and BC) examined. For example, the significant unique contribution to overall variance in AP scores was a linear combination of PSAT Math (M), Critical Reading (CR) and FCAT Reading (R) for Biology and Environmental Science. Moderate relationships for Chemistry included a linear combination of PSAT M, W (Writing) and FCAT M; a combination of FCAT M and PSAT M was most significantly associated with Calculus AB performance. These findings have implications for both research and practice. FCAT scores, in conjunction with PSAT scores, can potentially be used for specific STEM-related AP courses, as part of a systematic approach towards AP course identification and placement. For courses with moderate to strong relationships, validation studies and development of expectancy tables, which estimate the probability of successful performance on these AP exams, are recommended. Also, findings established a need to examine other related research issues including, but not limited to, extensive longitudinal studies and analyses of other available or prospective standardized test scores.
Doyle, William R.
Several studies have reported a positive impact of increased academic momentum on transfer from community colleges to four-year institutions. This result may be due to selection bias. Using data from the Beginning Postsecondary Students dataset, I test whether taking more credits in the first year has an impact on transfer rates among bachelor's…
Wang, Ting; Merkle, Edgar C.; Zeileis, Achim
In this paper, we consider a family of recently-proposed measurement invariance tests that are based on the scores of a fitted model. This family can be used to test for measurement invariance w.r.t. a continuous auxiliary variable, without pre-specification of subgroups. Moreover, the family can be used when one wishes to test for measurement invariance w.r.t. an ordinal auxiliary variable, yielding test statistics that are sensitive to violations that are monotonically related to the ordinal variable (and less sensitive to non-monotonic violations). The paper is specifically aimed at potential users of the tests who may wish to know (1) how the tests can be employed for their data, and (2) whether the tests can accurately identify specific models parameters that violate measurement invariance (possibly in the presence of model misspecification). After providing an overview of the tests, we illustrate their general use via the R packages lavaan and strucchange. We then describe two novel simulations that provide evidence of the tests' practical abilities. As a whole, the paper provides researchers with the tools and knowledge needed to apply these tests to general measurement invariance scenarios. PMID:24936190
Ho, James K.
Explains how spreadsheet software can be used in the design and grading of academic tests and in assigning grades. Macro programs and menu-driven software are highlighted and an example using IBM PCs and Lotus 1-2-3 software is given. (Author/LRW)
Writing Multiple Choice Tests Academic Learning Centre 201 Tier 480-1481 #12;Hierarchy of Learning are at a party. A person walks up to you and accidentally spills a drink all over your new pants. You begin to social psychologists, there is a cognitive bias known as the primacy effect that overemphasizes
Academic Testing Services Jennifer Fidler- Unit Coordinator Kimberly Lucus- Lead Specialist 2013 Red Raider Orientation #12;Academic Testing Services What Do We Do In Academic Testing? We provide testing to Texas Tech students and the community - Of importance to students and parents during RRO
Friedman, A F; Wakefield, J A; Sasek, J; Schroeder, D
A new scoring procedure to be used with Spraings' technique for administering the Bender-Gestalt test in a multiple choice format is presented. Scoring weights are used instead of simply scoring each item right or wrong. The evidence presented suggests that this method of scoring would increase the value of Spraings' test in the diagnosis of perceptual deficits. PMID:833302
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.
Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
This study examined the relationship between teacher education students' scores on basic skills admission tests and graduating seniors' scores on the National Teacher Examinations (NTE) at Eastern Kentucky University. The 1981-82 basic skills test scores for 262 teacher education students were compared with their NTE scores taken in 1984-85 during…
Alpay, S. Pamir
Step 2: Click on the test title Step 3: Click on the test score Step 1: Click on "My Grades test results in HuskyCT Instructors apply settings that determine the extent of the feedback that students see after taking a test in HuskyCT and when that information becomes available. Minimal
To determine usefulness of current and previous test-day somatic cell score (SCS) in predicting test-day milk yield, test-day records from Holstein first and second calvings between 1995 and 2002 were examined. Initial selection required that cows have at least the first four test days with recorde...
Zimmerman, Donald W.
Results of this study indicate that the correlation between half-test scores over repeated splits, over persons, and over repeated testings resulting in different sets of observed scores, is given by Kuder-Richardson Formula 21. (RF)
Meijer, Rob R.
Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a…
Niu, Sunny X.; Tienda, Marta
Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success—high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe. PMID:23788828
Niu, Sunny X; Tienda, Marta
Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success-high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe. PMID:23788828
Branberg, Kenny; And Others
Effects of sex, education, and age on total test score on the Swedish Scholastic Aptitude Test, a college entrance examination, are studied using applicants aged over 25 with 1 to 4 years' work experience. About 10,000 applicants have taken the test annually since 1977. Genuine differences appear in each variable studied. (SLD)
Zenisky, April L.; Hambleton, Ronald K.
Test scores matter these days. Test-takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees' information or usability needs, but this is clearly changing…
Lin, Miao-Hsiang; Hsiung, Chao A.
Two simple empirical approximate Bayes estimators are introduced for estimating domain scores under binomial and hypergeometric distributions respectively. Criteria are established regarding use of these functions over maximum likelihood estimation counterparts. (SLD)
Rebecca Holman; Nadine Weisscher; Cees AW Glas; Marcel GW Dijkgraaf; Marinus Vermeulen; Rob J de Haan; Robert Lindeboom
BACKGROUND: Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This paper examines the measurement properties of the Academic Medical Center linear disability score
Rich, John D., Jr.; Fullard, William; Overton, Willis
One Hundred and Twelve Latino students from Philadelphia participated in this study, which examined the development of deductive reasoning across adolescence, and the relation of reasoning to test anxiety and standardized test scores. As predicted, 11th and ninth graders demonstrated significantly more advanced reasoning than seventh graders.…
Kane, Thomas J.; Staiger, Douglas O.
By the spring of 2000, forty states had begun using student test scores to rate school performance. Twenty states have gone a step further and are attaching explicit monetary rewards or sanctions to a school's test performance. In this paper, the authors focus on accountability programs in which states measure the effectiveness of individual…
Feldt, Leonard S.
Develops formulas to cope with the situation in which the reliability of test scores must be approximated even though no examinee has taken the complete instrument. Develops different estimators for part tests that are judged to be classically parallel, tau-equivalent, or congeneric. Proposes standards for differentiating among these three models.…
Pankratz, Mary; Morrison, Andrea; Plante, Elena
Differences in the standard scores for the Peabody Picture Vocabulary Test-Revised (PPVT-R; L. M. Dunn & L. M. Dunn, 1981) and the PPVT-Third Edition (PPVT-III; Dunn & Dunn, 1997b) are known to exist for children, with typically higher scores occurring on the PPVT-III. However, these tests are administered into adulthood as well, and score…
Matton, Nadine; Vautier, Stephane; Raufaste, Eric
Mean gain scores for cognitive ability tests between two sessions in a selection setting are now a robust finding, yet not fully understood. Many authors do not attribute such gain scores to an increase in the target abilities. Our approach consists of testing a longitudinal SEM model suitable to this view. We propose to model the scores' changes…
Journal of Blacks in Higher Education, 2003
Discusses the racial scoring gap on tests for admission to medical, business, law, and other graduate programs, noting that in the highest-scoring brackets on the Medical College Admission Test (MCAT), the racial gap is even larger. Whites are five times, twelve times, and seven times more likely, respectively, to score higher on the MCAT, Law…
Pope, Gregory A.; Wentzel, Carolyn; Braden, Brigitta; Anderson, Jordan
The purpose of this study was to investigate statistical relationships between gender and Alberta Achievement Testing Program scores. Achievement test scores from grades 3, 6, and 9 in all subject areas were investigated during a four-year period. Results showed statistically significant positive correlations between gender and scores in most…
Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.
A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary to test a POCC/MOC software delivery, as well as improve the quality of the test process.
Stocking, Martha; And Others
For two tests measuring the same trait, the program, BIV20, equates the scores using the two True score distributions estimated by the univariate method 20 program (see Wingersky, Lees, Lennon, and Lord, 1969) and, with these equated true scores and their distributions, estimates the bivariate distribution scores and the relative efficiency of the…
Dorans, Neil J.; Moses, Tim P.; Eignor, Daniel R.
Score equating is essential for any testing program that continually produces new editions of a test and for which the expectation is that scores from these editions have the same meaning over time. Particularly in testing programs that help make high-stakes decisions, it is extremely important that test equating be done carefully and accurately.…
Blackburn, McKinley L.
Previous research has suggested that skills reflected in test-score performance on tests such as the Armed Forces Qualification Test (AFQT) can account for some of the racial differences in average wages. I use a more complete set of test scores available with the National Longitudinal Survey of Youth 1979 Cohort to reconsider this evidence, and…
van der Linden, Wim J.
A constrained computerized adaptive testing (CAT) algorithm is presented that automatically equates the number-correct scores on adaptive tests. The algorithm can be used to equate number-correct scores across different administrations of the same adaptive test as well as to an external reference test. The constraints are derived from a set of…
May, Judy Jackson; Sanders, Eugene T. W.
Districts throughout the nation are engaged in comprehensive transformation to "turn around" low performing schools. Standardized test scores are used to gauge student achievement; however, academic gains may lag behind leading indicators such as improved school climate and effective leadership. This study examines 16 underperforming…
Relationships of National Teacher Examination Communication Skills and General Knowledge Scores with High School and College Grades, Myers-Briggs Type Indicator Characteristics, and Self-Reported Skill Ratings and Academic Problems.
Schurr, K. Terry; And Others
Relationships among National Teacher Examinations (NTE) Communication Skills and General Knowledge test scores, Myers-Briggs Type Indicator characteristics, self-reported academic problems, and 14 skill self-ratings were examined for 161 college teaching majors. After several other variables were controlled, personality variables accounted for a…
PhD Harry R. Goldberg (Johns Hopkins University Zanvyl Krieger Mind/Brain Institute and Department of Biology)
A study conducted that shows that students learn well and score higher on exams in a "Virtual Learning Environment" where the students are presented the same material that is traditionally presented in lecture.
Over the past five years, both DC Public Schools (DCPS) and public charter schools (PCS) have seen significant growth in secondary reading and math scores on the state test known as the District of Columbia Comprehensive Assessment System (DC CAS). However, scores have not improved as much at the elementary level. Reading and math scores for DCPS…
This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…
Smith, Richard M.; Mitchell, Virginia P.
To improve the accuracy of college placement, Rasch scoring and person-fit statistics on the Comparative Guidance and Placement test (CGP) was compared to the traditional right-only scoring. Correlations were calculated between English and mathematics course grades and scores of 1,448 entering freshmen on the reading, writing, and mathematics…
Gavett, Brandon E
The base rates of abnormal test scores in cognitively normal samples have been a focus of recent research. The goal of the current study is to illustrate how Bayes' theorem uses these base rates--along with the same base rates in cognitively impaired samples and prevalence rates of cognitive impairment--to yield probability values that are more useful for making judgments about the absence or presence of cognitive impairment. Correlation matrices, means, and standard deviations were obtained from the Wechsler Memory Scale--4th Edition (WMS-IV) Technical and Interpretive Manual and used in Monte Carlo simulations to estimate the base rates of abnormal test scores in the standardization and special groups (mixed clinical) samples. Bayes' theorem was applied to these estimates to identify probabilities of normal cognition based on the number of abnormal test scores observed. Abnormal scores were common in the standardization sample (65.4% scoring below a scaled score of 7 on at least one subtest) and more common in the mixed clinical sample (85.6% scoring below a scaled score of 7 on at least one subtest). Probabilities varied according to the number of abnormal test scores, base rates of normal cognition, and cutoff scores. The results suggest that interpretation of base rates obtained from cognitively healthy samples must also account for data from cognitively impaired samples. Bayes' theorem can help neuropsychologists answer questions about the probability that an individual examinee is cognitively healthy based on the number of abnormal test scores observed. PMID:25784058
Das, Jishnu; Dercon, Stefan; Habyarimana, James; Krishnan, Pramila; Muralidharan, Karthik; Sundararaman, Venkatesh
Empirical studies of the relationship between school inputs and test scores typically do not account for the fact that households will respond to changes in school inputs. We present a dynamic household optimization model relating test scores to school and household inputs, and test its predictions in two very different low-income country…
Klesch, Heather S.
The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…
K. Das; B. C. Sutradhar
A general nonlinear regression model for repeated measures data is considered. Neyman?s  partial score tests are derived for the significance of regression parameters as well as overdispersion components of the model. Neyman?s score test is asymptotically locally optimal, and the test statistic has asymptotically ?2 distribution under the null hypothesis, with m degrees of freedom, where m is the
The purpose of this study was to compare the achievement of general education students within regular education classes to the achievement of general education students in inclusion/co-teach classes to determine whether there was a significant difference in the achievement between the two groups. The school district's inclusion/co-teach model included ongoing professional development support for teachers and administrators. General education teachers, special education teachers, and teacher assistants collaborated to develop instructional strategies to provide additional remediation to help students to acquire the skills needed to master course content. This quantitative study reviewed the end-of course test (EoCT) scores of Grade 10 physical science and math students within an urban school district. It is not known whether general education students in an inclusive/co-teach science or math course will demonstrate a higher achievement on the EoCT in math or science than students not in an inclusive/co-teach classroom setting. In addition, this study sought to determine if students classified as low socioeconomic status benefited from participating in co-teaching classrooms as evidenced by standardized tests. Inferential statistics were used to determine whether there was a significant difference between the achievements of the treatment group (inclusion/co-teach) and the control group (non-inclusion/co-teach). The findings can be used to provide school districts with optional instructional strategies to implement in the diverse classroom setting in the modern classroom to increase academic performance on state standardized tests.
Cope, Ronald T.; Kolen, Michael J.
This study compared five density estimation techniques applied to samples from a population of 272,244 examinees' ACT English Usage and Mathematics Usage raw scores. Unsmoothed frequencies, kernel method, negative hypergeometric, four-parameter beta compound binomial, and Cureton-Tukey methods were applied to 500 replications of random samples of…
Noble, Julie; Davenport, Mark; Schiel, Jeff; Pommerich, Mary
This study examined the extent to which selected high school academic variables and noncognitive characteristics of American College Testing (ACT)--tested students explain differential test performance of racial/ethnic and gender groups. Of particular interest was the extent to which the noncognitive variables, over and above course work taken,…
We apply a quantile version of the Oaxaca-Blinder decomposition to estimate the counterfactual distribution of the test scores of Black students. In the Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999 (ECLS-K), we find that the gap initially appears only at the top of the distribution of test scores. As children age, however,…
Silles, Mary A.
This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…
Mertler, Craig A.
This book is designed to help K-12 teachers and administrators understand the nature of standardized tests and, in particular, the scores that result from them. This useful manual helps teachers develop the skills necessary to incorporate these test scores into various types of instructional decision making--a process known as "data-driven…
Increasing standardized test scores in reading and math is of high importance to the California Department of Education to meet requirements mandated by the No Child Left Behind (NCLB) act of 2001. More research is needed to understand the best ways to improve tests scores to meet concerns of the NCLB act. The purpose of the study was to evaluate…
There are many reasons to align tests with curricular standards, but this alignment is not sufficient to protect against score inflation. This report explains the relationship between alignment and score inflation by clarifying what is meant by inappropriate test preparation. It provides a concrete, hypothetical example that illustrates a process…
Ramos, Erica; Alfonso, Vincent C.; Schermerhorn, Susan M.
The interpretation of cognitive test scores often leads to decisions concerning the diagnosis, educational placement, and types of interventions used for children. Therefore, it is important that practitioners administer and score cognitive tests without error. This study assesses the frequency and types of examiner errors that occur during the…
Lord, Frederic M.
Given any observed number-right score on a test, a method is described for obtaining a predicition interval for the corresponding number-right score on a randomly parallel form of the same test. The interval can be written down directly from published tables of the hypergeometric distribution. (Author)
Eleonora Patacchini; Yves Zenou
We investigate the racial gap in test scores between white and non-white students in Britain both in levels and differences across the school years. We find that there is a substantial racial gap in test scores, especially between ages 7 and 11, and a less severe one between ages 11 and 16. It thus seems that nonwhites are losing ground
Roland G. Fryer Jr; Steven D. Levitt
In previous research, a substantial gap in test scores between white and black students persists, even after controlling for a wide range of observable characteristics. Using a newly available data set (the Early Childhood Longitudinal Study), we demonstrate that in stark contrast to earlier studies, the black-white test score gap among incoming kindergartners disappears when we control for a small
Roland G. Fryer Jr.; Steven D. Levitt
In previous research, a substantial gap in test scores between White and Black students persists, even after controlling for a wide range of observable characteristics. Using a newly available data set (Early Childhood Longitudinal Study), we demonstrate that in stark contrast to earlier studies, the Black-White test score gap among incoming kindergartners disappears when we control for a small number
Lockwood, J. R.; McCaffrey, Daniel F.
A common strategy for estimating treatment effects in observational studies using individual student-level data is analysis of covariance (ANCOVA) or hierarchical variants of it, in which outcomes (often standardized test scores) are regressed on pretreatment test scores, other student characteristics, and treatment group indicators. Measurement…
Berends, Mark; Penaloza, Roberto V.
Background/Context: Although there has been progress in closing the test score gaps among student groups over past decades, that progress has stalled. Many researchers have speculated why the test score gaps closed between the early 1970s and the early 1990s, but only a few have been able to empirically study how changes in school factors and…
Cascallar, Alicia S.; Dorans, Neil J.
This study compares two methods commonly used (concordance and prediction) to establish linkages between scores from tests of similar content given in different languages. Score linkages between the Verbal and Math sections of the SAT I and the corresponding sections of the Spanish-language admissions test, the Prueba de Aptitud Academica (PAA),…
Bergeron, Renee; Floyd, Randy G.
This study examined the group- and individual-level part score profiles of children with intellectual disability (ID) who participated in clinical validity studies supporting three individually administered intelligence tests. Across tests, children with ID produced group-level profiles comprising mean part scores that fell in the Low to Very Low…
Chen, Shiu-Sheng; Luoh, Ming-Ching
Using data from the Programme for International Student Assessment (PISA) and the Trends in International Mathematics and Science Study (TIMSS), we investigate the link between test scores (mathematics and science) and cross-country income differences. We would like to know whether test scores are good indicators of labor-force quality. The…
Pellicer-Sanchez, Ana; Schmitt, Norbert
Despite a number of research studies investigating the Yes-No vocabulary test format, one main question remains unanswered: What is the best scoring procedure to adjust for testee overestimation of vocabulary knowledge? Different scoring methodologies have been proposed based on the inclusion and selection of nonwords in the test. However, there…
Strand, Steve; Deary, Ian J.; Smith, Pauline
Background and aims: There is uncertainty about the extent or even existence of sex differences in the mean and variability of reasoning test scores ( Jensen, 1998; Lynn, 1994, ; Mackintosh, 1996). This paper analyses the Cognitive Abilities Test (CAT) scores of a large and representative sample of UK pupils to determine the extent of any sex…
Cech, Scott J.
More students are taking Advanced Placement tests, but the proportion of tests receiving what is deemed a passing score has dipped, and the mean score is down for the fourth year in a row. Data released here this week by the New York City-based nonprofit organization that owns the AP brand shows that a greater-than-ever proportion of students…
Leslie Rescorla; Adena S. Rosenthal
Growth in Test of Cognitive Skills (TCS) scores and Comprehensive Tests of Basic Skills (CTBS) reading, math, and total achievement scores from 3rd to 10th grade was studied in 328 public school students in a middle-class suburban community. Surprisingly, groups differing in ability and achievement in 3rd grade made parallel progress over time, and some \\
Tamerah N. Hunt; Michael S. Ferrara; L. Stephen Miller; Stephen Macciocchi
ObjectivePoor effort on baseline neuropsychological tests is expected to influence interpretation of post-concussion assessment scores. Our study examined effort in an athletic population to determine if poor effort effects neuropsychological test performance.
Lowe, Patricia A.; Papanastasiou, Elena C.; DeRuyck, Kimberly A.; Reynolds, Cecil R.
In this study, the authors investigated the temporal stability and construct validity of the Adult Manifest Anxiety Scale-College Version (AMAS-C; C. R. Reynolds, B. O. Richmond, & P. A. Lowe, 2003b) scores. Results indicated that the AMAS-C scores had adequate to excellent test score stability, and evidence supported the construct validity of the…
This study investigated the change in sophomore reading scores on the Florida Comprehensive Assessment Test after the implementation of an academic vocabulary program and the change in teacher knowledge and professional practice after a program of staff development in academic vocabulary. The purpose was to determine if the impact of the…
Hulett, Judie L; Weiss, Robert E; Bwibo, Nimrod O; Galal, Osman M; Drorbaugh, Natalie; Neumann, Charlotte G
Micronutrient deficiencies and suboptimal energy intake are widespread in rural Kenya, with detrimental effects on child growth and development. Sporadic school feeding programmes rarely include animal source foods (ASF). In the present study, a cluster-randomised feeding trial was undertaken to determine the impact of snacks containing ASF on district-wide, end-term standardised school test scores and nutrient intake. A total of twelve primary schools were randomly assigned to one of three isoenergetic feeding groups (a local plant-based stew (githeri) with meat, githeri plus whole milk or githeri with added oil) or a control group receiving no intervention feeding. After the initial term that served as baseline, children were fed at school for five consecutive terms over two school years from 1999 to 2001. Longitudinal analysis was used controlling for average energy intake, school attendance, and baseline socio-economic status, age, sex and maternal literacy. Children in the Meat group showed significantly greater improvements in test scores than those in all the other groups, and the Milk group showed significantly greater improvements in test scores than the Plain Githeri (githeri+oil) and Control groups. Compared with the Control group, the Meat group showed significant improvements in test scores in Arithmetic, English, Kiembu, Kiswahili and Geography. The Milk group showed significant improvements compared with the Control group in test scores in English, Kiswahili, Geography and Science. Folate, Fe, available Fe, energy per body weight, vitamin B??, Zn and riboflavin intake were significant contributors to the change in test scores. The greater improvements in test scores of children receiving ASF indicate improved academic performance, which can result in greater academic achievement. PMID:24168874
Sackett, P R; Wilk, S L
Various forms of score adjustment have been suggested and used when mean differences by gender, race, or ethnicity are found using preemployment tests. This article examines the rationales for score adjustment and describes and compares different forms of score adjustment, including within-group norming, bonus points, separate cutoffs, and banding. It reviews the legal environment for personnel selection and the circumstances leading to the passage of the Civil Rights Act of 1991. It examines score adjustment in the use of cognitive ability tests, personality inventories, interest inventories, scored biographical data, and physical ability tests and outlines the implications for testing practice of various interpretations of the Civil Rights Act of 1991. PMID:7985886
Winter, David G.
The Advisory Panel on the Scholastic Aptitude Test (SAT) Score Decline contends that many factors have contributed to the drop in scores on the SAT and many other tests. The research evidence and theory about three social motives that could be expected to play some role in test performance and academic functioning are examined: the motives for…
Berson, Barry L.
The purpose of this memo is to present tests that comprise the test battery used to select Navy personnel to train marine mammals, and to describe the scoring procedures of the tests. The test battery consists of: Biosystems General Information Test (BGIT), Personnel History Questionnaire (PHQ), Gordon Personal Inventory, Gordon Personal Profile,…
Paik, Chie Matsuzawa; Michael, William B.
The twofold purpose of this study was to investigate the reliability and construct validity of scores on the Japanese version of an academic self-concept scale titled the Dimensions of Self-Concept (DOSC) Form H and ascertain any relationships between scores on the DOSC scale and selected demographic variables, including class, gender, and…
Coyner, Sandra C.
The doctoral thesis summarized in this document investigated which set of teacher education program admissions criteria best predict achievement by examining the relationship between outcomes in the teacher education program and test scores and other indicators of academic achievement. The particular problem was to determine the predictive value…
Bradley N. Collins; E. Paul Wileyto; Michael F. G. Murphy; Marcus R. Munafò
Purpose: Research has linked prenatal tobacco exposure to neurocognitive and behavioral prob- lems that can disrupt learning and school performance in childhood. Less is known about its effects on academic achievement in adolescence when controlling for known confounding factors (e.g., environmental tobacco smoke (ETS)). We hypothesized that prenatal tobacco exposure would decrease the likelihood of passing academic achievement tests taken
Marco, Gary L.; And Others
Data from the verbal portion of the College Entrance Examination Board Scholastic Aptitude Tests were used in an experimental test of the accuracy of equating for a variety of models in three categories: linear equating, equipercentile equating, and item characteristic curve equating. The models were tested for both mean squared error and bias.…
Zimmerman, Donald W.; Zumbo, Bruno D.
Educational and psychological testing textbooks typically warn of the inappropriateness of performing arithmetic operations and statistical analysis on percentiles instead of raw scores. This seems inconsistent with the well-established finding that transforming scores to ranks and using nonparametric methods often improves the validity and power…
Peng, Pai; Hochweber, Jan; Klieme, Eckhard
Outcome-oriented evaluation of school effectiveness is often based on student test scores in certain critical examinations. This study provides another method of evaluation--value-added--which is based on student achievement progress. This paper introduces the method of estimating the value-added score of schools in multi-level models. Based on…
Binqing Q. Wei; Walter A. Baase; Larry H. Weaver; Brian W. Matthews; Brian K. Shoichet
Prediction of interaction energies between ligands and their receptors remains a major challenge for structure-based inhibitor discovery. Much effort has been devoted to developing scoring schemes that can successfully rank the affinities of a diverse set of possible ligands to a binding site for which the structure is known. To test these scoring functions, well-characterized experimental systems can be very
Wilcox, Rand R.
A procedure is described for determining the minimal length of a mastery test given certain constraints. The procedure assumes that the testor is indifferent to misclassifying some testees who score within a specified range about the passing score. An example and table are provided. (JKS)
Dashfield, A. K.; Lambert, A. W.; Campbell, J. K.; Wilkins, D. C.
We have investigated the correlation between the scores attained on a computerised psychometric test, measuring psychomotor aptitude and learning tying of a surgical reef knot. Fifteen surgical trainees performed a test of psychomotor aptitude (ADTRACK 2) from the MICROPAT testing system. They then performed a simple test of their ability to tie a surgical reef knot and were assessed by a panel of experts prior to embarking on a standardised course of instruction and practice session. The knot-tying test was repeated at the end of the day and the differences in average scores recorded. There was a significant correlation between the means of the differences in knot tying scores and ADTRACK 2 scores (r = -0.533, P < 0.05). Psychomotor abilities appear to be determinants of trainees' initial proficiency in learning to tie a surgical reef knot. PMID:11320926
Elena C. Papanastasiou; Mark D. Reckase
Because of the increased popularity of computerized adaptive testing (CAT), many admissions tests, as well as certification and licensure examinations, have been transformed from their paper-and-pencil versions to computerized adaptive versions. A major difference between paper-and-pencil tests and CAT from an examinee's point of view is that in many cases examinees are not allowed to revise their answers on CAT.
Thompson, Andrew; Wennike, Nic
The Royal College of Physicians' Acute care toolkit 10 has recommended the use of the AMB score as an aid to determining patients suitable for ambulatory care. As this score has only been previously validated in one centre, the present study calculated the score of 200 patients referred to the medical take to see if it successfully identified patients who had a length of stay of less than 12 hours. In our test centre, the score was found to have a reduced sensitivity compared with the original centre (88 vs 96%) and a positive predictive value of 39%. Therefore in our hospital this was not a useful scoring system, and other trusts need to be aware that the AMB score may not be as effective as the original study suggested. PMID:26031968
Malone, Margaret E.
This article presents a review of the Canadian Academic English Language (CAEL) Assessment, a high stakes standardized test of the English language. It is a topic-based test that integrates listening, reading, writing and speaking. The test is designed to describe the level of English language proficiency of test takers planning to study at…
Paul, Clyde A.; Rosenkoetter, John
Questions whether higher achieving students complete an achievement test faster than lower achievers, reviews relevant research, and poses various hypotheses regarding test completion and achievement. Findings indicated that perhaps there is a tendency for better students to finish early on tests but that the correlation between test score and…
With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…
Kim, Sooyeon; Moses, Tim
The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…
No one can dispute that tests should measure important content, and for many (but not all) purposes, tests should be aligned with curricular goals. Thus in many cases, alignment is clearly better than the alternative, and nothing that follows here argues otherwise. Unfortunately, however, this does not imply that alignment is sufficient protection…
Jancarík, Antonín; Kostelecká, Yvona
Electronic testing has become a regular part of online courses. Most learning management systems offer a wide range of tools that can be used in electronic tests. With respect to time demands, the most efficient tools are those that allow automatic assessment. The presented paper focuses on one of these tools: matching questions in which one…
High, Clennis F.
A study was conducted to identify factors affecting student performance on the Texas Academic Skills Program (TASP), a state-mandated measure designed to assess students' basic skills and competencies. TASP and Assessment of Student Skills for Entry Transfer (ASSET) scores were analyzed for 328 academic track students from 6 community colleges in…
This study investigated the effect of targeted test preparation, or coaching, on oral English as a second language test scores. The tests in question were the Basic English Skills Test Plus (BEST Plus), a scripted oral interview published by the Center for Applied Linguistics, and the Versant English Test (VET), a computer-administered and…
... minutes a night linked to lower performance in math, science To use the sharing features on this ... about their homework habits, and their performance in math and science was assessed using a standardized test. ...
Creighton, Susan Dabney
There is no consensus regarding the most reliable and valid scoring methods for the assessment of higher order thinking skills. Most of the research on alternative formats has focused on the scoring of writing ability. This study examined the value of different types of performance assessment scoring guides on state mandated science and social studies tests. A proportional stratified sample of raters were randomly assigned to one of four scoring groups: checklist, analytic rubric, holistic rubric, and generic rubrics. A fifth method, the weighted analytic rubric, was included by applying an algorithmic formula to the scores assigned by raters using the analytic rubric. A comparison of the mean scores for the five scoring groups suggests that there may be a difference in the way raters applied the rubric for each group. Although the literature suggests that it is possible to achieve high levels of inter-rater reliability, across forms of scoring, phi coefficients of moderate strength were obtained for three of the four constructed-response items. Results for each scoring group were compared indicating that item complexity may impact the level of inter-rate, reliability and the selection of the most reliable rubric for each discipline. Analytic rubrics appear to achieve more reliable results with less complex items. A multitrait-multimethod approach was utilized to investigate the external validity of the social studies and science tasks. As expected, there tended to be a stronger association between the PACT science constructed-response scores with scores based on science multiple-choice scores than between the science constructed-response scores and the writing ability subtest scores. A similar pattern was seen with social studies items. These results provide some evidence for the validity of the performance assessments. A post study survey completed by raters provided qualitative information regarding their thought processes and their primary focus during the scoring process. An analysis of this data suggests that raters using alternative rubrics may have employed different strategies to score student responses.
Christian, Veronica Faye
The No Child Left Behind Act emphasized the responsibility of states to improve student academic performance. In one state, students are required to take subject-area tests and master each test to graduate; however, in some schools, many students are failing the English II test administered during students' sophomore year. Two districts have…
Looney, Marilyn A.; Gilbert, Jennie
The purpose of the study was to determine if currently used FITNESSGRAM[R] cut-off scores for the Back Saver Sit and Reach Test had the best criterion-referenced validity evidence for 6-12 year old children. Secondary analyses of an existing data set focused on the passive straight leg raise and Back Saver Sit and Reach Test flexibility scores of…
Hetzler, Ronald K.; Stickley, Christopher D.; Kimura, Iris F.
In this study, we developed allometric exponents for scaling Wingate anaerobic test (WAnT) power data that are reflective in controlling for body mass (BM) and lean body mass (LBM) and established a normative WAnT data set for college-age women. One hundred women completed a standard WAnT. Allometric exponents and percentile ranks for peak (PP)…
Watson, Charles G.; Klett, William G.
In a search for an adequate but efficient substitute, the authors have instituted three evaluations of the relationships between potential WAIS-substitutes and the WAIS itself. The present report describes the first of these researches-- a study of the relationships between the four group ability tests and the WAIS in a mental hospital setting.…
Bartosh, Oksana; Tudor, Margaret; Ferguson, Lynne; Taylor, Catherine
The present research investigated the impact of environmental education (EE) programs on student achievement in math, reading, and writing by comparing student performances on two standardized tests for environmental education schools and schools with traditional curriculum. Quantitative analysis was used to evaluate the impact of the EE programs.…
Armor, David J.; Duck, Stephanie
Recent studies have used increasingly complex methodologies to estimate the effect of peer characteristics--race, poverty, and ability--on student achievement. A paper by Hanushek, Kain, and Rivkin using Texas state testing data has received particularly wide attention because it found a large negative effect of school percent black on black math…
Poplin, Beth D.
This study examined whether students who graded and corrected their own test papers improved their learning and standardized test scores on the North Carolina end-of-course test in United States History. Four preexisting, intact classrooms of 11th grade United States History students in two different high schools formed the basis of this…
East, Pam C.
Many teachers look at standardized tests as something to be dreaded. This author and teacher looks at standardized-test scores and sees a tool to bring students learning to new heights. This is a way for teachers to target instruction exactly where it's needed. A way to get students looking forward to end-of-the-year tests (really!) as a way to…
Sireci, Stephen G.; Han, Kyung T.; Wells, Craig S.
In the United States, when English language learners (ELLs) are tested, they are usually tested in English and their limited English proficiency is a potential cause of construct-irrelevant variance. When such irrelevancies affect test scores, inaccurate interpretations of ELLs' knowledge, skills, and abilities may occur. In this article, we…
Della-Piana, Gabriel Mario; Gardner, Michael
Background: Professional standards for validity of achievement tests have long reflected a consensus that validity is the degree to which evidence and theory support interpretations of test scores entailed by the intended uses of tests. Yet there are convincing lines of evidence that the standards are not adequately followed in practice, that…
Westbrook, Bert W.; And Others
Attempted to replicate study determining relationship between appropriateness of career choices and career maturity test scores in rural ninth grade students (N=112) using Goal Selection scale of Career Maturity Inventory Competence Test and American College Testing Program Career Planning Program. Found two career maturity measures correlated…
MacCann, Robert G.
It is shown that the Angoff and bookmarking cut scores are examples of true score equating that in the real world must be applied to observed scores. In the context of defining minimal competency, the percentage "failed" by such methods is a function of the length of the measuring instrument. It is argued that this length is largely arbitrary,…
Eldred-Skemp, Nicolia; Quinn, James W.; Chang, Hsin-wen; Rauh, Virginia A.; Rundle, Andrew; Orjuela, Manuela A.; Perera, Frederica P.
Childhood cognitive and test-taking abilities have long-term implications for educational achievement and health, and may be influenced by household environmental exposures and neighborhood contexts. This study evaluates whether age 5 scores on the Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R, administered in English) are associated with polycyclic aromatic hydrocarbon (PAH) exposure and neighborhood context variables including poverty, low educational attainment, low English language proficiency, and inadequate plumbing. The Columbia Center for Children’s Environmental Health enrolled African-American and Dominican-American New York City women during pregnancy, and conducted follow-up for subsequent childhood health outcomes including cognitive test scores. Individual outcomes were linked to data characterizing 1-km network buffers around prenatal addresses, home observations, interviews, and prenatal PAH exposure data from personal air monitors. Prenatal PAH exposure above the median predicted 3.5 point lower total WPPSI-R scores and 3.9 point lower verbal scores; the association was similar in magnitude across models with adjustments for neighborhood characteristics. Neighborhood-level low English proficiency was independently associated with 2.3 point lower mean total WPPSI-R score, 1.2 point lower verbal score, and 2.7 point lower performance score per standard deviation. Low neighborhood-level educational attainment was also associated with 2.0 point lower performance scores. In models examining effect modification, neighborhood associations were similar or diminished among the high PAH exposure group, as compared with the low PAH exposure group. Early life exposure to personal PAH exposure or selected neighborhood-level social contexts may predict lower cognitive test scores. However, these results may reflect limited geographic exposure variation and limited generalizability. PMID:24994947
Bornheimer, Deane G.
Performance of limited-English speaking graduate school applicants on the Prueba de Admision para Estudios Graduados aptitude test is compared with Graduate Record Examination results, and the validity of the two tests as predictors of academic success for bilingual doctoral students in the New York University Puerto Rico program is examined. (MSE)
Recent calls for an increase in educational accountability in K-16 resulted in an uptick of low-stakes testing and, consequently, an increased need for ensuring that students' test scores are reliable and valid representations of their true ability. Focusing on accountability testing in higher education, the current program of research was…
Lalande, John F.; Schweckendiek, Jurgen
Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)
Loewen, David Allen
This exploratory correlational study seeks to answer the question of whether a relationship exists between student average test score gains on state exams and teachers' rating of values on the Schwartz Values Survey. Eighty-seven randomly selected Kansas teachers of math and/or reading, grades four through eight, participated. Student test…
Needham, Martha Elaine
This research compares differences between standardized test scores in problem-based learning (PBL) classrooms and a traditional classroom for 6th grade students using a mixed-method, quasi-experimental and qualitative design. The research shows that problem-based learning is as effective as traditional teaching methods on standardized tests. The…
Liu, Yang; Thissen, David
Local dependence (LD) refers to the violation of the local independence assumption of most item response models. Statistics that indicate LD between a pair of items on a test or questionnaire that is being fitted with an item response model can play a useful diagnostic role in applications of item response theory. In this article, a new score test…
D'Agostino, Jerome V.; Powers, Sonya J.
A meta-analysis was conducted to examine the degree to which teachers' test scores and their performance in preparation programs as measured by their collegiate grade point average (GPA) predicted their teaching competence. Results from 123 studies that yielded 715 effect sizes were analyzed, and the mediating effects of test and GPA type,…
Jerome V. D’Agostino; Sonya J. Powers
A meta-analysis was conducted to examine the degree to which teachers’ test scores and their performance in preparation programs as measured by their collegiate grade point average (GPA) predicted their teaching competence. Results from 123 studies that yielded 715 effect sizes were analyzed, and the mediating effects of test and GPA type, criterion type, teaching level, service level, and decade
The purpose of this study was to investigate the impact of local item dependence (LID) in passage-based testlets on the test score reliability of an English as a Foreign Language (EFL) reading comprehension test from the perspective of generalizability (G) theory. Definitions and causes of LID in passage-based testlets are reviewed within the…
DEVELOPMENT OF A WEB-BASED BLIND TEST TO SCORE AND RANK HYPERSPECTRAL CLASSIFICATION ALGORITHMS K by supplying the user with additional spectral data as compared to high-resolution color imagery. The web their classification algorithms and upload their results back to the web application. The blind test site automatically
de Gobbi Porto, Fábio Henrique; Spíndola, Lívia; de Oliveira, Maira Okada; Figuerêdo do Vale, Patrícia Helena; Orsini, Marco; Nitrini, Ricardo; Dozzi Brucki, Sonia Maria
It is not easy to differentiate patients with mild cognitive impairment (MCI) from subjective memory complainers (SMC). Assessments with screening cognitive tools are essential, particularly in primary care where most patients are seen. The objective of this study was to evaluate the diagnostic accuracy of screening cognitive tests and to propose a score derived from screening tests. Elderly subjects with memory complaints were evaluated using the Mini Mental State Examination (MMSE) and the Brief Cognitive Battery (BCB). We added two delayed recalls in the MMSE (a delayed recall and a late-delayed recall, LDR), and also a phonemic fluency test of letter P fluency (LPF). A score was created based on these tests. The diagnoses were made on the basis of clinical consensus and neuropsychological testing. Receiver operating characteristic curve analyses were used to determine area under the curve (AUC), the sensitivity and specificity for each test separately and for the final proposed score. MMSE, LDR, LPF and delayed recall of BCB scores reach statistically significant differences between groups (P=0.000, 0.03, 0.001 and 0.01, respectively). Sensitivity, specificity and AUC were MMSE: 64%, 79% and 0.75 (cut off <29); LDR: 56%, 62% and 0.62 (cut off <3); LPF: 71%, 71% and 0.71 (cut off <14); delayed recall of BCB: 56%, 82% and 0.68 (cut off <9). The proposed score reached a sensitivity of 88% and 76% and specificity of 62% and 75% for cut off over 1 and over 2, respectively. AUC were 0.81. In conclusion, a score created from screening tests is capable of discriminating MCI from SMC with moderate to good accurancy. PMID:24147213
Meadows, Sara; Herrick, David; Feiler, Anthony
The aim of the UK National Literacy Strategy is to raise standards in literacy. Strong evidence for its success has, however, been lacking: most of the available data comes from performance on tests administered in schools or from Office for Standards in Education reports and is vulnerable to suggestions of bias. An opportunistic analysis of data from a population cohort study extending over three school years compares school-based scores at school entry and at age 7-8 with independently administered scores on similar tests. The results show a small but statistically significant rise between 1998 and 1999 and between 1998 and 2000 in scores on both Key Stage 1 Reading Standard Assessment Tasks taken in schools and the reading component of the WORD test taken independently. This is clear evidence for a real rise in reading attainment over this period, which may be attributable to the children's experience of the National Literacy Strategy. PMID:18273398
Chamberlain, Gary E
I studied predictive effects of teachers and schools on test scores in fourth through eighth grade and outcomes later in life such as college attendance and earnings. For example, predict the fraction of a classroom attending college at age 20 given the test score for a different classroom in the same school with the same teacher and given the test score for a classroom in the same school with a different teacher. I would like to have predictive effects that condition on averages over many classrooms, with and without the same teacher. I set up a factor model that, under certain assumptions, makes this feasible. Administrative school district data in combination with tax data were used to calculate estimates and do inference. PMID:24101492
Homard, Catherine M
The purpose of this ex post facto correlational study was to compare exit examination scores and NCLEX-RN(®) pass rates of baccalaureate nursing students who differed in level of participation in a standardized test package. Three cohort groups emerged as a standardized test package was introduced: (a) students who did not participate in a standardized test package; (b) students with two semesters of a standardized test package; and (c) students with four semesters of a standardized test package. Benner's novice-to-expert theory framed the study in the belief that students best acquire knowledge and skills through practice and reflection. Students participating in four semesters of a standardized test package demonstrated higher exit examination scores and NCLEX-RN pass rates compared with students who did not participate in this package. This study's results could inform nurse educators about strategies to facilitate nursing student success on exit examinations and the NCLEX-RN. PMID:23413805
Ockey, Gary J.
The second language group oral is a test of second language speaking proficiency, in which a group of three or more English language learners discuss an assigned topic without interaction with interlocutors. Concerns expressed about the extent to which test takers' personal characteristics affect the scores of others in the group have limited its…
Hofer, Manfred; Kuhnle, Claudia; Kilian, Britta; Fries, Stefan
The predictive power of cognitive ability and self-control strength for self-reported grades and an achievement test were studied. It was expected that the variables use of time structure, academic procrastination, and motivational interference during learning further aid in predicting students' achievement because they are operative in situations…
Dimitrov, Dimiter M.; Raykov, Tenko; AL-Qataee, Abdullah Ali
This article is concerned with developing a measure of general academic ability (GAA) for high school graduates who apply to colleges, as well as with the identification of optimal weights of the GAA indicators in a linear combination that yields a composite score with maximal reliability and maximal predictive validity, employing the framework of…
Educational Testing Service, Princeton, NJ.
The Nairn report, The Reign of ETS, asserts that Educational Testing Service (ETS) has attempted to suppress information on the relationship of test scores to students' family income, that the relationship of Scholastic Aptitude Test (SAT) scores to income is inordinately high, and that the tests preserve the social status quo by denying…
Lee, Seon-Young; Olszewski-Kubilius, Paula
This study examined differences between students who qualified for talent search testing via scores on standardized tests and via parent nomination in their performances on the SAT or ACT and some demographic characteristics. Overall, the standardized testing group earned higher scores on the off-level tests than the parent nominated group. Asian…
Kay, Rachel E.
Over the past few decades, and especially in the past ten years, computer use in schools has increased dramatically; however there has been little research examining the effects of technology use on student achievement, specifically defined by standardized test scores. There is also concern as to how technology use differs by gender and if that…
In 2004, the National Endowment for the Arts (NEA) concluded that "literature reading is fading as a meaningful activity, especially among younger people." How can educators continue to teach students about the power of literary response when the priority is for them to achieve proficiency on standardized tests, whose scores can only be narrowly…
Herriott, Tavita S.
The purpose of this study was to determine if there was a difference in students' standardized test scores based on the instructional model their teachers used. One group of students was served under a pullout instructional model. The other was served under an inclusive model. It is not known whether or not the pullout instructional model or the…
Almond, Russell G.
Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…
This paper considers the problem of validating placement procedures or, more precisely, of determining their educational appropriateness. At issue is determining whether a test score serves the particular educational function it was designed to serve (for example, course placement), and whether it does so in an economical way. These determinations…
Ricketts, Christine R.
This study examined the extent to which end-of-course grades are predictive of Virginia Standards of Learning test scores in nine high school content areas. It also analyzed the impact of the variables school cluster attended, gender, ethnicity, disability status, Limited English Proficiency status, and socioeconomic status on the relationship…
Among the 50 states, Florida's gains on the National Assessment of Educational Progress (NAEP) between 1992 and 2011 ranked second only to Maryland's. Florida's progress has been particularly impressive in the early grades. In 1998, Florida scored about one grade level below the national average on the 4th-grade NAEP reading test, but it was…
McEnroe, James D.
The study examined the effects of the federally funded Comprehensive School Reform (CSR) program on student performance on mandated standardized tests. The study focused on the mathematics and reading scores of Illinois public elementary and middle and junior high school students. The federal CSR program provided Illinois schools with an annual…
Rothstein, Jesse; Wozny, Nathan
Analysts often examine the black-white test score gap conditional on family income. Typically only a current income measure is available. We argue that the gap conditional on permanent income is of greater interest, and we describe a method for identifying this gap using an auxiliary data set to estimate the relationship between current and…
David J. Hebert; Alan F. Holmes
A literature review was carried out to determine existing knowledge regarding the relationship of Graduate Record Examinations Aptitude Test (GRE) scores and graduate grade point average (GGPA). Building upon this information, a study was undertaken in which data were gathered for 67 M.Ed. candidates admitted into and graduating from the University of New Hampshire Department of Education during a given
Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance." His…
Diamond, Sandra M.
The Problem: The purpose of this study was to investigate whether or not there were any statistically significant differences in the Mathematics California Standard Test scores and attendance rates for African American and Latina high school girls who participated in an afterschool program. Method: A quasi-experimental design was conducted with…
Stiefel, Leanna; Schwartz, Amy Ellen; Ellen, Ingrid Gould
We examine the size and distribution of the gap in test scores across races within New York City public schools and the factors that explain these gaps. While gaps are partially explained by differences in student characteristics, such as poverty, differences in schools attended are also important. At the same time, substantial within-school gaps…
Thomas, P. Ann
The focus of the investigation is on a sixth grade population not performing reading on grade level and not achieving high-stakes test score proficiency causing the school to fail adequate yearly progress (AYP). The lack of reading skills causes the students to repeat grades in middle school and high school. Reading technology instruction is the…
van Ginkel, Joost R.; van der Ark, L. Andries; Sijtsma, Klaas
The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at…
Land, Warren A.; Land, Elizabeth R.
This study was directed toward determining the effect on National Teacher Education (NTE) Test Scores of changing a professional undergraduate educational program, and exploring whether there is a specific significant difference between the use of a traditional undergraduate professional educational program and a modified undergraduate…
Worrell, Frank C.; Watson, Stevie
In this study, the authors tested the viability of the expanded nigrescence (NT-E) model as operationalized by Cross Racial Identity Scale (CRIS) scores using confirmatory factor analyses. Participants were 594 Black college students from the Southeastern United States. Results indicated a good fit for NT-E's proposed six-factor structure.…
M. Laiacona; M. G. Inzaghi; A. De Tanti; E. Capitani
The Wisconsin card sorting test and the Weigl test are two neuropsychological tools widely used in clinical practice to assess\\u000a frontal lobe functions. In this study we present norms useful for Italian subjects aged from 15 to 85 years, within 5–17 years\\u000a of education. Concerning the Wisconsin card sorting test, a new measure of global efficiency (global score) is proposed
Richard A. Charter
The author provides statistical approaches to aid investigators in assuring that sufficiently high test score reliabilities are achieved for specific research purposes. The statistical approaches use tests of statistical significance between the obtained reliability and lowest population reliability that an investigator will tolerate. The statistical approaches work for coefficient alpha and related coefficients and for alternate-forms, split-half (2-part alpha), and
George-Ezzelle, Carol E.; Skaggs, Gary
Current testing standards call for test developers to provide evidence that testing procedures and test scores, and the inferences made based on the test scores, show evidence of validity and are comparable across subpopulations (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on…
In recent years there has been growing theoretical interest in exploring the relationship between the interpretation and use of high-stakes proficiency test scores. In these discussions, the role of institutional test users (or test score consumers) has received only limited attention. This may be due, at least in part, to the lack of consensus in…
Manjunath, N K; Telles, Shirley
The performance scores of children (aged 11 to 16 years) in verbal and spatial memory tests were compared for two groups (n = 30, each), one attending a yoga camp and the other a fine arts camp. Both groups were assessed on the memory tasks initially and after ten days of their respective interventions. A control group (n = 30) was similarly studied to assess the test-retest effect. At the final assessment the yoga group showed a significant increase of 43% in spatial memory scores (Multivariate analysis, Tukey test), while the fine arts and control groups showed no change. The results suggest that yoga practice, including physical postures, yoga breathing, meditation and guided relaxation improved delayed recall of spatial information. PMID:15648409
Zimmerman, Donald W.
In order to circumvent the influence of correlation in paired-samples and repeated measures experimental designs, researchers typically perform a one-sample Student "t" test on difference scores. That procedure entails some loss of power, because it employs N - 1 degrees of freedom instead of the 2N - 2 degrees of freedom of the…
Texas State Higher Education Coordinating Board, Austin.
This report provides a summary of test results from the Texas Academic Skills Program (TASP), including alternative tests taken for TASP purposes as authorized by state law, by student race/ethnicity for academic year 1999-2000. The results are provided for each Texas public university and community or technical college and for the entire state.…
There is significant potential for error in long production processes that consist of sequential stages, each of which is heavily dependent on the previous stage, such as the SER (Scoring, Equating, and Reporting) process. Quality control procedures are required in order to monitor this process and to reduce the number of mistakes to a minimum. In…
A study investigated the consistency of criteria for academic English skills as applied by teachers of academic English and science lecturers in a South African historically black university. Both groups were asked to evaluate first-year students' essays on the greenhouse effect. Results indicated a wide variation in scores and judgments within…
Powell, P. E.
Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.
Gitomer, Drew H.; Qi, Yi
This study concerns the "highly qualified teacher" provisions of the "Elementary and Secondary Education Act" ("ESEA," 2002), as reauthorized, and other policies at the federal, state and local levels, which have aimed to elevate the content knowledge of teachers. This examination of "Praxis II" score trends was not meant to serve as an evaluation…
ECONOMICS UNDERSTANDING OF ALBANIAN HIGH SCHOOL STUDENTS: FACTORS RELATED TO ACHIEVEMENT AS MEASURED BY TEST SCORES ON THE TEST OF ECONOMIC LITERACY By Dolore Bushati ©2010 Submitted to the Department of Curriculum... STUDENTS: FACTORS RELATED TO ACHIEVEMENT AS MEASURED BY TEST SCORES ON THE TEST OF ECONOMIC LITERACY Committee: ________________________________ Chairperson...
Lam, Teresa; Burns, Kharis; Dennis, Mark; Cheung, N Wah; Gunton, Jenny E
Cardiovascular disease (CVD) is the leading cause of morbidity and mortality among patients with diabetes mellitus, who have a risk of cardiovascular mortality two to four times that of people without diabetes. An individualised approach to cardiovascular risk estimation and management is needed. Over the past decades, many risk scores have been developed to predict CVD. However, few have been externally validated in a diabetic population and limited studies have examined the impact of applying a prediction model in clinical practice. Currently, guidelines are focused on testing for CVD in symptomatic patients. Atypical symptoms or silent ischemia are more common in the diabetic population, and with additional markers of vascular disease such as erectile dysfunction and autonomic neuropathy, these guidelines can be difficult to interpret. We propose an algorithm incorporating cardiovascular risk scores in combination with typical and atypical signs and symptoms to alert clinicians to consider further investigation with provocative testing. The modalities for investigation of CVD are discussed. PMID:25987961
Oden, Neal; VanVeldhuisen, Paul C.; Scott, Ingrid U.; Ip, Michael S.
We compare five closed tests for strong control of family-wide type I error (FWE) while making all pair-wise comparisons of means in clinical trials with multiple arms such as the SCORE Study. We simulated outcomes of the SCORE Study under its design hypotheses, and used p-values from chi-squared tests to compare performance of a “pairwise” closed test described below to Bonferroni and Hochberg adjusted p-values. “Pairwise” closed testing was more powerful than Hochberg’s method by several definitions of multiple-test power. Simulations over a wider parameter space, and considering other closed methods, confirmed this superiority for p-values based on normal, logistic, and Poisson distributions. The power benefit of “pair-wise” closed testing begins to disappear with 5 or more arms, and with unbalanced designs. For trials with 4 or fewer arms and balanced designs, investigators should consider using “pair-wise” closed testing in preference to Shaffer’s, Hommel’s, and Hochberg’s approaches when making all pairwise comparisons of means. If not all p-values from the closed family are available, Shaffer’s method is a good choice. PMID:21660119
Stevens, Charlotte Bethany Rains
Nationwide, the goal of providing a productive science and math education to our youth in today's educational institutions is centering itself around the technology being utilized in these classrooms. In this age of digital technology, educational software and calculator-based laboratories (CBL) have become significant devices in the teaching of science and math for many states across the United States. Among the technology, the Texas Instruments graphing calculator and Vernier Labpro interface, are among some of the calculator-based laboratories becoming increasingly popular among middle and high school science and math teachers in many school districts across this country. In Tennessee, however, it is reported that this type of technology is not regularly utilized at the student level in most high school science classrooms, especially in the area of Physical Science (Vernier, 2006). This research explored the effect of calculator based laboratory instruction on standardized test scores. The purpose of this study was to determine the effect of traditional teaching methods versus graphing calculator teaching methods on the state mandated End-of-Course (EOC) Physical Science exam based on ability, gender, and ethnicity. The sample included 187 total tenth and eleventh grade physical science students, 101 of which belonged to a control group and 87 of which belonged to the experimental group. Physical Science End-of-Course scores obtained from the Tennessee Department of Education during the spring of 2005 and the spring of 2006 were used to examine the hypotheses. The findings of this research study suggested the type of teaching method, traditional or calculator based, did not have an effect on standardized test scores. However, the students' ability level, as demonstrated on the End-of-Course test, had a significant effect on End-of-Course test scores. This study focused on a limited population of high school physical science students in the middle Tennessee Putnam County area. The study should be reproduced in various school districts in the state of Tennessee to compare the findings.
G. Stennis Watson; Kenneth J Sufka; Terence J Coderre
The formalin test is a well-established model for assessing inflammatory nociceptive processes and analgesic drug effects. Previous research established the validity of an ordinal relationship among three well-defined pain behavior categories used to compute a composite pain score (CPS). However, optimal weights had not been validated. The present research used data from Coderre et al. (1993)and from Sufka and Roach
Reynolds, Matthew R.
The linear loadings of intelligence test composite scores on a general factor ("g") have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the "g" loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a)…
Butler, Oliver T.; And Others
This study tested for cultural bias in the Bender Visual Motor Gestalt Test. Subjects were 72 black and white patients diagnosed as either brain damaged or psychiatric. Bender protocols were scored by Pascal-Suttell and Hain systems. No race effect appeared except for the Pascal-Suttell system for which blacks scored significantly better. (Author)
Kobrosly, Roni W; Seplaki, Christopher L; Jones, Courtney M; van Wijngaarden, Edwin
Objective To investigate the relationship between a measure of cumulative physiologic dysfunction and specific domains of cognitive function. Methods We examined a summary score measuring physiological dysfunction, a multisystem measure of the body’s ability to effectively adapt to physical and psychological demands, in relation to cognitive function deficits in a population of 4511 adults aged 20 to 59 who participated in the third National Health and Nutrition Examination Survey (1988–1994). Measures of cognitive function comprised three domains: working memory, visuomotor speed, and perceptual-motor speed. ‘Physiologic dysfunction’ scores summarizing measures of cardiovascular, immunologic, kidney, and liver function were explored. We used multiple linear regression models to estimate associations between cognitive function measures and physiological dysfunction scores, adjusting for socioeconomic factors, test conditions, and self-reported health factors. Results We noted a dose-response relationship between physiologic dysfunction and working memory (coefficient = 0.207, 95% CI = (0.066, 0.348), p < 0.0001) that persisted after adjustment for all covariates (p = 0.03). We did not observe any significant relationships between dysfunction scores and visuomotor (p = 0.37) or perceptual-motor ability (p = 0.33). Conclusions Our findings suggest that multisystem physiologic dysfunction is associated with working memory. Future longitudinal studies are needed to clarify the underlying mechanisms and explore the persistency of this association into later life. We suggest that such studies should incorporate physiologic data, neuroendocrine parameters, and a wide range of specific cognitive domains. PMID:22155941
Treena Eileen Rohde; Lee Anne Thompson
The purpose of the present study is to explain variation in academic achievement with general cognitive ability and specific cognitive abilities. Grade point average, Wide Range Achievement Test III scores, and SAT scores represented academic achievement. The specific cognitive abilities of interest were: working memory, processing speed, and spatial ability. General cognitive ability was measured with the Raven's Advanced Progressive
Sandra Chong; William B. Michael
For each of three samples of 213 8th-grade girls, 191 11th-grade boys, and 213 11th-grade girls attending schools in Korea, the twofold purpose of this investigation was to obtain evidence of the internal-consistency reliability and construct validity of scores on each of the five factor subscales of a Korean version of an academic self-concept instrument titled Dimensions of Self-Concept (DOSC),
Lee, Guemin; Park, In-Yong
Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Ünal-Karagüven, M. Hülya
Academic motivation and test anxiety have been still adduced for low performance of students by educators. To know the factors that have an effect on students' academic motivation and test anxiety levels can be helpful to improve students' academic performance. The aim of this study was to investigate the effects of demographic variables and…
Deckersbach, T; Savage, C R; Henin, A; Mataix-Cols, D; Otto, M W; Wilhelm, S; Rauch, S L; Baer, L; Jenike, M A
The Rey-Osterrieth Complex Figure Test (RCFT) is a widely-used measure of visuospatial construction and nonverbal memory. One of the critical aspects of this test is that organizing the figure into meaningful perceptual units during copy enhances its subsequent free recall from memory. This study examined the psychometric properties of a new system for quantifying the organizational approach to the RCFT figure and compared it to another compatible scoring system. We investigated interrater reliability of both systems and explored the influences of copy organization and copy accuracy on immediate recall. Seventy-one participants meeting DSM-IV criteria for obsessive-compulsive disorder and 55 healthy control participants completed the copy and immediate free recall condition of the RCFT. Interrater reliability was evaluated by Kappa coefficients and Pearson correlations. The effects of copy organization and copy accuracy on immediate recall were evaluated using multiple regression analyses. Results indicated that the organizational approach could be assessed with high reliability using both scoring systems. Organization during copy was a strong predictor for subsequent free recall from memory using both approaches. Multiple regression analysis indicated that all organizational elements were not equally predictive of memory performance. This new system represents a very simple and reliable approach to scoring organization on the RCFT, since it requires the identification of only 5 figure components. These characteristics should contribute to its clinical utility. PMID:11094399
Edwards, Jerri D; Vance, David E; Wadley, Virginia G; Cissell, Gayla M; Roenker, Daniel L; Ball, Karlene K
The Useful Field of View test (UFOV(1)) is a measure of processing speed that predicts driving performance and other functional abilities in older adults. In comparison to a number of other visual and cognitive measures, the UFOV measure has consistently been found to be the strongest predictor of motor vehicle crashes of older adults. This measure has valuable applications in that computerized, performance-based measures that are predictive of crashes in the elderly population can provide an objective criterion for determining the need for driver restriction or rehabilitation. Administration of the UFOV test has evolved from the standard version (administered via touch-screen with the Visual Attention Analyzer) to two briefer versions, which are administered on a personal desktop computer (PC) using either a touch screen or mouse response option. These new versions of the test are briefer and require less specialized equipment, making the test more portable and practical for use in clinical settings. This study examined the reliability and validity of the scores from these two new versions. Results indicate that test-retest reliabilities of the scores from the UFOV PC versions are high (r's= 0 .884 for mouse and 0.735 for touch), and performance on both PC versions correlates well with performance on the standard version (r's = 0.658 for mouse and 0.746 for touch). Furthermore, scores were highly correlated (r = 0.916) when participants used either a touch screen or a mouse to input responses. In conclusion, the reliability and validity coefficients are of sufficient magnitude to make the touch and mouse PC versions of the UFOV practical for use in clinical evaluations. PMID:16019630
Sullivan, Jeremy R.; Winter, Suzanne M.; Sass, Daniel A.; Svenkerud, Nicole
Many tests provide users with several different types of scores to facilitate interpretation and description of students' performance. Common examples include raw scores, age- and grade-equivalent scores, and standard scores. However, when used within the context of assessing growth among young children, these scores should not be…
Stewart, David W.; Hagemeier, Nicholas E.; Thigpen, Jim C.; Brooks, Lauren
Objective: To determine if the frequency of self-testing of course material prior to actual examination improves examination scores, regardless of the actual scores on the self-testing. Methods: Practice quizzes were randomly generated from a total of 1342 multiple-choice questions in pathophysiology and made available online for student self-testing. Intercorrelations, 2-way repeated measures ANOVA with post hoc tests, and 2-group comparisons following rank ordering, were conducted. Results: During each of 4 testing blocks, more than 85% of students took advantage of the self-testing process for a total of 7042 attempts. A consistent significant correlation (p?0.05) existed between the number of practice quiz attempts and the subsequent examination scores. No difference in the number of quiz attempts was demonstrated compared to the first testing block. Exam scores for the first and second testing blocks were both higher than those for third and fourth blocks. Conclusion: Although self-testing strategies increase retrieval and retention, they are uncommon in pharmacy education. The results suggested that the number of self-testing attempts alone improved subsequent examination scores, regardless of the score for self-tests.
Walton, Gregory M; Spencer, Steven J
Past research has assumed that group differences in academic performance entirely reflect genuine differences in ability. In contrast, extending research on stereotype threat, we suggest that standard measures of academic performance are biased against non-Asian ethnic minorities and against women in quantitative fields. This bias results not from the content of performance measures, but from the context in which they are assessed-from psychological threats in common academic environments, which depress the performances of people targeted by negative intellectual stereotypes. Like the time of a track star running into a stiff headwind, such performances underestimate the true ability of stereotyped students. Two meta-analyses, combining data from 18,976 students in five countries, tested this latent-ability hypothesis. Both meta-analyses found that, under conditions that reduce psychological threat, stereotyped students performed better than nonstereotyped students at the same level of past performance. We discuss implications for the interpretation of and remedies for achievement gaps. PMID:19656335
Research conducted on the relationship between international students’ TOEFL scores and academic performance has produced contradictory results. To examine this relationship, a meta-analysis was conducted on extant studies centered on international...
Introduction: Premature withdrawal from university due to academic failure can present problems for students, families and educators. In an effort to widen the understanding regarding factors predicting academic success in higher institutions, prior academic achievement measures (preparatory school grade average point (GPA), aptitude test scores,…
He, Hua; McDermott, Michael P.
Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified. PMID:21856650
Prato, Ermelinda; Biandolino, Francesca; Libralato, Giovanni
This study developed a tool able to evaluate the potential contamination of marine sediments detecting the presence or absence of toxicity supporting environmental decision-making processes. When the sample is toxic, it is important to classify its level of toxicity to understand its subsequent effects and management practices. Corophium insidiosum is a widespread and frequently recorded species along the Mediterranean Sea, North Sea and western Baltic Sea with records also in the Atlantic Ocean and Pacific Ocean. This amphipod is found in high abundance in shallow brackish inshore areas and estuaries also with high turbidity. At Italian level, C. insidiosum is more frequently collectable than Corophium orientale, making routine toxicity tests easier to be performed. Moreover, according to the international scientific literature, C. insidiosum is more sensitive than C. orientale. Whole sediment toxicity data (10 days) with C. insidiosum were organised in a species-specific toxicity score on the basis of the minimum significance difference (MSD) approach. Thresholds to rank samples as non-toxic and toxic were based on sediment samples (n=84) from the Gulf of Taranto (Italy). A five-class toxicity score (absent, low, medium, high and very high toxicity) was developed, considering the distribution of the 90th percentile of the MSD normalised to the effects on the negative controls (samples from reference sites). This toxicity score could be useful for interpreting sediment potential impacts and providing quick responsive management information. PMID:25773894
Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W.
Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…
Stevenson, Rosnisha D.; Kritsonis, William Allan
This article will seek to utilize Dr. William Allan Kritsonis' book "Ways of Knowing Through the Realms of Meaning" (2007) as a framework to improve a campus's standardized test scores, more specifically, their TAKS (Texas Assessment of Knowledge and Skills) scores. Many campuses have an improvement plan, also known as a Campus Improvement Plan,…
Oshima, T. C.; And Others
A procedure to detect differential item functioning (DIF) is introduced that is suitable for tests with a cutoff score. DIF is assessed on a limited closed interval of thetas in which a cutoff score falls. How this approach affects the identification of DIF items is demonstrated with real data sets. (SLD)
Math 199 Contract of Understanding You are enrolled in Math 199 because your Calculus Placement Test score was below the cutoff for Math 231. Statistical data from previous semesters shows that students with placement scores of 22 and below (out of 45) have a much higher W/D/F rate in Math 231
Chowa, Gina A. N.; Masa, Rainier D.; Wretman, Christopher J.; Ansong, David
Household assets as part of youth's family background have been found to have a significant impact on youth's academic achievement. In this study, the impact of household possessions on youth's academic achievement in the Ghana YouthSave experiment is investigated. Findings support the hypothesized positive direction of the impact of household…
Admission Test Preparation Admission test scores help professional and graduate programs determine-prepared for these tests. Some are tests of aptitude in quantitative skills, verbal and analytical reasoning and/or writing ability (e.g., GRE, LSAT, GMAT), while others are tests of content knowledge (e.g., GRE Subject Tests
Bhat, Venkatraman; Wahab, Atiqa Abdul; Garg, Kailash C; Janahi, Ibrahim; Singh, Rajvir
Background and Aims: Pulmonary changes in patients with cystic fibrosis (CF) with CFTR I1234V mutation have not been extensively documented. Impact of geographic influence on phenotypical expression is largely unknown. This descriptive clinical study presents the high-resolution computed tomography (HRCT) pulmonary findings and computed tomography (CT) scoring with respect to pulmonary function tests (PFT) in a small subset of CF group. Materials and Methods: We examined 29 patients between 2 and 31 years of age with CFTR I1234V mutation. HRCT and PFT were performed within 2 weeks of each other. Imaging abnormalities on HRCT were documented and analyzed by utilizing the scoring system described by Bhalla et al., Brody et al., Helbich et al.,and Santamaria et al. Efficacy of the scoring system with respect to PFT was compared. Statistical Analysis: Inter-observer reliability of the scoring systems was tested using intraclass correlation (ICC) between the two observers. Spearman correlation coefficients were calculated between the scoring systems and between the scoring systems and PFT results. Results: In our study, right upper and middle lobes were the most frequently involved sites of involvement. Bronchiectasis and peribronchial thickening were the most frequent imaging findings. Scores with all four scoring systems were reproducible, with good ICC coefficient of 0.69. There was good agreement between senior radiologists in all scoring systems. Conclusion: We noted pulmonary imaging abnormalities in a large majority (96%) of our CF patients. There was no significant difference in the CT scores observed from various systems. The CT evaluation system by Broody is detailed and time consuming, and is ideal for research and academic setup. On the other hand, the systems by Bhalla and Santamaria are easy to use, quick, and equally informative. We found the scoring system by Santamaria preferable over that of Bhalla by virtue of additional points of evaluation and ease of use, and therefore better suited for busy clinical practice. PMID:25709165
Wilcox, Rand R.
This paper describes and compares procedures for estimating the reliability of proficiency tests that are scored with latent structure models. Results suggest that the predictive estimate is the most accurate of the procedures. (Author/BW)
Marangi, Giuseppe; Ricciardi, Stefania; Orteschi, Daniela; Tenconi, Romano; Monica, Matteo Della; Scarano, Gioacchino; Battaglia, Domenica; Lettori, Donatella; Vasco, Gessica; Zollino, Marcella
Pitt-Hopkins syndrome (PTHS) is an emerging condition characterized by severe intellectual disability (ID), typical facial gestalt, and additional features, such as breathing abnormalities. Because of the overlapping phenotype of severe ID with absent speech, epilepsy, microcephaly, large mouth, and constipation, differential diagnosis of PTHS with respect to Angelman, Rett, and Mowat-Wilson syndromes represents a relevant clinical issue, and many patients are currently undergoing genetic tests for different conditions that are assumed to fall within the PTHS clinical spectrum. During a search for TCF4 mutations in 78 patients with a suspected PTHS, haploinsufficiency of TCF4 was identified in 18. By evaluating clinical features of patients with a proven TCF4 mutation with those of patients without, we noticed that, in addition to the typical facial gestalt, the PTHS phenotype results from the various combination of the following characteristics: ID with severe speech impairment, normal growth parameters at birth, postnatal microcephaly, breathing abnormalities, motor incoordination, ocular anomalies, constipation, seizures, typical behavior, and subtle brain abnormalities. On the basis of these observations, here we propose a clinically based score system as useful tool for driving a first choice molecular test for PTHS. This scoring system is also proposed for a clinically based diagnosis of PTHS in absence of a proven TCF4 mutation. PMID:22678594
Walker, J.D. [Environmental Protection Agency, Washington, DC (United States)
This paper describes the TSCA interagency testing committee`s (ITC) approaches to screening and scoring chemicals and chemical groups between 1977 and 1983. During this time the ITC conducted five scoring exercises to select chemicals and chemical groups for detailed review and to determine which of these chemicals and chemical groups should be added to the TSCA Section 4(e) Priority Testing List. 29 refs., 1 fig., 2 tabs.
Should We Stop Looking for a Better Scoring Algorithm for Handling Implicit Association Test Data? Test of the Role of Errors, Extreme Latencies Treatment, Scoring Formula, and Practice Trials on Reliability and Validity
Perugini, Marco; Schönbrodt, Felix
Since the development of D scores for the Implicit Association Test, few studies have examined whether there is a better scoring method. In this contribution, we tested the effect of four relevant parameters for IAT data that are the treatment of extreme latencies, the error treatment, the method for computing the IAT difference, and the distinction between practice and test critical trials. For some options of these different parameters, we included robust statistic methods that can provide viable alternative metrics to existing scoring algorithms, especially given the specificity of reaction time data. We thus elaborated 420 algorithms that result from the combination of all the different options and test the main effect of the four parameters with robust statistical analyses as well as their interaction with the type of IAT (i.e., with or without built-in penalty included in the IAT procedure). From the results, we can elaborate some recommendations. A treatment of extreme latencies is preferable but only if it consists in replacing rather than eliminating them. Errors contain important information and should not be discarded. The D score seems to be still a good way to compute the difference although the G score could be a good alternative, and finally it seems better to not compute the IAT difference separately for practice and test critical trials. From this recommendation, we propose to improve the traditional D scores with small yet effective modifications. PMID:26107176
Uno, Yota; Mizukami, Hitomi; Ando, Masahiko; Yukihiro, Ryoji; Iwasaki, Yoko; Ozaki, Norio
Objective The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. Methods The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2±0.7 years) residing in a juvenile detention home; reliability was assessed using Cronbach’s alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQ<70) was performed. In addition, stratum-specific likelihood ratios for detection of intellectual disability were calculated. Results The Cronbach’s alpha for the new Tanaka B Intelligence Scale IQ (BIQ) was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85–0.96). In addition, the stratum-specific likelihood ratio for the BIQ?65 stratum was 13.8 (95% CI: 3.9–48.9), and the stratum-specific likelihood ratio for the BIQ?76 stratum was 0.1 (95% CI: 0.03–0.4). Thus, intellectual disability could be ruled out or determined. Conclusion The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult. PMID:24940880
Strambler, Michael J.; Linke, Lance H.; Ward, Nadia L.
This study examines whether academic identification, or one's psychological and emotional investment in academics, mediates the association between child-reported parental educational socialization and standardized achievement test scores among a predominantly ethnic minority sample of 367 urban middle school students. We predicted that academic…
Alster, E H
The purpose of this study was to assess the effects of extended time on the algebra test performance of community college students with and without learning disabilities. Forty-four students with learning disabilities and 44 students without learning disabilities attending five California community colleges participated in the study. The students each took an algebra test under timed conditions and a comparable test under extended-time conditions. The main results were that the students with learning disabilities scored significantly lower than the students without learning disabilities under timed conditions, the scores of the students with learning disabilities increased significantly with extended time, and the scores of the students with learning disabilities under extended-time conditions did not differ significantly from the timed or extended-time scores of the students without learning disabilities. PMID:9066283
Zenisky, April L.; Hambleton, Ronald K.; Sireci, Stephen G.
How a testing agency approaches score reporting can have a significant impact on the perception of that assessment and the usefulness of the information among intended users and stakeholders. Too often, important decisions about reporting test data are left to the end of the test development cycle, but by considering the audience(s) and the kinds…
Recent jockeying for control of congressional seismic research funds has left the U.S. scientific research community uneasy about future cooperation with the federal government in the development of comprehensive nuclear test ban monitoring systems.Even though the language in a Defense authorization bill for fiscal year 1995, which cleared the Senate June 30, will likely be toned down in the House and Senate conference, the “aggressive” maneuvering that ensued to tentatively dispose the bulk of power over the interagency seismic network to the Defense Department, critics say, raises new questions about how science policy decisions are made in the United States and how committed the Congress and some federal agencies are to “reinventing” government under the Clinton-Gore plan. And for now, a hefty chunk of funding for academic seismic research is no longer a sure thing.
Tanilon, Jenny; Segers, Mien; Vedder, Paul; Tillema, Harm
This study illustrates the development and validation of an admission test, labeled as Performance Samples on Academic Tasks in Educational Sciences (PSAT-Ed), designed to assess samples of performance on academic tasks characteristic of those that would eventually be encountered by examinees in an Educational Sciences program. The test was based…
Prince, Joan Marie
Over the past years, progress in Black academic achievement, particularly in the area of science, has generally slowed or ceased. According to the 1994 NAEP assessment, twelfth-grade Black students are performing at the level of White eighth-grade students in the discipline of science (Department of Education, 1996). These students, in their last year of required schooling, are about to graduate, yet they lag at least four years behind their white counterparts in science achievement. Despite the establishment and implementation of numerous science intervention programs, Black students still suffer from a disparate gap in standardized test score achievement. The purpose of this research is to investigate teachers' perceptions of the effectiveness of an urban sciences intervention tool that was designed to assist in narrowing the Black-White science academic achievement gap. Specifically, what factors affect teachers' personal sense of instructional efficacy, and how does this translate into their outcome expectancy for student academic success? A multiple-case, replicative design, grounded in descriptive theory, was selected for the study. Multiple sources of evidence were queried to provide robust findings. These sources included a validated health sciences self-efficacy instrument, an interview protocol, a classroom observation, and a review of archival material that included case study participants' personnel files and meeting minutes. A cross-comparative analytic approach was selected for interpretation (Yin, 1994). Findings indicate that teachers attribute the success or failure of educational intervention tools in closing the Black-White test score gap to a variety of internal and external factors. These factors included a perceived lack of both monetary and personal support by the school leadership, as well as a perceived lack of parental involvement which impacted negatively on student achievement patterns. The case study participants displayed a depressed outcome expectancy effect for successful student achievement, which they directly attributed to the barriers stated above. If educational reforms are to be successful, the issues of teachers' perceptions of factors that inhibit their personal ability to instruct, and how that translates to student academic achievement must be addressed.
Rayder, Nicolas; And Others
Four Wechsler subscales were administered in a longitudinal design to children from the Responsive Model Follow Through Program. On the first testing, subjects' average intelligence scores were significantly lower, but on subsequent tests equivalent to or higher than national norms, calling into question Deutsch's cumulative-deficit hypothesis.…
Talento-Miller, Eileen; Rudner, Lawrence M.
The validity of Graduate Management Admission Test (GMAT) scores is examined by summarizing 273 studies conducted between 1997 and 2004. Each of the studies was conducted through the Validity Study Service of the test sponsor and contained identical variables and statistical methods. Validity coefficients from each of the studies were corrected…
Robert Saltstone; Colin Skinner; Paul Tremblay
This study is a preliminary examination of the fit of three classical test theory models of standard error of measurement to selected personality scale (MMPI) score retest data. The three models compared are the conventional standard error of measurement formula, Lord’s (1955: Lord, F. M. (1955). Estimating test reliability. Educational and Psychological Measurement, 15, 325–336) conditional standard error of measurement
Bokossa, Maxime C.; Huang, Gary G.
This report describes the imputation procedures used to deal with missing data in the National Education Longitudinal Study of 1988 (NELS:88), the only current National Center for Education Statistics (NCES) dataset that contains scores from cognitive tests given the same set of students at multiple time points. As is inevitable, cognitive test…
Hodges, James Gregory
This study examined the impact that the teaching technique known as cooperative learning had on the changes between pre- and post-test scores on all sub-categories ("induction, deduction, analysis, evaluation, inference", and "total composite") associated with the "California Critical Thinking Skills Test" (CCTST) for…
Wilcox, Rand R.
When determining criterion-referenced test length, problems of guessing are shown to be more serious than expected. A new method of scoring is presented that corrects for guessing without assuming that guessing is random. Empirical investigations of the procedure are examined. Test length can be substantially reduced. (Author/CM)
Kachewar, Smita Sushil; Dongre, Suryakant Dattatraya
Introduction: Fine-needle aspiration cytology (FNAC) method is safe, reliable and time saving outdoor procedure with little discomfort to the patient for detecting Carcinoma breast. The efficacy can further be enhanced when physical breast examination, mammography and FNAC (the triple test [TT]) are jointly taken into consideration. Aims and Objectives: The aim was to evaluate the role of TT score (TTS) in palpable breast masses. Materials and Methods: This prospective study was carried out from May 2010 to April 2012. In the subjects referred to the Department of Pathology for FNAC of the breast mass, the TTS was calculated, and histopathological findings were noted. Results: In the study period TTS score was calculated in 200 cases out of 225 FNAC's of breast. Of 124 benign cases on cytology, only three showed discordant TTS. Out of 62 malignant cases, 61 showed concordant TTS and one case of mastitis on histopathology showed TTS of five. Out of all the benign lesions, two cases of fibrocystic disease and a single case of phylloides tumor gave a TTS ?6. These cases were diagnosed as infiltrating ductal carcinoma and angiosarcoma respectively on histopathology. Histopathological correlation was possible in only 70 patients. Of these 70, 28 were from the benign category and 42 were from the malignant category. TTS of ?6 has a sensitivity of 97.44%, specificity of 100%. FNAC has a sensitivity of 88.37%, specificity of 96.42%. Conclusions: TT reliably guides evaluation of palpable breast masses. Histological correlation indicated TTS to be better diagnostic tool than FNAC alone.
Bielinski, John; Thurlow, Martha; Minnema, Jane; Scott, Jim
This report is a review and analysis of the psychometric literature on the topic of out-of-level testing. Out-of-level testing refers to the practice of using a level of the test other than the test taken by most of the students in a student's current grade level. Much of the research on out-of-level testing was conducted in the 1970s and 1980s,…
Cavil, Jafus Kenyatta
This purpose of the present study was to estimate minimum admission requirements using cognitive measures that will maximize candidate success on the doctoral comprehensive examination. Moreover, the present study established minimum scores on the Graduate Record Examinations (verbal and quantitative components) that will maximize doctoral student…
Center on Education Policy, 2011
This paper profiles Idaho's test score trends through 2008-09. In 2007, the mean scale score on the state 4th grade reading test was 209 for non-Title I students and 205 for Title I students. In 2007, the mean scale score in 4th grade reading was 211 for non-Title I students and 208 for Title I students. Between 2007 and 2009, the mean scale score…
Center on Education Policy, 2011
This paper profiles Utah's test score trends through 2008-09. In 2004, the mean scale score on the state 4th grade reading test was 167 for non-Title I students and 164 for Title I students. In 2009 the mean scale score in 4th grade reading was 168 for non-Title I students and 164 for Title I students. Between 2004 and 2009, the mean scale score…
Center on Education Policy, 2011
This paper profiles Kansas' test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 80 for non-Title I students and 73 for Title I students. In 2009, the mean scale score in 4th grade reading was 84 for non-Title I students and 78 for Title I students. Between 2006 and 2009, the mean scale score…
Center on Education Policy, 2011
This paper profiles Maine's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 445 for non-Title I students and 438 for Title I students. In 2009, the mean scale score in 4th grade reading was 477 for non-Title I students and 441 for Title I students. Between 2006 and 2009, the mean scale score…
Ruiz-Blanco, Yasser B.; Marrero-Ponce, Yovani; García, Yamila; Puris, Amilkar; Bello, Rafael; Green, James; Sotomayor-Torres, Clivia M.
Most successful structure prediction strategies use knowledge-based functions for global optimization, in spite of their intrinsic limited potential to create new folds, while physics-based approaches are often employed only during structure refinement steps. We here propose a physics-based scoring potential intended to perform global searches of the conformational space. We introduce a dynamic test to evaluate the discrimination power of our function, and compare it with predictions of targets from the CASP-ROLL competition. Results demonstrate that this dynamic test is able to generate 3D models which outrank 59% (according GDT_TS score) of models generated with ab initio structure prediction servers.
Marshall, Garland Ross
ENALISIS OF n?ISSOUM GOUNTI EHEM INVENTOrQn TEST SCORES END THE FERFORNENGE RATINGS OF TAXES iQRIGULTUREL EXTENSION aGENTS k Thesis GaHLaND iKSS Ha &HaLL Submitted to the Graduate College of the Tomas k. A University in Fartial fulfillment... of the requirements for the degree of 5 ?STER OF SCIENCE August 1964 Eaj or buhject: SocioiogF AN ANALYSIS OF "MISSOURI COUNTY AGi~ INFENTORP' TEST SCORES AND THE PERFORMANCE RATINGS OF TEXAS AGRICULTURAL EXTENSION AGENTS A Thesis GARLAND HOSS MARSHALL...
Mallett, Susan; Halligan, Steve; Collins, Gary S.; Altman, Doug G.
Background Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. Methods In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Results Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. Conclusions The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. PMID:25353643
Bohnker, Bruce K; Sack, David M; Wedierhold, Lynn; Malakooti, Mark
Physical performance and risk factors from the U.S. Navy physical readiness test (PRT) were analyzed in a retrospective, cross-sectional, population-based study using data from the Spring 2002 cycle. PRT scores were available for 22,314 active duty women and 131,287 men, and risk factor information was available for 4,254 women and 31,503 men. For risk factors, self-reported smoking rates were higher for men than women, and decreased with increasing age. Self-reported rates for elevated cholesterol and joint problems increased with increasing age. Linear regression showed body mass index increased with age for men (constant = 25.6, increasing 0.0,765 per year of age over 18 years, p = 0.000) and were increasing at a lower rate for women (constant = 24.5 increasing 0.0,159 per year of age over 18 years, p = 0.000). Increasing body mass index was associated with decreasing PRT performance. This analysis provides population-based information on the PRT risk factors, body mass index, and physical fitness for Navy personnel. PMID:16435757
Modern academic libraries have a great number of information resources available online in the form of electronic catalogs, books, journals, and subject subscription databases. To determine whether users can easily retrieve the information they are seeking, academic librarians conduct usability testing of their libraries' Web sites. There has been…
Singley, Daniel B.; Lent, Robert W.; Sheu, Hung-Bin
The authors tested a social cognitive model of academic and overall life satisfaction in a sample of 769 university students. The predictors, drawn from Lent's unifying perspective on well-being and psychosocial adjustment, included social cognitive (academic self-efficacy, goal progress, social support) and personality (trait positive affect)…
Kozloff, Allison Burstein
Comprehensive academic achievement tests are routinely used by school psychologists in psycho-educational assessment batteries to identify learning disabled students. A variety of assessment measures are used across age groups to determine if a discrepancy exists between academic achievement and intellectual functioning; however, among the most…
George Washington Univ., Washington, DC. Inst. for Educational Leadership.
"Options in Education" is a radio news program which focuses on issues and developments in education. This transcript contains discussions of volunteer parent tutors in a junior high school, the feminization of the teaching profession, the test score controversy, busing as an issue in the political primaries, and busing and the role of the social…
Stacy, Brian; Lockwood, J. R.; McCaffrey, Daniel
Researchers and policymakers are interested in the causal effects of educational inputs on student achievement. Unfortunately, it is not possible to directly observe student learning, so test score data is often used as an approximate measure. To measure their achievement at a given point in time (e.g., in the spring of the school year) students…
The topic of arts integration creates continuing dialog among educators and arts advocates. This study examined the degree to which student achievement was affected when arts education is limited or eliminated from schools to meet the mandates of NCLB (2001) legislation. Standardized test scores from 12 schools in Central Mississippi were used to…
Biuk-Aghai, Robert P.
Abstract--University or college admission is a complex decision process that goes beyond simply. domestic vs. overseas student proportion, and others. Choosing the most suitable among the many thousands matching test scores and admission requirements. Past research has suggested that students' backgrounds
Miroslava Korenova; Norbert Zilka; Zuzana Stozicka; Ondrej Bugos; Ivo Vanicky; Michal Novak
We have previously shown that transgenic rats expressing misfolded tau protein developed neurofibrillary tangles and axonal degeneration in the brain and spinal cord, which led to impairment of sensorimotor and neuromuscular functions. To quantify neurobehavioral phenotype of the transgenic rats we have designed a testing protocol and a novel scoring system – NeuroScale – that reliably reflects progression of functional
Jencks, Christopher; And Others
This volume contains eleven appendixes, varying from 5 to 165 pages, which describe the sample used in the analysis of ten surveys of American men aged 25-64 to determine the effects of family background, adolescent personality traits, cognitive test scores, and earnings in maturity. The appendixes are (1) 1970 Census 1/1000 Sample; (2) 1962…
Tannenbaum, Richard J.; Cho, Yeonsuk
In this article, we consolidate and present in one place what is known about quality indicators for setting standards so that stakeholders may be able to recognize the signs of standard-setting quality. We use the context of setting standards to associate English language test scores with language proficiency descriptions such as those presented…
Bolinger, Rex W.
Scholastic Aptitude Test (SAT) scores of Asian, Hispanic, Black, and White students with similar socioeconomic backgrounds and access to similar instruction in the same large midwestern school district were compared. Income levels were determined by using federal guidelines for free and reduced school lunches. The population of the study consisted…
Williams, Thomas O., Jr.; Fall, Anna-Maria; Eaves, Ronald C.; Woods-Groves, Suzanne
The reliability of scores for the "Draw-A-Person Intellectual Ability Test for Children, Adolescents, and Adults" is examined with a sample of 110 college students from two universities in the southeast. The alpha coefficient for the total sample and the interscorer and intrascorer reliability for a subset of 31 students are analyzed. The alpha…
Koretz, Daniel; Kim, Young-Suk
In a pair of recent studies, Fryer and Levitt (2004a, 2004b) analyzed the Early Childhood Longitudinal Study--Kindergarten Cohort (ECLS-K) to explore the characteristics of the Black-White test score gap in young children. They found that the gap grew markedly between kindergarten and the third grade and that they could predict the gap from…
Turner, Sherry L.
Thirteen percent of the 2008-2009 senior class in one southeastern state did not pass the science portion of the state's high school graduation test. Another 5% failed to pass the math portion of the graduation test, leaving these students unable to obtain a high school diploma. The purpose of this nonexperimental quantitative research study was…
Steedle, Jeffrey T.
Tests of college learning are often administered to obtain value-added scores indicating whether score gains are below, near, or above typical performance for students of given entering academic ability. This study compares the qualities of value-added scores generated by the original Collegiate Learning Assessment value-added approach and a new…
Haberman, Shelby J.
Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Silberglitt, Benjamin; Hintze, John
This study outlines a formative assessment system using a consistent set of cut scores on Curriculum-Based Measurement-Reading (CBM-R) probes and investigates four statistical methods for establishing cut scores. Cut scores were established using the Minnesota statewide achievement test in reading at grade 3 as the criterion for a successful…
Fontanive, Paolo; Miccoli, Mario; Simioniuc, Anca; Angelillis, Marco; Di Bello, Vitantonio; Baggiani, Angelo; Bongiorni, Maria Grazia; Marzilli, Mario; Dini, Frank Lloyd
Although echo Doppler and biomarkers are the most common examinations performed worldwide in heart failure (HF), they are rarely considered in risk scores. In outpatients with chronic HF and left ventricular ejection fraction (LVEF) ?45%, data on clinical status, echo Doppler variables, aminoterminal pro-type B natriuretic peptide (NT-proBNP), estimated glomerular filtration rate (eGFR), and drug therapies were combined to build up a multiparametric score. We randomly selected 250 patients to produce a derivation cohort and 388 patients were used as a testing cohort. Follow-up lasted 29 ± 23 months. The univariable predictors that entered into the multivariable Cox model were as follows: furosemide daily dose >25 mg, inability to tolerate angiotensin converting enzyme (ACE) inhibitors, inability to tolerate ?-blockers, age >75 years, New York Heart Association (NYHA) >2, eGFR<60 mL/min, NT-proBNP plasma levels above the median, tricuspid plane systolic excursion (TAPSE) ?14 mm, LV end-diastolic volume index (LVEDVi) >96 mL/m(2) , moderate-to-severe mitral regurgitation (MR) and LVEF <30%. The scores of prognostic factors were obtained with the respective odds ratio divided by the lower odd ratio: 4 points for furosemide dose, 3 points for age, NT-proBNP, LVEDVi, TAPSE, 2 points for inability to tolerate ?-blockers, inability to tolerate ACE inhibitors, NYHA, eGFR<60 mL/min, moderate-to-severe MR, 1 point for LVEF. The multiparametric score predicted all-cause mortality either in the derivation cohort (68.4% sensitivity, 79.5% specificity, area under the curve [AUC] 78.7%) or in the testing cohort (73.7% sensitivity, 71.3% specificity, AUC 77.2%). All-cause mortality significantly increased with increasing score both in the derivation and in the testing cohort (P < 0.0001). In conclusion, this multiparametric score is able to predict mortality in chronic systolic HF. PMID:23742144
Wilde, Elizabeth Ty; Hollister, Robinson
This study tested the performance of nonexperimental estimators of impacts applied to a class size reduction intervention with achievement test scores as the outcome. Nonexperimental estimates of impacts were compared to "true impact" estimates provided by a random-assignment design that assessed intervention effects. Data came from Project STAR,…
Wilde, Elizabeth Ty; Hollister, Robinson
In this study we test the performance of some nonexperimental estimators of impacts applied to an educational intervention--reduction in class size--where achievement test scores were the outcome. We compare the nonexperimental estimates of the impacts to "true impact" estimates provided by a random-assignment design used to assess the…
Elisha Ofiram; Timothy A. Garvey; James D. Schwender; Francis Denis; Joseph H. Perra; Ensor E. Transfeldt; Robert B. Winter; Jill M. Wroblewski
Background The lack of a widely available scoring system for cervical degenerative spondylosis encouraged the authors to establish and\\u000a validate a systematic quantitative radiographic index.\\u000a \\u000a \\u000a \\u000a Materials and methods This study included intraobserver and interobserver reliability testing among three reviewers with different years of experience.\\u000a Each observer independently scored four cervical radiographs of 48 patients at separate intervals, and statistical analysis\\u000a of the
Minov, Jordan; Karadzinska-Bislimovska, Jovanka; Vasilevska, Kristin; Stoleski, Saso; Mijakoski, Dragan
Introduction : COPD Assessment Test (CAT) is an 8-items questionnaire for assessment of health status in patients with chronic obstructive pulmonary disease (COPD). Objective : To evaluate the course of CAT scores during bacterial exacerbations of COPD treated in outpatient setting. Methods : We performed an observational, prospective study including 81 outpatients (57 males and 24 females, aged 43 to 74 years) with bacterial exacerbation of COPD. All participants completed CAT at initial visit (i.e. at the time of diagnosis of exacerbation and beginning of its treatment), 10 and 30 days after initial visit. Mean scores of each item, as well as the overall mean score, at these time points were compared. Results : The mean scores for each CAT question at initial visit varied from 2.6 to 3.5, whereas the mean scores for each CAT question 10 days after initial visit varied from 1.7 to 2.6. We registered significant reduction of the mean overall CAT score 10 days after initial visit as compared to its value at initial visit of 6.9 ± 2.7 points (16.8 vs 23.7; P < 0.001). The mean scores for each CAT question 30 days after initial visit varied from 1.3 to 2.4. We registered reduction of mean overall CAT score 30 days after initial visit as compared to its score 10 days after initial visit of 2.9 ± 1.2 points (13.9 vs 16.8; P < 0.005). The mean overall CAT score 30 days after initial visit was reduced for 9.8 ± 4.5 points as compared to its value at initial visit (13.9 vs 23.7; P < 0.001). Conclusion : We found significant improvement in the patient’s health status during recovery from exacerbation as compared to their health status at the time of exacerbation confirming the CAT as an effective tool to measure health status in patients with COPD. PMID:25893024
The purpose of this study was to examine the longitudinal impacts of the Science Writing Heuristic (SWH) approach on student science achievement measured by the Iowa Test of Basic Skills (ITBS). A number of studies have reported positive impact of an inquiry-based instruction on student achievement, critical thinking skills, reasoning skills, attitude toward science, etc. So far, studies have focused on exploring how an intervention affects student achievement using teacher/researcher-generated measurement. Only a few studies have attempted to explore the long-term impacts of an intervention on student science achievement measured by standardized tests. The students' science and reading ITBS data was collected from 2000 to 2011 from a school district which had adopted the SWH approach as the main approach in science classrooms since 2002. The data consisted of 12,350 data points from 3,039 students. The multilevel model for change with discontinuity in elevation and slope technique was used to analyze changes in student science achievement growth trajectories prior and after adopting the SWH approach. The results showed that the SWH approach positively impacted students by initially raising science achievement scores. The initial impact was maintained and gradually increased when students were continuously exposed to the SWH approach. Disadvantaged students who were at risk of having low science achievement had bigger benefits from experience with the SWH approach. As a result, existing problematic achievement gaps were narrowed down. Moreover, students who started experience with the SWH approach as early as elementary school seemed to have better science achievement growth compared to students who started experiencing with the SWH approach only in high school. The results found in this study not only confirmed the positive impacts of the SWH approach on student achievement, but also demonstrated additive impacts found when students had longitudinal experiences with the approach. By engaging in the argument-based classrooms where teachers value students' prior knowledge, encourage students to take control of their learning, and provide non-threatening environment for students to developing big ideas through negotiation, student's achievement can be enhanced. The results also started to shed some light on sustainability of the SWH approach within the school district.
Weaver, Gabriela C.
Examines the database for the National Educational Longitudinal Study (NELS:88) for connections between student use of computers in math and science classes and their academic success. Finds that computer use was significantly correlated with gender, socioeconomic status, parent's level of education, and Item Response Theory (IRT) scores.…
Feliz-Rodriguez, Darwin; Zudaire, Santiago; Carpio, Carlos; Martínez, Elizabet; Gómez-Mendieta, Antonia; Santiago, Ana; Alvarez-Sala, Rodolfo; García-Río, Francisco
BACKGROUND: An adequate evaluation of exacerbations is a primary objective in managing patients with chronic obstructive pulmonary disease (COPD). OBJECTIVES: To define the profile of health status recovery during severe exacerbations of COPD using the COPD Assessment Test (CAT) questionnaire and to evaluate its prognostic value. METHODS: Forty-five patients with previous COPD diagnoses who were hospitalized due to severe exacerbation(s) were included in the study. These patients were treated by their respective physicians following current recommendations; health status was assessed daily using the CAT questionnaire. The CAT score, spirometry and recurrent hospitalizations were recorded one and three months after hospital discharge. RESULTS: Global initiative for chronic Obstructive Lung Disease (GOLD) stage was an independent determinant for increased CAT score during the first days of exacerbation with respect to postexacerbation values. From hospitalization day 5, the CAT score was similar to that obtained in the stable phase. Body mass index, GOLD stage and education level were related to health status recovery pattern. CAT score increase and the area under the curve of CAT recovery were inversely related to the forced expiratory volume in 1 s achieved three months after discharge (r= ?0.606; P<0.001 and r= ?0.532; P<0.001, respectively). Patients with recurrent hospitalizations showed higher CAT score increases and slower recovery. CONCLUSIONS: The CAT detects early health status improvement during severe COPD exacerbations. Its initial worsening and recovery pattern are related to lung function and recurrent hospitalizations. PMID:24093119
In this article, the author talks about academic jibberish. Alfie Kohn states that a great deal of academic writing is incomprehensible even to others in the same area of scholarship. Academic Jibberish may score points for the writer but does not help research or practice. The author discusses jibberish as a career strategy that impresses those…
Martinez, Josue G; Carroll, Raymond J; Muller, Samuel; Sampson, Joshua N; Chatterjee, Nilanjan
We consider the problem of score testing for certain low dimensional parameters of interest in a model that could include finite but high dimensional secondary covariates and associated nuisance parameters. We investigate the possibility of the potential gain in power by reducing the dimensionality of the secondary variables via oracle estimators such as the Adaptive Lasso. As an application, we use a recently developed framework for score tests of association of a disease outcome with an exposure of interest in the presence of a possible interaction of the exposure with other co-factors of the model. We derive the local power of such tests and show that if the primary and secondary predictors are independent, then having an oracle estimator does not improve the local power of the score test. Conversely, if they are dependent, there is the potential for power gain. Simulations are used to validate the theoretical results and explore the extent of correlation needed between the primary and secondary covariates to observe an improvement of the power of the test by using the oracle estimator. Our conclusions are likely to hold more generally beyond the model of interactions considered here. PMID:20405045
Current National Assessment of Educational Progress results continued their 40-year pattern with two-thirds of U.S. 8th graders not proficient in reading, yet formal reading and literacy instruction ends in elementary school. Lack of reading proficiency can undermine academic progress in high school. Elementary literacy instruction provides…
Jordan, Stacie L.
The purpose of this study was to determine if a relationship exists between technology skills and academic achievement among eighth-grade students. Previous studies investigated the relationship between the use of technology as a teaching tool and student outcomes, but none had specifically examined students' technology skill competencies…
Costley, Kevin C.; Bell, David; Leggett, Timothy
Poverty most likely will always be a concern in the United States, particularly in at-risk populations of the public schools. Many students in poverty enter public schools behind in academic and social skills due to a lack of quality early learning experiences. Regular education teachers work diligently to catch these children up in vital skills;…
Möller, Jens; Zimmermann, Friederike; Köller, Olaf
Background: The reciprocal I/E model (RI/EM) combines the internal/external frame of reference model (I/EM) with the reciprocal effects model (REM). The RI/EM extends the I/EM longitudinally and the REM across domains. The model predicts that, within domains, mathematics and verbal achievement (VACH) and academic self-concept have positive effects…
Gandara, Patricia; Merino, Barbara
Data collected at three schools in California with programs for students of limited English proficiency (LEP) suggests that exit rates should not be the focus of evaluations of LEP programs and that schools cannot adequately answer questions about students' academic achievement and English language acquisition by program type. (SLD)
Jones, Tracy Anne
Researchers are increasingly aware of the role of spatial skills in preparing children for future mathematics achievement (National Mathematics Advisory Panel, 2008). In addition, sex differences have been consistently documented showing boys score higher than girls in assessments of spatial ability, particularly mental rotation (Linn & Peterson,…
Nagle, Barry T.
Out-of-School Time programs and their impact on standardized college entrance exam scores for black or African-American children of single parents who have applied for a competitive college scholarship program is the study focus. Study importance is supported by the large percentage of black children raised by single parents, the large percentage…
Dietrich, Cecile C.; Lichtenberger, Eric J.
Research studies have been ambivalent about whether enrolling in community college makes completing a bachelor's degree less likely than directly enrolling in a four-year institution. This study uses propensity score matching with a posttreatment adjustment to determine the treatment effect associated with taking the community college to…
Lefgren, Lars; Sims, David
This article develops a simple model of teacher value-added to show how efficient use of information across subjects can improve the predictive ability of value-added models. Using matched student-teacher data from North Carolina, we show that the optimal use of math and reading scores improves the fit of prediction models of overall future…
Waldfogel, Jane; Zhai, Fuhua
This study examines the effects of public preschool expenditures on the math and science scores of 4th graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in 7 Organisation for Economic Co-operation and Development (OECD) countries--Australia, Japan, the…
Center on Education Policy, 2011
This paper profiles Rhode Island's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 445 for non-Title I students and 435 for Title I students. In 2009, the mean scale score in 4th grade reading was 448 for non-Title I students and 440 for Title I students. Between 2006 and 2009, the mean…
Center on Education Policy, 2011
This paper profiles North Carolina's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade math test was 351 for non-Title I students and 347 for Title I students. In 2009, the mean scale score in 4th grade math was 354 for non-Title I students and 350 for Title I students. Between 2006 and 2009, the mean scale…
Center on Education Policy, 2011
This paper profiles Tennessee's test score trends through 2008-09. In 2004, the mean scale score on the state 4th grade reading test was 501 for non-Title I students and 486 for Title I students. In 2009, the mean scale score in 4th grade reading was 512 for non-Title I students and 495 for Title I students. Between 2004 and 2009, the mean scale…
Center on Education Policy, 2011
This paper profiles Missouri's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 661 for non-Title I students and 642 for Title I students. In 2009, the mean scale score in 4th grade reading was 661 for non-Title I students and 648 for Title I students. Between 2006 and 2009, there was no…
Center on Education Policy, 2011
This paper profiles Kentucky's test score trends through 2008-09. In 2007, the mean scale score on the state 4th grade reading test was 455 for non-Title I students and 451 for Title I students. In 2009, the mean scale score in 4th grade reading was 455 for non-Title I students and 451 for Title I students. Between 2007 and 2009, the mean scale…
Center on Education Policy, 2011
This paper profiles Colorado's test score trends through 2008-09. In 2003, the mean scale score on the state 4th grade reading test was 598 for non-Title I students and 558 for Title I students. In 2009, the mean scale score in 4th grade reading was 599 for non-Title I students and 556 for Title I students. Between 2003 and 2009, the mean scale…
Center on Education Policy, 2011
This paper profiles New Hampshire's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 445 for non-Title I students and 438 for Title I students. In 2009, the mean scale score in 4th grade reading was 448 for non-Title I students and 441 for Title I students. Between 2006 and 2009, the mean…
Center on Education Policy, 2011
This paper profiles Texas's test score trends through 2008-09. In 2005, the mean scale score on the state 4th grade reading test was 2297 for non-Title I students and 2207 for Title I students. In 2009, the mean scale score in 4th grade reading was 2334 for non-Title I students and 2235 for Title I students. Between 2005 and 2009, the mean scale…
Center on Education Policy, 2011
This paper profiles Pennsylvania's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 1390 for non-Title I students and 1220 for Title I students. In 2009, the mean scale score in 4th grade reading was 1420 for non-Title I students and 1270 for Title I students. Between 2006 and 2009, the mean…
Center on Education Policy, 2011
This paper profiles Delaware's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 474 for non-Title I students and 464 for Title I students. In 2009, the mean scale score in 4th grade reading was 478 for non-Title I students and 467 for Title I students. Between 2006 and 2009, the mean scale…
Center on Education Policy, 2011
This paper profiles Maryland's test score trends through 2008-09. In 2004, 82% of non-Title I 4th graders and 61% of Title I 4th graders scored at the proficient level on the state reading test. In 2009, 90% of non-Title I 4th graders and 78% of Title I 4th graders scored at the proficient level in reading. Between 2004 and 2009, the percentage…
Center on Education Policy, 2011
This paper profiles Massachusetts's test score trends through 2008-09. In 2006, 59% of non-Title I 4th graders and 29% of Title I 4th graders scored at the proficient level on the state reading test. In 2009, 64% of non-Title I 4th graders and 31% of Title I 4th graders scored at the proficient level in reading. Between 2006 and 2009, the…
Choi, Ick Kyu
At the University of California, Los Angeles, the Test of Oral Proficiency (TOP), an internally developed oral proficiency test, is administered to international teaching assistant (ITA) candidates to ensure an appropriate level of academic oral English proficiency. Test taker performances are rated live by two raters according to four subscales.…
Warne, Russell T.
Above-level testing is the practice of administering aptitude or academic achievement tests that are designed for typical students in higher grades or older age-groups to gifted or high-achieving students. Although widely accepted in gifted education, above-level testing has not been subject to careful psychometric scrutiny. In this study, I…
Sackett, Paul R.; Kuncel, Nathan R.; Arneson, Justin J.; Cooper, Sara R.; Waters, Shonna D.
Critics of educational admissions tests assert that tests measure nothing more than socioeconomic status (SES) and that their apparent validity in predicting academic performance is an artifact of SES. The authors examined multiple large data sets containing data on admissions and related tests, SES, and grades showing that (a) SES is related to…
Pekrun, Reinhard; Hall, Nathan C.; Goetz, Thomas; Perry, Raymond P.
A theoretical model linking boredom and academic achievement is proposed. Based on Pekrun's (2006) control-value theory of achievement emotions, the model posits that boredom and achievement reciprocally influence each other over time. Data from a longitudinal study with college students (N = 424) were used to examine the hypothesized effects. The…
Bodden, Jamie G; Needham, Robert A; Chockalingam, Nachiappan
This study assessed the basic fundamental movements of mixed martial arts (MMA) athletes using the functional movement screen (FMS) assessment and determined if an intervention program was successful at improving results. Participants were placed into 1 of the 2 groups: intervention and control groups. The intervention group was required to complete a corrective exercise program 4 times per week, and all participants were asked to continue their usual MMA training routine. A mid-intervention FMS test was included to examine if successful results were noticed sooner than the 8-week period. Results highlighted differences in FMS test scores between the control group and intervention group (p = 0.006). Post hoc testing revealed a significant increase in the FMS score of the intervention group between weeks 0 and 8 (p = 0.00) and weeks 0 and 4 (p = 0.00) and no significant increase between weeks 4 and 8 (p = 1.00). A ? analysis revealed that the intervention group participants were more likely to have an FMS score >14 than participants in the control group at week 4 (? = 7.29, p < 0.01) and week 8 (? = 5.2, p ? 0.05). Finally, a greater number of participants in the intervention group were free from asymmetry at week 4 and week 8 compared with the initial test period. The results of the study suggested that a 4-week intervention program was sufficient at improving FMS scores. Most if not all, the movements covered on the FMS relate to many aspects of MMA training. The knowledge that the FMS can identify movement dysfunctions and, furthermore, the fact that the issues can be improved through a standardized intervention program could be advantageous to MMA coaches, thus, providing the opportunity to adapt and implement new additions to training programs. PMID:23860293
Martinez, Edwin E.
This study examines the impact of instrumental music study and group chess lessons on the standardized test scores of suburban elementary public school students (grades three through five) in Levittown, New York. The study divides the students into the following groups and compares the standardized test scores of each: a) instrumental music…
Wilcox, Rand R.
The problem of determining an optimal passing score for a mastery test is discussed, when the purpose of the test is to predict success on an external criterion. For the case of constant losses for the two possible error types, a method for determining passing scores is derived. (Author/JKS)
Jeremy D. Finn; Donald A. Rock
A sample of 1, 803 minority students from low-income homes was classified into 3 groups on the basis of grades, test scores, and persistence from Grade 8 through Grade 12; the classifications were academically successful school completers (\\
Childs, Ruth A.; Dunn, Jennifer L.; van Barneveld, Christina; Jaciw, Andrew P.
This study compares five scoring approaches for a test of clinical reasoning skills. All of the approaches incorporate information about the correct item responses selected and the errors, such as selecting too many responses or selecting a response that is inappropriate and/or harmful to the patient. The approaches are combinations of theoretical…
Tucker, Irvin B., III; Amato, Louis
Examines the effects of academics and athletics (including football and basketball variables) on first-year college students' Scholastic Aptitude Test (SAT) scores at U.S. universities engaged in big-time intercollegiate sports. Academic variables rather than athletic success variables determine the level of average first-year students' SAT…
Shariff, Zalilah Mohd; Yasin, Zaidah Mohamed
A total of 107 Malay primary school girls (8-9 yr. old) completed a set of measurements on eating behavior (ChEAT, food neophobia scales, and dieting experience), the Rosenberg Self-Esteem Scale, body shape satisfaction, dietary intake, weight, and height. About 38% of the girls scored 20 and more on the ChEAT, and 46% of them reported dieting by reducing sugar and sweets (73%), skipping meals (67%), reducing fat foods (60%) and snacks (53%) as the most frequent methods practiced. In general, those girls with higher ChEAT scores tended to have lower self-esteem (r=.39), indicating they were more unwilling to try new foods (food neophobic) (r=.29), chose a smaller figure for desired body size (r=-.25), and were more dissatisfied with their body size (r=.31). PMID:15974357
M. Skodak; O. L. Crissey
One-fourth of the stated vocational choices of 297 girl senior students from the pre-college, commercial, and general and home economics groups of two Flint, Michigan, high schools was in office work. The concentration of highest Strong scores was in stenography, office work, home-making, and nursing––4 occupations between which the Strong Blank does not discriminate adequately. Therefore the Strong Blank is
Awad, Germine H.
The purpose of the present study was to examine the extent to which racial identity, academic self-concept, and self-esteem predict two types of academic outcomes, grade point average (GPA), and verbal Graduate Record Examination scores. Although grades and standardized test performance are often collapsed under the category of academic…
Vijaya L. Tirunahari; Syed A. Zaidi; Rakesh Sharma; Joan Skurnick; Hormoz Ashtyani
Study objective: To compare multiple sleep latency test (MSLT) and scoring of microsleep (presence of sleep electroencephalograph between 3 and 15s in an epoch) as a diagnostic test for excessive daytime sleepiness (EDS).Design: A retrospective study.Setting: Sleep center at a tertiary care teaching hospital.Subjects: Patients referred to a sleep center who had an MSLT and one or more of the
Lordo, R A; Feder, P I; Gettings, S D
The Cosmetic, Toiletry, and Fragrance Association (CTFA) Evaluation of Alternatives Program comprised a multi-phased study of the relationship between Draize eye irritation test data and comparable data from a selection of promising alternative (in vitro) tests. The CTFA Program was designed to determine the effectiveness and limitations of several in vitro tests over a range of different cosmetic and personal-care product types. Test materials constituted experimental formulations representative of three distinct product types. Each material was tested in vivo (according to a modified Draize eye irritation test protocol) and in vitro (according to one of up to forty different protocols). A statistical ranking and selection procedure ("concordance analysis") was used to identify those in vitro tests where the relationships between in vitro and in vivo score was sufficiently well defined to warrant further statistical analysis. In vitro test performance was then evaluated by regression modelling of these relationships. Maximum average Draize score (MAS) was utilized as the primary quantitative measure of eye irritation potential in vivo. The goodness-of-fit of the observed data to the regression model and comparison of the magnitude of upper and lower prediction-bounds on the range of probable MAS values associated with the regression model fit (prediction intervals) provide a means by which the performance of each in vitro test may be measured relative to Draize test outcome. The narrower the prediction interval (i.e. the more precise the fit), the more predictive of in vivo score (MAS) is the in vitro test result. The prediction interval thus represents uncertainty associated with Draize test prediction. Such uncertainty depends heavily on the degree of irritancy. In Phases I and II, the widths of the prediction intervals were narrowest in the region corresponding to low irritation potential; increasing widths were observed as irritation potential increased. In Phase III, relatively narrow prediction interval widths were observed at both the low and high end of the observed range of irritation potential; wider intervals were observed in the middle of the observed range. In general, the selected endpoints in each phase had similar average prediction interval widths and thereby differed only slightly in their ability to predict MAS to a given level of precision; any differences between endpoints tended to occur at the low and/or high ends of the observed range of irritation potential. The primary contributor to total variability associated with prediction of MAS is the deviation between the Draize score as observed in the laboratory and what is predicted by the model for a given formulation. Consistently, this component is responsible for 70% to 95% of the total variability. The other components (i.e. variability among replicate MAS and in vitro scores) could be reduced simply by increasing the number of replicate tests performed on each test formulation. However, this would have relatively little impact on the overall precision of prediction. PMID:20654467
Objective To determine the effect of clinical scores that predict streptococcal infection or rapid streptococcal antigen detection tests compared with delayed antibiotic prescribing. Design Open adaptive pragmatic parallel group randomised controlled trial. Setting Primary care in United Kingdom. Patients Patients aged ?3 with acute sore throat. Intervention An internet programme randomised patients to targeted antibiotic use according to: delayed antibiotics (the comparator group for analyses), clinical score, or antigen test used according to clinical score. During the trial a preliminary streptococcal score (score 1, n=1129) was replaced by a more consistent score (score 2, n=631; features: fever during previous 24 hours; purulence; attends rapidly (within three days after onset of symptoms); inflamed tonsils; no cough/coryza (acronym FeverPAIN). Outcomes Symptom severity reported by patients on a 7 point Likert scale (mean severity of sore throat/difficulty swallowing for days two to four after the consultation (primary outcome)), duration of symptoms, use of antibiotics. Results For score 1 there were no significant differences between groups. For score 2, symptom severity was documented in 80% (168/207 (81%) in delayed antibiotics group; 168/211 (80%) in clinical score group; 166/213 (78%) in antigen test group). Reported severity of symptoms was lower in the clinical score group (?0.33, 95% confidence interval ?0.64 to ?0.02; P=0.04), equivalent to one in three rating sore throat a slight versus moderate problem, with a similar reduction for the antigen test group (?0.30, ?0.61 to ?0.00; P=0.05). Symptoms rated moderately bad or worse resolved significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63) but not the antigen test group (1.11, 0.88 to 1.40). In the delayed antibiotics group, 75/164 (46%) used antibiotics. Use of antibiotics in the clinical score group (60/161) was 29% lower (adjusted risk ratio 0.71, 95% confidence interval 0.50 to 0.95; P=0.02) and in the antigen test group (58/164) was 27% lower (0.73, 0.52 to 0.98; P=0.03). There were no significant differences in complications or reconsultations. Conclusion Targeted use of antibiotics for acute sore throat with a clinical score improves reported symptoms and reduces antibiotic use. Antigen tests used according to a clinical score provide similar benefits but with no clear advantages over a clinical score alone. Trial registration ISRCTN32027234 PMID:24114306
Florida State Dept. of Education, Tallahassee.
The College-Level Academic Skills Test (CLAST) is mandated as part of Florida's system of educational accountability. The CLAST is an achievement test that measures students' attainment of the college-level communication and mathematics skills that have been identified by the faculties of community colleges and state universities. Four subtests…
Battleson, Brenda; Booth, Austin; Weintrop, Jane
Discusses usability testing as a tool for evaluating the effectiveness and ease of use of academic library Web sites; considers human-computer interaction; reviews major usability principles; and explores the application of formal usability testing to an existing site at the University at Buffalo (NY) libraries. (Author/LRW)
?ahin, Hilal; Pekçevik, Yeliz
PURPOSE Computed tomography (CT) angiography emerges as a viable alternative technique for confirmation of brain death. However, evaluation criteria are not well established for demonstration of cerebral circulatory arrest. This retrospective study aimed to evaluate CT angiography scoring systems in diagnosis of brain death, review the literature, and compare interobserver agreement between different scales for the diagnosis of brain death. METHODS CT angiography examinations of 25 patients with a clinical diagnosis of brain death were reevaluated according to 10-, 7-, and 4-point scales. Exams were performed with a 64-slice CT scanner including unenhanced, arterial (20 s) and venous phase (60 s) scans. Subtraction images of both phases were obtained. Interobserver agreement was evaluated for the assessment of vessel opacification and diagnosis of brain death. RESULTS According to 10-, 7-, and 4-point scales; 13, 16, and 22 of 25 patients had full score, respectively. Using the clinical exam as the reference standard, sensitivities obtained for 10-, 7-, and 4-point scales were 52%, 64%, and 88%, respectively. Percent agreement between readers was 100% for 10- and 7-point scales and 88% for 4-point scale. Percent agreement for opacification of scale vessels was equally high for all three scales (93.6%, 93.7%, 91% for 10-, 7-, and 4-point scales, respectively). CONCLUSION The 4-point scale appears to be more sensitive than the 10-and 7-point scales in CT angiography evaluation for brain death. Interobserver agreement is high for all three scales when subtraction images are used. PMID:25698093
Anderson, Paul S.
The Multi-Digit Technologies (MDT) testing technique is discussed as the first major advance in computer assisted testing in several decades. The MDT testing method uses fill-in-the-blank or completion-type questions, with an alphabetized long list of possible responses. An MDT answer sheet is used to record the code number of the answer. For…
Lohman, David F.; Gambrell, James L.
Language-reduced (nonverbal) ability tests are the primary talent identification tools for ELL children. The appropriate use of such tests with low-SES and minority children is more nuanced. Whenever language-reduced tests are used for talent identification, nonverbal tests that measure more than figural reasoning abilities should be employed. For…
Versant tests are automated spoken language tests that are taken on the telephone or computer. If you would like to listen to a sample test, purchase a practice test, or view the test score after taking the test (if applicable), please visit
Versant tests are automated spoken language tests that are taken on the telephone or computer. If you would like to listen to a sample test, purchase a practice test, or view the test score after taking the test (if applicable), please visit www.VersantTest.com PART INSTRUCTIONS · Carefully read
Rodrigues, Clarissa Guimaraes; Rios-Neto, Eduardo Luiz Goncalves; de Xavier Pinto, Cristine Campos
In Brazil, the mean of math test scores for students of the fourth grade declined by approximately 0.2 standard deviation in the late 1990s. However, the potential changes in the distribution of scores have never been addressed. It is unclear if the decline was caused by deterioration in student performance levels at the upper and/or lower tails…
P. Medina-Pastor; M. Mezcua; C. Rodríguez-Torreblanca; A. R. Fernández-Alba
The obligation for accredited laboratories to participate in proficiency tests under ISO 17025, performing multiresidue methods\\u000a (MRMs) for pesticide residues, involves the reporting of a large number of individual z scores making the evaluation of the overall performance of the laboratories difficult. It entails, time and again, the need\\u000a for ways to summarise the laboratory’s overall assessment into a unique
McLaughlin, J. Patrick; White, Jason T.
Outcomes measurements have always been an important part of proving to outside constituencies how you "measure up" to other schools with your business programs. A common nationally-normed exam that is used is the Major Field Achievement Test in Business from Educational Testing Services. Our paper discusses some guidelines that we are "pilot…
Munoz, Carolyn Sue
The purpose of this study was to identify the impact intensive reading instruction had for 28 students with learning disabilities at the middle school level on standardized tests. National Assessment of Education Progress testing indicates that across the United States, learning disabled students literacy skills are decreasing annually, and these…
Bishop, Dorothy V. M.; McDonald, David
Background: Children who meet language test criteria for specific language impairment (SLI) are not necessarily the same as those who are referred to a speech and language therapist. Aims: To consider how far this discrepancy reflects insensitivity of traditional language tests to clinically important features of language impairment. Methods &…
Kelly, Colleen; And Others
The SPINE test (SPeech INtelligibility Evaluation), designed to measure speech intelligibility of severely to profoundly hearing-impaired children was administered to 30 hearing-impaired children (12-16 years old) to examine its validity. Results suggested that the SPINE test is a valid measure of speech intelligibility with hearing-impaired…
Williams, W. Larry; Weil, Timothy M.; Porter, James C. K.
Guided notes were employed in two undergraduate Psychology courses involving 71 students. The study design utilized an alternating treatments format to compare Traditional Lectures with Guided Notes lectures. In one of the two courses, tests were administered after each class lecture, whereas the same type of test was administered at the beginning…
Schulz, E. Matthew; Wang, Lin
In this study, items were drawn from a full-length test of 30 items in order to construct shorter tests for the purpose of making accurate pass/fail classifications with regard to a specific criterion point on the latent ability metric. A three-item parameter Item Response Theory (IRT) framework was used. The criterion point on the latent ability…
Samiran Sinha; Bhramar Mukherjee
Summary The paper considers the problem of determining the number of matched sets in 1 : M matched case- control studies with a categorical exposure having k þ 1 categories, k ? 1. The basic interest lies in constructing a test statistic to test whether the exposure is associated with the disease. Estimates of the k odds ratios for 1
Hahn, Jinsoo; Jang, Kyungho
International comparisons of economic understanding generally require a translation of a standardized test written in English into another language. Test results can differ based on how researchers translate the English written exam into one in their own language. To confirm this hypothesis, two differently translated versions of the "Basic…
Tatsuoka, Kikumi K.; Tatsuoka, Maurice M.
The family of Weibull distributions was investigated as a model for the distributions of response times for items in computer-based criterion-referenced tests. The fit of these distributions were, with a few exceptions, good to excellent according to the Kolmogorov-Smirnov test. For a few relatively simple items, the two-parameter gamma…
Homack, Susan Rae
?????????????????????????????????????................ LIST OF TABLES TABLE Page 1 Demographics Among Groups of Children with ADHD and No Diagnosis............................................................................................................................................. 2 Conners...' Continuous Performance Test-II Variable Means for Children with ADHD and No Diagnosis.......................................................................................................................................... 3 Gordon Diagnostic System...
The current study seeks to replicate and extend the research on the effects of implicit theories of intelligence. Study 1 tested the hypothesis that an incremental theory- a belief that intelligence is malleable would result in mastery...
Homack, Susan Rae
Today, there are numerous versions of the continuous performance test (CPT) used in clinical and research settings. Although CPTs may constitute a similar group of tasks with a common paradigm, they are very different in the parameters they measure...
Geluk, Christiane A; Dikkers, Riksta; Kors, Jan A; Tio, René A; Slart, Riemer HJA; Vliegenthart, Rozemarijn; Hillege, Hans L; Willems, Tineke P; de Jong, Paul E; van Gilst, Wiek H; Oudkerk, Matthijs; Zijlstra, Felix
Background Asymptomatic subjects at intermediate coronary risk may need diagnostic testing for risk stratification. Both measurement of coronary calcium scores and exercise testing are well established tests for this purpose. However, it is not clear which test should be preferred as initial diagnostic test. We evaluated the prevalence of documented coronary artery disease (CAD) according to calcium scores and exercise test results. Methods Asymptomatic subjects with ST-T changes on a rest ECG were selected from the population based PREVEND cohort study and underwent measurement of calcium scores by electron beam tomography and exercise testing. With calcium scores ?10 or a positive exercise test, myocardial perfusion imaging (MPS) or coronary angiography (CAG) was recommended. The primary endpoint was documented obstructive CAD (?50% stenosis). Results Of 153 subjects included, 149 subjects completed the study protocol. Calcium scores ?400, 100–399, 10–99 and <10 were found in 16, 29, 18 and 86 subjects and the primary endpoint was present in 11 (69%), 12 (41%), 0 (0%) and 1 (1%) subjects, respectively. A positive, nondiagnostic and negative exercise test was present in 33, 27 and 89 subjects and the primary endpoint was present in 13 (39%), 5 (19%) and 6 (7%) subjects, respectively. Receiver operator characteristics analysis showed that the area under the curve, as measure of diagnostic yield, of 0.91 (95% CI 0.84–0.97) for calcium scores was superior to 0.74 (95% CI 0.64–0.83) for exercise testing (p = 0.004). Conclusion Measurement of coronary calcium scores is an appropriate initial non-invasive test in asymptomatic subjects at increased coronary risk. PMID:17629903
Gambichler, Thilo; Moussa, Georg; Sand, Michael; Sand, Daniel; Orlikov, Alexei; Altmeyer, Peter; Hoffmann, Klaus
Noninvasive imaging techniques might be of particular diagnostic value for studying and monitoring cutaneous inflammatory conditions such as contact dermatitis. We evaluate acute allergic contact dermatitis (AACD) by means of optical coherence tomography (OCT) and correlate the clinical grading of patch test reactions with the findings obtained from OCT. Twenty positive patch test reactions (+, n = 6; ++, n = 7; +++, n = 7) are investigated using a conventional OCT scanner. In comparison to the control sites, OCT of AACD showed pronounced skin folds, thickened and/or disrupted entrance signals, and a significant increase in epidermal thickness. Moreover, clearly demarcated signal-free cavities within the epidermis and considerable reduction of dermal reflectivity are demonstrated by OCT. Notably, the latter findings strongly correlate with the clinical patch test grading. OCT may be a useful tool for visualization of micromorphological features of AACD. However, before OCT can be employed as an objective parameter in grading severity of patch test reactions, larger studies are required that correlate clinical patch test readings and OCT findings with histopathology. PMID:16409095
Karatas, Hakan; Alci, Bulent; Aydin, Hasan
Test anxiety seems like a benign problem to some people, but it can be potentially serious when it leads to high levels of distress and academic failure. The aim of this study is to define the correlation among high school senior students' test anxiety, academic performance (GPA) and points of university entrance exam (UEE). The study group…
Wang, Shudong; McCall, Marty; Jiao, Hong; Harris, Gregg
The purposes of this study are twofold. First, to investigate the construct or factorial structure of a set of Reading and Mathematics computerized adaptive tests (CAT), "Measures of Academic Progress" (MAP), given in different states at different grades and academic terms. The second purpose is to investigate the invariance of test factorial…
Fida, Mariam; Kassab, Salah Eldin
Purpose The development of clinical problem-solving skills evolves over time and requires structured training and background knowledge. Computer-based case simulations (CCS) have been used for teaching and assessment of clinical reasoning skills. However, previous studies examining the psychometric properties of CCS as an assessment tool have been controversial. Furthermore, studies reporting the integration of CCS into problem-based medical curricula have been limited. Methods This study examined the psychometric properties of using CCS software (DxR Clinician) for assessment of medical students (n=130) studying in a problem-based, integrated multisystem module (Unit IX) during the academic year 2011–2012. Internal consistency reliability of CCS scores was calculated using Cronbach’s alpha statistics. The relationships between students’ scores in CCS components (clinical reasoning, diagnostic performance, and patient management) and their scores in other examination tools at the end of the unit including multiple-choice questions, short-answer questions, objective structured clinical examination (OSCE), and real patient encounters were analyzed using stepwise hierarchical linear regression. Results Internal consistency reliability of CCS scores was high (?=0.862). Inter-item correlations between students’ scores in different CCS components and their scores in CCS and other test items were statistically significant. Regression analysis indicated that OSCE scores predicted 32.7% and 35.1% of the variance in clinical reasoning and patient management scores, respectively (P<0.01). Multiple-choice question scores, however, predicted only 15.4% of the variance in diagnostic performance scores (P<0.01), while students’ scores in real patient encounters did not predict any of the CCS scores. Conclusion Students’ scores in OSCE are the most important predictors of their scores in clinical reasoning and patient management using CCS. However, real patient encounter assessment does not appear to test a construct similar to what is tested in CCS. PMID:25759603
Hartlage, Lawrence C.
The study investigated academic, behavioral, and psychological test performance of children diagnosed as emotionally disturbed, minimally brain injured, of dull normal intelligence, or suffering from a specific learning disability, respectively. Participating were 132 children, whose mean age was 9 years, 7 months. Multidisciplinary evaluation was…
Pekrun, Reinhard; Elliot, Andrew J.; Maier, Markus A.
The authors propose a theoretical model linking achievement goals and achievement emotions to academic performance. This model was tested in a prospective study with undergraduates (N = 213), using exam-specific assessments of both goals and emotions as predictors of exam performance in an introductory-level psychology course. The findings were…
Youyan Nie; Shun Lau; Albert K. Liau
Emphasizing task importance, which is regarded as a way of motivating engaged behavior, may increase an individual's anxiety. The present research investigated whether academic self-efficacy could moderate the maladaptive relation between task importance and test anxiety. 1978 and 1670 Grade 9 Singaporean students participated in a survey related to their learning experience and motivational processes in math and English respectively.
Nie, Youyan; Lau, Shun; Liau, Albert K.
Emphasizing task importance, which is regarded as a way of motivating engaged behavior, may increase an individual's anxiety. The present research investigated whether academic self-efficacy could moderate the maladaptive relation between task importance and test anxiety. 1978 and 1670 Grade 9 Singaporean students participated in a survey related…
Harold S. Pine; Harold C. Pillsbury
Objective: Allergic disease plays a central role in the clinical practice of otolaryngology. The purpose of this study was to review the 20-year experience of an allergy clinic integrated within an otolaryngology practice at a major academic institution. Study design: We performed a retrospective database review of over 3300 otolaryngology patients referred for allergy skin testing between 1979 and 1999.
Raufelder, Diana; Hoferichter, Frances; Schneeweiss, David; Wood, Megan A.
Based on cognitive evaluation theory (CET) and organismic integration theory (OIT)--both sub-theories of self-determination theory (SDT)--the present study examined whether the academic self-regulation of youth with test anxiety can be strengthened through social and motivational relationships with peers and teachers. This study employed a large…
Ho, Andrew D.; Yu, Carol C.
Many statistical analyses benefit from the assumption that unconditional or conditional distributions are continuous and normal. More than 50 years ago in this journal, Lord and Cook chronicled departures from normality in educational tests, and Micerri similarly showed that the normality assumption is met rarely in educational and psychological…
In his thoughtful focus article, Haertel (this issue) pushes testing experts to broaden the scope of their validation efforts and to invite scholars from other disciplines to join them. He credits existing validation frameworks for helping the measurement community to identify incomplete or nonexistent validity arguments. However, he notes his…
...unnecessary further testing and surgery due to false positive...that alerts users to the risk associated with off-label...or not to proceed with surgery. In the Federal Register...warning to address the risk of off-label use...Federal Food, Drug, and Cosmetic Act (FD&C...
...unnecessary further testing and surgery due to false positive...that alerts users to the risk associated with off-label...or not to proceed with surgery. While FDA is establishing...warning to address the risk of off-label use...Federal Food, Drug, and Cosmetic Act (FD&C...
Allen, Denise A.
Little empirical evidence suggested that independent reading abilities of students enrolled in biology predicted their performance on the Biology I Graduation End-of-Course Assessment (ECA). An archival study was conducted at one Indiana urban public high school in Indianapolis, Indiana, by examining existing educational assessment data to test…
Council, Forrest M.; And Others
The study compared the driver licensing test performance of two groups of driver education students: those involved in North Carolina's multi-vehicle range program and students in the "30 and 6" program (30 hours of class instruction and six hours of "behind the whell" instruction). It evaluated the performance of 3,049 applicants (all aged 16 and…
Sederberg, Per B.
of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across of eye movement data from 35 par- ticipants solving Raven's Advanced Progressive Matrices on two separate
Zajac, David J.
Purpose: To determine if children with repaired cleft palate and normal velopharyngeal (VP) closure as determined by aerodynamic testing exhibit greater acoustic nasalance than control children without cleft palate. Method: Pressure-flow procedures were used to identify 2 groups of children based on VP closure during the production of /p/ in the…
--------------------------------------------------------- 11 Psychometric Theories of Intelligence -------------------------------------------- 14 Cattell-Horn Fluid and Crystallized Intelligence -------------------------------- 18 Carroll?s Three-Stratum Hierarchy...-based intelligence tests have been preferred by the educators for more than a century, due to the aforementioned advantages. Psychometric Theories of Intelligence McGrew and Flanagan (1998) noted three different research traditions in the structural analysis...
Welsh, Megan E.; D'Agostino, Jerome V.; Kaniskan, Burcu
Standards-based progress reports (SBPRs) require teachers to grade students using the performance levels reported by state tests and are an increasingly popular report card format. They may help to increase teacher familiarity with state standards, encourage teachers to exclude nonacademic factors from grades, and/or improve communication with…
Educational stakeholders are aware that school administration has become an incredibly intricate dynamic that is too complex for principals to handle alone. Test-driven accountability has made the already daunting task of school administration even more challenging. Distributed leadership presents an opportunity to explore increased leadership…
Thomas, Antoinette D.
Empirical findings have supported an inverse relationship between closeness to extended family and friends versus spouse. The three foregoing interpersonal relationships in terms of affective quality, direction and dominance were investigated, using an objective test as well as the TAT. Bellak (1986) considered that the strength of the TAT lies in…
Daniel N. Allen; Joshua E. Caron; Lisa A. Duke; Gerald Goldstein
Recent factor-analytic studies of the Halstead Category Test (HCT) indicate that its seven subtests form three factors including a Counting factor (subtests I and II), a Spatial Positional Reasoning factor (subtests III, IV, and VII), and a Proportional Reasoning factor (subtests V, VI, and VII). The sensitivity and specificity of these factors to heterogeneous forms of brain damage was examined
An examination of mathematics achievement as measured by standardized test scores and grade distribution among urban high schools to determine the relationship between student outcomes in key courses and standardized tests
Irene Gail Norde
The study examined the relationship between mathematics achievement as measured by standardized test scores and grade distribution among high schools in a large urban school district to determine if MEAP and MAT 7 scores reflect student outcomes in key courses. Statistical analysis was used to determine the relationship between student outcomes in key courses and standardized tests. ^ An ex
Development and administration of institutional ESL placement tests require a great deal of financial and human resources. Due to a steady increase in the number of international students studying in the United States, some US universities have started to consider using standardized test scores for ESL placement. The English Placement Test (EPT)…
Marsh, Herbert W; Hau, Kit-Tai
Academically selective schools are intended to affect academic self-concept positively, but theoretical and empirical research demonstrates that the effects are negative. The big-fish-little-pond effect (BFLPE), an application of social comparison theory to educational settings, posits that a student will have a lower academic self-concept in an academically selective school than in a nonselective school. This study, the largest cross-cultural study of the BFLPE ever undertaken, tested theoretical predictions for nationally representative samples of approximately 4,000 15-year-olds from each of 26 countries (N = 103,558) who completed the same self-concept instrument and achievement tests. Consistent with the BFLPE, the effects of school-average achievement were negative in all 26 countries (M beta = -.20, SD = .08), demonstrating the BFLPE's cross-cultural generalizability. PMID:12971085
Bagust, Jeff; Docherty, Sharon; Haynes, Wayne; Telford, Richard; Isableu, Brice
The Rod and Frame Test has been used to assess the degree to which subjects rely on the visual frame of reference to perceive vertical (visual field dependence- independence perceptual style). Early investigations found children exhibited a wide range of alignment errors, which reduced as they matured. These studies used a mechanical Rod and Frame system, and presented only mean values of grouped data. The current study also considered changes in individual performance. Changes in rod alignment accuracy in 419 school children were measured using a computer-based Rod and Frame test. Each child was tested at school Grade 2 and retested in Grades 4 and 6. The results confirmed that children displayed a wide range of alignment errors, which decreased with age but did not reach the expected adult values. Although most children showed a decrease in frame dependency over the 4 years of the study, almost 20% had increased alignment errors suggesting that they were becoming more frame-dependent. Plots of individual variation (SD) against mean error allowed the sample to be divided into 4 groups; the majority with small errors and SDs; a group with small SDs, but alignments clustering around the frame angle of 18°; a group showing large errors in the opposite direction to the frame tilt; and a small number with large SDs whose alignment appeared to be random. The errors in the last 3 groups could largely be explained by alignment of the rod to different aspects of the frame. At corresponding ages females exhibited larger alignment errors than males although this did not reach statistical significance. This study confirms that children rely more heavily on the visual frame of reference for processing spatial orientation cues. Most become less frame-dependent as they mature, but there are considerable individual differences. PMID:23724139
Ohno, Y; Kaneko, T; Inoue, T; Morikawa, Y; Yoshida, T; Fujii, A; Masuda, M; Ohno, T; Hayashi, M; Momma, J; Uchiyama, T; Chiba, K; Ikeda, N; Imanishi, Y; Itakagaki, H; Kakishima, H; Kasai, Y; Kurishita, A; Kojima, H; Matsukawa, K; Nakamura, T; Ohkoshi, K; Okumura, H; Saijo, K; Sakamoto, K; Suzuki, T; Takano, K; Tatsumi, H; Tani, N; Usami, M; Watanabe, R
A three-step interlaboratory validation of alternative methods to the Draize eye irritation test (Draize test) was conducted by the co-operation of 27 organizations including national research institutes, universities, cosmetic industries, kit suppliers and others. Twelve alternative methods were evaluated using 38 cosmetic ingredients and isotonic sodium chloride solution. Draize tests were conducted according to the OECD guidelines using the same lot of test substances as was evaluated in the alternative tests. Results were as follows. (1) Variation in Draize scores was large near the critical range (maximal average Draize total scores (MAS)=15-50) for the evaluation of cosmetic ingredients. (2) Interlaboratory variation was relatively small for the alternative tests. The mean coefficients of variation (CV%) were less than 50 for all assays except for the hen's egg-chorioallantoic membrane test (HET-CAM), chorioallantoic membrane-trypan blue staining test (CAM-TB) and haemoglobin denaturation test (HD). The CV% of these three methods came into the same range as the other tests when non-irritants were excluded from the data analysis. (3) Results for acids (pH of 10% solution <2.5), alkalis (pH of 10% solution >11.5) and alcohols (lower mono-ol) in cytotoxicity tests clearly deviated from the other samples in the comparison of cytotoxicity with Draize results. (4) Pearson's correlation coefficients (r) between results from cytotoxicity tests using serum and MAS were -0.86 to -0.92 for samples excluding acids, alkalis and alcohols. (5) When the samples were divided into liquids and powders, r of CAM-TB increased from 0.71 for all samples to 0.80 and 0.92, respectively. (6) Spearman's rank correlation coefficients between the results of alternative methods and MAS were relatively high (r>0.8) in the case of HET-CAM and CAM-TB. Those for cytotoxicity tests were high if the data for acids, alkalis and alcohols were excluded (SIRC-CVS: r=0.945, SIRC-NRU: r=0.931, HeLa-MTT: r=0.926, CHL-CVS: r=0.880). Exclusion of data for powdered samples also increased the coefficient of HET-CAM and CAM-TB to 0.831 and 0.863, respectively. These results suggest that no single method can constitute an evaluation system applicable to all types of test substances by itself. However, several methods will be useful for the prediction of eye irritation potential of cosmetic ingredients if they are used with clear understanding of the characteristics of those methods. PMID:20654468
Zou, Xiaoling; Zhang, Xuning
A new score report based on a mechanism of formative assessment and feedback is developed to offer individual testees not only their final scores but also their sub-scale scores, their percentile position, as well as corresponding feedback on self-regulation strategies. Structural equation modeling is adopted in the confirmatory factor analysis to…
Fehr, Charles Norman
Learning to read requires knowledge of word meanings for those words most commonly encountered in basic reading materials. Many young students lack the basic vocabulary knowledge needed to facilitate learning to read. Two randomized studies were conducted to test the effects of an online, computer-adaptive vocabulary instruction program designed…
Angela Anjorin; Helga Schmidt; Hans-Georg Posselt; Christina Smaczny; Hanns Ackermann; Michael Deimling; Thomas J. Vogl; Nasreddin Abolmaali
The aim of this study was to investigate whether the parenchymal lung damage in patients suffering from cystic fibrosis (CF)\\u000a can be equivalently quantified by the Chrispin-Norman (CN) scores determined with low-field magnetic resonance imaging (MRI)\\u000a and conventional chest radiography (CXR). Both scores were correlated with pulmonary function tests (PFT) and the Shwachman-Kulczycki\\u000a method (SKM). To evaluate the comparability of
As states begin to demand more rigor on their high-stakes tests--and the tests evolve to incorporate revised academic standards--many officials are gambling that an initial wave of lower scores will give way to greater student achievement in the future. Changes to statewide tests and subsequent plummeting scores sparked controversy and emergency…
Bond, Timothy N.; Lang, Kevin
Although both economists and psychometricians typically treat them as interval scales, test scores are reported using ordinal scales. Using the Early Childhood Longitudinal Study and the Children of the National Longitudinal Survey, we examine the effect of order-preserving scale transformations on the evolution of the black-white reading test…
Evans, Richard M.; Surkan, Alvin J.
The recent arrival of portable computer systems with high-level language interpreters now makes it practical to rapidly develop complex testing and scoring programs. These programs permit undergraduates access, at arbitrary times, to testing as an integral part of a mastery learning strategy. Effects of introducing the computer were studied by…
Angela L. Duckworth; Patrick D. Quinn; Eli Tsukayama
The increasing prominence of standardized testing to assess student learning motivated the current investigation. We propose that standardized achievement test scores assess competencies determined more by intelligence than by self-control, whereas report card grades assess competencies determined more by self-control than by intelligence. In particular, we suggest that intelligence helps students learn and solve problems independent of formal instruction, whereas
Papay, John P.; Murnane, Richard J.; Willett, John B.
Students receive abundant information about their educational performance, but how this information affects future educational-investment decisions is not well understood. Increasingly common sources of information are state-mandated standardized tests. On these tests, students receive a score and a label that summarizes their performance. Using a…
Konstantinov, K V; Beard, K T; Goddard, M E; van der Werf, J H J
A multi-trait (MT) random regression (RR) test day (TD) model has been developed for genetic evaluation of somatic cell scores for Australian dairy cattle, where first, second and third lactations were considered as three different but correlated traits. The model includes herd-test-day, year-season, age at calving, heterosis and lactation curves modelled with Legendre polynomials as fixed effects, and random genetic and permanent environmental effects modelled with Legendre polynomials. Residual variance varied across the lactation trajectory. The genetic parameters were estimated using asreml. The heritability estimates ranged from 0.05 to 0.16. The genetic correlations between lactations and between test days within lactations were consistent with most of the published results. Preconditioned conjugate gradient algorithm with iteration on data was implemented for solving the system of equations. For reliability approximation, the method of Tier and Meyer was used. The genetic evaluation system was validated with Interbull validation method III by comparing proofs from a complete evaluation with those from an evaluation based on a data set excluding the most recent 4 years. The genetic trend estimate was in the allowed range and correlations between the two sets of proofs were very high. Additionally, the RR model was compared to the previous test day model. The correlations of proofs between both models were high (0.97) for bulls with high reliabilities. The correlations of bulls decreased with increasing incompleteness of daughter performance information. The correlations between the breeding values from two consecutive runs were high ranging from 0.97 to 0.99. The MT RR TD model was able to make effective use of available information on young bulls and cows, and could offer an opportunity to breeders to utilize estimated breeding values for first and later lactations. PMID:19646149
Education Digest: Essential Readings Condensed for Quick Review, 2004
This article presents an adaptation of an article from School Board News, January 6, 2004 edition. The article describes the effort of de-tracking students of varying ability levels, made by officials of South Side High School, in Rockville Centre, New York, and Noble High School, in North Berwick, Maine. Officials from both schools say that the…
Comparison of three learning indicators college grade-point average (GPA), student-reported growth, and Graduate Record Examination scores found: (1) student-reported cognitive growth survey items have a modest relative validity; (2) the attenuation associated with use of residual gain scores does not invalidate their use; and (3) GPA and…
Smith, Steven L.
There is a growing body of literature linking curriculum based measurements and general outcome measures to state test outcomes. With an increasing focus on the early identification of students at risk for academic failure, the present study investigated the predictive validity of preschool early literacy general outcome measures (GOMs) in a low…
Freeland, Amy L.; Emerson, Robert Wall; Curtis, Amy B.; Fogarty, Kieran
This article presents the findings of a secondary analysis of the National Longitudinal Transition Study 2 that explored the predictive association between training in access technology and performance on the Woodcock-Johnson Tests of Academic Achievement: III. The results indicated that the use of access technology had a limited predictive…
Benson, Nicholas; Taub, Gordon E.
The purpose of this study was to test the invariance of scores derived from the Woodcock-Johnson III Tests of Cognitive Ability (WJ III COG) and Woodcock-Johnson III Tests of Academic Achievement (WJ III ACH) across a group of students diagnosed with learning disorders (n = 994) and a matched sample of students without known clinical diagnoses (n…
Anjorin, Angela; Schmidt, Helga; Posselt, Hans-Georg; Smaczny, Christina; Ackermann, Hanns; Deimling, Michael; Vogl, Thomas J; Abolmaali, Nasreddin
The aim of this study was to investigate whether the parenchymal lung damage in patients suffering from cystic fibrosis (CF) can be equivalently quantified by the Chrispin-Norman (CN) scores determined with low-field magnetic resonance imaging (MRI) and conventional chest radiography (CXR). Both scores were correlated with pulmonary function tests (PFT) and the Shwachman-Kulczycki method (SKM). To evaluate the comparability of MRI and CXR for different states of the disease, all scores were applied to patients divided into three age groups. Seventy-three CF patients (mean SKM score: 62 +/- 8) with a median age (range) of 14 years (7-32) were included. The mean CN scores determined with both imaging methods were comparable (CXR: 12.1 +/- 4.7; MRI: 12.0 +/- 4.5) and showed high correlation (P < 0.05, R = 0.97). Only weak correlations were found between imaging, PFT, and SKM. Both imaging modalities revealed significantly more severe disease expression with age, while PFT and SKM failed to detect early signs of disease. We conclude that imaging of the lung in CF patients is capable of detecting subtle and early parenchymal destruction before lung function or clinical scoring is affected. Furthermore, low-field MRI revealed high consistency with chest radiography and may be used for a thorough follow-up while avoiding radiation exposure. PMID:18274754
Yu, Chong Ho
Many American authors expressed their concern that US competitiveness in science, technology, engineering, and mathematics (STEM) is losing ground. Using the Trends in International Mathematics and Science Study (TIMSS) 2007 data, this study investigated how academic self-concept and instrumental motivation influence science test performance among…
Miller, Joshua D; Hyatt, Courtland S; Rausher, Steven; Maples, Jessica L; Zeichner, Amos
The Elemental Psychopathy Assessment (EPA) is a relatively new self-report measure of the basic traits associated with psychopathy. Using community participants (N = 104) oversampled for the presence of psychopathic traits, we examined the convergent and criterion validity of the EPA total and factor scores (i.e., Antagonism, Emotional Stability, Disinhibition, and Narcissism) in relation to self- and informant reports of psychopathy and the general personality dimensions of the HEXACO (Honesty-Humility, Emotionality, Extraversion, Agreeableness, Conscientiousness, and Openness to Experience; Ashton & Lee, 2009), as well as self-reported scores on narcissism, Machiavellianism, and externalizing behaviors (EBs) such as antisocial behavior and aggression. The EPA total and factor scores manifested substantial positive correlations with self- and informant-reported psychopathy scores and dimensions from the HEXACO, narcissism, Machiavellianism, and EBs. The patterns of these relations became clearer and more differentiated when examined via regression analyses such that the EPA factors manifested differential relations with various aspects of psychopathy (e.g., EPA Antagonism was the only unique correlate of psychopathy traits related to callousness and manipulation). Overall, the EPA is a promising assessment tool given the breadth of its coverage, the flexibility with which it can be used (total score; 4-factor scores; 18 subscale scores), and its ties to a popular model of basic personality traits. PMID:24548152
Phelps, Richard P.
This document consists of 14 articles which appeared in the electronic news bulletin, "EducationNews.org," and which were part of a series on "Test Bashing," a discussion of the use of standardized tests that focuses on controversies surrounding the Texas Assessment of Academic Skills and test score improvements in Texas. The articles are: (1)…
John R. Hayes; Jonathan I. Groner
BackgroundMissing data and the retrospective, nonrandomized nature of trauma registries can decrease the quality of registry-based research. Therefore, we used multiple imputation and propensity scores to test the effect of car seats and seat belt usage on injury severity in children involved in motor vehicle crashes.
Quinn, David M.
Black-white test score gaps form in early childhood and widen over elementary school. Sociologists have debated the roles that socioeconomic status (SES) and school quality play in explaining these patterns. In this study, I replicate and extend past research using new nationally representative data from the Early Childhood Longitudinal…
Kramarz, Francis; Machin, Stephen; Ouazad, Amine
What makes a test score? There is a great deal of uncertainty surrounding the exact contribution of school quality, pupil background, and peers in educational achievement. If peers make most of the difference, then diversity and heterogeneous classrooms may narrow the gap between high- and low-performing students. If pupil background is the first…
Eighth-grade students in New Jersey take the Early Warning Test (EWT), which involves reading, writing, and mathematics. Students with EWT scores below the state level of competency take a remedial mathematics course that provides students with computer-assisted instruction (2 days per week) as well as regular classroom instruction (3 days per…
Wold, Donald C.; And Others
This study found that clinician-generated SPINE (Speech Intelligibility Evaluation) test scores were correlated with objective computer-generated measures of tongue deviancy during vowel production in 28 persons (ages 14-20) with severe/profound hearing loss. Data suggest that subjects were more deviant in their production of front vowels than…
Igoe, Deirdre; Peralta, Christopher; Jean, Lindsey; Vo, Sandra; Yep, Linda Ngan; Zabjek, Karl; Wright, F. Virginia
Preschool-aged children continually learn new skills and perfect existing ones. "Mastery motivation" is theorized to be a personality trait linked to skill learning. The Dimensions of Mastery Questionnaire (DMQ) quantifies mastery motivation. This pilot study evaluated DMQ test-retest score reliability (preschool-version) and included exploratory…
Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…
Carlo Cianchetti; Simona Corona; Maria Foscoliano; Daniela Contu; Giuseppina Sannio-Fancello
According to Nelson's (1976) criteria, the MCST (MWCST) is a simplification of the Wisconsin Card Sorting Test (WCST). As the MCST is particularly suitable for children, the aim of this study was to establish the normative data presently lacking for that group. The MCST was administered to 1126 normal children aged 4 to 13 years. Scoring was based on all
Anderson, Daniel; Alonzo, Julie; Tindal, Gerald
In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in the state of Washington. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Washington state…
Moses, Tim; Liu, Jinghua
In equating research and practice, equating functions that are smooth are typically assumed to be more accurate than equating functions with irregularities. This assumption presumes that population test score distributions are relatively smooth. In this study, two examples were used to reconsider common beliefs about smoothing and equating. The…
Kerns, Claretta M.
The purpose of this study was to examine the effectiveness of high school transition strategies for ninth grade students in comparison to the traditional high school experience of first time ninth grade students. This study compared the English End-of-Course (EOC) test scores of first time ninth grade students in a traditional high school setting…
This study investigated the relationship between the pattern of impairment on test scores of the neurologically impaired children and proximity to an inactive toxic waste disposal site. Subjects (N = 147) were students, ages 6-16, classified as neurologically impaired. Seventy-six who lived within six miles of the site served as the experimental group and 71 who did not live near a site comprised the control group. Research was based on existing data available through the Child Study Team evaluation process. Attention was given to the ACID cluster of the WISC-R, the Arithmetic and Reading subtests on the WRAT, and the Koppitz scores of the Bender Visual Motor Gestalt Test. No significant difference was found between the experimental and control groups. Sex differences within the experimental group were not significant. Time of exposure and patterning of scores in the experimental group were investigated. Time had a significant main effect on WISC-R Arithmetic and Digit Span subtests, the ACID cluster and the Bender Test for the total group. Main effect for sex was significant for the WISC-R Information subtest. An interaction effect was found to be significant on the WRAT Arithmetic subtest WRAT. The longer the girls lived within the site area the lower they scored on the WISC-R Information subtest and the WRAT Arithmetic subtest. The variable exposure (interaction of distance and time) was related to lower scores on the WISC-R Arithmetic and Digit Span subtest. A two-way interaction was found on the WRAT Arithmetic subtest. The longer the females were exposed to the waste site area, the lower they scored on the WRAT Arithmetic subtest. A comparison of those children in the site area from birth and those in the area three years prior to the evaluation was done. A significant main effect was found for the Bender Gestalt.
Brown, JP; Amaechi, BT; Bader, JD; Gilbert, GH; Makhija, SK; Lozano-Pineda, J; Leo, MC; Chuhe, C; Vollmer, WM
Objectives To better understand the effectiveness of xylitol in caries prevention in adults, and to attempt improved clinical trial efficiency. Methods As part of the Xylitol for Adult Caries Trial (X-ACT), non-cavitated and cavitated caries lesions were assessed in subjects who were experiencing the disease. The trial was a test of the effectiveness of 5 grams/day of xylitol, consumed by dissolving in the mouth five 1 gram lozenges spaced across each day, compared with a sucralose placebo. For this analysis, seeking trial efficiency, 538 subjects aged 21–80, with complete data for four dental examinations were selected from the 691 randomized into the three year trial, conducted at three sites. Acceptable inter and intra examiner reliability before and during the trial was quantified using the kappa statistic. Results The mean annualized non-cavitated plus cavitated lesion transition scores in coronal and root surfaces, from sound to carious favoured xylitol over placebo, during the three cumulative periods of 12, 24, and 33 months, but these clinically and statistically non-significant differences declined in magnitude over time. Restricting the present assessment to those subjects with a higher baseline lifetime caries experience showed possible but inconsistent benefit. Conclusions There was no clear and clinically relevant preventive effect of xylitol on caries in adults with adequate fluoride exposure when non-cavitated plus cavitated lesions were assessed. This conformed to the X-ACT trial result assessing cavitated lesions. Including non-cavitated lesion assessment in this full scale, placebo controlled, multi site, randomized, double blinded clinical trial in adults experiencing dental caries, did not achieve added trial efficiency or demonstrate practical benefit of xylitol. Trial Registration ClinicalTrials.Gov NCT00393055 PMID:24205951
Wang, Jinhao; Brown, Michelle Stallone
The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by an AES tool, IntelliMetric [TM] and human raters. Data collection included administering the Texas version of the WriterPlacer "Plus" test and obtaining scores assigned by IntelliMetric [TM] and by human…
Effects of Absence and Cognitive Skills Index on Various Achievement Indicators. A Study of ISTEP Scores, Discrepancies, and School-Based Math and English Tests of 1997-1998 Seventh Grade Students at Sarah Scott Middle School, Terre Haute, Indiana.
Davis, Holly S.
This study examines the correlation between absence, cognitive skills index (CSI), and various achievement indicators such as the Indiana Statewide Testing for Educational Progress (ISTEP) test scores, discrepancies, and school-based English and mathematics tests for 64 seventh-grade students from one middle school. Scores for each of the subtests…
Gray, Sarah A; Rogers, Maria; Martinussen, Rhonda; Tannock, Rosemary
Introduction. Behavioral inattention, working memory (WM), and academic achievement share significant variance, but the direction of relationships across development is unknown. The aim of the present study was to determine whether WM mediates the pathway between inattentive behaviour and subsequent academic outcomes. Methods. 204 students from grades 1-4 (49.5% female) were recruited from elementary schools. Participants received assessments of WM and achievement at baseline and one year later. WM measures included a visual-spatial storage task and auditory-verbal storage and manipulation tasks. Teachers completed the SWAN behaviour rating scale both years. Mediation analysis with PROCESS (Hayes, 2013) was used to determine mediation pathways. Results. Teacher-rated inattention indirectly influenced math addition fluency, subtraction fluency and calculation scores through its effect on visual-spatial WM, only for boys. There was a direct relationship between inattention and math outcomes one year later for girls and boys. Children who displayed better attention had higher WM scores, and children with higher WM scores had stronger scores on math outcomes. Bias-corrected bootstrap confidence intervals for the indirect effects were entirely below zero for boys, for the three math outcomes. WM did not mediate the direct relationship between inattention and reading scores. Discussion. Findings identify inattention and WM as longitudinal predictors for math addition and subtraction fluency and math calculation outcomes one year later, with visual-spatial WM as a significant mediator for boys. Results highlight the close relationship between inattention and WM and their importance in the development of math skills. PMID:26038714
Rogers, Maria; Martinussen, Rhonda; Tannock, Rosemary
Introduction. Behavioral inattention, working memory (WM), and academic achievement share significant variance, but the direction of relationships across development is unknown. The aim of the present study was to determine whether WM mediates the pathway between inattentive behaviour and subsequent academic outcomes. Methods. 204 students from grades 1–4 (49.5% female) were recruited from elementary schools. Participants received assessments of WM and achievement at baseline and one year later. WM measures included a visual-spatial storage task and auditory-verbal storage and manipulation tasks. Teachers completed the SWAN behaviour rating scale both years. Mediation analysis with PROCESS (Hayes, 2013) was used to determine mediation pathways. Results. Teacher-rated inattention indirectly influenced math addition fluency, subtraction fluency and calculation scores through its effect on visual-spatial WM, only for boys. There was a direct relationship between inattention and math outcomes one year later for girls and boys. Children who displayed better attention had higher WM scores, and children with higher WM scores had stronger scores on math outcomes. Bias-corrected bootstrap confidence intervals for the indirect effects were entirely below zero for boys, for the three math outcomes. WM did not mediate the direct relationship between inattention and reading scores. Discussion. Findings identify inattention and WM as longitudinal predictors for math addition and subtraction fluency and math calculation outcomes one year later, with visual-spatial WM as a significant mediator for boys. Results highlight the close relationship between inattention and WM and their importance in the development of math skills. PMID:26038714
Sani, Claudia; Grilli, Leonardo
The performance of a school system can be evaluated through the learning levels of the pupils, usually summarized by school mean scores. The variability of the mean scores among schools is rarely studied in detail, though it is a crucial issue especially in primary schools: in fact, a high variability among schools raises doubts on the capacity of…
Rains, Cherri Sloan
This study was an investigation of the effectiveness of mathematics instruction using the interactive whiteboard (IWB) for 1, 2, and 3 years. Guided by Gagne's conditions of learning theory, this program evaluation study investigated the impact of receiving 1, 2, or 3 years of mathematics instruction using the IWB on mathematics scores on the…
Robert F. Bornstein
The degree to which projection plays a role in Rorschach (Rorschach, 1921\\/1942) responding remains controversial, in part because extant data have yielded inconclusive results. In this investigation, I examined the impact of social projection on Rorschach Oral Dependency (ROD) scores using methods adapted from social cognition research. In Study 1, I prescreened 85 college students (40 women and 45 men)
Harder, Valerie S.; Stuart, Elizabeth A.; Anthony, James C.
There is considerable interest in using propensity score (PS) statistical techniques to address questions of causal inference in psychological research. Many PS techniques exist, yet few guidelines are available to aid applied researchers in their understanding, use, and evaluation. In this study, the authors give an overview of available…
Background Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both. Results Here, we present and demonstrate a theory to split the knowledge-based potentials in scoring terms biologically meaningful and to combine them in new scores to predict near-native structures. Our strategy allows circumventing the problem of defining the reference state. In this approach we give the proof for a simple and linear application that can be further improved by optimizing the combination of Zscores. Using the simplest composite score () we obtained predictions similar to state-of-the-art methods. Besides, our approach has the advantage of identifying the most relevant terms involved in the stability of the protein structure. Finally, we also use the composite Zscores to assess the conformation of models and to detect local errors. Conclusion We have introduced a method to split knowledge-based potentials and to solve the problem of defining a reference state. The new scores have detected near-native structures as accurately as state-of-art methods and have been successful to identify wrongly modeled regions of many near-native conformations. PMID:19917096
Wilcox, Rand R.
A mastery test is frequently described as follows: an examinee responds to n dichotomously scored test items. Depending upon the examinee's observed (number correct) score, a mastery decision is made and the examinee is advanced to the next level of instruction. Otherwise, a nonmastery decision is made and the examinee is given remedial work. This…
García, R E
Chilean universities employ a common admission scoring system for students, based on high school grades, mathematic and verbal academic aptitude tests, and specific biology and social sciences tests. Aiming to know the predictive values of these tests, the standardized scores obtained in the selection tests and academic performance of 1094 first year medical students, admitted in 1989 and 1990 to six universities, were analyzed. These students obtained high admission scores and their academic performance during the first year was low (mean grades ranged from 4.6 +/- 0.6 to 5.28 +/- 0.5 in different universities for a scale from 1 to 7). In all, except one university there was a correlation between admission scores and academic performance. Multiple regression analysis showed that admission scores explained a 13% of performance and that the parameters with better predictive value were high school grades, biology test and mathematics academic aptitude test. Verbal academic aptitude test did not have a predictive value. PMID:7569443
Livingston, Samuel A.
To many people, standardized testing means multiple-choice testing. However, some tests contain questions that require the test taker to produce the answer, rather than simply choosing it from a list. The required response can be as simple as the writing of a single word as complex as the design of a laboratory experiment to test a scientific…
85 boys in the New Jersey State Colony for the Feeble-minded were given the Goodenough test 3 times with 18 and 25 days between tests. The median increase and median decrease from test to test did not exceed one year. The test-retest reliability of the complete Goodenough scale ranged from .68 to .80, and the reliability of an abbreviated scale
Green, Donald Ross; Roudabush, Glenn E.
Scores on the Prescriptive Reading Inventory, the California Achievement Tests, 1970 Edition, the the Short Form Tests of Academic Aptitude were obtained for black pupils and representative samples of pupils in grades 1-3. These scores were compared in an attempt to asses bias in the Prescriptive Reading Inventory, a criterion-referenced…
Introduction COPD exacerbations have a negative impact on lung function, decrease quality of life (QoL) and increase the risk of death. The objective of this study was to assess the course of health status after an outpatient or inpatient exacerbation in patients with COPD. Methods This is an epidemiological, prospective, multicentre study that was conducted in 79 hospitals and primary care centres in Spain. Four hundred seventy-six COPD patients completed COPD assessment test (CAT) and Clinical COPD Questionnaire (CCQ) questionnaires during the 24 hours after presenting at hospital or primary care centres with symptoms of an exacerbation, and also at weeks 4–6. The scores from the CAT and CCQ were evaluated and compared at baseline and after recovery from the exacerbation. Results A total of 164 outpatients (33.7%) and 322 inpatients (66.3%) were included in the study. The majority were men (88.2%), the mean age was 69.4 years (SD?=?9.5) and the mean FEV1 (%) was 47.7% (17.4%). During the exacerbation, patients presented high scores in the CAT: [mean: 22.0 (SD?=?7.0)] and the CCQ: [mean: 4.4 (SD?=?1.2)]. After recovery there was a significant reduction in the scores of both questionnaires [CAT: mean: -9.9 (SD?=?5.1) and CCQ: mean: -3.1 (SD?=?1.1)]. Both questionnaires showed a strong correlation during and after the exacerbation and the best predictor of the magnitude of improvement in the scores was the severity of each score at onset. Conclusions Due to their good correlation, CAT and CCQ can be useful tools to measure health status during an exacerbation and to evaluate recovery. However, new studies are necessary in order to identify which factors are influencing the course of the recovery of health status after a COPD exacerbation. PMID:23987232
Looney, Marilyn A.
Given that equating/linking applications are now appearing in kinesiology literature, this article provides an overview of the different types of linked test scores: equated, concordant, and predicted. It also addresses the different types of evidence required to determine whether the scores from two different field tests (measuring the same…
The measures of academic success used in this study were: mean college grade point average, value changes as measured by pre- and post-test scores on the Allport, Vernon, and Lindsey, "Study of Values," and the number of persons who fell into the following categories: dropouts, persons on academic probation, persons with grade point averages of…
Root Kustritz, Margaret V
Third-year veterinary students in a required theriogenology diagnostics course were allowed to self-select attendance at a lecture in either the evening or the next morning. One group was presented with PowerPoint slides in a traditional format (T group), and the other group was presented with PowerPoint slides in the assertion-evidence format (A-E group), which uses a single sentence and a highly relevant graphic on each slide to ensure attention is drawn to the most important points in the presentation. Students took a multiple-choice pre-test, attended lecture, and then completed a take-home assignment. All students then completed an online multiple-choice post-test and, one month later, a different online multiple-choice test to evaluate retention. Groups did not differ on pre-test, assignment, or post-test scores, and both groups showed significant gains from pre-test to post-test and from pre-test to retention test. However, the T group showed significant decline from post-test to retention test, while the A-E group did not. Short-term differences between slide designs were most likely unaffected due to required coursework immediately after lecture, but retention of material was superior with the assertion-evidence slide design. PMID:25000882
Hinderer, Katherine A; DiBartolo, Mary C; Walsh, Catherine M
In an effort to meet the demand for well-educated, high-quality nurses, schools of nursing seek to admit those candidates most likely to have both timely progression and first-time success on the National Council Licensure Examination for Registered Nurses (NCLEX-RN). Finding the right combination of academic indicators, which are most predictive of success, continues to be an ongoing challenge for entry-level baccalaureate nursing programs across the United States. This pilot study explored the relationship of a standardized admission examination, the Health Education Systems, Inc. (HESI) Admission Assessment (A(2)) Examination to preadmission grade point average (GPA), science GPA, and nursing GPA using a retrospective descriptive design. In addition, the predictive ability of the A(2) Examination, preadmission GPA, and science GPA related to timely progression and NCLEX-RN success were explored. In a sample of 89 students, no relationship was found between the A(2) Examination and preadmission GPA or science GPA. The A(2) Examination was correlated with nursing GPA and NCLEX-RN success but not with timely progression. Further studies are needed to explore the utility and predictive ability of standardized examinations such as the A(2) Examination and the contribution of such examinations to evidence-based admission decision making. PMID:25223292
Abella, Rodolfo; Urrutia, Joanne; Shneyderman, Aleksandr
Approximately 1,700 English language learners (ELLs) and former ELL students, in Grades 4 and 10, were tested using both an English-language (Stanford Achievement Test, 9th ed.) and a Spanish-language (Aprenda, 2nd ed.) achievement test. Their performances on the two tests were contrasted. The results showed that ELL students, for the most part,…
Armstrong, William B.
This is a study of the relationship between placement test scores and academic achievement, as measured by the gain in placement pre- and post-test scores after students completed a semester of English instruction. Two placement tests were administered to a cohort of students enrolling in community college English courses. Pre- and post-placement…
Whaley, Arthur L.; Noel, La Tonya
The present study tested the model minority and inferior minority assumptions by examining the relationship between academic performance and measures of behavioral health in a subsample of 3,008 (22%) participants in a nationally representative, multicultural sample of 13,601 students in the 2001 Youth Risk Behavioral Survey, comparing Asian…
Lievens, Filip; Sackett, Paul R.
This study provides conceptual and empirical arguments why an assessment of applicants' procedural knowledge about interpersonal behavior via a video-based situational judgment test might be valid for academic and postacademic success criteria. Four cohorts of medical students (N = 723) were followed from admission to employment. Procedural…
Li, Xin; Yan, Wenfan
This study followed the comparative research mode of description, interpretation, juxtaposition and comparison. Based on the literatures and data collected on the topic, the paper compared and analyzed the past, present and future of APTHS (academic proficiency test for high schools) in the two countries. Some contemplations on the common issues…
Ingrid Waldron; Ann Hickey; Cathy McPherson; Arthur Butensky; Leslie Gruss; Karen Overall; Angela Schmader; David Wohlmuth
This study has analyzed the relationships between the Type A, or coronary-prone, behavior pattern and several physiological, behavioral and psychosocial characteristics and also has tested whether an increase in academic demands was associated with an increase in students' Type A scores. The Type A behavior pattern, as assessed by the Type A score of the Jenkins Activity Survey, was not
Canivez, Gary L
The present study examined the incremental validity of Wechsler Adult Intelligence Scale-4th Edition (WAIS-IV; Wechsler, 2008a) factor index scores in predicting academic achievement on the Wechsler Individual Achievement Test-2nd Edition (WIAT-II; Psychological Corporation, 2002a) and on the Wechsler Individual Achievement Test-3rd Edition (WIAT-III; Wechsler, 2009a) beyond that predicted by the WAIS-IV Full Scale IQ (FSIQ). As with previous intelligence test incremental validity studies, the WAIS-IV FSIQ accounted for statistically significant and generally large portions of WIAT-II and WIAT-III subtest and composite score variance. WAIS-IV factor index scores combined to provide statistically significant increments in variance accounted for in most WIAT-II and WIAT-III subtest and composite scores over and above the FSIQ score; however, the effect sizes ranged from trivial to medium as observed in investigations with other intelligence tests (i.e., Glutting, Watkins, Konold, & McDermott, 2006; Youngstrom, Kogos, & Glutting, 1999). Individually, the WAIS-IV factor index scores provided trivial to small unique contributions to predicting WIAT-II and WIAT-III scores. This finding indicated that the FSIQ should retain primacy and greatest interpretive weight in WAIS-IV interpretation, as previously indicated by WAIS-IV subtest variance partitions form hierarchical exploratory factor analyses (Canivez & Watkins, 2010a, 2012b). PMID:23647042
Duckworth, Angela L; Quinn, Patrick D; Tsukayama, Eli
The increasing prominence of standardized testing to assess student learning motivated the current investigation. We propose that standardized achievement test scores assess competencies determined more by intelligence than by self-control, whereas report card grades assess competencies determined more by self-control than by intelligence. In particular, we suggest that intelligence helps students learn and solve problems independent of formal instruction, whereas self-control helps students study, complete homework, and behave positively in the classroom. Two longitudinal, prospective studies of middle school students support predictions from this model. In both samples, IQ predicted changes in standardized achievement test scores over time better than did self-control, whereas self-control predicted changes in report card grades over time better than did IQ. As expected, the effect of self-control on changes in report card grades was mediated in Study 2 by teacher ratings of homework completion and classroom conduct. In a third study, ratings of middle school teachers about the content and purpose of standardized achievement tests and report card grades were consistent with the proposed model. Implications for pedagogy and public policy are discussed. PMID:24072936
or earlier year scores are not on your UGA record. It is your responsibility to request AP scores and insure that they have properly posted to your academic record. Scores are easily ordered through the Automated Score. If you ordered your scores more than 2 weeks ago but still do not see them on record, there may
Duckworth, Angela Lee; Seligman, Martin E. P.
Throughout elementary, middle, and high school, girls earn higher grades than boys in all major subjects. Girls, however, do not out perform boys on achievement or IQ tests. To date, explanations for the underprediction of girls' GPAs by standardized tests have focused on gender differences favoring boys on such tests. The authors' investigation…
A two-group randomized multivariate analysis of covariance (MANCOVA) was used to investigate the effects of cognitive-behavioral hypnosis in reducing test anxiety and improving academic performance in comparison to a Hawthorne control group. Subjects were enrolled in a rigorous introductory psychology course which covered an entire text in one…
Harder, Valerie S.; Stuart, Elizabeth A.; Anthony, James C.
There is considerable interest in using propensity score (PS) statistical techniques to address questions of causal inference in psychological research. Many PS techniques exist, yet few guidelines are available to aid applied researchers in their understanding, use and evaluation. This study gives an overview of available techniques for PS estimation and PS application. It also provides a way to help compare PS techniques, using the resulting measured covariate balance as the criterion for selecting between techniques. The empirical example for this study involves the potential causal relationship linking early-onset cannabis problems and subsequent negative mental health outcomes, using data from a prospective cohort study. PS techniques are described and evaluated based on their ability to balance the distributions of measured potentially confounding covariates for individuals with and without early-onset cannabis problems. This paper identifies the PS techniques that yield good statistical balance of the chosen measured covariates within the context of this particular research question and cohort. PMID:20822250
Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E
The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. PMID:25724515
Ward, Jennifer Henry
This study attempted to identify factors in seventh grade academics that are associated with overall success in tenth grade biology. The study addressed the following research questions: Are there significant differences in performance levels in seventh grade Criterion Referenced Competency Test (CRCT) scores in science, math, reading, and language arts associated with performance categories in tenth grade biology End of Course Test (EOCT) and the following demographic variables : gender, ethnicity, socioeconomic status, disability category, and English language proficiency level? Is there a relationship among the categorical variables on the tenth grade biology EOCT and the same five demographic variables? Retrospective causal comparative research was used on a representative sample from the middle schools in three North Georgia counties who took the four CRCTs in the 2006-2007 school year, and took the biology EOCT in the 2009-2010 school year. Chi square was used to determine the relationships of the various demographic variables on three biology EOCT performance categories. Twoway ANOVA determined relationships between the seventh grade CRCT scores of students in the various demographic groups and their performance levels on the biology EOCT. Students' performance levels on the biology EOCT matched their performance levels on the seventh grade CRCTs consistently. Females performed better than males on all seventh grade CRCTs. Black and Hispanic students did worse than White and Asian/Asian Indian students on the math CRCT. Students living in poverty did worse on reading and language arts CRCTs than students who were better off. Special education students did worse on science, reading, and language arts CRCTs than students not receiving special education services. English language learners did worse than native English speakers on all seventh grade CRCTs. These findings suggest that remedial measures may be taken in the seventh grade that could impact performance levels on the biology EOCT.
Bastian, Mauresa; Eggett, Dennis L; Jefferies, Laura K
Question placement and usage of pre-evaluation instructions (PEI) in questionnaires for food sensory analysis may bias consumers' scores via carry-over effects. Data from consumer sensory panels previously conducted at a central location, spanning 11 years and covering a broad range of food product categories, were compiled. Overall acceptance (OA) question placement was studied with categories designated as first (the first evaluation question following demographic questions), after nongustation questions (immediately following questions that do not require panelists to taste the product), and later (following all other hedonic and just-about-right [JAR] questions, but occasionally before ranking, open-ended comments, and/or intent to purchase questions). Each panel was categorized as having or not having PEI in the questionnaire; PEI are instructions that appear immediately before the first evaluation question and show panelists all attributes they will evaluate before receiving test samples. Postpanel surveys were administered regarding the self-reported effect of PEI on panelists' evaluation experience. OA scores were analyzed and compared (1) between OA question placement categories and (2) between panels with and without PEI. For most product categories, OA scores tended to be lower when asked later in the questionnaire, suggesting evidence of a carry-over effect. Usage of PEI increased OA scores by 0.10 of a 9-point hedonic scale point, which is not practically significant. Postpanel survey data showed that presence of PEI typically improved the panelists' experience. Using PEI does not appear to introduce a meaningful carry-over effect. PMID:25604650
Cummings, Steven R.; Sanders, Jason L.; Caserotti, Paolo; Rosano, Caterina; Satterfield, Suzanne; Strotmeyer, Elsa S.; Harris, Tamara B.; Simonsick, Eleanor M.; Cawthon, Peggy M.
Abstract Background Characterization of long-term health trajectory in older individuals is important for proactive health management. However, the relative prognostic value of information contained in clinical profiles of nonfrail older adults is often unclear. Methods We screened 825 phenotypic and genetic measures evaluated during the Health, Aging, and Body Composition Study (Health ABC) baseline visit (3,067 men and women aged 70–79). Variables that best predicted mortality over 13 years of follow-up were identified using 10-fold cross-validation. Results Mortality was most strongly associated with low Digit Symbol Substitution Test (DSST) score (DSST<25; 21.9% of cohort; hazard ratio [HR]=1.87±0.06) and elevated serum cystatin C (?1.30?mg/mL; 12.1% of cohort; HR=2.25±0.07). These variables predicted mortality better than 823 other measures, including baseline age and a 45-variable health deficit index. Given elevated cystatin C (?1.30?mg/mL), mortality risk was further increased by high serum creatinine, high abdominal visceral fat density, and smoking history (2.52?HR ?3.73). Given a low DSST score (<25) combined with low-to-moderate cystatin C (<1.30?mg/mL), mortality risk was highest among those with elevated plasma resistin and smoking history (1.90?HR?2.02). Conclusions DSST score and serum cystatin C warrant priority consideration for the evaluation of mortality risk in older individuals. Both variables, taken individually, predict mortality better than chronological age or a health deficit index in well-functioning older adults (ages 70–79). DSST score and serum cystatin C can thus provide evidence-based tools for geriatric assessment. PMID:22607624
Pope, Gregory A.; Wentzel, Carolyn; Cammaert, Ron
Analysis of all January and June 2000 test scores for Alberta high school seniors found weak relationships between gender and both diploma examination and school-awarded scores. The largest statistical effect for gender was that the difference between the two sets of scores was greater for girls than boys, with school-awarded scores being higher.…
Churchwell, Dawn Earheart
This study examined the relationship between reading achievement and achievement in other subject areas. The purpose of this study was to determine if there was a correlation between reading scores as measured by the Standardized Test for the Assessment of Reading (STAR) and academic achievement in language arts, math, science, and social studies…
The purpose of this quantitative correlational research was to study the relationship between class size and students' academic achievement. Citywide language arts and math test scores for third and fifth grade students in four New York City public schools were examined using a variety of variables including (a) gender, (b) ethnicity, (c) grade…
Barclay, Sean M.; Stolte, Scott K.
Objectives. To determine a measurable definition of academic entitlement, measure academic entitlement in graduating doctor of pharmacy (PharmD) students, and compare the academic performance between students identified as more or less academically entitled. Methods. Graduating students at a private health sciences institution were asked to complete an electronic survey instrument that included demographic data, academic performance, and 2 validated academic entitlement instruments. Results. One hundred forty-one of 243 students completed the survey instrument. Fourteen (10%) students scored greater than the median total points possible on 1 or both of the academic entitlement instruments and were categorized as more academically entitled. Less academically entitled students required fewer reassessments and less remediation than more academically entitled students. The highest scoring academic entitlement items related to student perception of what professors should do for them. Conclusion. Graduating pharmacy students with lower levels of academic entitlement were more academically successful than more academically entitled students. Moving from an expert opinion approach to evidence-based decision-making in the area of academic entitlement will allow pharmacy educators to identify interventions that will decrease academic entitlement and increase academic success in pharmacy students. PMID:25147388
Griswold, Deborah E.; Barnhill, Gena P.; Myles, Brenda Smith; Hagiwara, Taku; Simpson, Richard L.
A study focused on identifying the academic characteristics of 21 children and youth who have Asperger syndrome. Students had an extraordinary range of academic achievement scores, extending from significantly above average to far below grade level. Lowest achievement scores were shown for numerical operations, listening comprehension, and written…
de la Torre, Jimmy; Patz, Richard J.
This article proposes a practical method that capitalizes on the availability of information from multiple tests measuring correlated abilities given in a single test administration. By simultaneously estimating different abilities with the use of a hierarchical Bayesian framework, more precise estimates for each ability dimension are obtained.…
Blanding, Joseph Dwayne
Standardized tests continue to be used in the United States to evaluate applicants for admission to most colleges and universities, which often results in less access for students--specifically students of color--who may have been inadequately prepared in grades K-12 for standardized testing. The purpose of this phenomenological case study was to…
New Jersey State Dept. of Education, Trenton.
The New Jersey High School Proficiency Test for grade 11 (HSPT11) replaced a similar requirement for grade 9 and became a graduation requirement in October 1993. As in the past, the writing test consists of a writing sample, which assesses student abilities to write sustained discourse, and a multiple-choice portion that assesses how well students…
VanderLaan, Ski R.
This mixed methods study (Creswell, 2008) was designed to test the influence of collaborative testing on learning using a quasi-experimental approach. This study used a modified embedded mixed method design in which the qualitative and quantitative data, associated with the secondary questions, provided a supportive role in a study based primarily…
This paper reviews findings from several studies that contribute to our understanding of cross-cultural differences in academic achievement, anxiety and self-doubt. The focus is on comparisons between Confucian Asian and European regions. Recent studies indicate that high academic achievement of students from Confucian Asian countries is…
Zhong, Ming; Zhang, Yiwei; Lange, Kenneth; Fan, Ruzong
In this article, we developed a cross-population comparison test statistic to detect chromosome regions in which there is no significant excess homozygosity in one population but homozygosity remains high in the other. We treated an extended stretch of homozygosity as a surrogate indicator of a recent positive selection. Conditioned on existing linkage disequilibrium, we proposed to test the haplotype version of the Hardy–Weinberg equilibrium (HWE). For each population, we assumed that a random sample of unrelated individuals were typed on a large number of single nucleotide polymorphisms (SNPs). A pooled-test statistic was constructed by comparing the measurements of homozygosity of the two samples around a core SNP. In the chromosome regions where HWE is roughly true in one population and HWE is not true in the other, the pooled-test statistic led to significant results to detect the positive selection. We evaluated the performance of the test statistic by type I error comparison and power evaluation. We showed that the proposed test statistic was very conservative and it had good power when the selected allele remains polymorphic. Then, we applied the test to HapMap Phase II data to make a comparison with previous results and to search for new candidate regions.
The Indiana Statewide Testing for Educational Progress-Plus (ISTEP+) was designed to assess student mastery of key educational goals. The 5th grade ISTEP+ Science Test (5-GIST) is part of the ISTEP+ testing regime. The Indiana Academic Standards were developed to guide instruction in the state, and questions on the ISTEP+ were aligned with these standards. Since its inception, the use of the ISTEP+ exam has been changed to comply with the dictates of both Indiana Public Law 221 and the national No Child Left Behind act. With these modifications, the purpose of the tests has shifted from assessment of individual student academic progress to evaluation of the quality of the educational institution administering the tests. The validity of this use has never been established. The purpose of this study is to assess the validity of the 5-GIST as an instrument for assessing and forming judgments about the quality of science instruction in a particular school. ISTEP+ scores of 2 cohorts of students in a Midwestern school district were converted into Z-scores and tracked from 3rd to 5th grade. A regression line was established to account for the general aptitude and the socio-economic status (SES) of the students. Examining the residuals of the 5-GIST scores revealed that between 57% and 60% of the variance in the scores can be attributed to the general aptitude and SES of the students, leaving between 40% and 43% that can be interpreted as reflecting the effect of the school on student learning.
Background Determining the variation of circulating cathodic antigen (CCA) in urine and egg counts variation in stool between days in Schistosoma mansoni (S. mansoni) infected individuals is vital to decide whether or not to rely on a single-sample test for diagnosis of Schistosomiasis. In this study, the magnitude of day-to-day variation in urine-CCA test scores and in faecal egg counts was evaluated in school children in Ethiopia. Methods A total of 620 school children (age 8 to 12 years) were examined for S. mansoni infection using double Kato-Katz and single urine-CCA cassette methods (batch 32727) on three consecutive days. Results The prevalence of S. mansoni infection was 81.1% based on triple urine-CCA-cassette test and 53.1% based on six Kato-Katz thick smears. Among the study participants, 26.3% showed fluctuation in urine CCA and 32.4% showed fluctuation in egg output. Mean egg count as well as number of cases in each class of intensity and intensity of cassette band color varied over the three days of examination. Over 85% of the children that showed day-to-day variations in status of S. mansoni infection from negative to positive or vice versa by the Kato-Katz and the CCA methods had light intensity of infection. The fluctuation in both the CCA test scores and faecal egg count was not associated with age and sex. Conclusions The current study showed day-to-day variation in CCA and Kato-Katz test results of children infected with S. mansoni. This indicates the necessity of more than one urine or stool samples to be collected on different days for more reliable diagnosis of S. mansoni infection in low endemic areas. PMID:24742192
This paper investigates the credit scoring accuracy of five neural network models: multilayer perceptron, mixture-of-experts, radial basis function, learning vector quantization, and fuzzy adaptive resonance. The neural network credit scoring models are tested using 10-fold crossvalidation with two real world data sets. Results are benchmarked against more traditional methods under consideration for commercial applications including linear discriminant analysis, logistic regression,
This article provides an introduction to the kind of computer software that is used to score student writing in some high stakes testing programs, and that is being promoted as a teaching and learning tool to schools. It sketches the state of play with machines for the scoring of writing, and describes how these machines work and what they do.…
This study uses analysis of co-variance in order to determine which cognitive/learning (working memory, knowledge integration, epistemic belief of learning) or social/personality factors (test anxiety, performance-avoidance goals) might account for gender differences in SAT-V, SAT-M, and overall SAT scores. The results revealed that none of the cognitive/learning factors accounted for gender differences in SAT performance. However, the social/personality factors of test anxiety and performance-avoidance goals each separately accounted for all of the significant gender differences in SAT-V, SAT-M, and overall SAT performance. Furthermore, when the influences of both of these factors were statistically removed simultaneously, all non-significant gender differences reduced further to become trivial by Cohen's (1988) standards. Taken as a whole, these results suggest that gender differences in SAT-V, SAT-M, and overall SAT performance are a consequence of social/learning factors. PMID:23997382
Chomitz, Virginia R.; Slining, Meghan M.; McGowan, Robert J.; Mitchell, Suzanne E.; Dawson, Glen F.; Hacker, Karen A.
Objectives: To determine relationships between physical fitness and academic achievement in diverse, urban public school children. Methods: This cross-sectional study used public school data from 2004 to 2005. Academic achievement was assessed as a passing score on Massachusetts Comprehensive Assessment System (MCAS) achievement tests in…