Sample records for act test scores

  1. The validity of ACT-PEP test scores for predicting academic performance of registered nurses in BSN programs.

    PubMed

    Yang, J C; Noble, J

    1990-01-01

    This study investigated the validity of three American College Testing-Proficiency Examination Program (ACT-PEP) tests (Maternal and Child Nursing, Psychiatric/Mental Health Nursing, Adult Nursing) for predicting the academic performance of registered nurses (RNs) enrolled in bachelor's degree BSN programs nationwide. This study also examined RN students' performance on the ACT-PEP tests by their demographic characteristics: student's age, sex, race, student status (full- or part-time), and employment status (full- or part-time). The total sample for the three tests comprised 2,600 students from eight institutions nationwide. The median correlation coefficients between the three ACT-PEP tests and the semester grade point averages ranged from .36 to .56. Median correlation coefficients increased over time, supporting the stability of ACT-PEP test scores for predicting academic performance over time. The relative importance of selected independent variables for predicting academic performance was also examined; the most important variable for predicting academic performance was typically the ACT-PEP test score. Across the institutions, student demographic characteristics did not contribute significantly to explaining academic performance, over and above ACT-PEP scores.

  2. The Effects of Using Selected Metacognitive Strategies on ACT Mathematics Sub-Test Scores

    ERIC Educational Resources Information Center

    LeMay, Jeffrey W.

    2016-01-01

    This quasi-experimental post-test only control group designed quantitative study examined whether or not members of an experimental group of participants who utilized two metacognitive strategy training regimens experienced a significant increase in their ACT mathematics sub-test scores compared to a group of students who did not utilize either of…

  3. Development of Predicted Paths for ACT Aspire Score Reports. WP-2014-06

    ERIC Educational Resources Information Center

    Allen, Jeff

    2014-01-01

    This report focuses on the initial development of the predicted score paths for ACT Aspire reporting. The paths provide predicted score ranges for the next two years--as well as predicted ACT score ranges for tests administered at grades 9 and 10. Longitudinal ACT Aspire data for students tested in spring 2013 and spring 2014, as well as…

  4. The Effects of Process Oriented Guided Inquiry Learning on Secondary Student ACT Science Scores

    NASA Astrophysics Data System (ADS)

    Judd, William Lindsey

    The purpose of this study was to examine any significant difference on secondary school chemistry students' ACT Science Test scores between students taught by the Process Oriented Guided Inquiry Learning (POGIL) method versus students taught by traditional, teacher-centered pedagogy. This study also examined any difference between students taught by the POGIL method versus students taught by traditional, teacher-centered pedagogy in regard to the three different types of questions on the ACT Science Test: data representation, research summaries, and conflicting viewpoints. The sample consisted of sophomore-level students at two private, suburban Christian schools. A pretest-posttest design was used to compare the mean difference in scores from ACT issued sample test booklets before and after each group had received instruction via the POGIL method or more traditional methods. This study found that there was no significant difference in the mean difference of test scores between the two groups. This study also found that there was not a significant difference in the mean difference of scores in regard to the three different types of questions on the ACT Science Test. Further implications of this study are discussed.

  5. The effect of lab based instruction on ACT science scores

    NASA Astrophysics Data System (ADS)

    Hamilton, Michelle

    Standardized tests, although unpopular, are required for a multitude of reasons. One of these tests is the ACT. The ACT is a college readiness test that many high school juniors take to gain college admittance. Students throughout the United States are unprepared for this assessment. The average high school junior is three points behind twenty-four, the ACT recommended score, for the science section. The science section focuses on reading text and, interpreting graphs, charts, tables and diagrams with an emphasis on experimental design and relationships among variables. For students to become better at interpreting and understanding scientific graphics they must have vast experience developing their own graphics. The purpose of this study was to provide students the opportunity to generate their own graphics to master interpretation of them on the ACT. According to a t-test the results show that students who are continually exposed to creating graphs are able to understand and locate information from graphs at a significantly faster rate.

  6. Accountancy, teaching methods, sex, and American College Test scores.

    PubMed

    Heritage, J; Harper, B S; Harper, J P

    1990-10-01

    This study examines the significance of sex, methodology, academic preparation, and age as related to development of judgmental and problem-solving skills. Sex, American College Test (ACT) Mathematics scores, Composite ACT scores, grades in course work, grade point average (GPA), and age were used in studying the effects of teaching method on 96 students' ability to analyze data in financial statements. Results reflect positively on accounting students compared to the general college population and the women students in particular.

  7. Relationships of Declining Test Scores and Grade Inflation.

    ERIC Educational Resources Information Center

    Bellott, Fred K.

    The relationship between declining scores on national standardized tests and grade inflation is explored. Grade inflation refers to the indicated measure of evaluation of student performance having higher placement than is usual based on the performances. Data for this study were taken from the American College Testing (ACT) Program Class Profile…

  8. The Influence of Foreign Language Learning during Early Childhood on Standardized Test Scores

    ERIC Educational Resources Information Center

    Shaw, Tommetta

    2010-01-01

    Increasing standardized test scores in reading and math is of high importance to the California Department of Education to meet requirements mandated by the No Child Left Behind (NCLB) act of 2001. More research is needed to understand the best ways to improve tests scores to meet concerns of the NCLB act. The purpose of the study was to evaluate…

  9. ACT Test Preparation Course and Its Impact on Students' College- and Career-Readiness

    ERIC Educational Resources Information Center

    Parrott, Timothy Nolan

    2012-01-01

    This study examined the effectiveness of an ACT intervention course developed for high school juniors at Anderson County High School during the 2011-2012 school year. This study compared the ACT composite test scores of the treatment group to the ACT composite test scores of the control group by using their PLAN scores as a baseline, to determine…

  10. Estimated Student Score Gain on the ACT COMP Exam: Valid Tool for Institutional Assessment?

    ERIC Educational Resources Information Center

    Banta, Trudy W.; And Others

    1987-01-01

    An institution can test seniors with the ACT College Outcome Measures Project (COMP) exam, then subtract from the senior score an estimated freshman score. Studies at the University of Tennessee, Knoxville, indicate that this method is not reliable to make judgments about the quality of general education programs. (Author/MLW)

  11. Use of Standardized Test Scores to Predict Success in a Computer Applications Course

    ERIC Educational Resources Information Center

    Harris, Robert V.; King, Stephanie B.

    2016-01-01

    The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…

  12. The Dental Hygiene Aptitude Tests and the American College Testing Program Tests as Predictors of Scores on the National Board Dental Hygiene Examination.

    ERIC Educational Resources Information Center

    Longenbecker, Sueann; Wood, Peter H.

    1984-01-01

    Scores from the National Board Dental Hygiene Examination (NBDHE) served as the criterion variable in a comparison of the predictive validity of the Dental Hygiene Aptitude Tests (DHAT) and the ACT Assessment tests. The DHAT-Science and Verbal tests combined to produce the highest multiple correlation with NBDHE scores. (Author/DWH)

  13. The HISD Class of 1991: American College Testing Program (ACT).

    ERIC Educational Resources Information Center

    Ronacher, Karl; And Others

    This report analyzes the performance of students in the graduating class of 1991 of the Houston (Texas) Independent School District (HISD) who took the American College Testing Program (ACT) test. Eleven percent of the class of 1991, 796 students, graduated with ACT scores. Houston White, Black, and Mexican American students obtained higher…

  14. Do later wake times and increased sleep duration of 12th graders result in more studying, higher grades, and improved SAT/ACT test scores?

    PubMed

    Cole, James S

    2016-09-01

    The aim of this study was to investigate the relationship between sleep duration, wake time, and hours studying on high school grades and performance on the Scholastic Aptitude Test (SAT)/ American College Testing (ACT) college entrance exams. Data were collected from 13,071 recently graduated high school seniors who were entering college in the fall of 2014. A column proportions z test with a Bonferroni adjustment was used to analyze proportional differences. Analysis of covariance (ANCOVA) was used to examine mean group differences. Students who woke up prior to 6 a.m. and got less than 8 h of sleep (27 %) were significantly more likely to report studying 11 or more hours per week (30 %), almost double the rate compared to students who got more than 8 h of sleep and woke up the latest (16 %). Post hoc results revealed students who woke up at 7 a.m. or later reported significantly higher high school grades than all other groups (p < 0.001), with the exception of those students who woke up between 6:01 a.m. and 7:00 a.m. and got eight or more hours of sleep. The highest reported SAT/ACT scores were from the group that woke up after 7 a.m. but got less than 8 h sleep (M = 1099.5). Their scores were significantly higher than all other groups. This study provides additional evidence that increased sleep and later wake time are associated with increased high school grades. However, this study also found that students who sleep the longest also reported less studying and lower SAT/ACT scores.

  15. The Uses and Misuses of Test Scores: Technical Assistance Perspective.

    ERIC Educational Resources Information Center

    Echternacht, Gary

    The uses and misuses of standardized test results used for program evaluation as seen by a staff member of an Elementary Secondary Education Act (ESEA) Title I Technical Assistance Center are described. In ESEA Title I, test scores are used to select students for the program. Although federal requirements do not require using standardized test…

  16. ACT Average Composite by State: 2000 ACT-Tested Graduates.

    ERIC Educational Resources Information Center

    American Coll. Testing Program, Iowa City, IA.

    This table contains average composite scores by state for high school graduates who took the ACT Assessment in 2000. For each state the percentage of graduates taking the ACT Assessment and the average composite score are given, with the same information for those who completed the recommended core curriculum and those who did not, as well as for…

  17. Identifying Speech Acts in E-Mails: Toward Automated Scoring of the "TOEIC"® E-Mail Task. Research Report. ETS RR-12-16

    ERIC Educational Resources Information Center

    De Felice, Rachele; Deane, Paul

    2012-01-01

    This study proposes an approach to automatically score the "TOEIC"® Writing e-mail task. We focus on one component of the scoring rubric, which notes whether the test-takers have used particular speech acts such as requests, orders, or commitments. We developed a computational model for automated speech act identification and tested it…

  18. How Should Colleges Treat Multiple Admissions Test Scores? ACT Working Paper 2017-4

    ERIC Educational Resources Information Center

    Mattern, Krista; Radunzel, Justine; Bertling, Maria; Ho, Andrew

    2017-01-01

    The percentage of students retaking college admissions tests is rising (Harmston & Crouse, 2016). Researchers and college admissions offices currently use a variety of methods for summarizing these multiple scores. Testing companies, interested in validity evidence like correlations with college first-year grade-point averages (FYGPA), often…

  19. What We Lose in Winning the Test Score Race

    ERIC Educational Resources Information Center

    Jorgenson, Olaf

    2012-01-01

    To achieve perpetually better test results each year as mandated by the No Child Left Behind Act (NCLB), teachers in successful schools such as Leroy Anderson Elementary in San Jose, California, will "try anything" to raise scores, as the school's principal stated in an interview with "The San Jose Mercury News." In schools…

  20. Good Instructional Leadership: Principals' Actions to Increase Composite ACT School Scores

    ERIC Educational Resources Information Center

    Xu, Bo; Liu, Dongfang

    2016-01-01

    Due to increased college admission requirements and a 20-year flat-lined trend in ACT scores, it is imperative for education leaders across the nation to implement effective strategies to increase ACT composite scores. High school principals, as instructional leaders and decision makers, are the major stakeholders who are vested in the outcomes of…

  1. From High School to the Future: ACT Preparation--Too Much, Too Late. Why ACT Scores Are Low in Chicago and What It Means for Schools

    ERIC Educational Resources Information Center

    Allensworth, Elaine; Correa, Macarena; Ponisciak, Steve

    2008-01-01

    The majority of Chicago Public Schools (CPS) students are not attaining the ACT scores they are aiming for, which they need to qualify for scholarships and college acceptance. This report looks at the reasons behind students' low performance and what matters for doing well on this test. CPS students are highly motivated to do well on the ACT, and…

  2. ACT/SAT Test Preparation and Coaching Programs. What Works Clearinghouse Intervention Report

    ERIC Educational Resources Information Center

    What Works Clearinghouse, 2016

    2016-01-01

    Most colleges and universities in the United States require students to take the SAT or ACT as part of the college application process. These tests are high stakes in at least three ways. First, most universities factor scores on these tests into admissions decisions. Second, higher scores can increase a student's chances of being admitted to…

  3. Do Test Scores Buy Happiness?

    ERIC Educational Resources Information Center

    McCluskey, Neal

    2017-01-01

    Since at least the enactment of No Child Left Behind in 2002, standardized test scores have served as the primary measures of public school effectiveness. Yet, such scores fail to measure the ultimate goal of education: maximizing happiness. This exploratory analysis assesses nation level associations between test scores and happiness, controlling…

  4. A Study of the Predictability of Praxis I Examination Scores from ACT Scores and Teacher Education Program Prerequisite Courses

    ERIC Educational Resources Information Center

    Henderson, Allen R.

    2013-01-01

    This study investigated the relationship between student enrollment in certain college courses and Praxis I scores. Specifically, the study examined the predictive nature of the relationships between students' grades in college algebra, their freshman English course of choice, their ACT scores, and their Praxis I scores. The subjects consisted of…

  5. Predicting occupational personality test scores.

    PubMed

    Furnham, A; Drakeley, R

    2000-01-01

    The relationship between students' actual test scores and their self-estimated scores on the Hogan Personality Inventory (HPI; R. Hogan & J. Hogan, 1992), an omnibus personality questionnaire, was examined. Despite being given descriptive statistics and explanations of each of the dimensions measured, the students tended to overestimate their scores; yet all correlations between actual and estimated scores were positive and significant. Correlations between self-estimates and actual test scores were highest for sociability, ambition, and adjustment (r = .62 to r = .67). The results are discussed in terms of employers' use and abuse of personality assessment for job recruitment.

  6. Exploring a Source of Uneven Score Equity across the Test Score Range

    ERIC Educational Resources Information Center

    Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D.

    2018-01-01

    Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…

  7. Methods for Improving Test Scores: The Good, the Bad, and the Ugly

    ERIC Educational Resources Information Center

    Wright, Robert J.

    2009-01-01

    The No Child Left Behind Act (NCLB 2001) has the faculties of every public and charter school scrambling to drive test scores of seven identified groups of children (African-American children, Anglo-White children, children with disabilities, Hispanic children, children of poverty, children with English language limitations, and Native-American…

  8. Do Examinees Understand Score Reports for Alternate Methods of Scoring Computer Based Tests?

    ERIC Educational Resources Information Center

    Whittaker, Tiffany A.; Williams, Natasha J.; Dodd, Barbara G.

    2011-01-01

    This study assessed the interpretability of scaled scores based on either number correct (NC) scoring for a paper-and-pencil test or one of two methods of scoring computer-based tests: an item pattern (IP) scoring method and a method based on equated NC scoring. The equated NC scoring method for computer-based tests was proposed as an alternative…

  9. How Accurate Is a Test Score?

    ERIC Educational Resources Information Center

    Doppelt, Jerome E.

    1956-01-01

    The standard error of measurement as a means for estimating the margin of error that should be allowed for in test scores is discussed. The true score measures the performance that is characteristic of the person tested; the variations, plus and minus, around the true score describe a characteristic of the test. When the standard deviation is used…

  10. See It, Be It, Write It: Using Performing Arts to Improve Writing Skills and Test Scores

    ERIC Educational Resources Information Center

    Blecher-Sass, Hope Sara; Moffitt, Maryellen

    2010-01-01

    Improve students' writing skills and boost their assessment scores while adding arts education, creativity, and fun to your writing curriculum. With this vibrant resource, improving writing skills goes hand-in-hand with improving test scores. Students learn how to use acting and visualization as prewriting activities to help them connect writing…

  11. What Do Test Score Really Mean? A Latent Class Analysis of Danish Test Score Performance

    ERIC Educational Resources Information Center

    McIntosh, James; Munk, Martin D.

    2014-01-01

    Latent class Poisson count models are used to analyse a sample of Danish test score results from a cohort of individuals born in 1954-1955, tested in 1968, and followed until 2011. The procedure takes account of unobservable effects as well as excessive zeros in the data. We show that the test scores measure manifest or measured ability as it has…

  12. Which Advanced Mathematics Courses Influence ACT Score? A State Level Analysis of the Iowa Class of 2012

    ERIC Educational Resources Information Center

    Grinstead, Mary L.

    2013-01-01

    This study explores the relationship between specific advanced mathematics courses and college readiness (as determined by ACT score). The ACT organization has found a consistent relationship between taking a minimum core number of mathematics courses and higher ACT scores (mathematics and composite) (ACT, Inc., 2012c). However, the extent to…

  13. Validating Test Score Meaning and Defending Test Score Use: Different Aims, Different Methods

    ERIC Educational Resources Information Center

    Cizek, Gregory J.

    2016-01-01

    Advances in validity theory and alacrity in validation practice have suffered because the term "validity" has been used to refer to two incompatible concerns: (1) the degree of support for specified interpretations of test scores (i.e. intended score meaning) and (2) the degree of support for specified applications (i.e. intended test…

  14. Pain scores for intravenous cannulation and arterial blood gas test among emergency department patients.

    PubMed

    Ballesteros-Peña, Sendoa; Vallejo-De la Hoz, Gorka; Fernández-Aedo, Irrintzi

    2017-12-23

    To analyse vein catheterisation and blood gas test-related pain among adult patients in the emergency department and to explore pain score-related factors. An observational and multicentre research study was performed. Patients undergoing vein catheterisation or arterial puncture for gas test were included consecutively. After each procedure, patients scored the pain experienced using the NRS-11. 780 vein catheterisations and 101 blood gas tests were analysed. Venipuncture was scored with an average score of 2.8 (95% CI: 2.6-3), and arterial puncture with 3.6 (95%CI 3.1-4). Iatrogenic pain scores were associated with moderate - high difficulty procedures (P<.001); with the choice of the humeral rather than the radial artery (P=.02) in the gas test and correlated to baseline pain in venipunctures (P<.001). Pain scores related to other variables such as sex, place of origin or needle gauge did not present statistically significant differences. Vein catheterisation and blood gas test-related pain can be considered mild to moderately and moderately painful procedures, respectively. The pain score is associated with certain variables such as the difficulty of the procedure, the anatomic area of the puncture or baseline pain. A better understanding of painful effects related to emergency nursing procedures and the factors associated with pain self-perception could help to determine when and how to act to mitigate this undesired effect. Copyright © 2017 Elsevier España, S.L.U. All rights reserved.

  15. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP

    ERIC Educational Resources Information Center

    Chudowsky, Naomi; Chudowsky, Victor

    2010-01-01

    In recent years, scores on the annual state reading and mathematics tests used for accountability have gone up in most states. These trends in state test scores do not always coincide, however, with trends on the National Assessment of Educational Progress (NAEP), the federally sponsored assessment that is administered periodically to…

  16. Estimating Total-Test Scores from Partial Scores in a Matrix Sampling Design.

    ERIC Educational Resources Information Center

    Sachar, Jane; Suppes, Patrick

    1980-01-01

    The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students and 60 items of the 110-item Stanford Mental Arithmetic Test. Three methods yielded fairly good estimates of the total-test score. (Author/RL)

  17. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Washington

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Washington's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) decreased in grade 4 reading. In grade 4 math, the percentage scoring proficient on the state test decreased…

  18. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Utah

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Utah's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 8 reading. In grade 4 reading, the percentage scoring proficient on the state test showed a…

  19. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Arkansas

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Arkansas's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) went up in math at grades 4 and 8. In reading, the percentages scoring proficient on the state test went up at…

  20. EDUCATION AND PSYCHOLOGICAL TEST SCORES

    PubMed Central

    Pershad, Dwarka; Verma, S. K.

    1980-01-01

    Education, a long neglected variable affecting psychological test score, is in search of reemphasis. Some evidence for this has accumulated on the psychological tests constructed and standardized here at the department of Psychiatry, P.G.I., Chandigarh. Tentative norms prepared education wise on WAIS-Verbal section, PGI-Memory Scale, Proverb and Similarity Tests, Psychoticism Questionnaire, and PGI MQN 2, for adults, in the age range of 16-50, are reported. The results showed marked difference in the mean scores of different educational categories and thus stressed the need for reporting norms separately for different educational levels. PMID:22064617

  1. Prediction of true test scores from observed item scores and ancillary data.

    PubMed

    Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

    2015-05-01

    In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.

  2. Test/score/report: Simulation techniques for automating the test process

    NASA Technical Reports Server (NTRS)

    Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.

    1994-01-01

    A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary

  3. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP

    ERIC Educational Resources Information Center

    Chudowsky, Naomi; Chudowsky, Victor

    2010-01-01

    This report compares state math and reading proficiency scores in grades 4 and 8 to National Assessment of Educational Progress (NAEP) basic scores for the period of 2005 to 2009. The study found that scores on state tests and NAEP have increased in most states with sufficient data. Also included with the report are profiles for the 23 states that…

  4. Estimating Total-test Scores from Partial Scores in a Matrix Sampling Design.

    ERIC Educational Resources Information Center

    Sachar, Jane; Suppes, Patrick

    It is sometimes desirable to obtain an estimated total-test score for an individual who was administered only a subset of the items in a total test. The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students in grades 3-5 and 60 items of the ll0-item Stanford Mental…

  5. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Ohio

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Ohio's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 4 reading and grade 8 math. In grade 8 reading, the percentage of students scoring proficient…

  6. Estimating the Reliability of a Test Battery Composite or a Test Score Based on Weighted Item Scoring

    ERIC Educational Resources Information Center

    Feldt, Leonard S.

    2004-01-01

    In some settings, the validity of a battery composite or a test score is enhanced by weighting some parts or items more heavily than others in the total score. This article describes methods of estimating the total score reliability coefficient when differential weights are used with items or parts.

  7. ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

    ERIC Educational Resources Information Center

    Allalouf, Avi

    2014-01-01

    The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…

  8. [Validation of a Spanish version of the Childhood Asthma Control Test (Sc-ACT) for use in Spain].

    PubMed

    Pérez-Yarza, E G; Castro-Rodriguez, J A; Villa Asensi, J R; Garde Garde, J; Hidalgo Bermejo, F J

    2015-08-01

    The Childhood Asthma Control Test (c-ACT) is a validated tool for determining pediatric asthma control. However, it is not validated in the Spanish language in Spain. We evaluated the psychometric properties of the Spanish version of the Childhood Asthma Control Test (Sc-ACT) for assessing asthma control in children ages 4 to11. This national, multicentre, prospective study was conducted in Spain with asthmatic children and their caregivers. Patients were assessed at 3 visits (Baseline, 2 Weeks, and 4 Months). Clinical variables included: symptoms, exacerbations, FEV1, asthma classification, PAQLQ and PACQLQ questionnaire scores, and asthma control as perceived by physicians, patients and caregivers. The Sc-ACT feasibility, validity, reliability, and sensitivity to change were assessed. A total of 394 children were included; mean (SD) time to complete the Sc-ACT was 5.3 (4.4) minutes. Sc-ACT score was correlated with asthma control as perceived by physician (-0.52), patient (-0.53), and caregiver (-0.51) and with the PAQLQ (0.56) and PACQLQ (0.55) scores. Sc-ACT was found to be significantly related to intensity and frequency of asthma symptoms. Cronbach alpha coefficient α was 0.81 and intraclass correlation coefficient was ≥0.85 for all of the items. The global effect size of Sc-ACT was 0.55. The cutoff point scores of 21 or higher indicated a good asthma control and their MCID was 4 points. The Spanish version of the c-ACT was found to be a reliable and valid questionnaire for evaluating asthma control in Spanish-speaking children ages 4 to 11 in Spain. Copyright © 2014 Asociación Española de Pediatría. Published by Elsevier España, S.L.U. All rights reserved.

  9. School accountability and the black-white test score gap.

    PubMed

    Gaddis, S Michael; Lauen, Douglas Lee

    2014-03-01

    Since at least the 1960s, researchers have closely examined the respective roles of families, neighborhoods, and schools in producing the black-white achievement gap. Although many researchers minimize the ability of schools to eliminate achievement gaps, the No Child Left Behind Act (NCLB) increased pressure on schools to do so by 2014. In this study, we examine the effects of NCLB's subgroup-specific accountability pressure on changes in black-white math and reading test score gaps using a school-level panel dataset on all North Carolina public elementary and middle schools between 2001 and 2009. Using difference-in-difference models with school fixed effects, we find that accountability pressure reduces black-white achievement gaps by raising mean black achievement without harming mean white achievement. We find no differential effects of accountability pressure based on the racial composition of schools, but schools with more affluent populations are the most successful at reducing the black-white math achievement gap. Thus, our findings suggest that school-based interventions have the potential to close test score gaps, but differences in school composition and resources play a significant role in the ability of schools to reduce racial inequality. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Summary of Score Changes (in other Tests).

    ERIC Educational Resources Information Center

    Cleary, T. Anne; McCandless, Sam A.

    Scholastic Aptitude Test (SAT) scores have declined during the last 14 years. Similar score declines have been observed in many different testing programs, many groups, and tested areas. The declines, while not large in any given year, have been consistent over time, area, and group. The period around 1965 is critical for the interpretation of…

  11. 21 CFR 1210.18 - Scoring.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Scoring. 1210.18 Section 1210.18 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) REGULATIONS UNDER... MILK ACT Inspection and Testing § 1210.18 Scoring. Scoring of sanitary conditions required by §§ 1210...

  12. 21 CFR 1210.18 - Scoring.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Scoring. 1210.18 Section 1210.18 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) REGULATIONS UNDER... MILK ACT Inspection and Testing § 1210.18 Scoring. Scoring of sanitary conditions required by §§ 1210...

  13. Testing Intelligently Includes Double-Checking Wechsler IQ Scores

    ERIC Educational Resources Information Center

    Kuentzel, Jeffrey G.; Hetterscheidt, Lesley A.; Barnett, Douglas

    2011-01-01

    The rigors of standardized testing make for numerous opportunities for examiner error, including simple computational mistakes in scoring. Although experts recommend that test scoring be double-checked, the extent to which independent double-checking would reduce scoring errors is not known. A double-checking procedure was established at a…

  14. A Comparison of an Introductory Course to SAT/ACT Scores in Predicting Student Performance

    ERIC Educational Resources Information Center

    Marsh, Crystale M.; Vandehey, Michael A.; Diekhoff, George M.

    2008-01-01

    We assessed students in General Psychology classes and examined their SAT/ACT scores, GPAs, and attempted and earned hours. Exams in General Psychology were superior to the SAT/ACT in predicting GPA, supporting the use of an introductory course as a "gateway" for identifying at-risk students and engaging them in academic services.…

  15. Test Scores and Stereotypes.

    ERIC Educational Resources Information Center

    Gose, Ben

    1995-01-01

    A psychologist's research suggests that black and female students may have lower standardized test scores and academic achievement because they have accepted stereotypes concerning their ability. Critics feel the researcher, Claude M. Steele, may be overlooking other factors. Steele has developed a program a Stanford University (California) to…

  16. Using Patterns of Summed Scores in Paper-and-Pencil Tests and Computer-Adaptive Tests to Detect Misfitting Item Score Patterns

    ERIC Educational Resources Information Center

    Meijer, Rob R.

    2004-01-01

    Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a…

  17. Does Test Preparation Work? Implications for Score Validity

    ERIC Educational Resources Information Center

    Xie, Qin

    2013-01-01

    This article reports an empirical study that examined the pattern of test preparation for College English Test Band 4 (CET4) and the differential effects of test preparation practices on its scores, thereby drawing implications for CET4 score validity. Data collection involved 1,003 test takers of CET4. A pretest was administered at the beginning…

  18. Facilitating the Interpretation of English Language Proficiency Scores: Combining Scale Anchoring and Test Score Mapping Methodologies

    ERIC Educational Resources Information Center

    Powers, Donald; Schedl, Mary; Papageorgiou, Spiros

    2017-01-01

    The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…

  19. Per Pupil Expenditure, Graduation Rates, and ACT Scores in Tennessee School Districts

    ERIC Educational Resources Information Center

    Irvin, Jay Andrew

    2017-01-01

    The purpose of this study was to investigate and identify possible relationships between academic achievement, as measured by high school graduation rate and ACT composite scores of individual school districts within the state of Tennessee, and the per-pupil expenditure of each district. Research was conducted to determine whether a significant…

  20. The Truth about Scores Children Achieve on Tests.

    ERIC Educational Resources Information Center

    Brown, Jonathan R.

    1989-01-01

    The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)

  1. High-Stakes Testing and Student Achievement: Problems for the No Child Left Behind Act

    ERIC Educational Resources Information Center

    Nichols, Sharon L.; Glass, Gene V.; Berliner, David C.

    2005-01-01

    Under the federal No Child Left Behind Act of 2001 (NCLB), standardized test scores are the indicator used to hold schools and school districts accountable for student achievement. Each state is responsible for constructing an accountability system, attaching consequences--or stakes--for student performance. The theory of action implied by this…

  2. Categorical Differences in Statewide Standardized Testing Scores of Students with Disabilities

    ERIC Educational Resources Information Center

    Trexler, Ellen L.

    2013-01-01

    The No Child Left Behind Act requires all students be proficient in reading and mathematics by 2014, and students in subgroups to make Adequate Yearly Progress. One of these groups is students with disabilities, who continue to score well below their general education peers. This quantitative study identified scoring differences between disability…

  3. The Probability of Obtaining Two Statistically Different Test Scores as a Test Index

    ERIC Educational Resources Information Center

    Muller, Jorg M.

    2006-01-01

    A new test index is defined as the probability of obtaining two randomly selected test scores (PDTS) as statistically different. After giving a concept definition of the test index, two simulation studies are presented. The first analyzes the influence of the distribution of test scores, test reliability, and sample size on PDTS within classical…

  4. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Nevada

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Nevada's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP increased in grade 8 reading and math. Average annual gains were larger on the state test than on NAEP in both subjects. Trends in average (mean) test scores…

  5. Do Gains in Test Scores Explain Labor Market Outcomes?

    ERIC Educational Resources Information Center

    Rose, Heather

    2006-01-01

    Using data from the National Education Longitudinal Study of 1988, this article investigates whether students who made relatively large test score gains during high school had larger earnings 7 years after high school compared to students whose scores improved little. In models that control for pre-high school test scores, family background, and…

  6. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Louisiana

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Louisiana's test score trends through 2008-09. Between 2005 and 2009, trends on state tests and NAEP (National Assessment of Educational Progress) sometimes differed. On the state test, the percentages of students reaching the proficient level increased at grades 4 and 8 in both reading and math. On NAEP, the percentage of…

  7. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Tennessee

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Tennessee's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 8 reading and math. At grade 4, trends on the state test and NAEP differed somewhat. In…

  8. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Maryland

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Maryland's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased at grades 4 and 8 in both reading and math. Average annual gains were larger on the state test than…

  9. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Pennsylvania

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Pennsylvania's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 8 reading and math. Average annual gains were larger on the state test than on NAEP in…

  10. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Nebraska

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Nebraska's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the percentages reaching the basic level on NAEP (National Assessment of Educational Progress) increased at grade 4 in both reading and math. At grade 8, however, the percentages…

  11. Equating Scores from Adaptive to Linear Tests

    ERIC Educational Resources Information Center

    van der Linden, Wim J.

    2006-01-01

    Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test…

  12. Stability of scores for the Slosson Full-Range Intelligence Test.

    PubMed

    Williams, Thomas O; Eaves, Ronald C; Woods-Groves, Suzanne; Mariano, Gina

    2007-08-01

    The test-retest stability of the Slosson Full-Range Intelligence Test by Algozzine, Eaves, Mann, and Vance was investigated with test scores from a sample of 103 students. With a mean interval of 13.7 mo. and different examiners for each of the two test administrations, the test-retest reliability coefficients for the Full-Range IQ, Verbal Reasoning, Abstract Reasoning, Quantitative Reasoning, and Memory were .93, .85, .80, .80, and .83, respectively. Mean differences from the test-retest scores were not statistically significantly different for any of the scales. Results suggest that Slosson scores are stable over time even when different examiners administer the test.

  13. Rocking at 81 and Rolling at 34: ROC Cut-Off Scores for the Negative Acts Questionnaire–Revised in Serbia

    PubMed Central

    Petrović, Ivana B.; Vukelić, Milica; Čizmić, Svetlana

    2017-01-01

    Researchers are still searching for the ways to identify different categories of employees according to their exposure to negative acts and psychological experience of workplace bullying. We followed Notelaers and Einarsen’s application of the ROC analysis to determine the NAQ-R cut-off scores applying a “lower” and “higher” threshold. The main goal of this research was to develop and test different gold standards of personal and organizational relevance in determining the NAQ-R cut-off scores in a specific cultural and economic context of Serbia. Apart from combining self-labeling as a victim with self-perceived health, the objectives were to test the gold standards developed as a combination of self-labeling with life satisfaction, self-labeling with intention to leave and a complex gold standard based on self-labeling, self-perceived health, life satisfaction and intention to leave taken together. The ROC analysis on Serbian workforce data supports applying of different gold standards. For identifying employees in a preliminary stage of bullying, the most applicable was the gold standard based on self-labeling and intention to leave (score 34 and higher). The most accurate identification of victims could be based on the most complex gold standard (score 81 and higher). This research encourages further investigation of gold standards in different cultures. PMID:28119652

  14. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Alaska

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Alaska's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in math and grade 8 in reading. In grade 4 reading, the percentage reaching the…

  15. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Massachusetts

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Massachusetts' test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 4 reading and math and grade 8 math. Average annual gains were larger on the state test…

  16. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. California

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles California's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in both reading and math. Average annual gains were larger on the state test…

  17. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Montana

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Montana's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 4 reading and math and grade 8 reading. In grade 8 math, however, the percentage proficient…

  18. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Colorado

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Colorado's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in both reading and math. Average annual gains were generally larger on NAEP than…

  19. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Wisconsin

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Wisconsin's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in math at grades 4 and 8 and in reading at grade 8. In grade 4 reading, the percentage scoring…

  20. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Alabama

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Alabama's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in both reading and math. Average annual gains were generally larger on the state…

  1. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Texas

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Texas' test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in reading at grades 4 and 8 and in math at grade 8. In grade 4 math, however, the percentage scoring…

  2. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Florida

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Florida's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in both reading and math. Average annual gains were generally larger on the state…

  3. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Arizona

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Arizona's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in both reading and math. Average annual gains were generally larger on the state…

  4. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Iowa

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles Iowa's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 4 reading and math and in grade 8 math. In grade 8 reading, the percentage of students reaching…

  5. Modified Balance Error Scoring System (M-BESS) test scores in athletes wearing protective equipment and cleats.

    PubMed

    Azad, Aftab Mohammad; Al Juma, Saad; Bhatti, Junaid Ahmad; Delaney, J Scott

    2016-01-01

    Balance testing is an important part of the initial concussion assessment. There is no research on the differences in Modified Balance Error Scoring System (M-BESS) scores when tested in real world as compared to control conditions. To assess the difference in M-BESS scores in athletes wearing their protective equipment and cleats on different surfaces as compared to control conditions. This cross-sectional study examined university North American football and soccer athletes. Three observers independently rated athletes performing the M-BESS test in three different conditions: (1) wearing shorts and T-shirt in bare feet on firm surface (control); (2) wearing athletic equipment with cleats on FieldTurf; and (3) wearing athletic equipment with cleats on firm surface. Mean M-BESS scores were compared between conditions. 60 participants were recruited: 39 from football (all males) and 21 from soccer (11 males and 10 females). Average age was 21.1 years (SD=1.8). Mean M-BESS scores were significantly lower (p<0.001) for cleats on FieldTurf (mean=26.3; SD=2.0) and for cleats on firm surface (mean=26.6; SD=2.1) as compared to the control condition (mean=28.4; SD=1.5). Females had lower scores than males for cleats on FieldTurf condition (24.9 (SD=1.9) vs 27.3 (SD=1.6), p=0.005). Players who had taping or bracing on their ankles/feet had lower scores when tested with cleats on firm surface condition (24.6 (SD=1.7) vs 26.9 (SD=2.0), p=0.002). Total M-BESS scores for athletes wearing protective equipment and cleats standing on FieldTurf or a firm surface are around two points lower than M-BESS scores performed on the same athletes under control conditions.

  6. Modified Balance Error Scoring System (M-BESS) test scores in athletes wearing protective equipment and cleats

    PubMed Central

    Azad, Aftab Mohammad; Al Juma, Saad; Bhatti, Junaid Ahmad; Delaney, J Scott

    2016-01-01

    Background Balance testing is an important part of the initial concussion assessment. There is no research on the differences in Modified Balance Error Scoring System (M-BESS) scores when tested in real world as compared to control conditions. Objective To assess the difference in M-BESS scores in athletes wearing their protective equipment and cleats on different surfaces as compared to control conditions. Methods This cross-sectional study examined university North American football and soccer athletes. Three observers independently rated athletes performing the M-BESS test in three different conditions: (1) wearing shorts and T-shirt in bare feet on firm surface (control); (2) wearing athletic equipment with cleats on FieldTurf; and (3) wearing athletic equipment with cleats on firm surface. Mean M-BESS scores were compared between conditions. Results 60 participants were recruited: 39 from football (all males) and 21 from soccer (11 males and 10 females). Average age was 21.1 years (SD=1.8). Mean M-BESS scores were significantly lower (p<0.001) for cleats on FieldTurf (mean=26.3; SD=2.0) and for cleats on firm surface (mean=26.6; SD=2.1) as compared to the control condition (mean=28.4; SD=1.5). Females had lower scores than males for cleats on FieldTurf condition (24.9 (SD=1.9) vs 27.3 (SD=1.6), p=0.005). Players who had taping or bracing on their ankles/feet had lower scores when tested with cleats on firm surface condition (24.6 (SD=1.7) vs 26.9 (SD=2.0), p=0.002). Conclusions Total M-BESS scores for athletes wearing protective equipment and cleats standing on FieldTurf or a firm surface are around two points lower than M-BESS scores performed on the same athletes under control conditions. PMID:27900181

  7. High-Stakes Testing and Student Achievement: Problems for the No Child Left Behind Act. Executive Summary

    ERIC Educational Resources Information Center

    Nichols, Sharon L.; Glass, Gene V.; Berliner, David C.

    2005-01-01

    Under the federal No Child Left Behind Act of 2001 (NCLB), standardized test scores are the indicator used to hold schools and school districts accountable for student achievement. Each state is responsible for constructing an accountability system, attaching consequences--or stakes--for student performance. The theory of action implied by this…

  8. THE EFFECTS ON ACHIEVEMENT TEST RESULTS OF VARYING CONDITIONS OF EXPERIMENTAL ATMOSPHERE, NOTICE OF TEST, TEST ADMINISTRATION, AND TEST SCORING.

    ERIC Educational Resources Information Center

    GOODWIN, WILLIAM L.; AND OTHERS

    NULL HYPOTHESES WERE TESTED TO DETERMINE THE DIFFERENTIAL EFFECTS OF (1) EXPERIMENTAL ATMOSPHERE AND ABSENCE OF SAME, (2) NOTICE OF TEST (10 SCHOOL DAYS) AND NO NOTICE (1 SCHOOL DAY), (3) TEACHER ADMINISTRATION AND OUTSIDE ADMINISTRATION OF TESTS, AND (4) TEACHER SCORING AND OUTSIDE SCORING OF TESTS. SIXTH-GRADE CLASSES (N=64), EACH FROM A…

  9. Score Equating and Nominally Parallel Language Tests.

    ERIC Educational Resources Information Center

    Moy, Raymond

    Score equating requires that the forms to be equated are functionally parallel. That is, the two test forms should rank order examinees in a similar fashion. In language proficiency testing situations, this assumption is often put into doubt because of the numerous tests that have been proposed as measures of language proficiency and the…

  10. Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

    ERIC Educational Resources Information Center

    Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.

    2010-01-01

    Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

  11. The Correlation between Self-Determination and ACT Scores for High School Students with Disabilities

    ERIC Educational Resources Information Center

    Brown, Deitra Learchelle

    2017-01-01

    A significant gap exists between the graduation rate of students with disabilities and their nondisabled. The purpose of this quantitative correlation study was to determine the relationship self-determination had on college and career readiness using ACT scores of students with high incident disabilities. This study was guided by the following…

  12. Neuropsychological test scores, academic performance, and developmental disorders in Spanish-speaking children.

    PubMed

    Rosselli, M; Ardila, A; Bateman, J R; Guzmán, M

    2001-01-01

    Limited information is currently available about performance of Spanish-speaking children on different neuropsychological tests. This study was designed to (a) analyze the effects of age and sex on different neuropsychological test scores of a randomly selected sample of Spanish-speaking children, (b) analyze the value of neuropsychological test scores for predicting school performance, and (c) describe the neuropsychological profile of Spanish-speaking children with learning disabilities (LD). Two hundred ninety (141 boys, 149 girls) 6- to 11-year-old children were selected from a school in Bogotá, Colombia. Three age groups were distinguished: 6- to 7-, 8- to 9-, and 10- to 11-year-olds. Performance was measured utilizing the following neuropsychological tests: Seashore Rhythm Test, Finger Tapping Test (FTT), Grooved Pegboard Test, Children's Category Test (CCT), California Verbal Learning Test-Children's Version (CVLT-C), Benton Visual Retention Test (BVRT), and Bateria Woodcock Psicoeducativa en Español (Woodcock, 1982). Normative scores were calculated. Age effect was significant for most of the test scores. A significant sex effect was observed for 3 test scores. Intercorrelations were performed between neuropsychological test scores and academic areas (science, mathematics, Spanish, social studies, and music). In a post hoc analysis, children presenting very low scores on the reading, writing, and arithmetic achievement scales of the Woodcock battery were identified in the sample, and their neuropsychological test scores were compared with a matched normal group. Finally, a comparison was made between Colombian and American norms.

  13. Score Reporting for the 1991 Medical College Admission Test.

    ERIC Educational Resources Information Center

    Mitchell, Karen J.; Haynes, Robert

    1990-01-01

    Data used in a major review of the system for reporting scores on the Medical College Admission Test (MCAT) are presented and discussed. The data demonstrated the value of the current score-reporting system and led to retention of the 15-point MCAT score scale in 1991. (Author/MSE)

  14. Teacher Greetings Increase College Students' Test Scores

    ERIC Educational Resources Information Center

    Weinstein, Lawrence; Laverghetta, Antonio; Alexander, Ralph; Stewart, Megan

    2009-01-01

    The current study is an extension of a previous investigation dealing with teacher greetings to students. The present investigation used teacher greetings with college students and academic performance (test scores). We report data using university students and in-class test performance. Students in introductory psychology who received teachers'…

  15. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. New Mexico

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles New Mexico's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 4 math and grade 8 reading and math. In grade 4 reading, the percentage basic on NAEP …

  16. State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. North Dakota

    ERIC Educational Resources Information Center

    Center on Education Policy, 2010

    2010-01-01

    This paper profiles North Dakota's test score trends through 2008-09. Between 2005 and 2009, the percentage of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grades 4 and 8 in both reading and math. Average annual gains were larger on the state test…

  17. Predicting Long-Term College Success through Degree Completion Using ACT[R] Composite Score, ACT Benchmarks, and High School Grade Point Average. ACT Research Report Series, 2012 (5)

    ERIC Educational Resources Information Center

    Radunzel, Justine; Noble, Julie

    2012-01-01

    This study compared the effectiveness of ACT[R] Composite score and high school grade point average (HSGPA) for predicting long-term college success. Outcomes included annual progress towards a degree (based on cumulative credit-bearing hours earned), degree completion, and cumulative grade point average (GPA) at 150% of normal time to degree…

  18. A process dissociation approach to objective-projective test score interrelationships.

    PubMed

    Bornstein, Robert F

    2002-02-01

    Even when self-report and projective measures of a given trait or motive both predict theoretically related features of behavior, scores on the 2 tests correlate modestly with each other. This article describes a process dissociation framework for personality assessment, derived from research on implicit memory and learning, which can resolve these ostensibly conflicting results. Research on interpersonal dependency is used to illustrate 3 key steps in the process dissociation approach: (a) converging behavioral predictions, (b) modest test score intercorrelations, and (c) delineation of variables that differentially affect self-report and projective test scores. Implications of the process dissociation framework for personality assessment and test development are discussed.

  19. A prognostic scoring system for arm exercise stress testing.

    PubMed

    Xie, Yan; Xian, Hong; Chandiramani, Pooja; Bainter, Emily; Wan, Leping; Martin, Wade H

    2016-01-01

    Arm exercise stress testing may be an equivalent or better predictor of mortality outcome than pharmacological stress imaging for the ≥50% for patients unable to perform leg exercise. Thus, our objective was to develop an arm exercise ECG stress test scoring system, analogous to the Duke Treadmill Score, for predicting outcome in these individuals. In this retrospective observational cohort study, arm exercise ECG stress tests were performed in 443 consecutive veterans aged 64.1 (11.1) years. (mean (SD)) between 1997 and 2002. From multivariate Cox models, arm exercise scores were developed for prediction of 5-year and 12-year all-cause and cardiovascular mortality and 5-year cardiovascular mortality or myocardial infarction (MI). Arm exercise capacity in resting metabolic equivalents (METs), 1 min heart rate recovery (HRR) and ST segment depression ≥1 mm were the stress test variables independently associated with all-cause and cardiovascular mortality by step-wise Cox analysis (all p<0.01). A score based on the relation HRR (bpm)+7.3×METs-10.5×ST depression (0=no; 1=yes) prognosticated 5-year cardiovascular mortality with a C-statistic of 0.81 before and 0.88 after adjustment for significant demographic and clinical covariates. Arm exercise scores for the other outcome end points yielded C-statistic values of 0.77-0.79 before and 0.82-0.86 after adjustment for significant covariates versus 0.64-0.72 for best fit pharmacological myocardial perfusion imaging models in a cohort of 1730 veterans who were evaluated over the same time period. Arm exercise scores, analogous to the Duke Treadmill Score, have good power for prediction of mortality or MI in patients who cannot perform leg exercise.

  20. Norm-Referenced Tests and Race-Blind Admissions: The Case for Eliminating the SAT and ACT at the University of California. Research & Occasional Paper Series: CSHE.15.17

    ERIC Educational Resources Information Center

    Geiser, Saul

    2017-01-01

    Of all college admission criteria, scores on nationally normed tests like the SAT and ACT are most affected by the socioeconomic background of the student. The effect of socioeconomic background on test scores has grown substantially at University of California over the past two decades, and tests have become more of a barrier to admission of…

  1. Test Operations Procedure (TOP) 03-2-827 Test Procedures for Video Target Scoring Using Calibration Lights

    DTIC Science & Technology

    2016-04-04

    Final 3. DATES COVERED (From - To) 4. TITLE AND SUBTITLE Test Operations Procedure (TOP) 03-2-827 Test Procedures for Video Target Scoring Using...ABSTRACT This Test Operations Procedure (TOP) describes typical equipment and procedures to setup and operate a Video Target Scoring System (VTSS) to...lights. 15. SUBJECT TERMS Video Target Scoring System, VTSS, witness screens, camera, target screen, light pole 16. SECURITY

  2. Semi-Quantitative Scoring of an Immunochromatographic Test for Circulating Filarial Antigen

    PubMed Central

    Chesnais, Cédric B.; Missamou, François; Pion, Sébastien D. S.; Bopda, Jean; Louya, Frédéric; Majewski, Andrew C.; Weil, Gary J.; Boussinesq, Michel

    2013-01-01

    The value of a semi-quantitative scoring of the filarial antigen test (Binax Now Filariasis card test, ICT) results was evaluated during a field survey in the Republic of Congo. One hundred and thirty-four (134) of 774 tests (17.3%) were clearly positive and were scored 1, 2, or 3; and 11 (1.4%) had questionable results. Wuchereria bancrofti microfilariae (mf) were detected in 41 of those 133 individuals with an ICT test score ≥ 1 who also had a night blood smear; none of the 11 individuals with questionable ICT results harbored night mf. Cuzick's test showed a significant trend for higher microfilarial densities in groups with higher ICT scores (P < 0.001). The ICT scores were also significantly correlated with blood mf counts. Because filarial antigen levels provide an indication of adult worm infection intensity, our results suggest that semi-quantitative reading of the ICT may be useful for grading the intensity of filarial infections in individuals and populations. PMID:24019435

  3. Selection Bias in College Admissions Test Scores

    ERIC Educational Resources Information Center

    Clark, Melissa; Rothstein, Jesse; Schanzenbach, Diane Whitmore

    2009-01-01

    Data from college admissions tests can provide a valuable measure of student achievement, but the non-representativeness of test-takers is an important concern. We examine selectivity bias in both state-level and school-level SAT and ACT averages. The degree of selectivity may differ importantly across and within schools, and across and within…

  4. Effect of two additional interventions, test and reflection, added to standard cardiopulmonary resuscitation training on seventh grade students' practical skills and willingness to act: a cluster randomised trial.

    PubMed

    Nord, Anette; Hult, Håkan; Kreitz-Sandberg, Susanne; Herlitz, Johan; Svensson, Leif; Nilsson, Lennart

    2017-06-23

    The aim of this research is to investigate if two additional interventions, test and reflection, after standard cardiopulmonary resuscitation (CPR) training facilitate learning by comparing 13-year-old students' practical skills and willingness to act. Seventh grade students in council schools of two municipalities in south-east Sweden. School classes were randomised to CPR training only (O), CPR training with a practical test including feedback (T) or CPR training with reflection and a practical test including feedback (RT). Measures of practical skills and willingness to act in a potential life-threatening situation were studied directly after training and at 6 months using a digital reporting system and a survey. A modified Cardiff test was used to register the practical skills, where scores in each of 12 items resulted in a total score of 12-48 points. The study was conducted in accordance with current European Resuscitation Council guidelines during December 2013 to October 2014. 29 classes for a total of 587 seventh grade students were included in the study. The total score of the modified Cardiff test at 6 months was the primary outcome. Secondary outcomes were the total score directly after training, the 12 individual items of the modified Cardiff test and willingness to act. At 6 months, the T and O groups scored 32 (3.9) and 30 (4.0) points, respectively (p<0.001), while the RT group scored 32 (4.2) points (not significant when compared with T). There were no significant differences in willingness to act between the groups after 6 months. A practical test including feedback directly after training improved the students' acquisition of practical CPR skills. Reflection did not increase further CPR skills. At 6-month follow-up, no intervention effect was found regarding willingness to make a life-saving effort. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use

  5. Effect of two additional interventions, test and reflection, added to standard cardiopulmonary resuscitation training on seventh grade students’ practical skills and willingness to act: a cluster randomised trial

    PubMed Central

    Nord, Anette; Hult, Håkan; Kreitz-Sandberg, Susanne; Herlitz, Johan; Svensson, Leif; Nilsson, Lennart

    2017-01-01

    Objectives The aim of this research is to investigate if two additional interventions, test and reflection, after standard cardiopulmonary resuscitation (CPR) training facilitate learning by comparing 13-year-old students’ practical skills and willingness to act. Settings Seventh grade students in council schools of two municipalities in south-east Sweden. Design School classes were randomised to CPR training only (O), CPR training with a practical test including feedback (T) or CPR training with reflection and a practical test including feedback (RT). Measures of practical skills and willingness to act in a potential life-threatening situation were studied directly after training and at 6 months using a digital reporting system and a survey. A modified Cardiff test was used to register the practical skills, where scores in each of 12 items resulted in a total score of 12–48 points. The study was conducted in accordance with current European Resuscitation Council guidelines during December 2013 to October 2014. Participants 29 classes for a total of 587 seventh grade students were included in the study. Primary and secondary outcome measures The total score of the modified Cardiff test at 6 months was the primary outcome. Secondary outcomes were the total score directly after training, the 12 individual items of the modified Cardiff test and willingness to act. Results At 6 months, the T and O groups scored 32 (3.9) and 30 (4.0) points, respectively (p<0.001), while the RT group scored 32 (4.2) points (not significant when compared with T). There were no significant differences in willingness to act between the groups after 6 months. Conclusions A practical test including feedback directly after training improved the students’ acquisition of practical CPR skills. Reflection did not increase further CPR skills. At 6-month follow-up, no intervention effect was found regarding willingness to make a life-saving effort. PMID:28645953

  6. Critical Thinking: More than Test Scores

    ERIC Educational Resources Information Center

    Smith, Vernon G.; Szymanski, Antonia

    2013-01-01

    This article is for practicing or aspiring school administrators. The demand for excellence in public education has lead to an emphasis on standardized test scores. This article explores the development of a professional enhancement program designed to prepare teachers to teach higher order thinking skills. Higher order thinking is the primary…

  7. The Black-White Test Score Gap.

    ERIC Educational Resources Information Center

    Jencks, Christopher, Ed.; Phillips, Meredith, Ed.

    The 15 chapters of this book address issues related to the continuing test score gap between black and white students. The editors argue against traditional explanations which emphasize differences in economic resources and demographic factors, and they urge that more emphasis be put on psychological and cultural factors. The book suggests studies…

  8. Test Takers and the Validity of Score Interpretations

    ERIC Educational Resources Information Center

    Kopriva, Rebecca J.; Thurlow, Martha L.; Perie, Marianne; Lazarus, Sheryl S.; Clark, Amy

    2016-01-01

    This article argues that test takers are as integral to determining validity of test scores as defining target content and conditioning inferences on test use. A principled sustained attention to how students interact with assessment opportunities is essential, as is a principled sustained evaluation of evidence confirming the validity or calling…

  9. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum or...

  10. ANOVA Analysis of Student Daily Test Scores in Multi-Day Test Periods

    ERIC Educational Resources Information Center

    Mouritsen, Matthew L.; Davis, Jefferson T.; Jones, Steven C.

    2016-01-01

    Instructors are often concerned when giving multiple-day tests because students taking the test later in the exam period may have an advantage over students taking the test early in the exam period due to information leakage. However, exam scores seemed to decline as students took the same test later in a multi-day exam period (Mouritsen and…

  11. Scoring Yes-No Vocabulary Tests: Reaction Time vs. Nonword Approaches

    ERIC Educational Resources Information Center

    Pellicer-Sanchez, Ana; Schmitt, Norbert

    2012-01-01

    Despite a number of research studies investigating the Yes-No vocabulary test format, one main question remains unanswered: What is the best scoring procedure to adjust for testee overestimation of vocabulary knowledge? Different scoring methodologies have been proposed based on the inclusion and selection of nonwords in the test. However, there…

  12. Increased correlation coefficient between the written test score and tutors' performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia.

    PubMed

    Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

    2016-03-01

    This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (p<0.001) and for group 2 was 0.305 (p<0.001). The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.

  13. The Effect of Pretest Exercise on Baseline Computerized Neurocognitive Test Scores.

    PubMed

    Pawlukiewicz, Alec; Yengo-Kahn, Aaron M; Solomon, Gary

    2017-10-01

    Baseline neurocognitive assessment plays a critical role in return-to-play decision making following sport-related concussions. Prior studies have assessed the effect of a variety of modifying factors on neurocognitive baseline test scores. However, relatively little investigation has been conducted regarding the effect of pretest exercise on baseline testing. The aim of our investigation was to determine the effect of pretest exercise on baseline Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores in adolescent and young adult athletes. We hypothesized that athletes undergoing self-reported strenuous exercise within 3 hours of baseline testing would perform more poorly on neurocognitive metrics and would report a greater number of symptoms than those who had not completed such exercise. Cross-sectional study; Level of evidence, 3. The ImPACT records of 18,245 adolescent and young adult athletes were retrospectively analyzed. After application of inclusion and exclusion criteria, participants were dichotomized into groups based on a positive (n = 664) or negative (n = 6609) self-reported history of strenuous exercise within 3 hours of the baseline test. Participants with a positive history of exercise were then randomly matched, based on age, sex, education level, concussion history, and hours of sleep prior to testing, on a 1:2 basis with individuals who had reported no pretest exercise. The baseline ImPACT composite scores of the 2 groups were then compared. Significant differences were observed for the ImPACT composite scores of verbal memory, visual memory, reaction time, and impulse control as well as for the total symptom score. No significant between-group difference was detected for the visual motor composite score. Furthermore, pretest exercise was associated with a significant increase in the overall frequency of invalid test results. Our results suggest a statistically significant difference in ImPACT composite scores between

  14. [Scores and stages in pneumology].

    PubMed

    Kuhn, Max

    2013-10-01

    Useful scales and classifications for patients with pulmonary diseases are discussed. The modified Medical Research Council breathlessness scale (mMRC) is a measure of disability in lung patients. The GOLD classifications, the COPD-Assessment Test (CAT) and the BODE Index are important to classify the severity of COPD and to measure the disability of these patients. The Geneva score is a clinical prediction rule used in determining the pre-test probability of pulmonary embolism. The Pulmonary Embolism Severity Index (PESI) is a scoring system used to predict 30 day mortality in patients with pulmonary embolism. The Epworth Sleepiness Scale is intended to measure daytime sleepiness in patients with sleep apnea syndrome. The Asthma Controll Test (ACT) determines if asthma symptoms are well controlled.

  15. Observed-Score Equating as a Test Assembly Problem.

    ERIC Educational Resources Information Center

    van der Linden, Wim J.; Luecht, Richard M.

    1998-01-01

    Derives a set of linear conditions of item-response functions that guarantees identical observed-score distributions on two test forms. The conditions can be added as constraints to a linear programming model for test assembly. An example illustrates the use of the model for an item pool from the Law School Admissions Test (LSAT). (SLD)

  16. A Review of Scoring Algorithms for Ability and Aptitude Tests.

    ERIC Educational Resources Information Center

    Chevalier, Shirley A.

    In conventional practice, most educators and educational researchers score cognitive tests using a dichotomous right-wrong scoring system. Although simple and straightforward, this method does not take into consideration other factors, such as partial knowledge or guessing tendencies and abilities. This paper discusses alternative scoring models:…

  17. Score tests for independence in semiparametric competing risks models.

    PubMed

    Saïd, Mériem; Ghazzali, Nadia; Rivest, Louis-Paul

    2009-12-01

    A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.

  18. Cognitive function and unsafe driving acts during an on-road test among community-dwelling older adults with cognitive impairments.

    PubMed

    Hotta, Ryo; Makizako, Hyuma; Doi, Takehiko; Tsutsumimoto, Kota; Nakakubo, Sho; Makino, Keitaro; Shimada, Hiroyuki

    2018-02-19

    To examine the relationship between cognitive function and unsafe driving acts among community-dwelling older adults with cognitive impairments. Participants (n = 160) were older residents of Obu, Japan, aged ≥65 years with cognitive impairments. They regularly drove and were assessed for the number of unsafe driving acts without adequate verification during an on-road test. We also evaluated cognitive function (attention, executive function and processing speed). Other examined variables included demographics, driving characteristics and visual condition. Participants were classified into two groups according to the number of unsafe driving acts as follows: high group (≥4 unsafe driving acts) and low group (≤3 unsafe driving acts). The high group participants were older in age (P < 0.001) and obtained a lower score on the symbol digit substitution task (P = 0.002) than the low group. The number of unsafe driving acts showed modest significant positive correlations with age (r = 0.396, P < 0.001). The symbol digit substitution task score was significantly associated with the number of unsafe driving acts (β = -0.196, P < 0.05) after adjusting for age group. Processing speed was associated with unsafe driving acts that became worse with increasing age. Future study will be required to longitudinally examine the influence of processing speed on traffic accidents for those with cognitive impairments. Geriatr Gerontol Int 2018; ••: ••-••. © 2018 Japan Geriatrics Society.

  19. D.C. Student Test Scores Show Uneven Progress. Data Snapshot

    ERIC Educational Resources Information Center

    DuPre, Mary

    2011-01-01

    Over the past five years, both DC Public Schools (DCPS) and public charter schools (PCS) have seen significant growth in secondary reading and math scores on the state test known as the District of Columbia Comprehensive Assessment System (DC CAS). However, scores have not improved as much at the elementary level. Reading and math scores for DCPS…

  20. Reliability of Total Test Scores When Considered as Ordinal Measurements

    ERIC Educational Resources Information Center

    Biswas, Ajoy Kumar

    2006-01-01

    This article studies the ordinal reliability of (total) test scores. This study is based on a classical-type linear model of observed score (X), true score (T), and random error (E). Based on the idea of Kendall's tau-a coefficient, a measure of ordinal reliability for small-examinee populations is developed. This measure is extended to large…

  1. Correlation of Simulation Examination to Written Test Scores for Advanced Cardiac Life Support Testing: Prospective Cohort Study.

    PubMed

    Strom, Suzanne L; Anderson, Craig L; Yang, Luanna; Canales, Cecilia; Amin, Alpesh; Lotfipour, Shahram; McCoy, C Eric; Osborn, Megan Boysen; Langdorf, Mark I

    2015-11-01

    Traditional Advanced Cardiac Life Support (ACLS) courses are evaluated using written multiple-choice tests. High-fidelity simulation is a widely used adjunct to didactic content, and has been used in many specialties as a training resource as well as an evaluative tool. There are no data to our knowledge that compare simulation examination scores with written test scores for ACLS courses. To compare and correlate a novel high-fidelity simulation-based evaluation with traditional written testing for senior medical students in an ACLS course. We performed a prospective cohort study to determine the correlation between simulation-based evaluation and traditional written testing in a medical school simulation center. Students were tested on a standard acute coronary syndrome/ventricular fibrillation cardiac arrest scenario. Our primary outcome measure was correlation of exam results for 19 volunteer fourth-year medical students after a 32-hour ACLS-based Resuscitation Boot Camp course. Our secondary outcome was comparison of simulation-based vs. written outcome scores. The composite average score on the written evaluation was substantially higher (93.6%) than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6-14.0%], p<0.00005). We found a statistically significant moderate correlation between simulation scenario test performance and traditional written testing (Pearson r=0.48, p=0.04), validating the new evaluation method. Simulation-based ACLS evaluation methods correlate with traditional written testing and demonstrate resuscitation knowledge and skills. Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation.

  2. Between-District Test Score Variation, 2009-2012

    ERIC Educational Resources Information Center

    Fahle, Erin; Reardon, Sean

    2016-01-01

    Describing the variation in test scores between and within school districts is critical for: (1) for policy-related and descriptive work that investigates the sorting of students among districts and the differential effectiveness of those districts; and (2) for methodological work planning future experiments or interventions. Intraclass…

  3. The Persisting Racial Scoring Gap on Graduate and Professional School Admission Tests.

    ERIC Educational Resources Information Center

    Journal of Blacks in Higher Education, 2003

    2003-01-01

    Discusses the racial scoring gap on tests for admission to medical, business, law, and other graduate programs, noting that in the highest-scoring brackets on the Medical College Admission Test (MCAT), the racial gap is even larger. Whites are five times, twelve times, and seven times more likely, respectively, to score higher on the MCAT, Law…

  4. Comparability of IQ Scores on Five Widely Used Intelligence Tests

    ERIC Educational Resources Information Center

    Hieronymus, A. N.; Stroud, James B.

    1969-01-01

    Attempts to fill research gap on testing by obtaining comparisons of deviation scores, at grade levels four, seven, and ten, from the California Test of Mental Maturity, Henmon-Nelson Tests, and Lorge-Thorndike Intelligence tests. Results tabulated. (CJ)

  5. Sex Differences in Cognitive Abilities Test Scores: A UK National Picture

    ERIC Educational Resources Information Center

    Strand, Steve; Deary, Ian J.; Smith, Pauline

    2006-01-01

    Background and aims: There is uncertainty about the extent or even existence of sex differences in the mean and variability of reasoning test scores ( Jensen, 1998; Lynn, 1994, ; Mackintosh, 1996). This paper analyses the Cognitive Abilities Test (CAT) scores of a large and representative sample of UK pupils to determine the extent of any sex…

  6. Teacher Use of Achievement Test Score Data

    ERIC Educational Resources Information Center

    Miller, Steven C.

    2012-01-01

    The Wyoming Department of Education (WDE) has invested time and money developing standardized achievement test score reports designed to give teachers data about each of their students' levels of mastery of particular concepts in order to differentiate their instruction. The purpose of this study was to determine the extent to which eighth-grade…

  7. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    ERIC Educational Resources Information Center

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  8. Misidentifying Factors Underlying Singapore's High Test Scores

    ERIC Educational Resources Information Center

    Usiskin, Zalman

    2012-01-01

    Singapore students have scored exceedingly well on international tests in mathematics. In response, there has been a desire in the United States--both at the policy level and at the school level--to emulate Singapore. Because what can be identified most easily about Singapore's school mathematics can be gleaned from curriculum documents from the…

  9. ACT Reporting Category Interpretation Guide: Version 1.0. ACT Working Paper 2016 (05)

    ERIC Educational Resources Information Center

    Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J.

    2016-01-01

    ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…

  10. A weighted generalized score statistic for comparison of predictive values of diagnostic tests

    PubMed Central

    Kosinski, Andrzej S.

    2013-01-01

    Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343

  11. Using Raters from India to Score a Large-Scale Speaking Test

    ERIC Educational Resources Information Center

    Xi, Xiaoming; Mollaun, Pam

    2011-01-01

    We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…

  12. The impact of testing accommodations on MCAT scores: descriptive results.

    PubMed

    Julian, Ellen R; Ingersoll, Deborah J; Etienne, Patricia M; Hilger, Anthony E

    2004-04-01

    Medical College Admission Test (MCAT) examinees with disabilities who receive accommodations receive flagged scores indicating nonstandard administration. This report compares MCAT examinees who received accommodations and their performances with standard examinees. Aggregate history records of all 1994-2000 MCAT examinees were identified as flagged (2,401) or standard (297,880), then further sorted by race/ethnicity (broadly identified as underrepresented minority and non-URM, at the time of testing) and gender. Those with flagged scores were also classified by disability (LD = learning disability, ADHD = attention deficit hyperactivity disorder, LD/ADHD = learning disability and attention deficit hyperactivity disorder, and Other = other disability) and type of accommodation. Mean MCAT scores were calculated for all groups. A group of 866 examinees took the MCAT first as a standard administration and subsequently with accommodations. In a separate analysis, their two sets of scores were compared. Less than 1% of examinees (2,401) had accommodations; of these, 55% were LD, 17% ADHD, 5% LD/ADHD, and 23% Other. Extended time was the most frequently provided accommodation. Mean flagged scores slightly exceeded mean standard scores on all MCAT sections. Examinees who retook the MCAT with accommodations after a standard administration increased their scores by six points, quadrupling the average gain Standard-Standard retest cohort from another study. The small but statistically significant different higher flagged scores may reflect either appropriate compensation or overly generous accommodations. Extended time had a positive impact on the scores of those who retested with this accommodation. The validity the flagged MCAT in predicting success in medical school is not known, and further investigation is underway.

  13. Leveraging Gender Differences to Boost Test Scores

    ERIC Educational Resources Information Center

    Costello, Bill

    2008-01-01

    According to the 2004 National Assessment of Educational Progress, males who have made it through 12 years of school have significantly poorer reading skills than their female peers. In every age group, boys have been scoring lower than girls annually for more than three decades on U.S. Department of Education reading tests. The longer boys are in…

  14. Test Score Stability and the Relationship of Adult Manifest Anxiety Scale-College Version Scores to External Variables among Graduate Students

    ERIC Educational Resources Information Center

    Lowe, Patricia A.; Peyton, Vicki; Reynolds, Cecil R.

    2007-01-01

    A sample of 79 individuals participated in the present study to evaluate the test score stability (8-week test-retest interval) and construct validity of the scores of the Adult Manifest Anxiety Scale-College Version, a new measure used to assess anxiety in college students, for application to graduate-level students. Results of the study…

  15. An Approach to Scoring and Equating Tests with Binary Items: Piloting With Large-Scale Assessments

    ERIC Educational Resources Information Center

    Dimitrov, Dimiter M.

    2016-01-01

    This article describes an approach to test scoring, referred to as "delta scoring" (D-scoring), for tests with dichotomously scored items. The D-scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The D-score is computed from the…

  16. Developing Test Score Reports that Work: The Process and Best Practices for Effective Communication

    ERIC Educational Resources Information Center

    Zenisky, April L.; Hambleton, Ronald K.

    2012-01-01

    Test scores matter these days. Test-takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees' information or usability needs, but this is clearly changing…

  17. Testing Students with Special Educational Needs in Large-Scale Assessments – Psychometric Properties of Test Scores and Associations with Test Taking Behavior

    PubMed Central

    Pohl, Steffi; Südkamp, Anna; Hardt, Katinka; Carstensen, Claus H.; Weinert, Sabine

    2016-01-01

    Assessing competencies of students with special educational needs in learning (SEN-L) poses a challenge for large-scale assessments (LSAs). For students with SEN-L, the available competence tests may fail to yield test scores of high psychometric quality, which are—at the same time—measurement invariant to test scores of general education students. We investigated whether we can identify a subgroup of students with SEN-L, for which measurement invariant competence measures of adequate psychometric quality may be obtained with tests available in LSAs. We furthermore investigated whether differences in test-taking behavior may explain dissatisfying psychometric properties and measurement non-invariance of test scores within LSAs. We relied on person fit indices and mixture distribution models to identify students with SEN-L for whom test scores with satisfactory psychometric properties and measurement invariance may be obtained. We also captured differences in test-taking behavior related to guessing and missing responses. As a result we identified a subgroup of students with SEN-L for whom competence scores of adequate psychometric quality that are measurement invariant to those of general education students were obtained. Concerning test taking behavior, there was a small number of students who unsystematically picked response options. Removing these students from the sample slightly improved item fit. Furthermore, two different patterns of missing responses were identified that explain to some extent problems in the assessments of students with SEN-L. PMID:26941665

  18. Flow and diffusion of high-stakes test scores.

    PubMed

    Marder, M; Bansal, D

    2009-10-13

    We apply visualization and modeling methods for convective and diffusive flows to public school mathematics test scores from Texas. We obtain plots that show the most likely future and past scores of students, the effects of random processes such as guessing, and the rate at which students appear in and disappear from schools. We show that student outcomes depend strongly upon economic class, and identify the grade levels where flows of different groups diverge most strongly. Changing the effectiveness of instruction in one grade naturally leads to strongly nonlinear effects on student outcomes in subsequent grades.

  19. Scoring systems for the Clock Drawing Test: A historical review

    PubMed Central

    Spenciere, Bárbara; Alves, Heloisa; Charchat-Fichman, Helenice

    2017-01-01

    The Clock Drawing Test (CDT) is a simple neuropsychological screening instrument that is well accepted by patients and has solid psychometric properties. Several different CDT scoring methods have been developed, but no consensus has been reached regarding which scoring method is the most accurate. This article reviews the literature on these scoring systems and the changes they have undergone over the years. Historically, different types of scoring systems emerged. Initially, the focus was on screening for dementia, and the methods were both quantitative and semi-quantitative. Later, the need for an early diagnosis called for a scoring system that can detect subtle errors, especially those related to executive function. Therefore, qualitative analyses began to be used for both differential and early diagnoses of dementia. A widely used qualitative method was proposed by Rouleau et al. (1992). Tracing the historical path of these scoring methods is important for developing additional scoring systems and furthering dementia prevention research. PMID:29213488

  20. Effects of Test Media on Different EFL Test-Takers in Writing Scores and in the Cognitive Writing Process

    ERIC Educational Resources Information Center

    Zou, Xiao-Ling; Chen, Yan-Min

    2016-01-01

    The effects of computer and paper test media on EFL test-takers with different computer familiarity in writing scores and in the cognitive writing process have been comprehensively explored from the learners' aspect as well as on the basis of related theories and practice. The results indicate significant differences in test scores among the…

  1. The Effect of Schooling and Ability on Achievement Test Scores. NBER Working Paper Series.

    ERIC Educational Resources Information Center

    Hansen, Karsten; Heckman, James J.; Mullen, Kathleen J.

    This study developed two methods for estimating the effect of schooling on achievement test scores that control for the endogeneity of schooling by postulating that both schooling and test scores are generated by a common unobserved latent ability. The methods were applied to data on schooling and test scores. Estimates from the two methods are in…

  2. Descriptive Statistics for Modern Test Score Distributions: Skewness, Kurtosis, Discreteness, and Ceiling Effects.

    PubMed

    Ho, Andrew D; Yu, Carol C

    2015-06-01

    Many statistical analyses benefit from the assumption that unconditional or conditional distributions are continuous and normal. More than 50 years ago in this journal, Lord and Cook chronicled departures from normality in educational tests, and Micerri similarly showed that the normality assumption is met rarely in educational and psychological practice. In this article, the authors extend these previous analyses to state-level educational test score distributions that are an increasingly common target of high-stakes analysis and interpretation. Among 504 scale-score and raw-score distributions from state testing programs from recent years, nonnormal distributions are common and are often associated with particular state programs. The authors explain how scaling procedures from item response theory lead to nonnormal distributions as well as unusual patterns of discreteness. The authors recommend that distributional descriptive statistics be calculated routinely to inform model selection for large-scale test score data, and they illustrate consequences of nonnormality using sensitivity studies that compare baseline results to those from normalized score scales.

  3. Principles and Practices of Test Score Equating. Research Report. ETS RR-10-29

    ERIC Educational Resources Information Center

    Dorans, Neil J.; Moses, Tim P.; Eignor, Daniel R.

    2010-01-01

    Score equating is essential for any testing program that continually produces new editions of a test and for which the expectation is that scores from these editions have the same meaning over time. Particularly in testing programs that help make high-stakes decisions, it is extremely important that test equating be done carefully and accurately.…

  4. The Role of Test Scores in Explaining Race and Gender Differences in Wages

    ERIC Educational Resources Information Center

    Blackburn, McKinley L.

    2004-01-01

    Previous research has suggested that skills reflected in test-score performance on tests such as the Armed Forces Qualification Test (AFQT) can account for some of the racial differences in average wages. I use a more complete set of test scores available with the National Longitudinal Survey of Youth 1979 Cohort to reconsider this evidence, and…

  5. Does breastfeeding contribute to the racial gap in reading and math test scores?

    PubMed

    Peters, Kristen E; Huang, Jin; Vaughn, Michael G; Witko, Christopher

    2013-10-01

    The aim of this study was to examine the impact of divergent breastfeeding practices between Caucasian and African American mothers on the lingering achievement test gap between Caucasian and African American children. The Child Development Supplement of the Panel Study of Income Dynamics, beginning in 1997, followed a cohort of 3563 children aged 0-12 years. Reading and math test scores from 2002 for 1928 children were linked with breastfeeding history. Regression analysis was used to examine associations between ever having been breastfed and duration of breastfeeding and test scores, controlling for characteristics of child, mother, and household. African American students scored significantly lower than Caucasian children by 10.6 and 10.9 points on reading and math tests, respectively. After accounting for the impact of having been breastfed during infancy, the racial test gap decreased by 17% for reading scores and 9% for math scores. Study findings indicate that breastfeeding explains 17% and 9% of the observed gaps in reading and math scores, respectively, between African Americans and Caucasians, an effect larger than most recent educational policy interventions. Renewed efforts around policies and clinical practices that promote and remove barriers for African American mothers to breastfeed should be implemented. Copyright © 2013 Elsevier Inc. All rights reserved.

  6. Discrepancies between modified Medical Research Council dyspnea score and COPD assessment test score in patients with COPD

    PubMed Central

    Rhee, Chin Kook; Kim, Jin Woo; Hwang, Yong Il; Lee, Jin Hwa; Jung, Ki-Suck; Lee, Myung Goo; Yoo, Kwang Ha; Lee, Sang Haak; Shin, Kyeong-Cheol; Yoon, Hyoung Kyu

    2015-01-01

    Background and objective According to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelines, either a modified Medical Research Council (mMRC) dyspnea score of ≥2 or a chronic obstructive pulmonary disease (COPD) assessment test (CAT) score of ≥10 is considered to represent COPD patients who are more symptomatic. We aimed to identify the ideal CAT score that exhibits minimal discrepancy with the mMRC score. Methods A receiver operating characteristic curve of the CAT score was generated for an mMRC scores of 1 and 2. A concordance analysis was applied to quantify the association between the frequencies of patients categorized into GOLD groups A–D using symptom cutoff points. A κ-coefficient was calculated. Results For an mMRC score of 2, a CAT score of 15 showed the maximum value of Youden’s index with a sensitivity and specificity of 0.70 and 0.66, respectively (area under the receiver operating characteristic curve [AUC] 0.74; 95% confidence interval [CI], 0.70–0.77). For an mMRC score of 1, a CAT score of 10 showed the maximum value of Youden’s index with a sensitivity and specificity of 0.77 and 0.65, respectively (AUC 0.77; 95% CI, 0.72–0.83). The κ value for concordance was highest between an mMRC score of 1 and a CAT score of 10 (0.66), followed by an mMRC score of 2 and a CAT score of 15 (0.56), an mMRC score of 2 and a CAT score of 10 (0.47), and an mMRC score of 1 and a CAT score of 15 (0.43). Conclusion A CAT score of 10 was most concordant with an mMRC score of 1 when classifying patients with COPD into GOLD groups A–D. However, a discrepancy remains between the CAT and mMRC scoring systems. PMID:26316736

  7. An Investigation into the Relationships Between Cloze Test Scores and Informal Reading Inventory Scores of Fifth Grade Pupils.

    ERIC Educational Resources Information Center

    Walter, Richard Barry

    This study investigated the relationship between instructional level scores as determined by a cloze test and instructional level scores as determined by an informal reading inventory (IRI). Fifty male and 50 female subjects were randomly selected from the total fifth grade population of five schools chosen from a total of 22 midwestern elementary…

  8. Reduce, Reuse, Recycle: The Longitudinal Value of Local Cut Scores Using State Test Data

    ERIC Educational Resources Information Center

    Nelson, Peter M.; Van Norman, Ethan R.; VanDerHeyden, Amanda

    2017-01-01

    We used existing reading (n = 1,498) and math (n = 2,260) data to evaluate state test scores for screening middle school students. In Phase 1, state test data were used to create a research-derived cut score that was optimal for predicting state test performance the following year. In Phase 2, those cut scores were applied with future cohorts.…

  9. Online pre-race education improves test scores for volunteers at a marathon.

    PubMed

    Maxwell, Shane; Renier, Colleen; Sikka, Robby; Widstrom, Luke; Paulson, William; Christensen, Trent; Olson, David; Nelson, Benjamin

    2017-09-01

    This study examined whether an online course would lead to increased knowledge about the medical issues volunteers encounter during a marathon. Health care professionals who volunteered to provide medical coverage for an annual marathon were eligible for the study. Demographic information about medical volunteers including profession, specialty, education level and number of marathons they had volunteered for was collected. A 15-question test about the most commonly encountered medical issues was created by the authors and administered before and after the volunteers took the online educational course and compared to a pilot study the previous year. Seventy-four subjects completed the pre-test. Those who participated in the pilot study last year (N = 15) had pre-test scores that were an average of 2.4 points higher than those who did not (mean ranks: pilot study = 51.6 vs. non-pilot = 33.9, p = 0.004). Of the 74 subjects who completed the pre-test, 54 also completed the post-test. The overall post-pre mean score difference was 3.8 ± 2.7 (t = 10.5 df = 53 p < 0.001). While subjects with all levels of volunteer experience demonstrated improvement, only change among first time marathon volunteers was significantly different from the others. Subjects reporting all degree/certification levels demonstrated improvement, but no difference in improvement was found between degree/certification levels. In this follow-up to the previous year's pilot study, online education demonstrated a long-term (one-year) increase in test scores. Testing also continued to show short-term improvement in post-course test scores, compared to pre-course test scores. In general, marathon medical volunteers who had no volunteer experience demonstrated greater improvement than those who had prior volunteer experience.

  10. Examining the Validity of GED[R] Tests Scores with Scheduling and Setting Accommodations. GED Testing Service Research Studies, 2004-1

    ERIC Educational Resources Information Center

    George-Ezzelle, Carol E.; Skaggs, Gary

    2004-01-01

    Current testing standards call for test developers to provide evidence that testing procedures and test scores, and the inferences made based on the test scores, show evidence of validity and are comparable across subpopulations (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on…

  11. Do We Really Become Smarter When Our Fluid-Intelligence Test Scores Improve?

    PubMed Central

    Hayes, Taylor R.; Petrov, Alexander A.; Sederberg, Per B.

    2014-01-01

    Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training—now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains. A novel scanpath analysis of eye movement data from 35 participants solving Raven’s Advanced Progressive Matrices on two separate sessions indicated that one-third of the variance of score gains could be attributed to test-taking strategy alone, as revealed by characteristic changes in eye-fixation patterns. When the strategic contaminant was partialled out, the residual score gains were no longer significant. These results are compatible with established theories of skill acquisition suggesting that procedural knowledge tacitly acquired during training can later be utilized at posttest. Our novel method and result both underline a reason to be wary of purported intelligence gains, but also provide a way forward for testing for them in the future. PMID:25395695

  12. Do We Really Become Smarter When Our Fluid-Intelligence Test Scores Improve?

    PubMed

    Hayes, Taylor R; Petrov, Alexander A; Sederberg, Per B

    2015-01-01

    Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training-now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains. A novel scanpath analysis of eye movement data from 35 participants solving Raven's Advanced Progressive Matrices on two separate sessions indicated that one-third of the variance of score gains could be attributed to test-taking strategy alone, as revealed by characteristic changes in eye-fixation patterns. When the strategic contaminant was partialled out, the residual score gains were no longer significant. These results are compatible with established theories of skill acquisition suggesting that procedural knowledge tacitly acquired during training can later be utilized at posttest. Our novel method and result both underline a reason to be wary of purported intelligence gains, but also provide a way forward for testing for them in the future.

  13. TCAP Scores and per Pupil Expenditures: Statewide Changes before and after Tennessee's First to the Top Act

    ERIC Educational Resources Information Center

    Cantrell, Martha Ely

    2013-01-01

    The purpose of this study was to investigate the relationships between the changes in Tennessee Comprehensive Assessment Program (TCAP) scores and the changes in Per Pupil Expenditures (PPE) after the enactment of "First to the Top Act of 2010" and the receipt of $501,000,000 in federal Race to the Top (RTTT) grant monies. Half of that…

  14. A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

    PubMed

    Kosinski, Andrzej S

    2013-03-15

    Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.

  15. Validity of GRE General Test scores and TOEFL scores for graduate admission to a technical university in Western Europe

    NASA Astrophysics Data System (ADS)

    Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

    2018-01-01

    Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the Master's programme grade point average (GGPA) with and without the addition of the undergraduate GPA (UGPA) and the TOEFL score, and of GRE scores for study completion and Master's thesis performance. GRE scores explained 20% of the variation in the GGPA, while additional 7% were explained by the TOEFL score and 3% by the UGPA. Contrary to common belief, the GRE quantitative reasoning score showed only little explanatory power. GRE scores were also weakly related to study progress but not to thesis performance. Nevertheless, GRE and TOEFL scores were found to be sensible admissions instruments. Rigorous methodology was used to obtain highly reliable results.

  16. Proficiency Standards and Cut-Scores for Language Proficiency Tests.

    ERIC Educational Resources Information Center

    Moy, Raymond H.

    1984-01-01

    Discusses the problems associated with "grading on a curve," the approach often used for standard setting on language proficiency tests. Proposes four main steps presented in the setting of a non-arbitrary cut-score. These steps not only establish a proficiency standard checked by external criteria, but also check to see that the test covers the…

  17. Effort Analysis: Individual Score Validation of Achievement Test Data

    ERIC Educational Resources Information Center

    Wise, Steven L.

    2015-01-01

    Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…

  18. Student Laptop Use and Scores on Standardized Tests

    ERIC Educational Resources Information Center

    Kposowa, Augustine J.; Valdez, Amanda D.

    2013-01-01

    Objectives: The primary objective of the study was to investigate the relationship between ubiquitous laptop use and academic achievement. It was hypothesized that students with ubiquitous laptops would score on average higher on standardized tests than those without such computers. Methods: Data were obtained from two sources. First, demographic…

  19. Predicting Academic Success in First-Year Mathematics Courses Using ACT Mathematics Scores and High School Grade Point Average

    ERIC Educational Resources Information Center

    Mayo, Sandra Sims

    2012-01-01

    Improving college performance and retention is a daunting task for colleges and universities. Many institutions are taking action to increase retention rates by exploring their academic programs. Regression analysis was used to compare the effectiveness of ACT mathematics scores, high school grade point averages (HSGPA), and demographic factors…

  20. The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach

    PubMed Central

    Xu, Jian

    2017-01-01

    The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers’ listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms. PMID:29312063

  1. The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach.

    PubMed

    Xu, Jian

    2017-01-01

    The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers' listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms.

  2. The Dynamics of the Evolution of the Black-White Test Score Gap

    ERIC Educational Resources Information Center

    Sohn, Kitae

    2012-01-01

    We apply a quantile version of the Oaxaca-Blinder decomposition to estimate the counterfactual distribution of the test scores of Black students. In the Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999 (ECLS-K), we find that the gap initially appears only at the top of the distribution of test scores. As children age, however,…

  3. Comparing Graphical and Verbal Representations of Measurement Error in Test Score Reports

    ERIC Educational Resources Information Center

    Zwick, Rebecca; Zapata-Rivera, Diego; Hegarty, Mary

    2014-01-01

    Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept. We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each…

  4. Rank score and permutation testing alternatives for regression quantile estimates

    USGS Publications Warehouse

    Cade, B.S.; Richards, J.D.; Mielke, P.W.

    2006-01-01

    Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application

  5. Contributions of Hamstring Stiffness to Straight-Leg-Raise and Sit-and-Reach Test Scores.

    PubMed

    Miyamoto, Naokazu; Hirata, Kosuke; Kimura, Noriko; Miyamoto-Mikami, Eri

    2018-02-01

    The passive straight-leg-raise (PSLR) and the sit-and-reach (SR) tests have been widely used to assess hamstring extensibility. However, it remains unclear to what extent hamstring stiffness (a measure of material properties) contributes to PSLR and SR test scores. Therefore, we aimed to clarify the relationship between hamstring stiffness and PSLR and SR scores using ultrasound shear wave elastography. Ninety-eight healthy subjects completed the study. Each subject completed PSLR testing, and classic and modified SR testing of the right leg. Muscle shear modulus of the biceps femoris, semitendinosus, and semimembranosus was quantified as an index of muscle stiffness. The relationships between shear modulus of each muscle and PSLR or SR scores were calculated using Pearson's product-moment correlation coefficients. Shear modulus of the semitendinosus and semimembranosus showed negative correlations with the two PSLR and two SR scores (absolute r value≤0.484). Shear modulus of the biceps femoris was significantly correlated with the PSLR score determined by the examiner and the modified SR score (absolute r value≤0.308). The present findings suggest that PSLR and SR test scores are strongly influenced by factors other than hamstring stiffness and therefore might not accurately evaluate hamstring stiffness. © Georg Thieme Verlag KG Stuttgart · New York.

  6. Manual for Scoring the Test of Directed Imagination.

    ERIC Educational Resources Information Center

    Veldman, Donald J.; And Others

    A scoring manual for the Directed Imagination Test, a projective technique wherein the subject is instructed to write four fictional stories (four minutes are allowed for each) about teachers and their experiences, is presented. The manual provides detailed instructions for rating each story by fifteen dimensions relevant to teacher education…

  7. The Relationship between Upper-Level Math Course Completion and ACT Math Sub Score Achievement

    ERIC Educational Resources Information Center

    Dial, Larry Michael

    2016-01-01

    More high school students are taking the ACT and more students are taking it at an earlier age. States such as Missouri are now testing all public and charter school students during their junior year to use the ACT as a formative assessment to drive discussions about student schedules, plans of study, and course offerings. With more data from more…

  8. AP Trends: Tests Soar, Scores Slip--Gaps between Groups Spur Equity Concerns

    ERIC Educational Resources Information Center

    Cech, Scott J.

    2008-01-01

    More students are taking Advanced Placement tests, but the proportion of tests receiving what is deemed a passing score has dipped, and the mean score is down for the fourth year in a row. Data released here this week by the New York City-based nonprofit organization that owns the AP brand shows that a greater-than-ever proportion of students…

  9. Validity Evidence for ACT Compass® Placement Tests. ACT Research Report Series 2014 (2)

    ERIC Educational Resources Information Center

    Westrick, Paul A.; Allen, Jeff

    2014-01-01

    We examined the validity of using Compass® test scores and high school grade point average (GPA) for placing students in first-year college courses and for identifying students at risk of not succeeding. Consistent with other research, the combination of high school GPA and Compass scores performed better than either measure used alone. Results…

  10. Generalized likelihood ratios for quantitative diagnostic test scores.

    PubMed

    Tandberg, D; Deely, J J; O'Malley, A J

    1997-11-01

    The reduction of quantitative diagnostic test scores to the dichotomous case is a wasteful and unnecessary simplification in the era of high-speed computing. Physicians could make better use of the information embedded in quantitative test results if modern generalized curve estimation techniques were applied to the likelihood functions of Bayes' theorem. Hand calculations could be completely avoided and computed graphical summaries provided instead. Graphs showing posttest probability of disease as a function of pretest probability with confidence intervals (POD plots) would enhance acceptance of these techniques if they were immediately available at the computer terminal when test results were retrieved. Such constructs would also provide immediate feedback to physicians when a valueless test had been ordered.

  11. Validity of GRE General Test Scores and TOEFL Scores for Graduate Admission to a Technical University in Western Europe

    ERIC Educational Resources Information Center

    Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

    2018-01-01

    Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the…

  12. The Formalization of Fairness: Issues in Testing for Measurement Invariance Using Subtest Scores

    ERIC Educational Resources Information Center

    Molenaar, Dylan; Borsboom, Denny

    2013-01-01

    Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear factor analyses of subtest scores. These subtest scores typically result from summing the item scores. In this paper, we discuss 4 possible problems…

  13. College Student Profiles: Norms for the ACT Assessment, 1980-81 Edition.

    ERIC Educational Resources Information Center

    Sawyer, Richard L.

    Extensive normative information are presented on about 455,170 college freshmen entering college in 1979 at 1,099 institutions participating in the American College Testing (ACT) Assessment Program. Norms are provided for males and females on ACT test scores and high school grades by college type, affiliation, and geographic region, and by student…

  14. Estimating Achievement Gaps from Test Scores Reported in Ordinal "Proficiency" Categories

    ERIC Educational Resources Information Center

    Ho, Andrew D.; Reardon, Sean F.

    2012-01-01

    Test scores are commonly reported in a small number of ordered categories. Examples of such reporting include state accountability testing, Advanced Placement tests, and English proficiency tests. This paper introduces and evaluates methods for estimating achievement gaps on a familiar standard-deviation-unit metric using data from these ordered…

  15. A Seven-Year Follow-Up of Intelligence Test Scores of Foster Grandparents

    ERIC Educational Resources Information Center

    Troll, Lillian E.; And Others

    1976-01-01

    After seven years, a group (N=32) of originally nonemployed poverty-level older people (over 60) now employed as foster grandparents were retested with the WAIS. Three subtest scores showed stability and Digit Span showed a statistically significant drop. Neither age nor initial level of health or WAIS scores was related to test-score changes over…

  16. Explaining the black-white gap in cognitive test scores: Toward a theory of adverse impact.

    PubMed

    Cottrell, Jonathan M; Newman, Daniel A; Roisman, Glenn I

    2015-11-01

    In understanding the causes of adverse impact, a key parameter is the Black-White difference in cognitive test scores. To advance theory on why Black-White cognitive ability/knowledge test score gaps exist, and on how these gaps develop over time, the current article proposes an inductive explanatory model derived from past empirical findings. According to this theoretical model, Black-White group mean differences in cognitive test scores arise from the following racially disparate conditions: family income, maternal education, maternal verbal ability/knowledge, learning materials in the home, parenting factors (maternal sensitivity, maternal warmth and acceptance, and safe physical environment), child birth order, and child birth weight. Results from a 5-wave longitudinal growth model estimated on children in the NICHD Study of Early Child Care and Youth Development from ages 4 through 15 years show significant Black-White cognitive test score gaps throughout early development that did not grow significantly over time (i.e., significant intercept differences, but not slope differences). Importantly, the racially disparate conditions listed above can account for the relation between race and cognitive test scores. We propose a parsimonious 3-Step Model that explains how cognitive test score gaps arise, in which race relates to maternal disadvantage, which in turn relates to parenting factors, which in turn relate to cognitive test scores. This model and results offer to fill a need for theory on the etiology of the Black-White ethnic group gap in cognitive test scores, and attempt to address a missing link in the theory of adverse impact. (c) 2015 APA, all rights reserved).

  17. Simple exercise test score versus cardiac stress test for the prediction of coronary artery disease in patients with type 2 diabetes.

    PubMed

    Pikto-Pietkiewicz, Witold; Przewłocka, Monika; Chybowska, Barbara; Cyciwa, Alona; Pasierski, Tomasz

    2014-01-01

    Type 2 diabetes markedly increases the risk of coronary heart disease (CHD), and screening for CHD is suggested by the guidelines. The aim of the study was to compare the diagnostic usefulness of the simple exercise test score, incorporating the clinical data and cardiac stress test results, with the standard stress test in patients with type 2 diabetes. A total of 62 consecutive patients (aged 65.4 ±8.5 years; 32 men) with type 2 diabetes and clinical symptoms suggesting CHD underwent a stress test followed by coronary angiography. The simple score was calculated for all patients. Significant coronary stenosis was observed in 41 patients (66.1%). Stress test results were positive in 36 patients (58.1%). The mean simple score was high (65.5 ±14.3 points). A positive linear relationship was observed between the score and the prevalence of CHD (R2 = 0.19; P <0.001) as well as its severity (R² = 0.23; P <0.001). The area under the receiver-operating characteristic curve for the simple score was 0.74 (95% confidence interval [CI], 0.62-0.86). At the original cut-off value of 60 points, the score had a similar prognostic value to that of the standard stress test. However, in a multivariate analysis, only the simple score (odds ratio [OR], 1.46; 95% CI, 1.11-1.94; P <0.01 for an increase in the score by 1 point) and male sex (OR, 1.57; 95% CI, 1.24-1.98; P <0.001) remained independent predictors of CHD. In patients with type 2 diabetes, the simple score correlated with the prevalence and severity of CHD. However, the cut-off value of 60 points was inadequate in the population of diabetic patients with high risk of CHD. The simple score used instead of or together with the stress test was a better predictor of CHD than the stress test alone.

  18. A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

    ERIC Educational Resources Information Center

    Kamens, David H.

    2015-01-01

    This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…

  19. Assessment Test Scores of Incoming Students, Fall 2001.

    ERIC Educational Resources Information Center

    Negron, Maggie; Breindel, Matthew

    This assessment of placement test scores in reading, math, and sentence skills from incoming students at College of the Desert (California) shows that students are overwhelmingly underprepared for study at the college. Only 15% of students were prepared in sentence skills, 27% in reading skills, 7% in math skills; only 3% were prepared in all 3…

  20. Test Score Stability and Construct Validity of the Adult Manifest Anxiety Scale-College Version Scores among College Students: A Brief Report

    ERIC Educational Resources Information Center

    Lowe, Patricia A.; Papanastasiou, Elena C.; DeRuyck, Kimberly A.; Reynolds, Cecil R.

    2005-01-01

    In this study, the authors investigated the temporal stability and construct validity of the Adult Manifest Anxiety Scale-College Version (AMAS-C; C. R. Reynolds, B. O. Richmond, & P. A. Lowe, 2003b) scores. Results indicated that the AMAS-C scores had adequate to excellent test score stability, and evidence supported the construct validity of the…

  1. The Validity of IQ Scores Derived from Readiness Screening Tests

    ERIC Educational Resources Information Center

    Telegdy, Gabriel A.

    1976-01-01

    The Screening Test of Academic Readiness (STAR) and the Peabody Picture Vocabulary Test (PPVT) were administered to 52 kindergarten children to reveal the convergent validity of IQ scores derived from the STAR. The findings raise doubts about the validity of the deviation IQs derived from the STAR. (Author)

  2. Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

    ERIC Educational Resources Information Center

    Kolen, Michael J.; Lee, Won-Chan

    2011-01-01

    This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

  3. The Comparison of Accuracy Scores on the Paper and Pencil Testing vs. Computer-Based Testing

    ERIC Educational Resources Information Center

    Retnawati, Heri

    2015-01-01

    This study aimed to compare the accuracy of the test scores as results of Test of English Proficiency (TOEP) based on paper and pencil test (PPT) versus computer-based test (CBT). Using the participants' responses to the PPT documented from 2008-2010 and data of CBT TOEP documented in 2013-2014 on the sets of 1A, 2A, and 3A for the Listening and…

  4. 78 FR 66700 - Toxic Substances Control Act Chemical Testing; Receipt of Test Data

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-06

    ... Chemical Testing; Receipt of Test Data AGENCY: Environmental Protection Agency (EPA). ACTION: Notice. SUMMARY: This notice announces EPA's receipt of test data on 21 chemicals. These data were submitted pursuant to 3 test rules issued by EPA under section 4 of the Toxic Substance Control Act (TSCA). The...

  5. Student Achievement and Efficiency in Missouri Schools and the No Child Left Behind Act

    ERIC Educational Resources Information Center

    Primont, Diane F.; Domazlicky, Bruce

    2006-01-01

    The 2001 No Child Left Behind Act requires that schools make ''annual yearly progress'' in raising student achievement, or face possible sanctions. The No Child Left Behind Act places added emphasis on test scores, such as scores from the Missouri Assessment Program (MAP), to evaluate the performance of schools. In this paper, we investigate…

  6. Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

    PubMed

    Haverinen-Shaughnessy, Ulla; Shaughnessy, Richard J

    2015-01-01

    Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.

  7. Bi-Factor MIRT Observed-Score Equating for Mixed-Format Tests

    ERIC Educational Resources Information Center

    Lee, Guemin; Lee, Won-Chan

    2016-01-01

    The main purposes of this study were to develop bi-factor multidimensional item response theory (BF-MIRT) observed-score equating procedures for mixed-format tests and to investigate relative appropriateness of the proposed procedures. Using data from a large-scale testing program, three types of pseudo data sets were formulated: matched samples,…

  8. Optimal Scoring Methods of Hand-Strength Tests in Patients with Stroke

    ERIC Educational Resources Information Center

    Huang, Sheau-Ling; Hsieh, Ching-Lin; Lin, Jau-Hong; Chen, Hui-Mei

    2011-01-01

    The purpose of this study was to determine the optimal scoring methods for measuring strength of the more-affected hand in patients with stroke by examining the effect of reducing measurement errors. Three hand-strength tests of grip, palmar pinch, and lateral pinch were administered at two sessions in 56 patients with stroke. Five scoring methods…

  9. Score Reporting in Teacher Certification Testing: A Review, Design, and Interview/Focus Group Study

    ERIC Educational Resources Information Center

    Klesch, Heather S.

    2010-01-01

    The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…

  10. Clinical experience of scoring criteria for Familial Hypercholesterolaemia (FH) genetic testing in Wales.

    PubMed

    Haralambos, K; Whatley, S D; Edwards, R; Gingell, R; Townsend, D; Ashfield-Watt, P; Lansberg, P; Datta, D B N; McDowell, I F W

    2015-05-01

    Familial Hypercholesterolaemia (FH) is caused by mutations in genes of the Low Density Lipoprotein (LDL) receptor pathway. A definitive diagnosis of FH can be made by the demonstration of a pathogenic mutation. The Wales FH service has developed scoring criteria to guide selection of patients for DNA testing, for those referred to clinics with hypercholesterolaemia. The criteria are based on a modification of the Dutch Lipid Clinic scoring criteria and utilise a combination of lipid values, physical signs, personal and family history of premature cardiovascular disease. They are intended to provide clinical guidance and enable resources to be targeted in a cost effective manner. 623 patients who presented to lipid clinics across Wales had DNA testing following application of these criteria. The proportion of patients with a pathogenic mutation ranged from 4% in those scoring 5 or less up to 85% in those scoring 15 or more. LDL-cholesterol was the strongest discriminatory factor. Scores gained from physical signs, family history, coronary heart disease, and triglycerides also showed a gradient in mutation pick-up rate according to the score. These criteria provide a useful tool to guide selection of patients for DNA testing when applied by health professionals who have clinical experience of FH. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  11. Increasing Racial Isolation and Test Score Gaps in Mathematics: A 30-Year Perspective

    ERIC Educational Resources Information Center

    Berends, Mark; Penaloza, Roberto V.

    2010-01-01

    Background/Context: Although there has been progress in closing the test score gaps among student groups over past decades, that progress has stalled. Many researchers have speculated why the test score gaps closed between the early 1970s and the early 1990s, but only a few have been able to empirically study how changes in school factors and…

  12. Peer Effects and the Indigenous/Non-Indigenous Early Test-Score Gap in Peru

    ERIC Educational Resources Information Center

    Sakellariou, Chris

    2008-01-01

    This paper assesses the magnitude of the non-indigenous/indigenous test-score gap for third-year and fourth-year primary school pupils in Peru, in relation to the main family, school and peer inputs contributing to the test-score gap using the estimation method of feasible generalized least squares. The article then decomposes the gap into its…

  13. Motivating High School Students to Score Proficient on State Tests

    ERIC Educational Resources Information Center

    Brown, Sarah Lee

    2015-01-01

    The researcher interviewed two groups of eleventh grade students, in a rural Appalachian setting, who tended to score low on the state mandated high stakes/low stakes test to discover their efforts on the test, specifically in reading, and to obtain their opinions concerning the effects of a specific incentive or consequence. Before the eleventh…

  14. A Comparison of Standardized Achievement Test Scores on Right and Left Brain Dominant Fourth-Grade Students.

    ERIC Educational Resources Information Center

    Bell, Michael L.; Roubinek, Darrell L.

    1989-01-01

    Compares fourth-graders' subtest scores on the Stanford Achievement Test (SAT), the Iowa Test of Basic Skills (ITBS), and the Metropolitan Achievement Test (MAT). Finds right-brain dominant students scored better on four SAT subtests, and left-brain dominant students scored better on four ITBS subtests and two MAT subtests. (NH)

  15. Generation of GHS Scores from TEST and online sources ...

    EPA Pesticide Factsheets

    Alternatives assessment frameworks such as DfE (Design for the Environment) evaluate chemical alternatives in terms of human health effects, ecotoxicity, and fate. T.E.S.T. (Toxicity Estimation Software Tool) can be utilized to evaluate human health in terms of acute oral rat toxicity, developmental toxicity, endocrine activity, and mutagenicity. It can be used to evaluate ecotoxicity (in terms of acute fathead minnow toxicity) and fate (in terms of bioconcentration factor). It also be used to estimate a variety of key physicochemical properties such as melting point, boiling point, vapor pressure, water solubility, and bioconcentration factor. A web-based version of T.E.S.T. is currently being developed to allow predictions to be made from other web tools. Online data sources such as from NCCT’s Chemistry Dashboard, REACH dossiers, or from ChemHat.org can also be utilized to obtain GHS (Global Harmonization System) scores for comparing alternatives. The purpose of this talk is to show how GHS (Global Harmonization Score) data can be obtained from literature sources and from T.E.S.T. (Toxicity Estimation Software Tool). This data will be used to compare chemical alternatives in the alternatives assessment dashboard (a 2018 CSS product).

  16. High Test Scores: The Wrong Road to National Economic Success

    ERIC Educational Resources Information Center

    Baker, Keith

    2011-01-01

    A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…

  17. Commentary: Student Cognition, the Situated Learning Context, and Test Score Interpretation

    ERIC Educational Resources Information Center

    La Marca, Paul M.

    2006-01-01

    Although it is assumed that student cognition contributes to student performance on achievement tests, it may be that current testing models lack the degree of specification necessary to warrant such inferences. With test score interpretations as the referent, the authors in this special issue address the role of student cognition in learning and…

  18. Relationships between spatial activities and scores on the mental rotation test as a function of sex.

    PubMed

    Ginn, Sheryl R; Pickens, Stefanie J

    2005-06-01

    Previous results suggested that female college students' scores on the Mental Rotations Test might be related to their prior experience with spatial tasks. For example, women who played video games scored better on the test than their non-game-playing peers, whereas playing video games was not related to men's scores. The present study examined whether participation in different types of spatial activities would be related to women's performance on the Mental Rotations Test. 31 men and 59 women enrolled at a small, private church-affiliated university and majoring in art or music as well as students who participated in intercollegiate athletics completed the Mental Rotations Test. Women's scores on the Mental Rotations Test benefitted from experience with spatial activities; the more types of experience the women had, the better their scores. Thus women who were athletes, musicians, or artists scored better than those women who had no experience with these activities. The opposite results were found for the men. Efforts are currently underway to assess how length of experience and which types of experience are related to scores.

  19. The value of Bayes' theorem for interpreting abnormal test scores in cognitively healthy and clinical samples.

    PubMed

    Gavett, Brandon E

    2015-03-01

    The base rates of abnormal test scores in cognitively normal samples have been a focus of recent research. The goal of the current study is to illustrate how Bayes' theorem uses these base rates--along with the same base rates in cognitively impaired samples and prevalence rates of cognitive impairment--to yield probability values that are more useful for making judgments about the absence or presence of cognitive impairment. Correlation matrices, means, and standard deviations were obtained from the Wechsler Memory Scale--4th Edition (WMS-IV) Technical and Interpretive Manual and used in Monte Carlo simulations to estimate the base rates of abnormal test scores in the standardization and special groups (mixed clinical) samples. Bayes' theorem was applied to these estimates to identify probabilities of normal cognition based on the number of abnormal test scores observed. Abnormal scores were common in the standardization sample (65.4% scoring below a scaled score of 7 on at least one subtest) and more common in the mixed clinical sample (85.6% scoring below a scaled score of 7 on at least one subtest). Probabilities varied according to the number of abnormal test scores, base rates of normal cognition, and cutoff scores. The results suggest that interpretation of base rates obtained from cognitively healthy samples must also account for data from cognitively impaired samples. Bayes' theorem can help neuropsychologists answer questions about the probability that an individual examinee is cognitively healthy based on the number of abnormal test scores observed.

  20. More than Just Test Scores

    ERIC Educational Resources Information Center

    Levin, Henry M.

    2012-01-01

    Around the world we hear considerable talk about creating world-class schools. Usually the term refers to schools whose students get very high scores on the international comparisons of student achievement such as PISA or TIMSS. The practice of restricting the meaning of exemplary schools to the narrow criterion of achievement scores is usually…

  1. Predictive effects of teachers and schools on test scores, college attendance, and earnings

    PubMed Central

    Chamberlain, Gary E.

    2013-01-01

    I studied predictive effects of teachers and schools on test scores in fourth through eighth grade and outcomes later in life such as college attendance and earnings. For example, predict the fraction of a classroom attending college at age 20 given the test score for a different classroom in the same school with the same teacher and given the test score for a classroom in the same school with a different teacher. I would like to have predictive effects that condition on averages over many classrooms, with and without the same teacher. I set up a factor model that, under certain assumptions, makes this feasible. Administrative school district data in combination with tax data were used to calculate estimates and do inference. PMID:24101492

  2. Predictive effects of teachers and schools on test scores, college attendance, and earnings.

    PubMed

    Chamberlain, Gary E

    2013-10-22

    I studied predictive effects of teachers and schools on test scores in fourth through eighth grade and outcomes later in life such as college attendance and earnings. For example, predict the fraction of a classroom attending college at age 20 given the test score for a different classroom in the same school with the same teacher and given the test score for a classroom in the same school with a different teacher. I would like to have predictive effects that condition on averages over many classrooms, with and without the same teacher. I set up a factor model that, under certain assumptions, makes this feasible. Administrative school district data in combination with tax data were used to calculate estimates and do inference.

  3. Background Variables, Levels of Aggregation, and Standardized Test Scores

    ERIC Educational Resources Information Center

    Paulson, Sharon E.; Marchant, Gregory J.

    2009-01-01

    This article examines the role of student demographic characteristics in standardized achievement test scores at both the individual level and aggregated at the state, district, school levels. For several data sets, the majority of the variance among states, districts, and schools was related to demographic characteristics. Where these background…

  4. What's in a Teacher Test? Assessing the Relationship between Teacher Test Scores and Student Secondary STEM Achievement. CEDR Working Paper. WP #2016-4

    ERIC Educational Resources Information Center

    Goldhaber, Dan; Gratz, Trevor; Theobald, Roddy

    2016-01-01

    We investigate the predictive validity of teacher credential test scores for student performance in secondary STEM classrooms in Washington state. After replicating earlier findings that teacher basic skills licensure test scores are a modest and statistically significant predictor of student math test score gains in elementary grades, we focus on…

  5. Effects of Classroom Ventilation Rate and Temperature on Students’ Test Scores

    PubMed Central

    2015-01-01

    Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students’ mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9–7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12–13 points per each 1°C decrease in temperature within the observed range of 20–25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students. PMID:26317643

  6. The effects of calculator-based laboratories on standardized test scores

    NASA Astrophysics Data System (ADS)

    Stevens, Charlotte Bethany Rains

    Nationwide, the goal of providing a productive science and math education to our youth in today's educational institutions is centering itself around the technology being utilized in these classrooms. In this age of digital technology, educational software and calculator-based laboratories (CBL) have become significant devices in the teaching of science and math for many states across the United States. Among the technology, the Texas Instruments graphing calculator and Vernier Labpro interface, are among some of the calculator-based laboratories becoming increasingly popular among middle and high school science and math teachers in many school districts across this country. In Tennessee, however, it is reported that this type of technology is not regularly utilized at the student level in most high school science classrooms, especially in the area of Physical Science (Vernier, 2006). This research explored the effect of calculator based laboratory instruction on standardized test scores. The purpose of this study was to determine the effect of traditional teaching methods versus graphing calculator teaching methods on the state mandated End-of-Course (EOC) Physical Science exam based on ability, gender, and ethnicity. The sample included 187 total tenth and eleventh grade physical science students, 101 of which belonged to a control group and 87 of which belonged to the experimental group. Physical Science End-of-Course scores obtained from the Tennessee Department of Education during the spring of 2005 and the spring of 2006 were used to examine the hypotheses. The findings of this research study suggested the type of teaching method, traditional or calculator based, did not have an effect on standardized test scores. However, the students' ability level, as demonstrated on the End-of-Course test, had a significant effect on End-of-Course test scores. This study focused on a limited population of high school physical science students in the middle Tennessee

  7. Can Percentiles Replace Raw Scores in the Statistical Analysis of Test Data?

    ERIC Educational Resources Information Center

    Zimmerman, Donald W.; Zumbo, Bruno D.

    2005-01-01

    Educational and psychological testing textbooks typically warn of the inappropriateness of performing arithmetic operations and statistical analysis on percentiles instead of raw scores. This seems inconsistent with the well-established finding that transforming scores to ranks and using nonparametric methods often improves the validity and power…

  8. Branded by a Test

    ERIC Educational Resources Information Center

    Popham, W. James

    2006-01-01

    Scholastic Aptitude Test (SAT) and American College Program (ACT) scores are the main determinants of college entrance in the USA. It is widely assumed that these tests are predictive of success both during college and in later life, but such views are incorrect. Another widely-held view, held by many educators, is that the SAT and ACT are…

  9. Situational Effects May Account for Gain Scores in Cognitive Ability Testing: A Longitudinal SEM Approach

    ERIC Educational Resources Information Center

    Matton, Nadine; Vautier, Stephane; Raufaste, Eric

    2009-01-01

    Mean gain scores for cognitive ability tests between two sessions in a selection setting are now a robust finding, yet not fully understood. Many authors do not attribute such gain scores to an increase in the target abilities. Our approach consists of testing a longitudinal SEM model suitable to this view. We propose to model the scores' changes…

  10. Effects of Targeted Test Preparation on Scores of Two Tests of Oral English as a Second Language

    ERIC Educational Resources Information Center

    Farnsworth, Tim

    2013-01-01

    This study investigated the effect of targeted test preparation, or coaching, on oral English as a second language test scores. The tests in question were the Basic English Skills Test Plus (BEST Plus), a scripted oral interview published by the Center for Applied Linguistics, and the Versant English Test (VET), a computer-administered and…

  11. SAT and ACT Predict College GPA after Removing "g"

    ERIC Educational Resources Information Center

    Coyle, Thomas R.; Pillow, David R.

    2008-01-01

    This research examined whether the SAT and ACT would predict college grade point average (GPA) after removing g from the tests. SAT and ACT scores and freshman GPAs were obtained from a university sample (N=161) and the 1997 National Longitudinal Study of Youth (N=8984). Structural equation modeling was used to examine relationships among g, GPA,…

  12. A knowledge-based theory of rising scores on "culture-free" tests.

    PubMed

    Fox, Mark C; Mitchum, Ainsley L

    2013-08-01

    Secular gains in intelligence test scores have perplexed researchers since they were documented by Flynn (1984, 1987). Gains are most pronounced on abstract, so-called culture-free tests, prompting Flynn (2007) to attribute them to problem-solving skills availed by scientifically advanced cultures. We propose that recent-born individuals have adopted an approach to analogy that enables them to infer higher level relations requiring roles that are not intrinsic to the objects that constitute initial representations of items. This proposal is translated into item-specific predictions about differences between cohorts in pass rates and item-response patterns on the Raven's Matrices (Flynn, 1987), a seemingly culture-free test that registers the largest Flynn effect. Consistent with predictions, archival data reveal that individuals born around 1940 are less able to map objects at higher levels of relational abstraction than individuals born around 1990. Polytomous Rasch models verify predicted violations of measurement invariance, as raw scores are found to underestimate the number of analogical rules inferred by members of the earlier cohort relative to members of the later cohort who achieve the same overall score. The work provides a plausible cognitive account of the Flynn effect, furthers understanding of the cognition of matrix reasoning, and underscores the need to consider how test-takers select item responses. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  13. Acting White: A Critical Review

    ERIC Educational Resources Information Center

    Sohn, Kitae

    2011-01-01

    The hypothesis of acting White has been heatedly debated and influential over the last 20 years or so in explaining the Black-White test score gap. Recently, economists have joined the debate and started providing new theoretical and empirical analyses of the phenomenon. This paper critically reviews the arguments that have been advanced to…

  14. A Latent Class Approach to Estimating Test-Score Reliability

    ERIC Educational Resources Information Center

    van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

    2011-01-01

    This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…

  15. Experiential Awareness of the Effects of Test Score Reports.

    ERIC Educational Resources Information Center

    Bender, Robert C.

    Because most counselors have experienced a significant amount of success, they often have difficulty understanding the impact of test scores on persons who do not perform well. Counselor educators must develop experiential awareness in an area normally outside the realm of their students. To provide such an experience, 25 counselor trainees took…

  16. Benefits of Coaching on Test Scores Seen as Negligible.

    ERIC Educational Resources Information Center

    Report on Education Research, 1983

    1983-01-01

    THE FOLLOWING IS THE FULL TEXT OF THIS DOCUMENT: A new study by a pair of Harvard University researchers discounts earlier findings that coaching can substantially improve student performance on the Scholastic Aptitude Test (SAT). "There is simply insufficient evidence that large score increases are a result of a coaching program," write…

  17. Structured didactic teaching sessions improve medical student neurology clerkship test scores: a pilot study.

    PubMed

    Menkes, Daniel L; Reed, Mary

    2008-01-01

    To determine the effectiveness of didactic case-based instruction methodology to improve medical student comprehension of common neurological illnesses and neurological emergencies. Neurology department, academic university. 415 third and fourth year medical students performing a required four week neurology clerkship. Raw test scores on a 1 hour, 50-item clinical vignette based examination and open-ended questions in a post-clerkship feedback session. There was a statistically significant improvement in overall test scores (p<0.001). Didactic teaching sessions have a significant positive impact on neurology student clerkship test score performance and perception of their educational experience. Confirmation of these results across multiple specialties in a multi-center trial is warranted.

  18. Estimating Conditional Distributions of Scores on an Alternate Form of a Test. Research Report. ETS RR-15-18

    ERIC Educational Resources Information Center

    Livingston, Samuel A.; Chen, Haiwen H.

    2015-01-01

    Quantitative information about test score reliability can be presented in terms of the distribution of equated scores on an alternate form of the test for test takers with a given score on the form taken. In this paper, we describe a procedure for estimating that distribution, for any specified score on the test form taken, by estimating the joint…

  19. Scoring Dawg Core Breakoff and Retention Mechanism

    NASA Technical Reports Server (NTRS)

    Badescu, Mircea; Sherrit, Stewart; Bar-Cohen, Yoseph; Bao, Xiaoqi; Backes, Paul G.

    2011-01-01

    This novel core break-off and retention mechanism consists of a scoring dawg controlled by a set of two tubes (a drill tube and an inner tube). The drill tube and the inner tube have longitudinal concentric holes. The solution can be implemented in an eccentric tube configuration as well where the tubes have eccentric longitudinal holes. The inner tube presents at the bottom two control surfaces for controlling the orientation of the scoring dawg. The drill tube presents a sunk-in profile on the inside of the wall for housing the scoring dawg. The inner tube rotation relative to the drill tube actively controls the orientation of the scoring dawg and hence its penetration and retrieval from the core. The scoring dawg presents a shaft, two axially spaced arms, and a tooth. The two arms slide on the control surfaces of the inner tube. The tooth, when rotated, can penetrate or be extracted from the core. During drilling, the two tubes move together maintaining the scoring dawg completely outside the core. After the desired drilling depth has been reached the inner tube is rotated relative to the drill tube such that the tooth of the scoring dawg moves toward the central axis. By rotating the drill tube, the scoring dawg can score the core and so reduce its cross sectional area. The scoring dawg can also act as a stress concentrator for breaking the core in torsion or tension. After breaking the core, the scoring dawg can act as a core retention mechanism. For scoring, it requires the core to be attached to the rock. If the core is broken, the dawg can be used as a retention mechanism. The scoring dawg requires a hard-tip insert like tungsten carbide for scoring hard rocks. The relative rotation of the two tubes can be controlled manually or by an additional actuator. In the implemented design solution the bit rotation for scoring was in the same direction as the drilling. The device was tested for limestone cores and basalt cores. The torque required for breaking the

  20. Computerized scoring algorithms for the Autobiographical Memory Test.

    PubMed

    Takano, Keisuke; Gutenbrunner, Charlotte; Martens, Kris; Salmon, Karen; Raes, Filip

    2018-02-01

    Reduced specificity of autobiographical memories is a hallmark of depressive cognition. Autobiographical memory (AM) specificity is typically measured by the Autobiographical Memory Test (AMT), in which respondents are asked to describe personal memories in response to emotional cue words. Due to this free descriptive responding format, the AMT relies on experts' hand scoring for subsequent statistical analyses. This manual coding potentially impedes research activities in big data analytics such as large epidemiological studies. Here, we propose computerized algorithms to automatically score AM specificity for the Dutch (adult participants) and English (youth participants) versions of the AMT by using natural language processing and machine learning techniques. The algorithms showed reliable performances in discriminating specific and nonspecific (e.g., overgeneralized) autobiographical memories in independent testing data sets (area under the receiver operating characteristic curve > .90). Furthermore, outcome values of the algorithms (i.e., decision values of support vector machines) showed a gradient across similar (e.g., specific and extended memories) and different (e.g., specific memory and semantic associates) categories of AMT responses, suggesting that, for both adults and youth, the algorithms well capture the extent to which a memory has features of specific memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  1. A seven-year follow-up of intelligence test scores of foster grandparents.

    PubMed

    Troll, L E; Saltz, R; Dunin-Markiewicz, A

    1976-09-01

    After 7 years, a group of originally nonemployed poverty-level older people (over 60) who had been employed as foster grandparents were retested with the WAIS. Four WAIS subtests - Vocabulary Similarities, Digit Span, and Block Design - were employed. Of the original group of 39, complete data were available for 28; 18 of these were still working on the project, and the other 10 had dropped out. Dropouts as a group tested lower originally and also showed more deterioration in functional health ratings over time. For the total group of 32 foster grandparents, three subtest scores showed stability over the 7 years. Only Digit Span showed a statistically significant drop. Neither age nor the initial level of health or WAIS scores was related to test-score changes over time.

  2. Robust joint score tests in the application of DNA methylation data analysis.

    PubMed

    Li, Xuan; Fu, Yuejiao; Wang, Xiaogang; Qiu, Weiliang

    2018-05-18

    Recently differential variability has been showed to be valuable in evaluating the association of DNA methylation to the risks of complex human diseases. The statistical tests based on both differential methylation level and differential variability can be more powerful than those based only on differential methylation level. Anh and Wang (2013) proposed a joint score test (AW) to simultaneously detect for differential methylation and differential variability. However, AW's method seems to be quite conservative and has not been fully compared with existing joint tests. We proposed three improved joint score tests, namely iAW.Lev, iAW.BF, and iAW.TM, and have made extensive comparisons with the joint likelihood ratio test (jointLRT), the Kolmogorov-Smirnov (KS) test, and the AW test. Systematic simulation studies showed that: 1) the three improved tests performed better (i.e., having larger power, while keeping nominal Type I error rates) than the other three tests for data with outliers and having different variances between cases and controls; 2) for data from normal distributions, the three improved tests had slightly lower power than jointLRT and AW. The analyses of two Illumina HumanMethylation27 data sets GSE37020 and GSE20080 and one Illumina Infinium MethylationEPIC data set GSE107080 demonstrated that three improved tests had higher true validation rates than those from jointLRT, KS, and AW. The three proposed joint score tests are robust against the violation of normality assumption and presence of outlying observations in comparison with other three existing tests. Among the three proposed tests, iAW.BF seems to be the most robust and effective one for all simulated scenarios and also in real data analyses.

  3. The Relationship between Deductive Reasoning Ability, Test Anxiety, and Standardized Test Scores in a Latino Sample

    ERIC Educational Resources Information Center

    Rich, John D., Jr.; Fullard, William; Overton, Willis

    2011-01-01

    One Hundred and Twelve Latino students from Philadelphia participated in this study, which examined the development of deductive reasoning across adolescence, and the relation of reasoning to test anxiety and standardized test scores. As predicted, 11th and ninth graders demonstrated significantly more advanced reasoning than seventh graders.…

  4. A Diet Score Assessing Norwegian Adolescents’ Adherence to Dietary Recommendations—Development and Test-Retest Reproducibility of the Score

    PubMed Central

    Handeland, Katina; Kjellevold, Marian; Wik Markhus, Maria; Eide Graff, Ingvild; Frøyland, Livar; Lie, Øyvind; Skotheim, Siv; Stormark, Kjell Morten; Dahl, Lisbeth; Øyen, Jannike

    2016-01-01

    Assessment of adolescents’ dietary habits is challenging. Reliable instruments to monitor dietary trends are required to promote healthier behaviours in this group. The purpose of this cross-sectional study was to assess adolescents’ adherence to Norwegian dietary recommendations with a diet score and to report results from, and test-retest reliability of, the score. The diet score involved seven food groups and one physical activity indicator, and was applied to answers from a semi-quantitative food frequency questionnaire (FFQ) administered twice. Reproducibility of the score was assessed with Cohen’s Kappa (κ statistics) at an interval of three months. The setting was eight lower-secondary schools in Hordaland County, Norway, and subjects were adolescents (n = 472) aged 14–15 years and their caregivers. Results showed that the proportion of adolescents consistently classified by the diet score was 87.6% (κ = 0.465). For food groups, proportions ranged from 74.0% to 91.6% (κ = 0.249 to κ = 0.573). Less than 40% of the participants were found to adhere to recommendations for frequencies of eating fruits, vegetables, added sugar, and fish. Highest compliance to recommendations was seen for choosing water as beverage and limit the intake of red meat. The score was associated with parental socioeconomic status. The diet score was found to be reproducible at an acceptable level. Health promoting work targeting adolescents should emphasize to increase the intake of recommended foods to approach nutritional guidelines. PMID:27483312

  5. A Diet Score Assessing Norwegian Adolescents' Adherence to Dietary Recommendations-Development and Test-Retest Reproducibility of the Score.

    PubMed

    Handeland, Katina; Kjellevold, Marian; Wik Markhus, Maria; Eide Graff, Ingvild; Frøyland, Livar; Lie, Øyvind; Skotheim, Siv; Stormark, Kjell Morten; Dahl, Lisbeth; Øyen, Jannike

    2016-07-29

    Assessment of adolescents' dietary habits is challenging. Reliable instruments to monitor dietary trends are required to promote healthier behaviours in this group. The purpose of this cross-sectional study was to assess adolescents' adherence to Norwegian dietary recommendations with a diet score and to report results from, and test-retest reliability of, the score. The diet score involved seven food groups and one physical activity indicator, and was applied to answers from a semi-quantitative food frequency questionnaire (FFQ) administered twice. Reproducibility of the score was assessed with Cohen's Kappa (κ statistics) at an interval of three months. The setting was eight lower-secondary schools in Hordaland County, Norway, and subjects were adolescents (n = 472) aged 14-15 years and their caregivers. Results showed that the proportion of adolescents consistently classified by the diet score was 87.6% (κ = 0.465). For food groups, proportions ranged from 74.0% to 91.6% (κ = 0.249 to κ = 0.573). Less than 40% of the participants were found to adhere to recommendations for frequencies of eating fruits, vegetables, added sugar, and fish. Highest compliance to recommendations was seen for choosing water as beverage and limit the intake of red meat. The score was associated with parental socioeconomic status. The diet score was found to be reproducible at an acceptable level. Health promoting work targeting adolescents should emphasize to increase the intake of recommended foods to approach nutritional guidelines.

  6. A Bad Idea: National Standards Based on Test Scores

    ERIC Educational Resources Information Center

    Baker, Keith

    2010-01-01

    The justification for national standards is that test scores predict a nation's future economic success. There is no evidence that supports this assumption. There is evidence that it is wrong. For more than half a century, reformers have been trying to fix our schools with little success. The obvious conclusion is that something that can't be…

  7. America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

    ERIC Educational Resources Information Center

    Petrilli, Michael J.; Wright, Brandon L.

    2016-01-01

    At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…

  8. Using Heteroskedastic Ordered Probit Models to Recover Moments of Continuous Test Score Distributions from Coarsened Data

    ERIC Educational Resources Information Center

    Reardon, Sean F.; Shear, Benjamin R.; Castellano, Katherine E.; Ho, Andrew D.

    2017-01-01

    Test score distributions of schools or demographic groups are often summarized by frequencies of students scoring in a small number of ordered proficiency categories. We show that heteroskedastic ordered probit (HETOP) models can be used to estimate means and standard deviations of multiple groups' test score distributions from such data. Because…

  9. Power and sample size evaluation for the Cochran-Mantel-Haenszel mean score (Wilcoxon rank sum) test and the Cochran-Armitage test for trend.

    PubMed

    Lachin, John M

    2011-11-10

    The power of a chi-square test, and thus the required sample size, are a function of the noncentrality parameter that can be obtained as the limiting expectation of the test statistic under an alternative hypothesis specification. Herein, we apply this principle to derive simple expressions for two tests that are commonly applied to discrete ordinal data. The Wilcoxon rank sum test for the equality of distributions in two groups is algebraically equivalent to the Mann-Whitney test. The Kruskal-Wallis test applies to multiple groups. These tests are equivalent to a Cochran-Mantel-Haenszel mean score test using rank scores for a set of C-discrete categories. Although various authors have assessed the power function of the Wilcoxon and Mann-Whitney tests, herein it is shown that the power of these tests with discrete observations, that is, with tied ranks, is readily provided by the power function of the corresponding Cochran-Mantel-Haenszel mean scores test for two and R > 2 groups. These expressions yield results virtually identical to those derived previously for rank scores and also apply to other score functions. The Cochran-Armitage test for trend assesses whether there is an monotonically increasing or decreasing trend in the proportions with a positive outcome or response over the C-ordered categories of an ordinal independent variable, for example, dose. Herein, it is shown that the power of the test is a function of the slope of the response probabilities over the ordinal scores assigned to the groups that yields simple expressions for the power of the test. Copyright © 2011 John Wiley & Sons, Ltd.

  10. The Implications of Family Size and Birth Order for Test Scores and Behavioral Development

    ERIC Educational Resources Information Center

    Silles, Mary A.

    2010-01-01

    This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…

  11. The Weighted Airman Promotion System: Standardizing Test Scores

    DTIC Science & Technology

    2008-01-01

    This document and trademark( s ) contained herein are protected by law as indicated in a notice appearing later in this work. This electronic...SUBTITLE The Weighted Airman Promotion System. Standardizing Test Scores 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR( S ) 5d...PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME( S ) AND ADDRESS(ES) Rand Corporation,PO Box 2138,Santa Monica

  12. Intelligence Tests and the Immigration Act of 1924.

    ERIC Educational Resources Information Center

    Snyderman, Mark; Herrnstein, R. J.

    1983-01-01

    An examination of the historical record fails to uncover any support for the claim that the racially biased Immigration Act of 1924 was passed with the help of the intelligence testing community. (GC)

  13. Qualitative Dimensions in Scoring the Rey Visual Memory Test of Malingering.

    ERIC Educational Resources Information Center

    Griffin, G. A. Elmer; And Others

    1996-01-01

    A new qualitative scoring system for the Rey Visual Memory Test was tested for its ability to distinguish between malingerers and nonmalingerers. The new system, based on the types of errors made, was able to distinguish between 53 psychiatrically disabled and 64 normal nonmalingerers, and between nonmalingerers and 91 possible malingerers. (SLD)

  14. Correlations between the Hand Test Pathology score and Personality Assessment Inventory scales for pain clinic patients.

    PubMed

    George, J M; Wagner, E E

    1995-06-01

    Pearson correlations between the Hand Test Pathology (PATH) score and Personality Assessment Inventory scales produced a cluster of relationships characteristic of an antisocial orientation. Likewise, PATH significantly differentiated between a "P" (Pathology) group flagged by a high Negative Impression score on the inventory, and an "N" (Normal) group of 100 pain patients. It was suggested that the interpretive simplicity of Hand Test scores renders the scores amenable to further correlational studies involving the inventory.

  15. Improving Test Score Reporting: Perspectives from the ETS Score Reporting Conference. Research Report. ETS RR-11-45

    ERIC Educational Resources Information Center

    Zapata-Rivera, Diego, Ed.; Zwick, Rebecca, Ed.

    2011-01-01

    This volume includes 3 papers based on presentations at a workshop on communicating assessment information to particular audiences, held at Educational Testing Service (ETS) on November 4th, 2010, to explore some issues that influence score reports and new advances that contribute to the effectiveness of these reports. Jessica Hullman, Rebecca…

  16. Standardized Testing of Special Education Students: A Comparison of Service Type and Test Scores

    ERIC Educational Resources Information Center

    Hogan-Young, Christine

    2013-01-01

    The purpose of this study was to determine if there was a difference in Tennessee Comprehensive Assessment Program Modified Academic Achievement Standards (TCAP MAAS) achievement test scores for special education students who receive their instruction in the resource classroom or in an inclusion classroom. The study involved third, fourth, and…

  17. Validity of Alternative Cut-Off Scores for the Back-Saver Sit and Reach Test

    ERIC Educational Resources Information Center

    Looney, Marilyn A.; Gilbert, Jennie

    2012-01-01

    The purpose of the study was to determine if currently used FITNESSGRAM[R] cut-off scores for the Back Saver Sit and Reach Test had the best criterion-referenced validity evidence for 6-12 year old children. Secondary analyses of an existing data set focused on the passive straight leg raise and Back Saver Sit and Reach Test flexibility scores of…

  18. Interpretation and Utilization of Scores on the Air Force Officer Qualifying Test.

    ERIC Educational Resources Information Center

    Miller, Robert E.

    The report summarizes a large body of data relevant to the proper interpretation and use of aptitude scores on the Air Force Officer Qualifying Test (AFOQT). Included are descriptions of the AFOQT testing program and the test itself. Technical data include an extensive sampling of validation studies covering predictors of success in pilot…

  19. Stochastic Processes as True-Score Models for Highly Speeded Mental Tests.

    ERIC Educational Resources Information Center

    Moore, William E.

    The previous theoretical development of the Poisson process as a strong model for the true-score theory of mental tests is discussed, and additional theoretical properties of the model from the standpoint of individual examinees are developed. The paper introduces the Erlang process as a family of test theory models and shows in the context of…

  20. A physical function test for use in the intensive care unit: validity, responsiveness, and predictive utility of the physical function ICU test (scored).

    PubMed

    Denehy, Linda; de Morton, Natalie A; Skinner, Elizabeth H; Edbrooke, Lara; Haines, Kimberley; Warrillow, Stephen; Berney, Sue

    2013-12-01

    Several tests have recently been developed to measure changes in patient strength and functional outcomes in the intensive care unit (ICU). The original Physical Function ICU Test (PFIT) demonstrates reliability and sensitivity. The aims of this study were to further develop the original PFIT, to derive an interval score (the PFIT-s), and to test the clinimetric properties of the PFIT-s. A nested cohort study was conducted. One hundred forty-four and 116 participants performed the PFIT at ICU admission and discharge, respectively. Original test components were modified using principal component analysis. Rasch analysis examined the unidimensionality of the PFIT, and an interval score was derived. Correlations tested validity, and multiple regression analyses investigated predictive ability. Responsiveness was assessed using the effect size index (ESI), and the minimal clinically important difference (MCID) was calculated. The shoulder lift component was removed. Unidimensionality of combined admission and discharge PFIT-s scores was confirmed. The PFIT-s displayed moderate convergent validity with the Timed "Up & Go" Test (r=-.60), the Six-Minute Walk Test (r=.41), and the Medical Research Council (MRC) sum score (rho=.49). The ESI of the PFIT-s was 0.82, and the MCID was 1.5 points (interval scale range=0-10). A higher admission PFIT-s score was predictive of: an MRC score of ≥48, increased likelihood of discharge home, reduced likelihood of discharge to inpatient rehabilitation, and reduced acute care hospital length of stay. Scoring of sit-to-stand assistance required is subjective, and cadence cutpoints used may not be generalizable. The PFIT-s is a safe and inexpensive test of physical function with high clinical utility. It is valid, responsive to change, and predictive of key outcomes. It is recommended that the PFIT-s be adopted to test physical function in the ICU.

  1. Univariate and Bivariate Loglinear Models for Discrete Test Score Distributions.

    ERIC Educational Resources Information Center

    Holland, Paul W.; Thayer, Dorothy T.

    2000-01-01

    Applied the theory of exponential families of distributions to the problem of fitting the univariate histograms and discrete bivariate frequency distributions that often arise in the analysis of test scores. Considers efficient computation of the maximum likelihood estimates of the parameters using Newton's Method and computationally efficient…

  2. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed

    Kong, A; Cox, N J

    1997-11-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.

  3. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed Central

    Kong, A; Cox, N J

    1997-01-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested. PMID:9345087

  4. Beyond Correlations: Usefulness of High School GPA and Test Scores in Making College Admissions Decisions

    ERIC Educational Resources Information Center

    Sawyer, Richard

    2013-01-01

    Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…

  5. Graduate Students' Administration and Scoring Errors on the Woodcock-Johnson III Tests of Cognitive Abilities

    ERIC Educational Resources Information Center

    Ramos, Erica; Alfonso, Vincent C.; Schermerhorn, Susan M.

    2009-01-01

    The interpretation of cognitive test scores often leads to decisions concerning the diagnosis, educational placement, and types of interventions used for children. Therefore, it is important that practitioners administer and score cognitive tests without error. This study assesses the frequency and types of examiner errors that occur during the…

  6. Evaluating the Stability of Test Score Means for the "TOEIC"® Speaking and Writing Tests. Research Report. ETS RR-17-50

    ERIC Educational Resources Information Center

    Qu, Yanxuan; Huo, Yan; Chan, Eric; Shotts, Matthew

    2017-01-01

    For educational tests, it is critical to maintain consistency of score scales and to understand the sources of variation in score means over time. This practice helps to ensure that interpretations about test takers' abilities are comparable from one administration (or one form) to another. This study examines the consistency of reported scores…

  7. The effect of human immunodeficiency virus type 1 antibody status on military applicant aptitude test scores.

    PubMed

    Arday, D R; Brundage, J F; Gardner, L I; Goldenbaum, M; Wann, F; Wright, S

    1991-06-15

    The authors conducted a population-based study to attempt to estimate the effect of human immunodeficiency virus type 1 (HIV-1) seropositivity on Armed Services Vocational Aptitude Battery test scores in otherwise healthy individuals with early HIV-1 infection. The Armed Services Vocational Aptitude Battery is a 10-test written multiple aptitude battery administered to all civilian applicants for military enlistment prior to serologic screening for HIV-1 antibodies. A total of 975,489 induction testing records containing both Armed Services Vocational Aptitude Battery and HIV-1 results from October 1985 through March 1987 were examined. An analysis data set (n = 7,698) was constructed by choosing five controls for each of the 1,283 HIV-1-positive cases, matched on five-digit ZIP code, and a multiple linear regression analysis was performed to control for demographic and other factors that might influence test scores. Years of education was the strongest predictor of test scores, raising an applicant's score on a composite test nearly 0.16 standard deviation per year. The HIV-1-positive effect on the composite score was -0.09 standard deviation (99% confidence interval -0.17 to -0.02). Separate regressions on each component test within the battery showed HIV-1 effects between -0.39 and +0.06 standard deviation. The two Armed Services Vocational Aptitude Battery component tests felt a priori to be the most sensitive to HIV-1-positive status showed the least decrease with seropositivity. Much of the variability in test scores was not predicted by either HIV-1 serostatus or the demographic and other factors included in the model. There appeared to be little evidence of a strong HIV-1 effect.

  8. Low aerobic fitness and obesity are associated with lower standardized test scores in children.

    PubMed

    Roberts, Christian K; Freed, Benjamin; McCarthy, William J

    2010-05-01

    To investigate whether aerobic fitness and obesity in school children are associated with standardized test performance. Ethnically diverse (n = 1989) 5th, 7th, and 9th graders attending California schools comprised the sample. Aerobic fitness was determined by a 1-mile run/walk test; body mass index (BMI) was obtained from state-mandated measurements. California standardized test scores were obtained from the school district. Students whose mile run/walk times exceeded California Fitnessgram standards or whose BMI exceeded Centers for Disease Control sex- and age-specific body weight standards scored lower on California standardized math, reading, and language tests than students with desirable BMI status or fitness level, even after controlling for parent education among other covariates. Ethnic differences in standardized test scores were consistent with ethnic differences in obesity status and aerobic fitness. BMI-for-age was no longer a significant multivariate predictor when covariates included fitness level. Low aerobic fitness is common among youth and varies among ethnic groups, and aerobic fitness level predicts performance on standardized tests across ethnic groups. More research is needed to uncover the physiological mechanisms by which aerobic fitness may contribute to performance on standardized academic tests.

  9. The Emphasis of Student Test Scores in Teacher Appraisal Systems

    ERIC Educational Resources Information Center

    Smith, William C.; Kubacka, Katarzyna

    2017-01-01

    Over the past 30 years teachers have been held increasingly accountable for the quality of education in their classroom. During this transition, the line between teacher appraisals, traditionally an instrument for continuous formative teacher feedback, and summative teacher evaluations has blurred. Student test scores, as an "objective"…

  10. Rising Stars: High School's Change Process Produces Higher Test Scores.

    ERIC Educational Resources Information Center

    McCown, Claire; Runnebaum, Robert

    2001-01-01

    Presents Bishop Ward High School (Kansas) as a case study that has seen great improvements in standardized testing results by changing its approach. States that realignment of curriculum, adjusting instructional strategies, and accommodating students with special needs are important aspects of raising assessment scores in high schools. (CJW)

  11. Comparing the Effects of Elementary Music and Visual Arts Lessons on Standardized Mathematics Test Scores

    ERIC Educational Resources Information Center

    King, Molly Elizabeth

    2016-01-01

    The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…

  12. Many Children Left Behind? Textbooks and Test Scores in Kenya. NBER Working Paper No. 13300

    ERIC Educational Resources Information Center

    Glewwe, Paul; Kremer, Michael; Moulin, Sylvie

    2007-01-01

    A randomized evaluation suggests that a program which provided official textbooks to randomly selected rural Kenyan primary schools did not increase test scores for the average student. In contrast, the previous literature suggests that textbook provision has a large impact on test scores. Disaggregating the results by students' initial academic…

  13. Relationship of Elementary and Secondary School Achievement Test Scores to Later Academic Success.

    ERIC Educational Resources Information Center

    Loyd, Brenda H.; And Others

    1980-01-01

    This study investigated the relationship between achievement test scores on the Iowa Tests of Basic Skills (ITBS) and Iowa Tests of Educational Development (ITED), and high school and college grade point average. Support for the predictive validity of the ITBS and ITED achievement test batteries is provided. (Author/GK)

  14. The Impact of Inclusion and Resource Instruction on Standardized Test Scores of Special Education Students

    ERIC Educational Resources Information Center

    Derico, Vontrice L.

    2017-01-01

    The purpose of the proposed quasi-experimental quantitative study was to determine if students who were taught in the inclusive setting yielded higher standardized test scores compared to students who were taught in the resource setting. The researcher analyzed the standardized test scores, in the areas of Language Arts, Reading, and Mathematics…

  15. STABILITY OF ACADEMIC APTITUDE AND READING TEST SCORES OF MOBILE AND NON-MOBILE DISADVANTAGED CHILDREN.

    ERIC Educational Resources Information Center

    JUSTMAN, JOSEPH

    CHANGES IN ACADEMIC APTITUDE AND ACHIEVEMENT TEST SCORES OF PUPILS ATTENDING PUBLIC SCHOOLS IN DISADVANTAGED AREAS IN NEW YORK CITY WERE INVESTIGATED. AN ATTEMPT WAS MADE TO DETERMINE WHETHER VARYING DEGREES OF MOBILITY WERE ASSOCIATED WITH VARIATION IN CHANGES IN TEST SCORES. THE CUMULATIVE RECORD CARDS OF SIXTH-GRADE PUPILS WERE EXAMINED TO…

  16. Kindergarten Black-White Test Score Gaps: Replicating and Updating Previous Findings with New National Data

    ERIC Educational Resources Information Center

    Quinn, David

    2014-01-01

    A substantial body of evidence has shown large academic test score gaps between black and white students in early childhood. These gaps remain, and probably grow, as students progress through school. Many researchers have sought to explain these persistent test score gaps, and particularly, to understand the role of students' socio-economic status…

  17. The Influence of an NCLB Accountability Plan on the Distribution of Student Test Score Gains

    ERIC Educational Resources Information Center

    Springer, Matthew G.

    2008-01-01

    Previous research on the effect of accountability programs on the distribution of student test score gains is decidedly mixed. This study examines the issue by estimating an educational production function in which test score gains are a function of the incentives schools have to focus instruction on below-proficient students. NCLB's threat of…

  18. Test and Score Data Summary for TOEFL[R] Internet-Based and Paper-Based Tests. January 2008-December 2008 Test Data

    ERIC Educational Resources Information Center

    Educational Testing Service, 2008

    2008-01-01

    The Test of English as a Foreign Language[TM], better known as TOEFL[R], is designed to measure the English-language proficiency of people whose native language is not English. TOEFL scores are accepted by more than 6,000 colleges, universities, and licensing agencies in 130 countries. The test is also used by governments, and scholarship and…

  19. A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

    ERIC Educational Resources Information Center

    Lee, Guemin; Park, In-Yong

    2012-01-01

    Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

  20. How Changes in Families and Schools Are Related to Trends in Black-White Test Scores

    ERIC Educational Resources Information Center

    Berends, Mark; Lucas, Samuel R.; Penaloza, Roberto V.

    2008-01-01

    Through several decades of research, a great deal has been written about trends in black-white test scores and the factors that may explain the gaps in different subject areas. Only a few studies have examined the changing relationships between gaps in students' test scores and family and school measures in nationally representative data over…

  1. Clock Drawing Test and the diagnosis of amnestic mild cognitive impairment: can more detailed scoring systems do the work?

    PubMed

    Rubínová, Eva; Nikolai, Tomáš; Marková, Hana; Siffelová, Kamila; Laczó, Jan; Hort, Jakub; Vyhnálek, Martin

    2014-01-01

    The Clock Drawing Test is a frequently used cognitive screening test with several scoring systems in elderly populations. We compare simple and complex scoring systems and evaluate the usefulness of the combination of the Clock Drawing Test with the Mini-Mental State Examination to detect patients with mild cognitive impairment. Patients with amnestic mild cognitive impairment (n = 48) and age- and education-matched controls (n = 48) underwent neuropsychological examinations, including the Clock Drawing Test and the Mini-Mental State Examination. Clock drawings were scored by three blinded raters using one simple (6-point scale) and two complex (17- and 18-point scales) systems. The sensitivity and specificity of these scoring systems used alone and in combination with the Mini-Mental State Examination were determined. Complex scoring systems, but not the simple scoring system, were significant predictors of the amnestic mild cognitive impairment diagnosis in logistic regression analysis. At equal levels of sensitivity (87.5%), the Mini-Mental State Examination showed higher specificity (31.3%, compared with 12.5% for the 17-point Clock Drawing Test scoring scale). The combination of Clock Drawing Test and Mini-Mental State Examination scores increased the area under the curve (0.72; p < .001) and increased specificity (43.8%), but did not increase sensitivity, which remained high (85.4%). A simple 6-point scoring system for the Clock Drawing Test did not differentiate between healthy elderly and patients with amnestic mild cognitive impairment in our sample. Complex scoring systems were slightly more efficient, yet still were characterized by high rates of false-positive results. We found psychometric improvement using combined scores from the Mini-Mental State Examination and the Clock Drawing Test when complex scoring systems were used. The results of this study support the benefit of using combined scores from simple methods.

  2. The Relationship between Academic Averages of Primary School Science and Technology Class and Test Sub-Test Scores of Placement Test of Science

    ERIC Educational Resources Information Center

    Guzeller, Cem Oktay

    2012-01-01

    In this research, the relationship between written exam scores of science and technology class of 6th, 7th, and 8th grades, project, participation in class activities and performance work, year-end academic success point averages and sub-test raw scores of LDT science of 6th, 7th and 8th grades. Academic success point averages were used as…

  3. Racial Differences in Mathematics Test Scores for Advanced Mathematics Students

    ERIC Educational Resources Information Center

    Minor, Elizabeth Covay

    2016-01-01

    Research on achievement gaps has found that achievement gaps are larger for students who take advanced mathematics courses compared to students who do not. Focusing on the advanced mathematics student achievement gap, this study found that African American advanced mathematics students have significantly lower test scores and are less likely to be…

  4. Commentary on "Validating the Interpretations and Uses of Test Scores"

    ERIC Educational Resources Information Center

    Brennan, Robert L.

    2013-01-01

    Kane's paper "Validating the Interpretations and Uses of Test Scores" is the most complete and clearest discussion yet available of the argument-based approach to validation. At its most basic level, validation as formulated by Kane is fundamentally a simply-stated two-step enterprise: (1) specify the claims inherent in a particular interpretation…

  5. Using Test Scores from Students with Disabilities in Teacher Evaluation

    ERIC Educational Resources Information Center

    Buzick, Heather M.; Jones, Nathan D.

    2015-01-01

    Much of the recent focus of educational policymakers has been on improving the measurement of teacher effectiveness. Linking student growth to teacher effects has been a large part of reform efforts. To date, neither researchers nor practitioners have arrived at a consensus on how to treat test scores from students with disabilities in…

  6. Piloting a Polychotomous Partial-Credit Scoring Procedure in a Multiple-Choice Test

    ERIC Educational Resources Information Center

    Tsopanoglou, Antonios; Ypsilandis, George S.; Mouti, Anna

    2014-01-01

    Multiple-choice (MC) tests are frequently used to measure language competence because they are quick, economical and straightforward to score. While degrees of correctness have been investigated for partially correct responses in combined-response MC tests, degrees of incorrectness in distractors and the role they play in determining the…

  7. What's in a Teacher Test? Assessing the Relationship between Teacher Licensure Test Scores and Student STEM Achievement and Course-Taking. Working Paper 158

    ERIC Educational Resources Information Center

    Goldhaber, Dan; Gratz, Trevor; Theobald, Roddy

    2016-01-01

    We investigate the relationship between teacher licensure test scores and student test achievement and high school course-taking. We focus on three subject/grade combinations--middle school math, ninth-grade algebra and geometry, and ninth-grade biology--and find evidence that a teacher's basic skills test scores are modestly predictive of student…

  8. The Bender Gestalt Test with the Human Figure Drawing Test for Young School Children. A Manual for Use with the Koppitz Scoring System.

    ERIC Educational Resources Information Center

    Koppitz, Elizabeth Munsterberg

    Presented is a manual for scoring the Bender Gestalt Test and the Human Figure Drawing Test for screening and diagnostic uses with emotionally disturbed, brain damaged, or perceptually handicapped 5- to 11-year-old children. Given are suggestions for administering and scoring the Bender test which examines distortion of shape, rotation,…

  9. A pretest prognostic score to assess patients undergoing exercise or pharmacological stress testing.

    PubMed

    Morise, Anthony; Evans, Matthew; Jalisi, Farrukh; Shetty, Rajendra; Stauffer, Marc

    2007-02-01

    A previously developed pretest score was validated to stratify patients presenting for exercise testing with suspected coronary disease according to the presence of angiographic coronary disease. Our goal was to determine how well this pretest score risk stratified patients undergoing pharmacological and exercise stress tests concerning prognostic endpoints. Retrospective cohort analysis. University hospital stress laboratory. 7452 unselected ambulatory patients with symptoms of suspected coronary disease undergoing stress testing between 1995 and 2004. All-cause death, cardiac death and non-fatal myocardial infarction. The rate of all-cause death was 5.5% (CI 5.0 to 6.1) with 4.3 (SD 2.4) years of follow-up (Exercise 2.8% (CI 2.3 to 3.2) v Pharmacological group 11.9% (CI 10.5 to 13.3); p<0.001). The rate of cardiac death/myocardial infarction was 2.6% (CI 2.2 to 3.0) (Exercise 1.4% (CI 1.1 to 1.8) v Pharmacological group 5.3% (CI 4.3 to 6.2); p<0.001). In both groups, stratification by pretest score was significant for all-cause death and the combined endpoint. However, stratification was more effective in the pharmacological group using the combined endpoint rather than all-cause death. Pharmacological stress patients in intermediate and high risk groups were at higher risk than their respective exercise test cohorts. Referral for pharmacological stress testing was found to be an independent predictor of time to death (2.7 (CI 2.0 to 3.6); p<0.001). A pretest score previously validated to stratify according to angiographic outcomes, effectively risk stratified pharmacological and exercise stress patients according to the combined endpoint of cardiac death/myocardial infarction.

  10. TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

    ERIC Educational Resources Information Center

    Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

    2012-01-01

    Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

  11. Association between the gait pattern characteristics of older people and their two-step test scores.

    PubMed

    Kobayashi, Yoshiyuki; Ogata, Toru

    2018-04-27

    The Two-Step test is one of three official tests authorized by the Japanese Orthopedic Association to evaluate the risk of locomotive syndrome (a condition of reduced mobility caused by an impairment of the locomotive organs). It has been reported that the Two-Step test score has a good correlation with one's walking ability; however, its association with the gait pattern of older people during normal walking is still unknown. Therefore, this study aims to clarify the associations between the gait patterns of older people observed during normal walking and their Two-Step test scores. We analyzed the whole waveforms obtained from the lower-extremity joint angles and joint moments of 26 older people in various stages of locomotive syndrome using principal component analysis (PCA). The PCA was conducted using a 260 × 2424 input matrix constructed from the participants' time-normalized pelvic and right-lower-limb-joint angles along three axes (ten trials of 26 participants, 101 time points, 4 angles, 3 axes, and 2 variable types per trial). The Pearson product-moment correlation coefficient between the scores of the principal component vectors (PCVs) and the scores of the Two-Step test revealed that only one PCV (PCV 2) among the 61 obtained relevant PCVs is significantly related to the score of the Two-Step test. We therefore concluded that the joint angles and joint moments related to PCV 2-ankle plantar-flexion, ankle plantar-flexor moments during the late stance phase, ranges of motion and moments on the hip, knee, and ankle joints in the sagittal plane during the entire stance phase-are the motions associated with the Two-Step test.

  12. Mixed handedness and achievement test scores of middle school boys.

    PubMed

    Sarma, P S B

    2008-10-01

    The purpose of the study was to replicate findings of an earlier study of fourth grade boys manifesting mixed handedness with a sample. Among 32 mixed-handed boys in Grades 6 to 8, the right-handed writer, left-handed thrower group obtained low spelling scores (Normal Curve Equivalent Scores) on the California Achievement Test significantly more frequently than the left-handed writer, right-handed thrower group. These findings are consistent with data for Grade 4 boys in the earlier study. Findings strengthen the hypotheses that mixed handedness is not a unitary neuropsychological entity and that boys who write with the right hand and throw with the left hand might be at risk for certain academic deficits.

  13. Validity and reliability of Abbreviated Mental Test Score (AMTS) among older Iranian.

    PubMed

    Foroughan, Mahshid; Wahlund, Lars-Olof; Jafari, Zahra; Rahgozar, Mehdi; Farahani, Ida G; Rashedi, Vahid

    2017-11-01

    Cognitive impairment is common among older people and is associated with increased morbidity and mortality. The main aim of this study was to evaluate the validity of the Persian version of the Abbreviated Mental Test Score (AMTS) as a screening tool for dementia. Data were obtained from a cross-sectional study. One hundred and one older adults who were members of Iranian Alzheimer Association and 101 of their siblings were entered into this study by convenient sampling. The Diagnostic and Statistical Manual of Mental Disorders, 4th edition, criteria for diagnosing dementia and the Mini-Mental State Examination were used as the study tools. The gathered data were analyzed by the Mann-Whitney U-test, the Kruskal-Wallis test, Spearman's rank correlation coefficient, and the receiver-operating characteristic. The AMTS could successfully differentiate the dementia group from the non-dementia group. Scores were significantly correlated with Diagnostic and Statistical Manual of Mental Disorders diagnosis for dementia and Mini-Mental State Examination scores (P < 0.001). Educational level (P < 0.001) and male sex (P = 0.015) were positively associated with AMTS, whereas (P < 0.001) was negatively associated with AMTS. Total Cronbach's α coefficient was 0.90. The scores 6 and 7 showed the optimum balance between sensitivity (99% and 94%, respectively) and specificity (85% and 86%, respectively). The Persian version of the AMTS is a valid cognitive assessment tool for older Iranian adults and can be used for dementia screening in Iran. © 2017 Japanese Psychogeriatric Society.

  14. A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

    PubMed

    Bersabé, Rosa; Rivas, Teresa

    2010-05-01

    The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.

  15. Report: States See Test-Score Gains

    ERIC Educational Resources Information Center

    Viadero, Debra

    2004-01-01

    This article discusses a report from Education Trust, a Washington-based research and advocacy group. The report says almost half the states have seen rising math scores on their state exams for elementary school pupils since the federal No Child Left Behind law was enacted. It also states that reading scores have improved among 4th and 5th…

  16. School Choice in Suburbia: Test Scores, Race, and Housing Markets

    ERIC Educational Resources Information Center

    Dougherty, Jack; Harelson, Jeffrey; Maloney, Laura; Murphy, Drew; Smith, Russell; Snow, Michael; Zannoni, Diane

    2009-01-01

    Home buyers exercise school choice when shopping for a private residence due to its location in a public school district or attendance area. In this quantitative study of one Connecticut suburban district, we measure the effect of elementary school test scores and racial composition on home buyers' willingness to purchase single-family homes over…

  17. The Effect of Mobility on Texas Assessment of Knowledge and Skills Test Scores

    ERIC Educational Resources Information Center

    Alvarez, Ray

    2006-01-01

    This research studies the effects of mobility on the high-stakes test scores of a Title I South Central Texas school district. The study involved 10, 5th-grade elementary feeder school populations graduating to the 6th grade in 3 middle schools. The researcher compared the 1st administration scores of the Texas Assessment of Knowledge and Skills…

  18. Effects of correcting for prematurity on cognitive test scores in childhood.

    PubMed

    Wilson-Ching, Michelle; Pascoe, Leona; Doyle, Lex W; Anderson, Peter J

    2014-03-01

    The American Academy of Pediatrics recommends that test scores should be corrected for prematurity up to 3 years of age, but this practice varies greatly in both clinical and research settings. The aim of this study was to contrast the effects of using chronological age and those of using corrected age on measures of cognitive outcome across childhood. A theoretical model was constructed using norms from the Bayley Scales of Infant and Toddler Development, Third Edition; the Wechsler Preschool and Primary Scale of Intelligence, Third Edition Australian; and the Wechsler Intelligence Scales for Children, Fourth Edition Australian. Baseline scores representing different levels of functioning (70, below average; 85, borderline; and 100, average) were recalculated using the normative data for ages 6 months to 16 years to account for 1, 2, 3 and 4 months of prematurity. The model created depicted the difference in standardised scores between chronological and corrected age. Compared with scores corrected for prematurity, the absolute reduction in scores using chronological age was greater for increasing degree of prematurity, younger ages at assessment and higher baseline scores and was substantial even beyond 3 years of age. However, the pattern was erratic, with considerable fluctuation evident across different ages and baseline scores. Chronological age results in a lowering of scores at all ages for preterm-born subjects that is greater in the first few years and in those born at earlier gestational ages. Whether or not to correct for prematurity depends upon the context of the assessment. © 2014 The Authors. Journal of Paediatrics and Child Health © 2014 Paediatrics and Child Health Division (Royal Australasian College of Physicians).

  19. How Parents Can Help Kids Improve Test Scores: Taking the Stakes out of Literacy Testing

    ERIC Educational Resources Information Center

    Schneider, Steven

    2006-01-01

    In order to meet the goals of No Child Left Behind, standardized testing is preeminent as the sole indicator determining whether states all across America demonstrate adequate yearly progress regarding the improvement of student achievement in literacy education. This book will help teachers and parents raise children's scores on standardized…

  20. The Effects of Group Members' Personalities on a Test Taker's L2 Group Oral Discussion Test Scores

    ERIC Educational Resources Information Center

    Ockey, Gary J.

    2009-01-01

    The second language group oral is a test of second language speaking proficiency, in which a group of three or more English language learners discuss an assigned topic without interaction with interlocutors. Concerns expressed about the extent to which test takers' personal characteristics affect the scores of others in the group have limited its…

  1. Noncognitive Skills and the Gender Disparities in Test Scores and Teacher Assessments: Evidence from Primary School

    ERIC Educational Resources Information Center

    Cornwell, Christopher; Mustard, David B.; Van Parys, Jessica

    2013-01-01

    Using data from the 1998-99 ECLS-K cohort, we show that the grades awarded by teachers are not aligned with test scores. Girls in every racial category outperform boys on reading tests, while boys score at least as well on math and science tests as girls. However, boys in all racial categories across all subject areas are not represented in…

  2. Web-based training and interrater reliability testing for scoring the Hamilton Depression Rating Scale.

    PubMed

    Rosen, Jules; Mulsant, Benoit H; Marino, Patricia; Groening, Christopher; Young, Robert C; Fox, Debra

    2008-10-30

    Despite the importance of establishing shared scoring conventions and assessing interrater reliability in clinical trials in psychiatry, these elements are often overlooked. Obstacles to rater training and reliability testing include logistic difficulties in providing live training sessions, or mailing videotapes of patients to multiple sites and collecting the data for analysis. To address some of these obstacles, a web-based interactive video system was developed. It uses actors of diverse ages, gender and race to train raters how to score the Hamilton Depression Rating Scale and to assess interrater reliability. This system was tested with a group of experienced and novice raters within a single site. It was subsequently used to train raters of a federally funded multi-center clinical trial on scoring conventions and to test their interrater reliability. The advantages and limitations of using interactive video technology to improve the quality of clinical trials are discussed.

  3. Predicting changes in clinical status of young asthmatics: clinical scores or objective parameters?

    PubMed

    Leung, Ting F; Ko, Fanny W S; Wong, Gary W K; Li, Chung Y; Yung, Edmund; Hui, David S C; Lai, Christopher K W

    2009-05-01

    Preventing asthma exacerbation is an important goal of asthma management. The existing clinical tools are not good in predicting asthma exacerbations in young children. Childhood Asthma Control Test (C-ACT) was recently published to be a simple tool for assessing disease control in young children. This study investigated C-ACT and other disease-related factors for indicating longitudinal changes in asthma status and predicting asthma exacerbations. During the same clinic visit, asthma patients aged 4-11 years completed the Chinese version of C-ACT and underwent exhaled nitric oxide and spirometric measurements. Blinded to these results, the same investigator assigned Disease Severity Score (DSS) and rated asthma control according to Global Initiative for Asthma. Asthma exacerbations during the next 6 months were recorded. Ninety-seven patients were recruited, with their mean (standard deviation [SD]) age being 9.2 (2.0) years. Thirty-six (37.1%) patients had uncontrolled asthma at baseline. C-ACT, DSS, and FEV(1) differed among patients with different control status (P < 0.001 for C-ACT and DSS; P = 0.028 for FEV(1)). Thirty-two patients had asthma exacerbations during the 6-month follow-up. Changes in patients' C-ACT scores correlated with changes in asthma control status, DSS, and FEV(1) (P = 0.019, 0.034, and 0.020, respectively). C-ACT score was lower among patients with asthma exacerbations (mean [SD]: 22.9 [4.2] vs. 24.5 [2.1]; P = 0.015). Logistic regression confirmed that the occurrence of asthma exacerbations was associated only with baseline C-ACT (B = -0.203, P = 0.042). In conclusion, C-ACT is better than DSS and objective parameters in reflecting changes in asthma status and predicting asthma exacerbations in young children. (c) 2009 Wiley-Liss, Inc.

  4. Opportunity to learn: Investigating possible predictors for pre-course Test Of Astronomy STandards TOAST scores

    NASA Astrophysics Data System (ADS)

    Berryhill, Katie J.

    As astronomy education researchers become more interested in experimentally testing innovative teaching strategies to enhance learning in introductory astronomy survey courses ("ASTRO 101"), scholars are placing increased attention toward better understanding factors impacting student gain scores on the widely used Test Of Astronomy STandards (TOAST). Usually used in a pre-test and post-test research design, one might naturally assume that the pre-course differences observed between high- and low-scoring college students might be due in large part to their pre-existing motivation, interest, experience in science, and attitudes about astronomy. To explore this notion, 11 non-science majoring undergraduates taking ASTRO 101 at west coast community colleges were interviewed in the first few weeks of the course to better understand students' pre-existing affect toward learning astronomy with an eye toward predicting student success. In answering this question, we hope to contribute to our understanding of the incoming knowledge of students taking undergraduate introductory astronomy classes, but also gain insight into how faculty can best meet those students' needs and assist them in achieving success. Perhaps surprisingly, there was only weak correlation between students' motivation toward learning astronomy and their pre-test scores. Instead, the most fruitful predictor of TOAST pre-test scores was the quantity of pre-existing, informal, self-directed astronomy learning experiences.

  5. Individual Differences in Digit Span, Susceptibility to Proactive Interference, and Aptitude/Achievement Test Scores.

    ERIC Educational Resources Information Center

    Dempster, Frank N.; Cooney, John B.

    1982-01-01

    Individual differences in digit span, susceptibility to proactive interference, and various aptitude/achievement test scores were investigated in two experiments with college students. Results indicated that digit span was strongly correlated with aptitude/achievement scores, but did not indicate that susceptibility to proactive interference…

  6. Baseline Severity as Predictor of Change in St George’s Respiratory Questionnaire Scores in Trials of Long-acting Bronchodilators with COPD Patients

    PubMed Central

    Jones, Paul W.; Gelhorn, Heather; Karlsson, Niklas; Menjoge, Shailendra; Müllerova, Hana; Rennard, Stephen I.; Tal-Singer, Ruth; Wilson, Hilary; Merrill, Debora; Tabberer, Maggie

    2017-01-01

    Background: In trials oflong-acting bronchodilators, health status is an important trial outcome, however the influence of baseline severity on response measured by St George’s Respiratory Questionnaire (SGRQ) is not known. We have compared SGRQ changes between patients with chronic obstructive pulmonary disease (COPD) of mild-moderate severity or dyspnea (Global initiative for chronic Obstructive Lung disease [GOLD] grades 1 and 2; modified Medical Research Council [mMRC] grades 1 and 2) to those with severe-very severe severity or dyspnea (GOLD grades 3 and 4; mMRC grades 3 and 4). Methods: Combined individual patient data from the COPD Biomarkers Qualification Consortium database (trials of long-acting bronchodilators) were used comprising of patients from short-term (≤1-year duration; n=10802) and medium-term (2-4 years’ duration; n=8963) studies. A repeated measures analysis of variance (ANOVA) was used to determine the effects of baseline severity (GOLD/mMRC) on SGRQ response to treatment. All treatment arms were combined. Results: In short-term studies, milder patients showed a greater response than those with more severe disease in terms of GOLD grade (partial Eta2 = 0.03, p < 0.0001) and mMRC grade (partial Eta2 = 0.05, p < 0.0001). Similar results were seen in the medium-term studies (partial Eta2 = 0.02, p < 0.0001; mMRC: partial Eta2 = 0.05, p < 0.0001,). Conclusions: Patients with less severe airflow limitation and less severe dyspnea showed larger improvements in SGRQ score than more severely obstructed or dyspneic patients. Although these severity influences are small (2%-5% of the variance in SGRQ score), they do suggest that pre-specified separate analyses are warranted to test for differences in response, based on baseline severity. PMID:28848922

  7. A pretest prognostic score to assess patients undergoing exercise or pharmacological stress testing

    PubMed Central

    Morise, Anthony; Evans, Matthew; Jalisi, Farrukh; Shetty, Rajendra; Stauffer, Marc

    2007-01-01

    Objective A previously developed pretest score was validated to stratify patients presenting for exercise testing with suspected coronary disease according to the presence of angiographic coronary disease. Our goal was to determine how well this pretest score risk stratified patients undergoing pharmacological and exercise stress tests concerning prognostic endpoints. Design Retrospective cohort analysis. Setting University hospital stress laboratory. Patients 7452 unselected ambulatory patients with symptoms of suspected coronary disease undergoing stress testing between 1995 and 2004. Main outcomes measures All‐cause death, cardiac death and non‐fatal myocardial infarction. Results The rate of all‐cause death was 5.5% (CI 5.0 to 6.1) with 4.3 (SD 2.4) years of follow‐up (Exercise 2.8% (CI 2.3 to 3.2) v Pharmacological group 11.9% (CI 10.5 to 13.3); p<0.001). The rate of cardiac death/myocardial infarction was 2.6% (CI 2.2 to 3.0) (Exercise 1.4% (CI 1.1 to 1.8) v Pharmacological group 5.3% (CI 4.3 to 6.2); p<0.001). In both groups, stratification by pretest score was significant for all‐cause death and the combined endpoint. However, stratification was more effective in the pharmacological group using the combined endpoint rather than all‐cause death. Pharmacological stress patients in intermediate and high risk groups were at higher risk than their respective exercise test cohorts. Referral for pharmacological stress testing was found to be an independent predictor of time to death (2.7 (CI 2.0 to 3.6); p<0.001). Conclusion A pretest score previously validated to stratify according to angiographic outcomes, effectively risk stratified pharmacological and exercise stress patients according to the combined endpoint of cardiac death/myocardial infarction. PMID:17228070

  8. Construction of an Exome-Wide Risk Score for Schizophrenia Based on a Weighted Burden Test.

    PubMed

    Curtis, David

    2018-01-01

    Polygenic risk scores obtained as a weighted sum of associated variants can be used to explore association in additional data sets and to assign risk scores to individuals. The methods used to derive polygenic risk scores from common SNPs are not suitable for variants detected in whole exome sequencing studies. Rare variants, which may have major effects, are seen too infrequently to judge whether they are associated and may not be shared between training and test subjects. A method is proposed whereby variants are weighted according to their frequency, their annotations and the genes they affect. A weighted sum across all variants provides an individual risk score. Scores constructed in this way are used in a weighted burden test and are shown to be significantly different between schizophrenia cases and controls using a five-way cross-validation procedure. This approach represents a first attempt to summarise exome sequence variation into a summary risk score, which could be combined with risk scores from common variants and from environmental factors. It is hoped that the method could be developed further. © 2017 John Wiley & Sons Ltd/University College London.

  9. Can Machine Scoring Deal with Broad and Open Writing Tests as Well as Human Readers?

    ERIC Educational Resources Information Center

    McCurry, Doug

    2010-01-01

    This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…

  10. Pediatric residents' learning styles and temperaments and their relationships to standardized test scores.

    PubMed

    Tuli, Sanjeev Y; Thompson, Lindsay A; Saliba, Heidi; Black, Erik W; Ryan, Kathleen A; Kelly, Maria N; Novak, Maureen; Mellott, Jane; Tuli, Sonal S

    2011-12-01

    Board certification is an important professional qualification and a prerequisite for credentialing, and the Accreditation Council for Graduate Medical Education (ACGME) assesses board certification rates as a component of residency program effectiveness. To date, research has shown that preresidency measures, including National Board of Medical Examiners scores, Alpha Omega Alpha Honor Medical Society membership, or medical school grades poorly predict postresidency board examination scores. However, learning styles and temperament have been identified as factors that 5 affect test-taking performance. The purpose of this study is to characterize the learning styles and temperaments of pediatric residents and to evaluate their relationships to yearly in-service and postresidency board examination scores. This cross-sectional study analyzed the learning styles and temperaments of current and past pediatric residents by administration of 3 validated tools: the Kolb Learning Style Inventory, the Keirsey Temperament Sorter, and the Felder-Silverman Learning Style test. These results were compared with known, normative, general and medical population data and evaluated for correlation to in-service examination and postresidency board examination scores. The predominant learning style for pediatric residents was converging 44% (33 of 75 residents) and the predominant temperament was guardian 61% (34 of 56 residents). The learning style and temperament distribution of the residents was significantly different from published population data (P  =  .002 and .04, respectively). Learning styles, with one exception, were found to be unrelated to standardized test scores. The predominant learning style and temperament of pediatric residents is significantly different than that of the populations of general and medical trainees. However, learning styles and temperament do not predict outcomes on standardized in-service and board examinations in pediatric residents.

  11. Spinal appearance questionnaire: factor analysis, scoring, reliability, and validity testing.

    PubMed

    Carreon, Leah Y; Sanders, James O; Polly, David W; Sucato, Daniel J; Parent, Stefan; Roy-Beaudry, Marjolaine; Hopkins, Jeffrey; McClung, Anna; Bratcher, Kelly R; Diamond, Beverly E

    2011-08-15

    Cross sectional. This study presents the factor analysis of the Spinal Appearance Questionnaire (SAQ) and its psychometric properties. Although the SAQ has been administered to a large sample of patients with adolescent idiopathic scoliosis (AIS) treated surgically, its psychometric properties have not been fully evaluated. This study presents the factor analysis and scoring of the SAQ and evaluates its psychometric properties. The SAQ and the Scoliosis Research Society-22 (SRS-22) were administered to AIS patients who were being observed, braced or scheduled for surgery. Standard demographic data and radiographic measures including Lenke type and curve magnitude were also collected. Of the 1802 patients, 83% were female; with a mean age of 14.8 years and mean initial Cobb angle of 55.8° (range, 0°-123°). From the 32 items of the SAQ, 15 loaded on two factors with consistent and significant correlations across all Lenke types. There is an Appearance (items 1-10) and an Expectations factor (items 12-15). Responses are summed giving a range of 5 to 50 for the Appearance domain and 5 to 20 for the Expectations domain. The Cronbach's α was 0.88 for both domains and Total score with a test-retest reliability of 0.81 for Appearance and 0.91 for Expectations. Correlations with major curve magnitude were higher for the SAQ Appearance and SAQ Total scores compared to correlations between the SRS Appearance and SRS Total scores. The SAQ and SRS-22 Scores were statistically significantly different in patients who were scheduled for surgery compared to those who were observed or braced. The SAQ is a valid measure of self-image in patients with AIS with greater correlation to curve magnitude than SRS Appearance and Total score. It also discriminates between patients who require surgery from those who do not.

  12. Selection Bias in College Admissions Test Scores. NBER Working Paper No. 14265

    ERIC Educational Resources Information Center

    Clark, Melissa; Rothstein, Jesse; Schanzenbach, Diane Whitmore

    2008-01-01

    Data from college admissions tests can provide a valuable measure of student achievement, but the non-representativeness of test-takers is an important concern. We examine selectivity bias in both state-level and school-level SAT and ACT averages. The degree of selectivity may differ importantly across and within schools, and across and within…

  13. A Comparison of Three Methods for Computing Scale Score Conditional Standard Errors of Measurement. ACT Research Report Series, 2013 (7)

    ERIC Educational Resources Information Center

    Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu

    2013-01-01

    Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…

  14. A Study of the Relationship between the ACT College Mathematics Readiness Standard and College Mathematics Achievement

    ERIC Educational Resources Information Center

    Harwell, Michael; Moreno, Mario; Post, Thomas

    2016-01-01

    This study examined the relationship between the American College Testing (ACT) college mathematics readiness standard and college mathematics achievement using a sample of students who met or exceeded the minimum 3 years high school mathematics coursework recommended by ACT. According to ACT, a student who scores 22 or higher on the ACT…

  15. Effect of Item Arrangement, Knowledge of Arrangement, and Test Anxiety on Two Scoring Methods.

    ERIC Educational Resources Information Center

    Plake, Barbara S.; And Others

    1981-01-01

    Number right and elimination scores were analyzed on a college level mathematics exam assembled from pretest data. Anxiety measures were administered along with the experimental forms to undergraduates. Results suggest that neither test scores nor attitudes are influenced by item order knowledge thereof, or anxiety level. (Author/GK)

  16. ACER Mathematics Profile Series: Number Test. (Test Booklet, Answer and Record Sheet, Score Key, and Teachers Handbook).

    ERIC Educational Resources Information Center

    Cornish, Greg; Wines, Robin

    The Number Test of the ACER Mathematics Profile Series, contains 30 items, for each of three suggested grade levels: 7-8, 8-9, and 9-10. Raw scores on all tests in the ACER Mathematics Profile Series (Number, Operations, Space and Measurement) are converted to a common scale called MAPS, a major feature of the Series. Based on the Rasch Model,…

  17. Linking Scores from Tests of Similar Content Given in Different Languages: An Illustration Involving Methodological Alternatives

    ERIC Educational Resources Information Center

    Cascallar, Alicia S.; Dorans, Neil J.

    2005-01-01

    This study compares two methods commonly used (concordance and prediction) to establish linkages between scores from tests of similar content given in different languages. Score linkages between the Verbal and Math sections of the SAT I and the corresponding sections of the Spanish-language admissions test, the Prueba de Aptitud Academica (PAA),…

  18. Effect of vowel context on test-retest nasalance score variability in children with and without cleft palate.

    PubMed

    Ha, Seunghee; Jung, Seungeun; Koh, Kyung S

    2018-06-01

    The purpose of this study was to determine whether test-retest nasalance score variability differs between Korean children with and without cleft palate (CP) and vowel context influences variability in nasalance score. Thirty-four 3-to-5-year-old children with and without CP participated in the study. Three 8-syllable speech stimuli devoid of nasal consonants were used for data collection. Each stimulus was loaded with high, low, or mixed vowels, respectively. All participants were asked to repeat the speech stimuli twice after the examiner, and an immediate test-retest nasalance score was assessed with no headgear change. Children with CP exhibited significantly greater absolute difference in nasalance scores than children without CP. Variability in nasalance scores was significantly different for the vowel context, and the high vowel sentence showed a significantly larger difference in nasalance scores than the low vowel sentence. The cumulative frequencies indicated that, for children with CP in the high vowel sentence, only 8 of 17 (47%) repeated nasalance scores were within 5 points. Test-retest nasalance score variability was greater for children with CP than children without CP, and there was greater variability for the high vowel sentence(s) for both groups. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Scoring Method of a Situational Judgment Test: Influence on Internal Consistency Reliability, Adverse Impact and Correlation with Personality?

    ERIC Educational Resources Information Center

    De Leng, W. E.; Stegers-Jager, K. M.; Husbands, A.; Dowell, J. S.; Born, M. Ph.; Themmen, A. P.

    2017-01-01

    Situational Judgment Tests (SJTs) are increasingly used for medical school selection. Scoring an SJT is more complicated than scoring a knowledge test, because there are no objectively correct answers. The scoring method of an SJT may influence the construct and concurrent validity and the adverse impact with respect to non-traditional students.…

  20. Decision making under internal uncertainty: the case of multiple-choice tests with different scoring rules.

    PubMed

    Bereby-Meyer, Yoella; Meyer, Joachim; Budescu, David V

    2003-02-01

    This paper assesses framing effects on decision making with internal uncertainty, i.e., partial knowledge, by focusing on examinees' behavior in multiple-choice (MC) tests with different scoring rules. In two experiments participants answered a general-knowledge MC test that consisted of 34 solvable and 6 unsolvable items. Experiment 1 studied two scoring rules involving Positive (only gains) and Negative (only losses) scores. Although answering all items was the dominating strategy for both rules, the results revealed a greater tendency to answer under the Negative scoring rule. These results are in line with the predictions derived from Prospect Theory (PT) [Econometrica 47 (1979) 263]. The second experiment studied two scoring rules, which allowed respondents to exhibit partial knowledge. Under the Inclusion-scoring rule the respondents mark all answers that could be correct, and under the Exclusion-scoring rule they exclude all answers that might be incorrect. As predicted by PT, respondents took more risks under the Inclusion rule than under the Exclusion rule. The results illustrate that the basic process that underlies choice behavior under internal uncertainty and especially the effect of framing is similar to the process of choice under external uncertainty and can be described quite accurately by PT. Copyright 2002 Elsevier Science B.V.

  1. Effects of Scoring by Section and Independent Scorers' Patterns on Scorer Reliability in Biology Essay Tests

    ERIC Educational Resources Information Center

    Ebuoh, Casmir N.; Ezeudu, S. A.

    2015-01-01

    The study investigated the effects of scoring by section, use of independent scorers and conventional patterns on scorer reliability in Biology essay tests. It was revealed from literature review that conventional pattern of scoring all items at a time in essay tests had been criticized for not being reliable. The study was true experimental study…

  2. An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

    ERIC Educational Resources Information Center

    Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

    2013-01-01

    Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…

  3. Pediatric Residents' Learning Styles and Temperaments and Their Relationships to Standardized Test Scores

    PubMed Central

    Tuli, Sanjeev Y.; Thompson, Lindsay A.; Saliba, Heidi; Black, Erik W.; Ryan, Kathleen A.; Kelly, Maria N.; Novak, Maureen; Mellott, Jane; Tuli, Sonal S.

    2011-01-01

    Background Board certification is an important professional qualification and a prerequisite for credentialing, and the Accreditation Council for Graduate Medical Education (ACGME) assesses board certification rates as a component of residency program effectiveness. To date, research has shown that preresidency measures, including National Board of Medical Examiners scores, Alpha Omega Alpha Honor Medical Society membership, or medical school grades poorly predict postresidency board examination scores. However, learning styles and temperament have been identified as factors that 5 affect test-taking performance. The purpose of this study is to characterize the learning styles and temperaments of pediatric residents and to evaluate their relationships to yearly in-service and postresidency board examination scores. Methods This cross-sectional study analyzed the learning styles and temperaments of current and past pediatric residents by administration of 3 validated tools: the Kolb Learning Style Inventory, the Keirsey Temperament Sorter, and the Felder-Silverman Learning Style test. These results were compared with known, normative, general and medical population data and evaluated for correlation to in-service examination and postresidency board examination scores. Results The predominant learning style for pediatric residents was converging 44% (33 of 75 residents) and the predominant temperament was guardian 61% (34 of 56 residents). The learning style and temperament distribution of the residents was significantly different from published population data (P  =  .002 and .04, respectively). Learning styles, with one exception, were found to be unrelated to standardized test scores. Conclusions The predominant learning style and temperament of pediatric residents is significantly different than that of the populations of general and medical trainees. However, learning styles and temperament do not predict outcomes on standardized in-service and board

  4. Fine-Tuning Cross-Battery Assessment Procedures: After Follow-Up Testing, Use All Valid Scores, Cohesive or Not

    ERIC Educational Resources Information Center

    Schneider, W. Joel; Roman, Zachary

    2018-01-01

    We used data simulations to test whether composites consisting of cohesive subtest scores are more accurate than composites consisting of divergent subtest scores. We demonstrate that when multivariate normality holds, divergent and cohesive scores are equally accurate. Furthermore, excluding divergent scores results in biased estimates of…

  5. EXPLORATION OF SCORE AGREEMENT ON A MODIFIED UPPER QUARTER Y-BALANCE TEST KIT AS COMPARED TO THE UPPER QUARTER Y-BALANCE TEST.

    PubMed

    Cramer, Josh; Quintero, Miguel; Rhinehart, Alex; Rutherford, Caitlin; Nasypany, Alan; May, James; Baker, Russell T

    2017-02-01

    Physical performance measures (PPMs) such as The Star Excursion Balance Test (SEBT) and the Y-Balance Test (YBT) are functional movement tests used to assess participants' dynamic balance, which can be a vital component in physical exams to identify predisposing factors for risk of injury. The YBT is a functional assessment tool for the upper and lower body. It evolved from the SEBT, which has been previously used in research as a lower body functional assessment. It is comprised of fewer movement directions, which help limit fatigue. The YBT kit is a commercialized tool, which may pose barriers for clinicians with limited budgets and/or strict approval process for purchasing capital items in their clinics, especially healthcare providers in the secondary school setting. The cost may also pose a barrier for researchers with limited budgets. A less expensive, easy to make kit, may provide clinicians an opportunity to integrate functional testing into their evaluation or research. The purpose of this pilot study was to describe a cost efficient method to gather participant's upper quarter YBT (UQYBT) measurements and examine the inter- and intra-rater score agreement between this method and the commercial YBT measurements. A convenience sample of 20 physically active participants volunteered to participate in a comparison study of the of Upper Quarter Y-Balance Test (UQYBT) using the commercialized kit and the Modified Upper Quarter Y-Balance Test kit (mUQYBT) made with three cloth tape measures, athletic tape, a goniometer and three 2x4x8 wood blocks. A Pearson Product Moment correlation and Bland-Altman analyses were used to examine the relationship between intra-rater scores comparing the UQYBT and mUQYBT. Inter-rater scores were analyzed using intraclass correlation coefficients (ICC) (2,1) and Bland-Altman analyses. All Pearson Product Moment r-values for intra-rater scores were greater than .96 and statistically significant at p<0.05. Coefficients of

  6. Testing for Accountability: A Balancing Act That Challenges Current Testing Practices and Theories

    ERIC Educational Resources Information Center

    Brennan, Robert L.

    2015-01-01

    Koretz, in his article published in this issue, provides compelling arguments that the high stakes currently associated with accountability testing lead to behavioral changes in students, teachers, and other stakeholders that often have negative consequences, such as inflated scores. Koretz goes on to argue that these negative consequences require…

  7. Student Neighborhoods, Schools, and Test Score Growth: Evidence from Milwaukee, Wisconsin

    ERIC Educational Resources Information Center

    Carlson, Deven; Cowen, Joshua M.

    2015-01-01

    Schools and neighborhoods are thought to be two of the most important contextual influences on student academic outcomes. Drawing on a unique data set that permits simultaneous estimation of neighborhood and school contributions to student test score gains, we analyze the distributions of these contributions to consider the relative importance of…

  8. Teachers' Perceptions and Expectations and the Black-White Test Score Gap.

    ERIC Educational Resources Information Center

    Ferguson, Ronald F.

    2003-01-01

    Evaluates how schools can positively affect the test score gap between black and white students by examining two potential sources for this difference: teachers and students. Offers evidence for the proposition that teachers' perceptions, expectations, and behaviors interact with students' beliefs, behaviors, and work habits in ways that help to…

  9. The Effect of Stakes on Accountability Test Scores and Pass Rates

    ERIC Educational Resources Information Center

    Steedle, Jeffrey T.; Grochowalski, Joseph

    2017-01-01

    Students may not fully demonstrate their knowledge and skills on accountability tests if there are no stakes attached to individual performance. In that case, assessment results may not accurately reflect student achievement, so the validity of score interpretations and uses suffers. For this study, matched samples of students taking state…

  10. Effects of Analytical and Holistic Scoring Patterns on Scorer Reliability in Biology Essay Tests

    ERIC Educational Resources Information Center

    Ebuoh, Casmir N.

    2018-01-01

    Literature revealed that the patterns/methods of scoring essay tests had been criticized for not being reliable and this unreliability is more likely to be more in internal examinations than in the external examinations. The purpose of this study is to find out the effects of analytical and holistic scoring patterns on scorer reliability in…

  11. Impact of a standardized test package on exit examination scores and NCLEX-RN outcomes.

    PubMed

    Homard, Catherine M

    2013-03-01

    The purpose of this ex post facto correlational study was to compare exit examination scores and NCLEX-RN(®) pass rates of baccalaureate nursing students who differed in level of participation in a standardized test package. Three cohort groups emerged as a standardized test package was introduced: (a) students who did not participate in a standardized test package; (b) students with two semesters of a standardized test package; and (c) students with four semesters of a standardized test package. Benner's novice-to-expert theory framed the study in the belief that students best acquire knowledge and skills through practice and reflection. Students participating in four semesters of a standardized test package demonstrated higher exit examination scores and NCLEX-RN pass rates compared with students who did not participate in this package. This study's results could inform nurse educators about strategies to facilitate nursing student success on exit examinations and the NCLEX-RN. Copyright 2013, SLACK Incorporated.

  12. [Relationship between unipedal stance test score and center of pressure velocity in elderly].

    PubMed

    Rodrigo Antonio, Guzmán; Rony, Silvestre; Francisco Aniceto, Rodríguez; David Andrés, Arriagada; Pablo Andrés, Ortega

    2011-01-01

    Frequent falls are one of the most important health problems in the elderly population. The unipedal stance test (UPST), asses postural stability and is used in fall risk measures. Despite this, there is little information about its relationship with posturographic parameters (PP) that characterizes postural stability. Center of pressure velocity (CoPV) is one of the best PP that describes postural stability. The aim of this study was to analyze the relation between UST score and CoPV in elderly population. A sample of 38 healthy elderly subjects where divided in two groups according to their UPST score, low performance (LP, n=11) and high performance (HP, n=27). The correlation between UPST score and COP mean velocity (CoPmV), recorded from a posturographic test, was analyzed between both groups. An inverse correlation between UPST score and CoPmV was found in both groups. However, this was higher in the LP group (r=-0.69, P=.02) compared to the HP (r=-0.39, P=.04). Based on the results of this investigation, it may be concluded that the achievement on UPST has an inverse relationship with CoPmV, especially in subjects with low performance in the UPST. Copyright © 2010 SEGG. Published by Elsevier Espana. All rights reserved.

  13. Linkage analysis in nuclear families. 2: Relationship between affected sib-pair tests and lod score analysis.

    PubMed

    Knapp, M; Seuchter, S A; Baur, M P

    1994-01-01

    It is believed that the main advantage of affected sib-pair tests is that their application requires no information about the underlying genetic mechanism of the disease. However, here it is proved that the mean test, which can be considered the most prominent of the affected sib-pair tests, is equivalent to lod score analysis for an assumed recessive mode of inheritance, irrespective of the true mode of the disease. Further relationships of certain sib-pair tests and lod score analysis under specific assumed genetic modes are investigated.

  14. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... surgery is planned, is malignant. The test is for adjunctive use, in the context of a negative primary clinical and radiological evaluation, to augment the identification of patients whose gynecologic surgery... § 866.1(e). (c) Black box warning. Under section 520(e) of the Federal Food, Drug, and Cosmetic Act...

  15. Association of Health Sciences Reasoning Test scores with academic and experiential performance.

    PubMed

    Cox, Wendy C; McLaughlin, Jacqueline E

    2014-05-15

    To assess the association of scores on the Health Sciences Reasoning Test (HSRT) with academic and experiential performance in a doctor of pharmacy (PharmD) curriculum. The HSRT was administered to 329 first-year (P1) PharmD students. Performance on the HSRT and its subscales was compared with academic performance in 29 courses throughout the curriculum and with performance in advanced pharmacy practice experiences (APPEs). Significant positive correlations were found between course grades in 8 courses and HSRT overall scores. All significant correlations were accounted for by pharmaceutical care laboratory courses, therapeutics courses, and a law and ethics course. There was a lack of moderate to strong correlation between HSRT scores and academic and experiential performance. The usefulness of the HSRT as a tool for predicting student success may be limited.

  16. Consistency of SAT® I: Reasoning Test Score Conversions. Research Report. ETS RR-08-67

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Guo, Hongwen; Liu, Jinghua; Dorans, Neil J.

    2008-01-01

    This study uses historical data to explore the consistency of SAT® I: Reasoning Test score conversions and to examine trends in scaled score means. During the period from April 1995 to December 2003, both Verbal (V) and Math (M) means display substantial seasonality, and a slight increasing trend for both is observed. SAT Math means increase more…

  17. Do Standardized Tests Penalize Deep-Thinking, Creative, or Conscientious Students?: Some Personality Correlates of Graduate Record Examinations Test Scores

    ERIC Educational Resources Information Center

    Powers, Donald E.; Kaufman, James C.

    2004-01-01

    The objective of the study reported here was to explore the relationship of Graduate Record Examinations (GRE) General Test scores to selected personality traits--conscientiousness, rationality, ingenuity, quickness, creativity, and depth. A sample of 342 GRE test takers completed short personality inventory scales for each trait. Analyses…

  18. Demographically Adjusted Groups for Equating Test Scores. Research Report. ETS RR-14-30

    ERIC Educational Resources Information Center

    Livingston, Samuel A.

    2014-01-01

    In this study, I investigated 2 procedures intended to create test-taker groups of equal ability by poststratifying on a composite variable created from demographic information. In one procedure, the stratifying variable was the composite variable that best predicted the test score. In the other procedure, the stratifying variable was the…

  19. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum or... § 866.1(e). (c) Black box warning. Under section 520(e) of the Federal Food, Drug, and Cosmetic Act... box and must appear in all advertising, labeling, and promotional material for these devices. That...

  20. 21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum or... § 866.1(e). (c) Black box warning. Under section 520(e) of the Federal Food, Drug, and Cosmetic Act... box and must appear in all advertising, labeling, and promotional material for these devices. That...

  1. CaPTHUS scoring model in primary hyperparathyroidism: can it eliminate the need for ioPTH testing?

    PubMed

    Elfenbein, Dawn M; Weber, Sara; Schneider, David F; Sippel, Rebecca S; Chen, Herbert

    2015-04-01

    The CaPTHUS model was reported to have a positive predictive value of 100 % to correctly predict single-gland disease in patients with primary hyperparathyroidism, thus obviating the need for intraoperative parathyroid hormone (ioPTH) testing. We sought to apply the CaPTHUS scoring model in our patient population and assess its utility in predicting long-term biochemical cure. We retrospective reviewed all parathyroidectomies for primary hyperparathyroidism performed at our university hospital from 2003 to 2012. We routinely perform ioPTH testing. Biochemical cure was defined as a normal calcium level at 6 months. A total of 1,421 patients met the inclusion criteria: 78 % of patients had a single adenoma at the time of surgery, 98 % had a normal serum calcium at 1 week postoperatively, and 96 % had a normal serum calcium level 6 months postoperatively. Using the CaPTHUS scoring model, 307 patients (22.5 %) had a score of ≥ 3, with a positive predictive value of 91 % for single adenoma. A CaPTHUS score of ≥ 3 had a positive predictive value of 98 % for biochemical cure at 1 week as well as at 6 months. In our population, where ioPTH testing is used routinely to guide use of bilateral exploration, patients with a preoperative CaPTHUS score of ≥ 3 had good long-term biochemical cure rates. However, the model only predicted adenoma in 91 % of cases. If minimally invasive parathyroidectomy without ioPTH testing had been done for these patients, the cure rate would have dropped from 98 % to an unacceptable 89 %. Even in these patients with high CaPTHUS scores, multigland disease is present in almost 10 %, and ioPTH testing is necessary.

  2. ADAMTS13 test and/or PLASMIC clinical score in management of acquired thrombotic thrombocytopenic purpura: a cost-effective analysis.

    PubMed

    Kim, Chong H; Simmons, Sierra C; Williams, Lance A; Staley, Elizabeth M; Zheng, X Long; Pham, Huy P

    2017-11-01

    The ADAMTS13 test distinguishes thrombotic thrombocytopenic purpura (TTP) from other thrombotic microangiopathies (TMAs). The PLASMIC score helps determine the pretest probability of ADAMTS13 deficiency. Due to inherent limitations of both tests, and potential adverse effects and cost of unnecessary treatments, we performed a cost-effectiveness analysis (CEA) investigating the benefits of incorporating an in-hospital ADAMTS13 test and/or PLASMIC score into our clinical practice. A CEA model was created to compare four scenarios for patients with TMAs, utilizing either an in-house or a send-out ADAMTS13 assay with or without prior risk stratification using PLASMIC scoring. Model variables, including probabilities and costs, were gathered from the medical literature, except for the ADAMTS13 send-out and in-house tests, which were obtained from our institutional data. If only the cost is considered, in-house ADAMTS13 test for patients with intermediate- to high-risk PLASMIC score is the least expensive option ($4,732/patient). If effectiveness is assessed as measured by the number of averted deaths, send-out ADAMTS13 test is the most effective. Considering the cost/effectiveness ratio, the in-house ADAMTS13 test in patients with intermediate- to high-risk PLASMIC score is the best option, followed by the in-house ADAMTS13 test without the PLASMIC score. In patients with clinical presentations of TMAs, having an in-hospital ADAMTS13 test to promptly establish the diagnosis of TTP appears to be cost-effective. Utilizing the PLASMIC score further increases the cost-effectiveness of the in-house ADAMTS13 test. Our findings indicate the benefit of having a rapid and reliable in-house ADAMTS13 test, especially in the tertiary medical center. © 2017 AABB.

  3. Interpreting the "g" Loadings of Intelligence Test Composite Scores in Light of Spearman's Law of Diminishing Returns

    ERIC Educational Resources Information Center

    Reynolds, Matthew R.

    2013-01-01

    The linear loadings of intelligence test composite scores on a general factor ("g") have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the "g" loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of…

  4. School Readiness and the Draw-a-Man Test: An Empiricaly Derived Alternative to Harris' Scoring System.

    ERIC Educational Resources Information Center

    Simner, Marvin L.

    1985-01-01

    An abbreviated scoring system for the Goodenough-Harris Draw-A-Man Test found that three items had the same overall potential for correctly identifying at-risk kindergarteners as more time-consuming scoring methods. (CL)

  5. Economic impact of 21-gene recurrence score testing on early-stage breast cancer in Ireland.

    PubMed

    Smyth, Lillian; Watson, Geoff; Walsh, Elaine M; Kelly, Catherine M; Keane, Maccon; Kennedy, M John; Grogan, Liam; Hennessy, Bryan T; O'Reilly, Seamus; Coate, Linda E; O'Connor, Miriam; Quinn, Cecily; Verleger, Katharina; Schoeman, Olaf; O'Reilly, Susan; Walshe, Janice M

    2015-10-01

    The 21-gene test is a validated multi-gene diagnostic test that predicts chemotherapy (CT) benefit in oestrogen receptor positive (ER+), lymph node-negative (N0) breast cancer (BC) patients (pts). Ireland was the first public health care system to reimburse this test in Europe. Study objectives were to assess the impact of this test on decision-making and to analyse the economic impact of testing. Between October 2011 and February 2013, a national, retrospective, cross-sectional observational study of ER+, N0 BC pts tested with the 21-gene test was conducted. Surveyed breast medical oncologists, provided the assumption for the decision impact analysis that grade (G) 1 pts would not have received CT before testing and G2/3 pts would have received CT before testing. Descriptive statistical analyses were performed. 592 pts were identified; Low, intermediate and high recurrence score were identified in 53, 36 and 10 % pts, respectively. 384 (70 %) pts had G2, 129 (22 %) G3 and 76 (13 %) G1 tumours. Post testing, 345 pts (59 %) experienced a change in CT decision; 339 changed to hormone therapy alone and 6 advised to receive CT. 172 (30 %) pts received CT, 12 (3.9 %) of pts with low scores, 108 (50.9 %) of intermediate risk and 50 (90.9 %) of pts with high risk scores. Net reduction in CT use was 58 % and net savings achieved were €793,565. Since public reimbursement, the introduction of the 21-gene test has resulted in a significant reduction in chemotherapy administration and cost savings for the Irish public healthcare system.

  6. Association between the Medical College Admission Test scores and Alpha Omega Alpha Medical Honors Society membership.

    PubMed

    Gauer, Jacqueline L; Jackson, J Brooks

    2017-01-01

    Medical schools worldwide are faced with the challenge of selecting from among many qualified applicants. One factor that might help admissions committees identify future exceptional medical students is scores on standardized entrance exams. The purpose of this study was to determine the association between scores on the most commonly used standardized medical school entrance exam in the USA, the Medical College Admission Test (MCAT), and election to the US medical honors society, Alpha Omega Alpha (AOA). MCAT scores and AOA membership data were analyzed for all the students pursuing Doctor of Medicine degrees at the University of Minnesota Medical School and who graduated between 2012-2016 (n=1,309). An independent-samples t -test found a significant difference (t=6.132, p <0.001) in MCAT scores between those who were elected to AOA (n=179) and those who were not (n=1,130). On average, students who were elected to AOA had composite MCAT scores of 1.65 points higher than those who were not. Percentages of students elected to AOA gradually but inconsistently increased with MCAT score. No student who scored <27 on the MCAT was elected to AOA. Among students with MCAT scores at the 99th percentile or above (scores of ≥38), 13 of 48 (27.1%) were elected to AOA. Election to AOA during medical school was significantly associated with higher MCAT scores. Admissions committees should carefully consider the role of standardized entrance exam scores, in the context of a holistic review, when selecting for exceptional medical students.

  7. Association between the Medical College Admission Test scores and Alpha Omega Alpha Medical Honors Society membership

    PubMed Central

    Gauer, Jacqueline L; Jackson, J Brooks

    2017-01-01

    Introduction Medical schools worldwide are faced with the challenge of selecting from among many qualified applicants. One factor that might help admissions committees identify future exceptional medical students is scores on standardized entrance exams. The purpose of this study was to determine the association between scores on the most commonly used standardized medical school entrance exam in the USA, the Medical College Admission Test (MCAT), and election to the US medical honors society, Alpha Omega Alpha (AOA). Method MCAT scores and AOA membership data were analyzed for all the students pursuing Doctor of Medicine degrees at the University of Minnesota Medical School and who graduated between 2012–2016 (n=1,309). Results An independent-samples t-test found a significant difference (t=6.132, p<0.001) in MCAT scores between those who were elected to AOA (n=179) and those who were not (n=1,130). On average, students who were elected to AOA had composite MCAT scores of 1.65 points higher than those who were not. Percentages of students elected to AOA gradually but inconsistently increased with MCAT score. No student who scored <27 on the MCAT was elected to AOA. Among students with MCAT scores at the 99th percentile or above (scores of ≥38), 13 of 48 (27.1%) were elected to AOA. Discussion Election to AOA during medical school was significantly associated with higher MCAT scores. Admissions committees should carefully consider the role of standardized entrance exam scores, in the context of a holistic review, when selecting for exceptional medical students. PMID:28979178

  8. Two for One: Using QAR to Increase Reading Comprehension and Improve Test Scores

    ERIC Educational Resources Information Center

    Green, Susan

    2016-01-01

    This teaching tip describes an intervention used in a third-grade classroom implemented to help students pass an end-of-grade reading comprehension test. Low scores on a practice end-of-grade comprehension test prompted a re-examination of classroom reading instruction and a plan for intervention. This teaching tip describes the phases implemented…

  9. Testing the effects of long-acting steroids in edema and ecchymosis after closed rhinoplasty

    PubMed Central

    Gutierrez, Santiago; Wuesthoff, Carolina

    2014-01-01

    BACKGROUND: Steroids have proven to be of some benefit in rhinoplasty edema and ecchymosis when administered at a high and repeated dose. OBJECTIVE: To evaluate the effects of single-dose, long-acting intramuscular steroids on postoperative edema and ecchymosis after closed rhinoplasty with osteotomies compared with placebo. METHODS: A randomized, double-blinded, placebo-controlled trial was performed. Fifty-four patients were randomly assigned to two groups: 28 received a single dose of long-acting dexamethasone (mean [± SD] dose 16±4 mg) immediately before anesthetic induction; the remaining 26 received an intramuscular injection of saline solution. The same surgeon performed all surgeries, with patients under general anesthesia. Acetaminophen was the only analgesic used to control postoperative pain. High-resolution digital photographs were taken on postoperative days 1, 3, 7 and 14. Scoring was performed separately for eyelid swelling and ecchymosis by an independent observer using a graded scale (0 to 5) for edema and a scoring system (0 to 13) for ecchymosis. RESULTS: No statistically significant differences in terms of age, sex or amount of bleeding during surgery were found between the two groups. No statistically significant difference was observed in the decrease of both ecchymosis and edema between placebo and high-dose, long-acting dexamethasone. A statistically significant difference in operation time was found, favouring the steroid group. No severe complications were observed due to steroid use. DISCUSSION: Osteotomies are basically a form of (controlled) trauma, with considerable disruption of the abundant blood vessels in this facial region and, therefore, are associated with with undesirable effects. A recent meta-analysis failed to show benefits of the use of steroids after postoperative day 3. Only a trend toward reduction in edema and ecchymosis with the use of long-acting steroids compared with placebo was demonstrated in the present study

  10. Estimating Teacher Effectiveness from Two-Year Changes in Students' Test Scores

    ERIC Educational Resources Information Center

    Leigh, Andrew

    2010-01-01

    Using a dataset covering over 10,000 Australian school teachers and over 90,000 pupils, I estimate how effective teachers are in raising students' test scores. Since the exams are biennial, it is necessary to take account of the teacher's work in the intervening year. Even adjusting for measurement error, the teacher fixed effects are widely…

  11. Can Tracking Raise the Test Scores of High-Ability Minority Students?

    ERIC Educational Resources Information Center

    Card, David; Giuliano, Laura

    2016-01-01

    We evaluate a tracking program in a large urban district where schools with at least one gifted fourth grader create a separate "gifted/high achiever" classroom. Most seats are filled by non-gifted high achievers, ranked by previous-year test scores. We study the program's effects on the high achievers using (1) a rank-based regression…

  12. Hypothesis Testing as an Act of Rationality

    NASA Astrophysics Data System (ADS)

    Nearing, Grey

    2017-04-01

    Statistical hypothesis testing is ad hoc in two ways. First, setting probabilistic rejection criteria is, as Neyman (1957) put it, an act of will rather than an act of rationality. Second, physical theories like conservation laws do not inherently admit probabilistic predictions, and so we must use what are called epistemic bridge principles to connect model predictions with the actual methods of hypothesis testing. In practice, these bridge principles are likelihood functions, error functions, or performance metrics. I propose that the reason we are faced with these problems is because we have historically failed to account for a fundamental component of basic logic - namely the portion of logic that explains how epistemic states evolve in the presence of empirical data. This component of Cox' (1946) calculitic logic is called information theory (Knuth, 2005), and adding information theory our hypothetico-deductive account of science yields straightforward solutions to both of the above problems. This also yields a straightforward method for dealing with Popper's (1963) problem of verisimilitude by facilitating a quantitative approach to measuring process isomorphism. In practice, this involves data assimilation. Finally, information theory allows us to reliably bound measures of epistemic uncertainty, thereby avoiding the problem of Bayesian incoherency under misspecified priors (Grünwald, 2006). I therefore propose solutions to four of the fundamental problems inherent in both hypothetico-deductive and/or Bayesian hypothesis testing. - Neyman (1957) Inductive Behavior as a Basic Concept of Philosophy of Science. - Cox (1946) Probability, Frequency and Reasonable Expectation. - Knuth (2005) Lattice Duality: The Origin of Probability and Entropy. - Grünwald (2006). Bayesian Inconsistency under Misspecification. - Popper (1963) Conjectures and Refutations: The Growth of Scientific Knowledge.

  13. Test-retest reliability and minimal detectable change scores for the timed "up & go" test, the six-minute walk test, and gait speed in people with Alzheimer disease.

    PubMed

    Ries, Julie D; Echternach, John L; Nof, Leah; Gagnon Blodgett, Michelle

    2009-06-01

    With the increasing incidence of Alzheimer disease (AD), determining the validity and reliability of outcome measures for people with this disease is necessary. The goals of this study were to assess test-retest reliability of data for the Timed "Up & Go" Test (TUG), the Six-Minute Walk Test (6MWT), and gait speed and to calculate minimal detectable change (MDC) scores for each outcome measure. Performance differences between groups with mild to moderate AD and moderately severe to severe AD (as determined by the Functional Assessment Staging [FAST] scale) were studied. This was a prospective, nonexperimental, descriptive methodological study. Background data collected for 51 people with AD included: use of an assistive device, Mini-Mental Status Examination scores, and FAST scale scores. Each participant engaged in 2 test sessions, separated by a 30- to 60-minute rest period, which included 2 TUG trials, 1 6MWT trial, and 2 gait speed trials using a computerized gait assessment system. A specific cuing protocol was followed to achieve optimal performance during test sessions. Test-retest reliability values for the TUG, the 6MWT, and gait speed were high for all participants together and for the mild to moderate AD and moderately severe to severe AD groups separately (intraclass correlation coefficients > or = .973); however, individual variability of performance also was high. Calculated MDC scores at the 90% confidence interval were: TUG=4.09 seconds, 6MWT=33.5 m (110 ft), and gait speed=9.4 cm/s. The 2 groups were significantly different in performance of clinical tests, with the participants who were more cognitively impaired being more physically and functionally impaired. A single researcher for data collection limited sample numbers and prohibited blinding to dementia level. The TUG, the 6MWT, and gait speed are reliable outcome measures for use with people with AD, recognizing that individual variability of performance is high. Minimal detectable change

  14. Evaluating the Effectiveness of the USA Testprep Intervention to Increase High School Test Scores

    ERIC Educational Resources Information Center

    Christian, Veronica Faye

    2012-01-01

    The No Child Left Behind Act emphasized the responsibility of states to improve student academic performance. In one state, students are required to take subject-area tests and master each test to graduate; however, in some schools, many students are failing the English II test administered during students' sophomore year. Two districts have…

  15. Estimation and test for linkage between markers: a comparison of lod score and χ (2) test in a linkage study of maritime pine (Pinus pinaster Ait.).

    PubMed

    Gerber, S; Rodolphe, F

    1994-06-01

    The first step in the construction of a linkage map involves the estimation and test for linkage between all possible pairs of markers. The lod score method is used in many linkage studies for the latter purpose. In contrast with classical statistical tests, this method does not rely on the choice of a first-type error level. We thus provide a comparison between the lod score and a χ (2) test on linkage data from a gymnosperm, the maritime pine. The lod score appears to be a very conservative test with the usual thresholds. Its severity depends on the type of data used.

  16. Validation of undergraduate medical student script concordance test (SCT) scores on the clinical assessment of the acute abdomen.

    PubMed

    Goos, Matthias; Schubach, Fabian; Seifert, Gabriel; Boeker, Martin

    2016-08-17

    Health professionals often manage medical problems in critical situations under time pressure and on the basis of vague information. In recent years, dual process theory has provided a framework of cognitive processes to assist students in developing clinical reasoning skills critical especially in surgery due to the high workload and the elevated stress levels. However, clinical reasoning skills can be observed only indirectly and the corresponding constructs are difficult to measure in order to assess student performance. The script concordance test has been established in this field. A number of studies suggest that the test delivers a valid assessment of clinical reasoning. However, different scoring methods have been suggested. They reflect different interpretations of the underlying construct. In this work we want to shed light on the theoretical framework of script theory and give an idea of script concordance testing. We constructed a script concordance test in the clinical context of "acute abdomen" and compared previously proposed scores with regard to their validity. A test comprising 52 items in 18 clinical scenarios was developed, revised along the guidelines and administered to 56 4(th) and 5(th) year medical students at the end of a blended-learning seminar. We scored the answers using five different scoring methods (distance (2×), aggregate (2×), single best answer) and compared the scoring keys, the resulting final scores and Cronbach's α after normalization of the raw scores. All scores except the single best answers calculation achieved acceptable reliability scores (>= 0.75), as measured by Cronbach's α. Students were clearly distinguishable from the experts, whose results were set to a mean of 80 and SD of 5 by the normalization process. With the two aggregate scoring methods, the students' means values were between 62.5 (AGGPEN) and 63.9 (AGG) equivalent to about three expert SD below the experts' mean value (Cronbach's α : 0.76 (AGGPEN

  17. The relationship between selected standardized test scores and performance in advanced placement math and science exams: Analyzing the differential effectiveness of scores for course identification and placement

    NASA Astrophysics Data System (ADS)

    Urbina, Josue N.

    There is a national need to increase the STEM-related workforce. Among factors leading towards STEM careers include the number of advanced high school mathematics and science courses students complete. Florida's enrollment patterns in STEM-related Advanced Placement (AP) courses, however, reveal that only a small percentage of students enroll into these classes. Therefore, screening tools are needed to find more students for these courses, who are academically ready, yet have not been identified. The purpose of this study was to investigate the extent to which scores from a national standardized test, Preliminary Scholastic Assessment Test/ National Merit Qualifying Test (PSAT/NMSQT), in conjunction with and compared to a state-mandated standardized test, Florida Comprehensive Assessment Test (FCAT), are related to selected AP exam performance in Seminole County Public Schools. An ex post facto correlational study was conducted using 6,189 student records from the 2010 - 2012 academic years. Multiple regression analyses using simultaneous Full Model testing showed differential moderate to strong relationships between scores in eight of the nine AP courses (i.e., Biology, Environmental Science, Chemistry, Physics B, Physics C Electrical, Physics C Mechanical, Statistics, Calculus AB and BC) examined. For example, the significant unique contribution to overall variance in AP scores was a linear combination of PSAT Math (M), Critical Reading (CR) and FCAT Reading (R) for Biology and Environmental Science. Moderate relationships for Chemistry included a linear combination of PSAT M, W (Writing) and FCAT M; a combination of FCAT M and PSAT M was most significantly associated with Calculus AB performance. These findings have implications for both research and practice. FCAT scores, in conjunction with PSAT scores, can potentially be used for specific STEM-related AP courses, as part of a systematic approach towards AP course identification and placement. For courses with

  18. The Effect of School Poverty on Racial Gaps in Tests Scores: The Case of the Minnesota Basic Standards Tests

    ERIC Educational Resources Information Center

    Myers, Samuel L.; Kim, Hyeoneui; Mandala, Cheryl

    2004-01-01

    A data from 1996,1998 and 1999 Minnesota comprehensive statewide testing on eight graders is used to analyze whether African American students perform worse than the white students who attend the poverty schools. The analyses conclude that African American-White test score gap is attributed more to the racial discriminations and racial treatments…

  19. What's in a Teacher Test? Assessing the Relationship between Teacher Licensure Test Scores and Student STEM Achievement and Course-Taking. CEDR Working Paper. WP #2016-11

    ERIC Educational Resources Information Center

    Goldhaber, Dan; Gratz, Trevor; Theobald, Roddy

    2016-01-01

    We investigate the relationship between teacher licensure test scores and student test achievement and high school course-taking. We focus on three subject/grade combinations-- middle school math, ninth-grade algebra and geometry, and ninth-grade biology--and find evidence that a teacher's basic skills test scores are modestly predictive of…

  20. Cognitive test scores in male adolescent cigarette smokers compared to non-smokers: a population-based study.

    PubMed

    Weiser, Mark; Zarka, Salman; Werbeloff, Nomi; Kravitz, Efrat; Lubin, Gad

    2010-02-01

    Although previous studies indicate that people with lower intelligence quotient (IQ) scores are more likely to become cigarette smokers, IQ scores of siblings discordant for smoking and of adolescents who began smoking between ages 18-21 years have not been studied systematically. Each year a random sample of Israeli military recruits complete a smoking questionnaire. Cognitive functioning is assessed by the military using standardized tests equivalent to IQ. Of 20 221 18-year-old males, 28.5% reported smoking at least one cigarette a day (smokers). An unadjusted comparison found that smokers scored 0.41 effect sizes (ES, P < 0.001) lower than non-smokers; adjusted analyses remained significant (adjusted ES = 0.27, P < 0.001). Adolescents smoking one to five, six to 10, 11-20 and 21+ cigarettes/day had cognitive test scores 0.14, 0.22, 0.33 and 0.5 adjusted ES poorer than those of non-smokers (P < 0.001). Adolescents who did not smoke by age 18, and then began to smoke between ages 18-21 had lower cognitive test scores compared to never-smokers (adjusted ES = 0.14, P < 0.001). An analysis of brothers discordant for smoking found that smoking brothers had lower cognitive scores than non-smoking brothers (adjusted ES = 0.27; P = 0.014). Controlled analyses from this large population-based cohort of male adolescents indicate that IQ scores are lower in male adolescents who smoke compared to non-smokers and in brothers who smoke compared to their non-smoking brothers. The IQs of adolescents who began smoking between ages 18-21 are lower than those of non-smokers. Adolescents with poorer IQ scores might be targeted for programmes designed to prevent smoking.

  1. Investigating Score Dependability in English/Chinese Interpreter Certification Performance Testing: A Generalizability Theory Approach

    ERIC Educational Resources Information Center

    Han, Chao

    2016-01-01

    As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…

  2. The reliability and validity of qualitative scores for the Controlled Oral Word Association Test.

    PubMed

    Ross, Thomas P; Calhoun, Emily; Cox, Tara; Wenner, Carolyn; Kono, Whitney; Pleasant, Morgan

    2007-05-01

    The reliability and validity of two qualitative scoring systems for the Controlled Oral Word Association Test [Benton, A. L., Hamsher, de S. K., & Sivan, A. B. (1983). Multilingual aplasia examination (2nd ed.). Iowa City, IA: AJA Associates] were examined in 108 healthy young adults. The scoring systems developed by Troyer et al. [Troyer, A. K., Moscovich, M., & Winocur, G. (1997). Clustering and switching as two components of verbal fluency: Evidence from younger and older healthy adults. Neuropsychology, 11, 138-146] and by Abwender et al. [Abwender, D. A., Swan, J. G., Bowerman, J. T., & Connolly, S. W. (2001a). Qualitative analysis of verbal fluency output: Review and comparison of several scoring methods. Assessment, 8, 323-336] each demonstrated excellent interrater reliability (all indices at or above r(icc)=.9). Consistent with previous research [e.g., Ross, T. P. (2003). The reliability of cluster and switch scores for the COWAT. Archives of Clinical Psychology, 18, 153-164), test-retest reliability coefficients (N=53; M interval 44.6 days) for the qualitative scores were modest to poor (r(icc)=.6 to .4 range). Correlations among COWAT scores, measures of executive functioning, verbal learning, working memory, and vocabulary were examined. The idea that qualitative scores represent distinct executive functions such as cognitive flexibility or strategy utilization was not supported. We offer the interpretation that COWAT performance may require the ability to retrieve words in a non-routine manner while suppressing habitual responses and associated processing interference, presumably due to a spread of activation across semantic or lexical networks. This interpretation, though speculative at present, implies that clustering and switching on the COWAT may not be entirely deliberate, but rather an artifact of a passive (i.e., state-dependent) process. Ideas for future research, most noticeably experimental studies using cognitive methods (e.g., priming), are

  3. Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

    ERIC Educational Resources Information Center

    Jacob, Brian A.

    2016-01-01

    Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…

  4. Exploring Validity of Computer-Based Test Scores with Examinees' Response Behaviors and Response Times

    ERIC Educational Resources Information Center

    Sahin, Füsun

    2017-01-01

    Examining the testing processes, as well as the scores, is needed for a complete understanding of validity and fairness of computer-based assessments. Examinees' rapid-guessing and insufficient familiarity with computers have been found to be major issues that weaken the validity arguments of scores. This study has three goals: (a) improving…

  5. The effect of instructional methodology on high school students natural sciences standardized tests scores

    NASA Astrophysics Data System (ADS)

    Powell, P. E.

    Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.

  6. ACT Profile Report: State. Graduating Class 2016. New Hampshire

    ERIC Educational Resources Information Center

    ACT, Inc., 2016

    2016-01-01

    This report provides information about the performance of New Hampshire's 2016 graduating seniors who took the ACT as sophomores, juniors, or seniors; and self-reported at the time of testing that they were scheduled to graduate in 2016. Beginning with the Graduating Class of 2013, all students whose scores are college reportable, both standard…

  7. Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success.

    PubMed

    Niu, Sunny X; Tienda, Marta

    2012-04-01

    Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success-high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe.

  8. Role of test motivation in intelligence testing.

    PubMed

    Duckworth, Angela Lee; Quinn, Patrick D; Lynam, Donald R; Loeber, Rolf; Stouthamer-Loeber, Magda

    2011-05-10

    Intelligence tests are widely assumed to measure maximal intellectual performance, and predictive associations between intelligence quotient (IQ) scores and later-life outcomes are typically interpreted as unbiased estimates of the effect of intellectual ability on academic, professional, and social life outcomes. The current investigation critically examines these assumptions and finds evidence against both. First, we examined whether motivation is less than maximal on intelligence tests administered in the context of low-stakes research situations. Specifically, we completed a meta-analysis of random-assignment experiments testing the effects of material incentives on intelligence-test performance on a collective 2,008 participants. Incentives increased IQ scores by an average of 0.64 SD, with larger effects for individuals with lower baseline IQ scores. Second, we tested whether individual differences in motivation during IQ testing can spuriously inflate the predictive validity of intelligence for life outcomes. Trained observers rated test motivation among 251 adolescent boys completing intelligence tests using a 15-min "thin-slice" video sample. IQ score predicted life outcomes, including academic performance in adolescence and criminal convictions, employment, and years of education in early adulthood. After adjusting for the influence of test motivation, however, the predictive validity of intelligence for life outcomes was significantly diminished, particularly for nonacademic outcomes. Collectively, our findings suggest that, under low-stakes research conditions, some individuals try harder than others, and, in this context, test motivation can act as a third-variable confound that inflates estimates of the predictive validity of intelligence for life outcomes.

  9. Role of test motivation in intelligence testing

    PubMed Central

    Duckworth, Angela Lee; Quinn, Patrick D.; Lynam, Donald R.; Loeber, Rolf; Stouthamer-Loeber, Magda

    2011-01-01

    Intelligence tests are widely assumed to measure maximal intellectual performance, and predictive associations between intelligence quotient (IQ) scores and later-life outcomes are typically interpreted as unbiased estimates of the effect of intellectual ability on academic, professional, and social life outcomes. The current investigation critically examines these assumptions and finds evidence against both. First, we examined whether motivation is less than maximal on intelligence tests administered in the context of low-stakes research situations. Specifically, we completed a meta-analysis of random-assignment experiments testing the effects of material incentives on intelligence-test performance on a collective 2,008 participants. Incentives increased IQ scores by an average of 0.64 SD, with larger effects for individuals with lower baseline IQ scores. Second, we tested whether individual differences in motivation during IQ testing can spuriously inflate the predictive validity of intelligence for life outcomes. Trained observers rated test motivation among 251 adolescent boys completing intelligence tests using a 15-min “thin-slice” video sample. IQ score predicted life outcomes, including academic performance in adolescence and criminal convictions, employment, and years of education in early adulthood. After adjusting for the influence of test motivation, however, the predictive validity of intelligence for life outcomes was significantly diminished, particularly for nonacademic outcomes. Collectively, our findings suggest that, under low-stakes research conditions, some individuals try harder than others, and, in this context, test motivation can act as a third-variable confound that inflates estimates of the predictive validity of intelligence for life outcomes. PMID:21518867

  10. Maximal exercise testing variables and 10-year survival: fitness risk score derivation from the FIT Project.

    PubMed

    Ahmed, Haitham M; Al-Mallah, Mouaz H; McEvoy, John W; Nasir, Khurram; Blumenthal, Roger S; Jones, Steven R; Brawner, Clinton A; Keteyian, Steven J; Blaha, Michael J

    2015-03-01

    To determine which routinely collected exercise test variables most strongly correlate with survival and to derive a fitness risk score that can be used to predict 10-year survival. This was a retrospective cohort study of 58,020 adults aged 18 to 96 years who were free of established heart disease and were referred for an exercise stress test from January 1, 1991, through May 31, 2009. Demographic, clinical, exercise, and mortality data were collected on all patients as part of the Henry Ford ExercIse Testing (FIT) Project. Cox proportional hazards models were used to identify exercise test variables most predictive of survival. A "FIT Treadmill Score" was then derived from the β coefficients of the model with the highest survival discrimination. The median age of the 58,020 participants was 53 years (interquartile range, 45-62 years), and 28,201 (49%) were female. Over a median of 10 years (interquartile range, 8-14 years), 6456 patients (11%) died. After age and sex, peak metabolic equivalents of task and percentage of maximum predicted heart rate achieved were most highly predictive of survival (P<.001). Subsequent addition of baseline blood pressure and heart rate, change in vital signs, double product, and risk factor data did not further improve survival discrimination. The FIT Treadmill Score, calculated as [percentage of maximum predicted heart rate + 12(metabolic equivalents of task) - 4(age) + 43 if female], ranged from -200 to 200 across the cohort, was near normally distributed, and was found to be highly predictive of 10-year survival (Harrell C statistic, 0.811). The FIT Treadmill Score is easily attainable from any standard exercise test and translates basic treadmill performance measures into a fitness-related mortality risk score. The FIT Treadmill Score should be validated in external populations. Copyright © 2015 Mayo Foundation for Medical Education and Research. Published by Elsevier Inc. All rights reserved.

  11. Relationships between the handball-specific complex test, non-specific field tests and the match performance score in elite professional handball players.

    PubMed

    Hermassi, Souhail; Chelly, Mohamed-Souhaiel; Wollny, Rainer; Hoffmeyer, Birgit; Fieseler, Georg; Schulze, Stephan; Irlenbusch, Lars; Delank, Karl-Stefan; Shephard, Roy J; Bartels, Thomas; Schwesig, René

    2018-06-01

    This study assessed the validity of the handball-specific complex test (HBCT) and two non-specific field tests in professional elite handball athletes, using the match performance score (MPS) as the gold standard of performance. Thirteen elite male handball players (age: 27.4±4.8 years; premier German league) performed the HBCT, the Yo-Yo Intermittent Recovery (YYIR) test and a repeated shuttle sprint ability (RSA) test at the beginning of pre-season training. The RSA results were evaluated in terms of best time, total time, and fatigue decrement. Heart rates (HR) were assessed at selected times throughout all tests; the recovery HR was measured immediately post-test and 10 minutes later. The match performance score was based on various handball specific parameters (e.g., field goals, assists, steals, blocks, and technical mistakes) as seen during all matches of the immediately subsequent season (2015/2016). The parameters of run 1, run 2, and HR recovery at minutes 6 and 10 of the RSA test all showed a variance of more than 10% (range: 11-15%). However, the variance of scores for the YYIR test was much smaller (range: 1-7%). The resting HR (r2=0.18), HR recovery at minute 10 (r2=0.10), lactate concentration at rest (r2=0.17), recovery of heart rate from 0 to 10 minutes (r2=0.15), and velocity of second throw at first trial (r2=0.37) were the most valid HBCT parameters. Much effort is necessary to assess MPS and to develop valid tests. Speed and the rate of functional recovery seem the best predictors of competitive performance for elite handball players.

  12. Automated Essay Scoring versus Human Scoring: A Comparative Study

    ERIC Educational Resources Information Center

    Wang, Jinhao; Brown, Michelle Stallone

    2007-01-01

    The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by an AES tool, IntelliMetric [TM] and human raters. Data collection included administering the Texas version of the WriterPlacer "Plus" test and obtaining scores assigned by IntelliMetric [TM] and by…

  13. College Math Assessment: SAT Scores vs. College Math Placement Scores

    ERIC Educational Resources Information Center

    Foley-Peres, Kathleen; Poirier, Dawn

    2008-01-01

    Many colleges and university's use SAT math scores or math placement tests to place students in the appropriate math course. This study compares the use of math placement scores and SAT scores for 188 freshman students. The student's grades and faculty observations were analyzed to determine if the SAT scores and/or college math assessment scores…

  14. A Comparison of Scores on the WISC-R and Lorge-Thorndike Intelligence Test for Disadvantaged Black Elementary School Children

    ERIC Educational Resources Information Center

    Lowe, James D.; Karnes, Frances A.

    1976-01-01

    It is indicated that, although the scores [obtained on both tests] are significantly correlated, the tests yield significantly different scores with the Lorge-Thorndike consistently overestimating the WISC-R full scale I.Q. (Author)

  15. The Effect of Four Intervention Programs on Standardized Test Scores by Gender

    ERIC Educational Resources Information Center

    Cryder, Rebecca E.

    2012-01-01

    This quantitative correlational study involved the analysis, by gender, of the effect of four intervention programs at an Arizona middle school as seen on Arizona's Instrument to Measure Standards (AIMS) test scores. These four intervention programs included: Advancement Via Individual Determination (AVID), a planner stamping system, a World…

  16. International Test Score Comparisons and Educational Policy: A Review of the Critiques

    ERIC Educational Resources Information Center

    Carnoy, Martin

    2015-01-01

    Stanford education professor Martin Carnoy examines four main critiques of how international test results are used in policymaking. Of particular interest are critiques of the policy analyses published by the Program for International Student Assessment (PISA). Using average PISA scores as a comparative measure of student achievement is misleading…

  17. Using the EZ-Diffusion Model to Score a Single-Category Implicit Association Test of Physical Activity

    PubMed Central

    Rebar, Amanda L.; Ram, Nilam; Conroy, David E.

    2014-01-01

    Objective The Single-Category Implicit Association Test (SC-IAT) has been used as a method for assessing automatic evaluations of physical activity, but measurement artifact or consciously-held attitudes could be confounding the outcome scores of these measures. The objective of these two studies was to address these measurement concerns by testing the validity of a novel SC-IAT scoring technique. Design Study 1 was a cross-sectional study, and study 2 was a prospective study. Method In study 1, undergraduate students (N = 104) completed SC-IATs for physical activity, flowers, and sedentary behavior. In study 2, undergraduate students (N = 91) completed a SC-IAT for physical activity, self-reported affective and instrumental attitudes toward physical activity, physical activity intentions, and wore an accelerometer for two weeks. The EZ-diffusion model was used to decompose the SC-IAT into three process component scores including the information processing efficiency score. Results In study 1, a series of structural equation model comparisons revealed that the information processing score did not share variability across distinct SC-IATs, suggesting it does not represent systematic measurement artifact. In study 2, the information processing efficiency score was shown to be unrelated to self-reported affective and instrumental attitudes toward physical activity, and positively related to physical activity behavior, above and beyond the traditional D-score of the SC-IAT. Conclusions The information processing efficiency score is a valid measure of automatic evaluations of physical activity. PMID:25484621

  18. Integrated Application of Active Controls (IAAC) technology to an advanced subsonic transport project: Test act system validation

    NASA Technical Reports Server (NTRS)

    1985-01-01

    The primary objective of the Test Active Control Technology (ACT) System laboratory tests was to verify and validate the system concept, hardware, and software. The initial lab tests were open loop hardware tests of the Test ACT System as designed and built. During the course of the testing, minor problems were uncovered and corrected. Major software tests were run. The initial software testing was also open loop. These tests examined pitch control laws, wing load alleviation, signal selection/fault detection (SSFD), and output management. The Test ACT System was modified to interface with the direct drive valve (DDV) modules. The initial testing identified problem areas with DDV nonlinearities, valve friction induced limit cycling, DDV control loop instability, and channel command mismatch. The other DDV issue investigated was the ability to detect and isolate failures. Some simple schemes for failure detection were tested but were not completely satisfactory. The Test ACT System architecture continues to appear promising for ACT/FBW applications in systems that must be immune to worst case generic digital faults, and be able to tolerate two sequential nongeneric faults with no reduction in performance. The challenge in such an implementation would be to keep the analog element sufficiently simple to achieve the necessary reliability.

  19. The Apgar score has survived the test of time.

    PubMed

    Finster, Mieczyslaw; Wood, Margaret

    2005-04-01

    In 1953, Virginia Apgar, M.D. published her proposal for a new method of evaluation of the newborn infant. The avowed purpose of this paper was to establish a simple and clear classification of newborn infants which can be used to compare the results of obstetric practices, types of maternal pain relief and the results of resuscitation. Having considered several objective signs pertaining to the condition of the infant at birth she selected five that could be evaluated and taught to the delivery room personnel without difficulty. These signs were heart rate, respiratory effort, reflex irritability, muscle tone and color. Sixty seconds after the complete birth of the baby a rating of zero, one or two was given to each sign, depending on whether it was absent or present. Virginia Apgar reviewed anesthesia records of 1025 infants born alive at Columbia Presbyterian Medical Center during the period of this report. All had been rated by her method. Infants in poor condition scored 0-2, infants in fair condition scored 3-7, while scores 8-10 were achieved by infants in good condition. The most favorable score 1 min after birth was obtained by infants delivered vaginally with the occiput the presenting part (average 8.4). Newborns delivered by version and breech extraction had the lowest score (average 6.3). Infants delivered by cesarean section were more vigorous (average score 8.0) when spinal was the method of anesthesia versus an average score of 5.0 when general anesthesia was used. Correlating the 60 s score with neonatal mortality, Virginia found that mature infants receiving 0, 1 or 2 scores had a neonatal death rate of 14%; those scoring 3, 4, 5, 6 or 7 had a death rate of 1.1%; and those in the 8-10 score group had a death rate of 0.13%. She concluded that the prognosis of an infant is excellent if he receives one of the upper three scores, and poor if one of the lowest three scores.

  20. An approach to analyzing a single subject's scores obtained in a standardized test with application to the Aachen Aphasia Test (AAT).

    PubMed

    Willmes, K

    1985-08-01

    Methods for the analysis of a single subject's test profile(s) proposed by Huber (1973) are applied to the Aachen Aphasia Test (AAT). The procedures are based on the classical test theory model (Lord & Novick, 1968) and are suited for any (achievement) test with standard norms from a large standardization sample and satisfactory reliability estimates. Two test profiles of a Wernicke's aphasic, obtained before and after a 3-month period of speech therapy, are analyzed using inferential comparisons between (groups of) subtest scores on one test application and between two test administrations for single (groups of) subtests. For each of these comparisons, the two aspects of (i) significant (reliable) differences in performance beyond measurement error and (ii) the diagnostic validity of that difference in the reference population of aphasic patients are assessed. Significant differences between standardized subtest scores and a remarkably better preserved reading and writing ability could be found for both test administrations using the multiple test procedure of Holm (1979). Comparison of both profiles revealed an overall increase in performance for each subtest as well as changes in level of performance relations between pairs of subtests.

  1. Results of thermal test of metallic molybdenum disk target and fast-acting valve testing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Virgo, M.; Chemerisov, S.; Gromov, R.

    2016-12-01

    This report describes the irradiation conditions for thermal testing of helium-cooled metallic disk targets that was conducted on March 9, 2016, at the Argonne National Laboratory electron linac. The four disks in this irradiation were pressed and sintered by Oak Ridge National Laboratory from molybdenum metal powder. Two of those disks were instrumented with thermocouples. Also reported are results of testing a fast-acting-valve system, which was designed to protect the accelerator in case of a target-window failure.

  2. From Test Scores to Language Use: Emergent Bilinguals Using English to Accomplish Academic Tasks

    ERIC Educational Resources Information Center

    Rodriguez-Mojica, Claudia

    2018-01-01

    Prominent discourses about emergent bilinguals' academic abilities tend to focus on performance as measured by test scores and perpetuate the message that emergent bilinguals trail far behind their peers. When we remove the constraints of formal testing situations, what can emergent bilinguals do in English as they engage in naturally occurring…

  3. Comparison of Standardized Test Scores from Traditional Classrooms and Those Using Problem-Based Learning

    ERIC Educational Resources Information Center

    Needham, Martha Elaine

    2010-01-01

    This research compares differences between standardized test scores in problem-based learning (PBL) classrooms and a traditional classroom for 6th grade students using a mixed-method, quasi-experimental and qualitative design. The research shows that problem-based learning is as effective as traditional teaching methods on standardized tests. The…

  4. Co-Educational Tutorial Classes and Their Significance on Gendered Test Scores of Wollo University Students: A Before-After Analyses

    ERIC Educational Resources Information Center

    Gidey, Mu'uz

    2015-01-01

    This action research is carried out in a practical class room setting to devise an innovative way of administering tutorial classes to improve students' learning competence with particular reference to gendered test scores. A before-after test score analyses of mean and standard deviations along with t-statistical tests of hypotheses of second…

  5. CK-MM Polymorphism is Associated With Physical Fitness Test Scores in Military Recruits.

    PubMed

    Sprouse, Courtney; Tosi, Laura L; Gordish-Dressman, Heather; Abdel-Ghani, Mai S; Panchapakesan, Karuna; Niederberger, Brenda; Devaney, Joseph M; Kelly, Karen R

    2015-09-01

    Muscle-specific creatine kinase is thought to play an integral role in maintaining energy homeostasis by providing a supply of creatine phosphate. The genetic variant, rs8111989, contributes to individual differences in physical performance, and thus the purpose of this study was to determine if rs8111989 variant is predictive of Physical Fitness Test (PFT) scores in male, military infantry recruits. DNA was extracted from whole blood, and genotyping was performed in 176 Marines. Relationships between PFT measures (run, sit-ups, and pull-ups) and genotype were determined. Participants with 2 copies of the T allele for rs8111989 variant had higher PFT scores for run time, pull-ups, and total PFT score. Specifically, participants with 2 copies of the TT allele (variant) (n = 97) demonstrated an overall higher total PFT score as compared with those with one copy of the C allele (n = 79) (TT: 250 ± 31 vs. 238 ± 31; p = 0.02), run score (TT: 82 ± 10 vs. 78 ± 11; p = 0.04) and pull-up score (TT: 78 ± 11 vs. 65 ± 21; p = 0.04) or those with the CC/CT genotype. These results demonstrate an association between physical performance measures and genetic variation in the muscle-specific creatine kinase gene (rs8111989). Reprint & Copyright © 2015 Association of Military Surgeons of the U.S.

  6. The Sequential Probability Ratio Test: An efficient alternative to exact binomial testing for Clean Water Act 303(d) evaluation.

    PubMed

    Chen, Connie; Gribble, Matthew O; Bartroff, Jay; Bay, Steven M; Goldstein, Larry

    2017-05-01

    The United States's Clean Water Act stipulates in section 303(d) that states must identify impaired water bodies for which total maximum daily loads (TMDLs) of pollution inputs into water bodies are developed. Decision-making procedures about how to list, or delist, water bodies as impaired, or not, per Clean Water Act 303(d) differ across states. In states such as California, whether or not a particular monitoring sample suggests that water quality is impaired can be regarded as a binary outcome variable, and California's current regulatory framework invokes a version of the exact binomial test to consolidate evidence across samples and assess whether the overall water body complies with the Clean Water Act. Here, we contrast the performance of California's exact binomial test with one potential alternative, the Sequential Probability Ratio Test (SPRT). The SPRT uses a sequential testing framework, testing samples as they become available and evaluating evidence as it emerges, rather than measuring all the samples and calculating a test statistic at the end of the data collection process. Through simulations and theoretical derivations, we demonstrate that the SPRT on average requires fewer samples to be measured to have comparable Type I and Type II error rates as the current fixed-sample binomial test. Policymakers might consider efficient alternatives such as SPRT to current procedure. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Linking U.S. School District Test Score Distributions to a Common Scale. CEPA Working Paper No. 16-09

    ERIC Educational Resources Information Center

    Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D.

    2017-01-01

    There is no comprehensive database of U.S. district-level test scores that is comparable across states. We describe and evaluate a method for constructing such a database. First, we estimate linear, reliability-adjusted linking transformations from state test score scales to the scale of the National Assessment of Educational Progress (NAEP). We…

  8. The Disaggregation of Value-Added Test Scores to Assess Learning Outcomes in Economics Courses

    ERIC Educational Resources Information Center

    Walstad, William B.; Wagner, Jamie

    2016-01-01

    This study disaggregates posttest, pretest, and value-added or difference scores in economics into four types of economic learning: positive, retained, negative, and zero. The types are derived from patterns of student responses to individual items on a multiple-choice test. The micro and macro data from the "Test of Understanding in College…

  9. A Guide for Setting the Cut-Scores to Minimize Weighted Classification Errors in Test Batteries

    ERIC Educational Resources Information Center

    Grabovsky, Irina; Wainer, Howard

    2017-01-01

    In this article, we extend the methodology of the Cut-Score Operating Function that we introduced previously and apply it to a testing scenario with multiple independent components and different testing policies. We derive analytically the overall classification error rate for a test battery under the policy when several retakes are allowed for…

  10. Participation in a coteaching classroom and students' end-of-course test scores

    NASA Astrophysics Data System (ADS)

    Debro, Ava

    General education students consistently perform poorly on standardized science tests. Coteaching is an instructional strategy that improves the achievement of students with disabilities, but very little research exists that examines the effect of coteaching classrooms on the performance of general education students. The purpose of this study was to examine the effect of coteaching classrooms on the performance of general education students. The constructivist theoretical framework provided the foundation for this research. The research question examined the effect that coteaching classrooms had on the performance of general education biology students. In this experimental design utilizing a posttest-only control group, coteaching instructional strategy was the treatment, and student performance was measured using the scores obtained from the biology end-of-course test. Data for this study was analyzed using an independent t-test. The results of this study revealed that there was not a statistically significant difference in student performance on the biology end-of-course test between treatment and control groups. More than half of the general education biology students enrolled in coteaching classrooms failed the end-of-course test. Researchers may use this study as a catalyst to examine other instructional practices that may improve student performance in science courses. The results of this study may be used to persuade coteachers of the importance of attending frequent professional development opportunities that examine a variety of coteaching instructional strategies. Improving the performance of general education students in science may improve standardized test scores, afford more students the opportunity to attend college, and ensure that students are able to compete on a global level.

  11. Raise Test Scores without Selling Your Soul: An Interview with Scott Mandel

    ERIC Educational Resources Information Center

    Curriculum Review, 2006

    2006-01-01

    With his 10th book, Improving Test Scores: A Practical Approach for Teachers and Administrators, Scott Mandel outlines steps educators can take to boost achievement on standardized exams while maintaining the integrity of their day-to-day teaching. Mandel, who holds a Ph.D. in curriculum and instruction from USC, teaches history and English at…

  12. Linear score tests for variance components in linear mixed models and applications to genetic association studies.

    PubMed

    Qu, Long; Guennel, Tobias; Marshall, Scott L

    2013-12-01

    Following the rapid development of genome-scale genotyping technologies, genetic association mapping has become a popular tool to detect genomic regions responsible for certain (disease) phenotypes, especially in early-phase pharmacogenomic studies with limited sample size. In response to such applications, a good association test needs to be (1) applicable to a wide range of possible genetic models, including, but not limited to, the presence of gene-by-environment or gene-by-gene interactions and non-linearity of a group of marker effects, (2) accurate in small samples, fast to compute on the genomic scale, and amenable to large scale multiple testing corrections, and (3) reasonably powerful to locate causal genomic regions. The kernel machine method represented in linear mixed models provides a viable solution by transforming the problem into testing the nullity of variance components. In this study, we consider score-based tests by choosing a statistic linear in the score function. When the model under the null hypothesis has only one error variance parameter, our test is exact in finite samples. When the null model has more than one variance parameter, we develop a new moment-based approximation that performs well in simulations. Through simulations and analysis of real data, we demonstrate that the new test possesses most of the aforementioned characteristics, especially when compared to existing quadratic score tests or restricted likelihood ratio tests. © 2013, The International Biometric Society.

  13. Maintenance of Wakefulness Test scores and driving performance in sleep disorder patients and controls.

    PubMed

    Philip, Pierre; Chaufton, Cyril; Taillard, Jacques; Sagaspe, Patricia; Léger, Damien; Raimondi, Monika; Vakulin, Andrew; Capelli, Aurore

    2013-08-01

    Sleepiness at the wheel is a risk factor for traffic accidents. Past studies have demonstrated the validity of the Maintenance of Wakefulness Test (MWT) scores as a predictor of driving impairment in untreated patients with obstructive sleep apnea syndrome (OSAS), but there is limited information on the validity of the maintenance of wakefulness test by MWT in predicting driving impairment in patients with hypersomnias of central origin (narcolepsy or idiopathic hypersomnia). The aim of this study was to compare the MWT scores with driving performance in sleep disorder patients and controls. 19 patients suffering from hypersomnias of central origin (9 narcoleptics and 10 idiopathic hypersomnia), 17 OSAS patients and 14 healthy controls performed a MWT (4×40-minute trials) and a 40-minute driving session on a real car driving simulator. Participants were divided into 4 groups defined by their MWT sleep latency scores. The groups were pathological (sleep latency 0-19 min), intermediate (20-33 min), alert (34-40 min) and control (>34 min). The main driving performance outcome was the number of inappropriate line crossings (ILCs) during the 40 minute drive test. Patients with pathological MWT sleep latency scores (0-19 min) displayed statistically significantly more ILC than patients from the intermediate, alert and control groups (F (3, 46)=7.47, p<0.001). Pathological sleep latencies on the MWT predicted driving impairment in patients suffering from hypersomnias of central origin as well as in OSAS patients. MWT is an objective measure of daytime sleepiness that appears to be useful in estimating the driving performance in sleepy patients. Copyright © 2013 Elsevier B.V. All rights reserved.

  14. Self Adapted Testing as Formative Assessment: Effects of Feedback and Scoring on Engagement and Performance

    ERIC Educational Resources Information Center

    Arieli-Attali, Meirav

    2016-01-01

    This dissertation investigated the feasibility of self-adapted testing (SAT) as a formative assessment tool with the focus on learning. Under two different orientation goals--to excel on a test (performance goal) or to learn from the test (learning goal)--I examined the effect of different scoring rules provided as interactive feedback, on test…

  15. Do Neurocognitive SCAT3 Baseline Test Scores Differ Between Footballers (Soccer) Living With and Without Disability? A Cross-Sectional Study.

    PubMed

    Weiler, Richard; van Mechelen, Willem; Fuller, Colin; Ahmed, Osman Hassan; Verhagen, Evert

    2018-01-01

    To determine if baseline Sport Concussion Assessment Tool, third Edition (SCAT3) scores differ between athletes with and without disability. Cross-sectional comparison of preseason baseline SCAT3 scores for a range of England international footballers. Team doctors and physiotherapists supporting England football teams recorded players' SCAT 3 baseline tests from August 1, 2013 to July 31, 2014. A convenience sample of 249 England footballers, of whom 185 were players without disability (male: 119; female: 66) and 64 were players with disability (male learning disability: 17; male cerebral palsy: 28; male blind: 10; female deaf: 9). Between-group comparisons of median SCAT3 total and section scores were made using nonparametric Mann-Whitney-Wilcoxon ranked-sum test. All footballers with disability scored higher symptom severity scores compared with male players without disability. Male footballers with learning disability demonstrated no significant difference in the total number of symptoms, but recorded significantly lower scores on immediate memory and delayed recall compared with male players without disability. Male blind footballers' scored significantly higher for total concentration and delayed recall, and male footballers with cerebral palsy scored significantly higher on balance testing and immediate memory, when compared with male players without disability. Female footballers with deafness scored significantly higher for total concentration and balance testing than female footballers without disability. This study suggests that significant differences exist between SCAT3 baseline section scores for footballers with and without disability. Concussion consensus guidelines should recognize these differences and produce guidelines that are specific for the growing number of athletes living with disability.

  16. Improving Personality Facet Scores with Multidimensional Computer Adaptive Testing: An Illustration with the Neo Pi-R

    ERIC Educational Resources Information Center

    Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W.

    2013-01-01

    Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…

  17. Are students' impressions of improved learning through active learning methods reflected by improved test scores?

    PubMed

    Everly, Marcee C

    2013-02-01

    To report the transformation from lecture to more active learning methods in a maternity nursing course and to evaluate whether student perception of improved learning through active-learning methods is supported by improved test scores. The process of transforming a course into an active-learning model of teaching is described. A voluntary mid-semester survey for student acceptance of the new teaching method was conducted. Course examination results, from both a standardized exam and a cumulative final exam, among students who received lecture in the classroom and students who had active learning activities in the classroom were compared. Active learning activities were very acceptable to students. The majority of students reported learning more from having active-learning activities in the classroom rather than lecture-only and this belief was supported by improved test scores. Students who had active learning activities in the classroom scored significantly higher on a standardized assessment test than students who received lecture only. The findings support the use of student reflection to evaluate the effectiveness of active-learning methods and help validate the use of student reflection of improved learning in other research projects. Copyright © 2011 Elsevier Ltd. All rights reserved.

  18. Changes in Student Populations and Average Test Scores of Dutch Primary Schools

    ERIC Educational Resources Information Center

    Luyten, Hans; de Wolf, Inge

    2011-01-01

    This article focuses on the relation between student population characteristics and average test scores per school in the final grade of primary education from a dynamic perspective. Aggregated data of over 5,000 Dutch primary schools covering a 6-year period were used to study the relation between changes in school populations and shifts in mean…

  19. States Eyeing Expense of Hand-Scored Tests in Light of NCLB Rules

    ERIC Educational Resources Information Center

    Archer, Jeff

    2005-01-01

    When students put down their pencils at the end of Connecticut's testing each year, another intensive process begins. Hundreds of trained evaluators work day and night for about a month to score the written responses. Although expensive, the use of open-ended questions drives the kind of instruction that state leaders say they want in their…

  20. Updating prognosis of cirrhosis by Cox's regression model using Child-Pugh score and aminopyrine breath test as time-dependent covariates.

    PubMed

    Merkel, C; Morabito, A; Sacerdoti, D; Bolognesi, M; Angeli, P; Gatta, A

    1998-06-01

    The determination of aminopyrine breath test on entry into the study was recently shown to improve the accuracy of prediction of death based on the Child-Pugh classification, but the possible usefulness of serial determinations of both parameters has not been assessed. In the present study, we aimed at evaluating whether serial determinations of aminopyrine breath test and Child-Pugh score improve prognostic accuracy in patients with cirrhosis, compared with determinations obtained only on admission. In 74 patients with liver cirrhosis aminopyrine breath test and Child-Pugh score were obtained upon entry into the study. Patients were followed with sequential aminopyrine breath tests and assessments of the Child-Pugh score every 4-6 months. A total number of 232 determinations were obtained. During follow-up 45 patients died, on average after 12 months of follow-up. Child-Pugh score improved in the beginning of follow-up, and then remained fairly constant; aminopyrine breath test showed no improvement in the beginning of follow-up, but rather a slowly progressive decline. In patients who died, both the Child-Pugh score and the metabolism of aminopyrine were significantly more impaired in the last year preceding death (p < 0.05). Applying Cox's regression model with time-dependent covariates, Child-Pugh score and aminopyrine breath test were independent significant predictors of survival. The model with time-dependent covariates explained the observed survival much better than the model with time-fixed covariates (chi-sq. explained by regression = 31.45 vs 11.97; d.f. = 2; p = 0.0000001 vs 0.003). These data suggest that serial determinations of Child-Pugh score and aminopyrine breath test can be used to efficiently update prognosis of cirrhosis.

  1. Assessing Growth in Young Children: A Comparison of Raw, Age-Equivalent, and Standard Scores Using the Peabody Picture Vocabulary Test

    ERIC Educational Resources Information Center

    Sullivan, Jeremy R.; Winter, Suzanne M.; Sass, Daniel A.; Svenkerud, Nicole

    2014-01-01

    Many tests provide users with several different types of scores to facilitate interpretation and description of students' performance. Common examples include raw scores, age- and grade-equivalent scores, and standard scores. However, when used within the context of assessing growth among young children, these scores should not be interchangeable…

  2. The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

    ERIC Educational Resources Information Center

    Baggerly, Jennifer; Ferretti, Larissa K.

    2008-01-01

    What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…

  3. Lower Quarter Y-Balance Test Scores and Lower Extremity Injury in NCAA Division I Athletes.

    PubMed

    Lai, Wilson C; Wang, Dean; Chen, James B; Vail, Jeremy; Rugg, Caitlin M; Hame, Sharon L

    2017-08-01

    Functional movement tests that are predictive of injury risk in National Collegiate Athletic Association (NCAA) athletes are useful tools for sports medicine professionals. The Lower Quarter Y-Balance Test (YBT-LQ) measures single-leg balance and reach distances in 3 directions. To assess whether the YBT-LQ predicts the laterality and risk of sports-related lower extremity (LE) injury in NCAA athletes. Case-control study; Level of evidence, 3. The YBT-LQ was administered to 294 NCAA Division I athletes from 21 sports during preparticipation physical examinations at a single institution. Athletes were followed prospectively over the course of the corresponding season. Correlation analysis was performed between the laterality of reach asymmetry and composite scores (CS) versus the laterality of injury. Receiver operating characteristic (ROC) analysis was used to determine the optimal asymmetry cutoff score for YBT-LQ. A multivariate regression analysis adjusting for sex, sport type, body mass index, and history of prior LE surgery was performed to assess predictors of earlier and higher rates of injury. Neither the laterality of reach asymmetry nor the CS correlated with the laterality of injury. ROC analysis found optimal cutoff scores of 2, 9, and 3 cm for anterior, posteromedial, and posterolateral reach, respectively. All of these potential cutoff scores, along with a cutoff score of 4 cm used in the majority of prior studies, were associated with poor sensitivity and specificity. Furthermore, none of the asymmetric cutoff scores were associated with earlier or increased rate of injury in the multivariate analyses. YBT-LQ scores alone do not predict LE injury in this collegiate athlete population. Sports medicine professionals should be cautioned against using the YBT-LQ alone to screen for injury risk in collegiate athletes.

  4. Toxic Substances Control Act Test Submissions 2.0 (TSCATS 2.0)

    EPA Pesticide Factsheets

    The Toxic Substances Control Act Test Submissions 2.0 (TSCATS 2.0) tracks the submissions of health and safety data submitted to the EPA either as required or voluntarily under certain sections of TSCA.

  5. Improvement in intelligence test scores from 6 to 10 years in children of teenage mothers.

    PubMed

    Cornelius, Marie D; Goldschmidt, Lidush; De Genna, Natacha M; Richardson, Gale A; Leech, Sharon L; Day, Richard

    2010-06-01

    This study investigates change in IQ scores among 290 children born to teenage mothers and identifies social, economic, and environmental variables that may be associated with change in intelligence test performance. The children of 290 teenage mothers (72% African-American and 28% European American) were assessed with the Stanford-Binet Intelligence Scale-4th Edition at ages 6 and 10. The mean composite score at age 6 was 84.8 and 91.2 at age 10, an improvement of 6.4 points. Significant cross-sectional predictors at both ages 6 and 10 of higher Stanford-Binet Intelligence Scale scores were maternal cognitive ability, school grade, white ethnicity, and caregiver education. Having more children in the household significantly predicted lower Stanford-Binet Intelligence Scale scores at age 6. Higher satisfaction with maternal social support predicted higher Stanford-Binet Intelligence Scale scores at age 10. Change in IQ scores was not related to maternal socioeconomic status, social support, home environment, ethnicity, or family interactions. Custodial stability was associated with an improvement in IQ scores, whereas increase in caregiver depression was related to decline in IQ scores. Our findings suggest that improvement in IQ scores of offspring of teenage mothers may be related to stability of maternal custody. More research is needed to determine the impact of the maturation of adolescent mothers' parenting and the role of early education on improvement in cognitive abilities.

  6. Sequential Neighborhood Effects: The Effect of Long-Term Exposure to Concentrated Disadvantage on Children's Reading and Math Test Scores.

    PubMed

    Hicks, Andrew L; Handcock, Mark S; Sastry, Narayan; Pebley, Anne R

    2018-02-01

    Prior research has suggested that children living in a disadvantaged neighborhood have lower achievement test scores, but these studies typically have not estimated causal effects that account for neighborhood choice. Recent studies used propensity score methods to account for the endogeneity of neighborhood exposures, comparing disadvantaged and nondisadvantaged neighborhoods. We develop an alternative propensity function approach in which cumulative neighborhood effects are modeled as a continuous treatment variable. This approach offers several advantages. We use our approach to examine the cumulative effects of neighborhood disadvantage on reading and math test scores in Los Angeles. Our substantive results indicate that recency of exposure to disadvantaged neighborhoods may be more important than average exposure for children's test scores. We conclude that studies of child development should consider both average cumulative neighborhood exposure and the timing of this exposure.

  7. Sequential Neighborhood Effects: The Effect of Long-Term Exposure to Concentrated Disadvantage on Children's Reading and Math Test Scores

    PubMed Central

    Hicks, Andrew L.; Handcock, Mark S.; Sastry, Narayan

    2018-01-01

    Prior research has suggested that children living in a disadvantaged neighborhood have lower achievement test scores, but these studies typically have not estimated causal effects that account for neighborhood choice. Recent studies used propensity score methods to account for the endogeneity of neighborhood exposures, comparing disadvantaged and nondisadvantaged neighborhoods. We develop an alternative propensity function approach in which cumulative neighborhood effects are modeled as a continuous treatment variable. This approach offers several advantages. We use our approach to examine the cumulative effects of neighborhood disadvantage on reading and math test scores in Los Angeles. Our substantive results indicate that recency of exposure to disadvantaged neighborhoods may be more important than average exposure for children's test scores. We conclude that studies of child development should consider both average cumulative neighborhood exposure and the timing of this exposure. PMID:29192386

  8. Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer

    PubMed Central

    Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Deeks, Jon

    2016-01-01

    Introduction Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). Methods and analysis ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. PMID:27507231

  9. Generalizing Terwilliger's likelihood approach: a new score statistic to test for genetic association.

    PubMed

    el Galta, Rachid; Uitte de Willige, Shirley; de Visser, Marieke C H; Helmer, Quinta; Hsu, Li; Houwing-Duistermaat, Jeanine J

    2007-09-24

    In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.

  10. An Evaluation of Three Approximate Item Response Theory Models for Equating Test Scores.

    ERIC Educational Resources Information Center

    Marco, Gary L.; And Others

    Three item response models were evaluated for estimating item parameters and equating test scores. The models, which approximated the traditional three-parameter model, included: (1) the Rasch one-parameter model, operationalized in the BICAL computer program; (2) an approximate three-parameter logistic model based on coarse group data divided…

  11. Using College Admission Test Scores to Clarify High School Placement. Leading Indicator Spotlight

    ERIC Educational Resources Information Center

    Flug, Susanna

    2010-01-01

    In "Beyond Test Scores: Leading Indicators for Education," Foley and colleagues (2008) define leading indicators as those that "provide early signals of progress toward academic achievement" (p. 1) and stress that educators "need leading indicators to help them see the direction their efforts are going in and to take…

  12. Differences of wells scores accuracy, caprini scores and padua scores in deep vein thrombosis diagnosis

    NASA Astrophysics Data System (ADS)

    Gatot, D.; Mardia, A. I.

    2018-03-01

    Deep Vein Thrombosis (DVT) is the venous thrombus in lower limbs. Diagnosis is by using venography or ultrasound compression. However, these examinations are not available yet in some health facilities. Therefore many scoring systems are developed for the diagnosis of DVT. The scoring method is practical and safe to use in addition to efficacy, and effectiveness in terms of treatment and costs. The existing scoring systems are wells, caprini and padua score. There have been many studies comparing the accuracy of this score but not in Medan. Therefore, we are interested in comparative research of wells, capriniand padua score in Medan.An observational, analytical, case-control study was conducted to perform diagnostic tests on the wells, caprini and padua score to predict the risk of DVT. The study was at H. Adam Malik Hospital in Medan.From a total of 72 subjects, 39 people (54.2%) are men and the mean age are 53.14 years. Wells score, caprini score and padua score has a sensitivity of 80.6%; 61.1%, 50% respectively; specificity of 80.65; 66.7%; 75% respectively, and accuracy of 87.5%; 64.3%; 65.7% respectively.Wells score has better sensitivity, specificity and accuracy than caprini and padua score in diagnosing DVT.

  13. Correcting Two-Sample "z" and "t" Tests for Correlation: An Alternative to One-Sample Tests on Difference Scores

    ERIC Educational Resources Information Center

    Zimmerman, Donald W.

    2012-01-01

    In order to circumvent the influence of correlation in paired-samples and repeated measures experimental designs, researchers typically perform a one-sample Student "t" test on difference scores. That procedure entails some loss of power, because it employs N - 1 degrees of freedom instead of the 2N - 2 degrees of freedom of the…

  14. Loanwords and Vocabulary Size Test Scores: A Case of Different Estimates for Different L1 Learners

    ERIC Educational Resources Information Center

    Laufer, Batia; McLean, Stuart

    2016-01-01

    The article investigated how the inclusion of loanwords in vocabulary size tests affected the test scores of two L1 groups of EFL learners: Hebrew and Japanese. New BNC- and COCA-based vocabulary size tests were constructed in three modalities: word form recall, word form recognition, and word meaning recall. Depending on the test modality, the…

  15. Insights into Using "TOEIC"® Test Scores to Inform Human Resource Management Decisions. Research Report. ETS RR-17-48

    ERIC Educational Resources Information Center

    Oliveri, María Elena; Tannenbaum, Richard J.

    2017-01-01

    This report explores the ways in which human resource (HR) managers use "TOEIC"® scores to inform hiring, promotion, and training decisions in an international workplace. Two data sources were used (a) previously collected test users' testimonials that described managers' use of TOEIC scores to inform HR decisions and (b) test-use…

  16. Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography

    PubMed Central

    Mallett, Susan; Halligan, Steve; Collins, Gary S.; Altman, Doug G.

    2014-01-01

    Background Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. Methods In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Results Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. Conclusions The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. PMID:25353643

  17. Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.

    PubMed

    Mallett, Susan; Halligan, Steve; Collins, Gary S; Altman, Doug G

    2014-01-01

    Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.

  18. Comprehensive School Reform and Standardized Test Scores in Illinois Elementary and Middle Schools

    ERIC Educational Resources Information Center

    McEnroe, James D.

    2010-01-01

    The study examined the effects of the federally funded Comprehensive School Reform (CSR) program on student performance on mandated standardized tests. The study focused on the mathematics and reading scores of Illinois public elementary and middle and junior high school students. The federal CSR program provided Illinois schools with an annual…

  19. Relationship of Friends, Physical Education, and State Test Scores: Implications for School Counselors

    ERIC Educational Resources Information Center

    Hollingsworth, Mary Ann

    2010-01-01

    This study examined the relationship between dimensions of wellness and academic performance for 634 third through fifth grade students in Title One schools in rural Mississippi, using composites of the Five Factor Wellness Inventory for Elementary Children and Reading, Language, and Math Scores of the Mississippi Curriculum Test (a state level…

  20. Depressive status explains a significant amount of the variance in COPD assessment test (CAT) scores

    PubMed Central

    Miravitlles, Marc; Molina, Jesús; Quintano, José Antonio; Campuzano, Anna; Pérez, Joselín; Roncero, Carlos

    2018-01-01

    Background COPD assessment test (CAT) is a short, easy-to-complete health status tool that has been incorporated into the multidimensional assessment of COPD in order to guide therapy; therefore, it is important to understand the factors determining CAT scores. Methods This is a post hoc analysis of a cross-sectional, observational study conducted in respiratory medicine departments and primary care centers in Spain with the aim of identifying the factors determining CAT scores, focusing particularly on the cognitive status measured by the Mini-Mental State Examination (MMSE) and levels of depression measured by the short Beck Depression Inventory (BDI). Results A total of 684 COPD patients were analyzed; 84.1% were men, the mean age of patients was 68.7 years, and the mean forced expiratory volume in 1 second (%) was 55.1%. Mean CAT score was 21.8. CAT scores correlated with the MMSE score (Pearson’s coefficient r=−0.371) and the BDI (r=0.620), both p<0.001. In the multivariate analysis, the usual COPD severity variables (age, dyspnea, lung function, and comorbidity) together with MMSE and BDI scores were significantly associated with CAT scores and explained 45% of the variability. However, a model including only MMSE and BDI scores explained up to 40% and BDI alone explained 38% of the CAT variance. Conclusion CAT scores are associated with clinical variables of severity of COPD. However, cognitive status and, in particular, the level of depression explain a larger percentage of the variance in the CAT scores than the usual COPD clinical severity variables. PMID:29563782

  1. Depressive status explains a significant amount of the variance in COPD assessment test (CAT) scores.

    PubMed

    Miravitlles, Marc; Molina, Jesús; Quintano, José Antonio; Campuzano, Anna; Pérez, Joselín; Roncero, Carlos

    2018-01-01

    COPD assessment test (CAT) is a short, easy-to-complete health status tool that has been incorporated into the multidimensional assessment of COPD in order to guide therapy; therefore, it is important to understand the factors determining CAT scores. This is a post hoc analysis of a cross-sectional, observational study conducted in respiratory medicine departments and primary care centers in Spain with the aim of identifying the factors determining CAT scores, focusing particularly on the cognitive status measured by the Mini-Mental State Examination (MMSE) and levels of depression measured by the short Beck Depression Inventory (BDI). A total of 684 COPD patients were analyzed; 84.1% were men, the mean age of patients was 68.7 years, and the mean forced expiratory volume in 1 second (%) was 55.1%. Mean CAT score was 21.8. CAT scores correlated with the MMSE score (Pearson's coefficient r =-0.371) and the BDI ( r =0.620), both p <0.001. In the multivariate analysis, the usual COPD severity variables (age, dyspnea, lung function, and comorbidity) together with MMSE and BDI scores were significantly associated with CAT scores and explained 45% of the variability. However, a model including only MMSE and BDI scores explained up to 40% and BDI alone explained 38% of the CAT variance. CAT scores are associated with clinical variables of severity of COPD. However, cognitive status and, in particular, the level of depression explain a larger percentage of the variance in the CAT scores than the usual COPD clinical severity variables.

  2. Meta-Analyses of the Relationship of Creative Achievement to both IQ and Divergent Thinking Test Scores

    ERIC Educational Resources Information Center

    Kim, Kyung Hee

    2008-01-01

    There is disagreement among researchers about whether IQ tests or divergent thinking (DT) tests are better predictors of creative achievement. Resolving this dispute is complicated by the fact that some research has shown a relationship between IQ and DT test scores (e.g., Runco & Albert, 1986; Wallach, 1970). The present study conducted…

  3. Specific algorithm method of scoring the Clock Drawing Test applied in cognitively normal elderly

    PubMed Central

    Mendes-Santos, Liana Chaves; Mograbi, Daniel; Spenciere, Bárbara; Charchat-Fichman, Helenice

    2015-01-01

    The Clock Drawing Test (CDT) is an inexpensive, fast and easily administered measure of cognitive function, especially in the elderly. This instrument is a popular clinical tool widely used in screening for cognitive disorders and dementia. The CDT can be applied in different ways and scoring procedures also vary. Objective The aims of this study were to analyze the performance of elderly on the CDT and evaluate inter-rater reliability of the CDT scored by using a specific algorithm method adapted from Sunderland et al. (1989). Methods We analyzed the CDT of 100 cognitively normal elderly aged 60 years or older. The CDT ("free-drawn") and Mini-Mental State Examination (MMSE) were administered to all participants. Six independent examiners scored the CDT of 30 participants to evaluate inter-rater reliability. Results and Conclusion A score of 5 on the proposed algorithm ("Numbers in reverse order or concentrated"), equivalent to 5 points on the original Sunderland scale, was the most frequent (53.5%). The CDT specific algorithm method used had high inter-rater reliability (p<0.01), and mean score ranged from 5.06 to 5.96. The high frequency of an overall score of 5 points may suggest the need to create more nuanced evaluation criteria, which are sensitive to differences in levels of impairment in visuoconstructive and executive abilities during aging. PMID:29213954

  4. An Investigation of Calculator Use on Employment Tests of Mathematical Ability: Effects on Reliability, Validity, Test Scores, and Speed of Completion

    ERIC Educational Resources Information Center

    Bing, Mark N.; Stewart, Susan M.; Davison, H. Kristl

    2009-01-01

    Handheld calculators have been used on the job for more than 30 years, yet the degree to which these devices can affect performance on employment tests of mathematical ability has not been thoroughly examined. This study used a within-subjects research design (N = 167) to investigate the effects of calculator use on test score reliability, test…

  5. Developing Local Oral Reading Fluency Cut Scores for Predicting High-Stakes Test Performance

    ERIC Educational Resources Information Center

    Grapin, Sally L.; Kranzler, John H.; Waldron, Nancy; Joyce-Beaulieu, Diana; Algina, James

    2017-01-01

    This study evaluated the classification accuracy of a second grade oral reading fluency curriculum-based measure (R-CBM) in predicting third grade state test performance. It also compared the long-term classification accuracy of local and publisher-recommended R-CBM cut scores. Participants were 266 students who were divided into a calibration…

  6. Racial Differences in Test Preparation Strategies: A Commentary on "Shadow Education, American Style: Test Preparation, the SAT and College Enrollment"

    ERIC Educational Resources Information Center

    Alon, Sigal

    2010-01-01

    Claudia Buchmann, Dennis Condron and Vincent Roscigno's study, titled "Shadow Education, American Style: Test Preparation, the SAT and College Enrollment," demonstrates that vigorous use of expensive test preparation tools, such as private classes and tutors, significantly boosts scores on standardized exams such as the SAT or ACT. This…

  7. The Pooling-score (P-score): inter- and intra-rater reliability in endoscopic assessment of the severity of dysphagia.

    PubMed

    Farneti, D; Fattori, B; Nacci, A; Mancini, V; Simonelli, M; Ruoppolo, G; Genovese, E

    2014-04-01

    This study evaluated the intra- and inter-rater reliability of the Pooling score (P-score) in clinical endoscopic evaluation of severity of swallowing disorder, considering excess residue in the pharynx and larynx. The score (minimum 4 - maximum 11) is obtained by the sum of the scores given to the site of the bolus, the amount and ability to control residue/bolus pooling, the latter assessed on the basis of cough, raclage, number of dry voluntary or reflex swallowing acts (< 2, 2-5, > 5). Four judges evaluated 30 short films of pharyngeal transit of 10 solid (1/4 of a cracker), 11 creamy (1 tablespoon of jam) and 9 liquid (1 tablespoon of 5 cc of water coloured with methlyene blue, 1 ml in 100 ml) boluses in 23 subjects (10 M/13 F, age from 31 to 76 yrs, mean age 58.56±11.76 years) with different pathologies. The films were randomly distributed on two CDs, which differed in terms of the sequence of the films, and were given to judges (after an explanatory session) at time 0, 24 hours later (time 1) and after 7 days (time 2). The inter- and intra-rater reliability of the P-score was calculated using the intra-class correlation coefficient (ICC; 3,k). The possibility that consistency of boluses could affect the scoring of the films was considered. The ICC for site, amount, management and the P-score total was found to be, respectively, 0.999, 0.997, 1.00 and 0.999. Clinical evaluation of a criterion of severity of a swallowing disorder remains a crucial point in the management of patients with pathologies that predispose to complications. The P-score, derived from static and dynamic parameters, yielded a very high correlation among the scores attributed by the four judges during observations carried out at different times. Bolus consistencies did not affect the outcome of the test: the analysis of variance, performed to verify if the scores attributed by the four judges to the parameters selected, might be influenced by the different consistencies of the boluses

  8. ACTS 118x: High Speed TCP Interoperability Testing

    NASA Technical Reports Server (NTRS)

    Brooks, David E.; Buffinton, Craig; Beering, Dave R.; Welch, Arun; Ivancic, William D.; Zernic, Mike; Hoder, Douglas J.

    1999-01-01

    With the recent explosion of the Internet and the enormous business opportunities available to communication system providers, great interest has developed in improving the efficiency of data transfer over satellite links using the Transmission Control Protocol (TCP) of the Internet Protocol (IP) suite. The NASA's ACTS experiments program initiated a series of TCP experiments to demonstrate scalability of TCP/IP and determine to what extent the protocol can be optimized over a 622 Mbps satellite link. Through partnerships with the government technology oriented labs, computer, telecommunication, and satellite industries NASA Glenn was able to: (1) promote the development of interoperable, high-performance TCP/IP implementations across multiple computing / operating platforms; (2) work with the satellite industry to answer outstanding questions regarding the use of standard protocols (TCP/IP and ATM) for the delivery of advanced data services, and for use in spacecraft architectures; and (3) conduct a series of TCP/IP interoperability tests over OC12 ATM over a satellite network in a multi-vendor environment using ACTS. The experiments' various network configurations and the results are presented.

  9. Understanding the Role of "SES," Ethnicity, and Discipline Infractions in Students' Standardized Test Scores

    ERIC Educational Resources Information Center

    Koca, Fatih

    2017-01-01

    The goal of the current study is to examine the impact of students' social economic status, ethnicity, and discipline infractions on their standardized test scores in Indiana, the USA. Data from this study extracted from Indiana Department of Education. ISTEP is a criterion-referenced standardized test. It consists of items that assess a student's…

  10. Physiologic Dysfunction Scores and Cognitive Function Test Performance in United States Adults

    PubMed Central

    Kobrosly, Roni W; Seplaki, Christopher L; Jones, Courtney M; van Wijngaarden, Edwin

    2013-01-01

    Objective To investigate the relationship between a measure of cumulative physiologic dysfunction and specific domains of cognitive function. Methods We examined a summary score measuring physiological dysfunction, a multisystem measure of the body’s ability to effectively adapt to physical and psychological demands, in relation to cognitive function deficits in a population of 4511 adults aged 20 to 59 who participated in the third National Health and Nutrition Examination Survey (1988–1994). Measures of cognitive function comprised three domains: working memory, visuomotor speed, and perceptual-motor speed. ‘Physiologic dysfunction’ scores summarizing measures of cardiovascular, immunologic, kidney, and liver function were explored. We used multiple linear regression models to estimate associations between cognitive function measures and physiological dysfunction scores, adjusting for socioeconomic factors, test conditions, and self-reported health factors. Results We noted a dose-response relationship between physiologic dysfunction and working memory (coefficient = 0.207, 95% CI = (0.066, 0.348), p < 0.0001) that persisted after adjustment for all covariates (p = 0.03). We did not observe any significant relationships between dysfunction scores and visuomotor (p = 0.37) or perceptual-motor ability (p = 0.33). Conclusions Our findings suggest that multisystem physiologic dysfunction is associated with working memory. Future longitudinal studies are needed to clarify the underlying mechanisms and explore the persistency of this association into later life. We suggest that such studies should incorporate physiologic data, neuroendocrine parameters, and a wide range of specific cognitive domains. PMID:22155941

  11. Automated Essay Scoring versus Human Scoring: A Correlational Study

    ERIC Educational Resources Information Center

    Wang, Jinhao; Brown, Michelle Stallone

    2008-01-01

    The purpose of the current study was to analyze the relationship between automated essay scoring (AES) and human scoring in order to determine the validity and usefulness of AES for large-scale placement tests. Specifically, a correlational research design was used to examine the correlations between AES performance and human raters' performance.…

  12. Turkish version of the modified Constant-Murley score and standardized test protocol: reliability and validity.

    PubMed

    Çelik, Derya

    2016-01-01

    The Constant-Murley score (CMS) is widely used to evaluate disabilities associated with shoulder injuries, but it has been criticized for relying on imprecise terminology and a lack of standardized methodology. A modified guideline, therefore, was published in 2008 with several recommendations. This new version has not yet been translated or culturally adapted for Turkish-speaking populations. The purpose of this study was to translate and cross-culturally adapt the modified CMS and its test protocol, as well as define and measure its reliability and validity. The modified CMS was translated into Turkish, consistent with published methodological guidelines. The measurement properties of the Turkish version of the modified CMS were tested in 30 patients (12 males, 18 females; mean age: 59.5±13.5 years) with a variety of shoulder pathologies. Intraclass correlation coefficients (ICC) were used to estimate test-retest reliability. Construct validity was analyzed with the Turkish version of the American Shoulder and Elbow Surgeons (ASES) Standardized Shoulder Assessment Form and Short-Form Health Survey (SF-12). No difficulties were found in the translation process. The Turkish version of the modified CMS showed excellent test-retest reliability (ICC=0.86). The correlation coefficients between the Turkish version of the modified CMS and the ASES, SF-12-physical component score, and SF-12 mental component scores were found to be 0.48, 0.35, and 0.05, respectively. No floor or ceiling effects were found. The translation and cultural adaptation of the modified CMS and its standardized test protocol into Turkish were successful. The Turkish version of the modified CMS has sufficient reliability and validity to measure a variety of shoulder disorders for Turkish-speaking individuals.

  13. Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

    ERIC Educational Resources Information Center

    Yao, Lihua

    2012-01-01

    Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…

  14. Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

    ERIC Educational Resources Information Center

    Almond, Russell G.

    2014-01-01

    Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…

  15. Development and Validation of Scores from an Instrument Measuring Student Test-Taking Motivation

    ERIC Educational Resources Information Center

    Eklof, Hanna

    2006-01-01

    Using the expectancy-value model of achievement motivation as a basis, this study's purpose is to develop, apply, and validate scores from a self-report instrument measuring student test-taking motivation. Sampled evidence of construct validity for the present sample indicates that a number of the items in the instrument could be used as an…

  16. The Fight's Not Always Fixed: Using Literary Response to Transcend Standardized Test Scores

    ERIC Educational Resources Information Center

    Avila, JuliAnna

    2012-01-01

    In 2004, the National Endowment for the Arts (NEA) concluded that "literature reading is fading as a meaningful activity, especially among younger people." How can educators continue to teach students about the power of literary response when the priority is for them to achieve proficiency on standardized tests, whose scores can only be narrowly…

  17. Implications of Deployed and Nondeployed Fathers on Seventh Graders' California Achievement Test Scores during a Military Crisis.

    ERIC Educational Resources Information Center

    Pisano, Mark C.

    The differences in California Achievement Test (CAT) scores from 1990 to 1991 in seventh graders, currently enrolled in Albritton Junior High School in the Fort Bragg Schools, of deployed and nondeployed fathers were analyzed. CAT percentile scores from 1990 and 1991 (1991 being the year of "Desert Storm") were obtained in reading, math…

  18. Using a Concept-Grounded, Curriculum-Based Measure in Mathematics To Predict Statewide Test Scores for Middle School Students with LD.

    ERIC Educational Resources Information Center

    Helwig, Robert; Anderson, Lisbeth; Tindal, Gerald

    2002-01-01

    An 11-item math concept curriculum-based measure (CBM) was administered to 171 eighth grade students. Scores were correlated with scores from a computer adaptive test designed in conjunction with the state to approximate the official statewide mathematics achievement tests. Correlations for general education students and students with learning…

  19. Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

    ERIC Educational Resources Information Center

    Lalande, John F.; Schweckendiek, Jurgen

    1986-01-01

    Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)

  20. Linking Composite Scores: Effects of Anchor Test Length and Content Representativeness. Research Report. ETS RR-16-36

    ERIC Educational Resources Information Center

    Lin, Peng; Dorans, Neil; Weeks, Jonathan

    2016-01-01

    The nonequivalent groups with anchor test (NEAT) design is frequently used in test score equating or linking. One important assumption of the NEAT design is that the anchor test is a miniversion of the 2 tests to be equated/linked. When the content of the 2 tests is different, it is not possible for the anchor test to be adequately representative…

  1. A sup-score test for the cure fraction in mixture models for long-term survivors.

    PubMed

    Hsu, Wei-Wen; Todem, David; Kim, KyungMann

    2016-12-01

    The evaluation of cure fractions in oncology research under the well known cure rate model has attracted considerable attention in the literature, but most of the existing testing procedures have relied on restrictive assumptions. A common assumption has been to restrict the cure fraction to a constant under alternatives to homogeneity, thereby neglecting any information from covariates. This article extends the literature by developing a score-based statistic that incorporates covariate information to detect cure fractions, with the existing testing procedure serving as a special case. A complication of this extension, however, is that the implied hypotheses are not typical and standard regularity conditions to conduct the test may not even hold. Using empirical processes arguments, we construct a sup-score test statistic for cure fractions and establish its limiting null distribution as a functional of mixtures of chi-square processes. In practice, we suggest a simple resampling procedure to approximate this limiting distribution. Our simulation results show that the proposed test can greatly improve efficiency over tests that neglect the heterogeneity of the cure fraction under the alternative. The practical utility of the methodology is illustrated using ovarian cancer survival data with long-term follow-up from the surveillance, epidemiology, and end results registry. © 2016, The International Biometric Society.

  2. Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease.

    PubMed

    Moradi, Elaheh; Hallikainen, Ilona; Hänninen, Tuomo; Tohka, Jussi

    2017-01-01

    Rey's Auditory Verbal Learning Test (RAVLT) is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD), thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting) and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI) data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50) and RAVLT Percent Forgetting (R = 0.43) in a dataset consisting of 806 AD, mild cognitive impairment (MCI) or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.

  3. Changing abilities vs. changing tasks: Examining validity degradation with test scores and college performance criteria both assessed longitudinally.

    PubMed

    Dahlke, Jeffrey A; Kostal, Jack W; Sackett, Paul R; Kuncel, Nathan R

    2018-05-03

    We explore potential explanations for validity degradation using a unique predictive validation data set containing up to four consecutive years of high school students' cognitive test scores and four complete years of those students' college grades. This data set permits analyses that disentangle the effects of predictor-score age and timing of criterion measurements on validity degradation. We investigate the extent to which validity degradation is explained by criterion dynamism versus the limited shelf-life of ability scores. We also explore whether validity degradation is attributable to fluctuations in criterion variability over time and/or GPA contamination from individual differences in course-taking patterns. Analyses of multiyear predictor data suggest that changes to the determinants of performance over time have much stronger effects on validity degradation than does the shelf-life of cognitive test scores. The age of predictor scores had only a modest relationship with criterion-related validity when the criterion measurement occasion was held constant. Practical implications and recommendations for future research are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  4. Clinical score and rapid antigen detection test to guide antibiotic use for sore throats: randomised controlled trial of PRISM (primary care streptococcal management).

    PubMed

    Little, Paul; Hobbs, F D Richard; Moore, Michael; Mant, David; Williamson, Ian; McNulty, Cliodna; Cheng, Ying Edith; Leydon, Geraldine; McManus, Richard; Kelly, Joanne; Barnett, Jane; Glasziou, Paul; Mullee, Mark

    2013-10-10

    To determine the effect of clinical scores that predict streptococcal infection or rapid streptococcal antigen detection tests compared with delayed antibiotic prescribing. Open adaptive pragmatic parallel group randomised controlled trial. Primary care in United Kingdom. Patients aged ≥ 3 with acute sore throat. An internet programme randomised patients to targeted antibiotic use according to: delayed antibiotics (the comparator group for analyses), clinical score, or antigen test used according to clinical score. During the trial a preliminary streptococcal score (score 1, n=1129) was replaced by a more consistent score (score 2, n=631; features: fever during previous 24 hours; purulence; attends rapidly (within three days after onset of symptoms); inflamed tonsils; no cough/coryza (acronym FeverPAIN). Symptom severity reported by patients on a 7 point Likert scale (mean severity of sore throat/difficulty swallowing for days two to four after the consultation (primary outcome)), duration of symptoms, use of antibiotics. For score 1 there were no significant differences between groups. For score 2, symptom severity was documented in 80% (168/207 (81%) in delayed antibiotics group; 168/211 (80%) in clinical score group; 166/213 (78%) in antigen test group). Reported severity of symptoms was lower in the clinical score group (-0.33, 95% confidence interval -0.64 to -0.02; P=0.04), equivalent to one in three rating sore throat a slight versus moderate problem, with a similar reduction for the antigen test group (-0.30, -0.61 to -0.00; P=0.05). Symptoms rated moderately bad or worse resolved significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63) but not the antigen test group (1.11, 0.88 to 1.40). In the delayed antibiotics group, 75/164 (46%) used antibiotics. Use of antibiotics in the clinical score group (60/161) was 29% lower (adjusted risk ratio 0.71, 95% confidence interval 0.50 to 0.95; P=0.02) and in the

  5. Towards reporting standards for neuropsychological study results: A proposal to minimize communication errors with standardized qualitative descriptors for normalized test scores.

    PubMed

    Schoenberg, Mike R; Rum, Ruba S

    2017-11-01

    Rapid, clear and efficient communication of neuropsychological results is essential to benefit patient care. Errors in communication are a lead cause of medical errors; nevertheless, there remains a lack of consistency in how neuropsychological scores are communicated. A major limitation in the communication of neuropsychological results is the inconsistent use of qualitative descriptors for standardized test scores and the use of vague terminology. PubMed search from 1 Jan 2007 to 1 Aug 2016 to identify guidelines or consensus statements for the description and reporting of qualitative terms to communicate neuropsychological test scores was conducted. The review found the use of confusing and overlapping terms to describe various ranges of percentile standardized test scores. In response, we propose a simplified set of qualitative descriptors for normalized test scores (Q-Simple) as a means to reduce errors in communicating test results. The Q-Simple qualitative terms are: 'very superior', 'superior', 'high average', 'average', 'low average', 'borderline' and 'abnormal/impaired'. A case example illustrates the proposed Q-Simple qualitative classification system to communicate neuropsychological results for neurosurgical planning. The Q-Simple qualitative descriptor system is aimed as a means to improve and standardize communication of standardized neuropsychological test scores. Research are needed to further evaluate neuropsychological communication errors. Conveying the clinical implications of neuropsychological results in a manner that minimizes risk for communication errors is a quintessential component of evidence-based practice. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Effect of Mindfulness Meditation on Perceived Stress Scores and Autonomic Function Tests of Pregnant Indian Women.

    PubMed

    Muthukrishnan, Shobitha; Jain, Reena; Kohli, Sangeeta; Batra, Swaraj

    2016-04-01

    Various pregnancy complications like hypertension, preeclampsia have been strongly correlated with maternal stress. One of the connecting links between pregnancy complications and maternal stress is mind-body intervention which can be part of Complementary and Alternative Medicine (CAM). Biologic measures of stress during pregnancy may get reduced by such interventions. To evaluate the effect of Mindfulness meditation on perceived stress scores and autonomic function tests of pregnant Indian women. Pregnant Indian women of 12 weeks gestation were randomised to two treatment groups: Test group with Mindfulness meditation and control group with their usual obstetric care. The effect of Mindfulness meditation on perceived stress scores and cardiac sympathetic functions and parasympathetic functions (Heart rate variation with respiration, lying to standing ratio, standing to lying ratio and respiratory rate) were evaluated on pregnant Indian women. There was a significant decrease in perceived stress scores, a significant decrease of blood pressure response to cold pressor test and a significant increase in heart rate variability in the test group (p< 0.05, significant) which indicates that mindfulness meditation is a powerful modulator of the sympathetic nervous system and can thereby reduce the day-to-day perceived stress in pregnant women. The results of this study suggest that mindfulness meditation improves parasympathetic functions in pregnant women and is a powerful modulator of the sympathetic nervous system during pregnancy.

  7. Segregation and the Black-White Test Score Gap. NBER Working Paper No. 12988

    ERIC Educational Resources Information Center

    Vigdor, Jacob; Ludwig, Jens

    2007-01-01

    The mid-1980s witnessed breaks in two important trends related to race and schooling. School segregation, which had been declining, began a period of relative stasis. Black-white test score gaps, which had also been declining, also stagnated. The notion that these two phenomena may be related is also supported by basic cross-sectional evidence. We…

  8. Adults with poor reading skills: How lexical knowledge interacts with scores on standardized reading comprehension tests

    PubMed Central

    McKoon, Gail; Ratcliff, Roger

    2016-01-01

    Millions of adults in the United States lack the necessary literacy skills for most living wage jobs. For students from adult learning classes, we used a lexical decision task to measure their knowledge of words and we used a decision-making model (Ratcliff’s, 1978, diffusion model) to abstract the mechanisms underlying their performance from their RTs and accuracy. We also collected scores for each participant on standardized IQ tests and standardized reading tests used commonly in the education literature. We found significant correlations between the model’s estimates of the strengths with which words are represented in memory and scores for some of the standardized tests but not others. The findings point to the feasibility and utility of combining a test of word knowledge, lexical decision, that is well-established in psycholinguistic research, a decision-making model that supplies information about underlying mechanisms, and standardized tests. The goal for future research is to use this combination of approaches to understand better how basic processes relate to standardized tests with the eventual aim of understanding what these tests are measuring and what the specific difficulties are for individual, low-literacy adults. PMID:26550803

  9. Adults with poor reading skills: How lexical knowledge interacts with scores on standardized reading comprehension tests.

    PubMed

    McKoon, Gail; Ratcliff, Roger

    2016-01-01

    Millions of adults in the United States lack the necessary literacy skills for most living wage jobs. For students from adult learning classes, we used a lexical decision task to measure their knowledge of words and we used a decision-making model (Ratcliff's, 1978, diffusion model) to abstract the mechanisms underlying their performance from their RTs and accuracy. We also collected scores for each participant on standardized IQ tests and standardized reading tests used commonly in the education literature. We found significant correlations between the model's estimates of the strengths with which words are represented in memory and scores for some of the standardized tests but not others. The findings point to the feasibility and utility of combining a test of word knowledge, lexical decision, that is well-established in psycholinguistic research, a decision-making model that supplies information about underlying mechanisms, and standardized tests. The goal for future research is to use this combination of approaches to understand better how basic processes relate to standardized tests with the eventual aim of understanding what these tests are measuring and what the specific difficulties are for individual, low-literacy adults. Copyright © 2015. Published by Elsevier B.V.

  10. Automated Scoring of Speaking Tasks in the Test of English-for-Teaching ("TEFT"™). Research Report. ETS RR-15-31

    ERIC Educational Resources Information Center

    Zechner, Klaus; Chen, Lei; Davis, Larry; Evanini, Keelan; Lee, Chong Min; Leong, Chee Wee; Wang, Xinhao; Yoon, Su-Youn

    2015-01-01

    This research report presents a summary of research and development efforts devoted to creating scoring models for automatically scoring spoken item responses of a pilot administration of the Test of English-for-Teaching ("TEFT"™) within the "ELTeach"™ framework.The test consists of items for all four language modalities:…

  11. Responsiveness of migraine-ACT and MIDAS questionnaires for assessing migraine therapy.

    PubMed

    García, María Luisa; Baos, Vicente; Láinez, Miguel; Pascual, Julio; López-Gil, Arturo

    2008-10-01

    Migraine is frequently undertreated. The 4-item Migraine Assessment of Current Therapy (Migraine-ACT) questionnaire is a simple and reliable tool to identify patients requiring a change in current acute migraine treatment. To investigate the responsiveness of the Migraine-ACT tool, and compare it with that of the Migraine Disability Assessment (MIDAS) questionnaire, for patients with migraine at 1100 primary care sites in Spain. Patients eligible for this open-label, 2-visit prospective study reported migraine for >1 year and >or=1 migraine attack per month and were new to the clinic or on follow-up care for <6 months. Validated Spanish versions of the Migraine-ACT and MIDAS questionnaires were administered, and patient satisfaction with treatment was recorded, at baseline and at 3 months. A total of 3272 patients, 78% female, were enrolled, and 2877 (88%) returned for the 3-month visit. Investigators changed baseline migraine treatment for 72% of returning patients; 85% and 80% of these patients had improved Migraine-ACT and MIDAS scores at 3 months, respectively. Patients who reported being completely or very satisfied with migraine treatment numbered 492 (15%) at baseline and 1406 (49%) at 3 months. Migraine-ACT and MIDAS score agreement for improvement at 3 months was poor (kappa = 0.339). Both the mean MIDAS score and the distribution of Migraine-ACT scores improved over the course of 3 months; however, Migraine-ACT scores were significantly (P < .001) more sensitive (83% vs 75%) and specific (72% vs 58%) than MIDAS scores. The area under the curve in the receiver-operating characteristic analysis was significantly (P < .0001) greater for Migraine-ACT (0.82) as compared with the MIDAS (0.70) questionnaire. These results suggest that the Migraine-ACT questionnaire can be used more reliably than the MIDAS questionnaire for detecting improvements in treatment of new and follow-up patients with migraine.

  12. Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer.

    PubMed

    Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Snell, Kym; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Menon, Usha; Deeks, Jon

    2016-08-09

    Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted

  13. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Utah

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Utah's test score trends through 2008-09. In 2004, the mean scale score on the state 4th grade reading test was 167 for non-Title I students and 164 for Title I students. In 2009 the mean scale score in 4th grade reading was 168 for non-Title I students and 164 for Title I students. Between 2004 and 2009, the mean scale score…

  14. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Colorado

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Colorado's test score trends through 2008-09. In 2003, the mean scale score on the state 4th grade reading test was 598 for non-Title I students and 558 for Title I students. In 2009, the mean scale score in 4th grade reading was 599 for non-Title I students and 556 for Title I students. Between 2003 and 2009, the mean scale…

  15. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Maryland

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Maryland's test score trends through 2008-09. In 2004, 82% of non-Title I 4th graders and 61% of Title I 4th graders scored at the proficient level on the state reading test. In 2009, 90% of non-Title I 4th graders and 78% of Title I 4th graders scored at the proficient level in reading. Between 2004 and 2009, the percentage…

  16. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Delaware

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Delaware's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 474 for non-Title I students and 464 for Title I students. In 2009, the mean scale score in 4th grade reading was 478 for non-Title I students and 467 for Title I students. Between 2006 and 2009, the mean scale…

  17. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Massachusetts

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Massachusetts's test score trends through 2008-09. In 2006, 59% of non-Title I 4th graders and 29% of Title I 4th graders scored at the proficient level on the state reading test. In 2009, 64% of non-Title I 4th graders and 31% of Title I 4th graders scored at the proficient level in reading. Between 2006 and 2009, the…

  18. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Missouri

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Missouri's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 661 for non-Title I students and 642 for Title I students. In 2009, the mean scale score in 4th grade reading was 661 for non-Title I students and 648 for Title I students. Between 2006 and 2009, there was no…

  19. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? Kentucky

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles Kentucky's test score trends through 2008-09. In 2007, the mean scale score on the state 4th grade reading test was 455 for non-Title I students and 451 for Title I students. In 2009, the mean scale score in 4th grade reading was 455 for non-Title I students and 451 for Title I students. Between 2007 and 2009, the mean scale…

  20. Impossible Scores Resulting in Zero Frequencies in the Anchor Test: Impact on Smoothing and Equating. Research Report. ETS RR-08-10

    ERIC Educational Resources Information Center

    Puhan, Gautam; vonDavier, Alina; Gupta, Shaloo

    2008-01-01

    Equating under the external anchor design is frequently conducted using scaled scores on the anchor test. However, scaled scores often lead to the unique problem of creating zero frequencies in the score distribution because there may not always be a one-to-one correspondence between raw and scaled scores. For example, raw scores of 17 and 18 may…

  1. Establishing the Validity of TOEIC Bridge™ Test Scores for Students in Colombia, Chile, and Ecuador. Research Report. ETS RR-08-58

    ERIC Educational Resources Information Center

    Sinharay, Sandip; Feng, Ying; Saldivia, Luis; Powers, Donald E.; Ginuta, Anthony; Simpson, Annabelle; Weng, Vincent

    2008-01-01

    The validity of TOEIC Bridge™ scores as a measure of English language skill was examined from the standpoint of a unified concept of test validity. In this study, more than 6,000 test takers in 3 Latin American countries (Chile, Colombia, and Ecuador) took 1 form of the TOEIC Bridge test, and their scores were compared to additional information…

  2. Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

    ERIC Educational Resources Information Center

    Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

    2010-01-01

    In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…

  3. Speech perception scores in cochlear implant recipients: An analysis of ceiling effects in the CUNY sentence test (Quiet) in post-lingually deafened cochlear implant recipients.

    PubMed

    Ebrahimi-Madiseh, Azadeh; Eikelboom, Robert H; Jayakody, Dona Mp; Atlas, Marcus D

    2016-01-01

    To evaluate the clinical utility of the City University of New York sentence test in a cohort of post-lingually deafened cochlear implants recipients over time. 117 post-lingually deafened, Australian English-speaking CI recipients aged between 23 and 98 years (M = 66 years; SD = 15.09) were recruited. CUNY sentence test scores in quiet were collated and analysed at two cut-offs, 95% and 100%, as ceiling scores. CUNY sentence scores ranged from 4% to 100% (M = 86.75; SD = 20.65), with 38.8% of participants scoring 95% and 16.5% of participants reaching the 100% scores. The percentage of participants reaching the 95% and 100% ceiling scores increased over time (6 and 12 months post-implantation). The distribution of all post-operative CUNY test scores skewed to the right with 82% of test scores reaching above 90%. This study demonstrates that the CUNY test cannot be used as a valid tool to measure the speech perception skills of post-lingually deafened CI recipients over time. This may be overcome by using adaptive test protocols or linguistically, cognitively or contextually demanding test materials. The high percentage of CI recipients achieving ceiling scores for the CUNY sentence test in quiet at 3 months post-implantation, questions the validity of using CUNY in CI assessment test battery and limits its application for use in longitudinal studies evaluating CI outcomes. Further studies are required to examine different methods to overcome this problem.

  4. Guided-Inquiry Lessons Raise Scores on the Sixth Grade Georgia Science Test

    NASA Astrophysics Data System (ADS)

    Page, Purlie M.

    At the local level, G Middle School has the highest district-wide percentage of 6th grade science students who are not meeting standards. It is imperative that G middle school take corrective action to reduce the number of students failing to meet state science standards. Dewey's theory of conceptual framework, which involves knowledge constructed on a person's personal experience and mind activity through active forms of learning, guided this study. The goal of the study was to determine whether inquiry-based science modules produce greater 6th grade science achievement, as measured by an equivalent instrument of the science section of the Georgia Criterion-Referenced Competency Test, when compared to traditional instruction among eastern Georgia 6th graders. The sample consisted of 230 students in the nonintervention group and 119 students in the intervention group. All students were from intact classes. At the end of the intervention, an independent t test was conducted to analyze the scores. According to the study t test, (t = 12.33, df = 304.56, p < 0.05), the difference between the means was statistically significant. This project's potential impact on social change includes increasing student motivation towards, comprehension of, and interest in science concepts. At the local level, these inquiry lessons can be shared with science teachers across grade levels and within the district to improve county-wide science scores. An increase in student interest and comprehension of science concepts could ultimately lead to the United States producing more students in the fields of science, technology, engineering, and mathematics (STEM) education.

  5. Fluticasone propionate in clinically suspected asthma patients with negative methacholine challenge test.

    PubMed

    Peiman, Soheil; Abtahi, Hamidreza; Akhondzadeh, Shahin; Safavi, Enayat; Moin, Mostafa; Rahimi Foroushani, Abbas

    2017-07-01

    Despite reports of response to steroid inhaler in some clinically suspected asthma patients with negative methacholine challenge test (CSA/MCT-), treatment in these patients has not been prospectively studied. We studied the role of a 12 week high dose inhaled fluticasone trial in CSA/MCT- patients. After a 2 week run-in period, CSA/MCT-patients were treated with 12 weeks of Fluticasone propionate 1000 µg/day. The Asthma Control Test (ACT), numeric cough score (NCS) and bronchodilator use were compared with their pretreatment values. Thirty-four of 42 CSA/MCT-patients completed the study. Mean pretreatment ACT score (pACT) was significantly increased after treatment (14.7 ± 3.37 to 20.9 ± 3.1, P < 0.001). Posttreatment values of daytime (1.0 ± 1.0) and night-time (0.6 ± 0.9) NCS decreased compared to their pretreatment values (2.8 ± 1.1 and 1.9 ± 1.3, respectively; P < 0.001). ACT score change (ΔACT) were significantly greater in those with pACT < 15 than in those ≥15 (P < 0.001) . Fifteen of 21 patients with ΔACT > 5 did not need to use bronchodilator for their symptom relief. Wheeze disappeared in all six patients with ΔACT > 5 after the trial. Six months after the study, steroid inhaler continued to be used by 72.2% of patients. A significant portion of CSA/MCT- (especially those with pretreatment ACT score <15) respond to high dose fluticasone inhaler in terms of symptoms relief, disappearance of wheeze and need to bronchodilator use. ΔACT could not be predicted with any individual symptoms or signs before MCT, % FEV1 decline or symptoms during MCT and exhaled nitric oxide. © 2015 John Wiley & Sons Ltd.

  6. Improvements in manual dexterity relate to improvements in cognitive planning after assisted cycling therapy (ACT) in adolescents with down syndrome.

    PubMed

    Holzapfel, Simon D; Ringenbach, Shannon D R; Mulvey, Genna M; Sandoval-Menendez, Amber M; Cook, Megan R; Ganger, Rachel O; Bennett, Kristen

    2015-01-01

    We have previously reported beneficial effects of acute (i.e., single session) Assisted Cycling Therapy (ACT) on manual dexterity and cognitive planning ability in adolescents with Down syndrome (DS). In the present study, we report the chronic effects of eight weeks of ACT, voluntary cycling (VC), and no cycling (NC), on the same measures in adolescents with DS. Participants completed 8 weeks of ACT, VC, or NC. Those in the ACT and VC groups completed 30min sessions three times per week on a stationary bicycle. During ACT, the mechanical motor of the bicycle augmented the cadence to a rate which was on average 79% faster than the voluntary cadence. During VC, the participants pedaled at a self-selected rate. Unimanual dexterity scores as measured with the Purdue Pegboard test (PPT) improved significantly more for the ACT and VC groups compared to the NC group. ACT lead to greater improvements than VC and NC in the assembly sub-test, which is a task that requires more advanced temporal and spatial processing. The ACT group improved significantly more than the VC group and non-significantly more than the NC group in cognitive planning ability as measured by the Tower of London test (ToL). There were also significant correlations between the assembly subtest of the PPT and all measures of the ToL. These correlations were stronger during post-testing than pre-testing. Pre-post changes in the combined PPT score and ToL number of correct moves correlated positively in the ACT group. These results support the efficacy of the salutary effects of ACT on global fine motor function and executive function in DS. Additionally, the performance on complex bimanual dexterity tasks appears to be related to the capacity of cognitive planning ability. This research has important implications for persons with movement deficits that affect activities of daily living. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Medical devices; ovarian adnexal mass assessment score test system; labeling; black box restrictions. Final rule.

    PubMed

    2011-12-30

    The Food and Drug Administration (FDA) is amending the regulation classifying ovarian adnexal mass assessment score test systems to restrict these devices so that a prescribed warning statement that addresses a risk identified in the special controls guidance document must be in a black box and must appear in all labeling, advertising, and promotional material. The black box warning mitigates the risk to health associated with off-label use as a screening test, stand-alone diagnostic test, or as a test to determine whether or not to proceed with surgery.

  8. The effect of constructivist teaching strategies on science test scores of middle school students

    NASA Astrophysics Data System (ADS)

    Vaca, James L., Jr.

    International studies show that the United States is lagging behind other industrialized countries in science proficiency. The studies revealed how American students showed little significant gain on standardized tests in science between 1995 and 2005. Little information is available regarding how reform in American teaching strategies in science could improve student performance on standardized testing. The purpose of this quasi-experimental quantitative study using a pretest/posttest control group design was to examine how the use of a hands-on, constructivist teaching approach with low achieving eighth grade science students affected student achievement on the 2007 Ohio Eighth Grade Science Achievement Test posttest (N = 76). The research question asked how using constructivist teaching strategies in the science classroom affected student performance on standardized tests. Two independent samples of 38 students each consisting of low achieving science students as identified by seventh grade science scores and scores on the Ohio Eighth Grade Science Half-Length Practice Test pretest were used. Four comparisons were made between the control group receiving traditional classroom instruction and the experimental group receiving constructivist instruction including: (a) pretest/posttest standard comparison, (b) comparison of the number of students who passed the posttest, (c) comparison of the six standards covered on the posttest, (d) posttest's sample means comparison. A Mann-Whitney U Test revealed that there was no significant difference between the independent sample distributions for the control group and the experimental group. These findings contribute to positive social change by investigating science teaching strategies that could be used in eighth grade science classes to improve student achievement in science.

  9. Assessing the Usefulness of SAT and ACT Tests in Minority Admissions

    ERIC Educational Resources Information Center

    Micceri, Theodore

    2010-01-01

    This study sought to determine whether the use of standardized test scores contributes any useful information regarding First Time in College (FTIC) students' probable success at USF, using more detailed analysis of underrepresented minorities and women, who Micceri (2009) shows, experience substantial negative bias relative to males and whites on…

  10. Predicting Student Success in a Major's Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores

    NASA Astrophysics Data System (ADS)

    Thompson, E. David; Bowling, Bethany V.; Markle, Ross E.

    2018-02-01

    Studies over the last 30 years have considered various factors related to student success in introductory biology courses. While much of the available literature suggests that the best predictors of success in a college course are prior college grade point average (GPA) and class attendance, faculty often require a valuable predictor of success in those courses wherein the majority of students are in the first semester and have no previous record of college GPA or attendance. In this study, we evaluated the efficacy of the ACT Mathematics subject exam and Lawson's Classroom Test of Scientific Reasoning in predicting success in a major's introductory biology course. A logistic regression was utilized to determine the effectiveness of a combination of scientific reasoning (SR) scores and ACT math (ACT-M) scores to predict student success. In summary, we found that the model—with both SR and ACT-M as significant predictors—could be an effective predictor of student success and thus could potentially be useful in practical decision making for the course, such as directing students to support services at an early point in the semester.

  11. Predicting better performance on a college preparedness test from narrative comprehension at the age of 6 years: An fMRI study.

    PubMed

    Horowitz-Kraus, Tzipi; Eaton, Kenneth; Farah, Rola; Hajinazarian, Ardag; Vannest, Jennifer; Holland, Scott K

    2015-12-10

    To investigate whether high performance on college preparedness tests at 18 years of age can be predicted from brain activation patterns during narrative comprehension at 5-7 years of age. In this longitudinal study, functional MRI data during an auditory narrative-comprehension task were acquired from 15 children (5-7 years of age) who also provided their American College Testing (ACT) scores at the age of 18 years. Active voxels during the narrative-comprehension task were correlated with both composite ACT scores and the reading-comprehension component of the exam. Higher composite ACT scores and behavioral scores for reading comprehension were positively correlated with greater activation in frontal and anterior brain regions during the narrative-comprehension task. Our results suggest that neural circuits supporting higher ACT performance are predictable from a narrative-comprehension task at the age of 5-7 years. This supports a critical role for the anterior cingulate cortex, which is a part of the cingulo-opercular cognitive-control network early in development, as a facilitator for better ACT scores. This study highlights that shared neural circuits that support overall ACT performance and neural circuits that support reading comprehension both rely on neural circuits related to narrative comprehension in childhood, suggesting that interventions involving narrative comprehension should be considered for individuals with reading and other academic difficulties. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. An Investigation of the Relationship Between Readiness Test Scores for Kindergarten Children and Achievement Scores Obtained at the End of Grades One and Two. S.S.T.A. Research Centre Report No. 62.

    ERIC Educational Resources Information Center

    Warkentin, Lena

    The primary purpose of this study was to investigate the relationship between Metropolitan Readiness Test (MRT) scores in kindergarten (MRTK) and grade one (MRT1) with the reading scores of the Canadian Tests of Basic Skills (CTBS) at the end of grades one (CTBSR1) and two (CTBSR2). A secondary purpose of the study was to determine whether the…

  13. Alternate Test Procedures to Perform Clean Water Act Monitoring for Region 9

    EPA Pesticide Factsheets

    When performing Clean Water Act monitoring, parties interested in using a method not approved in 40 CFR Part 136 must apply to use the alternate test procedure (ATP) in the Region in which the discharging facility is located.

  14. Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score.

    PubMed

    Hung, Man; Hon, Shirley D; Cheng, Christine; Franklin, Jeremy D; Aoki, Stephen K; Anderson, Mike B; Kapron, Ashley L; Peters, Christopher L; Pelt, Christopher E

    2014-12-01

    The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Cohort study (diagnosis); Level of evidence, 2. Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior in all psychometric aspects examined in this study. Future

  15. Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score

    PubMed Central

    Hung, Man; Hon, Shirley D.; Cheng, Christine; Franklin, Jeremy D.; Aoki, Stephen K.; Anderson, Mike B.; Kapron, Ashley L.; Peters, Christopher L.; Pelt, Christopher E.

    2014-01-01

    Background: The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. Purpose: To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Study Design: Cohort study (diagnosis); Level of evidence, 2. Methods: Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. Results: All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. Conclusion: The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior

  16. End of Course Grades and Standardized Test Scores: Are Grades Predictive of Student Achievement?

    ERIC Educational Resources Information Center

    Ricketts, Christine R.

    2010-01-01

    This study examined the extent to which end-of-course grades are predictive of Virginia Standards of Learning test scores in nine high school content areas. It also analyzed the impact of the variables school cluster attended, gender, ethnicity, disability status, Limited English Proficiency status, and socioeconomic status on the relationship…

  17. Test Scores, Dropout Rates, and Transfer Rates as Alternative Indicators of High School Performance

    ERIC Educational Resources Information Center

    Rumberger, Russell W.; Palardy, Gregory J.

    2005-01-01

    This study investigated the relationships among several different indicators of high school performance: test scores, dropout rates, transfer rates, and attrition rates. Hierarchical linear models were used to analyze panel data from a sample of 14,199 students who took part in the National Education Longitudinal Survey of 1988. The results…

  18. Evaluation of Score Interpretive Information from the Perspective of Failed and Passed Test-Takers.

    ERIC Educational Resources Information Center

    Shannon, Gregory A.

    Candidates who had taken examinations for certification required by the American Production and Inventory Control Society (APICS) were surveyed to acquire feedback about the effectiveness of score interpretive information given to test takers. Those sampled included 488 passers and 389 failers of the Inventory Management (IM) examination and 457…

  19. ACT Test

    MedlinePlus

    ... Sample Required? A blood sample drawn from a vein in your arm Test Preparation Needed? None Looking ... is obtained by inserting a needle into a vein in the arm. Is any test preparation needed ...

  20. A risk score for predicting coronary artery disease in women with angina pectoris and abnormal stress test finding.

    PubMed

    Lo, Monica Y; Bonthala, Nirupama; Holper, Elizabeth M; Banks, Kamakki; Murphy, Sabina A; McGuire, Darren K; de Lemos, James A; Khera, Amit

    2013-03-15

    Women with angina pectoris and abnormal stress test findings commonly have no epicardial coronary artery disease (CAD) at catheterization. The aim of the present study was to develop a risk score to predict obstructive CAD in such patients. Data were analyzed from 337 consecutive women with angina pectoris and abnormal stress test findings who underwent cardiac catheterization at our center from 2003 to 2007. Forward selection multivariate logistic regression analysis was used to identify the independent predictors of CAD, defined by ≥50% diameter stenosis in ≥1 epicardial coronary artery. The independent predictors included age ≥55 years (odds ratio 2.3, 95% confidence interval 1.3 to 4.0), body mass index <30 kg/m(2) (odds ratio 1.9, 95% confidence interval 1.1 to 3.1), smoking (odds ratio 2.6, 95% confidence interval 1.4 to 4.8), low high-density lipoprotein cholesterol (odds ratio 2.9, 95% confidence interval 1.5 to 5.5), family history of premature CAD (odds ratio 2.4, 95% confidence interval 1.0 to 5.7), lateral abnormality on stress imaging (odds ratio 2.8, 95% confidence interval 1.5 to 5.5), and exercise capacity <5 metabolic equivalents (odds ratio 2.4, 95% confidence interval 1.1 to 5.6). Assigning each variable 1 point summed to constitute a risk score, a graded association between the score and prevalent CAD (ptrend <0.001). The risk score demonstrated good discrimination with a cross-validated c-statistic of 0.745 (95% confidence interval 0.70 to 0.79), and an optimized cutpoint of a score of ≤2 included 62% of the subjects and had a negative predictive value of 80%. In conclusion, a simple clinical risk score of 7 characteristics can help differentiate those more or less likely to have CAD among women with angina pectoris and abnormal stress test findings. This tool, if validated, could help to guide testing strategies in women with angina pectoris. Copyright © 2013 Elsevier Inc. All rights reserved.

  1. Predictive validity of the classroom strategies scale-observer form on statewide testing scores: an initial investigation.

    PubMed

    Reddy, Linda A; Fabiano, Gregory A; Dudek, Christopher M; Hsu, Louis

    2013-12-01

    The present study examined the validity of a teacher observation measure, the Classroom Strategies Scale--Observer Form (CSS), as a predictor of student performance on statewide tests of mathematics and English language arts. The CSS is a teacher practice observational measure that assesses evidence-based instructional and behavioral management practices in elementary school. A series of two-level hierarchical generalized linear models were fitted to data of a sample of 662 third- through fifth-grade students to assess whether CSS Part 2 Instructional Strategy and Behavioral Management Strategy scale discrepancy scores (i.e., ∑ |recommended frequency--frequency ratings|) predicted statewide mathematics and English language arts proficiency scores when percentage of minority students in schools was controlled. Results indicated that the Instructional Strategy scale discrepancy scores significantly predicted mathematics and English language arts proficiency scores: Relatively larger discrepancies on observer ratings of what teachers did versus what should have been done were associated with lower proficiency scores. Results offer initial evidence of the predictive validity of the CSS Part 2 Instructional Strategy discrepancy scores on student academic outcomes. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  2. ETS Psychometric Contributions: Focus on Test Scores. Research Report. ETS RR-13-15. ETS R&D Scientific and Policy Contributions Series. ETS SPC-13-03

    ERIC Educational Resources Information Center

    Moses, Tim

    2013-01-01

    The purpose of this report is to review ETS psychometric contributions that focus on test scores. Two major sections review contributions based on assessing test scores' measurement characteristics and other contributions about using test scores as predictors in correlational and regression relationships. An additional section reviews additional…

  3. GalaxyDock BP2 score: a hybrid scoring function for accurate protein-ligand docking

    NASA Astrophysics Data System (ADS)

    Baek, Minkyung; Shin, Woong-Hee; Chung, Hwan Won; Seok, Chaok

    2017-07-01

    Protein-ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein-ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein-ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html.

  4. Opportunity to Learn: Investigating Possible Predictors for Pre-Course "Test Of Astronomy STandards" TOAST Scores

    ERIC Educational Resources Information Center

    Berryhill, Katie J.; Slater, Timothy F.

    2017-01-01

    As discipline-based astronomy education researchers become more interested in experimentally testing innovative teaching strategies to enhance learning in undergraduate introductory astronomy survey courses ("ASTRO 101"), scholars are placing increased attention toward better understanding factors impacting student gain scores on the…

  5. Longitudinal Improvement in Balance Error Scoring System Scores among NCAA Division-I Football Athletes.

    PubMed

    Mathiasen, Ross; Hogrefe, Christopher; Harland, Kari; Peterson, Andrew; Smoot, M Kyle

    2018-02-15

    The Balance Error Scoring System (BESS) is a commonly used concussion assessment tool. Recent studies have questioned the stability and reliability of baseline BESS scores. The purpose of this longitudinal prospective cohort study is to examine differences in yearly baseline BESS scores in athletes participating on an NCAA Division-I football team. NCAA Division-I freshman football athletes were videotaped performing the BESS test at matriculation and after 1 year of participation in the football program. Twenty-three athletes were enrolled in year 1 of the study, and 25 athletes were enrolled in year 2. Those athletes enrolled in year 1 were again videotaped after year 2 of the study. The paired t-test was used to assess for change in score over time for the firm surface, foam surface, and the cumulative BESS score. Additionally, inter- and intrarater reliability values were calculated. Cumulative errors on the BESS significantly decreased from a mean of 20.3 at baseline to 16.8 after 1 year of participation. The mean number of errors following the second year of participation was 15.0. Inter-rater reliability for the cumulative score ranged from 0.65 to 0.75. Intrarater reliability was 0.81. After 1 year of participation, there is a statistically and clinically significant improvement in BESS scores in an NCAA Division-I football program. Although additional improvement in BESS scores was noted after a second year of participation, it did not reach statistical significance. Football athletes should undergo baseline BESS testing at least yearly if the BESS is to be optimally useful as a diagnostic test for concussion.

  6. Poisson Approximation-Based Score Test for Detecting Association of Rare Variants.

    PubMed

    Fang, Hongyan; Zhang, Hong; Yang, Yaning

    2016-07-01

    Genome-wide association study (GWAS) has achieved great success in identifying genetic variants, but the nature of GWAS has determined its inherent limitations. Under the common disease rare variants (CDRV) hypothesis, the traditional association analysis methods commonly used in GWAS for common variants do not have enough power for detecting rare variants with a limited sample size. As a solution to this problem, pooling rare variants by their functions provides an efficient way for identifying susceptible genes. Rare variant typically have low frequencies of minor alleles, and the distribution of the total number of minor alleles of the rare variants can be approximated by a Poisson distribution. Based on this fact, we propose a new test method, the Poisson Approximation-based Score Test (PAST), for association analysis of rare variants. Two testing methods, namely, ePAST and mPAST, are proposed based on different strategies of pooling rare variants. Simulation results and application to the CRESCENDO cohort data show that our methods are more powerful than the existing methods. © 2016 John Wiley & Sons Ltd/University College London.

  7. Utilizing the Six Realms of Meaning in Improving Campus Standardized Test Scores through Team Teaching and Strategic Planning

    ERIC Educational Resources Information Center

    Stevenson, Rosnisha D.; Kritsonis, William Allan

    2009-01-01

    This article will seek to utilize Dr. William Allan Kritsonis' book "Ways of Knowing Through the Realms of Meaning" (2007) as a framework to improve a campus's standardized test scores, more specifically, their TAKS (Texas Assessment of Knowledge and Skills) scores. Many campuses have an improvement plan, also known as a Campus…

  8. Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

    ERIC Educational Resources Information Center

    Goldstein, Donna; Alibrandi, Marsha

    2013-01-01

    This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…

  9. The Impact of Cooperative Learning on Critical Thinking Test Scores of Associate's Degree Graduates in Southwest Virginia

    ERIC Educational Resources Information Center

    Hodges, James Gregory

    2013-01-01

    This study examined the impact that the teaching technique known as cooperative learning had on the changes between pre- and post-test scores on all sub-categories ("induction, deduction, analysis, evaluation, inference", and "total composite") associated with the "California Critical Thinking Skills Test" (CCTST) for…

  10. Visual scoring of non cavitated caries lesions and clinical trial efficiency, testing xylitol in caries-active adults.

    PubMed

    Brown, John P; Amaechi, Bennett T; Bader, James D; Gilbert, Gregg H; Makhija, Sonia K; Lozano-Pineda, Juanita; Leo, Michael C; Chen, Chuhe; Vollmer, William M

    2014-06-01

    To better understand the effectiveness of xylitol in caries prevention in adults and to attempt improved clinical trial efficiency. As part of the Xylitol for Adult Caries Trial (X-ACT), non cavitated and cavitated caries lesions were assessed in subjects who were experiencing the disease. The trial was a test of the effectiveness of 5 g/day of xylitol, consumed by dissolving in the mouth five 1 g lozenges spaced across each day, compared with a sucralose placebo. For this analysis, seeking trial efficiency, 538 subjects aged 21-80, with complete data for four dental examinations, were selected from the 691 randomized into the 3-year trial, conducted at three sites. Acceptable inter- and intra-examiner reliability before and during the trial was quantified using the kappa statistic. The mean annualized noncavitated plus cavitated lesion transition scores in coronal and root surfaces, from sound to carious favoured xylitol over placebo, during the three cumulative periods of 12, 24, and 33 months, but these clinically and statistically nonsignificant differences declined in magnitude over time. Restricting the present assessment to those subjects with a higher baseline lifetime caries experience showed possible but inconsistent benefit. There was no clear and clinically relevant preventive effect of xylitol on caries in adults with adequate fluoride exposure when non cavitated plus cavitated lesions were assessed. This conformed to the X-ACT trial result assessing cavitated lesions. Including non cavitated lesion assessment in this full-scale, placebo-controlled, multisite, randomized, double-blinded clinical trial in adults experiencing dental caries did not achieve added trial efficiency or demonstrate practical benefit of xylitol. ClinicalTrials.Gov NCT00393055. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  11. Visual scoring of non-cavitated caries lesions and clinical trial efficiency, testing xylitol in caries active adults

    PubMed Central

    Brown, JP; Amaechi, BT; Bader, JD; Gilbert, GH; Makhija, SK; Lozano-Pineda, J; Leo, MC; Chuhe, C; Vollmer, WM

    2013-01-01

    Objectives To better understand the effectiveness of xylitol in caries prevention in adults, and to attempt improved clinical trial efficiency. Methods As part of the Xylitol for Adult Caries Trial (X-ACT), non-cavitated and cavitated caries lesions were assessed in subjects who were experiencing the disease. The trial was a test of the effectiveness of 5 grams/day of xylitol, consumed by dissolving in the mouth five 1 gram lozenges spaced across each day, compared with a sucralose placebo. For this analysis, seeking trial efficiency, 538 subjects aged 21–80, with complete data for four dental examinations were selected from the 691 randomized into the three year trial, conducted at three sites. Acceptable inter and intra examiner reliability before and during the trial was quantified using the kappa statistic. Results The mean annualized non-cavitated plus cavitated lesion transition scores in coronal and root surfaces, from sound to carious favoured xylitol over placebo, during the three cumulative periods of 12, 24, and 33 months, but these clinically and statistically non-significant differences declined in magnitude over time. Restricting the present assessment to those subjects with a higher baseline lifetime caries experience showed possible but inconsistent benefit. Conclusions There was no clear and clinically relevant preventive effect of xylitol on caries in adults with adequate fluoride exposure when non-cavitated plus cavitated lesions were assessed. This conformed to the X-ACT trial result assessing cavitated lesions. Including non-cavitated lesion assessment in this full scale, placebo controlled, multi site, randomized, double blinded clinical trial in adults experiencing dental caries, did not achieve added trial efficiency or demonstrate practical benefit of xylitol. Trial Registration ClinicalTrials.Gov NCT00393055 PMID:24205951

  12. The Validity of ITBS Reading Comprehension Test Scores for Learning Disabled and Non Learning Disabled Students under Extended-Time Conditions.

    ERIC Educational Resources Information Center

    Huesman, Ronald L., Jr.; Frisbie, David A.

    This study investigated the effect of extended-time limits in terms of performance levels and score comparability for reading comprehension scores on the Iowa Tests of Basic Skills (ITBS). The first part of the study compared the average reading comprehension scores on the ITBS of 61 sixth-graders with learning disabilities and 397 non learning…

  13. SAT Scores, 2012-13: Wake County Public School System (WCPSS). Measuring Up. D&A Report No. 13.22

    ERIC Educational Resources Information Center

    Muli, Juliana; Gilleland, Kevin; McMillen, Brad

    2014-01-01

    As the ACT has become part of North Carolina's mandatory testing program, SAT participation in Wake County Public School System (WCPSS) and North Carolina has declined in recent years. However, SAT performance in WCPSS remains high compared to state and national averages. In 2012-13, students in WCPSS continued to score 50-60 points higher on the…

  14. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? North Carolina

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles North Carolina's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade math test was 351 for non-Title I students and 347 for Title I students. In 2009, the mean scale score in 4th grade math was 354 for non-Title I students and 350 for Title I students. Between 2006 and 2009, the mean scale…

  15. State Test Score Trends through 2008-09, Part 4: Is Achievement Improving and Are Gaps Narrowing for Title I Students? New Hampshire

    ERIC Educational Resources Information Center

    Center on Education Policy, 2011

    2011-01-01

    This paper profiles New Hampshire's test score trends through 2008-09. In 2006, the mean scale score on the state 4th grade reading test was 445 for non-Title I students and 438 for Title I students. In 2009, the mean scale score in 4th grade reading was 448 for non-Title I students and 441 for Title I students. Between 2006 and 2009, the mean…

  16. Science Teacher Efficacy and Outcome Expectancy as Predictors of Students' End-of-Instruction (EOI) Biology I Test Scores

    ERIC Educational Resources Information Center

    Angle, Julie; Moseley, Christine

    2009-01-01

    The purpose of this study was to compare teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the statewide End-of-Instruction (EOI) Biology I test met or exceeded the state academic proficiency level (Proficient Group) to teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the…

  17. Consistency of maternal telephone administration of the asthma control test using postpartum recall compared to repeated measures during pregnancy.

    PubMed

    Xu, Ronghui; Li, Mofei; Johnson, Diana L; Luo, Yunjun; Chambers, Christina D

    2017-05-01

    Suboptimal asthma control during pregnancy may impact perinatal outcomes. U.S. guidelines recommend questionnaires to assess asthma control including the Asthma Control Test (ACT). It is unknown in a research setting to what extent recall differs by the time between symptom occurrence and the administration of the questionnaire. Between 2009-2014, 196 pregnant asthmatic women were recruited by the Organization of Teratology Information Specialists (OTIS) MotherToBaby Pregnancy Studies. Participants were administered the ACT at enrollment, gestational weeks 20 and 32, and shortly after delivery. The same women were also administered the ACT retrospectively at approximately 6 months postpartum. The Pearson correlation coefficients between the in-pregnancy and retrospective continuous ACT scores for the 1st, 2nd and 3rd trimesters were: 0.67 (95% CI: 0.58, 0.74), 0.61 (0.52, 0.70) and 0.65 (0.56, 0.72), respectively. When dichotomized into well-controlled asthma (ACT score ≥ 20) versus otherwise, the chi-square test for all three trimesters resulted in p values <0.0001. Cohen's Kappa statistics for the same dichotomized scores were 0.51, 0.45 and 0.40 for each trimester respectively. There was no evidence that adverse outcome of pregnancy (recall bias) influenced postpartum responses. The retrospectively recalled ACT score obtained postpartum was substantially different compared to in-pregnancy administration of the same questionnaire which could reflect test-retest variability as well as attenuation of recall. Documentation of the magnitude and direction of these differences could be useful in interpretation of the impact of asthma control when the ACT is used in retrospective case-control studies for pregnancy outcomes.

  18. The Causes and Consequences of Test Score Manipulation: Evidence from the New York Regents Examinations. CEPA Working Paper No. 16-08

    ERIC Educational Resources Information Center

    Dee, Thomas S.; Dobbie, Will; Jacob, Brian A.; Rockoff, Jonah

    2016-01-01

    In this paper, we show that the design and decentralized, school-based scoring of New York's high school exit exams--the Regents Examinations--led to the systematic manipulation of test sores just below important proficiency cutoffs. Our estimates suggest that teachers inflate approximately 40 percent of test scores near the proficiency cutoffs.…

  19. Do classroom ventilation rates in California elementary schools influence standardized test scores? Results from a prospective study.

    PubMed

    Mendell, M J; Eliseeva, E A; Davies, M M; Lobscheid, A

    2016-08-01

    Limited evidence has associated lower ventilation rates (VRs) in schools with reduced student learning or achievement. We analyzed longitudinal data collected over two school years from 150 classrooms in 28 schools within three California school districts. We estimated daily classroom VRs from real-time indoor carbon dioxide measured by web-connected sensors. School districts provided individual-level scores on standard tests in Math and English, and classroom-level demographic data. Analyses assessing learning effects used two VR metrics: average VRs for 30 days prior to tests, and proportion of prior daily VRs above specified thresholds during the year. We estimated relationships between scores and VR metrics in multivariate models with generalized estimating equations. All school districts had median school-year VRs below the California VR standard. Most models showed some positive associations of VRs with test scores; however, estimates varied in magnitude and few 95% confidence intervals excluded the null. Combined-district models estimated statistically significant increases of 0.6 points (P = 0.01) on English tests for each 10% increase in prior 30-day VRs. Estimated increases in Math were of similar magnitude but not statistically significant. Findings suggest potential small positive associations between classroom VRs and learning. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.

  20. Permanent Income and the Black-White Test Score Gap. NBER Working Paper No. 17610

    ERIC Educational Resources Information Center

    Rothstein, Jesse; Wozny, Nathan

    2011-01-01

    Analysts often examine the black-white test score gap conditional on family income. Typically only a current income measure is available. We argue that the gap conditional on permanent income is of greater interest, and we describe a method for identifying this gap using an auxiliary data set to estimate the relationship between current and…

  1. The Black-White Test Score Gap through Third Grade. NBER Working Paper No. 11049

    ERIC Educational Resources Information Center

    Fryer, Roland G.; Levitt, Steven D.

    2005-01-01

    This paper describes basic facts regarding the black-white test score gap over the first four years of school. Black children enter school substantially behind their white counterparts in reading and math, but including a small number of covariates erases the gap. Over the first four years of school, however, blacks lose substantial ground…

  2. Clinical trials transparency and the Trial and Experimental Studies Transparency (TEST) act.

    PubMed

    Logvinov, Ilana

    2014-03-01

    Clinical trial research is the cornerstone for successful advancement of medicine that provides hope for millions of people in the future. Full transparency in clinical trials may allow independent investigators to evaluate study designs, perform additional analysis of data, and potentially eliminate duplicate studies. Current regulatory system and publishers rely on investigators and pharmaceutical industries for complete and accurate reporting of results from completed clinical trials. Legislation seems to be the only way to enforce mandatory disclosure of results. The Trial and Experimental Studies Transparency (TEST) Act of 2012 was introduced to the legislators in the United States to promote greater transparency in research industry. Public safety and advancement of science are the driving forces for the proposed policy change. The TEST Act may benefit the society and researchers; however, there are major concerns with participants' privacy and intellectual property protection. Copyright © 2014 Elsevier Inc. All rights reserved.

  3. A robust method using propensity score stratification for correcting verification bias for binary tests

    PubMed Central

    He, Hua; McDermott, Michael P.

    2012-01-01

    Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified. PMID:21856650

  4. A Brief Look at: Test Scores and the Standard Error of Measurement. E&R Report No. 10.13

    ERIC Educational Resources Information Center

    Holdzkom, David; Sumner, Brian; McMillen, Brad

    2010-01-01

    In the context of standardized testing, the standard error of measurement (SEM) is a measure of the factors other than the student's actual knowledge of the tested material that may affect the student's test score. Such factors may include distractions in the testing environment, fatigue, hunger, or even luck. This means that a student's observed…

  5. Enrollment Management Trends Report, 2012: A Snapshot of the 2011 ACT-Tested High School Graduates

    ERIC Educational Resources Information Center

    ACT, Inc., 2012

    2012-01-01

    ACT created the "Enrollment Management Trends Report" to provide enrollment managers and other college administrators with information about students' patterns during the college choice process of the 2011 high school graduates who took the ACT[R] test. More than 1.6 million students--roughly half of the graduating class of 2011--took…

  6. Associations between cadmium exposure and neurocognitive test scores in a cross-sectional study of US adults.

    PubMed

    Ciesielski, Timothy; Bellinger, David C; Schwartz, Joel; Hauser, Russ; Wright, Robert O

    2013-02-05

    Low-level environmental cadmium exposure and neurotoxicity has not been well studied in adults. Our goal was to evaluate associations between neurocognitive exam scores and a biomarker of cumulative cadmium exposure among adults in the Third National Health and Nutrition Examination Survey (NHANES III). NHANES III is a nationally representative cross-sectional survey of the U.S. population conducted between 1988 and 1994. We analyzed data from a subset of participants, age 20-59, who participated in a computer-based neurocognitive evaluation. There were four outcome measures: the Simple Reaction Time Test (SRTT: visual motor speed), the Symbol Digit Substitution Test (SDST: attention/perception), the Serial Digit Learning Test (SDLT) trials-to-criterion, and the SDLT total-error-score (SDLT-tests: learning recall/short-term memory). We fit multivariable-adjusted models to estimate associations between urinary cadmium concentrations and test scores. 5662 participants underwent neurocognitive screening, and 5572 (98%) of these had a urinary cadmium level available. Prior to multivariable-adjustment, higher urinary cadmium concentration was associated with worse performance in each of the 4 outcomes. After multivariable-adjustment most of these relationships were not significant, and age was the most influential variable in reducing the association magnitudes. However among never-smokers with no known occupational cadmium exposure the relationship between urinary cadmium and SDST score (attention/perception) was significant: a 1 μg/L increase in urinary cadmium corresponded to a 1.93% (95%CI: 0.05, 3.81) decrement in performance. These results suggest that higher cumulative cadmium exposure in adults may be related to subtly decreased performance in tasks requiring attention and perception, particularly among those adults whose cadmium exposure is primarily though diet (no smoking or work based cadmium exposure). This association was observed among exposure levels

  7. Comparison of baseline and post-concussion ImPACT test scores in young athletes with stimulant-treated and untreated ADHD.

    PubMed

    Gardner, Ryan M; Yengo-Kahn, Aaron; Bonfield, Christopher M; Solomon, Gary S

    2017-02-01

    Baseline and post-concussion neurocognitive testing is useful in managing concussed athletes. Attention deficit hyperactivity disorder (ADHD) and stimulant medications are recognized as potential modifiers of performance on neurocognitive testing by the Concussion in Sport Group. Our goal was to assess whether individuals with ADHD perform differently on post-concussion testing and if this difference is related to the use of stimulants. Retrospective case-control study in which 4373 athletes underwent baseline and post-concussion testing using the ImPACT battery. 277 athletes self-reported a history of ADHD, of which, 206 reported no stimulant treatment and 69 reported stimulant treatment. Each group was matched with participants reporting no history of ADHD or stimulant use on several biopsychosocial characteristics. Non-parametric tests were used to assess ImPACT composite score differences between groups. Participants with ADHD had worse verbal memory, visual memory, visual motor speed, and reaction time scores than matched controls at baseline and post-concussion, all with p ≤ .001 and |r|≥ 0.100. Athletes without stimulant treatment had lower verbal memory, visual memory, visual motor speed, and reaction time scores than controls at baseline (p ≤ 0.01, |r|≥ 0.100 [except verbal memory, r = -0.088]) and post-concussion (p = 0.000, |r|> 0.100). Athletes with stimulant treatment had lower verbal memory (Baseline: p = 0.047, r = -0.108; Post-concussion: p = 0.023, r = -0.124) and visual memory scores (Baseline: p = 0.013, r = -0.134; Post-concussion: p = 0.003, r = -0.162) but equivalent visual motor speed and reaction time scores versus controls at baseline and post-concussion. ADHD-specific baseline and post-concussion neuropsychological profiles, as well as stimulant medication status, may need to be considered when interpreting ImPACT test results. Further investigation into the effects of ADHD and stimulant use on recovery from

  8. The effect of an intervention program on functional movement screen test scores in mixed martial arts athletes.

    PubMed

    Bodden, Jamie G; Needham, Robert A; Chockalingam, Nachiappan

    2015-01-01

    This study assessed the basic fundamental movements of mixed martial arts (MMA) athletes using the functional movement screen (FMS) assessment and determined if an intervention program was successful at improving results. Participants were placed into 1 of the 2 groups: intervention and control groups. The intervention group was required to complete a corrective exercise program 4 times per week, and all participants were asked to continue their usual MMA training routine. A mid-intervention FMS test was included to examine if successful results were noticed sooner than the 8-week period. Results highlighted differences in FMS test scores between the control group and intervention group (p = 0.006). Post hoc testing revealed a significant increase in the FMS score of the intervention group between weeks 0 and 8 (p = 0.00) and weeks 0 and 4 (p = 0.00) and no significant increase between weeks 4 and 8 (p = 1.00). A χ analysis revealed that the intervention group participants were more likely to have an FMS score >14 than participants in the control group at week 4 (χ = 7.29, p < 0.01) and week 8 (χ = 5.2, p ≤ 0.05). Finally, a greater number of participants in the intervention group were free from asymmetry at week 4 and week 8 compared with the initial test period. The results of the study suggested that a 4-week intervention program was sufficient at improving FMS scores. Most if not all, the movements covered on the FMS relate to many aspects of MMA training. The knowledge that the FMS can identify movement dysfunctions and, furthermore, the fact that the issues can be improved through a standardized intervention program could be advantageous to MMA coaches, thus, providing the opportunity to adapt and implement new additions to training programs.

  9. Interpreting the g loadings of intelligence test composite scores in light of Spearman's law of diminishing returns.

    PubMed

    Reynolds, Matthew R

    2013-03-01

    The linear loadings of intelligence test composite scores on a general factor (g) have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the g loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a) investigate whether the g loadings of composite scores from the Differential Ability Scales (2nd ed.) (DAS-II, C. D. Elliott, 2007a, Differential Ability Scales (2nd ed.). San Antonio, TX: Pearson) were nonlinear and (b) if they were nonlinear, to compare them with linear g loadings to demonstrate how SLODR alters the interpretation of these loadings. Linear and nonlinear confirmatory factor analysis (CFA) models were used to model Nonverbal Reasoning, Verbal Ability, Visual Spatial Ability, Working Memory, and Processing Speed composite scores in four age groups (5-6, 7-8, 9-13, and 14-17) from the DAS-II norming sample. The nonlinear CFA models provided better fit to the data than did the linear models. In support of SLODR, estimates obtained from the nonlinear CFAs indicated that g loadings decreased as g level increased. The nonlinear portion for the nonverbal reasoning loading, however, was not statistically significant across the age groups. Knowledge of general ability level informs composite score interpretation because g is less likely to produce differences, or is measured less, in those scores at higher g levels. One implication is that it may be more important to examine the pattern of specific abilities at higher general ability levels. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  10. Increasing the reliability of the fluid/crystallized difference score from the Kaufman Adolescent and Adult Intelligence Test with reliable component analysis.

    PubMed

    Caruso, J C

    2001-06-01

    The unreliability of difference scores is a well documented phenomenon in the social sciences and has led researchers and practitioners to interpret differences cautiously, if at all. In the case of the Kaufman Adult and Adolescent Intelligence Test (KAIT), the unreliability of the difference between the Fluid IQ and the Crystallized IQ is due to the high correlation between the two scales. The consequences of the lack of precision with which differences are identified are wide confidence intervals and unpowerful significance tests (i.e., large differences are required to be declared statistically significant). Reliable component analysis (RCA) was performed on the subtests of the KAIT in order to address these problems. RCA is a new data reduction technique that results in uncorrelated component scores with maximum proportions of reliable variance. Results indicate that the scores defined by RCA have discriminant and convergent validity (with respect to the equally weighted scores) and that differences between the scores, derived from a single testing session, were more reliable than differences derived from equal weighting for each age group (11-14 years, 15-34 years, 35-85+ years). This reliability advantage results in narrower confidence intervals around difference scores and smaller differences required for statistical significance.

  11. Identifying uncontrolled asthma in young children: clinical scores or objective variables?

    PubMed

    Leung, T F; Ko, F W S; Sy, H Y; Wong, E; Li, C Y; Yung, E; Hui, D S C; Wong, G W K; Lai, C K W

    2009-03-01

    Several international asthma guidelines emphasize the importance of assessing asthma control. However, there is limited data on the usefulness of available assessment tools in indicating disease control in young asthmatics. This study investigated the ability of Chinese version of Childhood Asthma Control Test (C-ACT) and other disease-related factors in identifying uncontrolled asthma (UA) in young children. During the same clinic visit, asthma patients 4 to 11 years of age completed C-ACT and underwent exhaled nitric oxide and spirometric measurements. Blinded to these results, the same investigator assigned Disease Severity Score (DSS) and rated asthma control according to Global Initiative for Asthma. The mean (SD) age of 113 recruited patients was 9.1 (2.0) years, and 35% of them had UA. C-ACT, DSS and forced expiratory volume in 1 second (FEV(1)) differed among patients with different control status (p < 0.001 for C-ACT and DSS; p = 0.014 for FEV(1)). Logistic regression confirmed that UA was associated with DSS (p < 0.001), PEF (p = 0.002), C-ACT (p = 0.011), and FEV(1) (p = 0.012). By ROC analysis, C-ACT and DSS were the best predictors for UA (p < 0.001), followed by PEF (p = 0.006) and FEV(1) (p = 0.007). When analyzed by the Classification and Regression Tree (CART) approach, the sequential use of DSS and C-ACT had 77% sensitivity and 84% specificity in identifying UA. C-ACT is better than objective parameters in identifying young Chinese children with UA.

  12. Testing the radiosurgery-based arteriovenous malformation score and the modified Spetzler-Martin grading system to predict radiosurgical outcome.

    PubMed

    Andrade-Souza, Yuri M; Zadeh, Gelareh; Ramani, Meera; Scora, Daryl; Tsao, May N; Schwartz, Michael L

    2005-10-01

    The aim of this study was to validate the radiosurgery-based arteriovenous malformation (AVM) score and the modified Spetzler-Martin grading system to predict radiosurgical outcome. One hundred thirty-six patients with brain AVMs were randomly selected. These patients had undergone a linear accelerator radiosurgical procedure at a single center between 1989 and 2000. Patients were divided into four groups according to an AVM score, which was calculated from the lesion volume, lesion location, and patient age (Group 1, AVM score <1; Group 2, AVM score 1-1.49; Group 3, AVM score 1.5-2; and Group 4, AVM score >2). Patients with a Spetzler-Martin Grade III AVM were divided into Grades IIIA (lesion >3 cm) and IIIB (lesion <3 cm). Sixty-two female (45.6%) and 74 male (54.4%) patients with a median age of 37.5 years (mean 37.5 years, range 5-77 years) were followed up for a median of 40 months. The median tumor margin dose was 15 Gy (mean 17.23 Gy, range 15-25 Gy). The proportions of excellent outcomes according to the AVM score were as follows: 91.7% for Group 1, 74.1% for Group 2, 60% for Group 3, and 33.3% for Group 4 (chi-square test, degrees of freedom (df) = 3, p < 0.001). Based on the modified Spetzler-Martin system, Grade I lesions had 88.9% excellent results; Grade II, 69.6%; Grade IIIB, 61.5%; and Grades IIIA and IV, 44.8% (chi-square test, df = 3, p = 0.047). The radiosurgery-based AVM score can be used accurately to predict excellent results following a single radiosurgical treatment for AVM. The modified Spetzler-Martin system can also predict radiosurgical results for AVMs, thus making it possible to use this system while deciding between surgery and radiosurgery.

  13. Interpreting Linked Psychomotor Performance Scores

    ERIC Educational Resources Information Center

    Looney, Marilyn A.

    2013-01-01

    Given that equating/linking applications are now appearing in kinesiology literature, this article provides an overview of the different types of linked test scores: equated, concordant, and predicted. It also addresses the different types of evidence required to determine whether the scores from two different field tests (measuring the same…

  14. Why women perform better in college than admission scores would predict: Exploring the roles of conscientiousness and course-taking patterns.

    PubMed

    Keiser, Heidi N; Sackett, Paul R; Kuncel, Nathan R; Brothen, Thomas

    2016-04-01

    Women typically obtain higher subsequent college GPAs than men with the same admissions test score. A common reaction is to attribute this to a flaw in the admissions test. We explore the possibility that this underprediction of women's performance reflects gender differences in conscientiousness and college course-taking patterns. In Study 1, we focus on using the ACT to predict performance in a single, large course where performance is decomposed into cognitive (exam and quiz scores) and less cognitive, discretionary components (discussion and extra credit points). The ACT does not underpredict female's cognitive performance, but it does underpredict female performance on the less cognitive, discretionary components of academic performance, because it fails to measure and account for the personality trait of conscientiousness. In Study 2, we create 2 course-difficulty indices (Course Challenge and Mean Aptitude in Course) and add them to an HLM regression model to see if they reduce the degree to which SAT scores underpredict female performance. Including Course Challenge does result in a modest reduction of the gender coefficient; however, including Mean Aptitude in Course does not. Thus, differences in course-taking patterns is a partial (albeit small) explanation for the common finding of differential prediction by gender. (c) 2016 APA, all rights reserved).

  15. Error Rates in Measuring Teacher and School Performance Based on Student Test Score Gains. NCEE 2010-4004

    ERIC Educational Resources Information Center

    Schochet, Peter Z.; Chiang, Hanley S.

    2010-01-01

    This paper addresses likely error rates for measuring teacher and school performance in the upper elementary grades using value-added models applied to student test score gain data. Using realistic performance measurement system schemes based on hypothesis testing, we develop error rate formulas based on OLS and Empirical Bayes estimators.…

  16. 16 CFR 1611.37 - Reasonable and representative tests under section 8 of the Act.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... FLAMMABLE FABRICS ACT REGULATIONS STANDARD FOR THE FLAMMABILITY OF VINYL PLASTIC FILM Rules and Regulations..., on initial test a film or a textile fabric with a nitro-cellulose fiber, finish or coating, does not... production shall be deemed reasonable and representative tests for such film or textile fabric. (d...

  17. 16 CFR 1611.37 - Reasonable and representative tests under section 8 of the Act.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... FLAMMABLE FABRICS ACT REGULATIONS STANDARD FOR THE FLAMMABILITY OF VINYL PLASTIC FILM Rules and Regulations..., on initial test a film or a textile fabric with a nitro-cellulose fiber, finish or coating, does not... production shall be deemed reasonable and representative tests for such film or textile fabric. (d...

  18. Achievement Testing in the No Child Left Behind Era: The Arkansas Benchmark

    ERIC Educational Resources Information Center

    Hall, John D.; Howerton, D. Lynn; Jones, Craig H.

    2008-01-01

    The No Child Left Behind Act and the accountability movement in public education caused many states to develop criterion-referenced academic achievement tests. Scores from these tests are often used to make high stakes decisions. Even so, these tests typically do not receive independent psychometric scrutiny. We evaluated the 2005 Arkansas…

  19. Your move: The effect of chess on mathematics test scores.

    PubMed

    Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla

    2017-01-01

    We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.

  20. Determinants of Academic Attainment in the United States: A Quantile Regression Analysis of Test Scores

    ERIC Educational Resources Information Center

    Haile, Getinet Astatike; Nguyen, Anh Ngoc

    2008-01-01

    We investigate the determinants of high school students' academic attainment in mathematics, reading and science in the United States; focusing particularly on possible differential impacts of ethnicity and family background across the distribution of test scores. Using data from the NELS2000 and employing quantile regression, we find two…