Rubínová, Eva; Nikolai, Tomáš; Marková, Hana; Siffelová, Kamila; Laczó, Jan; Hort, Jakub; Vyhnálek, Martin
2014-01-01
The Clock Drawing Test is a frequently used cognitive screening test with several scoring systems in elderly populations. We compare simple and complex scoring systems and evaluate the usefulness of the combination of the Clock Drawing Test with the Mini-Mental State Examination to detect patients with mild cognitive impairment. Patients with amnestic mild cognitive impairment (n = 48) and age- and education-matched controls (n = 48) underwent neuropsychological examinations, including the Clock Drawing Test and the Mini-Mental State Examination. Clock drawings were scored by three blinded raters using one simple (6-point scale) and two complex (17- and 18-point scales) systems. The sensitivity and specificity of these scoring systems used alone and in combination with the Mini-Mental State Examination were determined. Complex scoring systems, but not the simple scoring system, were significant predictors of the amnestic mild cognitive impairment diagnosis in logistic regression analysis. At equal levels of sensitivity (87.5%), the Mini-Mental State Examination showed higher specificity (31.3%, compared with 12.5% for the 17-point Clock Drawing Test scoring scale). The combination of Clock Drawing Test and Mini-Mental State Examination scores increased the area under the curve (0.72; p < .001) and increased specificity (43.8%), but did not increase sensitivity, which remained high (85.4%). A simple 6-point scoring system for the Clock Drawing Test did not differentiate between healthy elderly and patients with amnestic mild cognitive impairment in our sample. Complex scoring systems were slightly more efficient, yet still were characterized by high rates of false-positive results. We found psychometric improvement using combined scores from the Mini-Mental State Examination and the Clock Drawing Test when complex scoring systems were used. The results of this study support the benefit of using combined scores from simple methods.
21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Ovarian adnexal mass assessment score test system... immunological Test Systems § 866.6050 Ovarian adnexal mass assessment score test system. (a) Identification. An ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum or...
2016-04-04
Final 3. DATES COVERED (From - To) 4. TITLE AND SUBTITLE Test Operations Procedure (TOP) 03-2-827 Test Procedures for Video Target Scoring Using...ABSTRACT This Test Operations Procedure (TOP) describes typical equipment and procedures to setup and operate a Video Target Scoring System (VTSS) to...lights. 15. SUBJECT TERMS Video Target Scoring System, VTSS, witness screens, camera, target screen, light pole 16. SECURITY
Score Reporting for the 1991 Medical College Admission Test.
ERIC Educational Resources Information Center
Mitchell, Karen J.; Haynes, Robert
1990-01-01
Data used in a major review of the system for reporting scores on the Medical College Admission Test (MCAT) are presented and discussed. The data demonstrated the value of the current score-reporting system and led to retention of the 15-point MCAT score scale in 1991. (Author/MSE)
ERIC Educational Resources Information Center
Sheehan, Dwayne P.; Lafave, Mark R.; Katz, Larry
2011-01-01
This study was designed to test the intra- and inter-rater reliability of the University of North Carolina's Balance Error Scoring System in 9- and 10-year-old children. Additionally, a modified version of the Balance Error Scoring System was tested to determine if it was more sensitive in this population ("raw scores"). Forty-six…
ERIC Educational Resources Information Center
Simner, Marvin L.
1985-01-01
An abbreviated scoring system for the Goodenough-Harris Draw-A-Man Test found that three items had the same overall potential for correctly identifying at-risk kindergarteners as more time-consuming scoring methods. (CL)
Test/score/report: Simulation techniques for automating the test process
NASA Technical Reports Server (NTRS)
Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.
1994-01-01
A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary to test a POCC/MOC software delivery, as well as improve the quality of the test process.
McKeough, D Michael; Mattern-Baxter, Katrin; Barakatt, Edward
2010-01-01
The purpose of this study was to determine if a computer-aided instruction learning module improves students' knowledge of the neuroanatomy/physiology and clinical examination of the dorsal column-medial lemniscal (DCML) system. Sixty-one physical therapy students enrolled in a clinical neurology course in entry-level PT educational programs at two universities participated in the study. Students from University-1 (U1;) had not had a previous neuroanatomy course, while students from University-2 (U2;) had taken a neuroanatomy course in the previous semester. Before and after working with the learning module, students took a paper-and-pencil test on the neuroanatomy/physiology and clinical examination of the DCML system. Kruskal-Wallis one-way ANOVA and Mann-Whitney tests were used to determine if differences existed between neuroanatomy/physiology examination scores and clinical examination scores before and after taking the learning module, and between student groups based on university attended. For students from U1, neuroanatomy/physiology post-test scores improved significantly over pre-test scores (p < 0.001), while post-test scores of students from U2 did not (p = 0.60). Neuroanatomy/physiology pre-test scores from U2 were significantly better than those from U1 (p < 0.001); there was no significant difference in post-test scores (p = 0.062). Clinical examination pre-test and post-test scores from U2 were significantly better than those from U1 (p < 0.001). Clinical examination post-test scores improved significantly from the pre-test scores for both U1 (p < 0.001) and U2 (p < 0.001).
Alsalaheen, Bara; Haines, Jamie; Yorke, Amy; Broglio, Steven P
2015-12-01
To examine the reliability, convergent, and discriminant validity of the limits of stability (LOS) test to assess dynamic postural stability in adolescents using a portable forceplate system. Cross-sectional reliability observational study. School setting. Adolescents (N=36) completed all measures during the first session. To examine the reliability of the LOS test, a subset of 15 participants repeated the LOS test after 1 week. Not applicable. Outcome measurements included the LOS test, Balance Error Scoring System, Instrumented Balance Error Scoring System, and Modified Clinical Test for Sensory Interaction on Balance. A significant relation was observed among LOS composite scores (r=.36-.87, P<.05). However, no relation was observed between LOS and static balance outcome measurements. The reliability of the LOS composite scores ranged from moderate to good (intraclass correlation coefficient model 2,1=.73-.96). The results suggest that the LOS composite scores provide unique information about dynamic postural stability, and the LOS test completed at 100% of the theoretical limit appeared to be a reliable test of dynamic postural stability in adolescents. Clinicians should use dynamic balance measurement as part of their balance assessment and should not use static balance testing (eg, Balance Error Scoring System) to make inferences about dynamic balance, especially when balance assessment is used to determine rehabilitation outcomes, or when making return to play decisions after injury. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Gera, G; Freeman, D L; Blackinton, M T; Horak, F B; King, L
2016-02-01
Balance deficits in people with Parkinson's disease can affect any of the multiple systems encompassing balance control. Thus, identification of the specific deficit is crucial in customizing balance rehabilitation. The sensory organization test, a test of sensory integration for balance control, is sometimes used in isolation to identify balance deficits in people with Parkinson's disease. More recently, the Mini-Balance Evaluations Systems Test, a clinical scale that tests multiple domains of balance control, has begun to be used to assess balance in patients with Parkinson's disease. The purpose of our study was to compare the use of Sensory Organization Test and Mini-Balance Evaluations Systems Test in identifying balance deficits in people with Parkinson's disease. 45 participants (27M, 18F; 65.2 ± 8.2 years) with idiopathic Parkinson's disease participated in the cross-sectional study. Balance assessment was performed using the Sensory Organization Test and the Mini-Balance Evaluations Systems Test. People were classified into normal and abnormal balance based on the established cutoff scores (normal balance: Sensory Organization Test >69; Mini-Balance Evaluations Systems Test >73). More subjects were classified as having abnormal balance with the Mini-Balance Evaluations Systems Test (71% abnormal) than with the Sensory Organization Test (24% abnormal) in our cohort of people with Parkinson's disease. There were no subjects with a normal Mini-Balance Evaluations Systems Test score but abnormal Sensory Organization Test score. In contrast, there were 21 subjects who had an abnormal Mini-Balance Evaluations Systems Test score but normal Sensory Organization Test scores. Findings from this study suggest that investigation of sensory integration deficits, alone, may not be able to identify all types of balance deficits found in patients with Parkinson's disease. Thus, a comprehensive approach should be used to test of multiple balance systems to provide customized rehabilitation.
Arneja, Jugpal S; Narasimhan, Kailash; Bouwman, David; Bridge, Patrick D
2009-12-01
In-training evaluations in graduate medical education have typically been challenging. Although the majority of standardized examination delivery methods have become computer-based, in-training examinations generally remain pencil-paper-based, if they are performed at all. Audience response systems present a novel way to stimulate and evaluate the resident-learner. The purpose of this study was to assess the outcomes of audience response systems testing as compared with traditional testing in a plastic surgery residency program. A prospective 1-year pilot study of 10 plastic surgery residents was performed using audience response systems-delivered testing for the first half of the academic year and traditional pencil-paper testing for the second half. Examination content was based on monthly "Core Quest" curriculum conferences. Quantitative outcome measures included comparison of pretest and posttest and cumulative test scores of both formats. Qualitative outcomes from the individual participants were obtained by questionnaire. When using the audience response systems format, pretest and posttest mean scores were 67.5 and 82.5 percent, respectively; using traditional pencil-paper format, scores were 56.5 percent and 79.5 percent. A comparison of the cumulative mean audience response systems score (85.0 percent) and traditional pencil-paper score (75.0 percent) revealed statistically significantly higher scores with audience response systems (p = 0.01). Qualitative outcomes revealed increased conference enthusiasm, greater enjoyment of testing, and no user difficulties with the audience response systems technology. The audience response systems modality of in-training evaluation captures participant interest and reinforces material more effectively than traditional pencil-paper testing does. The advantages include a more interactive learning environment, stimulation of class participation, immediate feedback to residents, and immediate tabulation of results for the educator. Disadvantages include start-up costs and lead-time preparation.
Shenker, Bennett S
2014-02-01
To validate a scoring system that evaluates the ability of Internet search engines to correctly predict diagnoses when symptoms are used as search terms. We developed a five point scoring system to evaluate the diagnostic accuracy of Internet search engines. We identified twenty diagnoses common to a primary care setting to validate the scoring system. One investigator entered the symptoms for each diagnosis into three Internet search engines (Google, Bing, and Ask) and saved the first five webpages from each search. Other investigators reviewed the webpages and assigned a diagnostic accuracy score. They rescored a random sample of webpages two weeks later. To validate the five point scoring system, we calculated convergent validity and test-retest reliability using Kendall's W and Spearman's rho, respectively. We used the Kruskal-Wallis test to look for differences in accuracy scores for the three Internet search engines. A total of 600 webpages were reviewed. Kendall's W for the raters was 0.71 (p<0.0001). Spearman's rho for test-retest reliability was 0.72 (p<0.0001). There was no difference in scores based on Internet search engine. We found a significant difference in scores based on the webpage's order on the Internet search engine webpage (p=0.007). Pairwise comparisons revealed higher scores in the first webpages vs. the fourth (corr p=0.009) and fifth (corr p=0.017). However, this significance was lost when creating composite scores. The five point scoring system to assess diagnostic accuracy of Internet search engines is a valid and reliable instrument. The scoring system may be used in future Internet research. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Scoring systems for the Clock Drawing Test: A historical review
Spenciere, Bárbara; Alves, Heloisa; Charchat-Fichman, Helenice
2017-01-01
The Clock Drawing Test (CDT) is a simple neuropsychological screening instrument that is well accepted by patients and has solid psychometric properties. Several different CDT scoring methods have been developed, but no consensus has been reached regarding which scoring method is the most accurate. This article reviews the literature on these scoring systems and the changes they have undergone over the years. Historically, different types of scoring systems emerged. Initially, the focus was on screening for dementia, and the methods were both quantitative and semi-quantitative. Later, the need for an early diagnosis called for a scoring system that can detect subtle errors, especially those related to executive function. Therefore, qualitative analyses began to be used for both differential and early diagnoses of dementia. A widely used qualitative method was proposed by Rouleau et al. (1992). Tracing the historical path of these scoring methods is important for developing additional scoring systems and furthering dementia prevention research. PMID:29213488
Karr, Justin E; Garcia-Barrera, Mauricio A; Holdnack, James A; Iverson, Grant L
2017-05-01
Executive function consists of multiple cognitive processes that operate as an interactive system to produce volitional goal-oriented behavior, governed in large part by frontal microstructural and physiological networks. Identification of deficits in executive function in those with neurological or psychiatric conditions can be difficult because the normal variation in executive function test scores, in healthy adults when multiple tests are used, is largely unknown. This study addresses that gap in the literature by examining the prevalence of low scores on a brief battery of executive function tests. The sample consisted of 1,050 healthy individuals (ages 16-89) from the standardization sample for the Delis-Kaplan Executive Function System (D-KEFS). Seven individual test scores from the Trail Making Test, Color-Word Interference Test, and Verbal Fluency Test were analyzed. Low test scores, as defined by commonly used clinical cut-offs (i.e., ≤25th, 16th, 9th, 5th, and 2nd percentiles), occurred commonly among the adult portion of the D-KEFS normative sample (e.g., 62.8% of the sample had one or more scores ≤16th percentile, 36.1% had one or more scores ≤5th percentile), and the prevalence of low scores increased with lower intelligence and fewer years of education. The multivariate base rates (BR) in this article allow clinicians to understand the normal frequency of low scores in the general population. By use of these BRs, clinicians and researchers can improve the accuracy with which they identify executive dysfunction in clinical groups, such as those with traumatic brain injury or neurodegenerative diseases. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Heparin-Induced Thrombocytopenia Antibody Test
... HIT II is clinically suspected. There is a pre-test scoring system that is typically used to determine ... The HIT antibody test is performed when this pre-scoring test shows that a person has a moderate to ...
A Review of Scoring Algorithms for Ability and Aptitude Tests.
ERIC Educational Resources Information Center
Chevalier, Shirley A.
In conventional practice, most educators and educational researchers score cognitive tests using a dichotomous right-wrong scoring system. Although simple and straightforward, this method does not take into consideration other factors, such as partial knowledge or guessing tendencies and abilities. This paper discusses alternative scoring models:…
Qualitative Dimensions in Scoring the Rey Visual Memory Test of Malingering.
ERIC Educational Resources Information Center
Griffin, G. A. Elmer; And Others
1996-01-01
A new qualitative scoring system for the Rey Visual Memory Test was tested for its ability to distinguish between malingerers and nonmalingerers. The new system, based on the types of errors made, was able to distinguish between 53 psychiatrically disabled and 64 normal nonmalingerers, and between nonmalingerers and 91 possible malingerers. (SLD)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rades, Dirk, E-mail: Rades.Dirk@gmx.net; Dziggel, Liesa; Haatanen, Tiina
2011-07-15
Purpose: To create and validate scoring systems for intracerebral control (IC) and overall survival (OS) of patients irradiated for brain metastases. Methods and Materials: In this study, 1,797 patients were randomly assigned to the test (n = 1,198) or the validation group (n = 599). Two scoring systems were developed, one for IC and another for OS. The scores included prognostic factors found significant on multivariate analyses. Age, performance status, extracerebral metastases, interval tumor diagnosis to RT, and number of brain metastases were associated with OS. Tumor type, performance status, interval, and number of brain metastases were associated with IC.more » The score for each factor was determined by dividing the 6-month IC or OS rate (given in percent) by 10. The total score represented the sum of the scores for each factor. The score groups of the test group were compared with the corresponding score groups of the validation group. Results: In the test group, 6-month IC rates were 17% for 14-18 points, 49% for 19-23 points, and 77% for 24-27 points (p < 0.0001). IC rates in the validation group were 19%, 52%, and 77%, respectively (p < 0.0001). In the test group, 6-month OS rates were 9% for 15-19 points, 41% for 20-25 points, and 78% for 26-30 points (p < 0.0001). OS rates in the validation group were 7%, 39%, and 79%, respectively (p < 0.0001). Conclusions: Patients irradiated for brain metastases can be given scores to estimate OS and IC. IC and OS rates of the validation group were similar to the test group demonstrating the validity and reproducibility of both scores.« less
ERIC Educational Resources Information Center
Koppitz, Elizabeth Munsterberg
Presented is a manual for scoring the Bender Gestalt Test and the Human Figure Drawing Test for screening and diagnostic uses with emotionally disturbed, brain damaged, or perceptually handicapped 5- to 11-year-old children. Given are suggestions for administering and scoring the Bender test which examines distortion of shape, rotation,…
Karr, Justin E; Garcia-Barrera, Mauricio A; Holdnack, James A; Iverson, Grant L
2018-01-01
Multivariate base rates allow for the simultaneous statistical interpretation of multiple test scores, quantifying the normal frequency of low scores on a test battery. This study provides multivariate base rates for the Delis-Kaplan Executive Function System (D-KEFS). The D-KEFS consists of 9 tests with 16 Total Achievement scores (i.e. primary indicators of executive function ability). Stratified by education and intelligence, multivariate base rates were derived for the full D-KEFS and an abbreviated four-test battery (i.e. Trail Making, Color-Word Interference, Verbal Fluency, and Tower Test) using the adult portion of the normative sample (ages 16-89). Multivariate base rates are provided for the full and four-test D-KEFS batteries, calculated using five low score cutoffs (i.e. ≤25th, 16th, 9th, 5th, and 2nd percentiles). Low scores occurred commonly among the D-KEFS normative sample, with 82.6 and 71.8% of participants obtaining at least one score ≤16th percentile for the full and four-test batteries, respectively. Intelligence and education were inversely related to low score frequency. The base rates provided herein allow clinicians to interpret multiple D-KEFS scores simultaneously for the full D-KEFS and an abbreviated battery of commonly administered tests. The use of these base rates will support clinicians when differentiating between normal variations in cognitive performance and true executive function deficits.
Terwee, Caroline B; Mokkink, Lidwine B; Knol, Dirk L; Ostelo, Raymond W J G; Bouter, Lex M; de Vet, Henrica C W
2012-05-01
The COSMIN checklist is a standardized tool for assessing the methodological quality of studies on measurement properties. It contains 9 boxes, each dealing with one measurement property, with 5-18 items per box about design aspects and statistical methods. Our aim was to develop a scoring system for the COSMIN checklist to calculate quality scores per measurement property when using the checklist in systematic reviews of measurement properties. The scoring system was developed based on discussions among experts and testing of the scoring system on 46 articles from a systematic review. Four response options were defined for each COSMIN item (excellent, good, fair, and poor). A quality score per measurement property is obtained by taking the lowest rating of any item in a box ("worst score counts"). Specific criteria for excellent, good, fair, and poor quality for each COSMIN item are described. In defining the criteria, the "worst score counts" algorithm was taken into consideration. This means that only fatal flaws were defined as poor quality. The scores of the 46 articles show how the scoring system can be used to provide an overview of the methodological quality of studies included in a systematic review of measurement properties. Based on experience in testing this scoring system on 46 articles, the COSMIN checklist with the proposed scoring system seems to be a useful tool for assessing the methodological quality of studies included in systematic reviews of measurement properties.
The rat whole embryo culture assay using the Dysmorphology Score system.
Zhang, Cindy; Panzica-Kelly, Julie; Augustine-Rauch, Karen
2013-01-01
The rat whole embryo culture (WEC) system has been used extensively for characterizing teratogenic properties of test chemicals. In this chapter, we describe the methodology for culturing rat embryos as well as a new morphological score system, the Dysmorphology Score (DMS) system for assessing morphology of mid gestation (gestational day 11) rat embryos. In contrast to the developmental stage focused scoring associated with the Brown and Fabro score system, this new score system assesses the respective degree of severity of dysmorphology, which delineates normal from abnormal morphology of specific embryonic structures and organ systems. This score system generates an approach that allows rapid identification and quantification of adverse developmental findings, making it conducive for characterization of compounds for teratogenic properties and screening activities.
Automated essay scoring and the future of educational assessment in medical education.
Gierl, Mark J; Latifi, Syed; Lai, Hollis; Boulais, André-Philippe; De Champlain, André
2014-10-01
Constructed-response tasks, which range from short-answer tests to essay questions, are included in assessments of medical knowledge because they allow educators to measure students' ability to think, reason, solve complex problems, communicate and collaborate through their use of writing. However, constructed-response tasks are also costly to administer and challenging to score because they rely on human raters. One alternative to the manual scoring process is to integrate computer technology with writing assessment. The process of scoring written responses using computer programs is known as 'automated essay scoring' (AES). An AES system uses a computer program that builds a scoring model by extracting linguistic features from a constructed-response prompt that has been pre-scored by human raters and then, using machine learning algorithms, maps the linguistic features to the human scores so that the computer can be used to classify (i.e. score or grade) the responses of a new group of students. The accuracy of the score classification can be evaluated using different measures of agreement. Automated essay scoring provides a method for scoring constructed-response tests that complements the current use of selected-response testing in medical education. The method can serve medical educators by providing the summative scores required for high-stakes testing. It can also serve medical students by providing them with detailed feedback as part of a formative assessment process. Automated essay scoring systems yield scores that consistently agree with those of human raters at a level as high, if not higher, as the level of agreement among human raters themselves. The system offers medical educators many benefits for scoring constructed-response tasks, such as improving the consistency of scoring, reducing the time required for scoring and reporting, minimising the costs of scoring, and providing students with immediate feedback on constructed-response tasks. © 2014 John Wiley & Sons Ltd.
Self-Monitoring Assessments for Educational Accountability Systems
ERIC Educational Resources Information Center
Koretz, Daniel; Beguin, Anton
2010-01-01
Test-based accountability is now the cornerstone of U.S. education policy, and it is becoming more important in many other nations as well. Educators sometimes respond to test-based accountability in ways that produce score inflation. In the past, score inflation has usually been evaluated by comparing trends in scores on a high-stakes test to…
D.C. Student Test Scores Show Uneven Progress. Data Snapshot
ERIC Educational Resources Information Center
DuPre, Mary
2011-01-01
Over the past five years, both DC Public Schools (DCPS) and public charter schools (PCS) have seen significant growth in secondary reading and math scores on the state test known as the District of Columbia Comprehensive Assessment System (DC CAS). However, scores have not improved as much at the elementary level. Reading and math scores for DCPS…
Datta, Rakesh; Datta, Karuna; Venkatesh, M D
2015-07-01
The classical didactic lecture has been the cornerstone of the theoretical undergraduate medical education. Their efficacy however reduces due to reduced interaction and short attention span of the students. It is hypothesized that the interactive response pad obviates some of these drawbacks. The aim of this study was to evaluate the effectiveness of an interactive response system by comparing it with conventional classroom teaching. A prospective comparative longitudinal study was conducted on 192 students who were exposed to either conventional or interactive teaching over 20 classes. Pre-test, Post-test and retentions test (post 8-12 weeks) scores were collated and statistically analysed. An independent observer measured number of student interactions in each class. Pre-test scores from both groups were similar (p = 0.71). There was significant improvement in both post test scores when compared to pre-test scores in either method (p < 0.001). The interactive post-test score was better than conventional post test score (p < 0.001) by 8-10% (95% CI-difference of means - 8.2%-9.24%-10.3%). The interactive retention test score was better than conventional retention test score (p < 0.001) by 15-18% (95% CI-difference of means - 15.0%-16.64%-18.2%). There were 51 participative events in the interactive group vs 25 in the conventional group. The Interactive Response Pad method was efficacious in teaching. Students taught with the interactive method were likely to score 8-10% higher (statistically significant) in the immediate post class time and 15-18% higher (statistically significant) after 8-12 weeks. The number of student-teacher interactions increases when using the interactive response pads.
Making the Cut in Gifted Selection: Score Combination Rules and Their Impact on Program Diversity
ERIC Educational Resources Information Center
Lakin, Joni M.
2018-01-01
The recommendation of using "multiple measures" is common in policy guidelines for gifted and talented assessment systems. However, the integration of multiple test scores in a system that uses cut-scores requires choosing between different methods of combining quantitative scores. Past research has indicated that OR combination rules…
An Evaluation of the IntelliMetric[SM] Essay Scoring System
ERIC Educational Resources Information Center
Rudner, Lawrence M.; Garcia, Veronica; Welch, Catherine
2006-01-01
This report provides a two-part evaluation of the IntelliMetric[SM] automated essay scoring system based on its performance scoring essays from the Analytic Writing Assessment of the Graduate Management Admission Test[TM] (GMAT[TM]). The IntelliMetric system performance is first compared to that of individual human raters, a Bayesian system…
Consistency of Standard Setting in an Augmented State Testing System
ERIC Educational Resources Information Center
Lissitz, Robert W.; Wei, Hua
2008-01-01
In this article we address the issue of consistency in standard setting in the context of an augmented state testing program. Information gained from the external NRT scores is used to help make an informed decision on the determination of cut scores on the state test. The consistency of cut scores on the CRT across grades is maintained by forcing…
Effective Score Reporting of Non-Norm-Referenced Assessment.
ERIC Educational Resources Information Center
Haenn, Joseph F.; And Others
The Maryland State Department of Education issued a request for proposals to develop a score reporting system for the Maryland Functional Testing Program. RMC Research Corporation conducted a literature review of extant literature and developed a national survey of non-norm-referenced test score reporting practices. This comprehensive analysis of…
ERIC Educational Resources Information Center
Bennett, Randy Elliot; And Others
1990-01-01
The relationship of an expert-system-scored constrained free-response item type to multiple-choice and free-response items was studied using data for 614 students on the College Board's Advanced Placement Computer Science (APCS) Examination. Implications for testing and the APCS test are discussed. (SLD)
Characteristics of the Test Components of the IELTS Battery: Australian Trial Data.
ERIC Educational Resources Information Center
Griffin, Patrick
Results of the International English Language Testing System (IELTS) battery trials in Australia are reported. The IELTS tests of productive language skills use direct assessment strategies and subjective scoring according to detailed guidelines. The receptive skills tests use indirect assessment strategies and clerical scoring procedures.…
Klein, A A; Collier, T; Yeates, J; Miles, L F; Fletcher, S N; Evans, C; Richards, T
2017-09-01
A simple and accurate scoring system to predict risk of transfusion for patients undergoing cardiac surgery is lacking. We identified independent risk factors associated with transfusion by performing univariate analysis, followed by logistic regression. We then simplified the score to an integer-based system and tested it using the area under the receiver operator characteristic (AUC) statistic with a Hosmer-Lemeshow goodness-of-fit test. Finally, the scoring system was applied to the external validation dataset and the same statistical methods applied to test the accuracy of the ACTA-PORT score. Several factors were independently associated with risk of transfusion, including age, sex, body surface area, logistic EuroSCORE, preoperative haemoglobin and creatinine, and type of surgery. In our primary dataset, the score accurately predicted risk of perioperative transfusion in cardiac surgery patients with an AUC of 0.76. The external validation confirmed accuracy of the scoring method with an AUC of 0.84 and good agreement across all scores, with a minor tendency to under-estimate transfusion risk in very high-risk patients. The ACTA-PORT score is a reliable, validated tool for predicting risk of transfusion for patients undergoing cardiac surgery. This and other scores can be used in research studies for risk adjustment when assessing outcomes, and might also be incorporated into a Patient Blood Management programme. © The Author 2017. Published by Oxford University Press on behalf of the British Journal of Anaesthesia. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.
Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A
2016-03-01
Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.
Integral criteria for large-scale multiple fingerprint solutions
NASA Astrophysics Data System (ADS)
Ushmaev, Oleg S.; Novikov, Sergey O.
2004-08-01
We propose the definition and analysis of the optimal integral similarity score criterion for large scale multmodal civil ID systems. Firstly, the general properties of score distributions for genuine and impostor matches for different systems and input devices are investigated. The empirical statistics was taken from the real biometric tests. Then we carry out the analysis of simultaneous score distributions for a number of combined biometric tests and primary for ultiple fingerprint solutions. The explicit and approximate relations for optimal integral score, which provides the least value of the FRR while the FAR is predefined, have been obtained. The results of real multiple fingerprint test show good correspondence with the theoretical results in the wide range of the False Acceptance and the False Rejection Rates.
ERIC Educational Resources Information Center
McLean, James E.; Kaufman, Alan S.
1995-01-01
The six Holland-based Interest Scale scores yielded by the Harrington-O'Shea Career Decision-Making System (CDM) (T. Harrington and A. O'Shea, 1982) were related to sex, race, and performance on the Kaufman Adolescent and Adult Intelligence Test for 254 adolescents and young adults. CDM scores did not relate to most of the variables studied, and…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-23
...: Ovarian Adnexal Mass Assessment Score Test System; Availability AGENCY: Food and Drug Administration, HHS... assessment score test system into class II (special controls) under section 513(f)(2) of the Federal Food... DEPARTMENT OF HEALTH AND HUMAN SERVICES Food and Drug Administration [Docket No. FDA-2011-D-0028...
The Development of a New Method of Idiographic Measurement for Dynamic Assessment Intervention
ERIC Educational Resources Information Center
Hurley, Emma; Murphy, Raegan
2015-01-01
This paper proposes a new method of idiographic measurement for dynamic assessment (DA) intervention. There are two main methods of measurement for DA intervention; split-half tests and integrated scoring systems. Split-half tests of ability have proved useful from a research perspective. Integrated scoring systems coupled with case studies are…
Measures of Partial Knowledge and Unexpected Responses in Multiple-Choice Tests
ERIC Educational Resources Information Center
Chang, Shao-Hua; Lin, Pei-Chun; Lin, Zih-Chuan
2007-01-01
This study investigates differences in the partial scoring performance of examinees in elimination testing and conventional dichotomous scoring of multiple-choice tests implemented on a computer-based system. Elimination testing that uses the same set of multiple-choice items rewards examinees with partial knowledge over those who are simply…
Schoenberg, Mike R; Rum, Ruba S
2017-11-01
Rapid, clear and efficient communication of neuropsychological results is essential to benefit patient care. Errors in communication are a lead cause of medical errors; nevertheless, there remains a lack of consistency in how neuropsychological scores are communicated. A major limitation in the communication of neuropsychological results is the inconsistent use of qualitative descriptors for standardized test scores and the use of vague terminology. PubMed search from 1 Jan 2007 to 1 Aug 2016 to identify guidelines or consensus statements for the description and reporting of qualitative terms to communicate neuropsychological test scores was conducted. The review found the use of confusing and overlapping terms to describe various ranges of percentile standardized test scores. In response, we propose a simplified set of qualitative descriptors for normalized test scores (Q-Simple) as a means to reduce errors in communicating test results. The Q-Simple qualitative terms are: 'very superior', 'superior', 'high average', 'average', 'low average', 'borderline' and 'abnormal/impaired'. A case example illustrates the proposed Q-Simple qualitative classification system to communicate neuropsychological results for neurosurgical planning. The Q-Simple qualitative descriptor system is aimed as a means to improve and standardize communication of standardized neuropsychological test scores. Research are needed to further evaluate neuropsychological communication errors. Conveying the clinical implications of neuropsychological results in a manner that minimizes risk for communication errors is a quintessential component of evidence-based practice. Copyright © 2017 Elsevier B.V. All rights reserved.
van der Laan, Tallie M J; Postema, Sietke G; Reneman, Michiel F; Bongers, Raoul M; van der Sluis, Corry K
2018-02-10
Reliability study. Quantifying compensatory movements during work-related tasks may help to prevent musculoskeletal complaints in individuals with upper limb absence. (1) To develop a qualitative scoring system for rating compensatory shoulder and trunk movements in upper limb prosthesis wearers during the performance of functional capacity evaluation tests adjusted for use by 1-handed individuals (functional capacity evaluation-one handed [FCE-OH]); (2) to examine the interrater and intrarater reliability of the scoring system; and (3) to assess its feasibility. Movement patterns of 12 videotaped upper limb prosthesis wearers and 20 controls were analyzed. Compensatory movements were defined for each FCE-OH test, and a scoring system was developed, pilot tested, and adjusted. During reliability testing, 18 raters (12 FCE experts and 6 physiotherapists/gait analysts) scored videotapes of upper limb prosthesis wearers performing 4 FCE-OH tests 2 times (2 weeks apart). Agreement was expressed in % and kappa value. Feasibility (focus area's "acceptability", "demand," and "implementation") was determined by using a questionnaire. After 2 rounds of pilot testing and adjusting, reliability of a third version was tested. The interrater reliability for the first and second rating sessions were к = 0.54 (confidence interval [CI]: 0.52-0.57) and к = 0.64 (CI: 0.61-0.66), respectively. The intrarater reliability was к = 0.77 (CI: 0.72-0.82). The feasibility was good but could be improved by a training program. It seems possible to identify compensatory movements in upper limb prosthesis wearers during the performance of FCE-OH tests reliably by observation using the developed observational scoring system. Interrater reliability was satisfactory in most instances; intrarater reliability was good. Feasibility was established. Copyright © 2018 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Rosen, Jules; Mulsant, Benoit H; Marino, Patricia; Groening, Christopher; Young, Robert C; Fox, Debra
2008-10-30
Despite the importance of establishing shared scoring conventions and assessing interrater reliability in clinical trials in psychiatry, these elements are often overlooked. Obstacles to rater training and reliability testing include logistic difficulties in providing live training sessions, or mailing videotapes of patients to multiple sites and collecting the data for analysis. To address some of these obstacles, a web-based interactive video system was developed. It uses actors of diverse ages, gender and race to train raters how to score the Hamilton Depression Rating Scale and to assess interrater reliability. This system was tested with a group of experienced and novice raters within a single site. It was subsequently used to train raters of a federally funded multi-center clinical trial on scoring conventions and to test their interrater reliability. The advantages and limitations of using interactive video technology to improve the quality of clinical trials are discussed.
Accurate prediction of pregnancy viability by means of a simple scoring system.
Bottomley, Cecilia; Van Belle, Vanya; Kirk, Emma; Van Huffel, Sabine; Timmerman, Dirk; Bourne, Tom
2013-01-01
What is the performance of a simple scoring system to predict whether women will have an ongoing viable intrauterine pregnancy beyond the first trimester? A simple scoring system using demographic and initial ultrasound variables accurately predicts pregnancy viability beyond the first trimester with an area under the curve (AUC) in a receiver operating characteristic curve of 0.924 [95% confidence interval (CI) 0.900-0.947] on an independent test set. Individual demographic and ultrasound factors, such as maternal age, vaginal bleeding and gestational sac size, are strong predictors of miscarriage. Previous mathematical models have combined individual risk factors with reasonable performance. A simple scoring system derived from a mathematical model that can be easily implemented in clinical practice has not previously been described for the prediction of ongoing viability. This was a prospective observational study in a single early pregnancy assessment centre during a 9-month period. A cohort of 1881 consecutive women undergoing transvaginal ultrasound scan at a gestational age <84 days were included. Women were excluded if the first trimester outcome was not known. Demographic features, symptoms and ultrasound variables were tested for their influence on ongoing viability. Logistic regression was used to determine the influence on first trimester viability from demographics and symptoms alone, ultrasound findings alone and then from all the variables combined. Each model was developed on a training data set, and a simple scoring system was derived from this. This scoring system was tested on an independent test data set. The final outcome based on a total of 1435 participants was an ongoing viable pregnancy in 885 (61.7%) and early pregnancy loss in 550 (38.3%) women. The scoring system using significant demographic variables alone (maternal age and amount of bleeding) to predict ongoing viability gave an AUC of 0.724 (95% CI = 0.692-0.756) in the training set and 0.729 (95% CI = 0.684-0.774) in the test set. The scoring system using significant ultrasound variables alone (mean gestation sac diameter, mean yolk sac diameter and the presence of fetal heart beat) gave an AUC of 0.873 (95% CI = 0.850-0.897) and 0.900 (95% CI = 0.871-0.928) in the training and the test sets, respectively. The final scoring system using demographic and ultrasound variables together gave an AUC of 0.901 (95% CI = 0.881-0.920) and 0.924 (CI = 0.900-0.947) in the training and the test sets, respectively. After defining the cut-off at which the sensitivity is 0.90 on the training set, this model performed with a sensitivity of 0.92, specificity of 0.73, positive predictive value of 84.7% and negative predictive value of 85.4% in the test set. BMI and smoking variables were a potential omission in the data collection and might further improve the model performance if included. A further limitation is the absence of information on either bleeding or pain in 18% of women. Caution should be exercised before implementation of this scoring system prior to further external validation studies This simple scoring system incorporates readily available data that are routinely collected in clinical practice and does not rely on complex data entry. As such it could, unlike most mathematical models, be easily incorporated into normal early pregnancy care, where women may appreciate an individualized calculation of the likelihood of ongoing pregnancy viability. Research by V.V.B. supported by Research Council KUL: GOA MaNet, PFV/10/002 (OPTEC), several PhD/postdoc & fellow grants; IWT: TBM070706-IOTA3, PhD Grants; IBBT; Belgian Federal Science Policy Office: IUAP P7/(DYSCO, `Dynamical systems, control and optimization', 2012-2017). T.B. is supported by the Imperial Healthcare NHS Trust NIHR Biomedical Research Centre. Not applicable.
Graphical method for comparative statistical study of vaccine potency tests.
Pay, T W; Hingley, P J
1984-03-01
Producers and consumers are interested in some of the intrinsic characteristics of vaccine potency assays for the comparative evaluation of suitable experimental design. A graphical method is developed which represents the precision of test results, the sensitivity of such results to changes in dosage, and the relevance of the results in the way they reflect the protection afforded in the host species. The graphs can be constructed from Producer's scores and Consumer's scores on each of the scales of test score, antigen dose and probability of protection against disease. A method for calculating these scores is suggested and illustrated for single and multiple component vaccines, for tests which do or do not employ a standard reference preparation, and for tests which employ quantitative or quantal systems of scoring.
O'Grady, Anthony; Allen, David; Happerfield, Lisa; Johnson, Nicola; Provenzano, Elena; Pinder, Sarah E; Tee, Lilian; Gu, Mai; Kay, Elaine W
2010-12-01
Immunohistochemistry (IHC) is used as the frontline assay to determine HER2 status in invasive breast cancer patients. The aim of the study was to compare the performance of the Leica Oracle HER2 Bond IHC System (Oracle) with the current most readily accepted Dako HercepTest (HercepTest), using both commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. A total of 445 breast cancer samples from 3 international clinical HER2 referral centers were stained with the 2 test systems and scored in a blinded fashion by experienced pathologists. The overall agreement between the 2 tests in a 3×3 (negative, equivocal and positive) analysis shows a concordance of 86.7% and 86.3%, respectively when analyzed using commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. There is a good concordance between the Oracle and the HercepTest. The advantages of a complete fully automated test such as the Oracle include standardization of key analytical factors and improved turn around time. The implementation of the modified ASCO/CAP and UK HER2 IHC scoring guidelines has minimal effect on either assay interpretation, showing that Oracle can be used as a methodology for accurately determining HER2 IHC status in formalin fixed, paraffin-embedded breast cancer tissue.
ERIC Educational Resources Information Center
Green, Anthony
2007-01-01
This study investigated whether dedicated test preparation classes gave learners an advantage in improving their writing test scores. Score gains following instruction on a measure of academic writing skills--the International English Language Testing System (IELTS) academic writing test--were compared across language courses of three types; all…
Mearini, Luigi; Zucchi, Alessandro; Nunzi, Elisabetta; Di Biase, Manuel; Bini, Vittorio; Costantini, Elisabetta
2015-07-01
To date, there is no overall consensus on the definition of cure after surgery for pelvic organ prolapse (POP). The aim of the study was to design and test the scoring system S.A.C.S. (Satisfaction-Anatomy-Continence-Safety) to assess and compare the outcomes of POP repair. A total of 233 women underwent open sacrocolpopexy. The S.A.C.S. outcome scoring system was scheduled at 24 months of follow-up, and each component was detected according to: Satisfaction by mean of Patient Global Improvement Inventory scale, Anatomy by mean of POP Quantification system and bulge symptom, Continence by mean of pad use, and Safety by mean of the Clavien-Dindo classification of surgical complications. Each component produced a binary nominal categorical variable (1 or 0), with a total score of 4 representing cure. As a comparative tool, patients answered a simple yes/no question: "If you had to undergo surgery all over again, would you still do it?". The degree of concordance was estimated using Cohen's Kappa test. According to the S.A.C.S. scoring system, only 160 patients (68.6 %) reached the maximum score of cure. Sensitivity of the S.A.C.S. score was 74.1 %, specificity was 90 %, total diagnostic capacity was 75.5 %. The S.A.C.S. score internal consistency was good; the k-coefficient was higher for the satisfaction component of the score (k = 0.560). This study proposes an original, simple post-operative scoring system integrating satisfaction, anatomy, continence, and safety reports for patients undergoing surgery for POP, providing a complete, although perfectible, method to accurately report outcomes in all clinical scenarios.
Yoo, Jeong-Ju; Chung, Goh Eun; Lee, Jeong-Hoon; Nam, Joon Yeul; Chang, Young; Lee, Jeong Min; Lee, Dong Ho; Kim, Hwi Young; Cho, Eun Ju; Yu, Su Jong; Kim, Yoon Jun; Yoon, Jung-Hwan
2018-04-01
Advanced hepatocellular carcinoma (HCC) is associated with various clinical conditions including major vessel invasion, metastasis, and poor performance status. The aim of this study was to establish a prognostic scoring system and to propose a sub-classification of the Barcelona-Clinic Liver Cancer (BCLC) stage C. This retrospective study included consecutive patientswho received sorafenib for BCLC stage C HCC at a single tertiary hospital in Korea. A Cox proportional hazard model was used to develop a scoring system, and internal validationwas performed by a 5-fold cross-validation. The performance of the model in predicting risk was assessed by the area under the curve and the Hosmer-Lemeshow test. A total of 612 BCLC stage C HCC patients were sub- classified into strata depending on their performance status. Five independent prognostic factors (Child-Pugh score, α-fetoprotein, tumor type, extrahepatic metastasis, and portal vein invasion) were identified and used in the prognostic scoring system. This scoring system showed good discrimination (area under the receiver operating characteristic curve, 0.734 to 0.818) and calibration functions (both p < 0.05 by the Hosmer-Lemeshow test at 1 month and 12 months, respectively). The differences in survival among the different risk groups classified by the total score were significant (p < 0.001 by the log-rank test in both the Eastern Cooperative Oncology Group 0 and 1 strata). The heterogeneity of patientswith BCLC stage C HCC requires sub-classification of advanced HCC. A prognostic scoring system with five independent factors is useful in predicting the survival of patients with BCLC stage C HCC.
Instrumented sparring vest to aid in martial arts scoring.
Harrigan, Katie; Logan, Rachel; Sluti, Anne; Rogge, Renee
2006-01-01
Competitors in certain martial arts, such as Taekwondo, are required to wear protective vests during competition. This article outlines the design and fabrication of an instrumented martial arts sparring vest that will aid in martial arts scoring, which is currently a work in progress. After fabrication, this instrumented vest and associated system will not only provide the same protection as before modification, but will also report the location and force magnitude of strikes applied to the vest. This will aid in scoring of martial arts competitions, as it will determine if a strike is forceful enough to be considered deliberate and therefore is a valid point-scoring strike. This will make scoring of competitions unbiased and equal for all competitors, something that is difficult to achieve based solely on a judge's assessment by observation. The system will also indicate the probable injury resulting from a strike, for example, no injury, bruising or bone fracture. If a competitor's strike force is excessive and serious injury could result, the system will indicate this so action can be taken, such as penalty or disqualification of a competitor. Both tissue testing and force testing will be conducted prior to vest fabrication to determine estimates of forces that will damage tissue and typical forces experienced during competition. After testing is complete, the system will be fabricated and the testing results will be implemented into the associated software.
Normative data for the Clock Drawing Test for French-Quebec mid- and older aged healthy adults.
Turcotte, Valérie; Gagnon, Marie-Eve; Joubert, Sven; Rouleau, Isabelle; Gagnon, Jean-François; Escudier, Frédérique; Koski, Lisa; Potvin, Olivier; Macoir, Joël; Hudon, Carol
2018-05-09
The Clock Drawing Test (CDT) is frequently used to screen for cognitive impairment, however, normative data for Rouleau et al.'s scoring system are scarce. The present study aims to provide norms for Rouleau et al.'s scoring system that are tailored to Quebec French-speaking mid- and older aged healthy adults. Six researchers from various research centers across the Province of Quebec (Canada) sent anonymous data for 593 (391 women) healthy community-dwelling volunteers (age range: 43-93 years; education range: 5-23 years) who completed the CDT 'drawing on command' version. This command version (setting the clock hands to 11:10, without a pre-drawn circle) was administrated as part of a more extensive neuropsychological assessment, or along with cognitive screening instruments. Each drawn clock was scored according to the quantitative criteria set by Rouleau et al.'s scoring system. CDT scores were significantly correlated with age (r(592) = -.132, p = .001) and years of education (r(592) = .116, p = .005), but not with sex (r(592) = .065, p = .112). Since data were skewed towards higher test scores, the percentiles method was used for analysis. Percentile ranks stratified by age and education are presented. These normative data for Rouleau et al.'s scoring system will contribute towards adequately screening for cognitive decline in Quebec French-speaking healthy adults, by also taking into account individual characteristics such as age and education.
TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency
ERIC Educational Resources Information Center
Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela
2012-01-01
Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…
Fulks, Michael; Stout, Robert L; Dolan, Vera F
2012-01-01
Evaluate the degree of medium to longer term mortality prediction possible from a scoring system covering all laboratory testing used for life insurance applicants, as well as blood pressure and build measurements. Using the results of testing for life insurance applicants who reported a Social Security number in conjunction with the Social Security Death Master File, the mortality associated with each test result was defined by age and sex. The individual mortality scores for each test were combined for each individual and a composite mortality risk score was developed. This score was then tested against the insurance applicant dataset to evaluate its ability to discriminate risk across age and sex. The composite risk score was highly predictive of all-cause mortality risk in a linear manner from the best to worst quintile of scores in a nearly identical fashion for each sex and decade of age. Laboratory studies, blood pressure and build from life insurance applicants can be used to create scoring that predicts all-cause mortality across age and sex. Such an approach may hold promise for preventative health screening as well.
Clock Drawing as a Screen for Impaired Driving in Aging and Dementia: Is It Worth the Time?
Manning, Kevin J.; Davis, Jennifer D.; Papandonatos, George D.; Ott, Brian R.
2014-01-01
Clock drawing is recommended by medical and transportation authorities as a screening test for unsafe drivers. The objective of the present study was to assess the usefulness of different clock drawing systems as screening measures of driving performance in 122 healthy and cognitively impaired older drivers. Clock drawing was measured using four different scoring systems. Driving outcomes included global ratings of safety and the error rate on a standardized on-road test. Findings revealed that clock drawing was significantly correlated with the driving score on the road test for each of the scoring systems. However, receiver operator curve analyses showed limited clinical utility for clock drawing as a screening instrument for impaired on-road driving performance with the area under the curve ranging from 0.53 to 0.61. Results from this study indicate that clock drawing has limited utility as a solitary screening measure of on-road driving, even when considering a variety of scoring approaches. PMID:24296110
Clock drawing as a screen for impaired driving in aging and dementia: is it worth the time?
Manning, Kevin J; Davis, Jennifer D; Papandonatos, George D; Ott, Brian R
2014-02-01
Clock drawing is recommended by medical and transportation authorities as a screening test for unsafe drivers. The objective of the present study was to assess the usefulness of different clock drawing systems as screening measures of driving performance in 122 healthy and cognitively impaired older drivers. Clock drawing was measured using four different scoring systems. Driving outcomes included global ratings of safety and the error rate on a standardized on-road test. Findings revealed that clock drawing was significantly correlated with the driving score on the road test for each of the scoring systems. However, receiver operator curve analyses showed limited clinical utility for clock drawing as a screening instrument for impaired on-road driving performance with the area under the curve ranging from 0.53 to 0.61. Results from this study indicate that clock drawing has limited utility as a solitary screening measure of on-road driving, even when considering a variety of scoring approaches.
Hilton, C; Fisher, W; Lopez, A; Sanders, C
1997-09-01
To design and test a simple, easily modifiable system for calculating faculty productivity in teaching, research, administration, and patient care in which all areas of endeavor would be recognized and high productivity in one area would produce results similar to high productivity in another at the Louisiana State University School of Medicine in New Orleans. A relative-value and time-based system was designed in 1996 so that similar efforts in the four areas would produce similar scores, and a profile reflecting the authors' estimates of high productivity ("super faculty") was developed for each area. The activity profiles of 17 faculty members were used to test the system. "Super-faculty" scores in all areas were similar. The faculty members' mean scores were higher for teaching and research than for administration and patient care, and all four mean scores were substantially lower than the respective totals for the "super faculty". In each category the scores of those faculty members who scored above the mean in that category were used to calculate new mean scores. The mean scores for these faculty members were similar to those for the "super faculty" in teaching and research but were substantially lower for administration and patient care. When the mean total score of the eight faculty members predicted to have total scores below the group mean was compared with the mean total score of the nine faculty members predicted to have total scores above the group mean, the difference was significant (p < .0001). For the former, every score in each category was below the mean, with the exception of one faculty member's score in one category. Of the latter, eight had higher scores in teaching and four had higher scores in teaching and research combined. This system provides a quantitative method for the equal recognition of faculty productivity in a number of areas, and it may be useful as a starting point for other academic units exploring similar issues.
Student Tests for Teacher Evaluation: A Critique.
ERIC Educational Resources Information Center
Florio, David H.
1986-01-01
This article supports Edward Haertel's views on inappropriate use of student test scores in evaluating teachers. Tests scores may identify a few incompetent teachers, but may bring new ailments to schools. The article argues that even the system proposed by Haertal may become subject to abuse by mechanistic or autocratic administrative practices.…
Monitoring the Performance of Human and Automated Scores for Spoken Responses
ERIC Educational Resources Information Center
Wang, Zhen; Zechner, Klaus; Sun, Yu
2018-01-01
As automated scoring systems for spoken responses are increasingly used in language assessments, testing organizations need to analyze their performance, as compared to human raters, across several dimensions, for example, on individual items or based on subgroups of test takers. In addition, there is a need in testing organizations to establish…
2011-12-30
The Food and Drug Administration (FDA) is amending the regulation classifying ovarian adnexal mass assessment score test systems to restrict these devices so that a prescribed warning statement that addresses a risk identified in the special controls guidance document must be in a black box and must appear in all labeling, advertising, and promotional material. The black box warning mitigates the risk to health associated with off-label use as a screening test, stand-alone diagnostic test, or as a test to determine whether or not to proceed with surgery.
ERIC Educational Resources Information Center
Thum, Yeow Meng; Bhattacharya, Suman Kumar
To better describe individual behavior within a system, this paper uses a sample of longitudinal test scores from a large urban school system to consider hierarchical Bayes estimation of a multilevel linear regression model in which each individual regression slope of test score on time switches at some unknown point in time, "kj."…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-23
... yield a single result for the likelihood that an adnexal pelvic mass in a woman is malignant. Such a... test system measures one or more analytes in serum and combines the values into a single score that is then used to determine the likelihood that the pre-surgical adnexal mass in a woman not yet referred to...
The Scorer Reliability of Self-Scored Interest Inventories.
ERIC Educational Resources Information Center
O'Shea, Arthur J.; Harrington, Thomas F.
1980-01-01
Describes the procedures the authors of the System for Career Decision-Making (CDM) followed in establishing client scoring reliability. Authors recommend that manuals of self-scored inventories provide data establishing scorer reliability, that scoring be supervised, and that APGA test standards deal directly with scorer reliability. (Author)
A diagnostic scoring system for myxedema coma.
Popoveniuc, Geanina; Chandra, Tanu; Sud, Anchal; Sharma, Meeta; Blackman, Marc R; Burman, Kenneth D; Mete, Mihriye; Desale, Sameer; Wartofsky, Leonard
2014-08-01
To develop diagnostic criteria for myxedema coma (MC), a decompensated state of extreme hypothyroidism with a high mortality rate if untreated, in order to facilitate its early recognition and treatment. The frequencies of characteristics associated with MC were assessed retrospectively in patients from our institutions in order to derive a semiquantitative diagnostic point scale that was further applied on selected patients whose data were retrieved from the literature. Logistic regression analysis was used to test the predictive power of the score. Receiver operating characteristic (ROC) curve analysis was performed to test the discriminative power of the score. Of the 21 patients examined, 7 were reclassified as not having MC (non-MC), and they were used as controls. The scoring system included a composite of alterations of thermoregulatory, central nervous, cardiovascular, gastrointestinal, and metabolic systems, and presence or absence of a precipitating event. All 14 of our MC patients had a score of ≥60, whereas 6 of 7 non-MC patients had scores of 25 to 50. A total of 16 of 22 MC patients whose data were retrieved from the literature had a score ≥60, and 6 of 22 of these patients scored between 45 and 55. The odds ratio per each score unit increase as a continuum was 1.09 (95% confidence interval [CI], 1.01 to 1.16; P = .019); a score of 60 identified coma, with an odds ratio of 1.22. The area under the ROC curve was 0.88 (95% CI, 0.65 to 1.00), and the score of 60 had 100% sensitivity and 85.71% specificity. A score ≥60 in the proposed scoring system is potentially diagnostic for MC, whereas scores between 45 and 59 could classify patients at risk for MC.
Tong, L; Ang, A; Vernon, S; Zambarakji, H; Bhan, A; Sung, V; Page, S
2001-01-01
AIM—To assess the use of the Heidelberg retina tomograph (HRT) in screening for sight threatening diabetic macular oedema in a hospital diabetic clinic, using a new subjective analysis system (SCORE). METHODS—200 eyes of 100 consecutive diabetic patients attending a diabetologist's clinic were studied, all eyes had an acuity of 6/9 or better. All patients underwent clinical examination by an ophthalmologist. Using the HRT, one good scan was obtained for each eye centred on the fovea. A System for Classification and Ordering of Retinal Edema (SCORE) was developed using subjective assessment of the colour map and the reflectivity image. The interobserver agreement of using this method to detect macular oedema was assessed by two observers (ophthalmic trainees) who were familiarised with SCORE by studying standard pictures of eyes not in the study. All scans were graded from 0-6 and test positive cases were defined as having a SCORE value of 0-2. The sensitivity of SCORE was assessed by pooling the data with an additional 88 scans of 88 eyes in order to reduce the confidence interval of the index. RESULTS—12 eyes in eight out of the 100 patients had macular oedema clinically. Three scans in three patients could not be analysed because of poor scan quality. In the additional group of scans 76 out of 88 eyes had macular oedema clinically. The scoring system had a specificity of 99% (95% CI 96-100) and sensitivity of 67% (95% CI 57-76). The predictive value of a negative test was 87% (95% CI 82-99), and that of a positive test was 95% (95% CI 86-99). The mean difference of the SCORE value between two observers was -0.2 (95% CI -0.5 to +0.07). CONCLUSIONS—These data suggest that SCORE is potentially useful for detecting diabetic macular oedema in hospital diabetic patients. PMID:11133709
Andrade-Souza, Yuri M; Zadeh, Gelareh; Ramani, Meera; Scora, Daryl; Tsao, May N; Schwartz, Michael L
2005-10-01
The aim of this study was to validate the radiosurgery-based arteriovenous malformation (AVM) score and the modified Spetzler-Martin grading system to predict radiosurgical outcome. One hundred thirty-six patients with brain AVMs were randomly selected. These patients had undergone a linear accelerator radiosurgical procedure at a single center between 1989 and 2000. Patients were divided into four groups according to an AVM score, which was calculated from the lesion volume, lesion location, and patient age (Group 1, AVM score <1; Group 2, AVM score 1-1.49; Group 3, AVM score 1.5-2; and Group 4, AVM score >2). Patients with a Spetzler-Martin Grade III AVM were divided into Grades IIIA (lesion >3 cm) and IIIB (lesion <3 cm). Sixty-two female (45.6%) and 74 male (54.4%) patients with a median age of 37.5 years (mean 37.5 years, range 5-77 years) were followed up for a median of 40 months. The median tumor margin dose was 15 Gy (mean 17.23 Gy, range 15-25 Gy). The proportions of excellent outcomes according to the AVM score were as follows: 91.7% for Group 1, 74.1% for Group 2, 60% for Group 3, and 33.3% for Group 4 (chi-square test, degrees of freedom (df) = 3, p < 0.001). Based on the modified Spetzler-Martin system, Grade I lesions had 88.9% excellent results; Grade II, 69.6%; Grade IIIB, 61.5%; and Grades IIIA and IV, 44.8% (chi-square test, df = 3, p = 0.047). The radiosurgery-based AVM score can be used accurately to predict excellent results following a single radiosurgical treatment for AVM. The modified Spetzler-Martin system can also predict radiosurgical results for AVMs, thus making it possible to use this system while deciding between surgery and radiosurgery.
Everard, Eoin; Lyons, Mark; Harrison, Andrew J
2018-06-01
To examine the association of injury with the Functional Movement Screen (FMS) and Landing Error Scoring System (LESS) in military recruits undergoing an intensive 16-week training block. Prospective cohort study. One hundred and thirty-two entry-level male soldiers (18-25years) were tested using the FMS and LESS. The participants underwent an intensive 16-week training program with injury data recorded daily. Chi-squared statistics were used to examine associations between injury risk and (1) poor LESS scores, (2) any score of 1 on the FMS and (3) composite FMS score of ≤14. A composite FMS score of ≤14 was not a significant predictor of injury. LESS scores of >5 and having a score of 1 on any FMS test were significantly associated with injury. LESS scores had greater relative risk, sensitivity and specificity (2.2 (95% CI=1.48-3.34); 71% and 87% respectively) than scores of 1 on the FMS (relative risk=1.32 (95% CI=1.0-1.7); sensitivity=50% and specificity=76%). There was no association between composite FMS score and injury but LESS scores and scores of 1 in the FMS test were significantly associated with injury in varying degrees. LESS scores had a much better association with injury than both any scores of 1 on the FMS and a combination of LESS scores and scores of 1 on the FMS. Furthermore, the LESS provides comparable information related to injury risk as other well-established markers associated with injury such as age, muscular strength and previous injury. Copyright © 2017. Published by Elsevier Ltd.
Vingerhoets, Johan; Nijs, Steven; Tambuyzer, Lotke; Hoogstoel, Annemie; Anderson, David; Picchio, Gaston
2012-01-01
The aims of this study were to compare various genotypic scoring systems commonly used to predict virological outcome to etravirine, and examine their concordance with etravirine phenotypic susceptibility. Six etravirine genotypic scoring systems were assessed: Tibotec 2010 (based on 20 mutations; TBT 20), Monogram, Stanford HIVdb, ANRS, Rega (based on 37, 30, 27 and 49 mutations, respectively) and virco(®)TYPE HIV-1 (predicted fold change based on genotype). Samples from treatment-experienced patients who participated in the DUET trials and with both genotypic and phenotypic data (n=403) were assessed using each scoring system. Results were retrospectively correlated with virological response in DUET. κ coefficients were calculated to estimate the degree of correlation between the different scoring systems. Correlation between the five scoring systems and the TBT 20 system was approximately 90%. Virological response by etravirine susceptibility was comparable regardless of which scoring system was utilized, with 70-74% of DUET patients determined as susceptible to etravirine by the different scoring systems achieving plasma viral load <50 HIV-1 RNA copies/ml. In samples classed as phenotypically susceptible to etravirine (fold change in 50% effective concentration ≤3), correlations with genotypic score were consistently high across scoring systems (≥70%). In general, the etravirine genotypic scoring systems produced similar results, and genotype-phenotype concordance was high. As such, phenotypic interpretations, and in their absence all genotypic scoring systems investigated, may be used to reliably predict the activity of etravirine.
Comparing usability testing outcomes and functions of six electronic nursing record systems.
Cho, Insook; Kim, Eunman; Choi, Woan Heui; Staggers, Nancy
2016-04-01
This study examined the usability of six differing electronic nursing record (ENR) systems on the efficiency, proficiency and available functions for documenting nursing care and subsequently compared the results to nurses' perceived satisfaction from a previous study. The six hospitals had different ENR systems, all with narrative nursing notes in use for more than three years. Stratified by type of nursing unit, 54 staff nurses were digitally recorded during on-site usability testing by employing validated patient care scenarios and think-aloud protocols. The time to complete specific tasks was also measured. Qualitative performance data were converted into scores on efficiency (relevancy), proficiency (accuracy), and a competency index using scoring schemes described by McGuire and Babbott. Six nurse managers and the researchers completed assessments of available ENR functions and examined computerized nursing process components including the linkages among them. For the usability test, participants' mean efficiency score was 94.2% (95% CI, 91.4-96.9%). The mean proficiency was 60.6% (95% CI, 54.3-66.8%), and the mean competency index was 59.5% (95% CI, 52.9-66.0). Efficiency scores were significantly different across ENRs as was the time to complete tasks, ranging from 226.3 to 457.2s (χ(2)=12.3, P=0.031; χ(2)=11.2, P=0.048). No significant differences were seen for proficiency scores. The coverage of the various ENRs' nursing process ranged from 67% to 100%, but only two systems had complete integration of nursing components. Two systems with high efficiency and proficiency scores had much lower usability test scores and perceived user satisfaction along with more complex navigation patterns. In terms of system usability and functions, different levels of sophistication of and interaction performance with ENR systems exist in practice. This suggests that ENRs may have variable impacts on clinical outcomes and care quality. Future studies are needed to explore ENR impact on nursing care quality, efficiency, and safety. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Warren, Ruth M L; Thompson, Deborah; Pointon, Linda J; Hoff, Rebecca; Gilbert, Fiona J; Padhani, Anwar R; Easton, Douglas F; Lakhani, Sunil R; Leach, Martin O
2006-06-01
To evaluate prospectively the accuracy of a lesion classification system designed for use in a magnetic resonance (MR) imaging high-breast-cancer-risk screening study. All participating patients provided written informed consent. Ethics committee approval was obtained. The results of 1541 contrast material-enhanced breast MR imaging examinations were analyzed; 1441 screening examinations were performed in 638 women aged 24-51 years at high risk for breast cancer, and 100 examinations were performed in 100 women aged 23-81 years. Lesion analysis was performed in 991 breasts, which were divided into design (491 breasts) and testing (500 breasts) sets. The reference standard was histologic analysis of biopsy samples, fine-needle aspiration cytology, or minimal follow-up of 24 months. The scoring system involved the use of five features: morphology (MOR), pattern of enhancement (POE), percentage of maximal focal enhancement (PMFE), maximal signal intensity-time ratio (MITR), and pattern of contrast material washout (POCW). The system was evaluated by means of (a) assessment of interreader agreement, as expressed in kappa statistics, for 315 breasts in which both readers analyzed the same lesion, (b) assessment of the diagnostic accuracy of the scored components with receiver operating characteristic curve analysis, and (c) logistic regression analysis to determine which components of the scoring system were critical to the final score. A new simplified scoring system developed with the design set was applied to the testing set. There was moderate reader agreement regarding overall lesion outcome (ie, malignant, suspicious, or benign) (kappa=0.58) and less agreement regarding the scored components. The area under the receiver operating characteristic curve (AUC) for the overall lesion score, 0.88, was higher than the AUC for any one component. The components MOR, POE, and POCW yielded the best overall result. PMFE and MITR did not contribute to diagnostic utility. Applying a simplified scoring system to the testing set yielded a nonsignificantly (P=.2) higher AUC than did applying the original scoring system (sensitivity, 84%; specificity, 86.0%). Good diagnostic accuracy can be achieved by using simple qualitative descriptors of lesion enhancement, including POCW. In the context of screening, quantitative enhancement parameters appear to be less useful for lesion characterization. Copyright (c) RSNA, 2006.
A Milestone-Based Evaluation System-The Cure for Grade Inflation?
Kuo, Lindsay E; Hoffman, Rebecca L; Morris, Jon B; Williams, Noel N; Malachesky, Mark; Huth, Laura E; Kelz, Rachel R
2015-01-01
Controversy exists over the optimal use of the Milestones in the process of resident evaluation and feedback. We sought to evaluate the performance of a Milestones-based feedback system in comparison to a traditional model. The traditional evaluation system (TES) consisted of a generic 16-item survey using a 5-point Likert scale ranging from 1 to 5, and a free-text comments section. The Milestones-based evaluation system (MBES) was launched in July 2014, ranging from 0 to 4. Individual milestones were mapped to rotations based on resident educational goals by postgraduate year (PGY). The MBES consisted of a survey with a maximum of 7 items, followed by a free-text comment section. Within each evaluation system, an overall composite score was calculated for each categorical general surgical resident. To scale the 2 systems for comparison, TES scores were adjusted downward by 1 point. Descriptive statistics were performed. Univariate analysis was performed with the Wilcoxon signed-rank test. A test for trend across PGY was used for the MBES only. In the traditional system, the median score was 3.66 (range: 3.2-4.0). There was no meaningful difference in the median score by PGY. In the new system, the median score was 2.69 (range: 1.5-3.7, p < 0.01). The median score differed across PGY and increased by PGY of training (p < 0.01). There was an increase in differences between median scores by PGY. On using the milestones to facilitate faculty evaluation of resident knowledge and skill, there was a trend in increasing score by PGY of training. In the MBES, scores could be used to better discriminate resident skill and knowledge levels and resulted in improved differentiation in scoring by PGY. The use of the milestones as a basis for evaluation enabled the program to provide more meaningful feedback to residents and represents an improvement in surgical education. Copyright © 2015 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Analysis of 2009-10 WCPSS SAT Scores. Measuring Up. E&R Report No. 10.25
ERIC Educational Resources Information Center
Holdzkom, David; Gilleland, Kevin
2010-01-01
Wake County Public School System (WCPSS) students continue to fare well on the SAT test as compared with students in the state and nation. While there was a decline in average test scores in 2009-10 as compared with the prior year, the posted scores continue a trend of measurable improvement over time. Over the past 20 years, the average SAT…
Azad, Aftab Mohammad; Al Juma, Saad; Bhatti, Junaid Ahmad; Delaney, J Scott
2016-01-01
Balance testing is an important part of the initial concussion assessment. There is no research on the differences in Modified Balance Error Scoring System (M-BESS) scores when tested in real world as compared to control conditions. To assess the difference in M-BESS scores in athletes wearing their protective equipment and cleats on different surfaces as compared to control conditions. This cross-sectional study examined university North American football and soccer athletes. Three observers independently rated athletes performing the M-BESS test in three different conditions: (1) wearing shorts and T-shirt in bare feet on firm surface (control); (2) wearing athletic equipment with cleats on FieldTurf; and (3) wearing athletic equipment with cleats on firm surface. Mean M-BESS scores were compared between conditions. 60 participants were recruited: 39 from football (all males) and 21 from soccer (11 males and 10 females). Average age was 21.1 years (SD=1.8). Mean M-BESS scores were significantly lower (p<0.001) for cleats on FieldTurf (mean=26.3; SD=2.0) and for cleats on firm surface (mean=26.6; SD=2.1) as compared to the control condition (mean=28.4; SD=1.5). Females had lower scores than males for cleats on FieldTurf condition (24.9 (SD=1.9) vs 27.3 (SD=1.6), p=0.005). Players who had taping or bracing on their ankles/feet had lower scores when tested with cleats on firm surface condition (24.6 (SD=1.7) vs 26.9 (SD=2.0), p=0.002). Total M-BESS scores for athletes wearing protective equipment and cleats standing on FieldTurf or a firm surface are around two points lower than M-BESS scores performed on the same athletes under control conditions.
Azad, Aftab Mohammad; Al Juma, Saad; Bhatti, Junaid Ahmad; Delaney, J Scott
2016-01-01
Background Balance testing is an important part of the initial concussion assessment. There is no research on the differences in Modified Balance Error Scoring System (M-BESS) scores when tested in real world as compared to control conditions. Objective To assess the difference in M-BESS scores in athletes wearing their protective equipment and cleats on different surfaces as compared to control conditions. Methods This cross-sectional study examined university North American football and soccer athletes. Three observers independently rated athletes performing the M-BESS test in three different conditions: (1) wearing shorts and T-shirt in bare feet on firm surface (control); (2) wearing athletic equipment with cleats on FieldTurf; and (3) wearing athletic equipment with cleats on firm surface. Mean M-BESS scores were compared between conditions. Results 60 participants were recruited: 39 from football (all males) and 21 from soccer (11 males and 10 females). Average age was 21.1 years (SD=1.8). Mean M-BESS scores were significantly lower (p<0.001) for cleats on FieldTurf (mean=26.3; SD=2.0) and for cleats on firm surface (mean=26.6; SD=2.1) as compared to the control condition (mean=28.4; SD=1.5). Females had lower scores than males for cleats on FieldTurf condition (24.9 (SD=1.9) vs 27.3 (SD=1.6), p=0.005). Players who had taping or bracing on their ankles/feet had lower scores when tested with cleats on firm surface condition (24.6 (SD=1.7) vs 26.9 (SD=2.0), p=0.002). Conclusions Total M-BESS scores for athletes wearing protective equipment and cleats standing on FieldTurf or a firm surface are around two points lower than M-BESS scores performed on the same athletes under control conditions. PMID:27900181
Booker, Simon; Alfahad, Nawaf; Scott, Martin; Gooding, Ben; Wallace, W Angus
2015-01-01
To investigate shoulder scoring systems used in Europe and North America and how outcomes might be classified after shoulder joint replacement. All research papers published in four major journals in 2012 and 2013 were reviewed for the shoulder scoring systems used in their published papers. A method of identifying how outcomes after shoulder arthroplasty might be used to categorize patients into fair, good, very good and excellent outcomes was explored using the outcome evaluations from patients treated in our own unit. A total of 174 research articles that were published in the four journals used some form of shoulder scoring system. The outcome from shoulder arthroplasty in our unit has been evaluated using the constant score (CS) and the oxford shoulder score and these scores have been used to evaluate individual patient outcomes. CSs of < 30 = unsatisfactory; 30-39 = fair; 40-59 = good; 60-69 = very good; and 70 and over = excellent. The most popular shoulder scoring systems in North America were Simple Shoulder Test and American shoulder and elbow surgeons standard shoulder assessment form score and in Europe CS, Oxford Shoulder Score and DASH score. PMID:25793164
Can dual processing theory explain physics students' performance on the Force Concept Inventory?
NASA Astrophysics Data System (ADS)
Wood, Anna K.; Galloway, Ross K.; Hardy, Judy
2016-12-01
According to dual processing theory there are two types, or modes, of thinking: system 1, which involves intuitive and nonreflective thinking, and system 2, which is more deliberate and requires conscious effort and thought. The Cognitive Reflection Test (CRT) is a widely used and robust three item instrument that measures the tendency to override system 1 thinking and to engage in reflective, system 2 thinking. Each item on the CRT has an intuitive (but wrong) answer that must be rejected in order to answer the item correctly. We therefore hypothesized that performance on the CRT may give useful insights into the cognitive processes involved in learning physics, where success involves rejecting the common, intuitive ideas about the world (often called misconceptions) and instead carefully applying physical concepts. This paper presents initial results from an ongoing study examining the relationship between students' CRT scores and their performance on the Force Concept Inventory (FCI), which tests students' understanding of Newtonian mechanics. We find that a higher CRT score predicts a higher FCI score for both precourse and postcourse tests. However, we also find that the FCI normalized gain is independent of CRT score. The implications of these results are discussed.
ERIC Educational Resources Information Center
Evans, Richard M.; Surkan, Alvin J.
The recent arrival of portable computer systems with high-level language interpreters now makes it practical to rapidly develop complex testing and scoring programs. These programs permit undergraduates access, at arbitrary times, to testing as an integral part of a mastery learning strategy. Effects of introducing the computer were studied by…
ERIC Educational Resources Information Center
Schochet, Peter Z.; Chiang, Hanley S.
2010-01-01
This paper addresses likely error rates for measuring teacher and school performance in the upper elementary grades using value-added models applied to student test score gain data. Using realistic performance measurement system schemes based on hypothesis testing, we develop error rate formulas based on OLS and Empirical Bayes estimators.…
Assessing Associative Distance among Ideas Elicited by Tests of Divergent Thinking
ERIC Educational Resources Information Center
Acar, Selcuk; Runco, Mark A.
2014-01-01
Tests of divergent thinking represent the most commonly used assessment of creative potential. Typically they are scored for total ideational output (fluency), ideational originality, and, sometimes, ideational flexibility. That scoring system provides little information about the underlying process and about the associations among ideas. It also…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-30
... Automation, Inc. (``Amistar'') of San Marcos, California; Techno Soft Systemnics, Inc. (``Techno Soft'') of... the claim terms ``test,'' ``match score surface,'' and ``gradient direction,'' all of his infringement... complainants' proposed construction for the claim terms ``test,'' ``match score surface,'' and ``gradient...
The Emphasis of Student Test Scores in Teacher Appraisal Systems
ERIC Educational Resources Information Center
Smith, William C.; Kubacka, Katarzyna
2017-01-01
Over the past 30 years teachers have been held increasingly accountable for the quality of education in their classroom. During this transition, the line between teacher appraisals, traditionally an instrument for continuous formative teacher feedback, and summative teacher evaluations has blurred. Student test scores, as an "objective"…
Mathiasen, Ross; Hogrefe, Christopher; Harland, Kari; Peterson, Andrew; Smoot, M Kyle
2018-02-15
The Balance Error Scoring System (BESS) is a commonly used concussion assessment tool. Recent studies have questioned the stability and reliability of baseline BESS scores. The purpose of this longitudinal prospective cohort study is to examine differences in yearly baseline BESS scores in athletes participating on an NCAA Division-I football team. NCAA Division-I freshman football athletes were videotaped performing the BESS test at matriculation and after 1 year of participation in the football program. Twenty-three athletes were enrolled in year 1 of the study, and 25 athletes were enrolled in year 2. Those athletes enrolled in year 1 were again videotaped after year 2 of the study. The paired t-test was used to assess for change in score over time for the firm surface, foam surface, and the cumulative BESS score. Additionally, inter- and intrarater reliability values were calculated. Cumulative errors on the BESS significantly decreased from a mean of 20.3 at baseline to 16.8 after 1 year of participation. The mean number of errors following the second year of participation was 15.0. Inter-rater reliability for the cumulative score ranged from 0.65 to 0.75. Intrarater reliability was 0.81. After 1 year of participation, there is a statistically and clinically significant improvement in BESS scores in an NCAA Division-I football program. Although additional improvement in BESS scores was noted after a second year of participation, it did not reach statistical significance. Football athletes should undergo baseline BESS testing at least yearly if the BESS is to be optimally useful as a diagnostic test for concussion.
Evaluating Pekin duck walking ability using a treadmill performance test.
Byrd, C J; Main, R P; Makagon, M M
2016-10-01
Gait scoring is the most popular method for assessing the walking ability of poultry species. Although inexpensive and easy to implement, gait scoring systems are often criticized for being subjective. Using a treadmill performance test we assessed whether observable differences in Pekin duck walking ability identified using a gait scoring system translated to differences in walking performance. One hundred and eighty ducks were selected using a three-category gait scoring system (GS0 = smooth gait, n = 55; GS0.5 = labored walk without easily identifiable impediment, n = 56; GS1 = obvious impediment, n = 59) and the amount of time each duck was able to sustain walking on a treadmill at a speed of 0.31 m/s was evaluated. The walking test ended when each duck met one of three elimination criteria: (1) The duck walked for a maximum time of ten minutes, (2) the duck required support from the observer's hand for more than three seconds in order to continue walking on the treadmill, or (3) the duck sat down on the treadmill and made no attempt to stand despite receiving assistance from the observer. Data were analyzed in SAS 9.4 using PROC GLM. Tukey's multiple comparison test was used to compare differences in time spent walking between gait scores. Significant differences were found between all gait scores (P < 0.05). Behavioral correlates of walking performance were investigated. Video recorded during the treadmill test was analyzed for counts of sitting, standing, and leaning behaviors. Data were analyzed in SAS 9.4 using a negative binomial model for count data. No differences were found between gait scores for counts of sitting, standing, and leaning behaviors (P > 0.05). In conclusion, the amount of time spent walking on the treadmill corresponded to gait score and was an effective measurement for quantifying Pekin duck walking ability. The test could be a valuable tool for assessing the development of walking issues or the effectiveness of treatments aimed at promoting leg health. © 2016 Poultry Science Association Inc.
Patterson, Brendan M; Orvets, Nathan D; Aleem, Alexander W; Keener, Jay D; Calfee, Ryan P; Nixon, Devon C; Chamberlain, Aaron M
2018-06-01
The Patient-Reported Outcomes Measurement Information System (PROMIS) is being used to assess outcomes in many patient populations despite limited validation. The purpose of this study was to investigate the relationship between American Shoulder and Elbow Surgeons (ASES) and Simple Shoulder Test (SST) scores and PROMIS Physical Function (PF) and Upper Extremity (UE) function scores collected preoperatively in patients undergoing rotator cuff repair. This cross-sectional study analyzed 164 consecutive patients undergoing arthroscopic rotator cuff repair. Study inclusion required preoperative completion of the ASES and SST evaluations, as well as the PROMIS PF, UE, and Pain Interference computerized adaptive tests. Descriptive statistics were produced, and Pearson correlation coefficients were calculated between each of the outcome measures. Average PROMIS UE scores indicated greater impairment than PROMIS PF scores (34 vs 44). Three percent of patients reached the PROMIS UE ceiling score of 56. PROMIS PF scores demonstrated a weak correlation with ASES scores (r = 0.43, P < .001) and a moderate correlation with SST scores (r = 0.51, P < .001). PROMIS UE scores demonstrated a moderate correlation with both ASES scores (r = 0.59, P < .001) and SST scores (r = 0.62, P < .001). PROMIS Pain Interference scores demonstrated weak negative correlations with both ASES scores (r = -0.43, P < .001) and SST scores (r = -0.41, P < .001). Patients answered fewer questions on average using the PROMIS PF and UE instruments as compared with the ASES and SST instruments. PROMIS UE scores indicate greater impairment and demonstrate a stronger correlation with the legacy shoulder scores than PROMIS PF scores in patients with symptomatic rotator cuff tears. PROMIS computerized adaptive tests allow for more efficient patient-reported outcome data collection compared with traditional outcome scores. Copyright © 2018 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Taylor, K; Parashar, D; Bouverat, G; Poulos, A; Gullien, R; Stewart, E; Aarre, R; Crystal, P; Wallis, M
2017-11-01
Optimum mammography positioning technique is necessary to maximise cancer detection. Current criteria for mammography appraisal lack reliability and validity with a need to develop a more objective system. We aimed to establish current international practice in assessing image quality (IQ), of screening mammograms then develop and validate a reproducible assessment tool. A questionnaire sent to centres in countries undertaking population screening identified practice, participants for an expert panel (EP) of radiologists/radiographers and a testing panel (TP) of radiographers. The EP developed category criteria and descriptors using a modified Delphi process to agree definitions. The EP scored 12 screening mammograms to test agreement then a main set of 178 cases. Weighted scores were derived for each descriptor enabling calculation of numerical parameters for each new category. The TP then scored the main set. Statistical analysis included ANOVA, t-tests and Kendall's coefficient. 11 centres in 8 countries responded forming an EP of 7 members and TP of 44 members. The EP showed moderate agreement when the scoring the mini test set W = 0.50 p < 0.001 and the main set W = 0.55 p < 0.001, 'posterior nipple line' being the most difficult descriptor. The weighted total scores differentiated the 4 new categories Perfect, Good, Adequate and Inadequate (p < 0.001). We have developed an assessment tool by Delphi consensus and weighted consensus criteria. We have successfully tabulated a range of numerical scores for each new category providing the first validated and reproducible mammography IQ scoring system. Copyright © 2017 The College of Radiographers. Published by Elsevier Ltd. All rights reserved.
Dumitrescu, Gabriel; Januszkiewicz, Anna; Ågren, Anna; Magnusson, Maria; Wahlin, Staffan; Wernerman, Jan
2017-01-01
Abstract The severity of liver disease is assessed by scoring systems, which include the conventional coagulation test prothrombin time-the international normalized ratio (PT-INR). However, PT-INR is not predictive of bleeding in liver disease and thromboelastometry (ROTEM) has been suggested to give a better overview of the coagulation system in these patients. It has now been suggested that coagulation as reflected by tromboelastomety may also be used for prognostic purposes. The objective of our study was to investigate whether thrombelastometry may discriminate the degree of liver insufficiency according to the scoring systems Child Pugh and Model for End-stage Liver Disease (MELD). Forty patients with chronic liver disease of different etiologies and stages were included in this observational cross-sectional study. The severity of liver disease was evaluated using the Child-Pugh score and the MELD score, and blood samples for biochemistry, conventional coagulation tests, and ROTEM were collected at the time of the final assessment for liver transplantation. Statistical comparisons for the studied parameters with scores of severity were made using Spearman correlation test and receiver-operating characteristic (ROC) curves. Spearman correlation coefficients indicated that the thromboelastometric parameters did not correlate with Child-Pugh or MELD scores. The ROC curves of the thromboelastometric parameters could not differentiate advanced stages from early stages of liver cirrhosis. Standard ROTEM cannot discriminate the stage of chronic liver disease in patients with severe chronic liver disease. PMID:28591054
A prognostic scoring system for arm exercise stress testing.
Xie, Yan; Xian, Hong; Chandiramani, Pooja; Bainter, Emily; Wan, Leping; Martin, Wade H
2016-01-01
Arm exercise stress testing may be an equivalent or better predictor of mortality outcome than pharmacological stress imaging for the ≥50% for patients unable to perform leg exercise. Thus, our objective was to develop an arm exercise ECG stress test scoring system, analogous to the Duke Treadmill Score, for predicting outcome in these individuals. In this retrospective observational cohort study, arm exercise ECG stress tests were performed in 443 consecutive veterans aged 64.1 (11.1) years. (mean (SD)) between 1997 and 2002. From multivariate Cox models, arm exercise scores were developed for prediction of 5-year and 12-year all-cause and cardiovascular mortality and 5-year cardiovascular mortality or myocardial infarction (MI). Arm exercise capacity in resting metabolic equivalents (METs), 1 min heart rate recovery (HRR) and ST segment depression ≥1 mm were the stress test variables independently associated with all-cause and cardiovascular mortality by step-wise Cox analysis (all p<0.01). A score based on the relation HRR (bpm)+7.3×METs-10.5×ST depression (0=no; 1=yes) prognosticated 5-year cardiovascular mortality with a C-statistic of 0.81 before and 0.88 after adjustment for significant demographic and clinical covariates. Arm exercise scores for the other outcome end points yielded C-statistic values of 0.77-0.79 before and 0.82-0.86 after adjustment for significant covariates versus 0.64-0.72 for best fit pharmacological myocardial perfusion imaging models in a cohort of 1730 veterans who were evaluated over the same time period. Arm exercise scores, analogous to the Duke Treadmill Score, have good power for prediction of mortality or MI in patients who cannot perform leg exercise.
Wearable Improved Vision System for Color Vision Deficiency Correction
Riccio, Daniel; Di Perna, Luigi; Sanniti Di Baja, Gabriella; De Nino, Maurizio; Rossi, Settimio; Testa, Francesco; Simonelli, Francesca; Frucci, Maria
2017-01-01
Color vision deficiency (CVD) is an extremely frequent vision impairment that compromises the ability to recognize colors. In order to improve color vision in a subject with CVD, we designed and developed a wearable improved vision system based on an augmented reality device. The system was validated in a clinical pilot study on 24 subjects with CVD (18 males and 6 females, aged 37.4 ± 14.2 years). The primary outcome was the improvement in the Ishihara Vision Test score with the correction proposed by our system. The Ishihara test score significantly improved (\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$p = 0.03$ \\end{document}) from 5.8 ± 3.0 without correction to 14.8 ± 5.0 with correction. Almost all patients showed an improvement in color vision, as shown by the increased test scores. Moreover, with our system, 12 subjects (50%) passed the vision color test as normal vision subjects. The development and preliminary validation of the proposed platform confirm that a wearable augmented-reality device could be an effective aid to improve color vision in subjects with CVD. PMID:28507827
Sharma, Renuka; Kapoor, Raj
2016-01-01
Objectives: Blood pressure estimation is a key skill for medical practitioners. It is routinely taught to undergraduate medical students using an aneroid sphygmomanometer. However, the conceptual understanding in the practical remains limited. We conducted the following study to evaluate the efficacy of digital data acquisition systems as an adjunct to the sphygmomanometer to teach blood pressure. Methods: Fifty-seven first-year medical students participated in the study. An MCQ test of 15 questions, consisting of 10 conceptual and five factual questions, was administered twice – pre- and post-demonstration of blood pressure measurement using a digital data acquisition system. In addition, qualitative feedback was also obtained. Results: Median scores were 7 (6 - 8) and 3 (1.5 - 4) in pre-test sessions for conceptual and factual questions, respectively. Post-test scores showed a significant improvement in both categories (10 (9 - 10) and 4 (4 - 4.5), respectively, Mann-Whitney U test, p < 0.0001). Student feedback also indicated that the digital system enhanced learning and student participation. Conclusions: Student feedback regarding the demonstrations was uniformly positive, which was also reflected in significantly improved post-test scores. We conclude that parallel demonstration on digital systems and the sphygmomanometer will enhance student engagement and understanding of blood pressure measurement. PMID:27660735
Translation and validation of the Dutch new Knee Society Scoring System ©.
Van Der Straeten, Catherine; Witvrouw, Erik; Willems, Tine; Bellemans, Johan; Victor, Jan
2013-11-01
A new version of The Knee Society Knee Scoring System(©) (KSS) has recently been developed. Before this scale can be used in non-English-speaking populations, it has to be translated and validated for a particular population. We evaluated the construct and content validity, the test-retest reliability, and the internal consistency of the Dutch version of the New Knee Society KSS. A Dutch translation was performed using a forward-backward translation protocol. We tested the construct validity of the Dutch New KSS by comparing it with the Dutch versions of the WOMAC, Knee Injury and Osteoarthritis Outcome Score (KOOS), and SF-12 scores in 137 patients undergoing total knee arthroplasty (TKA). Content validity was assessed by comparing pre- and postoperative scores and by checking floor and ceiling effects. To evaluate test-retest reliability and consistency, 47 patients completed the questionnaire a second time with a mean of 8 days interval (range, 2-20 days) between tests. Construct validity was demonstrated because the Dutch New KSS correlated well with the Dutch WOMAC (r = -0.751; p < 0.001), Dutch KOOS (r = -0.723; p < 0.001), and Dutch SF-12 (r = 0.569; p < 0.001). There was a significant difference between pre- and postoperative scores (p < 0.001) in line with the other scores. Test-retest reliability proved excellent with an intraclass correlation coefficient between 0.73 and 0.92 depending on the domain tested. Consistency as indicated by Cronbach's alpha ranging from 0.84 to 0.96 was good to excellent. As demonstrated by the validation procedure, the Dutch New KSS is an excellent instrument to evaluate TKA outcome in Dutch-speaking patients.
Automated Trait Scores for "GRE"® Writing Tasks. Research Report. ETS RR-15-15
ERIC Educational Resources Information Center
Attali, Yigal; Sinharay, Sandip
2015-01-01
The "e-rater"® automated essay scoring system is used operationally in the scoring of the argument and issue tasks that form the Analytical Writing measure of the "GRE"® General Test. For each of these tasks, this study explored the value added of reporting 4 trait scores for each of these 2 tasks over the total e-rater score.…
NASA Astrophysics Data System (ADS)
Meilinda; Rustaman, N. Y.; Firman, H.; Tjasyono, B.
2018-05-01
The Climate Change System Thinking Instrument (CCSTI) is developed to measure a system thinking ability in the concept of climate change. CCSTI is developed in four phase’s development including instrument draft development, validation and evaluation including readable material test, expert validation, and field test. The result of field test is analyzed by looking at the readability score in Cronbach’s alpha test. Draft instrument is tested on college students majoring in Biology Education, Physics Education, and Chemistry Education randomly with a total number of 80 college students. Score of Content Validation Index at 0.86, which means that the CCSTI developed are categorized as very appropriate with question indicators and Cronbach’s alpha about 0.605 which mean categorized undesirable to minimal acceptable. From 45 questions of system thinking, there are 37 valid questions spread in four indicators of system thinking, which are system thinking phase I (pre-requirement), system thinking phase II (basic), system thinking phase III (intermediate), and system thinking phase IV (coherent expert).
ERIC Educational Resources Information Center
Chodorow, Martin; Burstein, Jill
2004-01-01
This study examines the relation between essay length and holistic scores assigned to Test of English as a Foreign Language[TM] (TOEFL[R]) essays by e-rater[R], the automated essay scoring system developed by ETS. Results show that an early version of the system, e-rater99, accounted for little variance in human reader scores beyond that which…
Bourgioti, Charis; Chatoupis, Konstantinos; Panourgias, Evangelia; Tzavara, Chara; Sarris, Kyrillos; Rodolakis, Alexandros; Moulopoulos, Lia Angela
2015-10-01
To report discriminant MRI features between cervical and endometrial carcinomas and to design an MRI- scoring system, with the potential to predict the origin of uterine cancer (cervix or endometrium) in histologically indeterminate cases. Dedicated pelvic MRIs of 77 patients with uterine tumors involving both cervix and corpus were retrospectively analyzed by two experts in female imaging. Seven MRI tumor characteristics were statistically tested for their discriminant ability for tumor origin compared to final histology: tumor location, perfusion pattern, rim enhancement, depth of myometrial invasion, cervical stromal integrity, intracavitary mass, and retained endometrial secretions. Kappa values were estimated to assess the levels of inter-rater reliability. On the basis of positive likelihood ratio values, an MRI-score was assigned. K value was excellent for most of the imaging criteria. Using ROC curve analysis, the estimated optimal cut-off for the MRI-scoring system was 4 with 96.6% sensitivity and 100% specificity. Using a ≥4 cut-off for cervical cancers and <4 for endometrial cancers, 97.4% of the patients were correctly classified. 2/58 patients with cervical cancer had MRI score <4 and none of the patients with endometrial cancer had MRI score >4. The area under curve of the MRI-scoring system was 0.99 (95% CI 0.98-1.00). When the MRI-score was applied to 20/77 patients with indeterminate initial biopsy and to 5/26 surgically treated patients with erroneous pre-op histology, all cases were correctly classified. The produced MRI-scoring system may be a reliable problem-solving tool for the differential diagnosis of cervical vs. endometrial cancer in cases of equivocal histology.
Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A
2013-12-01
A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
Intuitive Sense of Number Correlates With Math Scores on College-Entrance Examination
Libertus, Melissa E.; Odic, Darko; Halberda, Justin
2012-01-01
Many educated adults possess exact mathematical abilities in addition to an approximate, intuitive sense of number, often referred to as the Approximate Number System (ANS). Here we investigate the link between ANS precision and mathematics performance in adults by testing participants on an ANS-precision test and collecting their scores on the Scholastic Aptitude Test (SAT), a standardized college-entrance exam in the USA. In two correlational studies, we found that ANS precision correlated with SAT-Quantitative (i.e., mathematics) scores. This relationship remained robust even when controlling for SAT-Verbal scores, suggesting a small but specific relationship between our primitive sense for number and formal mathematical abilities. PMID:23098904
Improved perceptual-motor performance measurement system
NASA Technical Reports Server (NTRS)
Parker, J. F., Jr.; Reilly, R. E.
1969-01-01
Battery of tests determines the primary dimensions of perceptual-motor performance. Eighteen basic measures range from simple tests to sophisticated electronic devices. Improved system has one unit for the subject containing test display and response elements, and one for the experimenter where test setups, programming, and scoring are accomplished.
The Weighted Airman Promotion System: Standardizing Test Scores
2008-01-01
This document and trademark( s ) contained herein are protected by law as indicated in a notice appearing later in this work. This electronic...SUBTITLE The Weighted Airman Promotion System. Standardizing Test Scores 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR( S ) 5d...PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME( S ) AND ADDRESS(ES) Rand Corporation,PO Box 2138,Santa Monica
Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J
2014-05-01
Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
Wang-Hsu, Elizabeth; Smith, Susan S
2017-01-10
Falls are a common cause of injuries and hospital admissions in older adults. Balance limitation is a potentially modifiable factor contributing to falls. The Balance Evaluation Systems Test (BESTest), a clinical balance measure, categorizes balance into 6 underlying subsystems. Each of the subsystems is scored individually and summed to obtain a total score. The reliability of the BESTest and its individual subsystems has been reported in patients with various neurological disorders and cancer survivors. However, the reliability and minimal detectable change (MDC) of the BESTest with community-dwelling older adults have not been reported. The purposes of our study were to (1) determine the interrater and test-retest reliability of the BESTest total and subsystem scores; and (2) estimate the MDC of the BESTest and its individual subsystem scores with community-dwelling older adults. We used a prospective cohort methodological design. Community-dwelling older adults (N = 70; aged 70-94 years; mean = 85.0 [5.5] years) were recruited from a senior independent living community. Trained testers (N = 3) administered the BESTest. All participants were tested with the BESTest by the same tester initially and then retested 7 to 14 days later. With 32 of the participants, a second tester concurrently scored the retest for interrater reliability. Testers were blinded to each other's scores. Intraclass correlation coefficients [ICC(2,1)] were used to determine the interrater and test-retest reliability. Test-retest reliability was also analyzed using method error and the associated coefficients of variation (CVME). MDC was calculated using standard error of measurement. Interrater reliability (N = 32) of the BESTest total score was ICC(2, 1) = 0.97 (95% confidence interval [CI], 0.94-0.99). The ICCs for the individual subsystem scores ranged from 0.85 to 0.94. Test-retest reliability (N = 70) of the BESTest total score was ICC(2,1) = 0.93 (95% CI, 0.89-0.96). ICCs for the individual subsystem scores ranged from 0.72 to 0.89. The CVME (N = 70) of the BESTest total score was 4.1%. The CVME for the subsystem scores ranged from 5.0% to 10.7%. MDC (N = 70) for the BESTest total score at the 95% CI was 7.6%, or 8.2 points. MDC at the 95% CI for subsystem scores ranged from 11.7% to 19.0% (2.1-3.4 points). Results demonstrated generally good to excellent interrater and test-retest reliability in both the BESTest total and subsystem scores with community-dwelling older adults. The BESTest total and individual subsystem scores demonstrate good to excellent interrater and test-retest reliability with community-dwelling older adults. A change of 7.6% (8.2 points) or more in the BESTest total and a percentage change ranged from 11.7% to 19.0% (2.1-3.4 points) in the subsystem scores are suggested for clinicians to be 95% confident of true change when evaluating change in this population.
The Effect of Four Intervention Programs on Standardized Test Scores by Gender
ERIC Educational Resources Information Center
Cryder, Rebecca E.
2012-01-01
This quantitative correlational study involved the analysis, by gender, of the effect of four intervention programs at an Arizona middle school as seen on Arizona's Instrument to Measure Standards (AIMS) test scores. These four intervention programs included: Advancement Via Individual Determination (AVID), a planner stamping system, a World…
Improving School Accountability Measures. NBER Working Paper Series.
ERIC Educational Resources Information Center
Kane, Thomas J.; Staiger, Douglas O.
A growing number of states are using annual school-level test scores as part of their school accountability systems. This paper highlights an under-appreciated weakness of that approach, the imprecision of school-level test score means, and proposes a method for discerning signal from noise in annual school report cards. Using methods developed in…
Olivry, Thierry; Linder, Keith E; Paps, Judy S; Bizikova, Petra; Dunston, Stan; Donne, Nathalie; Mondoulet, Lucie
2012-12-01
Patch tests with allergens are used for the evaluation of cellular hypersensitivity to food and environmental allergens in dogs and humans with atopic dermatitis. Viaskin is a novel allergen epicutaneous delivery system that enhances epidermal allergen capture by immune cells. To compare the use of Viaskin and Finn chamber patch tests in dogs hypersensitive to mite allergens. Empty control or Dermatophagoides farinae house dust mite-containing Viaskin or Finn chamber patches were applied to the thoracic skin of six mite-hypersensitive Maltese-beagle crossbred atopic dogs. Lesions were graded 49 and 72 h after patch test application, and skin biopsies were collected after 72 h. Overall microscopic inflammation, eosinophil and T-lymphocyte infiltrations were scored. Positive macroscopic patch test reactions developed at five of six Viaskin application sites and four of six Finn chamber application sites. Median microscopic epidermal and dermal inflammation, as well as eosinophil and CD3 T-lymphocyte dermal scores were always higher in biopsies collected at Viaskin than at Finn chamber sites. Microscopic inflammation scores were significantly higher after mite allergen-containing Viaskin compared with empty patches, but this was not the case for mite-containing Finn chambers compared with control chambers. Scores obtained using Viaskin were not significantly different from those obtained using Finn chambers. Macroscopic and microscopic scores were significantly correlated. In mite-allergic dogs, Viaskin epicutaneous delivery systems appear to induce stronger allergen-specific inflammation than currently used Finn chamber patch tests. Consequently, Viaskin patches might offer a better alternative for screening cellular hypersensitivity to food and environmental allergens. © 2012 The Authors. Veterinary Dermatology © 2012 ESVD and ACVD.
Dong, Zhao; Nath, Anjali; Guo, Jing; Bhaumik, Urmi; Chin, May Y; Dong, Sherry; Marshall, Erica; Murphy, Johnna S; Sandel, Megan T; Sommer, Susan J; Ursprung, W W Sanouri; Woods, Elizabeth R; Reid, Margaret; Adamkiewicz, Gary
2018-01-01
To test the applicability of the Environmental Scoring System, a quick and simple approach for quantitatively measuring environmental triggers collected during home visits, and to evaluate its contribution to improving asthma outcomes among various child asthma programs. We pooled and analyzed data from multiple child asthma programs in the Greater Boston Area, Massachusetts, collected in 2011 to 2016, to examine the association of environmental scores (ES) with measures of asthma outcomes and compare the results across programs. Our analysis showed that demographics were important contributors to variability in asthma outcomes and total ES, and largely explained the differences among programs at baseline. Among all programs in general, we found that asthma outcomes were significantly improved and total ES significantly reduced over visits, with the total Asthma Control Test score negatively associated with total ES. Our study demonstrated that the Environmental Scoring System is a useful tool for measuring home asthma triggers and can be applied regardless of program and survey designs, and that demographics of the target population may influence the improvement in asthma outcomes.
National trends in safety performance of electronic health record systems in children's hospitals.
Chaparro, Juan D; Classen, David C; Danforth, Melissa; Stockwell, David C; Longhurst, Christopher A
2017-03-01
To evaluate the safety of computerized physician order entry (CPOE) and associated clinical decision support (CDS) systems in electronic health record (EHR) systems at pediatric inpatient facilities in the US using the Leapfrog Group's pediatric CPOE evaluation tool. The Leapfrog pediatric CPOE evaluation tool, a previously validated tool to assess the ability of a CPOE system to identify orders that could potentially lead to patient harm, was used to evaluate 41 pediatric hospitals over a 2-year period. Evaluation of the last available test for each institution was performed, assessing performance overall as well as by decision support category (eg, drug-drug, dosing limits). Longitudinal analysis of test performance was also carried out to assess the impact of testing and the overall trend of CPOE performance in pediatric hospitals. Pediatric CPOE systems were able to identify 62% of potential medication errors in the test scenarios, but ranged widely from 23-91% in the institutions tested. The highest scoring categories included drug-allergy interactions, dosing limits (both daily and cumulative), and inappropriate routes of administration. We found that hospitals with longer periods since their CPOE implementation did not have better scores upon initial testing, but after initial testing there was a consistent improvement in testing scores of 4 percentage points per year. Pediatric computerized physician order entry (CPOE) systems on average are able to intercept a majority of potential medication errors, but vary widely among implementations. Prospective and repeated testing using the Leapfrog Group's evaluation tool is associated with improved ability to intercept potential medication errors. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Bronchiectasis: correlation of high-resolution CT findings with health-related quality of life.
Eshed, I; Minski, I; Katz, R; Jones, P W; Priel, I E
2007-02-01
To evaluate the relationship between the severity of bronchiectatic diseases, as evident on high-resolution computed tomography (HRCT) and the patient's quality of life measured using the St George's Respiratory Questionnaire (SGRQ). Forty-six patients (25 women, 21 men, mean age: 63 years) with bronchiectatic disease as evident on recent HRCT examinations were recruited. Each patient completed the SGRQ and underwent respiratory function tests. HRCT findings were blindly and independently scored by two radiologists, using the modified Bhalla scoring system. The relationships between HRCT scores, SGRQ scores and pulmonary function tests were evaluated. The patients' total CT score did not correlate with the SGRQ scores. However, patients with more advanced disease on HRCT, significantly differed in their SGRQ scores from patients with milder bronchiectatic disease. A significant correlation was found between the CT scores for the middle and distal lung zones and the activity, impacts and total SGRQ scores. No correlation was found between CT scores and respiratory function test indices. However, a significant correlation was found between the SGRQ scores and most of the respiratory function test indices. A correlation between the severity of bronchiectatic disease as expressed in HRCT and the health-related quality of life exists in patients with a more severe bronchiectatic disease but not in patients with mild disease. Such correlation depends on the location of the bronchiectasis in the pulmonary tree.
Baykaner, Khan Richard; Huckvale, Mark; Whiteley, Iya; Andreeva, Svetlana; Ryumin, Oleg
2015-01-01
Automatic systems for estimating operator fatigue have application in safety-critical environments. A system which could estimate level of fatigue from speech would have application in domains where operators engage in regular verbal communication as part of their duties. Previous studies on the prediction of fatigue from speech have been limited because of their reliance on subjective ratings and because they lack comparison to other methods for assessing fatigue. In this paper, we present an analysis of voice recordings and psychophysiological test scores collected from seven aerospace personnel during a training task in which they remained awake for 60 h. We show that voice features and test scores are affected by both the total time spent awake and the time position within each subject's circadian cycle. However, we show that time spent awake and time-of-day information are poor predictors of the test results, while voice features can give good predictions of the psychophysiological test scores and sleep latency. Mean absolute errors of prediction are possible within about 17.5% for sleep latency and 5-12% for test scores. We discuss the implications for the use of voice as a means to monitor the effects of fatigue on cognitive performance in practical applications.
Baykaner, Khan Richard; Huckvale, Mark; Whiteley, Iya; Andreeva, Svetlana; Ryumin, Oleg
2015-01-01
Automatic systems for estimating operator fatigue have application in safety-critical environments. A system which could estimate level of fatigue from speech would have application in domains where operators engage in regular verbal communication as part of their duties. Previous studies on the prediction of fatigue from speech have been limited because of their reliance on subjective ratings and because they lack comparison to other methods for assessing fatigue. In this paper, we present an analysis of voice recordings and psychophysiological test scores collected from seven aerospace personnel during a training task in which they remained awake for 60 h. We show that voice features and test scores are affected by both the total time spent awake and the time position within each subject’s circadian cycle. However, we show that time spent awake and time-of-day information are poor predictors of the test results, while voice features can give good predictions of the psychophysiological test scores and sleep latency. Mean absolute errors of prediction are possible within about 17.5% for sleep latency and 5–12% for test scores. We discuss the implications for the use of voice as a means to monitor the effects of fatigue on cognitive performance in practical applications. PMID:26380259
Sachan, D; Gupta, N; Agarwal, P; Chaudhary, R
2011-08-01
Heparin-induced thrombocytopenia (HIT) should be diagnosed clinically as well as by laboratory assays for timely recognition, prevention and management of complications. To evaluate the clinical utility of pre-test clinical scoring system in combination with two immunoassays for the diagnosis of HIT in cardiac surgery patients. A total of 100 consecutive patients undergoing cardiac surgery were studied. Pre-test clinical scoring was carried out in patients with thrombocytopenia and further tested by two immunoassays, i.e., Heparin platelet factor 4 (H-PF4) enzyme-linked immunosorbent assay (ELISA) and particle gel immunoassay (PaGIA). Of the 100 patients studied, 42 patients developed thrombocytopenia post-operatively. On pre-test clinical scoring, low T-score was observed in 6 patients, intermediate in 28 and high score in 8 patients, whereas 19 patients (45.2%) were positive by H-PF4 ELISA and 10 (23.8%) by PaGIA for H-PF4 antibody. The difference in the incidence of clinically significant HIT antibodies in the three categories was statistically significant. A good correlation was also observed with ELISA optical density, T-scoring and PaGIA. Pre-test clinical scoring correlates well with the development of H-PF4 antibodies which are incriminated in the causation of thrombotic complications in patients with HIT. We also propose a protocol for diagnosing patients with clinical suspicion of HIT using pre-test clinical scoring and immunoassay. © 2011 The Authors. Transfusion Medicine © 2011 British Blood Transfusion Society.
Bonasia, Davide Edoardo; Marmotti, Antongiulio; Massa, Alessandro Domenico Felice; Ferro, Andrea; Blonna, Davide; Castoldi, Filippo; Rossi, Roberto
2015-09-01
In the last two decades, many surgical techniques have been described for articular cartilage repair. Reliable histological scoring systems are fundamental tools to evaluate new procedures. Several histological scoring systems have been described, and these can be divided in elementary and comprehensive scores, according to the number of sub-items. The aim of this study was to test the inter- and intra-observer reliability of ten main scores used for the histological evaluation of in vivo cartilage repair. The authors tested the starting hypothesis that elementary scores would show superior intra- and inter-observer reliability compared with comprehensive scores. Fifty histological sections obtained from the trochlea of New Zealand Rabbit and stained with Safranin-O fast green were used. The histological sections were analysed by 4 observers: 2 experienced in cartilage histology and 2 inexperienced. Histological evaluations were performed at time 1 and time 2, separated by a 30-day interval. The following scores were used: Mankin, O'Driscoll, Pineda, Wakitani, Fortier, Selleres, ICRS, ICRSII, Oswestry (OsScore) and modified O'Driscoll. Intra- and inter-observer reliability were evaluated for each score. In addition, the pavement-ceiling effect and the Bland-Altman Coefficient of Repeatability were then evaluated for each sub-item of every score. Intra-observer reliability was high for all observers in every score, even though the reliability was significantly lower for non-expert observers compared with expert counterparts. In terms of Coefficient of Repeatability, some scores performed better (O'Driscoll, Modified O'Driscoll and ICRSII) than others (Fortier, Seller). Inter-observer reliability was high for all observers in every score, but significantly lower for non-expert compared with expert observers. In expert hands, all the scores showed high intra- and inter-observer reliability, independently of the complexity. Although every score has advantages and disadvantages, ICRSII, O'Driscoll and Modified O'Driscoll scores should be preferred for the evaluation of in vivo cartilage repair in animal models.
Web-based education in systems-based practice: a randomized trial.
Kerfoot, B Price; Conlin, Paul R; Travison, Thomas; McMahon, Graham T
2007-02-26
All accredited US residency programs are expected to offer curricula and evaluate their residents in 6 general competencies. Medical schools are now adopting similar competency frameworks. We investigated whether a Web-based program could effectively teach and assess elements of systems-based practice. We enrolled 276 medical students and 417 residents in the fields of surgery, medicine, obstetrics-gynecology, and emergency medicine in a 9-week randomized, controlled, crossover educational trial. Participants were asked to sequentially complete validated Web-based modules on patient safety and the US health care system. The primary outcome measure was performance on a 26-item validated online test administered before, between, and after the participants completed the modules. Six hundred forty (92.4%) of the 693 enrollees participated in the study; 512 (80.0%) of the participants completed all 3 tests. Participants' test scores improved significantly after completion of the first module (P<.001). Overall learning from the 9-week Web-based program, as measured by the increase in scores (posttest scores minus pretest scores), was 16 percentage points (95% confidence interval, 14-17 percentage points; P<.001) in public safety topics and 22 percentage points (95% confidence interval, 20-23 percentage points; P<.001) in US health care system topics. A Web-based educational program on systems-based practice competencies generated significant and durable learning across a broad range of medical students and residents.
McLean, James M; Brumby-Rendell, Oscar; Lisle, Ryan; Brazier, Jacob; Dunn, Kieran; Gill, Tiffany; Hill, Catherine L; Mandziak, Daniel; Leith, Jordan
2018-05-01
The aim was to assess whether the Knee Society Score, Oxford Knee Score (OKS) and Knee Injury and Osteoarthritis Outcome Score (KOOS) were comparable in asymptomatic, healthy, individuals of different age, gender and ethnicity, across two remote continents. The purpose of this study was to establish normal population values for these scores using an electronic data collection system. There is no difference in clinical knee scores in an asymptomatic population when comparing age, gender and ethnicity, across two remote continents. 312 Australian and 314 Canadian citizens, aged 18-94 years, with no active knee pain, injury or pathology in the ipsilateral knee corresponding to their dominant arm, were evaluated. A knee examination was performed and participants completed an electronically administered questionnaire covering the subjective components of the knee scores. The cohorts were age- and gender-matched. Chi-square tests, Fisher's exact test and Poisson regression models were used where appropriate, to investigate the association between knee scores, age, gender, ethnicity and nationality. There was a significant inverse relationship between age and all assessment tools. OKS recorded a significant difference between gender with females scoring on average 1% lower score. There was no significant difference between international cohorts when comparing all assessment tools. An electronic, multi-centre data collection system can be effectively utilized to assess remote international cohorts. Differences in gender, age, ethnicity and nationality should be taken into consideration when using knee scores to compare to pathological patient scores. This study has established an electronic, normal control group for future studies using the Knee society, Oxford, and KOOS knee scores. Diagnostic Level II.
Brief Report: Development of the Adolescent Empathy and Systemizing Quotients
ERIC Educational Resources Information Center
Auyeung, Bonnie; Allison, Carrie; Wheelwright, Sally; Baron-Cohen, Simon
2012-01-01
Adolescent versions of the Empathy Quotient (EQ) and Systemizing Quotient (SQ) were developed and administered to n = 1,030 parents of typically developing adolescents, aged 12-16 years. Both measures showed good test-retest reliability and high internal consistency. Girls scored significantly higher on the EQ, and boys scored significantly higher…
Measuring Teacher Effectiveness with the Pennsylvania Value-Added Assessment System
ERIC Educational Resources Information Center
Bowen, Naomi
2017-01-01
The purpose of this research was to determine if the Pennsylvania Value-Added Assessment System Average Growth Index (PVAAS AGI) scores, derived from standardized tests and calculated for Pennsylvania schools, provide a valid and reliable assessment of teacher effectiveness, as these scores are currently used to derive 15% of the annual…
Developing and Evaluating a Machine-Scorable, Constrained Constructed-Response Item.
ERIC Educational Resources Information Center
Braun, Henry I.; And Others
The use of constructed response items in large scale standardized testing has been hampered by the costs and difficulties associated with obtaining reliable scores. The advent of expert systems may signal the eventual removal of this impediment. This study investigated the accuracy with which expert systems could score a new, non-multiple choice…
Iwata, Shintaro; Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Yonemoto, Tsukasa; Kawai, Akira
2016-09-01
The Musculoskeletal Tumor Society (MSTS) scoring system is a widely used functional evaluation tool for patients treated for musculoskeletal tumors. Although the MSTS scoring system has been validated in English and Brazilian Portuguese, a Japanese version of the MSTS scoring system has not yet been validated. We sought to determine whether a Japanese-language translation of the MSTS scoring system for the lower extremity had (1) sufficient reliability and internal consistency, (2) adequate construct validity, and (3) reasonable criterion validity compared with the Toronto Extremity Salvage Score (TESS) and SF-36 using psychometric analysis. The Japanese version of the MSTS scoring system was developed using accepted guidelines, which included translation of the English version of the MSTS into Japanese by five native Japanese bilingual musculoskeletal oncology surgeons and integrated into one document. One hundred patients with a diagnosis of intermediate or malignant bone or soft tissue tumors located in the lower extremity and who had undergone tumor resection with or without reconstruction or amputation participated in this study. Reliability was evaluated by test-retest analysis, and internal consistency was established by Cronbach's alpha coefficient. Construct validity was evaluated using the principal factor analysis and Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS scoring system with the TESS and SF-36. Test-retest analysis showed a high intraclass correlation coefficient (0.92; 95% CI, 0.88-0.95), indicating high reliability of the Japanese version of the MSTS scoring system, although a considerable ceiling effect was observed, with 23 patients (23%) given the maximum score. Cronbach's alpha coefficient was 0.87 (95% CI, 0.82-0.90), suggesting a high level of internal consistency. Factor analysis revealed that all items had high loading values and communalities; we identified a central role for the items "walking" and "gait" according to the Akaike information criterion network. The total MSTS score was correlated with that of the TESS (r = 0.81; 95% CI, 0.73-0.87; p < 0.001) and the physical component summary and physical functioning of the SF-36. The Japanese-language translation of the MSTS scoring system for the lower extremity has sufficient reliability and reasonable validity. Nevertheless, the observation of a ceiling effect suggests poor ability of this system to discriminate from among patients who have a high level of function.
Using the Teach-Back Method in Patient Education to Improve Patient Satisfaction.
Centrella-Nigro, Andrea M; Alexander, Catherine
2017-01-01
This quasi-experimental research study used two similar nursing units to test the effects of teach back on Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) scores. A pretest-posttest design tested 24 nurses' knowledge, attitudes, and beliefs about teach back. Education specialists provided a 1-hour teaching session on teach back to all nurses in the intervention unit. A significant improvement in knowledge scores in the pretest-posttest was found using paired t tests (p = .002). Qualitative analysis of nurses' comments demonstrated strong support for teach back in the post-test. The HCAHPS scores were not significantly improved in the intervention unit when compared with the control unit. More research needs to be conducted to determine the effectiveness of teach back on HCAHPS scores. J Contin Educ Nurs. 2017;48(1):47-52. Copyright 2017, SLACK Incorporated.
Hakim, Renée Marie; Salvo, Charles J; Balent, Anthony; Keyasko, Michael; McGlynn, Deidre
2015-02-01
A recent systematic review supported the use of strength and balance training for older adults at risk for falls, and provided preliminary evidence for those with peripheral neuropathy (PN). However, the role of gaming systems in fall risk reduction was not explored. The purpose of this case report was to describe the use of the Nintendo® Wii™ Fit gaming system to train standing balance in a community-dwelling older adult with PN and a history of recurrent near falls. A 76-year-old patient with bilateral PN participated in 1 h of Nintendo® Wii™ Fit balance training, two times a week for 6 weeks. Examination was conducted using a Computerized Dynamic Posturography system (i.e. Sensory Organization Test (SOT), Limits of Stability (LOS), Adaptation Test (ADT) and Motor Control Test (MCT) and clinical testing with the Berg Balance Scale (BBS), Timed Up and Go (TUG), Activities-specific Balance Confidence (ABC) scale and 30-s Chair Stand. Following training, sensory integration scores on the SOT were unchanged. Maximum excursion abilities improved by a range of 37-86% on the LOS test. MCT scores improved for amplitude with forward translations and ADT scores improved for downward platform rotations. Clinical scores improved on the BBS (28/56-34/56), ABC (57.5-70.6%) and TUG (14.9-10.9 s) which indicated reduced fall risk. Balance training with a gaming system showed promise as a feasible, objective and enjoyable method to improve physical performance and reduce fall risk in an individual with PN.
Chevalier, Thérèse M.; Stewart, Garth; Nelson, Monty; McInerney, Robert J.; Brodie, Norman
2016-01-01
It has been well documented that IQ scores calculated using Canadian norms are generally 2–5 points lower than those calculated using American norms on the Wechsler IQ scales. However, recent findings have demonstrated that the difference may be significantly larger for individuals with certain demographic characteristics, and this has prompted discussion about the appropriateness of using the Canadian normative system with a clinical population in Canada. This study compared the interpretive effects of applying the American and Canadian normative systems in a clinical sample. We used a multivariate analysis of variance (ANOVA) to calculate differences between IQ and Index scores in a clinical sample, and mixed model ANOVAs to assess the pattern of differences across age and ability level. As expected, Full Scale IQ scores calculated using Canadian norms were systematically lower than those calculated using American norms, but differences were significantly larger for individuals classified as having extremely low or borderline intellectual functioning when compared with those who scored in the average range. Implications of clinically different conclusions for up to 52.8% of patients based on these discrepancies highlight a unique dilemma facing Canadian clinicians, and underscore the need for caution when choosing a normative system with which to interpret WAIS-IV results in the context of a neuropsychological test battery in Canada. Based on these findings, we offer guidelines for best practice for Canadian clinicians when interpreting data from neuropsychological test batteries that include different normative systems, and suggestions to assist with future test development. PMID:27246955
Pallante-Kichura, Andrea L.; Bae, Won C.; Du, Jiang; Statum, Sheronda; Wolfson, Tanya; Gamst, Anthony C.; Cory, Esther; Amiel, David; Bugbee, William D.; Sah, Robert L.; Chung, Christine B.
2014-01-01
Objective: To describe and apply a semiquantitative MRI scoring system for multifeature analysis of cartilage defect repair in the knee by osteochondral allografts and to correlate this scoring system with histopathologic, micro–computed tomography (µCT), and biomechanical reference standards using a goat repair model. Design: Fourteen adult goats had 2 osteochondral allografts implanted into each knee: one in the medial femoral condyle and one in the lateral trochlea. At 12 months, goats were euthanized and MRI was performed. Two blinded radiologists independently rated 9 primary features for each graft, including cartilage signal, fill, edge integration, surface congruity, calcified cartilage integrity, subchondral bone plate congruity, subchondral bone marrow signal, osseous integration, and presence of cystic changes. Four ancillary features of the joint were also evaluated, including opposing cartilage, meniscal tears, synovitis, and fat-pad scarring. Comparison was made with histologic and µCT reference standards as well as biomechanical measures. Interobserver agreement and agreement with reference standards was assessed. Cohen’s κ, Spearman’s correlation, and Kruskal-Wallis tests were used as appropriate. Results: There was substantial agreement (κ > 0.6, P < 0.001) for each MRI feature and with comparison against reference standards, except for cartilage edge integration (κ = 0.6). There was a strong positive correlation between MRI and reference standard scores (ρ = 0.86, P < 0.01). Osteochondral allograft MRI scoring system was sensitive to differences in outcomes between the types of allografts. Conclusions: We have described a comprehensive MRI scoring system for osteochondral allografts and have validated this scoring system with histopathologic and µCT reference standards as well as biomechanical indentation testing. PMID:24489999
Imaging tools to measure treatment response in gout.
Dalbeth, Nicola; Doyle, Anthony J
2018-01-01
Imaging tests are in clinical use for diagnosis, assessment of disease severity and as a marker of treatment response in people with gout. Various imaging tests have differing properties for assessing the three key disease domains in gout: urate deposition (including tophus burden), joint inflammation and structural joint damage. Dual-energy CT allows measurement of urate deposition and bone damage, and ultrasonography allows assessment of all three domains. Scoring systems have been described that allow radiological quantification of disease severity and these scoring systems may play a role in assessing the response to treatment in gout. This article reviews the properties of imaging tests, describes the available scoring systems for quantification of disease severity and discusses the challenges and controversies regarding the use of imaging tools to measure treatment response in gout. © The Author 2018. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Seager, Anna L; Shah, Ume-Kulsoom; Brüsehafer, Katja; Wills, John; Manshian, Bella; Chapman, Katherine E; Thomas, Adam D; Scott, Andrew D; Doherty, Ann T; Doak, Shareen H; Johnson, George E; Jenkins, Gareth J S
2014-05-01
Micronucleus (MN) induction is an established cytogenetic end point for evaluating structural and numerical chromosomal alterations in genotoxicity testing. A semi-automated scoring protocol for the assessment of MN preparations from human cell lines and a 3D skin cell model has been developed and validated. Following exposure to a range of test agents, slides were stained with 4'-6-diamidino-2-phenylindole (DAPI) and scanned by use of the MicroNuc module of metafer 4, after the development of a modified classifier for selecting MN in binucleate cells. A common difficulty observed with automated systems is an artefactual output of high false positives, in the case of the metafer system this is mainly due to the loss of cytoplasmic boundaries during slide preparation. Slide quality is paramount to obtain accurate results. We show here that to avoid elevated artefactual-positive MN outputs, diffuse cell density and low-intensity nuclear staining are critical. Comparisons between visual (Giemsa stained) and automated (DAPI stained) MN frequencies and dose-response curves were highly correlated (R (2) = 0.70 for hydrogen peroxide, R (2) = 0.98 for menadione, R (2) = 0.99 for mitomycin C, R (2) = 0.89 for potassium bromate and R (2) = 0.68 for quantum dots), indicating the system is adequate to produce biologically relevant and reliable results. Metafer offers many advantages over conventional scoring including increased output and statistical power, and reduced scoring subjectivity, labour and costs. Further, the metafer system is easily adaptable for use with a range of different cells, both suspension and adherent human cell lines. Awareness of the points raised here reduces the automatic positive errors flagged and drastically reduces slide scoring time, making metafer an ideal candidate for genotoxic biomonitoring and population studies and regulatory genotoxic testing.
Rodrigues, Letícia C.; Marques, Aline P.; Barros, Paula B.; Michaelsen, Stella M.
2014-01-01
BACKGROUND: The Balance Evaluation Systems Test (BESTest) was recently created to allow the development of treatments according to the specific balance system affected in each patient. The Brazilian version of the BESTest has not been specifically tested after stroke. OBJECTIVE: To evaluate the intra- and inter-rater reliability and concurrent and convergent validity of the total score of the BESTest and BESTest sections for adults with hemiparesis after stroke. METHOD: The study included 16 subjects (61.1±7.5 years) with chronic hemiparesis (54.5±43.5 months after stroke). The BESTest was administered by two raters in the same week and one of the raters repeated the test after a one-week interval. Intraclass correlation coefficient (ICC) was calculated to assess intra- and interrater reliability. Concurrent validity with the Berg Balance Scale (BBS) and convergent validity with the Activities-specific Balance Confidence scale (ABC-Brazil) were assessed using Pearson's correlation coefficient. RESULTS: Both the BESTest total score (ICC=0.98) and the BESTest sections (ICC between 0.85 and 0.96) have excellent intrarater reliability. Interrater reliability for the total score was excellent (ICC=0.93) and, for the sections, it ranged between 0.71 and 0.94. The correlation coefficient between the BESTest and the BBS and ABC-Brazil were 0.78 and 0.59, respectively. CONCLUSIONS: The Brazilian version of the BESTest demonstrated adequate reliability when measured by sections and could identify what balance system was affected in patients after stroke. Concurrent validity was excellent with the BBS total score and good to excellent with the sections. The total scores but not the sections present adequate convergent validity with the ABC-Brazil. However, other psychometric properties should be further investigated. PMID:25003281
Aslam, Tariq M; Tahir, Humza J; Parry, Neil R A; Murray, Ian J; Kwak, Kun; Heyes, Richard; Salleh, Mahani M; Czanner, Gabriela; Ashworth, Jane
2016-10-01
To report on the utility of a computer tablet-based method for automated testing of visual acuity in children based on the principles of game design. We describe the testing procedure and present repeatability as well as agreement of the score with accepted visual acuity measures. Reliability and validity study. Setting: Manchester Royal Eye Hospital Pediatric Ophthalmology Outpatients Department. Total of 112 sequentially recruited patients. For each patient 1 eye was tested with the Mobile Assessment of Vision by intERactIve Computer for Children (MAVERIC-C) system, consisting of a software application running on a computer tablet, housed in a bespoke viewing chamber. The application elicited touch screen responses using a game design to encourage compliance and automatically acquire visual acuity scores of participating patients. Acuity was then assessed by an examiner with a standard chart-based near ETDRS acuity test before the MAVERIC-C assessment was repeated. Reliability of MAVERIC-C near visual acuity score and agreement of MAVERIC-C score with near ETDRS chart for visual acuity. Altogether, 106 children (95%) completed the MAVERIC-C system without assistance. The vision scores demonstrated satisfactory reliability, with test-retest VA scores having a mean difference of 0.001 (SD ±0.136) and limits of agreement of 2 SD (LOA) of ±0.267. Comparison with the near EDTRS chart showed agreement with a mean difference of -0.0879 (±0.106) with LOA of ±0.208. This study demonstrates promising utility for software using a game design to enable automated testing of acuity in children with ophthalmic disease in an objective and accurate manner. Copyright © 2016 Elsevier Inc. All rights reserved.
... among others. Each test has its own scoring system. In general, IQ tests are only one way to measure how well a person functions. Other factors, such as genetics and environment, should be considered.
Economic impact of 21-gene recurrence score testing on early-stage breast cancer in Ireland.
Smyth, Lillian; Watson, Geoff; Walsh, Elaine M; Kelly, Catherine M; Keane, Maccon; Kennedy, M John; Grogan, Liam; Hennessy, Bryan T; O'Reilly, Seamus; Coate, Linda E; O'Connor, Miriam; Quinn, Cecily; Verleger, Katharina; Schoeman, Olaf; O'Reilly, Susan; Walshe, Janice M
2015-10-01
The 21-gene test is a validated multi-gene diagnostic test that predicts chemotherapy (CT) benefit in oestrogen receptor positive (ER+), lymph node-negative (N0) breast cancer (BC) patients (pts). Ireland was the first public health care system to reimburse this test in Europe. Study objectives were to assess the impact of this test on decision-making and to analyse the economic impact of testing. Between October 2011 and February 2013, a national, retrospective, cross-sectional observational study of ER+, N0 BC pts tested with the 21-gene test was conducted. Surveyed breast medical oncologists, provided the assumption for the decision impact analysis that grade (G) 1 pts would not have received CT before testing and G2/3 pts would have received CT before testing. Descriptive statistical analyses were performed. 592 pts were identified; Low, intermediate and high recurrence score were identified in 53, 36 and 10 % pts, respectively. 384 (70 %) pts had G2, 129 (22 %) G3 and 76 (13 %) G1 tumours. Post testing, 345 pts (59 %) experienced a change in CT decision; 339 changed to hormone therapy alone and 6 advised to receive CT. 172 (30 %) pts received CT, 12 (3.9 %) of pts with low scores, 108 (50.9 %) of intermediate risk and 50 (90.9 %) of pts with high risk scores. Net reduction in CT use was 58 % and net savings achieved were €793,565. Since public reimbursement, the introduction of the 21-gene test has resulted in a significant reduction in chemotherapy administration and cost savings for the Irish public healthcare system.
Eaton, Joshua Seth; Miller, Paul E; Bentley, Ellison; Thomasy, Sara M; Murphy, Christopher J
2017-12-01
To present a semiquantitative ocular scoring system comprising elements and criteria that address many of the limitations associated with systems commonly used in preclinical studies, providing enhanced cross-species applicability and predictive value in modern ocular drug and device development. Revisions to the ocular scoring systems of McDonald-Shadduck and Hackett-McDonald were conducted by board-certified veterinary ophthalmologists at Ocular Services On Demand (OSOD) over the execution of hundreds of in vivo preclinical ocular drug and device development studies and general toxicological investigations. This semiquantitative preclinical ocular toxicology scoring (SPOTS) system was driven by limitations of previously published systems identified by our group's recent review of slit lamp-based scoring systems in clinical ophthalmology, toxicology, and vision science. The SPOTS system provides scoring criteria for the anterior segment, posterior segment, and characterization of intravitreal test articles. Key elements include: standardized slit lamp settings; expansion of criteria to enhance applicability to nonrabbit species; refinement and disambiguation of scoring criteria for corneal opacity, fluorescein staining severity, and aqueous flare; introduction of novel criteria for scoring of aqueous and anterior vitreous cell; and introduction of criteria for findings observed with drugs/devices targeting the posterior segment. A modified Standardization of Uveitis Nomenclature (SUN) system is also introduced to facilitate accurate use of SUN's criteria in laboratory species. The SPOTS systems provide criteria that stand to enhance the applicability of semiquantitative scoring criteria to the full range of laboratory species, in the context of modern approaches to ocular therapeutics and drug delivery and drug and device development.
Reliability and Normative Data for the Dynamic Visual Acuity Test for Vestibular Screening.
Riska, Kristal M; Hall, Courtney D
2016-06-01
The purpose of this study was to determine reliability of computerized dynamic visual acuity (DVA) testing and to determine reference values for younger and older adults. A primary function of the vestibular system is to maintain gaze stability during head motion. The DVA test quantifies gaze stabilization with the head moving versus stationary. Commercially available computerized systems allow clinicians to incorporate DVA into their assessment; however, information regarding reliability and normative values of these systems is sparse. Forty-six healthy adults, grouped by age, with normal vestibular function were recruited. Each participant completed computerized DVA testing including static visual acuity, minimum perception time, and DVA using the NeuroCom inVision System. Testing was performed by two examiners in the same session and then repeated at a follow-up session 3 to 14 days later. Intraclass correlation coefficients (ICCs) were used to determine inter-rater and test-retest reliability. ICCs for inter-rater reliability ranged from 0.323 to 0.937 and from 0.434 to 0.909 for horizontal and vertical head movements, respectively. ICCs for test-retest reliability ranged from 0.154 to 0.856 and from 0.377 to 0.9062 for horizontal and vertical head movements, respectively. Overall, raw scores (left/right DVA and up/down DVA) were more reliable than DVA loss scores. Reliability of a commercially available DVA system has poor-to-fair reliability for DVA loss scores. The use of a convergence paradigm and not incorporating the forced choice paradigm may contribute to poor reliability.
Relationship of the functional movement screen in-line lunge to power, speed, and balance measures.
Hartigan, Erin H; Lawrence, Michael; Bisson, Brian M; Torgerson, Erik; Knight, Ryan C
2014-05-01
The in-line lunge of the Functional Movement Screen (FMS) evaluates lateral stability, balance, and movement asymmetries. Athletes who score poorly on the in-line lunge should avoid activities requiring power or speed until scores are improved, yet relationships between the in-line lunge scores and other measures of balance, power, and speed are unknown. (1) Lunge scores will correlate with center of pressure (COP), maximum jump height (MJH), and 36.6-meter sprint time and (2) there will be no differences between limbs on lunge scores, MJH, or COP. Descriptive laboratory study. Level 3. Thirty-seven healthy, active participants completed the first 3 tasks of the FMS (eg, deep squat, hurdle step, in-line lunge), unilateral drop jumps, and 36.6-meter sprints. A 3-dimensional motion analysis system captured MJH. Force platforms measured COP excursion. A laser timing system measured 36.6-m sprint time. Statistical analyses were used to determine whether a relationship existed between lunge scores and COP, MJH, and 36.6-m speed (Spearman rho tests) and whether differences existed between limbs in lunge scores (Wilcoxon signed-rank test), MJH, and COP (paired t tests). Lunge scores were not significantly correlated with COP, MJH, or 36.6-m sprint time. Lunge scores, COP excursion, and MJH were not statistically different between limbs. Performance on the FMS in-line lunge was not related to balance, power, or speed. Healthy participants were symmetrical in lunging measures and MJH. Scores on the FMS in-line lunge should not be attributed to power, speed, or balance performance without further examination. However, assessing limb symmetry appears to be clinically relevant.
Personalized Risk Scoring for Critical Care Prognosis Using Mixtures of Gaussian Processes.
Alaa, Ahmed M; Yoon, Jinsung; Hu, Scott; van der Schaar, Mihaela
2018-01-01
In this paper, we develop a personalized real-time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs; the proposed risk scoring system ensures timely intensive care unit admissions for clinically deteriorating patients. The risk scoring system is based on the idea of sequential hypothesis testing under an uncertain time horizon. The system learns a set of latent patient subtypes from the offline electronic health record data, and trains a mixture of Gaussian Process experts, where each expert models the physiological data streams associated with a specific patient subtype. Transfer learning techniques are used to learn the relationship between a patient's latent subtype and her static admission information (e.g., age, gender, transfer status, ICD-9 codes, etc). Experiments conducted on data from a heterogeneous cohort of 6321 patients admitted to Ronald Reagan UCLA medical center show that our score significantly outperforms the currently deployed risk scores, such as the Rothman index, MEWS, APACHE, and SOFA scores, in terms of timeliness, true positive rate, and positive predictive value. Our results reflect the importance of adopting the concepts of personalized medicine in critical care settings; significant accuracy and timeliness gains can be achieved by accounting for the patients' heterogeneity. The proposed risk scoring methodology can confer huge clinical and social benefits on a massive number of critically ill inpatients who exhibit adverse outcomes including, but not limited to, cardiac arrests, respiratory arrests, and septic shocks.
Cho, Jung-Jin; Kim, Ji-Yong
2011-09-01
In-training examination (ITE) is a cognitive examination similar to the written test, but it is different from the Clinical Practice Examination of the Korean Academy of Family Medicine (KAFM) Certification Examination (CE). The objective of this is to estimate the positive predictive value of the KAFM-ITE for identifying residents at risk for poor performance on the three types of KAFM-CE. 372 residents who completed the KAFM-CE in 2011 were included. We compared the mean KAFM-CE scores with ITE experience. We evaluated the correlation and the positive predictive value (PPV) of ITE for the multiple choice question (MCQ) scores of 1st written test & 2nd slide examination, the total clinical practice examination scores, and the total sum of 2nd test. 275 out of 372 residents completed ITE. Those who completed ITE had significantly higher MCQ scores of 1st written test than those who did not. The correlation of ITE scores with 1st written MCQ (0.627) was found to be the highest among the other kinds of CE. The PPV of the ITE score for 1st written MCQ scores was 0.672. The PPV of the ITE score ranged from 0.376 to 0.502. The score of the KAFM ITE has acceptable positive predictive value that could be used as a part of comprehensive evaluation system for residents in cognitive field.
2016-12-22
included assessments and instruments, descriptive statistics were calculated. Independent-samples t-tests were conducted using participant survey scores...integrity tests within a multimodal system. Both conditions included the Military Acute Concussion Evaluation (MACE) and an Ease-of-Use survey . Mean scores...for the Ease-of-Use survey and mean test administration times for each measure were compared. Administrative feedback was also considered for
Mossadegh, Somayyeh; He, Shan; Parker, Paul
2016-05-01
Various injury severity scores exist for trauma; it is known that they do not correlate accurately to military injuries. A promising anatomical scoring system for blast pelvic and perineal injury led to the development of an improved scoring system using machine-learning techniques. An unbiased genetic algorithm selected optimal anatomical and physiological parameters from 118 military cases. A Naïve Bayesian model was built using the proposed parameters to predict the probability of survival. Ten-fold cross validation was employed to evaluate its performance. Our model significantly out-performed Injury Severity Score (ISS), Trauma ISS, New ISS, and the Revised Trauma Score in virtually all areas; positive predictive value 0.8941, specificity 0.9027, accuracy 0.9056, and area under curve 0.9059. A two-sample t test showed that the predictive performance of the proposed scoring system was significantly better than the other systems (p < 0.001). With limited resources and the simplest of Bayesian methodologies, we have demonstrated that the Naïve Bayesian model performed significantly better in virtually all areas assessed by current scoring systems used for trauma. This is encouraging and highlights that more can be done to improve trauma systems not only for our military injured, but also for civilian trauma victims. Reprint & Copyright © 2016 Association of Military Surgeons of the U.S.
Pantazes, Robert J; Saraf, Manish C; Maranas, Costas D
2007-08-01
In this paper, we introduce and test two new sequence-based protein scoring systems (i.e. S1, S2) for assessing the likelihood that a given protein hybrid will be functional. By binning together amino acids with similar properties (i.e. volume, hydrophobicity and charge) the scoring systems S1 and S2 allow for the quantification of the severity of mismatched interactions in the hybrids. The S2 scoring system is found to be able to significantly functionally enrich a cytochrome P450 library over other scoring methods. Given this scoring base, we subsequently constructed two separate optimization formulations (i.e. OPTCOMB and OPTOLIGO) for optimally designing protein combinatorial libraries involving recombination or mutations, respectively. Notably, two separate versions of OPTCOMB are generated (i.e. model M1, M2) with the latter allowing for position-dependent parental fragment skipping. Computational benchmarking results demonstrate the efficacy of models OPTCOMB and OPTOLIGO to generate high scoring libraries of a prespecified size.
ERIC Educational Resources Information Center
Wheelock, Anne
Scores on the Massachusetts Comprehensive Assessment System (MCAS) tests are used to select exemplary schools in Massachusetts, and the schools thus identified can receive awards from three different programs. This study examined the evidence about the use of MCAS scores to assess school quality. These three programs use MCAS to identify exemplary…
Airborne Turbulence Detection System Certification Tool Set
NASA Technical Reports Server (NTRS)
Hamilton, David W.; Proctor, Fred H.
2006-01-01
A methodology and a corresponding set of simulation tools for testing and evaluating turbulence detection sensors has been presented. The tool set is available to industry and the FAA for certification of radar based airborne turbulence detection systems. The tool set consists of simulated data sets representing convectively induced turbulence, an airborne radar simulation system, hazard tables to convert the radar observable to an aircraft load, documentation, a hazard metric "truth" algorithm, and criteria for scoring the predictions. Analysis indicates that flight test data supports spatial buffers for scoring detections. Also, flight data and demonstrations with the tool set suggest the need for a magnitude buffer.
Trait impulsivity predicts D-KEFS tower test performance in university students.
Lyvers, Michael; Basch, Vanessa; Duff, Helen; Edwards, Mark S
2015-01-01
The present study examined a widely used self-report index of trait impulsiveness in relation to performance on a well-known neuropsychological executive function test in 70 university undergraduate students (50 women, 20 men) aged 18 to 24 years old. Participants completed the Barratt Impulsiveness Scale (BIS-11) and the Frontal Systems Behavior Scale (FrSBe), after which they performed the Tower Test of the Delis-Kaplan Executive Function System. Hierarchical linear regression showed that after controlling for gender, current alcohol consumption, age at onset of weekly alcohol use, and FrSBe scores, BIS-11 significantly predicted Tower Test Achievement scores, β = -.44, p < .01. The results indicate that self-reported impulsiveness is associated with poorer executive cognitive performance even in a sample likely to be characterized by relatively high general cognitive functioning (i.e., university students). The results also support the role of inhibition as a key aspect of executive task performance. Elevated scores on the BIS-11 and FrSBe are known to be linked to risky drinking in young adults as confirmed in this sample; however, only BIS-11 predicted Tower Test performance.
A simple scoring system based on neutrophil count in sepsis patients.
Ueda, Takahiro; Aoyama-Ishikawa, Michiko; Nakao, Atsunori; Yamada, Taihei; Usami, Makoto; Kotani, Joji
2014-03-01
The assessment of critically ill patients is often a challenge for clinicians. There are a number of scoring systems such as Acute Physiology and Chronic Health Evaluation II (APACHE II), Sequential Organ Failure Assessment (SOFA) and C-reactive protein test (CRP), which have been shown to correlate with outcome in a variety of Intensive Care Unit (ICU) patients. Therefore, use of repeated measures of these preexisting scores over time is a reasonable attempt to assess the severity of organ dysfunction and predict outcome in critically ill patients. Several reports suggest that the neutrophil is a useful marker of sepsis. However, since both a large number and a small number of neutrophils indicate a severe situation, neutrophil count is difficult to use to directly predict patients'. We proposed a novel scoring system identify predictive factors using a simple blood cell count that may be associated with mortality in ICU patients. Our novel scoring system (n-score) was calculated as follows: ranges of neutrophils of 0-4999 cells/mm(3) and 5000-9999 cells/mm(3) were defined as 3 and 1 points, respectively. When the neutrophil count was over 10,000 cells/mm(3), the score was calculated by dividing the number of cells by 10,000. Then, 1 or 2 points were added when patients were female or male, respectively. We hypothesize that n-score may be a simple and easy scoring system to estimate mortality of the patients with sepsis and severe sepsis/septic shock without requirement of special methods or special measuring equipment, and may be as reliable as the APACHE II score or SOFA score. The retrospective evaluation was conducted at the Department of Emergency, Disaster and Critical Care Medicine at the Hyogo College of Medicine. Seventy-seven patients who were admitted to the emergency center and diagnosed sepsis or severe sepsis/septic shock between June 2007 and December 2012 and gave informed consent were enrolled. The n-score was significantly higher in non-survivors of sepsis and severe sepsis/septic shock (p<0.01, t-test) than in survivors. The ROC curve showed a sensitivity of 61.5% and a specificity of 80.4% at an n-score of 3.8 points; the area under the curve was 0.736. In addition, n-score correlated with APACHE II score (p<0.01, R=0.378) and SOFA score (p<0.05, R=0.256) on admission. Based on these preliminary evaluations, we hypothesize that n-score may be a useful scoring system to detect risk of death in sepsis and severe sepsis/septic shock. Copyright © 2014 Elsevier Ltd. All rights reserved.
Logistical Consideration in Computer-Based Screening of Astronaut Applicants
NASA Technical Reports Server (NTRS)
Galarza, Laura
2000-01-01
This presentation reviews the logistical, ergonomic, and psychometric issues and data related to the development and operational use of a computer-based system for the psychological screening of astronaut applicants. The Behavioral Health and Performance Group (BHPG) at the Johnson Space Center upgraded its astronaut psychological screening and selection procedures for the 1999 astronaut applicants and subsequent astronaut selection cycles. The questionnaires, tests, and inventories were upgraded from a paper-and-pencil system to a computer-based system. Members of the BHPG and a computer programmer designed and developed needed interfaces (screens, buttons, etc.) and programs for the astronaut psychological assessment system. This intranet-based system included the user-friendly computer-based administration of tests, test scoring, generation of reports, the integration of test administration and test output to a single system, and a complete database for past, present, and future selection data. Upon completion of the system development phase, four beta and usability tests were conducted with the newly developed system. The first three tests included 1 to 3 participants each. The final system test was conducted with 23 participants tested simultaneously. Usability and ergonomic data were collected from the system (beta) test participants and from 1999 astronaut applicants who volunteered the information in exchange for anonymity. Beta and usability test data were analyzed to examine operational, ergonomic, programming, test administration and scoring issues related to computer-based testing. Results showed a preference for computer-based testing over paper-and -pencil procedures. The data also reflected specific ergonomic, usability, psychometric, and logistical concerns that should be taken into account in future selection cycles. Conclusion. Psychological, psychometric, human and logistical factors must be examined and considered carefully when developing and using a computer-based system for psychological screening and selection.
ERIC Educational Resources Information Center
Bifulco, Robert; Ladd, Helen F.
2007-01-01
Using panel data that track individual students from year to year, we examine the effects of charter schools in North Carolina on racial segregation and black-white test score gaps. We find that North Carolina's system of charter schools has increased the racial isolation of both black and white students, and has widened the achievement gap.…
A dysmorphology score system for assessing embryo abnormalities in rat whole embryo culture.
Zhang, Cindy X; Danberry, Tracy; Jacobs, Mary Ann; Augustine-Rauch, Karen
2010-12-01
The rodent whole embryo culture (WEC) system is a well-established model for characterizing developmental toxicity of test compounds and conducting mechanistic studies. Laboratories have taken various approaches in describing type and severity of developmental findings of organogenesis-stage rodent embryos, but the Brown and Fabro morphological score system is commonly used as a quantitative approach. The associated score criteria is based upon developmental stage and growth parameters, where a series of embryonic structures are assessed and assigned respective scores relative to their gestational stage, with a Total Morphological Score (TMS) assigned to the embryo. This score system is beneficial because it assesses a series of stage-specific anatomical landmarks, facilitating harmonized evaluation across laboratories. Although the TMS provides a quantitative approach to assess growth and determine developmental delay, it is limited to its ability to identify and/or delineate subtle or structure-specific abnormalities. Because of this, the TMS may not be sufficiently sensitive for identifying compounds that induce structure or organ-selective effects. This study describes a distinct morphological score system called the "Dysmorphology Score System (DMS system)" that has been developed for assessing gestation day 11 (approximately 20-26 somite stage) rat embryos using numerical scores to differentiate normal from abnormal morphology and define the respective severity of dysmorphology of specific embryonic structures and organ systems. This method can also be used in scoring mouse embryos of the equivalent developmental stage. The DMS system enhances capabilities to rank-order compounds based upon teratogenic potency, conduct structure- relationships of chemicals, and develop statistical prediction models to support abbreviated developmental toxicity screens. © 2010 Wiley-Liss, Inc.
Test-Based Accountability: The Promise and the Perils
ERIC Educational Resources Information Center
Loveless, Tom
2005-01-01
In the early 1990s, states began establishing standards in academic subjects backed by test-based accountability systems to see that the standards were met. Incentives were implemented for schools and students based on pupil test scores. These early accountability systems paved the way for passage of landmark federal legislation, the No Child Left…
Keeping Scores: Audited Self-Monitoring of High-Stakes Testing Environments
ERIC Educational Resources Information Center
Padilla, Raymond; Richards, Michael
2006-01-01
To address a public relations problem faced by a large urban public school district in Texas, we conducted action research that resulted in an audited self-monitoring system for high-stakes testing environments. The system monitors violations of testing protocols while identifying and disseminating best practices to improve the education of…
The Automated Assessment of Postural Stability: Balance Detection Algorithm.
Napoli, Alessandro; Glass, Stephen M; Tucker, Carole; Obeid, Iyad
2017-12-01
Impaired balance is a common indicator of mild traumatic brain injury, concussion and musculoskeletal injury. Given the clinical relevance of such injuries, especially in military settings, it is paramount to develop more accurate and reliable on-field evaluation tools. This work presents the design and implementation of the automated assessment of postural stability (AAPS) system, for on-field evaluations following concussion. The AAPS is a computer system, based on inexpensive off-the-shelf components and custom software, that aims to automatically and reliably evaluate balance deficits, by replicating a known on-field clinical test, namely, the Balance Error Scoring System (BESS). The AAPS main innovation is its balance error detection algorithm that has been designed to acquire data from a Microsoft Kinect ® sensor and convert them into clinically-relevant BESS scores, using the same detection criteria defined by the original BESS test. In order to assess the AAPS balance evaluation capability, a total of 15 healthy subjects (7 male, 8 female) were required to perform the BESS test, while simultaneously being tracked by a Kinect 2.0 sensor and a professional-grade motion capture system (Qualisys AB, Gothenburg, Sweden). High definition videos with BESS trials were scored off-line by three experienced observers for reference scores. AAPS performance was assessed by comparing the AAPS automated scores to those derived by three experienced observers. Our results show that the AAPS error detection algorithm presented here can accurately and precisely detect balance deficits with performance levels that are comparable to those of experienced medical personnel. Specifically, agreement levels between the AAPS algorithm and the human average BESS scores ranging between 87.9% (single-leg on foam) and 99.8% (double-leg on firm ground) were detected. Moreover, statistically significant differences in balance scores were not detected by an ANOVA test with alpha equal to 0.05. Despite some level of disagreement between human and AAPS-generated scores, the use of an automated system yields important advantages over currently available human-based alternatives. These results underscore the value of using the AAPS, that can be quickly deployed in the field and/or in outdoor settings with minimal set-up time. Finally, the AAPS can record multiple error types and their time course with extremely high temporal resolution. These features are not achievable by humans, who cannot keep track of multiple balance errors with such a high resolution. Together, these results suggest that computerized BESS calculation may provide more accurate and consistent measures of balance than those derived from human experts.
ERIC Educational Resources Information Center
Ballou, Dale; Springer, Matthew G.
2015-01-01
Our aim in this article is to draw attention to some underappreciated problems in the design and implementation of evaluation systems that incorporate value-added measures. We focus on four: (1) taking into account measurement error in teacher assessments, (2) revising teachers' scores as more information becomes available about their students,…
Performance of high school male athletes on the Functional Movement Screen™.
Smith, Laura J; Creps, James R; Bean, Ryan; Rodda, Becky; Alsalaheen, Bara
2017-09-01
(1) Describe the performance of the Functional Movement Screen™ (FMS™) by reporting the proportion of adolescents with a score of ≤14 and the frequency of asymmetries in a cross-sectional sample; (2) explore associations between FMS™ to age and body mass, and explore the construct validity of the FMS™ against common postural stability measures; (3) examine the inter-rater and test-retest reliability of the FMS™ in adolescents. Cross-sectional. Field-setting. 94 male high-school athletes. The FMS™, Y-Balance Test (YBT) and Balance Error Scoring System (BESS). The median FMS™ composite score was 16 (9-21), 33% of participants scored below the suggested injury risk cutoff composite score of ≤14, and 62.8% had at least one asymmetry. No relationship was observed between the FMS™ to common static/dynamic balance tests. The inter-rater reliability of the FMS™ composite score suggested good reliability (ICC = 0.88, CI 95%:0.77, 0.94) and test-retest reliability for FMS™ composite scores was good with ICC = 0.83 (CI 95%:0.56, 0.95). FMS™ results should be interpreted cautiously with attention to the asymmetries identified during the screen, regardless of composite score. The lack of relationship between the FMS™ and other balance measures supports the notion that multiple screening tests should be used in order to provide a comprehensive picture of the adolescent athlete. Copyright © 2017 Elsevier Ltd. All rights reserved.
Peng, Jian-Hong; Fang, Yu-Jing; Li, Cai-Xia; Ou, Qing-Jian; Jiang, Wu; Lu, Shi-Xun; Lu, Zhen-Hai; Li, Pei-Xing; Yun, Jing-Ping; Zhang, Rong-Xin; Pan, Zhi-Zhong; Wan, De Sen
2016-04-19
Nearly 20% patients with stage II A colon cancer will develop recurrent disease post-operatively. The present study aims to develop a scoring system based on Artificial Neural Network (ANN) model for predicting 10-year survival outcome. The clinical and molecular data of 117 stage II A colon cancer patients from Sun Yat-sen University Cancer Center were used for training set and test set; poor pathological grading (score 49), reduced expression of TGFBR2 (score 33), over-expression of TGF-β (score 45), MAPK (score 32), pin1 (score 100), β-catenin in tumor tissue (score 50) and reduced expression of TGF-β in normal mucosa (score 22) were selected as the prognostic risk predictors. According to the developed scoring system, the patients were divided into 3 subgroups, which were supposed with higher, moderate and lower risk levels. As a result, for the 3 subgroups, the 10-year overall survival (OS) rates were 16.7%, 62.9% and 100% (P < 0.001); and the 10-year disease free survival (DFS) rates were 16.7%, 61.8% and 98.8% (P < 0.001) respectively. It showed that this scoring system for stage II A colon cancer could help to predict long-term survival and screen out high-risk individuals for more vigorous treatment.
Erci, Behice
2012-04-01
This article is a report of a quasi-experimental study of the effectiveness of the Omaha System intervention on the women's health promotion lifestyle profile and the quality of life. The Omaha System is a model for organizing, documenting and evaluating the outcomes of comprehensive, community-based, client-centred care. Therefore, the Omaha System is important for public health nurses whose aim is to protect and promote health. However, few studies addressed the influence of the Omaha System on health promotion activities or quality of life in adult population. The design of the study was one-group pre-test and post-test. The study took place in Turkey in 2007; the sample comprised 76 women from an urban primary healthcare centre. The women completed questionnaires consisting of demographical characteristics, the health promotion lifestyle profile scale developed by Walker and colleagues and the quality of life scale developed by Burckhardt and colleagues. The researcher then visited selected women in their home weekly or biweekly for a 4-month period. At the end of intervention, the scales were applied to the women as the post-test. The mean scores of self-actualization, health responsibility, interpersonal support, stress management subscales of the health promotion lifestyle profile and the total score increased in post-test, except for nutrition subscale. There were statistically significant differences between pre- and post-test scores. This study demonstrated that the Omaha System intervention increases health promotion lifestyle profile of the women. It is recommended as a nursing care to health promotion. © 2011 Blackwell Publishing Ltd.
Puranik, Ameya D; Nair, Gopinathan; Aggarwal, Rajiv; Bandyopadhyay, Abhijit; Shinto, Ajit; Zade, Anand
2013-04-01
The study aimed at developing a scoring system for scintigraphic grading of gastro-esophageal reflux (GER), on gastro-esophageal reflux scintigraphy (GERS) and comparison of clinical and scintigraphic scores, pre- and post-treatment. A total of 39 cases with clinically symptomatic GER underwent 99mTc sulfur colloid GERS; scores were assigned based on the clinical and scintigraphic parameters. Post domperidone GERS was performed after completion of treatment. Follow up GERS was performed and clinical and scintigraphic parameters were compared with baseline parameters. Paired t-test on pre and post domperidone treatment clinical scores showed that the decline in post-treatment scores was highly significant, with P value < 0.001. The scintigraphic scoring system had a sensitivity of 93.9% in assessing treatment response to domperidone, specificity of 83.3% i.e., 83.3% of children with no decline in scintigraphic scores show no clinical response to Domperidone. The scintigraphic scoring system had a positive predictive value of 96.9% and a negative predictive value of 71.4%. GERS with its quantitative parameters is a good investigation for assessing the severity of reflux and also for following children post-treatment.
Goebel, L; Orth, P; Cucchiarini, M; Pape, D; Madry, H
2017-04-01
To correlate osteochondral repair assessed by validated macroscopic scoring systems with established semiquantitative histological analyses in an ovine model and to test the hypothesis that important macroscopic individual categories correlate with their corresponding histological counterparts. In the weight-bearing portion of medial femoral condyles (n = 38) of 19 female adult Merino sheep (age 2-4 years; weight 70 ± 20 kg) full-thickness chondral defects were created (size 4 × 8 mm; International Cartilage Repair Society (ICRS) grade 3C) and treated with Pridie drilling. After sacrifice, 1520 blinded macroscopic observations from three observers at 2-3 time points including five different macroscopic scoring systems demonstrating all grades of cartilage repair where correlated with corresponding categories from 418 blinded histological sections. Categories "defect fill" and "total points" of different macroscopic scoring systems correlated well with their histological counterparts from the Wakitani and Sellers scores (all P ≤ 0.001). "Integration" was assessed in both histological scoring systems and in the macroscopic ICRS, Oswestry and Jung scores. Here, a significant relationship always existed (0.020 ≤ P ≤ 0.049), except for Wakitani and Oswestry (P = 0.054). No relationship was observed for the "surface" between histology and macroscopy (all P > 0.05). Major individual morphological categories "defect fill" and "integration", and "total points" of macroscopic scoring systems correlate with their corresponding categories in elementary and complex histological scoring systems. Thus, macroscopy allows to precisely predict key histological aspects of articular cartilage repair, underlining the specific value of macroscopic scoring for examining cartilage repair. Copyright © 2016 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
VALIDITY OF THE REARRANGEMENT EXERCISE AS A PREDICTOR OF ESSAY WRITING ABILITY.
ERIC Educational Resources Information Center
CONRY, JULIANNE JOYCE
DATA FROM THE PARAGRAPH ORGANIZATION PORTION OF THE CEEB ENGLISH COMPOSITION TEST (ECT) WERE CONVERTED TO THE ORIGINAL RANK-ORDER AND WERE THEN RESCORED BY THREE SYSTEMS USING SPEARMAN'S RHO TO DETERMINE WHICH METHOD YIELDED SCORES THAT CORRELATED BEST WITH TOTAL ESSAY SCORES. TWO OF THE METHODS INVESTIGATED, ONE IN WHICH THE NUMBER OF SCORES WAS…
The Contribution of Human Factors in Military System Development: Methodological Considerations
1980-07-01
Risk/Uncertainty Analysis - Project Scoring - Utility Scales - Relevance Tree Techniques (Reverse Factor Analysis) 2. Computer Simulation Simulation...effectiveness of mathematical models for R&D project selection. Management Science, April 1973, 18. 6-43 .1~ *.-. Souder, W.E. h scoring methodology for...per some interval PROFICIENCY test scores (written) RADIATION radiation effects aircrew performance on radiation environments REACTION TIME 1) (time
Rades, Dirk; Dziggel, Liesa; Nagy, Viorica; Segedin, Barbara; Lohynska, Radka; Veninga, Theo; Khoa, Mai T; Trang, Ngo T; Schild, Steven E
2013-07-01
Survival scores for patients with brain metastasis exist. However, the treatment regimens used to create these scores were heterogeneous. This study aimed to develop and validate a survival score in homogeneously treated patients. Eight-hundred-and-eighty-two patients receiving 10 × 3Gy of WBRT alone were randomly assigned to a test group (N=441) or a validation group (N=441). In the multivariate analysis of the test group, age, performance status, extracranial metastasis, and systemic treatment prior to WBRT were independent predictors of survival. The score for each factor was determined by dividing the 6-month survival rate (in %) by 10. Scores were summed and total scores ranged from 6 to 19 points. Patients were divided into four prognostic groups. The 6-month survival rates were 4% for 6-9 points, 29% for 10-14 points, 62% for 15-17 points, and 93% for 17-18 points (p<0.001) in the test group. The survival rates were 3%, 28%, 54% and 96%, respectively (p<0.001) in the validation group. Since the 6-month survival rates in the validation group were very similar to the test group, this new score (WBRT-30) appears valid and reproducible. It can help making treatment choices and stratifying patients in future trials. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
An Index to Objectively Score Supraglottic Abnormalities in Refractory Asthma
Good, James T.; Rollins, Donald R.; Curran-Everett, Douglas; Lommatzsch, Steven E.; Carolan, Brendan J.; Stubenrauch, Peter C.
2014-01-01
Background: Patients with refractory asthma frequently have elements of laryngopharyngeal reflux (LPR) with potential aspiration contributing to their poor control. We previously reported on a supraglottic index (SGI) scoring system that helps in the evaluation of LPR with potential aspiration. However, to further the usefulness of this SGI scoring system for bronchoscopists, a teaching system was developed that included both interobserver and intraobserver reproducibility. Methods: Five pulmonologists with expertise in fiber-optic bronchoscopy but novice to the SGI participated. A training system was developed that could be used via Internet interaction to make this learning technique widely available. Results: By the final testing, there was excellent interreader agreement (κ of at least 0.81), thus documenting reproducibility in scoring the SGI. For the measure of intrareader consistency, one reader was arbitrarily selected to rescore the final test 4 weeks later and had a κ value of 0.93, with a 95% CI of 0.79 to 1.00. Conclusions: In this study, we demonstrate that with an organized educational approach, bronchoscopists can develop skills to have highly reproducible assessment and scoring of supraglottic abnormalities. The SGI can be used to determine which patients need additional intervention to determine causes of LPR and gastroesophageal reflux. Identification of this problem in patients with refractory asthma allows for personal, individual directed therapy to improve asthma control. PMID:24202552
Jaipuria, Jiten; Suryavanshi, Manav; Sen, Tridib K
2016-12-01
To assess the reliability of the Guy's Stone Score, the Seoul National University Renal Stone Complexity (S-ReSC) score and the S.T.O.N.E. scores in percutaneous nephrolithotomy (PCNL), and assess their utility in discriminating outcomes [stone free rate (SFR), complications, need for multiple PCNL sessions, and auxiliary procedures] valid across parameters of experience of surgeon, independence from surgical approach, and variations in institution-specific instrumentation. A prospectively maintained database of two tertiary institutions was analysed (606 cases). Institutes differed in instrumentation, while the overall surgical team comprised: two trainees (experience <100 cases), two junior consultants (experience 100-200 cases), and two senior surgeons (experience >1000 cases). Scores were assigned and re-assigned after 4 months by one trainee and an expert surgeon. Inter-rater and test-retest agreement were analysed by Cohen's κ and intraclass correlation coefficient. Multivariate logistic regression models were created adjusting outcomes for the institution, comorbidity, Amplatz size, access tract location, the number of punctures, the experience level of the surgeon, and individual scoring system, and receiver operating curves were analysed for comparison. Despite some areas of inconsistencies, individually all scores had excellent inter-rater and test-retest concordance. On multivariable analyses, while the experience of the surgeon and surgical approach characteristics (such as access tract location, Amplatz size, and number of punctures) remained independently associated with different outcomes in varying combinations, calculus complexity scores were found consistently to be independently associated with all outcomes. The S-ReSC score had a superior association with SFR, the need for multiple PCNL sessions, and auxiliary procedures. Individually all scoring systems performed well. On cross comparison, the S-ReSC score consistently emerged to be more superiorly associated with all outcomes, signifying the importance of the distributional complexity of the calculus (which also indirectly amalgamates the influence of stone number, size, and anatomical location) in discriminating outcomes. Our study proves the utility of scoring systems in prognosticating multiple outcomes and also clarifies important aspects of their practical application including future roles such as benchmarking, audit, training, and objective assessment of surgical technique modifications. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
Muthukrishnan, Shobitha; Jain, Reena; Kohli, Sangeeta; Batra, Swaraj
2016-04-01
Various pregnancy complications like hypertension, preeclampsia have been strongly correlated with maternal stress. One of the connecting links between pregnancy complications and maternal stress is mind-body intervention which can be part of Complementary and Alternative Medicine (CAM). Biologic measures of stress during pregnancy may get reduced by such interventions. To evaluate the effect of Mindfulness meditation on perceived stress scores and autonomic function tests of pregnant Indian women. Pregnant Indian women of 12 weeks gestation were randomised to two treatment groups: Test group with Mindfulness meditation and control group with their usual obstetric care. The effect of Mindfulness meditation on perceived stress scores and cardiac sympathetic functions and parasympathetic functions (Heart rate variation with respiration, lying to standing ratio, standing to lying ratio and respiratory rate) were evaluated on pregnant Indian women. There was a significant decrease in perceived stress scores, a significant decrease of blood pressure response to cold pressor test and a significant increase in heart rate variability in the test group (p< 0.05, significant) which indicates that mindfulness meditation is a powerful modulator of the sympathetic nervous system and can thereby reduce the day-to-day perceived stress in pregnant women. The results of this study suggest that mindfulness meditation improves parasympathetic functions in pregnant women and is a powerful modulator of the sympathetic nervous system during pregnancy.
Kim, Ji-Yong
2011-01-01
Background In-training examination (ITE) is a cognitive examination similar to the written test, but it is different from the Clinical Practice Examination of the Korean Academy of Family Medicine (KAFM) Certification Examination (CE). The objective of this is to estimate the positive predictive value of the KAFM-ITE for identifying residents at risk for poor performance on the three types of KAFM-CE. Methods 372 residents who completed the KAFM-CE in 2011 were included. We compared the mean KAFM-CE scores with ITE experience. We evaluated the correlation and the positive predictive value (PPV) of ITE for the multiple choice question (MCQ) scores of 1st written test & 2nd slide examination, the total clinical practice examination scores, and the total sum of 2nd test. Results 275 out of 372 residents completed ITE. Those who completed ITE had significantly higher MCQ scores of 1st written test than those who did not. The correlation of ITE scores with 1st written MCQ (0.627) was found to be the highest among the other kinds of CE. The PPV of the ITE score for 1st written MCQ scores was 0.672. The PPV of the ITE score ranged from 0.376 to 0.502. Conclusion The score of the KAFM ITE has acceptable positive predictive value that could be used as a part of comprehensive evaluation system for residents in cognitive field. PMID:22745873
Lameness scoring system for dairy cows using force plates and artificial intelligence.
Ghotoorlar, S Mokaram; Ghamsari, S Mehdi; Nowrouzian, I; Ghotoorlar, S Mokaram; Ghidary, S Shiry
2012-02-04
Lameness scoring is a routine procedure in dairy industry to screen the herds for new cases of lameness. Subjective lameness scoring, which is the most popular lameness detection and screening method in dairy herds, has several limitations. They include low intra-observer and inter-observer agreement and the discrete nature of the scores which limits its usage in monitoring the lameness. The aim of this study is to develop an automated lameness scoring system comparable with conventional subjective lameness scoring by means of artificial neural networks. The system is composed of four balanced force plates installed in a hoof-trimming box. A group of 105 dairy cows was used for the study. Twenty-three features extracted from ground reaction force (GRF) data were used in a computer training process which was performed on 60 per cent of the data. The remaining 40 per cent of the data were used to test the trained system. Repeatability of the lameness scoring system was determined by GRF samples from 25 cows, captured at two different times from the same animals. The mean sd was 0.31 and the mean coefficient of variation was 14.55 per cent, which represents a high repeatability in comparison with subjective vision-based scoring methods. Although the highest sensitivity and specificity values were seen in locomotion score groups 1 and 4, the automatic lameness system was both sensitive and specific in all groups. The sensitivity and specificity were higher than 72 per cent in locomotion score groups 1 to 4, and it was 100 per cent specific and 50 per cent sensitive for group 5.
Squires, Liza; Li, Yunfeng; Civil, Richard; Paller, Amy S.
2010-01-01
Objective: To characterize dermal reactions and examine methylphenidate (MPH) sensitization in subjects receiving methylphenidate transdermal system (MTS). Method: This multicenter, open-label, dose-optimization study utilized MTS doses of 10, 15, 20, and 30 mg in children aged 6 to 12 years, inclusive (N = 305), with a DSM-IV-TR primary diagnosis of attention-deficit/hyperactivity disorder. The study was conducted between January 8, 2007, and August 23, 2007. Subjects wore MTS on their hips for 9 hours per day, alternating sides daily for a total of 7 weeks. Assessments included the Experience of Discomfort scale, Transdermal System Adherence scale, and Dermal Response Scale (DRS; 0 = no irritation, 7 = strong reaction). On-study reevaluations were conducted to characterize DRS scores ≥ 4. Epicutaneous allergy patch testing was conducted for DRS scores ≥ 6, persistent DRS scores ≥ 4, DRS score increase following an assessment of ≥ 4, or DRS scores of 4 or 5 following elective discontinuation. Results: Approximately half of subjects experienced definite erythema at the patch site that generally dissipated within 24 hours. Four subjects experienced a DRS score of 4 (1%): erythema in 1 subject resolved on study treatment, 2 cases resolved poststudy and subjects tolerated oral MPH, and 1 subject discontinued treatment. The latter subject was referred for patch testing and was diagnosed with allergic contact sensitization to MPH. Conclusions: Few severe dermal effects were seen with MTS treatment. Dermal reactions were characterized as contact dermatitis and dissipated rapidly. On patch testing, 1 subject (0.3%) manifested sensitization to MPH. Trial Registration: clinicaltrials.gov Identifier: NCT00434213 PMID:21494336
SAT Scores, 2012-13: Wake County Public School System (WCPSS). Measuring Up. D&A Report No. 13.22
ERIC Educational Resources Information Center
Muli, Juliana; Gilleland, Kevin; McMillen, Brad
2014-01-01
As the ACT has become part of North Carolina's mandatory testing program, SAT participation in Wake County Public School System (WCPSS) and North Carolina has declined in recent years. However, SAT performance in WCPSS remains high compared to state and national averages. In 2012-13, students in WCPSS continued to score 50-60 points higher on the…
ERIC Educational Resources Information Center
Maxwell, June B.
2010-01-01
In the state of Georgia, local school systems are under pressure to increase at-risk middle school students' state scores in reading and math. At the data site, the local school system implemented a supplemental education service (SES) program for at-risk students in order to pass the Georgia Criterion Referenced Competency Test (CRCT) in reading…
Systematic Review of Plant-Based Homeopathic Basic Research: An Update.
Ücker, Annekathrin; Baumgartner, Stephan; Sokol, Anezka; Huber, Roman; Doesburg, Paul; Jäger, Tim
2018-05-01
Plant-based test systems have been described as a useful tool for investigating possible effects of homeopathic preparations. The last reviews of this research field were published in 2009/2011. Due to recent developments in the field, an update is warranted. Publications on plant-based test systems were analysed with regard to publication quality, reproducibility and potential for further research. A literature search was conducted in online databases and specific journals, including publications from 2008 to 2017 dealing with plant-based test systems in homeopathic basic research. To be included, they had to contain statistical analysis and fulfil quality criteria according to a pre-defined manuscript information score (MIS). Publications scoring at least 5 points (maximum 10 points) were assumed to be adequate. They were analysed for the use of adequate controls, outcome and reproducibility. Seventy-four publications on plant-based test systems were found. Thirty-nine publications were either abstracts or proceedings of conferences and were excluded. From the remaining 35 publications, 26 reached a score of 5 or higher in the MIS. Adequate controls were used in 13 of these publications. All of them described specific effects of homeopathic preparations. The publication quality still varied: a substantial number of publications (23%) did not adequately document the methods used. Four reported on replication trials. One replication trial found effects of homeopathic preparations comparable to the original study. Three replication trials failed to confirm the original study but identified possible external influencing factors. Five publications described novel plant-based test systems. Eight trials used systematic negative control experiments to document test system stability. Regarding research design, future trials should implement adequate controls to identify specific effects of homeopathic preparations and include systematic negative control experiments. Further external and internal replication trials, and control of influencing factors, are needed to verify results. Standardised test systems should be developed. The Faculty of Homeopathy.
Examining the Feasibility and Effect of Transitioning GED Tests to Computer
ERIC Educational Resources Information Center
Higgins, Jennifer; Patterson, Margaret Becker; Bozman, Martha; Katz, Michael
2010-01-01
This study examined the feasibility of administering GED Tests using a computer based testing system with embedded accessibility tools and the impact on test scores and test-taker experience when GED Tests are transitioned from paper to computer. Nineteen test centers across five states successfully installed the computer based testing program,…
NASA Astrophysics Data System (ADS)
Christensen, Hannah; Moroz, Irene; Palmer, Tim
2015-04-01
Forecast verification is important across scientific disciplines as it provides a framework for evaluating the performance of a forecasting system. In the atmospheric sciences, probabilistic skill scores are often used for verification as they provide a way of unambiguously ranking the performance of different probabilistic forecasts. In order to be useful, a skill score must be proper -- it must encourage honesty in the forecaster, and reward forecasts which are reliable and which have good resolution. A new score, the Error-spread Score (ES), is proposed which is particularly suitable for evaluation of ensemble forecasts. It is formulated with respect to the moments of the forecast. The ES is confirmed to be a proper score, and is therefore sensitive to both resolution and reliability. The ES is tested on forecasts made using the Lorenz '96 system, and found to be useful for summarising the skill of the forecasts. The European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble prediction system (EPS) is evaluated using the ES. Its performance is compared to a perfect statistical probabilistic forecast -- the ECMWF high resolution deterministic forecast dressed with the observed error distribution. This generates a forecast that is perfectly reliable if considered over all time, but which does not vary from day to day with the predictability of the atmospheric flow. The ES distinguishes between the dynamically reliable EPS forecasts and the statically reliable dressed deterministic forecasts. Other skill scores are tested and found to be comparatively insensitive to this desirable forecast quality. The ES is used to evaluate seasonal range ensemble forecasts made with the ECMWF System 4. The ensemble forecasts are found to be skilful when compared with climatological or persistence forecasts, though this skill is dependent on region and time of year.
Mulligan, Ivan; Boland, Mark; Payette, Justin
2012-07-01
Prospective cohort. To identify the prevalence of neurocognitive and balance deficits in collegiate football players 48 hours following competition. Neurocognitive testing, balance assessments, and subjective report of symptoms are a commonly used test battery in examining athletes when concussion is suspected. Previous literature suggests many concussions go unreported. Little research exists examining the prevalence of neurocognitive or balance deficits in athletes who do not report concussion-like symptoms to a health care provider. Forty-five Division IA collegiate football players participated in this study. Preseason baseline scores using the Balance Error Scoring System, the Immediate Post-Concussion Assessment and Cognitive Testing, and the Postconcussion Symptom Scale were compared to posttest results obtained 48 hours following a game. Prevalence of symptoms was analyzed and reported. Thirty-two (71%) of the 45 athletes tested demonstrated at least 1 deficit in either the Postconcussion Symptom Scale, Balance Error Scoring System, or at least 1 composite score of the Immediate Post-Concussion Assessment and Cognitive Testing. Nineteen of the 32 subjects demonstrated a change in 2 or more categories of neurocognitive and balance function. In a cohort of football players tested 48 hours following their last game of the season, who did not seek medical attention related to a concussion, a significant number demonstrated limitations in neurocognitive and balance performance, suggesting that further research may need to be performed to improve recognition of an athlete's deficits and to improve the ability to assess concussion. Differential diagnosis/symptom prevalence, level 3b.
Automated aortic calcium scoring on low-dose chest computed tomography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Isgum, Ivana; Rutten, Annemarieke; Prokop, Mathias
Purpose: Thoracic computed tomography (CT) scans provide information about cardiovascular risk status. These scans are non-ECG synchronized, thus precise quantification of coronary calcifications is difficult. Aortic calcium scoring is less sensitive to cardiac motion, so it is an alternative to coronary calcium scoring as an indicator of cardiovascular risk. The authors developed and evaluated a computer-aided system for automatic detection and quantification of aortic calcifications in low-dose noncontrast-enhanced chest CT. Methods: The system was trained and tested on scans from participants of a lung cancer screening trial. A total of 433 low-dose, non-ECG-synchronized, noncontrast-enhanced 16 detector row examinations of themore » chest was randomly divided into 340 training and 93 test data sets. A first observer manually identified aortic calcifications on training and test scans. A second observer did the same on the test scans only. First, a multiatlas-based segmentation method was developed to delineate the aorta. Segmented volume was thresholded and potential calcifications (candidate objects) were extracted by three-dimensional connected component labeling. Due to image resolution and noise, in rare cases extracted candidate objects were connected to the spine. They were separated into a part outside and parts inside the aorta, and only the latter was further analyzed. All candidate objects were represented by 63 features describing their size, position, and texture. Subsequently, a two-stage classification with a selection of features and k-nearest neighbor classifiers was performed. Based on the detected aortic calcifications, total calcium volume score was determined for each subject. Results: The computer system correctly detected, on the average, 945 mm{sup 3} out of 965 mm{sup 3} (97.9%) calcified plaque volume in the aorta with an average of 64 mm{sup 3} of false positive volume per scan. Spearman rank correlation coefficient was {rho}=0.960 between the system and the first observer compared to {rho}=0.961 between the two observers. Conclusions: Automatic calcium scoring in the aorta thus appears feasible with good correlation between manual and automatic scoring.« less
Barisoni, Laura; Troost, Jonathan P; Nast, Cynthia; Bagnasco, Serena; Avila-Casado, Carmen; Hodgin, Jeffrey; Palmer, Matthew; Rosenberg, Avi; Gasim, Adil; Liensziewski, Chrysta; Merlino, Lino; Chien, Hui-Ping; Chang, Anthony; Meehan, Shane M; Gaut, Joseph; Song, Peter; Holzman, Lawrence; Gibson, Debbie; Kretzler, Matthias; Gillespie, Brenda W; Hewitt, Stephen M
2016-07-01
The multicenter Nephrotic Syndrome Study Network (NEPTUNE) digital pathology scoring system employs a novel and comprehensive methodology to document pathologic features from whole-slide images, immunofluorescence and ultrastructural digital images. To estimate inter- and intra-reader concordance of this descriptor-based approach, data from 12 pathologists (eight NEPTUNE and four non-NEPTUNE) with experience from training to 30 years were collected. A descriptor reference manual was generated and a webinar-based protocol for consensus/cross-training implemented. Intra-reader concordance for 51 glomerular descriptors was evaluated on jpeg images by seven NEPTUNE pathologists scoring 131 glomeruli three times (Tests I, II, and III), each test following a consensus webinar review. Inter-reader concordance of glomerular descriptors was evaluated in 315 glomeruli by all pathologists; interstitial fibrosis and tubular atrophy (244 cases, whole-slide images) and four ultrastructural podocyte descriptors (178 cases, jpeg images) were evaluated once by six and five pathologists, respectively. Cohen's kappa for inter-reader concordance for 48/51 glomerular descriptors with sufficient observations was moderate (0.40
Huang, Min H; Miller, Kara; Smith, Kristin; Fredrickson, Kayle; Shilling, Tracy
2016-01-01
Cancer is primarily a disease of older adults. About 77% of all cancers are diagnosed in persons aged 55 years and older. Cancer and its treatment can cause diverse sequelae impacting body systems underlying balance control. No study has examined the psychometric properties of balance assessment tools in older cancer survivors, presenting a significant challenge in the selection of outcome measures for clinicians treating this fast-growing population. This study aimed to determine the reliability, validity, and minimal detectable change (MDC) of the Balance Evaluation System Test (BESTest), Mini-Balance Evaluation Systems Test (Mini-BESTest), and Brief-Balance Evaluation Systems Test (Brief-BESTest) in community-dwelling older cancer survivors. This study was a cross-sectional design. Twenty breast and 8 prostate cancer survivors participated [age (SD) = 68.4 (8.13) years]. The BESTest and Activity-specific Balance Confidence (ABC) Scale were administered during the first session. Scores of Mini-BESTest and Brief-BESTest were extracted on the basis of the scores of BESTest. The BESTest was repeated within 1 to 2 weeks by the same rater to determine the test-retest reliability. For the analysis of the inter-rater reliability, 21 participants were randomly selected to be evaluated by 2 raters. A primary rater administered the test. The 2 raters independently and concurrently scored the performance of the participants. Each rater recorded the ratings separately on the scoring sheet. No discussion among the raters was allowed throughout the testing. Intraclass correlation coefficients (ICCs), standard error of measurement, minimal detectable change (MDC), and Bland-Altman plots were calculated. Concurrent validity of these balance tests with the ABC Scale was examined using the Spearman correlation. The BESTest, Mini-BESTest, and Brief-BESTest had high test-retest (ICC = 0.90-0.94) and interrater reliability (ICC = 0.86-0.96), small standard error of measurement (0.86-2.47 points), and MDC (2.39-6.86 points). The Bland-Altman plot revealed no systematic errors. The scores of BESTest, Mini-BEST, and Brief-BEST were correlated significantly with those of ABC Scale (P < .01), supporting their concurrent validity. The BESTest, Mini-BESTest, and Brief-BESTest showed high interrater and test-retest reliability, and excellent concurrent validity with the ABC Scale for community-dwelling cancer survivors aged 55 years and older who had completed cancer treatments for at least 3 months. Future studies are necessary to determine the predictive values for determining fall risks using balance assessment tools in older cancer survivors. Clinicians can utilize the BESTest and its short versions to evaluate balance problems in community-dwelling older cancer survivors and apply the established MDC to assess the intervention outcomes.
Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock
2017-09-29
Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients.
Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock
2017-01-01
Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients. PMID:29100405
Baron-Cohen, Simon; Richler, Jennifer; Bisarya, Dheraj; Gurunathan, Nhishanth; Wheelwright, Sally
2003-01-01
Systemizing is the drive to analyse systems or construct systems. A recent model of psychological sex differences suggests that this is a major dimension in which the sexes differ, with males being more drawn to systemize than females. Currently, there are no self-report measures to assess this important dimension. A second major dimension of sex differences is empathizing (the drive to identify mental states and respond to these with an appropriate emotion). Previous studies find females score higher on empathy measures. We report a new self-report questionnaire, the Systemizing Quotient (SQ), for use with adults of normal intelligence. It contains 40 systemizing items and 20 control items. On each systemizing item, a person can score 2, 1 or 0, so the SQ has a maximum score of 80 and a minimum of zero. In Study 1, we measured the SQ of n = 278 adults (114 males, 164 females) from a general population, to test for predicted sex differences (male superiority) in systemizing. All subjects were also given the Empathy Quotient (EQ) to test if previous reports of female superiority would be replicated. In Study 2 we employed the SQ and the EQ with n = 47 adults (33 males, 14 females) with Asperger syndrome (AS) or high-functioning autism (HFA), who are predicted to be either normal or superior at systemizing, but impaired at empathizing. Their scores were compared with n = 47 matched adults from the general population in Study 1. In Study 1, as predicted, normal adult males scored significantly higher than females on the SQ and significantly lower on the EQ. In Study 2, again as predicted, adults with AS/HFA scored significantly higher on the SQ than matched controls, and significantly lower on the EQ than matched controls. The SQ reveals both a sex difference in systemizing in the general population and an unusually strong drive to systemize in AS/HFA. These results are discussed in relation to two linked theories: the 'empathizing-systemizing' (E-S) theory of sex differences and the extreme male brain (EMB) theory of autism. PMID:12639333
Phantom evaluation of the effect of film processing on mammographic screen-film combinations.
McLean, D; Rickard, M T
1994-08-01
Mammographic image quality should be optimal for diagnosis, and the film contrast can be manipulated by altering development parameters. In this study phantom test objects were radiographed and processed for a given range of developer temperatures and times for four film-screen systems. Radiologists scored the phantom test objects on the resultant films to evaluate the effect on diagnosis of varying image contrast. While for three film-screen systems processing led to appreciable contrast differences, for only one film system did maximum contrast correspond with optimal phantom test object scoring. The inability to show an effect on diagnosis in all cases is possibly due to the variation in radiologist responses found in this study and in normal clinical circumstances. Other technical factors such as changes in film fog, grain and mottle may contribute to the study findings.
Pobocik, Tamara
2015-01-01
This quantitative research study used a pretest/posttest design and reviewed how an educational electronic documentation system helped nursing students to identify the accurate "related to" statement of the nursing diagnosis for the patient in the case study. Students in the sample population were senior nursing students in a bachelor of science nursing program in the northeastern United States. Two distinct groups were used for a control and intervention group. The intervention group used the educational electronic documentation system for three class assignments. Both groups were given a pretest and posttest case study. The Accuracy Tool was used to score the students' responses to the related to statement of a nursing diagnosis given at the end of the case study. The scores of the Accuracy Tool were analyzed, and then the numeric scores were placed in SPSS, and the paired t test scores were analyzed for statistical significance. The intervention group's scores were statistically different from the pretest scores to posttest scores, while the control group's scores remained the same from pretest to posttest. The recommendation to nursing education is to use the educational electronic documentation system as a teaching pedagogy to help nursing students prepare for nursing practice. © 2014 NANDA International, Inc.
Yan, Chong; Song, Jie; Pang, Song; Yi, Fangfang; Xi, Jianying; Zhou, Lei; Ding, Ding; Wang, Weifeng; Qiao, Kai; Zhao, Chongbo
2018-02-01
Repetitive nerve stimulation (RNS) is a valuable diagnostic method for myasthenia gravis (MG). However, its association with clinical severity was scarcely studied. We reviewed medical records and retrospectively enrolled 121 generalized MG patients. Sensitivity of different muscles to RNS and clinical scoring systems was evaluated. RNS testing revealed facial muscles have the highest positive rate, followed by proximal muscles and distal muscles, with the palpebral portion of the orbicularis oculi muscle most sensitive. Amplitude decrement of compound muscle action potential (CMAP) in the palpebral portion of the orbicularis oculi muscle is related to quantitative myasthenia gravis (QMG) scores, MG-specific manual muscle testing (MMT) scores and myasthenia gravis-related activities of daily living (MG-ADL) scores. We suggest that RNS testing of the palpebral portion of the orbicularis oculi muscle is a potential assessment indicator in patients with generalized MG. Copyright © 2017 Elsevier Ltd. All rights reserved.
Comparison of an expert system with other clinical scores for the evaluation of severity of asthma.
Gautier, V; Rédier, H; Pujol, J L; Bousquet, J; Proudhon, H; Michel, C; Daurès, J P; Michel, F B; Godard, P
1996-01-01
"Asthmaexpert" was produced at the special request of several clinicians in order to obtain a better understanding of the medical decisions taken by clinical experts in the management of asthmatic patients. In order to assess the severity of asthma, a new score called Artificial Intelligence score (AI score), produced by Asthmaexpert, was compared with three other scores (Aas, Hargreave and Brooks). One hundred patients were enrolled prospectively in the study during their first consultation in the out-patient clinic. Distribution of severity level according to the different scores was studied, and the reliability between AI and other scores was evaluated by Kappa and MacNemar tests. Correlations with functional parameters were performed. The AI score assessed higher levels of severity than the other scores (Kappa = 18, 28 and 10% for Aas, Hargreave and Brooks, respectively) with significant MacNemar test in all cases. There was a significant correlation between AI score and forced expiratory volume in one second (FEV1) (r = 0.73). These data indicate that the AI score is a severity score which defines higher levels of severity than the chosen scores. Correlations for functional parameters are good. This score appears easy to use for the first consultation of an asthmatic patient.
Prognostic score to predict mortality during TB treatment in TB/HIV co-infected patients.
Nguyen, Duc T; Jenkins, Helen E; Graviss, Edward A
2018-01-01
Estimating mortality risk during TB treatment in HIV co-infected patients is challenging for health professionals, especially in a low TB prevalence population, due to the lack of a standardized prognostic system. The current study aimed to develop and validate a simple mortality prognostic scoring system for TB/HIV co-infected patients. Using data from the CDC's Tuberculosis Genotyping Information Management System of TB patients in Texas reported from 01/2010 through 12/2016, age ≥15 years, HIV(+), and outcome being "completed" or "died", we developed and internally validated a mortality prognostic score using multiple logistic regression. Model discrimination was determined by the area under the receiver operating characteristic (ROC) curve (AUC). The model's good calibration was determined by a non-significant Hosmer-Lemeshow's goodness of fit test. Among the 450 patients included in the analysis, 57 (12.7%) died during TB treatment. The final prognostic score used six characteristics (age, residence in long-term care facility, meningeal TB, chest x-ray, culture positive, and culture not converted/unknown), which are routinely collected by TB programs. Prognostic scores were categorized into three groups that predicted mortality: low-risk (<20 points), medium-risk (20-25 points) and high-risk (>25 points). The model had good discrimination and calibration (AUC = 0.82; 0.80 in bootstrap validation), and a non-significant Hosmer-Lemeshow test p = 0.71. Our simple validated mortality prognostic scoring system can be a practical tool for health professionals in identifying TB/HIV co-infected patients with high mortality risk.
Gu, X; Fang, Z-M; Liu, Y; Lin, S-L; Han, B; Zhang, R; Chen, X
2014-01-01
Three-dimensional fluid-attenuated inversion recovery magnetic resonance imaging of the inner ear after intratympanic injection of gadolinium, together with magnetic resonance imaging scoring of the perilymphatic space, were used to investigate the positive identification rate of hydrops and determine the technique's diagnostic value for delayed endolymphatic hydrops. Twenty-five patients with delayed endolymphatic hydrops underwent pure tone audiometry, bithermal caloric testing, vestibular-evoked myogenic potential testing and three-dimensional magnetic resonance imaging of the inner ear after bilateral intratympanic injection of gadolinium. The perilymphatic space of the scanned images was analysed to investigate the positive identification rate of endolymphatic hydrops. According to the magnetic resonance imaging scoring of the perilymphatic space and the diagnostic standard, 84 per cent of the patients examined had endolymphatic hydrops. In comparison, the positive identification rates for vestibular-evoked myogenic potential and bithermal caloric testing were 52 per cent and 72 per cent respectively. Three-dimensional magnetic resonance imaging after intratympanic injection of gadolinium is valuable in the diagnosis of delayed endolymphatic hydrops and its classification. The perilymphatic space scoring system improved the diagnostic accuracy of magnetic resonance imaging.
Pollner, Péter; Horváth, Anna; Mezei, Tamás; Banczerowski, Péter; Czigléczki, Gábor
2018-04-01
Metastatic spinal diseases are common health problems and there is no consensus on the appropriate treatment of metastases in several conditions. Using clinical measures (e.g., survival time and functional status), prognosis prediction systems advise on the appropriate interventions. The aim of this article is to assess and compare 4 widely used scoring systems (revised Tokuhashi, Tomita, van der Linden, and modified Bauer scores) on a single-center cohort. A retrospective study was designed of 329 patients who were subjected to surgery because of metastatic spinal diseases. Subpopulations according to the classifications of the 4 scoring systems were identified. The overall survival was calculated with the Kaplan-Meier formula. The difference between the survival curves of subpopulations was analyzed with log-rank tests. The consistency rates for the 4 scoring systems are calculated as well. The follow-up period was 8 years. The median survival time was 222 days. The overall survival of prognostic categories in 3 scoring systems was significantly different from each other, but we found no differences between the categories of the van der Linden system. In this cohort, the revised Tokuhashi system gave the best approximation for survival, with a mean predictive capability 60.5%. The evaluation of 4 standard scoring systems showed that 3 were self-consistent, although none of systems was able to predict the survival in our cohort. Based on the predictive capability, the revised Tokuhashi system may provide the best predictions with careful examination of individual cases. Copyright © 2018 Elsevier Inc. All rights reserved.
Analysis of Lethality and Malformations During Zebrafish (Danio rerio) Development.
Raghunath, Azhwar; Perumal, Ekambaram
2018-01-01
The versatility offered by zebrafish (Danio rerio) makes it a powerful and an attractive vertebrate model in developmental toxicity and teratogenicity assays. Apart from the newly introduced chemicals as drugs, xenobiotics also induce abnormal developmental abnormalities and congenital malformations in living organisms. Over the recent decades, zebrafish embryo/larva has emerged as a potential tool to test teratogenicity potential of these chemicals. Zebrafish responds to compounds as mammals do as they share similarities in their development, metabolism, physiology, and signaling pathways with that of mammals. The methodology used by the different scientists varies enormously in the zebrafish embryotoxicity test. In this chapter, we present methods to assess lethality and malformations during zebrafish development. We propose two major malformations scoring systems: binomial and relative morphological scoring systems to assess the malformations in zebrafish embryos/larvae. Based on the scoring of the malformations, the test compound can be classified as a teratogen or a nonteratogen and its teratogenic potential is evaluated.
The impact of stroke on emotional intelligence
2010-01-01
Background Emotional intelligence (EI) is important for personal, social and career success and has been linked to the frontal anterior cingulate, insula and amygdala regions. Aim To ascertain which stroke lesion sites impair emotional intelligence and relation to current frontal assessment measurements. Methods One hundred consecutive, non aphasic, independently functioning patients post stroke were evaluated with the Bar-On emotional intelligence test, "known as the Emotional Quotient Inventory (EQ-i)" and frontal tests that included the Wisconsin Card Sorting Test (WCST) and Frontal Systems Behavioral Inventory (FRSBE) for correlational validity. The results of a screening, bedside frontal network syndrome test (FNS) and NIHSS to document neurological deficit were also recorded. Lesion location was determined by the Cerefy digital, coxial brain atlas. Results After exclusions (n = 8), patients tested (n = 92, mean age 50.1, CI: 52.9, 47.3 years) revealed that EQ-i scores were correlated (negatively) with all FRSBE T sub-scores (apathy, disinhibition, executive, total), with self-reported scores correlating better than family reported scores. Regression analysis revealed age and FRSBE total scores as the most influential variables. The WCST error percentage T score did not correlate with the EQ-i scores. Based on ANOVA, there were significant differences among the lesion sites with the lowest mean EQ-i scores associated with temporal (71.5) and frontal (87.3) lesions followed by subtentorial (91.7), subcortical gray (92.6) and white (95.2) matter, and the highest scores associated with parieto-occipital lesions (113.1). Conclusions 1) Stroke impairs EI and is associated with apathy, disinhibition and executive functioning. 2) EI is associated with frontal, temporal, subcortical and subtentorial stroke syndromes. PMID:21029468
An Interactive Software System for Computer-Assisted Testing
ERIC Educational Resources Information Center
Howze, Glenn
1978-01-01
This paper describes an interactive computer software system developed at Tuskegee Institute which is designed to allow flexibility in the development, administration, and scoring of examinations. (Author)
Pauchard, J-Y; Gehri, M; Vaudaux, B
2013-01-16
The McIsaac scoring system is a tool designed to predict the probability of streptococcal pharyngitis in children aged 3 to 17 years with a sore throat. Although it does not allow the physician to make the diagnosis of streptococcal pharyngitis, it enables to identify those children with a sore throat in whom rapid antigen detection tests have a good predictive value.
[Scores and stages in pneumology].
Kuhn, Max
2013-10-01
Useful scales and classifications for patients with pulmonary diseases are discussed. The modified Medical Research Council breathlessness scale (mMRC) is a measure of disability in lung patients. The GOLD classifications, the COPD-Assessment Test (CAT) and the BODE Index are important to classify the severity of COPD and to measure the disability of these patients. The Geneva score is a clinical prediction rule used in determining the pre-test probability of pulmonary embolism. The Pulmonary Embolism Severity Index (PESI) is a scoring system used to predict 30 day mortality in patients with pulmonary embolism. The Epworth Sleepiness Scale is intended to measure daytime sleepiness in patients with sleep apnea syndrome. The Asthma Controll Test (ACT) determines if asthma symptoms are well controlled.
Electronystagmography outcome and neuropsychological findings in tinnitus patients.
Jozefowicz-Korczynska, Magdalena; Ciechomska, Elzbieta Agata; Pajor, Anna Maria
2005-01-01
Because psychological aspects often are underscored in the generation of tinnitus, we assessed the neuropsychological status in our group of patients. We found an increased number of abnormal electronystagmography (ENG) recordings in tinnitus patients. The aim of this study was to compare the ENG outcome with the patients' neuropsychological status. We carried out the study on 69 subjects complaining of tinnitus and on 43 healthy persons. We performed clinical neurootological examinations and ENG tests on all patients. Neuropsychological evaluation was conducted by means of the Beck Depression Inventory (BDI), the Hospital Anxiety and Depression (HAD) test, the Mini Mental Status (MMS) test, and the Trail-Making Test (TMT). In 46 patients (66.6%), we found abnormal ENG outcomes (central, 42%; peripheral, 13.0%; mixed, 11.6%). Neuropsychological tests revealed abnormal scores: for the BDI, 43.5% of patients; for the HAD-A, 72.5%; for the HAD-D, 47.8%; for the MMS, 27.5%; and for the TMT, 55.1%. We did not find correlation between the ENG outcomes and neuropsychological test scores. We did not find correlation between the overall ENG outcomes and neuropsychological test scores, with one exception; we found the occurrence of abnormal neuropsychological test scores and the ENG outcome indicating central vestibular dysfunction. Our study showed that despite a high frequency of vestibular system dysfunction signs and a high incidence of abnormal neuropsychological test scores in tinnitus patients, only one correlation existed between these two results.
77 FR 68123 - Privacy Act of 1974; System of Records Notice
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-15
... test records, including registrant's first and last name, evaluation data, pretest and posttest scores..., pretest and posttest scores, and registration information, will be disclosed to accrediting bodies (such... educational information, training, best practices, and tools to health professionals as one initiative to help...
More Points for "Strivers": The New Affirmative Action?
ERIC Educational Resources Information Center
Gose, Ben
1999-01-01
Researchers at Educational Testing Service and elsewhere are devising methods that could help admissions officers measure educational disadvantage more systematically. One system identifies "strivers," any student scoring more than 200 points above the average score of peers with similar backgrounds, taking into account 14 variables such…
Corner, E J; Wood, H; Englebretsen, C; Thomas, A; Grant, R L; Nikoletou, D; Soni, N
2013-03-01
To develop a scoring system to measure physical morbidity in critical care - the Chelsea Critical Care Physical Assessment Tool (CPAx). The development process was iterative involving content validity indices (CVI), a focus group and an observational study of 33 patients to test construct validity against the Medical Research Council score for muscle strength, peak cough flow, Australian Therapy Outcome Measures score, Glasgow Coma Scale score, Bloomsbury sedation score, Sequential Organ Failure Assessment score, Short Form 36 (SF-36) score, days of mechanical ventilation and inter-rater reliability. Trauma and general critical care patients from two London teaching hospitals. Users of the CPAx felt that it possessed content validity, giving a final CVI of 1.00 (P<0.05). Construct validation data showed moderate to strong significant correlations between the CPAx score and all secondary measures, apart from the mental component of the SF-36 which demonstrated weak correlation with the CPAx score (r=0.024, P=0.720). Reliability testing showed internal consistency of α=0.798 and inter-rater reliability of κ=0.988 (95% confidence interval 0.791 to 1.000) between five raters. This pilot work supports proof of concept of the CPAx as a measure of physical morbidity in the critical care population, and is a cogent argument for further investigation of the scoring system. Copyright © 2012 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Factors contributing to speech perception scores in long-term pediatric cochlear implant users.
Davidson, Lisa S; Geers, Ann E; Blamey, Peter J; Tobey, Emily A; Brenner, Christine A
2011-02-01
The objectives of this report are to (1) describe the speech perception abilities of long-term pediatric cochlear implant (CI) recipients by comparing scores obtained at elementary school (CI-E, 8 to 9 yrs) with scores obtained at high school (CI-HS, 15 to 18 yrs); (2) evaluate speech perception abilities in demanding listening conditions (i.e., noise and lower intensity levels) at adolescence; and (3) examine the relation of speech perception scores to speech and language development over this longitudinal timeframe. All 112 teenagers were part of a previous nationwide study of 8- and 9-yr-olds (N = 181) who received a CI between 2 and 5 yrs of age. The test battery included (1) the Lexical Neighborhood Test (LNT; hard and easy word lists); (2) the Bamford Kowal Bench sentence test; (3) the Children's Auditory-Visual Enhancement Test; (4) the Test of Auditory Comprehension of Language at CI-E; (5) the Peabody Picture Vocabulary Test at CI-HS; and (6) the McGarr sentences (consonants correct) at CI-E and CI-HS. CI-HS speech perception was measured in both optimal and demanding listening conditions (i.e., background noise and low-intensity level). Speech perception scores were compared based on age at test, lexical difficulty of stimuli, listening environment (optimal and demanding), input mode (visual and auditory-visual), and language age. All group mean scores significantly increased with age across the two test sessions. Scores of adolescents significantly decreased in demanding listening conditions. The effect of lexical difficulty on the LNT scores, as evidenced by the difference in performance between easy versus hard lists, increased with age and decreased for adolescents in challenging listening conditions. Calculated curves for percent correct speech perception scores (LNT and Bamford Kowal Bench) and consonants correct on the McGarr sentences plotted against age-equivalent language scores on the Test of Auditory Comprehension of Language and Peabody Picture Vocabulary Test achieved asymptote at similar ages, around 10 to 11 yrs. On average, children receiving CIs between 2 and 5 yrs of age exhibited significant improvement on tests of speech perception, lipreading, speech production, and language skills measured between primary grades and adolescence. Evidence suggests that improvement in speech perception scores with age reflects increased spoken language level up to a language age of about 10 yrs. Speech perception performance significantly decreased with softer stimulus intensity level and with introduction of background noise. Upgrades to newer speech processing strategies and greater use of frequency-modulated systems may be beneficial for ameliorating performance under these demanding listening conditions.
Hammad, Shaza M.; El-Wassefy, Noha; Maher, Ahmed; Fawakerji, Shafik M.
2017-01-01
ABSTRACT Objective: To evaluate the effect of silica dioxide (SiO2) nanofillers in different bonding systems on shear bond strength (SBS) and mode of failure of orthodontic brackets at two experimental times. Methods: Ninety-six intact premolars were divided into four groups: A) Conventional acid-etch and primer Transbond XT; B) Transbond Plus self-etch primer; and two self-etch bonding systems reinforced with silica dioxide nanofiller at different concentrations: C) Futurabond DC at 1%; D) Optibond All-in-One at 7%. Each group was allocated into two subgroups (n = 12) according to experimental time (12 and 24 hours). SBS test was performed using a universal testing machine. ARI scores were determined under a stereomicroscope. Scanning electron microscopy (SEM) and transmission electron microscopy (TEM) were used to determine the size and distribution of nanofillers. One-way ANOVA was used to compare SBS followed by the post-hoc Tukey test. The chi-square test was used to evaluate ARI scores. Results: Mean SBS of Futurabond DC and Optibond All-in-One were significantly lower than conventional system, and there were no significant differences between means SBS obtained with all self-etch bonding systems used in the study. Lower ARI scores were found for Futurabond DC and Optibond All-in-One. There was no significant difference of SBS and ARI obtained at either time points for all bonding systems. Relative homogeneous distribution of the fillers was observed with the bonding systems. Conclusion: Two nanofilled systems revealed the lowest bond strengths, but still clinically acceptable and less adhesive was left on enamel. It is advisable not to load the brackets immediately to the maximum. PMID:28444018
Ma, Qing-Bian; Fu, Yuan-Wei; Feng, Lu; Zhai, Qiang-Rong; Liang, Yang; Wu, Meng; Zheng, Ya-An
2017-07-05
Since the 1980s, severity of illness scoring systems has gained increasing popularity in Intensive Care Units (ICUs). Physicians used them for predicting mortality and assessing illness severity in clinical trials. The objective of this study was to assess the performance of Simplified Acute Physiology Score 3 (SAPS 3) and its customized equation for Australasia (Australasia SAPS 3, SAPS 3 [AUS]) in predicting clinical prognosis and hospital mortality in emergency ICU (EICU). A retrospective analysis of the EICU including 463 patients was conducted between January 2013 and December 2015 in the EICU of Peking University Third Hospital. The worst physiological data of enrolled patients were collected within 24 h after admission to calculate SAPS 3 score and predicted mortality by regression equation. Discrimination between survivals and deaths was assessed by the area under the receiver operator characteristic curve (AUC). Calibration was evaluated by Hosmer-Lemeshow goodness-of-fit test through calculating the ratio of observed-to-expected numbers of deaths which is known as the standardized mortality ratio (SMR). A total of 463 patients were enrolled in the study, and the observed hospital mortality was 26.1% (121/463). The patients enrolled were divided into survivors and nonsurvivors. Age, SAPS 3 score, Acute Physiology and Chronic Health Evaluation Score II (APACHE II), and predicted mortality were significantly higher in nonsurvivors than survivors (P < 0.05 or P < 0.01). The AUC (95% confidence intervals [CI s]) for SAPS 3 score was 0.836 (0.796-0.876). The maximum of Youden's index, cutoff, sensitivity, and specificity of SAPS 3 score were 0.526%, 70.5 points, 66.9%, and 85.7%, respectively. The Hosmer-Lemeshow goodness-of-fit test for SAPS 3 demonstrated a Chi-square test score of 10.25, P = 0.33, SMR (95% CI) = 0.63 (0.52-0.76). The Hosmer-Lemeshow goodness-of-fit test for SAPS 3 (AUS) demonstrated a Chi-square test score of 9.55, P = 0.38, SMR (95% CI) = 0.68 (0.57-0.81). Univariate and multivariate analyses were conducted for biochemical variables that were probably correlated to prognosis. Eventually, blood urea nitrogen (BUN), albumin,lactate and free triiodothyronine (FT3) were selected as independent risk factors for predicting prognosis. The SAPS 3 score system exhibited satisfactory performance even superior to APACHE II in discrimination. In predicting hospital mortality, SAPS 3 did not exhibit good calibration and overestimated hospital mortality, which demonstrated that SAPS 3 needs improvement in the future.
ERIC Educational Resources Information Center
Wilcox, Timothy Eugene
2012-01-01
The purpose of this study was to determine if there were differences in MCT2 scores between students who attended a school district that used MSPMS and students who attended a school district that did not use MSPMS. The data for this study were archived and consisted of math and language arts MCT2 scores for two groups of students. The independent…
NASA Astrophysics Data System (ADS)
Gatot, D.; Mardia, A. I.
2018-03-01
Deep Vein Thrombosis (DVT) is the venous thrombus in lower limbs. Diagnosis is by using venography or ultrasound compression. However, these examinations are not available yet in some health facilities. Therefore many scoring systems are developed for the diagnosis of DVT. The scoring method is practical and safe to use in addition to efficacy, and effectiveness in terms of treatment and costs. The existing scoring systems are wells, caprini and padua score. There have been many studies comparing the accuracy of this score but not in Medan. Therefore, we are interested in comparative research of wells, capriniand padua score in Medan.An observational, analytical, case-control study was conducted to perform diagnostic tests on the wells, caprini and padua score to predict the risk of DVT. The study was at H. Adam Malik Hospital in Medan.From a total of 72 subjects, 39 people (54.2%) are men and the mean age are 53.14 years. Wells score, caprini score and padua score has a sensitivity of 80.6%; 61.1%, 50% respectively; specificity of 80.65; 66.7%; 75% respectively, and accuracy of 87.5%; 64.3%; 65.7% respectively.Wells score has better sensitivity, specificity and accuracy than caprini and padua score in diagnosing DVT.
Ghirardelli, Alyssa; Quinn, Valerie; Sugerman, Sharon
2011-01-01
To develop a retail grocery instrument with weighted scoring to be used as an indicator of the food environment. Twenty six retail food stores in low-income areas in California. Observational. Inter-rater reliability for grocery store survey instrument. Description of store scoring methodology weighted to emphasize availability of healthful food. Type A intra-class correlation coefficients (ICC) with absolute agreement definition or a κ test for measures using ranges as categories. Measures of availability and price of fruits and vegetables performed well in reliability testing (κ = 0.681-0.800). Items for vegetable quality were better than for fruit (ICC 0.708 vs 0.528). Kappa scores indicated low to moderate agreement (0.372-0.674) on external store marketing measures and higher scores for internal store marketing. "Next to" the checkout counter was more reliable than "within 6 feet." Health departments using the store scoring system reported it as the most useful communication of neighborhood findings. There was good reliability of the measures among the research pairs. The local store scores can show the need to bring in resources and to provide access to fruits and vegetables and other healthful food. Copyright © 2011 Society for Nutrition Education. Published by Elsevier Inc. All rights reserved.
Monte Carlo Approach for Reliability Estimations in Generalizability Studies.
ERIC Educational Resources Information Center
Dimitrov, Dimiter M.
A Monte Carlo approach is proposed, using the Statistical Analysis System (SAS) programming language, for estimating reliability coefficients in generalizability theory studies. Test scores are generated by a probabilistic model that considers the probability for a person with a given ability score to answer an item with a given difficulty…
Christofidis, Melany J; Hill, Andrew; Horswill, Mark S; Watson, Marcus O
2016-01-01
To systematically evaluate the impact of several design features on chart-users' detection of patient deterioration on observation charts with early-warning scoring-systems. Research has shown that observation chart design affects the speed and accuracy with which abnormal observations are detected. However, little is known about the contribution of individual design features to these effects. A 2 × 2 × 2 × 2 mixed factorial design, with data-recording format (drawn dots vs. written numbers), scoring-system integration (integrated colour-based system vs. non-integrated tabular system) and scoring-row placement (grouped vs. separate) varied within-participants and scores (present vs. absent) varied between-participants by random assignment. 205 novice chart-users, tested between March 2011-March 2014, completed 64 trials where they saw real patient data presented on an observation chart. Each participant saw eight cases (four containing abnormal observations) on each of eight designs (which represented a factorial combination of the within-participants variables). On each trial, they assessed whether any of the observations were physiologically abnormal, or whether all observations were normal. Response times and error rates were recorded for each design. Participants responded faster (scores present and absent) and made fewer errors (scores absent) using drawn-dot (vs. written-number) observations and an integrated colour-based (vs. non-integrated tabular) scoring-system. Participants responded faster using grouped (vs. separate) scoring-rows when scores were absent, but separate scoring-rows when scores were present. Our findings suggest that several individual design features can affect novice chart-users' ability to detect patient deterioration. More broadly, the study further demonstrates the need to evaluate chart designs empirically. © 2015 John Wiley & Sons Ltd.
Houssaini, Allal; Assoumou, Lambert; Miller, Veronica; Calvez, Vincent; Marcelin, Anne-Geneviève; Flandre, Philippe
2013-01-01
Background Several attempts have been made to determine HIV-1 resistance from genotype resistance testing. We compare scoring methods for building weighted genotyping scores and commonly used systems to determine whether the virus of a HIV-infected patient is resistant. Methods and Principal Findings Three statistical methods (linear discriminant analysis, support vector machine and logistic regression) are used to determine the weight of mutations involved in HIV resistance. We compared these weighted scores with known interpretation systems (ANRS, REGA and Stanford HIV-db) to classify patients as resistant or not. Our methodology is illustrated on the Forum for Collaborative HIV Research didanosine database (N = 1453). The database was divided into four samples according to the country of enrolment (France, USA/Canada, Italy and Spain/UK/Switzerland). The total sample and the four country-based samples allow external validation (one sample is used to estimate a score and the other samples are used to validate it). We used the observed precision to compare the performance of newly derived scores with other interpretation systems. Our results show that newly derived scores performed better than or similar to existing interpretation systems, even with external validation sets. No difference was found between the three methods investigated. Our analysis identified four new mutations associated with didanosine resistance: D123S, Q207K, H208Y and K223Q. Conclusions We explored the potential of three statistical methods to construct weighted scores for didanosine resistance. Our proposed scores performed at least as well as already existing interpretation systems and previously unrecognized didanosine-resistance associated mutations were identified. This approach could be used for building scores of genotypic resistance to other antiretroviral drugs. PMID:23555613
Sussman, Jeremy B; Wiitala, Wyndy L; Zawistowski, Matthew; Hofer, Timothy P; Bentley, Douglas; Hayward, Rodney A
2017-09-01
Accurately estimating cardiovascular risk is fundamental to good decision-making in cardiovascular disease (CVD) prevention, but risk scores developed in one population often perform poorly in dissimilar populations. We sought to examine whether a large integrated health system can use their electronic health data to better predict individual patients' risk of developing CVD. We created a cohort using all patients ages 45-80 who used Department of Veterans Affairs (VA) ambulatory care services in 2006 with no history of CVD, heart failure, or loop diuretics. Our outcome variable was new-onset CVD in 2007-2011. We then developed a series of recalibrated scores, including a fully refit "VA Risk Score-CVD (VARS-CVD)." We tested the different scores using standard measures of prediction quality. For the 1,512,092 patients in the study, the Atherosclerotic cardiovascular disease risk score had similar discrimination as the VARS-CVD (c-statistic of 0.66 in men and 0.73 in women), but the Atherosclerotic cardiovascular disease model had poor calibration, predicting 63% more events than observed. Calibration was excellent in the fully recalibrated VARS-CVD tool, but simpler techniques tested proved less reliable. We found that local electronic health record data can be used to estimate CVD better than an established risk score based on research populations. Recalibration improved estimates dramatically, and the type of recalibration was important. Such tools can also easily be integrated into health system's electronic health record and can be more readily updated.
Kambugu, Andrew; Thompson, Jennifer; Hakim, James; Tumukunde, Dinah; van Oosterhout, Joep J; Mwebaze, Raymond; Hoppe, Anne; Abach, James; Kwobah, Charles; Arenas-Pinto, Alejandro; Walker, Sarah A; Paton, Nicholas I
2016-04-15
To assess neurocognitive function at the first-line antiretroviral therapy failure and change on the second-line therapy. Randomized controlled trial was conducted in 5 sub-Saharan African countries. Patients failing the first-line therapy according to WHO criteria after >12 months on non-nucleoside reverse transcriptase inhibitors-based regimens were randomized to the second-line therapy (open-label) with lopinavir/ritonavir (400 mg/100 mg twice daily) plus either 2-3 clinician-selected nucleoside reverse transcriptase inhibitors, raltegravir, or as monotherapy after 12-week induction with raltegravir. Neurocognitive function was tested at baseline, weeks 48 and 96 using color trails tests 1 and 2, and the Grooved Pegboard test. Test results were converted to an average of the 3 individual test z-scores. A total of 1036 patients (90% of those >18 years enrolled at 13 evaluable sites) had valid baseline tests (58% women, median: 38 years, viral load: 65,000 copies per milliliter, CD4 count: 73 cells per cubic millimeter). Mean (SD) baseline z-score was -2.96 (1.74); lower baseline z-scores were independently associated with older age, lower body weight, higher viral load, lower hemoglobin, less education, fewer weekly working hours, previous central nervous system disease, and taking fluconazole (P < 0.05 in multivariable model). Z-score was increased by mean (SE) of +1.23 (0.04) after 96 weeks on the second-line therapy (P < 0.001; n = 915 evaluable), with no evidence of difference between the treatment arms (P = 0.35). Patients in sub-Saharan Africa failing the first-line therapy had low neurocognitive function test scores, but performance improved on the second-line therapy. Regimens with more central nervous system-penetrating drugs did not enhance neurocognitive recovery indicating this need not be a primary consideration in choosing a second-line regimen.
Miller, Delyana Ivanova; Talbot, Vincent; Gagnon, Michèle; Messier, Claude
2013-01-01
Interactive voice response (IVR) systems are computer programs, which interact with people to provide a number of services from business to health care. We examined the ability of an IVR system to administer and score a verbal fluency task (fruits) and the digit span forward and backward in 158 community dwelling people aged between 65 and 92 years of age (full scale IQ of 68–134). Only six participants could not complete all tasks mostly due to early technical problems in the study. Participants were also administered the Wechsler Intelligence Scale fourth edition (WAIS-IV) and Wechsler Memory Scale fourth edition subtests. The IVR system correctly recognized 90% of the fruits in the verbal fluency task and 93–95% of the number sequences in the digit span. The IVR system typically underestimated the performance of participants because of voice recognition errors. In the digit span, these errors led to the erroneous discontinuation of the test: however the correlation between IVR scoring and clinical scoring was still high (93–95%). The correlation between the IVR verbal fluency and the WAIS-IV Similarities subtest was 0.31. The correlation between the IVR digit span forward and backward and the in-person administration was 0.46. We discuss how valid and useful IVR systems are for neuropsychological testing in the elderly. PMID:23950755
Reed, Susan G.; Adibi, Shawn S.; Coover, Mullen; Gellin, Robert G.; Wahlquist, Amy E.; AbdulRahiman, Anitha; Hamil, Lindsey H.; Walji, Muhammad F.; O’Neill, Paula; Kalenderian, Elsbeth
2015-01-01
The Consortium for Oral Health Research and Informatics (COHRI) is leading the way in use of the Dental Diagnostic System (DDS) terminology in the axiUm electronic health record (EHR). This collaborative pilot study had two aims: 1) to investigate whether use of the DDS terms positively impacted predoctoral dental students’ critical thinking skills measured by the Health Sciences Reasoning Test (HSRT), and 2) to refine study protocols. The study design was a natural experiment with cross-sectional data collection using the HSRT for 15 classes (2013–17) of students at three dental schools. Characteristics of students who had been exposed to the DDS terms were compared with students who had not, and the differences were tested by t-tests or chi-square tests. Generalized linear models were used to evaluate the relationship between exposure and outcome on the overall critical thinking score. The results showed that exposure was significantly related to overall score (p=0.01), with not-exposed students having lower mean overall scores. This study thus demonstrated a positive impact of using the DDS terminology in an EHR on the critical thinking skills of predoctoral dental students in three COHRI schools as measured by their overall score on the HSRT. These preliminary findings support future research to further evaluate a proposed model of critical thinking in clinical dentistry. PMID:26034034
Reed, Susan G; Adibi, Shawn S; Coover, Mullen; Gellin, Robert G; Wahlquist, Amy E; AbdulRahiman, Anitha; Hamil, Lindsey H; Walji, Muhammad F; O'Neill, Paula; Kalenderian, Elsbeth
2015-06-01
The Consortium for Oral Health Research and Informatics (COHRI) is leading the way in use of the Dental Diagnostic System (DDS) terminology in the axiUm electronic health record (EHR). This collaborative pilot study had two aims: 1) to investigate whether use of the DDS terms positively impacted predoctoral dental students' critical thinking skills measured by the Health Sciences Reasoning Test (HSRT), and 2) to refine study protocols. The study design was a natural experiment with cross-sectional data collection using the HSRT for 15 classes (2013-17) of students at three dental schools. Characteristics of students who had been exposed to the DDS terms were compared with students who had not, and the differences were tested by t-tests or chi-square tests. Generalized linear models were used to evaluate the relationship between exposure and outcome on the overall critical thinking score. The results showed that exposure was significantly related to overall score (p=0.01), with not-exposed students having lower mean overall scores. This study thus demonstrated a positive impact of using the DDS terminology in an EHR on the critical thinking skills of predoctoral dental students in three COHRI schools as measured by their overall score on the HSRT. These preliminary findings support future research to further evaluate a proposed model of critical thinking in clinical dentistry.
40 CFR 51.365 - Data collection.
Code of Federal Regulations, 2012 CFR
2012-07-01
... test start time and the time final emission scores are determined; (6) Vehicle Identification Number... enforcement of an I/M program. The program shall gather test data on individual vehicles, as well as quality... equipment is required or those test procedures relying upon a vehicle's OBD system). (a) Test data. The goal...
40 CFR 51.365 - Data collection.
Code of Federal Regulations, 2013 CFR
2013-07-01
... test start time and the time final emission scores are determined; (6) Vehicle Identification Number... enforcement of an I/M program. The program shall gather test data on individual vehicles, as well as quality... equipment is required or those test procedures relying upon a vehicle's OBD system). (a) Test data. The goal...
Morrison, Philippa K; Harris, Patricia A; Maltin, Charlotte A; Grove-White, Dai; Argo, Caroline McG
2017-01-01
Anatomically distinct adipose tissues represent variable risks to metabolic health in man and some other mammals. Quantitative-imaging of internal adipose depots is problematic in large animals and associations between regional adiposity and health are poorly understood. This study aimed to develop and test a semi-quantitative system (EQUIFAT) which could be applied to regional adipose tissues. Anatomically-defined, photographic images of adipose depots (omental, mesenteric, epicardial, rump) were collected from 38 animals immediately post-mortem. Images were ranked and depot-specific descriptors were developed (1 = no fat visible; 5 = excessive fat present). Nuchal-crest and ventro-abdominal-retroperitoneal adipose depot depths (cm) were transformed to categorical 5 point scores. The repeatability and reliability of EQUIFAT was independently tested by 24 observers. When half scores were permitted, inter-observer agreement was substantial (average κw: mesenteric, 0.79; omental, 0.79; rump 0.61) or moderate (average κw; epicardial, 0.60). Intra-observer repeatability was tested by 8 observers on 2 occasions. Kappa analysis indicated perfect (omental and mesenteric) and substantial agreement (epicardial and rump) between attempts. A further 207 animals were evaluated ante-mortem (age, height, breed-type, gender, body condition score [BCS]) and again immediately post-mortem (EQUIFAT scores, carcass weight). Multivariable, random effect linear regression models were fitted (breed as random effect; BCS as outcome variable). Only height, carcass weight, omental and retroperitoneal EQUIFAT scores remained as explanatory variables in the final model. The EQUIFAT scores developed here demonstrate clear functional differences between regional adipose depots and future studies could be directed towards describing associations between adiposity and disease risk in surgical and post-mortem situations.
Ng, Alex L K; Choy, Bonnie N K; Chan, Tommy C Y; Wong, Ian Y H; Lai, Jimmy S M; Mok, Mo Yin
2017-07-01
To compare tear osmolarity (TO) and other dry eye parameters in rheumatoid arthritis (RA) patients with or without secondary Sjogren syndrome (sSS). Consecutive patients with RA were divided into a sSS group and no-sSS group using conventional diagnostic criteria by rheumatologists using symptomatology, Schirmer test score, and anti-Ro or anti-La autoantibody status. The TO, Ocular Surface Disease Index, dry eye disease (DED) parameters [such as tear breakup time (TBUT) and corneal staining score] and the systemic inflammatory markers [erythrocyte sedimentation rate (ESR) and C-reactive protein (CRP)] were compared. Correlation analyses between TO and the DED parameters and inflammatory markers were also performed. A total of 42 cases with mean age 54.8 ± 12.3 were included, with 12 patients (29%) having sSS and 30 (71%) without sSS. TO was increased in both groups (329 ± 20 and 319 ± 25 mOsm/L, respectively), but no statistically significant difference was found between the 2 groups (P = 0.126). RA with sSS had significantly shorter TBUT, higher corneal staining score, and ESR CRP levels (P < 0.05). TO did not correlate with the Schirmer test score, but had significant positive correlations with age, corneal staining score, ESR, and CRP levels, and a significant negative correlation with TBUT. TO was increased in RA patients with or without sSS. There was no significant correlation between TO and the Schirmer test score, and the physician could not use TO to diagnose sSS. However, TO correlated well with both DED parameters (TBUT and corneal staining score) and systemic inflammatory markers (ESR and CRP).
Morrison, Philippa K.; Harris, Patricia A.; Maltin, Charlotte A.; Grove-White, Dai; Argo, Caroline McG.
2017-01-01
Anatomically distinct adipose tissues represent variable risks to metabolic health in man and some other mammals. Quantitative-imaging of internal adipose depots is problematic in large animals and associations between regional adiposity and health are poorly understood. This study aimed to develop and test a semi-quantitative system (EQUIFAT) which could be applied to regional adipose tissues. Anatomically-defined, photographic images of adipose depots (omental, mesenteric, epicardial, rump) were collected from 38 animals immediately post-mortem. Images were ranked and depot-specific descriptors were developed (1 = no fat visible; 5 = excessive fat present). Nuchal-crest and ventro-abdominal-retroperitoneal adipose depot depths (cm) were transformed to categorical 5 point scores. The repeatability and reliability of EQUIFAT was independently tested by 24 observers. When half scores were permitted, inter-observer agreement was substantial (average κw: mesenteric, 0.79; omental, 0.79; rump 0.61) or moderate (average κw; epicardial, 0.60). Intra-observer repeatability was tested by 8 observers on 2 occasions. Kappa analysis indicated perfect (omental and mesenteric) and substantial agreement (epicardial and rump) between attempts. A further 207 animals were evaluated ante-mortem (age, height, breed-type, gender, body condition score [BCS]) and again immediately post-mortem (EQUIFAT scores, carcass weight). Multivariable, random effect linear regression models were fitted (breed as random effect; BCS as outcome variable). Only height, carcass weight, omental and retroperitoneal EQUIFAT scores remained as explanatory variables in the final model. The EQUIFAT scores developed here demonstrate clear functional differences between regional adipose depots and future studies could be directed towards describing associations between adiposity and disease risk in surgical and post-mortem situations. PMID:28296956
An EEG-Based Person Authentication System with Open-Set Capability Combining Eye Blinking Signals
Wu, Qunjian; Zeng, Ying; Zhang, Chi; Tong, Li; Yan, Bin
2018-01-01
The electroencephalogram (EEG) signal represents a subject’s specific brain activity patterns and is considered as an ideal biometric given its superior forgery prevention. However, the accuracy and stability of the current EEG-based person authentication systems are still unsatisfactory in practical application. In this paper, a multi-task EEG-based person authentication system combining eye blinking is proposed, which can achieve high precision and robustness. Firstly, we design a novel EEG-based biometric evoked paradigm using self- or non-self-face rapid serial visual presentation (RSVP). The designed paradigm could obtain a distinct and stable biometric trait from EEG with a lower time cost. Secondly, the event-related potential (ERP) features and morphological features are extracted from EEG signals and eye blinking signals, respectively. Thirdly, convolutional neural network and back propagation neural network are severally designed to gain the score estimation of EEG features and eye blinking features. Finally, a score fusion technology based on least square method is proposed to get the final estimation score. The performance of multi-task authentication system is improved significantly compared to the system using EEG only, with an increasing average accuracy from 92.4% to 97.6%. Moreover, open-set authentication tests for additional imposters and permanence tests for users are conducted to simulate the practical scenarios, which have never been employed in previous EEG-based person authentication systems. A mean false accepted rate (FAR) of 3.90% and a mean false rejected rate (FRR) of 3.87% are accomplished in open-set authentication tests and permanence tests, respectively, which illustrate the open-set authentication and permanence capability of our systems. PMID:29364848
An EEG-Based Person Authentication System with Open-Set Capability Combining Eye Blinking Signals.
Wu, Qunjian; Zeng, Ying; Zhang, Chi; Tong, Li; Yan, Bin
2018-01-24
The electroencephalogram (EEG) signal represents a subject's specific brain activity patterns and is considered as an ideal biometric given its superior forgery prevention. However, the accuracy and stability of the current EEG-based person authentication systems are still unsatisfactory in practical application. In this paper, a multi-task EEG-based person authentication system combining eye blinking is proposed, which can achieve high precision and robustness. Firstly, we design a novel EEG-based biometric evoked paradigm using self- or non-self-face rapid serial visual presentation (RSVP). The designed paradigm could obtain a distinct and stable biometric trait from EEG with a lower time cost. Secondly, the event-related potential (ERP) features and morphological features are extracted from EEG signals and eye blinking signals, respectively. Thirdly, convolutional neural network and back propagation neural network are severally designed to gain the score estimation of EEG features and eye blinking features. Finally, a score fusion technology based on least square method is proposed to get the final estimation score. The performance of multi-task authentication system is improved significantly compared to the system using EEG only, with an increasing average accuracy from 92.4% to 97.6%. Moreover, open-set authentication tests for additional imposters and permanence tests for users are conducted to simulate the practical scenarios, which have never been employed in previous EEG-based person authentication systems. A mean false accepted rate (FAR) of 3.90% and a mean false rejected rate (FRR) of 3.87% are accomplished in open-set authentication tests and permanence tests, respectively, which illustrate the open-set authentication and permanence capability of our systems.
Comprehensive Adult Student Assessment Systems Braille Reading Assessment: An Exploratory Study
ERIC Educational Resources Information Center
Posey, Virginia K.; Henderson, Barbara W.
2012-01-01
Introduction: This exploratory study determined whether transcribing selected test items on an adult life and work skills reading test into braille could maintain the same approximate scale-score range and maintain fitness within the item response theory model as used by the Comprehensive Adult Student Assessment Systems (CASAS) for developing…
Accountability Is More than a Test Score
ERIC Educational Resources Information Center
Turnipseed, Stephan; Darling-Hammond, Linda
2015-01-01
The number one quality business leaders look for in employees is creativity and yet the U.S. education system undermines the development of the higher-order skills that promote creativity by its dogged focus on multiple-choice tests. Stephan Turnipseed and Linda DarlingHammond discuss the kind of rich accountability system that will help students…
ERIC Educational Resources Information Center
Sahin, Alpaslan; Almus, Kadir; Willson, Victor
2017-01-01
This study examined the high schools' state tests performances in mathematics, reading, and science of an open-enrollment STEM-focused charter school system,Harmony Public Schools(HPS), between 2010 and 2013, and compared them with the performance of matched traditional public schools (TPS) in Texas. After propensity score matching, 12 HPS schools…
Stræde, Mia; Brabrand, Mikkel
2014-01-01
Clinical scores can be of aid to predict early mortality after admission to a medical admission unit. A developed scoring system needs to be externally validated to minimise the risk of the discriminatory power and calibration to be falsely elevated. We performed the present study with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. Pre-planned prospective observational cohort study. Danish 460-bed regional teaching hospital. We included 3046 consecutive patients from 2 October 2008 until 19 February 2009. 26 (0.9%) died within one calendar day and 196 (6.4%) died within 30 days. We calculated SCS for 1080 patients. We found an AUROC of 0.960 (95% confidence interval [CI], 0.932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ(2) = 2.68 (10 degrees of freedom), P = 0.998 and χ(2) = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ(2) = 5.56 (10 degrees of freedom), P = 0.234. We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision.
ERIC Educational Resources Information Center
Arcuino, Cathy Lee T.
2013-01-01
The purpose of this study was to examine if the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) are related to academic success defined by final cumulative grade point average (GPA). The data sample, from three Midwestern universities, was comprised of international graduate students who…
Timely diagnosis of dairy calf respiratory disease using a standardized scoring system.
McGuirk, Sheila M; Peek, Simon F
2014-12-01
Respiratory disease of young dairy calves is a significant cause of morbidity, mortality, economic loss, and animal welfare concern but there is no gold standard diagnostic test for antemortem diagnosis. Clinical signs typically used to make a diagnosis of respiratory disease of calves are fever, cough, ocular or nasal discharge, abnormal breathing, and auscultation of abnormal lung sounds. Unfortunately, routine screening of calves for respiratory disease on the farm is rarely performed and until more comprehensive, practical and affordable respiratory disease-screening tools such as accelerometers, pedometers, appetite monitors, feed consumption detection systems, remote temperature recording devices, radiant heat detectors, electronic stethoscopes, and thoracic ultrasound are validated, timely diagnosis of respiratory disease can be facilitated using a standardized scoring system. We have developed a scoring system that attributes severity scores to each of four clinical parameters; rectal temperature, cough, nasal discharge, ocular discharge or ear position. A total respiratory score of five points or higher (provided that at least two abnormal parameters are observed) can be used to distinguish affected from unaffected calves. This can be applied as a screening tool twice-weekly to identify pre-weaned calves with respiratory disease thereby facilitating early detection. Coupled with effective treatment protocols, this scoring system will reduce post-weaning pneumonia, chronic pneumonia, and otitis media.
Sleeper, Mark D; Kenyon, Lisa K; Elliott, James M; Cheng, M Samuel
2016-12-01
Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts' USA-Gymnastics competitive level to calculate the coefficient of determination (r 2 ). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. The relationship between total MGFMT scores and subjects' current USA-Gymnastics competitive level was found to be good (r 2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level 3.
Rostami, Reza; Sadeghi, Vahid; Zarei, Jamileh; Haddadi, Parvaneh; Mohazzab-Torabi, Saman; Salamati, Payman
2013-04-01
The aim of this study was to compare the Persian version of the wechsler intelligence scale for children - fourth edition (WISC-IV) and cognitive assessment system (CAS) tests, to determine the correlation between their scales and to evaluate the probable concurrent validity of these tests in patients with learning disorders. One-hundered-sixty-two children with learning disorder who were presented at Atieh Comprehensive Psychiatry Center were selected in a consecutive non-randomized order. All of the patients were assessed based on WISC-IV and CAS scores questionnaires. Pearson correlation coefficient was used to analyze the correlation between the data and to assess the concurrent validity of the two tests. Linear regression was used for statistical modeling. The type one error was considered 5% in maximum. There was a strong correlation between total score of WISC-IV test and total score of CAS test in the patients (r=0.75, P<0.001). The correlations among the other scales were mostly high and all of them were statistically significant (P<0.001). A linear regression model was obtained (α = 0.51, β = 0.81 and P<0.001). There is an acceptable correlation between the WISC-IV scales and CAS test in children with learning disorders. A concurrent validity is established between the two tests and their scales.
Rostami, Reza; Sadeghi, Vahid; Zarei, Jamileh; Haddadi, Parvaneh; Mohazzab-Torabi, Saman; Salamati, Payman
2013-01-01
Objective The aim of this study was to compare the Persian version of the wechsler intelligence scale for children - fourth edition (WISC-IV) and cognitive assessment system (CAS) tests, to determine the correlation between their scales and to evaluate the probable concurrent validity of these tests in patients with learning disorders. Methods One-hundered-sixty-two children with learning disorder who were presented at Atieh Comprehensive Psychiatry Center were selected in a consecutive non-randomized order. All of the patients were assessed based on WISC-IV and CAS scores questionnaires. Pearson correlation coefficient was used to analyze the correlation between the data and to assess the concurrent validity of the two tests. Linear regression was used for statistical modeling. The type one error was considered 5% in maximum. Findings There was a strong correlation between total score of WISC-IV test and total score of CAS test in the patients (r=0.75, P<0.001). The correlations among the other scales were mostly high and all of them were statistically significant (P<0.001). A linear regression model was obtained (α = 0.51, β = 0.81 and P<0.001). Conclusion There is an acceptable correlation between the WISC-IV scales and CAS test in children with learning disorders. A concurrent validity is established between the two tests and their scales. PMID:23724180
Bacci, Elizabeth D.; Leidy, Nancy K.; Poon, Jiat-Ling; Stringer, Sonja; Memoli, Matthew J.; Han, Alison; Fairchok, Mary P.; Coles, Christian; Owens, Jackie; Chen, Wei-Ju; Arnold, John C.; Danaher, Patrick J.; Lalani, Tahaniyat; Burgess, Timothy H.; Millar, Eugene V.; Ridore, Michelande; Hernández, Andrés; Rodríguez-Zulueta, Patricia; Ortega-Gallegos, Hilda; Galindo-Fraga, Arturo; Ruiz-Palacios, Guillermo M.; Pett, Sarah; Fischer, William; Gillor, Daniel; Moreno Macias, Laura; DuVal, Anna; Rothman, Richard; Dugas, Andrea; Guerrero, M. Lourdes
2018-01-01
Background The inFLUenza Patient Reported Outcome (FLU-PRO) measure is a daily diary assessing signs/symptoms of influenza across six body systems: Nose, Throat, Eyes, Chest/Respiratory, Gastrointestinal, Body/Systemic, developed and tested in adults with influenza. Objectives This study tested the reliability, validity, and responsiveness of FLU-PRO scores in adults with influenza-like illness (ILI). Methods Data from the prospective, observational study used to develop and test the FLU-PRO in influenza virus positive patients were analyzed. Adults (≥18 years) presenting with influenza symptoms in outpatient settings in the US, UK, Mexico, and South America were enrolled, tested for influenza virus, and asked to complete the 37-item draft FLU-PRO daily for up to 14-days. Analyses were performed on data from patients testing negative. Reliability of the final, 32-item FLU-PRO was estimated using Cronbach’s alpha (α; Day 1) and intraclass correlation coefficients (ICC; 2-day reproducibility). Convergent and known-groups validity were assessed using patient global assessments of influenza severity (PGA). Patient report of return to usual health was used to assess responsiveness (Day 1–7). Results The analytical sample included 220 ILI patients (mean age = 39.3, 64.1% female, 88.6% white). Sixty-one (28%) were hospitalized at some point in their illness. Internal consistency reliability (α) of FLU-PRO Total score was 0.90 and ranged from 0.72–0.86 for domain scores. Reproducibility (Day 1–2) was 0.64 for Total, ranging from 0.46–0.78 for domain scores. Day 1 FLU-PRO scores correlated (≥0.30) with the PGA (except Gastrointestinal) and were significantly different across PGA severity groups (Total: F = 81.7, p<0.001; subscales: F = 6.9–62.2; p<0.01). Mean score improvements Day 1–7 were significantly greater in patients reporting return to usual health compared with those who did not (p<0.05, Total and subscales, except Gastrointestinal and Eyes). Conclusions Results suggest FLU-PRO scores are reliable, valid, and responsive in adults with influenza-like illness. PMID:29566007
Multilingual Data Selection for Low Resource Speech Recognition
2016-09-12
Figure 1: Identification of language clusters using scores from an LID system training languages used in the Base and OP1 evaluation periods of the Babel...the posterior scores over frames. For a set of languages that are used to train the lan- guage identification (LID) network, pairs of languages that...which are combined during test time to produce 10 dimensional language 3854 Figure 3: Identification of language clusters using scores from individually
Building an Evaluation Scale using Item Response Theory.
Lalor, John P; Wu, Hao; Yu, Hong
2016-11-01
Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.
Building an Evaluation Scale using Item Response Theory
Lalor, John P.; Wu, Hao; Yu, Hong
2016-01-01
Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.1 PMID:28004039
Understanding Elementary Astronomy by Making Drawing-Based Models
NASA Astrophysics Data System (ADS)
van Joolingen, W. R.; Aukes, Annika V. A.; Gijlers, H.; Bollen, L.
2015-04-01
Modeling is an important approach in the teaching and learning of science. In this study, we attempt to bring modeling within the reach of young children by creating the SimSketch modeling system, which is based on freehand drawings that can be turned into simulations. This system was used by 247 children (ages ranging from 7 to 15) to create a drawing-based model of the solar system. The results show that children in the target age group are capable of creating a drawing-based model of the solar system and can use it to show the situations in which eclipses occur. Structural equation modeling predicting post-test knowledge scores based on learners' pre-test knowledge scores, the quality of their drawings and motivational aspects yielded some evidence that such drawing contributes to learning. Consequences for using modeling with young children are considered.
Sawle, Leanne; Freeman, Jennifer; Marsden, Jonathan
2017-04-01
Balance is a complex construct, affected by multiple components such as strength and co-ordination. However, whilst assessing an athlete's dynamic balance is an important part of clinical examination, there is no gold standard measure. The multiple single-leg hop-stabilization test is a functional test which may offer a method of evaluating the dynamic attributes of balance, but it needs to show adequate intra-tester reliability. The purpose of this study was to assess the intra-rater reliability of a dynamic balance test, the multiple single-leg hop-stabilization test on the dominant and non-dominant legs. Intra-rater reliability study. Fifteen active participants were tested twice with a 10-minute break between tests. The outcome measure was the multiple single-leg hop-stabilization test score, based on a clinically assessed numerical scoring system. Results were analysed using an Intraclass Correlations Coefficient (ICC 2,1 ) and Bland-Altman plots. Regression analyses explored relationships between test scores, leg dominance, age and training (an alpha level of p = 0.05 was selected). ICCs for intra-rater reliability were 0.85 for the dominant and non-dominant legs (confidence intervals = 0.62-0.95 and 0.61-0.95 respectively). Bland-Altman plots showed scores within two standard deviations. A significant correlation was observed between the dominant and non-dominant leg on balance scores (R 2 =0.49, p<0.05), and better balance was associated with younger participants in their non-dominant leg (R 2 =0.28, p<0.05) and their dominant leg (R 2 =0.39, p<0.05), and a higher number of hours spent training for the non-dominant leg R 2 =0.37, p<0.05). The multiple single-leg hop-stabilisation test demonstrated strong intra-tester reliability with active participants. Younger participants who trained more, have better balance scores. This test may be a useful measure for evaluating the dynamic attributes of balance. 3.
How White Teachers Experience and Think about Race in Professional Development
ERIC Educational Resources Information Center
Marcy, Renee
2010-01-01
The public educational system in the United States fails to proficiently educate a majority of African American, Latino/a, and students from low-income backgrounds. Test score statistics show an average scaled score gap of twenty-six points between African American and White students (National Center for Education Statistics, 2007). The term…
Automated Essay Scoring: Psychometric Guidelines and Practices
ERIC Educational Resources Information Center
Ramineni, Chaitanya; Williamson, David M.
2013-01-01
In this paper, we provide an overview of psychometric procedures and guidelines Educational Testing Service (ETS) uses to evaluate automated essay scoring for operational use. We briefly describe the e-rater system, the procedures and criteria used to evaluate e-rater, implications for a range of potential uses of e-rater, and directions for…
Dexterity and Bench Assembly Work Productivity in Adults with Mild Mental Retardation.
ERIC Educational Resources Information Center
Serr, Russell; And Others
1994-01-01
This study compared dexterity scores using the Vocational Transit Test System and bench assembly work productivity in 30 adults with mild mental retardation. Moderately high correlations were found between work output and motor coordination, manual dexterity, finger dexterity (with and without assembly), and total dexterity score. Finger dexterity…
Efficiency in the Community College Sector: Stochastic Frontier Analysis
ERIC Educational Resources Information Center
Agasisti, Tommaso; Belfield, Clive
2017-01-01
This paper estimates technical efficiency scores across the community college sector in the United States. Using stochastic frontier analysis and data from the Integrated Postsecondary Education Data System for 2003-2010, we estimate efficiency scores for 950 community colleges and perform a series of sensitivity tests to check for robustness. We…
Development of a computed tomography-based scoring system for necrotizing soft-tissue infections.
McGillicuddy, Edward A; Lischuk, Andrew W; Schuster, Kevin M; Kaplan, Lewis J; Maung, Adrian; Lui, Felix Y; Bokhari, S A Jamal; Davis, Kimberly A
2011-04-01
Necrotizing soft-tissue infections (NSTIs) are associated with significant morbidity and mortality, but a definitive nonsurgical diagnostic test remains elusive. Despite the widespread use of computed tomography (CT) as a diagnostic adjunct, there is little data that definitively correlate CT findings with the presence of NSTI. Our goal was the development of a CT-based scoring system to discriminate non-NSTI from NSTI. Patients older than 17 years undergoing CT for evaluation of soft-tissue infection at a tertiary care medical center over a 10-year period (2000-2009) were included. Abstracted data included comorbidities and social history, physical examination, laboratory findings, and operative and pathologic findings. NSTI was defined as soft-tissue necrosis in the dictated operative note or the accompanying pathology report. CT scans were reviewed by a radiologist blinded to clinical and laboratory data. A scoring system was developed and the area under the receiver operating characteristic curve was calculated. During the study period, 305 patients underwent CT scanning (57% men; mean age, 47.4 years). Forty-four patients (14.4%) evaluated had an NSTI. A scoring system was retrospectively developed (table). A score >6 points was 86.3% sensitive and 91.5% specific for the diagnosis of NSTI (positive predictive value, 63.3%; negative predictive value, 85.5%). The area under the receiver operating characteristic curve was 0.928 (95% confidence interval, 0.893-0.964). The mean score of the non-NSTI group was 2.74. We have developed a CT scoring system that is both sensitive and specific for the diagnosis of NSTIs. This system may allow clinicians to more accurately diagnose NSTIs. Prospective validation of this scoring system is planned.
Krecsák, László; Micsik, Tamás; Kiszler, Gábor; Krenács, Tibor; Szabó, Dániel; Jónás, Viktor; Császár, Gergely; Czuni, László; Gurzó, Péter; Ficsor, Levente; Molnár, Béla
2011-01-18
The immunohistochemical detection of estrogen (ER) and progesterone (PR) receptors in breast cancer is routinely used for prognostic and predictive testing. Whole slide digitalization supported by dedicated software tools allows quantization of the image objects (e.g. cell membrane, nuclei) and an unbiased analysis of immunostaining results. Validation studies of image analysis applications for the detection of ER and PR in breast cancer specimens provided strong concordance between the pathologist's manual assessment of slides and scoring performed using different software applications. The effectiveness of two connected semi-automated image analysis software (NuclearQuant v. 1.13 application for Pannoramic™ Viewer v. 1.14) for determination of ER and PR status in formalin-fixed paraffin embedded breast cancer specimens immunostained with the automated Leica Bond Max system was studied. First the detection algorithm was calibrated to the scores provided an independent assessors (pathologist), using selected areas from 38 small digital slides (created from 16 cases) containing a mean number of 195 cells. Each cell was manually marked and scored according to the Allred-system combining frequency and intensity scores. The performance of the calibrated algorithm was tested on 16 cases (14 invasive ductal carcinoma, 2 invasive lobular carcinoma) against the pathologist's manual scoring of digital slides. The detection was calibrated to 87 percent object detection agreement and almost perfect Total Score agreement (Cohen's kappa 0.859, quadratic weighted kappa 0.986) from slight or moderate agreement at the start of the study, using the un-calibrated algorithm. The performance of the application was tested against the pathologist's manual scoring of digital slides on 53 regions of interest of 16 ER and PR slides covering all positivity ranges, and the quadratic weighted kappa provided almost perfect agreement (κ = 0.981) among the two scoring schemes. NuclearQuant v. 1.13 application for Pannoramic™ Viewer v. 1.14 software application proved to be a reliable image analysis tool for pathologists testing ER and PR status in breast cancer.
Debris Evaluation after Root Canal Shaping with Rotating and Reciprocating Single-File Systems
Dagna, Alberto; Gastaldo, Giulia; Beltrami, Riccardo; Poggio, Claudio
2016-01-01
This study evaluated the root canal dentine surface by scanning electron microscope (SEM) after shaping with two reciprocating single-file NiTi systems and two rotating single-file NiTi systems, in order to verify the presence/absence of the smear layer and the presence/absence of open tubules along the walls of each sample; Forty-eight single-rooted teeth were divided into four groups and shaped with OneShape (OS), F6 SkyTaper (F6), WaveOne (WO) and Reciproc and irrigated using 5.25% NaOCl and 17% EDTA. Root canal walls were analyzed by SEM at a standard magnification of 2500×. The presence/absence of the smear layer and the presence/absence of open tubules at the coronal, middle, and apical third of each canal were estimated using a five-step scale for scores. Numeric data were analyzed using Kruskal-Wallis and Mann-Whitney U statistical tests and significance was predetermined at P < 0.05; The Kruskal-Wallis ANOVA for debris score showed significant differences among the NiTi systems (P < 0.05). The Mann-Whitney test confirmed that reciprocating systems presented significantly higher score values than rotating files. The same results were assessed considering the smear layer scores. ANOVA confirmed that the apical third of the canal maintained a higher quantity of debris and smear layer after preparation of all the samples; Single-use NiTi systems used in continuous rotation appeared to be more effective than reciprocating instruments in leaving clean walls. The reciprocating systems produced more debris and smear layer than rotating instruments. PMID:27763503
Earlam, S; Glover, C; Davies, M; Fordy, C; Allen-Mersh, T G
1997-05-01
Since systemic and regional (HAI) fluorinated pyrimidine chemotherapies offer similar survival benefit in treatment of colorectal liver metastases (CLM), we sought to identify their impact on quality of life (QoL), which might be a useful indicator of treatment preference. We compared QoL in 135 CLM patients managed by symptom control (n = 49 patients), systemic fluorouracil (5FU)/folinic acid (n = 35), or hepatic arterial floxuridine (FUDR) (n = 51). Full blood count and liver function tests, World Health Organization (WHO) toxicity criteria, and QoL (Rotterdam Symptom Checklist [RSC], the Sickness Impact Profile [SIP], and the Hospital Anxiety and Depression scale [HAD]) were measured monthly in all patients. The HAD anxiety score was significantly increased in symptom control compared with chemotherapy patients 1 month after randomization. There was a significant increase in RSC physical score (repeated measures, P = .05), and in scores for sore mouth (P < .01), dry mouth (P < .01), and tingling hands and feet (P < .01) in systemic chemotherapy compared with symptom control patients. Significant QoL differences (repeated measures and Mann-Whitney U [MWU]) between HAI and symptom control patients were not detected. Systemic chemotherapy patients lived for significantly longer (log-rank test, P < or = .0001) with abnormal HAD anxiety, RSC psychosocial, or RSC sore mouth scores compared with HAI patients, but there were no overall survival differences. Randomization to symptom control only was associated with increased anxiety. QoL with systemic chemotherapy was impaired by side effects. HAI was associated with similar survival to systemic chemotherapy but with better sustained QoL.
Standardizing an approach to the evaluation of implementation science proposals.
Crable, Erika L; Biancarelli, Dea; Walkey, Allan J; Allen, Caitlin G; Proctor, Enola K; Drainoni, Mari-Lynn
2018-05-29
The fields of implementation and improvement sciences have experienced rapid growth in recent years. However, research that seeks to inform health care change may have difficulty translating core components of implementation and improvement sciences within the traditional paradigms used to evaluate efficacy and effectiveness research. A review of implementation and improvement sciences grant proposals within an academic medical center using a traditional National Institutes of Health framework highlighted the need for tools that could assist investigators and reviewers in describing and evaluating proposed implementation and improvement sciences research. We operationalized existing recommendations for writing implementation science proposals as the ImplemeNtation and Improvement Science Proposals Evaluation CriTeria (INSPECT) scoring system. The resulting system was applied to pilot grants submitted to a call for implementation and improvement science proposals at an academic medical center. We evaluated the reliability of the INSPECT system using Krippendorff's alpha coefficients and explored the utility of the INSPECT system to characterize common deficiencies in implementation research proposals. We scored 30 research proposals using the INSPECT system. Proposals received a median cumulative score of 7 out of a possible score of 30. Across individual elements of INSPECT, proposals scored highest for criteria rating evidence of a care or quality gap. Proposals generally performed poorly on all other criteria. Most proposals received scores of 0 for criteria identifying an evidence-based practice or treatment (50%), conceptual model and theoretical justification (70%), setting's readiness to adopt new services/treatment/programs (54%), implementation strategy/process (67%), and measurement and analysis (70%). Inter-coder reliability testing showed excellent reliability (Krippendorff's alpha coefficient 0.88) for the application of the scoring system overall and demonstrated reliability scores ranging from 0.77 to 0.99 for individual elements. The INSPECT scoring system presents a new scoring criteria with a high degree of inter-rater reliability and utility for evaluating the quality of implementation and improvement sciences grant proposals.
How Have State Level Standards-Based Tests Related to Norm-Referenced Tests in Alaska?.
ERIC Educational Resources Information Center
Fenton, Ray
This overview of the Alaska system for test development, scoring, and reporting explored differences and similarities between norm-referenced and standards-based tests. The current Alaska testing program is based on legislation passed in 1997 and 1998, and is designed to meet the requirements of the federal No Child Left Behind Legislation. In…
NASA Astrophysics Data System (ADS)
Wei, Jun; Sahiner, Berkman; Hadjiiski, Lubomir M.; Chan, Heang-Ping; Helvie, Mark A.; Roubidoux, Marilyn A.; Zhou, Chuan; Ge, Jun; Zhang, Yiheng
2006-03-01
We are developing a two-view information fusion method to improve the performance of our CAD system for mass detection. Mass candidates on each mammogram were first detected with our single-view CAD system. Potential object pairs on the two-view mammograms were then identified by using the distance between the object and the nipple. Morphological features, Hessian feature, correlation coefficients between the two paired objects and texture features were used as input to train a similarity classifier that estimated a similarity scores for each pair. Finally, a linear discriminant analysis (LDA) classifier was used to fuse the score from the single-view CAD system and the similarity score. A data set of 475 patients containing 972 mammograms with 475 biopsy-proven masses was used to train and test the CAD system. All cases contained the CC view and the MLO or LM view. We randomly divided the data set into two independent sets of 243 cases and 232 cases. The training and testing were performed using the 2-fold cross validation method. The detection performance of the CAD system was assessed by free response receiver operating characteristic (FROC) analysis. The average test FROC curve was obtained from averaging the FP rates at the same sensitivity along the two corresponding test FROC curves from the 2-fold cross validation. At the case-based sensitivities of 90%, 85% and 80% on the test set, the single-view CAD system achieved an FP rate of 2.0, 1.5, and 1.2 FPs/image, respectively. With the two-view fusion system, the FP rates were reduced to 1.7, 1.3, and 1.0 FPs/image, respectively, at the corresponding sensitivities. The improvement was found to be statistically significant (p<0.05) by the AFROC method. Our results indicate that the two-view fusion scheme can improve the performance of mass detection on mammograms.
Comparing PETS and GEPT in China and Taiwan
ERIC Educational Resources Information Center
Wu, Mei
2012-01-01
This paper compares the Public English Test System (PETS) administered in mainland, China and the General English Proficiency Test (GEPT) administered in Taiwan, from the aspects of test levels, test contents and scoring weight. Compared with the PETS, the GEPT is found to value the English productive skills more, and have a greater ability to…
What To Look for in ESL Admission Tests: Cambridge Certificate Exams, IELTS, and TOEFL.
ERIC Educational Resources Information Center
Chalhoub-Deville, Micheline; Turner, Carolyn E.
2000-01-01
Familiarizes test users with issues to consider when employing assessments for screening and admission purposes. Examines the purpose, content, and scoring methods of three English-as-a-Second-Language admissions tests--the Cambridge certificate exams, International English Language Teaching System, and Test of English as a Foreign…
NASA Technical Reports Server (NTRS)
Olorenshaw, Lex; Trawick, David
1991-01-01
The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.
Transcriptional Responses Reveal Similarities Between Preclinical Rat Liver Testing Systems.
Liu, Zhichao; Delavan, Brian; Roberts, Ruth; Tong, Weida
2018-01-01
Toxicogenomics (TGx) is an important tool to gain an enhanced understanding of toxicity at the molecular level. Previously, we developed a pair ranking (PRank) method to assess in vitro to in vivo extrapolation (IVIVE) using toxicogenomic datasets from the Open Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System (TG-GATEs) database. With this method, we investiagted three important questions that were not addressed in our previous study: (1) is a 1-day in vivo short-term assay able to replace the 28-day standard and expensive toxicological assay? (2) are some biological processes more conservative across different preclinical testing systems than others? and (3) do these preclinical testing systems have the similar resolution in differentiating drugs by their therapeutic uses? For question 1, a high similarity was noted (PRank score = 0.90), indicating the potential utility of shorter term in vivo studies to predict outcome in longer term and more expensive in vivo model systems. There was a moderate similarity between rat primary hepatocytes and in vivo repeat-dose studies (PRank score = 0.71) but a low similarity (PRank score = 0.56) between rat primary hepatocytes and in vivo single dose studies. To address question 2, we limited the analysis to gene sets relevant to specific toxicogenomic pathways and we found that pathways such as lipid metabolism were consistently over-represented in all three assay systems. For question 3, all three preclinical assay systems could distinguish compounds from different therapeutic categories. This suggests that any noted differences in assay systems was biological process-dependent and furthermore that all three systems have utility in assessing drug responses within a certain drug class. In conclusion, this comparison of three commonly used rat TGx systems provides useful information in utility and application of TGx assays.
Pontone, Gianluca; Di Bella, Gianluca; Silvia, Castelletti; Maestrini, Viviana; Festa, Pierluigi; Ait-Ali, Lamia; Masci, Pier Giorgio; Monti, Lorenzo; di Giovine, Gabriella; De Lazzari, Manuel; Cipriani, Alberto; Guaricci, Andrea I; Dellegrottaglie, Santo; Pepe, Alessia; Marra, Martina Perazzolo; Aquaro, Giovanni D
2017-04-01
The current document was developed by the working group on the 'application of cardiac magnetic resonance' of the Italian Society of Cardiology to provide a perspective on the current state of technical advances and clinical cardiac magnetic resonance applications and to inform cardiologists how to implement their clinical and diagnostic pathway with the introduction of this technique in the clinical practice. Appropriateness criteria were defined using a score system: score 1-3 = inappropriate (test is not generally acceptable and is not a reasonable approach for the indication), score 4-6 = uncertain (test may be generally acceptable and may be a reasonable approach for the indication but more research and/or patient information is needed to classify the indication definitively) and score 7-9 = appropriate (test is generally acceptable and is a reasonable approach for the indication).
Akashi, Masaya; Teraoka, Shun; Kakei, Yasumasa; Kusumoto, Junya; Hasegawa, Takumi; Minamikawa, Tsutomu; Hashikawa, Kazunobu; Komori, Takahide
2018-04-01
This study aimed to evaluate posttreatment soft-tissue changes in patients with oral cancer with computed tomography (CT). To accomplish that purpose, a scoring system was established, referring to the criteria of lower leg lymphedema (LE). One hundred and six necks in 95 patients who underwent oral oncologic surgery with neck dissection (ND) were analyzed retrospectively using routine follow-up CT images. A two-point scoring system to evaluate soft-tissue changes (so-called "LE score") was established as follows: Necks with a "honeycombing" appearance were assigned 1 point. Necks with "taller than wide" fat lobules were assigned 1 point. Necks with neither appearance were assigned 0 points. Comparisons between patients with LE score ≥1 and LE score = 0 at 6 months postoperatively were performed using the Fisher exact test for discrete variables and the Mann-Whitney U test for continuous variables. Univariate predictors associated with posttreatment changes (i.e., LE score ≥1 at 6 months postoperatively) were entered into a multivariate logistic regression analysis. Values of p < 0.05 were considered to indicate statistical significance. The occurrence of the posttreatment soft-tissue changes was 32%. Multivariate logistic regression analysis showed that postoperative radiation therapy (RT) and bilateral ND were potential risk factors of posttreatment soft-tissue changes on CT images. Sequential evaluation of "honeycombing" and the "taller than wide" appearances on routine follow-up CT revealed the persistence of posttreatment soft-tissue changes in patients who underwent oral cancer treatment, and those potential risk factors were postoperative RT and bilateral ND.
Left behind by Design: Proficiency Counts and Test-Based Accountability. Working Paper
ERIC Educational Resources Information Center
Neal, Derek; Schanzenbach, Diane Whitmore
2009-01-01
Many test-based accountability systems, including the No Child Left Behind Act of 2001 (NCLB), place great weight on the numbers of students who score at or above specified proficiency levels in various subjects. Accountability systems based on these metrics often provide incentives for teachers and principals to target children near current…
21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.
Code of Federal Regulations, 2012 CFR
2012-04-01
... ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum or... § 866.1(e). (c) Black box warning. Under section 520(e) of the Federal Food, Drug, and Cosmetic Act... box and must appear in all advertising, labeling, and promotional material for these devices. That...
21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.
Code of Federal Regulations, 2014 CFR
2014-04-01
... ovarian/adnexal mass assessment test system is a device that measures one or more proteins in serum or... § 866.1(e). (c) Black box warning. Under section 520(e) of the Federal Food, Drug, and Cosmetic Act... box and must appear in all advertising, labeling, and promotional material for these devices. That...
ERIC Educational Resources Information Center
Neal, Derek; Schanzenbach, Diane Whitmore
2007-01-01
Many test-based accountability systems, including the No Child Left Behind Act of 2001 (NCLB), place great weight on the numbers of students who score at or above specified proficiency levels in various subjects. Accountability systems based on these metrics often provide incentives for teachers and principals to target children near current…
Statewide Articulated Assessment System. 1994-1995 Summary Report.
ERIC Educational Resources Information Center
New Mexico State Dept. of Education, Santa Fe. Assessment and Evaluation Unit.
Results from the component tests of the New Mexico Statewide Articulated Assessment System, an elementary-level assessment, are presented. The New Mexico Achievement Assessment, uses the Iowa Tests of Basic Skills to assess the achievement of students in grades 3, 5 and 8. Score increases were seen for students in grade 3 in mathematics, in grade…
Doğanay Erdoğan, Beyza; Elhan, Atilla Halİl; Kaskatı, Osman Tolga; Öztuna, Derya; Küçükdeveci, Ayşe Adile; Kutlay, Şehim; Tennant, Alan
2017-10-01
This study aimed to explore the potential of an inclusive and fully integrated measurement system for the Activities component of the International Classification of Functioning, Disability and Health (ICF), incorporating four classical scales, including the Health Assessment Questionnaire (HAQ), and a Computerized Adaptive Testing (CAT). Three hundred patients with rheumatoid arthritis (RA) answered relevant questions from four questionnaires. Rasch analysis was performed to create an item bank using this item pool. A further 100 RA patients were recruited for a CAT application. Both real and simulated CATs were applied and the agreement between these CAT-based scores and 'paper-pencil' scores was evaluated with intraclass correlation coefficient (ICC). Anchoring strategies were used to obtain a direct translation from the item bank common metric to the HAQ score. Mean age of 300 patients was 52.3 ± 11.7 years; disease duration was 11.3 ± 8.0 years; 74.7% were women. After testing for the assumptions of Rasch analysis, a 28-item Activities item bank was created. The agreement between CAT-based scores and paper-pencil scores were high (ICC = 0.993). Using those HAQ items in the item bank as anchoring items, another Rasch analysis was performed with HAQ-8 scores as separate items together with anchoring items. Finally a conversion table of the item bank common metric to the HAQ scores was created. A fully integrated and inclusive health assessment system, illustrating the Activities component of the ICF, was built to assess RA patients. Raw score to metric conversions and vice versa were available, giving access to the metric by a simple look-up table. © 2015 Asia Pacific League of Associations for Rheumatology and Wiley Publishing Asia Pty Ltd.
The reliability and validity of qualitative scores for the Controlled Oral Word Association Test.
Ross, Thomas P; Calhoun, Emily; Cox, Tara; Wenner, Carolyn; Kono, Whitney; Pleasant, Morgan
2007-05-01
The reliability and validity of two qualitative scoring systems for the Controlled Oral Word Association Test [Benton, A. L., Hamsher, de S. K., & Sivan, A. B. (1983). Multilingual aplasia examination (2nd ed.). Iowa City, IA: AJA Associates] were examined in 108 healthy young adults. The scoring systems developed by Troyer et al. [Troyer, A. K., Moscovich, M., & Winocur, G. (1997). Clustering and switching as two components of verbal fluency: Evidence from younger and older healthy adults. Neuropsychology, 11, 138-146] and by Abwender et al. [Abwender, D. A., Swan, J. G., Bowerman, J. T., & Connolly, S. W. (2001a). Qualitative analysis of verbal fluency output: Review and comparison of several scoring methods. Assessment, 8, 323-336] each demonstrated excellent interrater reliability (all indices at or above r(icc)=.9). Consistent with previous research [e.g., Ross, T. P. (2003). The reliability of cluster and switch scores for the COWAT. Archives of Clinical Psychology, 18, 153-164), test-retest reliability coefficients (N=53; M interval 44.6 days) for the qualitative scores were modest to poor (r(icc)=.6 to .4 range). Correlations among COWAT scores, measures of executive functioning, verbal learning, working memory, and vocabulary were examined. The idea that qualitative scores represent distinct executive functions such as cognitive flexibility or strategy utilization was not supported. We offer the interpretation that COWAT performance may require the ability to retrieve words in a non-routine manner while suppressing habitual responses and associated processing interference, presumably due to a spread of activation across semantic or lexical networks. This interpretation, though speculative at present, implies that clustering and switching on the COWAT may not be entirely deliberate, but rather an artifact of a passive (i.e., state-dependent) process. Ideas for future research, most noticeably experimental studies using cognitive methods (e.g., priming), are discussed.
Aoki, Tomonori; Nagata, Naoyoshi; Shimbo, Takuro; Niikura, Ryota; Sakurai, Toshiyuki; Moriyasu, Shiori; Okubo, Hidetaka; Sekine, Katsunori; Watanabe, Kazuhiro; Yokoi, Chizu; Yanase, Mikio; Akiyama, Junichi; Mizokami, Masashi; Uemura, Naomi
2016-11-01
We aimed to develop and validate a risk scoring system to determine the risk of severe lower gastrointestinal bleeding (LGIB) and predict patient outcomes. We first performed a retrospective analysis of data from 439 patients emergently hospitalized for acute LGIB at the National Center for Global Health and Medicine in Japan, from January 2009 through December 2013. We used data on comorbidities, medication, presenting symptoms, and vital signs, and laboratory test results to develop a scoring system for severe LGIB (defined as continuous and/or recurrent bleeding). We validated the risk score in a prospective study of 161 patients with acute LGIB admitted to the same center from April 2014 through April 2015. We assessed the system's accuracy in predicting patient outcome using area under the receiver operating characteristics curve (AUC) analysis. All patients underwent colonoscopy. In the first study, 29% of the patients developed severe LGIB. We devised a risk scoring system based on nonsteroidal anti-inflammatory drugs use, no diarrhea, no abdominal tenderness, blood pressure of 100 mm Hg or lower, antiplatelet drugs use, albumin level less than 3.0 g/dL, disease scores of 2 or higher, and syncope (NOBLADS), which all were independent correlates of severe LGIB. Severe LGIB developed in 75.7% of patients with scores of 5 or higher compared with 2% of patients without any of the factors correlated with severe LGIB (P < .001). The NOBLADS score determined the severity of LGIB with an AUC value of 0.77. In the validation (second) study, severe LGIB developed in 35% of patients; the NOBLADS score predicted the severity of LGIB with an AUC value of 0.76. Higher NOBLADS scores were associated with a requirement for blood transfusion, longer hospital stay, and intervention (P < .05 for trend). We developed and validated a scoring system for risk of severe LGIB based on 8 factors (NOBLADS score). The system also determined the risk for blood transfusion, longer hospital stay, and intervention. It might be used in decision making regarding intervention and management. Copyright © 2016 AGA Institute. Published by Elsevier Inc. All rights reserved.
Li, R; Li, C T; Zhao, S M; Li, H X; Li, L; Wu, R G; Zhang, C C; Sun, H Y
2017-04-01
To establish a query table of IBS critical value and identification power for the detection systems with different numbers of STR loci under different false judgment standards. Samples of 267 pairs of full siblings and 360 pairs of unrelated individuals were collected and 19 autosomal STR loci were genotyped by Golden e ye™ 20A system. The full siblings were determined using IBS scoring method according to the 'Regulation for biological full sibling testing'. The critical values and identification power for the detection systems with different numbers of STR loci under different false judgment standards were calculated by theoretical methods. According to the formal IBS scoring criteria, the identification power of full siblings and unrelated individuals was 0.764 0 and the rate of false judgment was 0. The results of theoretical calculation were consistent with that of sample observation. The query table of IBS critical value for identification of full sibling detection systems with different numbers of STR loci was successfully established. The IBS scoring method defined by the regulation has high detection efficiency and low false judgment rate, which provides a relatively conservative result. The query table of IBS critical value for identification of full sibling detection systems with different numbers of STR loci provides an important reference data for the result judgment of full sibling testing and owns a considerable practical value. Copyright© by the Editorial Department of Journal of Forensic Medicine
Research on Operation Assessment Method for Energy Meter
NASA Astrophysics Data System (ADS)
Chen, Xiangqun; Huang, Rui; Shen, Liman; chen, Hao; Xiong, Dezhi; Xiao, Xiangqi; Liu, Mouhai; Xu, Renheng
2018-03-01
The existing electric energy meter rotation maintenance strategy regularly checks the electric energy meter and evaluates the state. It only considers the influence of time factors, neglects the influence of other factors, leads to the inaccuracy of the evaluation, and causes the waste of resources. In order to evaluate the running state of the electric energy meter in time, a method of the operation evaluation of the electric energy meter is proposed. The method is based on extracting the existing data acquisition system, marketing business system and metrology production scheduling platform that affect the state of energy meters, and classified into error stability, operational reliability, potential risks and other factors according to the influencing factors, based on the above basic test score, inspecting score, monitoring score, score of family defect detection. Then, according to the evaluation model according to the scoring, we evaluate electric energy meter operating state, and finally put forward the corresponding maintenance strategy of rotation.
A Method of Evaluating Operation of Electric Energy Meter
NASA Astrophysics Data System (ADS)
Chen, Xiangqun; Li, Tianyang; Cao, Fei; Chu, Pengfei; Zhao, Xinwang; Huang, Rui; Liu, Liping; Zhang, Chenglin
2018-05-01
The existing electric energy meter rotation maintenance strategy regularly checks the electric energy meter and evaluates the state. It only considers the influence of time factors, neglects the influence of other factors, leads to the inaccuracy of the evaluation, and causes the waste of resources. In order to evaluate the running state of the electric energy meter in time, a method of the operation evaluation of the electric energy meter is proposed. The method is based on extracting the existing data acquisition system, marketing business system and metrology production scheduling platform that affect the state of energy meters, and classified into error stability, operational reliability, potential risks and other factors according to the influencing factors, based on the above basic test score, inspecting score, monitoring score, score of family defect detection. Then, according to the evaluation model according to the scoring, we evaluate electric energy meter operating state, and finally put forward the corresponding maintenance strategy of rotation.
Faron, Matthew L.; Buchan, Blake W.; Hyke, Josh; Madisen, Neil; Lillie, Jennifer L.; Granato, Paul A.; Wilson, Deborah A.; Procop, Gary W.; Novak-Weekley, Susan; Marlowe, Elizabeth; Cumpio, Joven; Griego-Fullbright, Christen; Kindig, Sandra; Timm, Karen; Young, Stephen; Ledeboer, Nathan A.
2015-01-01
The prompt and accurate identification of bacterial pathogens is fundamental to patient health and outcome. Recent advances in matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) have revolutionized bacterial identification in the clinical laboratory, but uniform incorporation of this technology in the U.S. market has been delayed by a lack of FDA-cleared systems. In this study, we conducted a multicenter evaluation of the MALDI Biotyper CA (MBT-CA) System (Bruker Daltonics Inc, Billerica, MA) for the identification of aerobic gram-negative bacteria as part of a 510(k) submission to the FDA. A total of 2,263 aerobic gram negative bacterial isolates were tested representing 23 genera and 61 species. Isolates were collected from various clinical sources and results obtained from the MBT-CA System were compared to DNA sequencing and/or biochemical testing. Isolates that failed to report as a "high confidence species ID" [log(score) ≥2.00] were re-tested using an extraction method. The MBT-CA System identified 96.8% and 3.1% of isolates with either a "high confidence" or a "low confidence" [log(score) value between 1.70 and <2.00] species ID, respectively. Two isolates did not produce acceptable confidence scores after extraction. The MBT-CA System correctly identified 99.8% (2,258/2,263) to genus and 98.2% (2,222/2,263) to species level. These data demonstrate that the MBT-CA System provides accurate results for the identification of aerobic gram-negative bacteria. PMID:26529504
Faron, Matthew L; Buchan, Blake W; Hyke, Josh; Madisen, Neil; Lillie, Jennifer L; Granato, Paul A; Wilson, Deborah A; Procop, Gary W; Novak-Weekley, Susan; Marlowe, Elizabeth; Cumpio, Joven; Griego-Fullbright, Christen; Kindig, Sandra; Timm, Karen; Young, Stephen; Ledeboer, Nathan A
2015-01-01
The prompt and accurate identification of bacterial pathogens is fundamental to patient health and outcome. Recent advances in matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) have revolutionized bacterial identification in the clinical laboratory, but uniform incorporation of this technology in the U.S. market has been delayed by a lack of FDA-cleared systems. In this study, we conducted a multicenter evaluation of the MALDI Biotyper CA (MBT-CA) System (Bruker Daltonics Inc, Billerica, MA) for the identification of aerobic gram-negative bacteria as part of a 510(k) submission to the FDA. A total of 2,263 aerobic gram negative bacterial isolates were tested representing 23 genera and 61 species. Isolates were collected from various clinical sources and results obtained from the MBT-CA System were compared to DNA sequencing and/or biochemical testing. Isolates that failed to report as a "high confidence species ID" [log(score) ≥2.00] were re-tested using an extraction method. The MBT-CA System identified 96.8% and 3.1% of isolates with either a "high confidence" or a "low confidence" [log(score) value between 1.70 and <2.00] species ID, respectively. Two isolates did not produce acceptable confidence scores after extraction. The MBT-CA System correctly identified 99.8% (2,258/2,263) to genus and 98.2% (2,222/2,263) to species level. These data demonstrate that the MBT-CA System provides accurate results for the identification of aerobic gram-negative bacteria.
Picchio, Gaston; Vingerhoets, Johan; Tambuyzer, Lotke; Coakley, Eoin; Haddad, Mojgan; Witek, James
2011-12-01
Abstract The prevalence of susceptibility to etravirine was investigated among clinical samples submitted for routine clinical testing in the United States using two separate weighted genotypic scoring systems. The presence of etravirine mutations and susceptibility to etravirine by phenotype of clinical samples from HIV-1-infected patients, submitted to Monogram Biosciences for routine resistance testing between June 2008 and June 2009, were analyzed. Susceptibility by genotype was determined using the Monogram and Tibotec etravirine-weighted genotypic scoring systems, with scores of ≤3 and ≤2, respectively, indicating full susceptibility. Susceptibility by phenotype was determined using the PhenoSense HIV assay, with lower and higher clinical cut-offs of 2.9 and 10, respectively. The frequency of individual etravirine mutations and the impact of the K103N mutation on susceptibility to etravirine by genotype were also determined. Among the 5482 samples with ≥1 defined nonnucleoside reverse transcriptase inhibitor (NNRTI) mutations associated with resistance, 67% were classed as susceptible to etravirine by genotype by both scoring systems. Susceptibility to etravirine by phenotype was higher (76%). The proportion of first-generation NNRTI-resistant samples with (n=3598) and without (n=1884) K103N with susceptibility to etravirine by genotype was 77% and 49%, respectively. Among samples susceptible to first-generation NNRTIs (n=9458), >99% of samples were susceptible to etravirine by phenotype (FC <2.9); the remaining samples had FC ≥2.9-10. In summary, among samples submitted for routine clinical testing in the United States, a high proportion of samples with first-generation NNRTI resistance was susceptible to etravirine by genotype and phenotype. A higher proportion of NNRTI-resistant samples with K103N than without was susceptible to etravirine.
Junghaenel, Doerte U.; Schneider, Stefan; Stone, Arthur A.; Christodoulou, Christopher; Broderick, Joan E.
2014-01-01
Objective This study examined the ecological validity and clinical utility of NIH Patient Reported-Outcomes Measurement Information System (PROMIS®) instruments for anger, depression, and fatigue in women with premenstrual symptoms. Methods One-hundred women completed daily diaries and weekly PROMIS assessments over 4 weeks. Weekly assessments were administered through Computerized Adaptive Testing (CAT). Weekly CATs and corresponding daily scores were compared to evaluate ecological validity. To test clinical utility, we examined if CATs could detect changes in symptom levels, if these changes mirrored those obtained from daily scores, and if CATs could identify clinically meaningful premenstrual symptom change. Results PROMIS CAT scores were higher in the pre-menstrual than the baseline (ps < .0001) and post-menstrual (ps < .0001) weeks. The correlations between CATs and aggregated daily scores ranged from .73 to .88 supporting ecological validity. Mean CAT scores showed systematic changes in accordance with the menstrual cycle and the magnitudes of the changes were similar to those obtained from the daily scores. Finally, Receiver Operating Characteristic (ROC) analyses demonstrated the ability of the CATs to discriminate between women with and without clinically meaningful premenstrual symptom change. Conclusions PROMIS CAT instruments for anger, depression, and fatigue demonstrated validity and utility in premenstrual symptom assessment. The results provide encouraging initial evidence of the utility of PROMIS instruments for the measurement of affective premenstrual symptoms. PMID:24630180
Verhaegh, Pauline; Bavalia, Roisin; Winkens, Bjorn; Masclee, Ad; Jonkers, Daisy; Koek, Ger
2018-06-01
Nonalcoholic fatty liver disease is a rapidly increasing health problem. Liver biopsy analysis is the most sensitive test to differentiate between nonalcoholic steatohepatitis (NASH) and simple steatosis (SS), but noninvasive methods are needed. We performed a systematic review and meta-analysis of noninvasive tests for differentiating NASH from SS, focusing on blood markers. We performed a systematic search of the PubMed, Medline and Embase (1990-2016) databases using defined keywords, limited to full-text papers in English and human adults, and identified 2608 articles. Two independent reviewers screened the articles and identified 122 eligible articles that used liver biopsy as reference standard. If at least 2 studies were available, pooled sensitivity (sens p ) and specificity (spec p ) values were determined using the Meta-Analysis Package for R (metafor). In the 122 studies analyzed, 219 different blood markers (107 single markers and 112 scoring systems) were identified to differentiate NASH from simple steatosis, and 22 other diagnostic tests were studied. Markers identified related to several pathophysiological mechanisms. The markers analyzed in the largest proportions of studies were alanine aminotransferase (sens p , 63.5% and spec p , 74.4%) within routine biochemical tests, adiponectin (sensp, 72.0% and spec p , 75.7%) within inflammatory markers, CK18-M30 (sens p , 68.4% and spec p , 74.2%) within markers of cell death or proliferation and homeostatic model assessment of insulin resistance (sens p , 69.0% and spec p , 72.7%) within the metabolic markers. Two scoring systems could also be pooled: the NASH test (differentiated NASH from borderline NASH plus simple steatosis with 22.9% sens p and 95.3% spec p ) and the GlycoNASH test (67.1% sens p and 63.8% spec p ). In the meta-analysis, we found no test to differentiate NASH from SS with a high level of pooled sensitivity and specificity (≥80%). However, some blood markers, when included in scoring systems in single studies, identified patients with NASH with ≥80% sensitivity and specificity. Replication studies and more standardized study designs are urgently needed. At present, no marker or scoring system can be recommended for use in clinical practice to differentiate NASH from simple steatosis. Copyright © 2018 AGA Institute. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Andrini, V. S.
2018-05-01
The objectives of the research are to develop the learning video for the flipped classroom model for Open University’s student and to know the effectiveness of the video. The development of the video used Research and Development ADDIE design (Analyses, Design, Development, Implementation, Evaluation). The sampling used purposive sampling was 28 students in Open University of Nganjuk. The techniques of data collection were the observation data to know the problems of the students, and learning facilities, the test (pre-test and post-test) to know a knowledge aspect, a questionnaire to know advisability of video learning, a structured interview to confirm their answer. The result of the expert of matter and media showed that the average product score was 3.75 of 4 or very good, the small-scale test showed that the average score was 3.60 of 4 and the large-scale test showed that the average score was 3.80 of 4, it had a very good category. The t-test with paired sample test showed that sig. (2-tailed) < 0.05. The N-gain score of pre and post test was 0.55, it had the medium category. It can be concluded that the development of the learning video for flipped classroom was effective to be implemented.
The developmental eye movement (DEM) test and Cantonese-speaking children in Hong Kong SAR, China.
Pang, Peter C; Lam, Carly S; Woo, George C
2010-07-01
There is no published norm for the Developmental Eye Movement (DEM) Test for Cantonese-speaking Chinese children. This study aimed to determine the normative values of this test for Cantonese-speaking Chinese children in Hong Kong SAR and to compare the results with the published norms of English-speaking and Spanish-speaking children. Cantonese-speaking students aged from 6 to 11 years were tested by the DEM test in Cantonese and a digital recorder was used to record the process. The DEM scores for the 305 students were determined by listening again to the audio records after the test and computed by using the formula from the DEM manual, except that the 'vertical scores' were adjusted by taking the vertical errors into consideration. The results were compared with other norms that have been published. Our subjects made more vertical errors than in other normative studies and adjusted vertical scores were proposed. In both adjusted vertical and horizontal scores, the Cantonese-speaking children completed the tests much faster than the norms for English- and Spanish-speaking children, the differences of the means being significant (p < 0.0001) in all age groups. The DEM norms may be affected by differences in languages, cultures and education systems among different ethnicities. The norms of the DEM test are proposed for Cantonese-speaking children in Hong Kong SAR, China.
Developmental Eye Movement (DEM) Test Norms for Mandarin Chinese-Speaking Chinese Children.
Xie, Yachun; Shi, Chunmei; Tong, Meiling; Zhang, Min; Li, Tingting; Xu, Yaqin; Guo, Xirong; Hong, Qin; Chi, Xia
2016-01-01
The Developmental Eye Movement (DEM) test is commonly used as a clinical visual-verbal ocular motor assessment tool to screen and diagnose reading problems at the onset. No established norm exists for using the DEM test with Mandarin Chinese-speaking Chinese children. This study aims to establish the normative values of the DEM test for the Mandarin Chinese-speaking population in China; it also aims to compare the values with three other published norms for English-, Spanish-, and Cantonese-speaking Chinese children. A random stratified sampling method was used to recruit children from eight kindergartens and eight primary schools in the main urban and suburban areas of Nanjing. A total of 1,425 Mandarin Chinese-speaking children aged 5 to 12 years took the DEM test in Mandarin Chinese. A digital recorder was used to record the process. All of the subjects completed a symptomatology survey, and their DEM scores were determined by a trained tester. The scores were computed using the formula in the DEM manual, except that the "vertical scores" were adjusted by taking the vertical errors into consideration. The results were compared with the three other published norms. In our subjects, a general decrease with age was observed for the four eye movement indexes: vertical score, adjusted horizontal score, ratio, and total error. For both the vertical and adjusted horizontal scores, the Mandarin Chinese-speaking children completed the tests much more quickly than the norms for English- and Spanish-speaking children. However, the same group completed the test slightly more slowly than the norms for Cantonese-speaking children. The differences in the means were significant (P<0.001) in all age groups. For several ages, the scores obtained in this study were significantly different from the reported scores of Cantonese-speaking Chinese children (P<0.005). Compared with English-speaking children, only the vertical score of the 6-year-old group, the vertical-horizontal time ratio of the 8-year-old group and the errors of 9-year-old group had no significant difference (P>0.05); compared with Spanish-speaking children, the scores were statistically significant (P<0.001) for the total error scores of the age groups, except the 6-, 9-, 10-, and 11-year-old age groups (P>0.05). DEM norms may be affected by differences in language, cultural, and educational systems among various ethnicities. The norms of the DEM test are proposed for use with Mandarin Chinese-speaking children in Nanjing and will be proposed for children throughout China.
Stræde, Mia; Brabrand, Mikkel
2014-01-01
Background Clinical scores can be of aid to predict early mortality after admission to a medical admission unit. A developed scoring system needs to be externally validated to minimise the risk of the discriminatory power and calibration to be falsely elevated. We performed the present study with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. Methods Pre-planned prospective observational cohort study. Setting Danish 460-bed regional teaching hospital. Findings We included 3046 consecutive patients from 2 October 2008 until 19 February 2009. 26 (0.9%) died within one calendar day and 196 (6.4%) died within 30 days. We calculated SCS for 1080 patients. We found an AUROC of 0.960 (95% confidence interval [CI], 0.932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774–0.879) for 30-day mortality, and goodness-of-fit test, χ2 = 2.68 (10 degrees of freedom), P = 0.998 and χ2 = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95% CI, 0.901–0.962) for 24-hours mortality and goodness-of-fit test, χ2 = 5.56 (10 degrees of freedom), P = 0.234. Conclusion We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision. PMID:25144186
[Value of brain MR imaging in infants with a severe idiopathic apparent life threatening event].
Christophe, C; Boutemy, R; Christiaens, F; Fonteyne, C; Ziereisen, F; Dan, B
2000-01-01
Prognostic value of a magnetic resonance imaging (MRI) scoring system in infants with a severe apparent life threatening event (ALTE). Ten infants with an ALTE (aged between 6 and 31 weeks) were clinically graded according to the PRISM score and evaluated with EEG, evoked potentials and MRI. The 18 MRIs obtained were distributed in 3 classes according to the delay after which they were obtained; class A (n=5): within the first 48 hours after the event, class B (n=7): between day 3 and 8 and class C (n=6): between day 9 and 50. The 18 MRIs were evaluated retrospectively using a scoring system based on 3 categories of lesions: edema, basal ganglia injury and watershed injuries. Five infants died between day 2 and day 15 after the event. The five surviving infants had follow up neurodevelopmental testing after 38 to 77 months. There was no correlation between the 5 MRIs of class A and the neurological outcome. For the MRIs of class B and C, the scoring system can be of great value when combined with the scores of EEG, EP and PRISM. The scoring system for MRI performed within 48 hours after the event is falsely reassuring. MRI can be helpful as early as 3 days after the event when combined with the score of the electrophysiological investigations and the PRISM.
Sefton, Gerri; Lane, Steven; Killen, Roger; Black, Stuart; Lyon, Max; Ampah, Pearl; Sproule, Cathryn; Loren-Gosling, Dominic; Richards, Caitlin; Spinty, Jean; Holloway, Colette; Davies, Coral; Wilson, April; Chean, Chung Shen; Carter, Bernie; Carrol, E D
2017-05-01
Pediatric Early Warning Scores are advocated to assist health professionals to identify early signs of serious illness or deterioration in hospitalized children. Scores are derived from the weighting applied to recorded vital signs and clinical observations reflecting deviation from a predetermined "norm." Higher aggregate scores trigger an escalation in care aimed at preventing critical deterioration. Process errors made while recording these data, including plotting or calculation errors, have the potential to impede the reliability of the score. To test this hypothesis, we conducted a controlled study of documentation using five clinical vignettes. We measured the accuracy of vital sign recording, score calculation, and time taken to complete documentation using a handheld electronic physiological surveillance system, VitalPAC Pediatric, compared with traditional paper-based charts. We explored the user acceptability of both methods using a Web-based survey. Twenty-three staff participated in the controlled study. The electronic physiological surveillance system improved the accuracy of vital sign recording, 98.5% versus 85.6%, P < .02, Pediatric Early Warning Score calculation, 94.6% versus 55.7%, P < .02, and saved time, 68 versus 98 seconds, compared with paper-based documentation, P < .002. Twenty-nine staff completed the Web-based survey. They perceived that the electronic physiological surveillance system offered safety benefits by reducing human error while providing instant visibility of recorded data to the entire clinical team.
The validation of the visual analogue scale for patient satisfaction after total hip arthroplasty.
Brokelman, Roy B G; Haverkamp, Daniel; van Loon, Corné; Hol, Annemiek; van Kampen, Albert; Veth, Rene
2012-06-01
INTRODUCTION: Patient satisfaction becomes more important in our modern health care system. The assessment of satisfaction is difficult because it is a multifactorial item for which no golden standard exists. One of the potential methods of measuring satisfaction is by using the well-known visual analogue scale (VAS). In this study, we validated VAS for satisfaction. PATIENT AND METHODS: In this prospective study, we studied 147 patients (153 hips). The construct validity was measured using the Spearman correlation test that compares the satisfaction VAS with the Harris hip score, pain VAS at rest and during activity, Oxford hip score, Short Form 36 and Western Ontario McMaster Universities Osteoarthritis Index. The reliability was tested using the intra-class coefficient. RESULTS: The Pearson correlation test showed correlations in the range of 0.40-0.80. The satisfaction VAS had a high correlation between the pain VAS and Oxford hip score, which could mean that pain is one of the most important factors in patient satisfaction. The intra-class coefficient was 0.95. CONCLUSIONS: There is a moderate to mark degree of correlation between the satisfaction VAS and the currently available subjective and objective scoring systems. The intra-class coefficient of 0.95 indicates an excellent test-retest reliability. The VAS satisfaction is a simple instrument to quantify the satisfaction of a patient after total hip arthroplasty. In this study, we showed that the satisfaction VAS has a good validity and reliability.
Ammitzbøll-Danielsen, Mads; Østergaard, Mikkel; Naredo, Esperanza; Terslev, Lene
2016-12-01
The aim was to evaluate the metric properties of the semi-quantitative OMERACT US scoring system vs a novel quantitative US scoring system for tenosynovitis, by testing its intra- and inter-reader reliability, sensitivity to change and comparison with clinical tenosynovitis scoring in a 6-month follow-up study. US and clinical assessments of the tendon sheaths of the clinically most affected hand and foot were performed at baseline, 3 and 6 months in 51 patients with RA. Tenosynovitis was assessed using the semi-quantitative scoring system (0-3) proposed by the OMERACT US group and a new quantitative US evaluation (0-100). A sum for US grey scale (GS), colour Doppler (CD) and pixel index (PI), respectively, was calculated for each patient. In 20 patients, intra- and inter-observer agreement was established between two independent investigators. A binary clinical tenosynovitis score was performed, calculating a sum score per patient. The intra- and inter-observer agreements for US tenosynovitis assessments were very good at baseline and for change for GS and CD, but less good for PI. The smallest detectable change was 0.97 for GS, 0.93 for CD and 30.1 for PI. The sensitivity to change from month 0 to 6 was high for GS and CD, and slightly higher than for clinical tenosynovitis score and PI. This study demonstrated an excellent intra- and inter-reader agreement between two investigators for the OMERACT US scoring system for tenosynovitis and a high ability to detect changes over time. Quantitative assessment by PI did not add further information. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Scoring system for differentiating perforated and non-perforated pediatric appendicitis.
Blumfield, Einat; Yang, Daniel; Grossman, Joshua
2017-10-01
Appendicitis is the most common indication for emergency pediatric surgery and its most significant complication is perforation. Perforated appendicitis (PA) may be managed conservatively, whereas non-perforated appendicitis (NP) is managed surgically. Recent studies have shown that ultrasound (US) is effective for differentiating between PA and NP, and does not expose pediatric patients to ionizing radiation. The purpose of this study is to enhance the accuracy of differentiation with a novel scoring system based on clinical, laboratory, and US findings. This retrospective study included 243 patients aged 2-17 years who presented between 2006 and 2013 with surgically proven appendicitis, of whom 60 had perforation. Clinical and laboratory data were collected and US images evaluated by a pediatric radiologist. To create the scoring system, point values were assigned to each parameter. A randomly selected training sample of 137 subjects was used to create a scoring prediction model. The model was tested on the remaining 106 patients. Scores of ≥6, ≥11, and ≥15 yielded specificities of 64, 91, and 99%, and sensitivities of 96, 61, and 29%, respectively (p < 0.001). We have designed a scoring system incorporating clinical, laboratory, and sonographic findings which can differentiate PA from NP with high specificity.
Kenyon, Lisa K.; Elliott, James M; Cheng, M. Samuel
2016-01-01
Purpose/Background Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. Methods A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts’ USA-Gymnastics competitive level to calculate the coefficient of determination (r2). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. Results The relationship between total MGFMT scores and subjects’ current USA-Gymnastics competitive level was found to be good (r2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). Conclusions The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level of Evidence Level 3 PMID:27999723
Evaluation of medical students using the "qi, blood, and fluid" system of Kampo medicine.
Arai, Makoto; Arai, Katsuhiko; Hioki, Chizuko; Takashi, Masanori; Matsumoto, Kaori; Honda, Masamitsu; Izumi, Shun-ichiro
2013-04-20
Although "qi, blood, and fluid" (QBF) is the most important concept for patients in Kampo medicine, there are few studies about the conditions of the QBF system among healthy populations. We used QBF pattern scores to determine whether or not medical students, presumed to be healthy, had any potentially pathological conditions. Six consecutive fourth-year classes totaling 652 medical students evaluated their own QBF conditions using Terasawa's QBF pattern scores. The six conditions: "qi deficiency" (QD), "qi stagnation" (QS), "qi counterflow" (QC), "blood deficiency" (BD), "blood stasis" (BS), and "fluid disturbance" (FD), were categorized according to Terasawa's criteria. The Mann-Whitney U test was used to compare the score differences between the genders, Chi-square test was used to examine gender differences in the QBF diagnoses, and the Spearman's rank-order correlation coefficient analysis was used to analyze the correlation between each category of QBF. In all, 44.6% of the students met at least one diagnostic criterion in the QBF system. QC, BD, BS, and FD were established more in females, and QD and QS were established without gender differences. Most students who were presumed to be healthy were revealed to have some potentially pathological conditions using the QBF system.
ERIC Educational Resources Information Center
Sugranes, Maria R.; Snider, Larry C.
1985-01-01
Describes the development of an automated library instruction records management system using microcomputer technology. Development described includes assessment of need, exploration of options, system design, and operational development. System products are identified and operational results are reported based on actual system performance.…
McIver, Kerry L.; Brown, William H.; Pfeiffer, Karin A.; Dowda, Marsha; Pate, Russell R.
2016-01-01
Purpose This study describes the development and pilot testing of the Observational System for Recording Physical Activity-Elementary School (OSRAC-E) version. Methods This system was developed to observe and document the levels and types of physical activity and physical and social contexts of physical activity in elementary school students during the school day. Inter-observer agreement scores and summary data were calculated. Results All categories had Kappa statistics above 0.80, with the exception of the activity initiator category. Inter-observer agreement scores were 96% or greater. The OSRAC-E was shown to be a reliable observation system that allows researchers to assess physical activity behaviors, the contexts of those behaviors, and the effectiveness of physical activity interventions in the school environment. Conclusion The OSRAC-E can yield data with high interobserver reliability and provide relatively extensive contextual information about physical activity of students in elementary schools. PMID:26889587
Jaya, Ziningi; Drain, Paul K.
2017-01-01
Introduction Rapid HIV tests have improved access to HIV diagnosis and treatment by providing quick and convenient testing in rural clinics and resource-limited settings. In this study, we evaluated the quality management system for voluntary and provider-initiated point-of-care HIV testing in primary healthcare (PHC) clinics in rural KwaZulu-Natal (KZN), South Africa. Material and methods We conducted a quality assessment audit in eleven PHC clinics that offer voluntary HIV testing and counselling in rural KZN, South Africa from August 2015 to October 2016. All the participating clinics were purposively selected from the province-wide survey of diagnostic services. We completed an on-site monitoring checklist, adopted from the WHO guidelines for assuring accuracy and reliability of HIV rapid tests, to assess the quality management system for HIV rapid testing at each clinic. To determine clinic’s compliance to WHO quality standards for HIV rapid testing the following quality measure was used, a 3-point scale (high, moderate and poor). A high score was defined as a percentage rating of 90 to 100%, moderate was defined as a percentage rating of 70 to 90%, and poor was defined as a percentage rating of less than 70%. Clinic audit scores were summarized and compared. We employed Pearson pair wise correlation coefficient to determine correlations between clinics audit scores and clinic and clinics characteristics. Linear regression model was computed to estimate statistical significance of the correlates. Correlations were reported as significant at p ≤0.05. Results Nine out of 11 audited rural PHC clinics are located outside 20Km of the nearest town and hospital. Majority (18.2%) of the audited rural PHC clinics reported that HIV rapid test was performed by HIV lay counsellors. Overall, ten clinics were rated moderate, in terms of their compliance to the stipulated WHO guidelines. Audit results showed that rural PHC clinics’ average rating score for compliance to the WHO guidelines ranged between 64.4% (CI: 44%– 84%) and 89.2% (CI: 74%– 100%).Ten out of eleven of the clinics were rated as moderate (70–89%). All clinic have scored highest for the following audit component: equipment; process control and specimen management; and facility ad safety, with 100%. Clinics obtained the lowest scores for the assessment audit component followed by process improvement and organisation, with 40.9% (CI: 15.7–66.1%), 45.5% (CI: 10.4–80.5%) and 56.8% (CI: 31.8 81.8%), respectively. A statistically significant correlation was observed between the following: category of staff performing the HIV rapid tests in the audited clinics and service and satisfactory audit component; weekly average number of patients using the audited PHC clinics and service and satisfactory audit component; number of HIV lay counsellors in the audited clinics and quality control audit component with p<0.05. Discussion In the small audit of primary healthcare clinics located within the rural part of KwaZulu-Natal, results revealed an overall moderate rating of the quality management system for rapid HIV testing. Improvements in the organisation, quality control, process improvement and assessment components could enable a higher quality assurance rating for rural HIV testing in KwaZulu-Natal. PMID:28829801
Jaya, Ziningi; Drain, Paul K; Mashamba-Thompson, Tivani P
2017-01-01
Rapid HIV tests have improved access to HIV diagnosis and treatment by providing quick and convenient testing in rural clinics and resource-limited settings. In this study, we evaluated the quality management system for voluntary and provider-initiated point-of-care HIV testing in primary healthcare (PHC) clinics in rural KwaZulu-Natal (KZN), South Africa. We conducted a quality assessment audit in eleven PHC clinics that offer voluntary HIV testing and counselling in rural KZN, South Africa from August 2015 to October 2016. All the participating clinics were purposively selected from the province-wide survey of diagnostic services. We completed an on-site monitoring checklist, adopted from the WHO guidelines for assuring accuracy and reliability of HIV rapid tests, to assess the quality management system for HIV rapid testing at each clinic. To determine clinic's compliance to WHO quality standards for HIV rapid testing the following quality measure was used, a 3-point scale (high, moderate and poor). A high score was defined as a percentage rating of 90 to 100%, moderate was defined as a percentage rating of 70 to 90%, and poor was defined as a percentage rating of less than 70%. Clinic audit scores were summarized and compared. We employed Pearson pair wise correlation coefficient to determine correlations between clinics audit scores and clinic and clinics characteristics. Linear regression model was computed to estimate statistical significance of the correlates. Correlations were reported as significant at p ≤0.05. Nine out of 11 audited rural PHC clinics are located outside 20Km of the nearest town and hospital. Majority (18.2%) of the audited rural PHC clinics reported that HIV rapid test was performed by HIV lay counsellors. Overall, ten clinics were rated moderate, in terms of their compliance to the stipulated WHO guidelines. Audit results showed that rural PHC clinics' average rating score for compliance to the WHO guidelines ranged between 64.4% (CI: 44%- 84%) and 89.2% (CI: 74%- 100%).Ten out of eleven of the clinics were rated as moderate (70-89%). All clinic have scored highest for the following audit component: equipment; process control and specimen management; and facility ad safety, with 100%. Clinics obtained the lowest scores for the assessment audit component followed by process improvement and organisation, with 40.9% (CI: 15.7-66.1%), 45.5% (CI: 10.4-80.5%) and 56.8% (CI: 31.8 81.8%), respectively. A statistically significant correlation was observed between the following: category of staff performing the HIV rapid tests in the audited clinics and service and satisfactory audit component; weekly average number of patients using the audited PHC clinics and service and satisfactory audit component; number of HIV lay counsellors in the audited clinics and quality control audit component with p<0.05. In the small audit of primary healthcare clinics located within the rural part of KwaZulu-Natal, results revealed an overall moderate rating of the quality management system for rapid HIV testing. Improvements in the organisation, quality control, process improvement and assessment components could enable a higher quality assurance rating for rural HIV testing in KwaZulu-Natal.
Maddali Bongi, S; Del Rosso, A; Miniati, I; Galluccio, F; Landi, G; Tai, G; Matucci-Cerinic, M
2012-09-01
In systemic sclerosis (SSc), mouth and face involvement leads to problems in oral health-related quality of life (OHRQoL). Mouth Handicap in Systemic Sclerosis scale (MHISS) is a 12-item questionnaire specifically quantifying mouth disability in SSc, organized in 3 subscales. Our aim was to validate Italian version of MHISS, by assessing its test-retest reliability and internal and external consistency in Italian SSc patients. Forty SSc patients (7 dSSc, 33 lSSc; age and disease duration: 57.27 ± 11.41, 9.4 ± 4.4 years; 22 with sicca syndrome) were evaluated with MHISS. MHISS was translated following a forward-backward translation procedure, with independent translations and counter-translation. Test-retest reliability was evaluated, comparing the results of two administrations, with intraclass correlation coefficient (ICC). Internal consistency was assessed by Cronbach's α and external consistency by comparison with mouth opening. MHISS has a good test-retest reliability (ICC: 0.93) and internal consistency (Cronbach's α:0.99). A good external consistency was confirmed by correlation with mouth opening (rho: -0,3869, p: 0.0137). Total MHISS score was 17.65 ± 5.20, with scores of subscale 1 (reduced mouth opening) of 6.60 ± 2.85 and scores of subscales 2 (sicca syndrome) and 3 (aesthetic concerns) of 7.82 ± 2.59 and 3.22 ± 1.14. Total and subscale 2 scores are higher in dSSc than in lSSc. This result may be due to the higher presence of sicca syndrome in dSSc than in lSSc (p = 0.0109). Our results support validity and reliability in Italian SSc patients of MHISS, specifically measuring SSc OHRQoL.
León-Justel, Antonio; Madrazo-Atutxa, Ainara; Alvarez-Rios, Ana I; Infantes-Fontán, Rocio; Garcia-Arnés, Juan A; Lillo-Muñoz, Juan A; Aulinas, Anna; Urgell-Rull, Eulàlia; Boronat, Mauro; Sánchez-de-Abajo, Ana; Fajardo-Montañana, Carmen; Ortuño-Alonso, Mario; Salinas-Vert, Isabel; Granada, Maria L; Cano, David A; Leal-Cerro, Alfonso
2016-10-01
Cushing's syndrome (CS) is challenging to diagnose. Increased prevalence of CS in specific patient populations has been reported, but routine screening for CS remains questionable. To decrease the diagnostic delay and improve disease outcomes, simple new screening methods for CS in at-risk populations are needed. To develop and validate a simple scoring system to predict CS based on clinical signs and an easy-to-use biochemical test. Observational, prospective, multicenter. Referral hospital. A cohort of 353 patients attending endocrinology units for outpatient visits. All patients were evaluated with late-night salivary cortisol (LNSC) and a low-dose dexamethasone suppression test for CS. Diagnosis or exclusion of CS. Twenty-six cases of CS were diagnosed in the cohort. A risk scoring system was developed by logistic regression analysis, and cutoff values were derived from a receiver operating characteristic curve. This risk score included clinical signs and symptoms (muscular atrophy, osteoporosis, and dorsocervical fat pad) and LNSC levels. The estimated area under the receiver operating characteristic curve was 0.93, with a sensitivity of 96.2% and specificity of 82.9%. We developed a risk score to predict CS in an at-risk population. This score may help to identify at-risk patients in non-endocrinological settings such as primary care, but external validation is warranted.
An Inmate Classification System Based on PCL: SV Factor Scores in a Sample of Prison Inmates
ERIC Educational Resources Information Center
Wogan, Michael; Mackenzie, Marci
2007-01-01
Psychopaths represent a significant management challenge in a prison population. A sample of ninety-five male inmates from three medium security prisons was tested using the Hare Psychopathy Checklist: Screening Version (PCL:SV). Using traditional criteria, 22% of the inmates were classified as psychopaths. Scores on the two factor dimensions of…
ERIC Educational Resources Information Center
Reeves, Edward B.
The system of high-stakes accountability in the Kentucky public schools raises the question of whether teachers and administrators should be held accountable if test scores are influenced by external factors over which educators have no control. This study investigates whether such external factors , or "contextual effects," bias the…
Roethke, M C; Kuru, T H; Schultze, S; Tichy, D; Kopp-Schneider, A; Fenchel, M; Schlemmer, H-P; Hadaschik, B A
2014-02-01
To evaluate the Prostate Imaging Reporting and Data System (PI-RADS) proposed by the European Society of Urogenital Radiology (ESUR) for detection of prostate cancer (PCa) by multiparametric magnetic resonance imaging (mpMRI) in a consecutive cohort of patients with magnetic resonance/transrectal ultrasound (MR/TRUS) fusion-guided biopsy. Suspicious lesions on mpMRI at 3.0 T were scored according to the PI-RADS system before MR/TRUS fusion-guided biopsy and correlated to histopathology results. Statistical correlation was obtained by a Mann-Whitney U test. Receiver operating characteristics (ROC) and optimal thresholds were calculated. In 64 patients, 128/445 positive biopsy cores were obtained out of 95 suspicious regions of interest (ROIs). PCa was present in 27/64 (42%) of the patients. ROC results for the aggregated PI-RADS scores exhibited higher areas under the curve compared to those of the Likert score. Sensitivity/Specificity for the following thresholds were calculated: 85 %/73 % and 67 %/92 % for PI-RADS scores of 9 and 10, respectively; 85 %/60 % and 56 %/97 % for Likert scores of 3 and 4, respectively [corrected. The standardised ESUR PI-RADS system is beneficial to indicate the likelihood of PCa of suspicious lesions on mpMRI. It is also valuable to identify locations to be targeted with biopsy. The aggregated PI-RADS score achieved better results compared to the single five-point Likert score. • The ESUR PI-RADS scoring system was evaluated using multiparametric 3.0-T MRI. • To investigate suspicious findings, transperineal MR/TRUS fusion-guided biopsy was used. • PI-RADS can guide biopsy locations and improve detection of clinically significant cancer. • Biopsy procedures can be optimised, reducing unnecessary negative biopsies for patients. • The PI-RADS scoring system may contribute to more effective prostate MRI.
Clinical scoring system in the evaluation of adult pharyngitis.
Seppälä, H; Lahtonen, R; Ziegler, T; Meurman, O; Hakkarainen, K; Miettinen, A; Arstila, P; Eskola, J; Saikku, P; Huovinen, P
1993-03-01
To compare results of a clinical scoring system for diagnosis of group A streptococcal pharyngitis with microbiologic results, when several different pharyngeal pathogens were tested simultaneously. Evaluation of clinical manifestations of 106 adult patients with pharyngitis of different microbial origin. General private practice; Health Center Pulssi, Turku, Finland. Adult patients whose chief complaints were sore throats. A symptom score that was assigned to each patient according to the total number of certain signs and symptoms that are postulated to increase the probability of group A streptococcal pharyngitis and blood measurements for infection. The highest symptom scores, 3 and 4, were found in 21 patients. These patients had pharyngitis due to group A streptococcus (four patients), group C streptococcus (four patients), group G streptococcus (two patients), group F streptococcus, Mycoplasma pneumoniae, Chlamydia pneumoniae, influenza A virus, influenza B virus, herpes simplex type 1 virus (two patients), and coxsackie B4 virus. No pathogen could be identified from three of the 21 patients. The C-reactive protein values and the leukocyte counts were raised significantly more often in streptococcal infections than in infections of other origin; the P values were .00016 and .028, respectively. Use of a clinical scoring system alone for diagnosis of pharyngitis may lead to improper use of anti-microbial agents. There is a need for accurate microbiologic diagnostic procedures in general practice to determine proper treatment of pharyngitis as well as to test the effect of antibacterial and, in the future, antiviral treatment in respiratory tract infections.
Pierce, Wesly; Mazur, Joseph; Greenberg, Charles; Mueller, Joan; Foster, Joyce; Lazarchick, John
2013-01-01
Over-diagnosis of heparin-induced thrombocytopenia (HIT) results in costly and unnecessary laboratory screening and treatment with direct thrombin inhibitors. Our aim was to evaluate the utility of the 4Ts scoring system to predict HIT in multiple ICU settings and to characterize our treatment of these cases. Eighty-two patients from multiple ICU settings who underwent laboratory testing for HIT were classified as low-, intermediate-, or high-risk patients based on retrospectively adjudicated 4Ts scores. These results were compared with platelet-factor 4 enzyme-linked immunosorbent assays (PF4 ELISAs), optical density (OD) values, and serotonin-release assays (SRAs) to assess the utility of the 4Ts score to rule out ICU-related HIT and reduce laboratory and drug expenditures. Of the 82 patients reviewed, only 12 (11.4%) were PF4-positive and only 1 (1.2%) was SRA-positive for HIT. Heparin was discontinued in only 63.4% of patients suspected to have HIT. There were no significant differences in mean day of platelet fall, mean platelet nadir, and mean percent fall in platelet count between PF4-positive and negative patients (all p > 0.2). There was, however, a significantly higher proportion of patients with an intermediate to high 4Ts score in the PF4-positive group than in the PF4-negative group (66% vs. 30%, respectively; p = 0.02). The mean PF4 OD value in patients with intermediate to high 4Ts scores was significantly higher than in patients with low 4Ts scores (0.658 vs. 0.258, respectively; p < 0.001). The negative predictive values of the 4Ts score relative to the PF4 and SRA were 92% and 100%, respectively. The estimated laboratory and pharmacologic cost avoidance potential of the scoring system in this cohort was $21,450. Our modified 4Ts scoring system appears to be an effective tool for predicting HIT in the ICU and could avoid significant drug and laboratory expenditures if implemented prospectively. The clinical management of patients suspected of HIT is highly variable at our institution. Clinical protocols and education encouraging the proper identification and treatment of suspected HIT need to be established.
Manfredini, A F; Malagoni, A M; Litmanen, H; Zhukovskaja, L; Jeannier, P; Dal Follo, D; Felisatti, M; Besseberg, A; Geistlinger, M; Bayer, P; Carrabre, J E
2011-03-01
Substances and methods used to increase oxygen blood transport and physical performance can be detected in the blood, but the screening of the athletes to be tested remains a critical issue for the International Federations. This project, AR.I.E.T.T.A., aimed to develop a software capable of analysing athletes' hematological and performance profiles to detect abnormal patterns. One-hundred eighty athletes belonging to the International Biathlon Union gave written informed consent to have their hematological data, previously collected according to anti-doping rules, used to develop the AR.I.E.T.T.A. software. Software was developed with the included sections: 1) log-in; 2) data-entry: where data are loaded, stored and grouped; 3) analysis: where data are analysed, validated scores are calculated, and parameters are simultaneously displayed as statistics, tables and graphs, and individual or subpopulation profiles; 4) screening: where an immediate evaluation of the risk score of the present sample and/or the athlete under study is obtained. The sample risk score or AR.I.E.T.T.A. score is calculated by a simple computational system combining different parameters (absolute values and intra-individual variations) considered concurrently. The AR.I.E.T.T.A. score is obtained by the sum of the deviation units derived from each parameter, considering the shift of the present value from the reference values, based on the number of standard deviations. AR.I.E.T.T.A. enables a quick evaluation of blood results assisting surveillance programs and perform timely target testing controls on athletes by the International Federations. Future studies aiming to validate the AR.I.E.T.T.A. score and improve the diagnostic accuracy will improve the system.
Koami, Hiroyuki; Sakamoto, Yuichiro; Sakurai, Ryota; Ohta, Miho; Imahase, Hisashi; Yahata, Mayuko; Umeka, Mitsuru; Miike, Toru; Nagashima, Futoshi; Iwamura, Takashi; Yamada, Kosuke Chris; Inoue, Satoshi
2016-08-01
The aim of this study is to evaluate the hematological differences between septic and traumatic disseminated intravascular coagulation (DIC) using the rotational thromboelastometry (ROTEM).This retrospective study includes all sepsis or severe trauma patients transported to our emergency department who underwent ROTEM from 2013 to 2014. All patients were divided into 2 groups based on the presence of DIC diagnosed by the Japanese Association for Acute Medicine (JAAM) DIC score. We statistically analyzed the demographics, clinical characteristics, laboratory data, ROTEM findings (EXTEM and FIBTEM), and outcome.Fifty-seven patients (30 sepsis and 27 severe trauma) were included in primary analysis. Sepsis cases were significantly older and had higher systemic inflammatory response syndrome (SIRS) scores, whereas there were no significant differences in other parameters including Acute Physiology and Chronic Health Evaluation (APACHE) II score, sequential organ failure assessment (SOFA) score. Twenty-six patients (14 sepsis and 12 severe trauma) were diagnosed with DIC. The Septic DIC (S-DIC) group was significantly older and had higher DIC scores than the traumatic DIC (T-DIC) group. Hematologic examination revealed significantly higher CRP, fibrinogen, lower FDP, DD, and higher FDP/DD ratio were found in the S-DIC group in comparison with the T-DIC group. ROTEM findings showed that the A10, A20, and MCF in the FIBTEM test were significantly higher in the S-DIC group. However, no statistical differences were confirmed in the LI30, LI45, and ML in EXTEM test.The plasma fibrinogen level and fibrinogen based clot firmness in whole-blood test revealed statistical significance between septic and traumatic DIC patients.
Dougados, Maxime; Jousse-Joulin, Sandrine; Mistretta, Frederic; d'Agostino, Maria-Antonietta; Backhaus, Marina; Bentin, Jacques; Chalès, Gérard; Chary-Valckenaere, Isabelle; Conaghan, Philip; Etchepare, Fabien; Gaudin, Philippe; Grassi, Walter; van der Heijde, Désirée; Sellam, Jérémie; Naredo, Esperanza; Szkudlarek, Marcin; Wakefield, Richard; Saraux, Alain
2010-05-01
To evaluate different global ultrasonographic (US) synovitis scoring systems as potential outcome measures of rheumatoid arthritis (RA) according to the Outcome Measures in Rheumatoid Arthritis Clinical Trials (OMERACT) filter. To study selected global scoring systems, for the clinical, B mode and power Doppler techniques, the following joints were evaluated: 28 joints (28-joint Disease Activity Score (DAS28)), 20 joints (metacarpophalangeals (MCPs) + metatarsophalangeals (MTPs)) and 38 joints (28 joints + MTPs) using either a binary (yes/no) or a 0-3 grade. The study was a prospective, 4-month duration follow-up of 76 patients with RA requiring anti-tumour necrosis factor (TNF) therapy (complete follow-up data: 66 patients). Intraobserver reliability was evaluated using the intraclass correlation coefficient (ICC), construct validity was evaluated using the Cronbach alpha test and external validity was evaluated using level of correlation between scoring system and C reactive protein (CRP). Sensitivity to change was evaluated using the standardised response mean. Discriminating capacity was evaluated using the standardised mean differences in patients considered by the doctor as significantly improved or not at the end of the study. Different clinimetric properties of various US scoring systems were at least as good as the clinical scores with, for example, intraobserver reliability ranging from 0.61 to 0.97 versus from 0.53 to 0.82, construct validity ranging from 0.76 to 0.89 versus from 0.76 to 0.88, correlation with CRP ranging from 0.28 to 0.34 versus from 0.28 to 0.35 and sensitivity to change ranging from 0.60 to 1.21 versus from 0.96 to 1.36 for US versus clinical scoring systems, respectively. This study suggests that US evaluation of synovitis is an outcome measure at least as relevant as physical examination. Further studies are required in order to achieve optimal US scoring systems for monitoring patients with RA in clinical trials and in clinical practice.
ERIC Educational Resources Information Center
Ferrara, Steve
2017-01-01
Test security is not an end in itself; it is important because we want to be able to make valid interpretations from test scores. In this article, I propose a framework for comprehensive test security systems: prevention, detection, investigation, and resolution. The article discusses threats to test security, roles and responsibilities, rigorous…
ERIC Educational Resources Information Center
Floyd, Randy G.; Bergeron, Renee; Hamilton, Gloria; Parra, Gilbert R.
2010-01-01
This study investigated the relations among executive functions and cognitive abilities through a joint exploratory factor analysis and joint confirmatory factor analysis of 25 test scores from the Delis-Kaplan Executive Function System and the Woodcock-Johnson III Tests of Cognitive Abilities. Participants were 100 children and adolescents…
Prognostic scores in oesophageal or gastric variceal bleeding.
Ohmann, C; Stöltzing, H; Wins, L; Busch, E; Thon, K
1990-05-01
Numerous scoring systems have been developed for the prediction of outcome of variceal bleeding; however, only a few have been evaluated adequately. The object of this study was to improve the classical Child-Pugh score (CPS) and to test other scores from the literature. Patients (n = 82) with endoscopically confirmed variceal bleeding and long-term sclerotherapy were included in the study. Linear logistic regression (LR) was applied to different sets of prognostic variables with regard to 30-day mortality. In addition, scores from the literature were evaluated on the data set. Performance was measured by the accuracy and receiver-operating characteristic curves. The application of LR to all five CPS variables (accuracy, 80%) was superior to the classical CPS (70%). LR with selection from the CPS variables or from other sets of variables resulted in no improvement. Compared with CPS only three scores from the literature, mainly based on subsets of the CPS variables, showed an improved accuracy. It is concluded that CPS is still a good scoring system; however, it can be improved by statistical analysis using the same variables.
Fama, Rosemary; Sullivan, Edith V; Sassoon, Stephanie A; Pfefferbaum, Adolf; Zahr, Natalie M
2016-12-01
Executive functioning and episodic memory impairment occur in HIV infection (HIV) and chronic alcoholism (ALC). Comorbidity of these conditions (HIV + ALC) is prevalent and heightens risk of vulnerability to separate and compounded deficits. Age and disease-related variables can also serve as mediators of cognitive impairment and should be considered, given the extended longevity of HIV-infected individuals in this era of improved pharmacological therapy. HIV, ALC, HIV + ALC, and normal controls (NC) were administered traditional and computerized tests of executive function and episodic memory. Test scores were expressed as age- and education-corrected Z-scores; selective tests were averaged to compute Executive Function and Episodic Memory Composite scores. Efficiency scores were calculated for tests with accuracy and response times. HIV, ALC, and HIV + ALC had lower scores than NC on Executive Function and Episodic Memory Composites, with HIV + ALC even lower than ALC and HIV on the Episodic Memory Composite. Impairments in planning and free recall of visuospatial material were observed in ALC, whereas impairments in psychomotor speed, sequencing, narrative free recall, and pattern recognition were observed in HIV. Lower decision-making efficiency scores than NC occurred in all 3 clinical groups. In ALC, age and lifetime alcohol consumption were each unique predictors of Executive Function and Episodic Memory Composite scores. In HIV + ALC, age was a unique predictor of Episodic Memory Composite score. Disease-specific and disease-overlapping patterns of impairment in HIV, ALC, and HIV + ALC have implications regarding brain systems disrupted by each disease and clinical ramifications regarding the complexities and compounded damping of cognitive functioning associated with dual diagnosis that may be exacerbated with aging. Copyright © 2016 by the Research Society on Alcoholism.
Aawar, Nadine; Moore, Richard; Riley, Stephen; Salek, Sam
2016-07-01
High Renal Quality of Life Profile (RQLP) scores are associated with impaired health-related quality of life; however, the clinical meaning of the scores is difficult for clinicians and healthcare planners to interpret. The aim of this study was to determine clinical significance of RQLP scores which could be used to aid clinical decision-making. The anchor-based technique (a method for categorizing numeric scores to ease interpretation) was used to develop a categorization system for the RQLP scores using a global question (GQ). The GQ scores (i.e. no effect to extremely large effect) were mapped against the RQLP scores, and intraclass correlation coefficient (ICC) was used to test their agreement. The RQLP and the GQ were administered to 260 adult patients (males = 165 and females = 95) with chronic renal failure (CRF). The mean RQLP score was 67.2, median = 61, SD = 41.5, and range 0-172. The mean GQ score was 1.74, median = 2, SD = 1.27, and range 0-4. The mean, mode, and median of the GQ scores for each RQLP score were used to devise several sets of categories of RQLP score, and the ICC test of agreement was calculated. The proposed set of RQLP score banding for adoption includes: 0-20 = no effect on patient's life (GQ = 0, n = 35); 21-51 = small effect on patient's life (GQ = 1, n = 66); 52-93 = moderate effect on patient's life (GQ = 2, n = 87); 94-134 = very large effect on patient's life (GQ = 3, n = 54); and 135-172 = extremely large effect on patient's life (GQ = 4, n = 18). The ICC coefficient for the proposed banding system was 0.80. The proposed categorization of the RQLP will aid the clinical interpretation of change in RQLP score informing treatment decision-making in routine practice.
ERIC Educational Resources Information Center
Heun, Christopher
2006-01-01
With federal funding riding on test scores, student performance is crucial. Information systems are an important ingredient of assessments that are sometimes overlooked. Small rural school districts, such as Orange County, Virginia, often have only a modest IT budget and must find ways to improve the reliability and redundancy of their systems.…
Personnel Management in the Military: Effects of Retirement Policies on the Retention of Personnel.
1986-01-01
observable characteristics as education level, AFQT scores, and promotion speed, see Ward and Tan (1984). 2 7EFMS is a decision support system being developed...34’For an operationalization of the concept of "quality" by looking at such observable characteristics as education level, AFQT scores, and * ’.,promotion...probabilities would have to deal with predicting the test scores for airmen by using information on entry characteristics such as education level and
Screening for cognitive dysfunction in Huntington's disease with the clock drawing test.
Terwindt, Paul W; Hubers, Anna A M; Giltay, Erik J; van der Mast, Rose C; van Duijn, Erik
2016-09-01
The aim of the study is to investigate the performance of the clock drawing test as a screening tool for cognitive impairment in Huntington's disease (HD) mutation carriers. The performance of the clock drawing test was assessed in 65 mutation carriers using the Shulman and the Freund scoring systems. The mini-mental state examination, the Symbol Digit Modalities Test, the Verbal Fluency Test, and the Stroop tests were used as comparisons for the evaluation of cognitive functioning. Correlations of the clock drawing test with various cognitive tests (convergent validity), neuropsychiatric characteristics (divergent validity) and clinical characteristics were analysed using the Spearman's rank correlation coefficient. Receiver-operator characteristic analyses were performed for the clock drawing test against both the mini-mental state examination and against a composite variable for executive cognitive functioning to assess optimal cut-off scores. Inter-rater reliability was high for both the Shulman and Freund scoring systems (ICC = 0.95 and ICC = 0.90 respectively). The clock drawing tests showed moderate to high correlations with the composite variable for executive cognitive functioning (mean ρ = 0.75) and weaker correlations with the mini-mental state examination (mean ρ = 0.62). Mean sensitivity of the clock drawing tests was 0.82 and mean specificity was 0.79, whereas the mean positive predictive value was 0.66 and the mean negative predictive value was 0.87. The clock drawing test is a suitable screening instrument for cognitive dysfunction in HD, because it was shown to be accurate, particularly so with respect to executive cognitive functioning, and is easy and quick to use. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
The effectiveness of four-cavity treatment systems in sealing amalgam restorations.
Morrow, Leean A; Wilson, Nairn H F
2002-01-01
Amalgam does not bond to tooth tissue; therefore, restorations using such material are prone to leakage despite the deposition of corrosion products. This study evaluated the effectiveness of four cavity treatment systems placed in vivo in sealing restorations of amalgam. Four cavity treatment systems were investigated in this study: Cervitec, Gluma One Bond, Panavia 21 and Copaliner Dentin Varnish and Sealant. No cavity treatment was placed in an additional group to serve as a control. The teeth were extracted within 15 minutes of restoration placement. The specimens were thermocycled (5-55 +/- 2 degrees C, 500 cycles), immersed in a dye solution, sectioned and scored for leakage. Scanning electron microscopy also examined features of the tooth/restoration interfaces. There were statistically significant differences among the groups regarding leakage scores (p = 0.00). None of the materials tested consistently prevented leakage; however, use of Copaliner Dentin Varnish and Sealant resulted in less overall, occlusal and cervical microleakage than any other systems tested. Significantly more leakage was observed in relation to the cervical portions of the cavities (p = 0.00). No significant differences were identified between the leakage scores obtained for the buccal and palatal (lingual) cavities and the different tooth types (p = 0.52 and 0.83, respectively). A level of significance of 0.05 was selected in all cases. The benefits of the materials tested in this study need to be evaluated using robust, long-term clinical studies. Further work should continue to develop laboratory tests that predict the behavior and performance of cavity sealants in clinical service.
Analysis of the Korean Navy Selection Process for the Naval Post Graduate School
1988-06-01
OUTCOME OF ECL TESTING SCORE..........................54 C. OUTCOME OF TOEFL TESTING SCORE.......................55 D. PLOT OF NPS GRADE WITH ECL...TESTING SCORE..............55 E. PLOT OF NPS GRADE WIHT NA GRADE......................56 F. PLOT OF NPS GRADE WITH TOEFL TESTING SCORE............56...OF ECL TESTING SCORE ............. 30 Table S. EXPECTANCY TABLE OF NAG ............................ 31 Table 9. EXPECTANCY TABLE OF TOEFL TESTING SCORE
Vision and academic performance of learning disabled children.
Wharry, R E; Kirkpatrick, S W
1986-02-01
The purpose of this study was to assess difference in academic performance among myopic, hyperopic, and emmetropic children who were learning disabled. More specifically, myopic children were expected to perform better on mathematical and spatial tasks than would hyperopic ones and that hyperopic and emmetropic children would perform better on verbal measures than would myopic ones. For 439 learning disabled students visual anomalies were determined via a Generated Retinal Reflex Image Screening System. Test data were obtained from school files. Partial support for the hypothesis was obtained. Myopic learning disabled children outperformed hyperopic and emmetropic children on the Key Math test. Myopic children scored better than hyperopic children on the WRAT Reading subtest and on the Durrell Analysis of Reading Difficulty Oral Reading Comprehension, Oral Rate, Flashword, and Spelling subtests, and on the Key Math Measurement and Total Scores. Severity of refractive error significantly affected the Wechsler Intelligence Scale for Children--Revised Full Scale, Performance Scale, Verbal Scale, and Digit Span scores but did not affect any academic test scores. Several other findings were also reported. Those with nonametropic problems scored higher than those without problems on the Key Math Time subtest. Implications supportive of the theories of Benbow and Benbow and Geschwind and Behan were stated.
Design and implementation of online automatic judging system
NASA Astrophysics Data System (ADS)
Liang, Haohui; Chen, Chaojie; Zhong, Xiuyu; Chen, Yuefeng
2017-06-01
For lower efficiency and poorer reliability in programming training and competition by currently artificial judgment, design an Online Automatic Judging (referred to as OAJ) System. The OAJ system including the sandbox judging side and Web side, realizes functions of automatically compiling and running the tested codes, and generating evaluation scores and corresponding reports. To prevent malicious codes from damaging system, the OAJ system utilizes sandbox, ensuring the safety of the system. The OAJ system uses thread pools to achieve parallel test, and adopt database optimization mechanism, such as horizontal split table, to improve the system performance and resources utilization rate. The test results show that the system has high performance, high reliability, high stability and excellent extensibility.
Measuring the Reliability of Picture Story Exercises like the TAT
Gruber, Nicole; Kreuzpointner, Ludwig
2013-01-01
As frequently reported, psychometric assessments on Picture Story Exercises, especially variations of the Thematic Apperception Test, mostly reveal inadequate scores for internal consistency. We demonstrate that the reason for this apparent shortcoming is not caused by the coding system itself but from the incorrect use of internal consistency coefficients, especially Cronbach’s α. This problem could be eliminated by using the category-scores as items instead of the picture-scores. In addition to a theoretical explanation we prove mathematically why the use of category-scores produces an adequate internal consistency estimation and examine our idea empirically with the origin data set of the Thematic Apperception Test by Heckhausen and two additional data sets. We found generally higher values when using the category-scores as items instead of picture-scores. From an empirical and theoretical point of view, the estimated reliability is also superior to each category within a picture as item measuring. When comparing our suggestion with a multifaceted Rasch-model we provide evidence that our procedure better fits the underlying principles of PSE. PMID:24348902
A nutrient profiling assessment of packaged foods using two star-based front-of-pack labels.
Carrad, Amy M; Louie, Jimmy Chun Yu; Yeatman, Heather R; Dunford, Elizabeth K; Neal, Bruce C; Flood, Victoria M
2016-08-01
To compare two front-of-pack nutrition labelling systems for the assessment of packaged foods and drinks with Australian Dietary Guidelines. A cross-sectional nutrient profiling assessment. Food and drink products (n 20 225) were categorised into scoring levels using criteria for the Institute of Medicine (IOM) three-star system and the five-star Australian Health Star Rating (HSR). The effectiveness of these systems to categorise foods in accordance with Australian Dietary Guidelines was explored. The study was conducted in Australia, using a comprehensive food database. Packaged food and drink products (n 20 225) available in Australia. Using the IOM three-star system, the majority (55 %) of products scored the minimum 0 points and 25·5 % scored the maximum 3 points. Using HSR criteria, the greatest proportion of products (15·2 %) scored three-and-a-half stars from a possible five and 12·5 % received the lowest rating of a half-star. Very few products (4·1 %) scored five stars. Products considered core foods and drinks in Australian Dietary Guidelines received higher scores than discretionary foods in all food categories for both labelling systems (all P<0·05; Mann-Whitney U test), with the exception of fish products using IOM three-star criteria (P=0·603). The largest discrepancies in median score between the two systems were for the food categories edible oils, convenience foods and dairy. Both the IOM three-star and Australian HSR front-of-pack labelling systems rated packaged foods and drinks broadly in line with Australian Dietary Guidelines by assigning core foods higher ratings and discretionary foods lower ratings.
Introduction to an Open Source Internet-Based Testing Program for Medical Student Examinations
2009-01-01
The author developed a freely available open source internet-based testing program for medical examination. PHP and Java script were used as the programming language and postgreSQL as the database management system on an Apache web server and Linux operating system. The system approach was that a super user inputs the items, each school administrator inputs the examinees' information, and examinees access the system. The examinee's score is displayed immediately after examination with item analysis. The set-up of the system beginning with installation is described. This may help medical professors to easily adopt an internet-based testing system for medical education. PMID:20046457
Introduction to an open source internet-based testing program for medical student examinations.
Lee, Yoon-Hwan
2009-12-20
The author developed a freely available open source internet-based testing program for medical examination. PHP and Java script were used as the programming language and postgreSQL as the database management system on an Apache web server and Linux operating system. The system approach was that a super user inputs the items, each school administrator inputs the examinees' information, and examinees access the system. The examinee's score is displayed immediately after examination with item analysis. The set-up of the system beginning with installation is described. This may help medical professors to easily adopt an internet-based testing system for medical education.
Do Examinees Understand Score Reports for Alternate Methods of Scoring Computer Based Tests?
ERIC Educational Resources Information Center
Whittaker, Tiffany A.; Williams, Natasha J.; Dodd, Barbara G.
2011-01-01
This study assessed the interpretability of scaled scores based on either number correct (NC) scoring for a paper-and-pencil test or one of two methods of scoring computer-based tests: an item pattern (IP) scoring method and a method based on equated NC scoring. The equated NC scoring method for computer-based tests was proposed as an alternative…
Sharifi, Parvane; Rahmati, Abbas; Saber, Maryam
2013-10-01
To evaluate the effect of note-taking skills training on the achievement motivation in learning. The experimental study comprised graduate students of the 2010-11 batch at Kerman's Bahonar University and Kerman's Medical Sciences University, Iran. The study sample included 110 people; 55 in the test group, and 55 in the control group. They were randomly selected and replaced through the single-stage cluster sampling. To collect the data, a questionnaire was used. Pre-test was performed before the training session in two groups. After training course, a post-test was taken. For data analysis, the independent t-test, was used. The average pre-test score of the test group was 182 +/- 34.15, while for the control group it was 191 +/- 30.37 (p < 0.089). After the training, the post-test showed statistically significant change. The test group scored 220 +/- 20.94 against the controls who scored 195 +/- 27.26 (p < 0.001). The findings showed that achievement motivation in learning increased significantly after imparting training in note-taking skills. Authorities in the educational system should invest more for promotion of such skills.
Beese, Mark E; Joy, Elizabeth; Switzler, Craig L; Hicks-Little, Charlie A
2015-08-01
Single-sport specialization (SSS) is becoming more prevalent in youth athletes. Deficits in functional movement have been shown to predispose athletes to injury. It is unclear whether a link exists between SSS and the development of functional movement deficits that predispose SSS athletes to an increased risk of knee injury. To determine whether functional movement deficits exist in SSS athletes compared with multi-sport (M-S) athletes. Cross-sectional study. Soccer practice fields. A total of 40 (21 SSS [age = 15.05 ± 1.2 years], 19 M-S [age = 15.32 ± 1.2 years]) female high school athlete volunteers were recruited through local soccer clubs. All SSS athletes played soccer. Participants were grouped into 2 categories: SSS and M-S. All participants completed 3 trials of the standard Landing Error Scoring System (LESS) jump-landing task. They performed a double-legged jump from a 30-cm platform, landing on a rubber mat at a distance of half their body height. Upon landing, participants immediately performed a maximal vertical jump. Values were assigned to each trial using the LESS scoring criteria. We averaged the 3 scored trials and then used a Mann-Whitney U test to test for differences between groups. Participant scores from the jump-landing assessment for each group were also placed into the 4 defined LESS categories for group comparison using a Pearson χ(2) test. The α level was set a priori at .05. Mean scores were 6.84 ± 1.81 for the SSS group and 6.07 ± 1.93 for the M-S group. We observed no differences between groups (z = -1.44, P = .15). A Pearson χ(2) analysis revealed that the proportions of athletes classified as having excellent, good, moderate, or poor LESS scores were not different between the SSS and M-S groups ([Formula: see text] = 1.999, P = .57). Participation in soccer alone compared with multiple sports did not affect LESS scores in adolescent female soccer players. However, the LESS scores indicated that most female adolescent athletes may be at an increased risk for knee injury, regardless of the number of sports played.
Chronic viral hepatitis: the histology report.
Guido, Maria; Mangia, Alessandra; Faa, Gavino
2011-03-01
In chronic viral hepatitis, the role of liver biopsy as a diagnostic test has seen a decline, paralleled by its increasing importance for prognostic purposes. Nowadays, the main indication for liver biopsy in chronic viral hepatitis is to assess the severity of the disease, in terms of both necro-inflammation (grade) and fibrosis (stage), which is important for prognosis and therapeutic management. Several scoring systems have been proposed for grading and staging chronic viral hepatitis and there is no a general consensus on the best system to be used in the daily practice. All scoring systems have their drawbacks and all may be affected by sampling and observer variability. Whatever the system used, a histological score is a reductive approach since damage in chronic viral hepatitis is a complex biological process. Thus, scoring systems are not intended to replace the detailed, descriptive, pathology report. In fact, lesions other than those scored for grading and staging may have clinical relevance and should be assessed and reported. This paper aims to provide a systematic approach to the interpretation of liver biopsies obtained in cases of chronic viral hepatitis, with the hope of helping general pathologists in their diagnostic practice. Copyright © 2011 Editrice Gastroenterologica Italiana S.r.l. Published by Elsevier Ltd.. All rights reserved.
Relapses vs. reactions in multibacillary leprosy: proposal of new relapse criteria.
Linder, Katharina; Zia, Mutaher; Kern, Winfried V; Pfau, Ruth K M; Wagner, Dirk
2008-03-01
To compare a new scoring system for multibacillary (MB) leprosy relapses, which combines time factor, risk factors and clinical presentation at relapse, to WHO criteria. Data were collected on all relapses diagnosed between 1998 and 2004 at the Marie-Adelaide-Centre in Karachi, Pakistan, including case histories, clinical manifestations, follow-up, bacterial indices, treatment and contacts. For the diagnosis of MB relapses a simple scoring system was developed and validated on a data-set of mouse foot pads (MFP)-confirmed relapses (Leprosy Reviews, 76, 2005, 241). Its sensitivity was further evaluated in the Karachi relapse cohort. The P-value was calculated with McNemar's test with continuity correction. The new scoring system that combines time factor, risk factors and clinical presentation at relapse had a higher sensitivity in MFP-confirmed relapses than the WHO-criteria (95%vs. 65%, P < 0.01). The sensitivity of the scoring system was also significantly higher than the WHO criteria in the 57 cases of MB-relapses diagnosed in Karachi (72%vs. 54%, P < 0.05). This new simple scoring system for diagnosing MB-relapses in leprosy should be further validated in a prospective study to confirm its superior sensitivity and to evaluate the specificity of these criteria by using MFP-confirmation for patients presenting with signs of activity after treatment.
Elkjær, Karina; Labouriau, Rodrigo; Ancker, Marie-Louise; Gustafsson, Hans; Callesen, Henrik
2013-12-01
A detailed study of 398,237 lactations of Danish Holstein dairy cows was undertaken. The objective was to investigate the information gained by evaluating vaginal discharge in cows from 5 to 19 days post-partum (p.p.) using an ordinal scale from 0 to 9. The study focused on the interval from calving to first insemination (CFI) and the non-return rate 56 days after first insemination (NR56), adjusted for the confounders milk production and body condition score (BCS). For the analyses, BCS was evaluated on the same day that the uterine score was made. Milk production was defined as test-day milk yield in the first month p.p. The study showed that the evaluation of vaginal discharge according to this score system permitted ranking of cows according to CFI and NR56, i.e. an increasing uterine score was associated with a significantly longer time from calving to first insemination and significantly reduced the probability of success of the first insemination. Reproductive success was already affected if the uterine score had reached 4 (i.e. before the discharge smelled abnormally). The negative effect on CFI and NR56 increased as the uterine score increased, which suggested that the uterine scoring system was a useful guide to dairy producers. Copyright © 2013 Elsevier Ltd. All rights reserved.
Meil, William M; LaPorte, David J; Mills, John A; Sesti, Ann; Collins, Sunshine M; Stiver, Alyssa G
2016-11-01
The development of substance use and addiction has been linked to impaired executive function which relies on systems that converge in the prefrontal cortex. This study examined several measures of executive function as predictors of college student alcohol, tobacco, and marijuana use frequency and abuse. College students (N=321) were administered the Delis-Kaplan Executive Function System (D-KEFS) test battery, the Sensation Seeking Scale V (SSSV), the Frontal Systems Behavioral Scale (FrSBe), the Perceived Stress Scale (PSS), the Michigan Alcohol Screening Test (MAST), the Fagerstrom Test of Nicotine Dependence (FTND). Alcohol use frequency was predicted by sensation seeking and FrSBe Disinhibition scores, but the latter only emerged as a unique predictor for binge drinking frequency. Sex and Disinhibition, Apathy and Executive Function FrSBe subscales predicted the frequency of tobacco use. FrSBe scores uniquely predicted tobacco use among daily users. Marijuana use frequency was predicted by sensation seeking, sex, perceived stress, and FrSBe Disinhibition scores, but only sensation seeking predicted daily use after controlling for other variables. FrSBe Disinhibition scores reached levels considered to be clinically significant for frequent binge drinkers and daily marijuana users. Sensation seeking emerged as the predominate predictor of the early stages of alcohol and tobacco related problems. These results suggest ecologically based self-report measures of frontal lobe function and sensation seeking are significant predictors of use frequency among college students and the extent of frontal dysfunction may be clinically significant among some heavy users. Copyright © 2016 Elsevier Ltd. All rights reserved.
Lin, Deng-Juin; Li, Ya-Hsin; Pai, Jar-Yuan; Sheu, Ing-Cheau; Glen, Robert; Chou, Ming-Jen; Lee, Ching-Yi
2009-12-19
Chronic kidney disease (CKD) is a serious public health problem in Taiwan and the world. The most effective, affordable treatments involve early prevention/detection/intervention, requiring screening. Successfully implementing CKD programs requires good patient participation, affected by patient perceptions of screening service quality. Service quality improvements can help make such programs more successful. Thus, good tools for assessing service quality perceptions are important. to investigate using a modified SERVQUAL questionnaire in assessing patient expectations, perceptions, and loyalty towards kidney disease screening service quality. 1595 kidney disease screening program patients in Taichung City were requested to complete and return a modified kidney disease screening SERVQUAL questionnaire. 1187 returned them. Incomplete ones (102) were culled and 1085 were chosen as effective for use. Paired t-tests, correlation tests, ANOVA, LSD test, and factor analysis identified the characteristics and factors of service quality. The paired t-test tested expectation score and perception score gaps. A structural equation modeling system examined satisfaction-based components' relationships. The effective response rate was 91.4%. Several methods verified validity. Cronbach's alpha on internal reliability was above 0.902. On patient satisfaction, expectation scores are high: 6.50 (0.82), but perception scores are significantly lower 6.14 (1.02). Older patients' perception scores are lower than younger patients'. Expectation and perception scores for patients with different types of jobs are significantly different. Patients higher on education have lower scores for expectation (r = -0.09) and perception (r = -0.26). Factor analysis identified three factors in the 22 item SERVQUAL form, which account for 80.8% of the total variance for the expectation scores and 86.9% of the total variance for the satisfaction scores. Expectation and perception score gaps in all 22 items are significant. The goodness-of-fit summary of the SEM results indicates that expectations and perceptions are positively correlated, perceptions and loyalty are positively correlated, but expectations and loyalty are not positively correlated. The results of this research suggest that the SERVQUAL instrument is a useful measurement tool in assessing and monitoring service quality in kidney disease screening services, enabling the staff to identify where service improvements are needed from the patients' perspectives.
Automated Generation and Assessment of Autonomous Systems Test Cases
NASA Technical Reports Server (NTRS)
Barltrop, Kevin J.; Friberg, Kenneth H.; Horvath, Gregory A.
2008-01-01
This slide presentation reviews some of the issues concerning verification and validation testing of autonomous spacecraft routinely culminates in the exploration of anomalous or faulted mission-like scenarios using the work involved during the Dawn mission's tests as examples. Prioritizing which scenarios to develop usually comes down to focusing on the most vulnerable areas and ensuring the best return on investment of test time. Rules-of-thumb strategies often come into play, such as injecting applicable anomalies prior to, during, and after system state changes; or, creating cases that ensure good safety-net algorithm coverage. Although experience and judgment in test selection can lead to high levels of confidence about the majority of a system's autonomy, it's likely that important test cases are overlooked. One method to fill in potential test coverage gaps is to automatically generate and execute test cases using algorithms that ensure desirable properties about the coverage. For example, generate cases for all possible fault monitors, and across all state change boundaries. Of course, the scope of coverage is determined by the test environment capabilities, where a faster-than-real-time, high-fidelity, software-only simulation would allow the broadest coverage. Even real-time systems that can be replicated and run in parallel, and that have reliable set-up and operations features provide an excellent resource for automated testing. Making detailed predictions for the outcome of such tests can be difficult, and when algorithmic means are employed to produce hundreds or even thousands of cases, generating predicts individually is impractical, and generating predicts with tools requires executable models of the design and environment that themselves require a complete test program. Therefore, evaluating the results of large number of mission scenario tests poses special challenges. A good approach to address this problem is to automatically score the results based on a range of metrics. Although the specific means of scoring depends highly on the application, the use of formal scoring - metrics has high value in identifying and prioritizing anomalies, and in presenting an overall picture of the state of the test program. In this paper we present a case study based on automatic generation and assessment of faulted test runs for the Dawn mission, and discuss its role in optimizing the allocation of resources for completing the test program.
Clinical use of the ABO-Scoring Index: reliability and subtraction frequency.
Lieber, William S; Carlson, Sean K; Baumrind, Sheldon; Poulton, Donald R
2003-10-01
This study tested the reliability and subtraction frequency of the study model-scoring system of the American Board of Orthodontists (ABO). We used a sample of 36 posttreatment study models that were selected randomly from six different orthodontic offices. Intrajudge and interjudge reliability was calculated using nonparametric statistics (Spearman rank coefficient, Wilcoxon, Kruskal-Wallis, and Mann-Whitney tests). We found differences ranging from 3 to 6 subtraction points (total score) for intrajudge scoring between two sessions. For overall total ABO score, the average correlation was .77. Intrajudge correlation was greatest for occlusal relationships and least for interproximal contacts. Interjudge correlation for ABO score averaged r = .85. Correlation was greatest for buccolingual inclination and least for overjet. The data show that some judges, on average, were much more lenient than others and that this resulted in a range of total scores between 19.7 and 27.5. Most of the deductions were found in the buccal segments and most were related to the second molars. We present these findings in the context of clinicians preparing for the ABO phase III examination and for orthodontists in their ongoing evaluation of clinical results.
High-Stakes Testing in Education: Science and Practice in K-12 Settings
ERIC Educational Resources Information Center
Bovaird, James A., Ed.; Geisinger, Kurt F., Ed.; Buckendahl, Chad W., Ed.
2011-01-01
Educational assessment and, more broadly, educational research in the United States have entered into an era characterized by a dramatic increase in the prevalence and importance of test score use in accountability systems. This volume covers a selection of contemporary issues about testing science and practice that impact the nation's public…
Developmental Eye Movement (DEM) Test Norms for Mandarin Chinese-Speaking Chinese Children
Tong, Meiling; Zhang, Min; Li, Tingting; Xu, Yaqin; Guo, Xirong; Hong, Qin; Chi, Xia
2016-01-01
The Developmental Eye Movement (DEM) test is commonly used as a clinical visual-verbal ocular motor assessment tool to screen and diagnose reading problems at the onset. No established norm exists for using the DEM test with Mandarin Chinese-speaking Chinese children. This study aims to establish the normative values of the DEM test for the Mandarin Chinese-speaking population in China; it also aims to compare the values with three other published norms for English-, Spanish-, and Cantonese-speaking Chinese children. A random stratified sampling method was used to recruit children from eight kindergartens and eight primary schools in the main urban and suburban areas of Nanjing. A total of 1,425 Mandarin Chinese-speaking children aged 5 to 12 years took the DEM test in Mandarin Chinese. A digital recorder was used to record the process. All of the subjects completed a symptomatology survey, and their DEM scores were determined by a trained tester. The scores were computed using the formula in the DEM manual, except that the “vertical scores” were adjusted by taking the vertical errors into consideration. The results were compared with the three other published norms. In our subjects, a general decrease with age was observed for the four eye movement indexes: vertical score, adjusted horizontal score, ratio, and total error. For both the vertical and adjusted horizontal scores, the Mandarin Chinese-speaking children completed the tests much more quickly than the norms for English- and Spanish-speaking children. However, the same group completed the test slightly more slowly than the norms for Cantonese-speaking children. The differences in the means were significant (P<0.001) in all age groups. For several ages, the scores obtained in this study were significantly different from the reported scores of Cantonese-speaking Chinese children (P<0.005). Compared with English-speaking children, only the vertical score of the 6-year-old group, the vertical-horizontal time ratio of the 8-year-old group and the errors of 9-year-old group had no significant difference (P>0.05); compared with Spanish-speaking children, the scores were statistically significant (P<0.001) for the total error scores of the age groups, except the 6-, 9-, 10-, and 11-year-old age groups (P>0.05). DEM norms may be affected by differences in language, cultural, and educational systems among various ethnicities. The norms of the DEM test are proposed for use with Mandarin Chinese-speaking children in Nanjing and will be proposed for children throughout China. PMID:26881754
Zimmermann, Laura J; Ferrucci, Luigi; Kiang Liu; Lu Tian; Guralnik, Jack M; Criqui, Michael H; Yihua Liao; McDermott, Mary M
2011-06-01
We hypothesized that, in the absence of clinically recognized dementia, cognitive dysfunction measured by the clock draw test (CDT) is associated with greater functional impairment in men and women with peripheral artery disease (PAD). Participants were men and women aged 60 years and older with Mini-Mental Status Examination scores ≥ 24 with PAD (n = 335) and without PAD (n = 234). We evaluated the 6-minute walk test, 4-meter walking velocity at usual and fastest pace, the Short Physical Performance Battery (SPPB), and accelerometer-measured physical activity. CDTs were scored using the Shulman system as follows: Category 1 (worst): CDT score 0-2; Category 2: CDT score 3; Category 3 (best): CDT score 4-5. Results were adjusted for age, sex, race, education, ankle-brachial index (ABI), and comorbidities. In individuals with PAD, lower CDT scores were associated with slower 4-meter usual-paced walking velocity (Category 1: 0.78 meters/second; Category 2: 0.83 meters/second; Category 3: 0.86 meters/second; p-trend = 0.025) and lower physical activity (Category 1: 420 activity units; Category 2: 677 activity units; Category 3: 701 activity units; p-trend = 0.045). Poorer CDT scores were also associated with worse functional performance in individuals without PAD (usual and fast-paced walking velocity and SPPB, p-trend = 0.022, 0.043, and 0.031, respectively). In conclusion, cognitive impairment identified with CDT is independently associated with greater functional impairment in older, dementia-free individuals with and without PAD. Longitudinal studies are necessary to explore whether baseline CDT scores and changes in CDT scores over time can predict long-term decline in functional performance in individuals with and without PAD.
Esmaeili, Alireza; Stewart, Andrew M; Hopkins, William G; Elias, George P; Lazarus, Brendan H; Rowell, Amber E; Aughey, Robert J
2018-01-01
Aim: The sit and reach test (S&R), dorsiflexion lunge test (DLT), and adductor squeeze test (AST) are commonly used in weekly musculoskeletal screening for athlete monitoring and injury prevention purposes. The aim of this study was to determine the normal week to week variability of the test scores, individual differences in variability, and the effects of training load on the scores. Methods: Forty-four elite Australian rules footballers from one club completed the weekly screening tests on day 2 or 3 post-main training (pre-season) or post-match (in-season) over a 10 month season. Ratings of perceived exertion and session duration for all training sessions were used to derive various measures of training load via both simple summations and exponentially weighted moving averages. Data were analyzed via linear and quadratic mixed modeling and interpreted using magnitude-based inference. Results: Substantial small to moderate variability was found for the tests at both season phases; for example over the in-season, the normal variability ±90% confidence limits were as follows: S&R ±1.01 cm, ±0.12; DLT ±0.48 cm, ±0.06; AST ±7.4%, ±0.6%. Small individual differences in variability existed for the S&R and AST (factor standard deviations between 1.31 and 1.66). All measures of training load had trivial effects on the screening scores. Conclusion: A change in a test score larger than the normal variability is required to be considered a true change. Athlete monitoring and flagging systems need to account for the individual differences in variability. The tests are not sensitive to internal training load when conducted 2 or 3 days post-training or post-match, and the scores should be interpreted cautiously when used as measures of recovery.
Repeat neurobehavioral study of borderline personality disorder.
van Reekum, R; Links, P S; Finlayson, M A; Boyle, M; Boiago, I; Ostrander, L A; Moustacalis, E
1996-01-01
Previous research has tentatively identified a large subgroup of patients with borderline personality disorder (BPD) with histories of developmental or acquired brain insults. Similarly, these studies have demonstrated a possible biological correlation between the severity of BPD and the number of previous brain insults. The possibility of frontal system cognitive dysfunction in BPD has been raised. This single-blind, case-control study of BPD showed that 13 of 24 subjects with BPD had suffered a brain insult. Correlations between neurodevelopmental/acquired brain injury score and the diagnostic interview for borderline (DIB) score (r = 0.47), and between frontal system cognitive functioning and DIB score (r = -0.37) were seen. Neurocognitive testing and comparison with a cohort of subjects with traumatic brain injury (TBI) showed a pattern of similar cognitive functioning between the 2 groups, with the only differences on individual tests being in the direction of worse functioning in the group with BPD on 2 tasks. These results support the hypotheses described above. The main limitation reflects the low numbers of subjects. PMID:8580113
Overview of different scoring systems in Fournier’s Gangrene and assessment of prognostic factors
Doluoğlu, Ömer Gökhan; Karagöz, Mehmet Ali; Kılınç, Muhammet Fatih; Karakan, Tolga; Yücetürk, Cem Nedim; Sarıcı, Haşmet; Özgür, Berat Cem; Eroğlu, Muzaffer
2016-01-01
Objective In this study we aimed to evaluate prognostic factors for the survival of patients with Fournier’s gangrene (FG), and overview different validated scoring systems for outcome prediction. Material and methods We retrospectively analyzed the data of 39 patients treated for FG in our clinic. Data were collected on medical history, symptoms, physical examination findings, vital signs, laboratory parameters at admission and at the end of treatment, timing and extent of surgical debridement, and the antibiotic treatment used. The Fournier’s Gangrene Severity Index (FGSI) and Charlson Comorbidity Index (CCI) were used to predict outcome. The data were analyzed in relation with the survival of the patients. Mann-Whitney U test, chi -square test, Wilcoxon signed rank test, and Cox regression analysis were used for the statistical analysis. Results Of 39 patients analyzed, 8 (20.5%) died and 31 (79.5%) survived. The median FGSI score on admission was 2 (0–9) for the survivors and 6 (2–14) for the non-survivors (p=0.004). The median CCI scores of the survivors and non-survivors were 2 (0–10) and 6.5 (5–11), respectively (p=0.001). Except for urea, albumin and hematocrit levels, no significant differences were found between survivors and non-survivors for other laboratory parameters on admission. Lower albumin levels and advanced age were found to be associated with mortality. Conclusion High blood urea, low albumin, and low hematocrit levels were associated with poor prognosis. High CCI and FGSI scores could be associated with a poor prognosis in patients with FG. PMID:27635295
Kisala, Pamela A; Tulsky, David S; Kalpakjian, Claire Z; Heinemann, Allen W; Pohlig, Ryan T; Carle, Adam; Choi, Seung W
2015-05-01
To develop a calibrated item bank and computer adaptive test to assess anxiety symptoms in individuals with spinal cord injury (SCI), transform scores to the Patient Reported Outcomes Measurement Information System (PROMIS) metric, and create a statistical linkage with the Generalized Anxiety Disorder (GAD)-7, a widely used anxiety measure. Grounded-theory based qualitative item development methods; large-scale item calibration field testing; confirmatory factor analysis; graded response model item response theory analyses; statistical linking techniques to transform scores to a PROMIS metric; and linkage with the GAD-7. Setting Five SCI Model System centers and one Department of Veterans Affairs medical center in the United States. Participants Adults with traumatic SCI. Spinal Cord Injury-Quality of Life (SCI-QOL) Anxiety Item Bank Seven hundred sixteen individuals with traumatic SCI completed 38 items assessing anxiety, 17 of which were PROMIS items. After 13 items (including 2 PROMIS items) were removed, factor analyses confirmed unidimensionality. Item response theory analyses were used to estimate slopes and thresholds for the final 25 items (15 from PROMIS). The observed Pearson correlation between the SCI-QOL Anxiety and GAD-7 scores was 0.67. The SCI-QOL Anxiety item bank demonstrates excellent psychometric properties and is available as a computer adaptive test or short form for research and clinical applications. SCI-QOL Anxiety scores have been transformed to the PROMIS metric and we provide a method to link SCI-QOL Anxiety scores with those of the GAD-7.
Ferrie, Joseph P; Rolf, Karen; Troesken, Werner
2012-01-01
Higher prior exposure to water-borne lead among male World War Two U.S. Army enlistees was associated with lower intelligence test scores. Exposure was proxied by urban residence and the water pH levels of the cities where enlistees lived in 1930. Army General Classification Test scores were six points lower (nearly 1/3 standard deviation) where pH was 6 (so the water lead concentration for a given amount of lead piping was higher) than where pH was 7 (so the concentration was lower). This difference rose with time exposed. At this time, the dangers of exposure to lead in water were not widely known and lead was ubiquitous in water systems, so these results are not likely the effect of individuals selecting into locations with different levels of exposure. Copyright © 2011 Elsevier B.V. All rights reserved.
Testing the Predictive Validity of the Hendrich II Fall Risk Model.
Jung, Hyesil; Park, Hyeoun-Ae
2018-03-01
Cumulative data on patient fall risk have been compiled in electronic medical records systems, and it is possible to test the validity of fall-risk assessment tools using these data between the times of admission and occurrence of a fall. The Hendrich II Fall Risk Model scores assessed during three time points of hospital stays were extracted and used for testing the predictive validity: (a) upon admission, (b) when the maximum fall-risk score from admission to falling or discharge, and (c) immediately before falling or discharge. Predictive validity was examined using seven predictive indicators. In addition, logistic regression analysis was used to identify factors that significantly affect the occurrence of a fall. Among the different time points, the maximum fall-risk score assessed between admission and falling or discharge showed the best predictive performance. Confusion or disorientation and having a poor ability to rise from a sitting position were significant risk factors for a fall.
Ferreira, João Gomes; Bricker, Suzanne B; Simas, Teresa Castro
2007-03-01
The Assessment of Estuarine Trophic Status (ASSETS) screening model has been extended to allow its application to both estuarine and coastal systems. The model, which combines elements of pressure, state and response, was tested on four systems: Maryland Coastal Bays and Long Island Sound in the United States and The Firth of Clyde (Scotland) and Tagus Estuary (Portugal) in the European Union. The overall scores were: Maryland Coastal Bays: Bad; Firth of Clyde: Poor; Tagus Estuary: Good. Long Island Sound was modelled along a timeline, using 1991 data (score: Bad) and 2002 data (score: Moderate). The improvement registered for Long Island Sound is a consequence of the reduction in nutrient loading, and the ASSETS score changed accordingly. The two main areas where developments are needed are (a) In the definition of type-specific ranges for eutrophication parameters, due to the recognition that natural or pristine conditions may vary widely, and the use of a uniform set of thresholds artificially penalizes some systems and potentially leads to misclassification; (b) In the definition and quantification of measures which will result in an improved state through a change in pressures, as well as in the definition of appropriate metrics through which response may be assessed. One possibility is the use of detailed research models where different response scenarios potentially produce changes in pressure and state. These outputs may be used to drive screening models and analyze the suitability of candidate metrics for evaluating management options.
Atashi, Alireza; Amini, Shahram; Tashnizi, Mohammad Abbasi; Moeinipour, Ali Asghar; Aazami, Mathias Hossain; Tohidnezhad, Fariba; Ghasemi, Erfan; Eslami, Saeid
2018-01-01
Introduction The European System for Cardiac Operative Risk Evaluation II (EuroSCORE II) is a prediction model which maps 18 predictors to a 30-day post-operative risk of death concentrating on accurate stratification of candidate patients for cardiac surgery. Objective The objective of this study was to determine the performance of the EuroSCORE II risk-analysis predictions among patients who underwent heart surgeries in one area of Iran. Methods A retrospective cohort study was conducted to collect the required variables for all consecutive patients who underwent heart surgeries at Emam Reza hospital, Northeast Iran between 2014 and 2015. Univariate and multivariate analysis were performed to identify covariates which significantly contribute to higher EuroSCORE II in our population. External validation was performed by comparing the real and expected mortality using area under the receiver operating characteristic curve (AUC) for discrimination assessment. Also, Brier Score and Hosmer-Lemeshow goodness-of-fit test were used to show the overall performance and calibration level, respectively. Results Two thousand five hundred eight one (59.6% males) were included. The observed mortality rate was 3.3%, but EuroSCORE II had a prediction of 4.7%. Although the overall performance was acceptable (Brier score=0.047), the model showed poor discriminatory power by AUC=0.667 (sensitivity=61.90, and specificity=66.24) and calibration (Hosmer-Lemeshow test, P<0.01). Conclusion Our study showed that the EuroSCORE II discrimination power is less than optimal for outcome prediction and less accurate for resource allocation programs. It highlights the need for recalibration of this risk stratification tool aiming to improve post cardiac surgery outcome predictions in Iran. PMID:29617500
ERIC Educational Resources Information Center
Foorman, Barbara R.; Petscher, Yaacov
2011-01-01
In Florida, mean proficiency scores are reported on the Florida Comprehensive Achievement Test (FCAT) as well as recommended learning gains from the developmental scale score. Florida now has another within-year measure of growth in reading comprehension from the Florida Assessments for Instruction in Reading (FAIR). The FAIR reading comprehension…
ERIC Educational Resources Information Center
Pivovarova, Margarita; Amrein-Beardsley, Audrey
2018-01-01
While states are no longer required to set up teacher evaluation systems based in significant part on student test scores, quite a few continue to use value-added (VAMs) or student growth percentile (SGP) models for that purpose. In this study, we analyzed three years of teacher data to illustrate the performance of teachers' median growth…
ERIC Educational Resources Information Center
South, Emogene
2014-01-01
The purpose of this study was to determine if a difference in achievement scores exist between students who attended the Johnson County School System preschool program and those who did not as measured by standardized TCAP achievement test Reading/Language Arts and Math scores of students in the third and fourth grades. The variables of grade…
ERIC Educational Resources Information Center
Khoshsima, Hooshang; Saed, Amin; Mousaei, Fatemeh
2018-01-01
Language proficiency tests have become common instruments to judge people based on their performance. Thus, the scores on language proficiency tests, such as the International English Language Testing System (IELTS) or Teaching English as a Foreign Language (TOEFL), play a crucial role in the test-takers' lives. Because of increasing demands on…
Molina, Gustavo Fabián; Cabral, Ricardo Juan; Mazzola, Ignacio; Lascano, Laura Brain; Frencken, Jo E
2013-01-01
The Atraumatic Restorative Treatment (ART) approach was suggested to be a suitable method to treat enamel and dentine carious lesions in patients with disabilities. The use of a restorative glass-ionomer with optimal mechanical properties is, therefore, very important. To test the null-hypotheses that no difference in diametral tensile, compressive and flexural strengths exists between: (1) The EQUIA system and (2) The Chemfil Rock (encapsulated glass-ionomers; test materials) and the Fuji 9 Gold Label and the Ketac Molar Easymix (hand-mixed conventional glass-ionomers; control materials); (3) The EQUIA system and Chemfil Rock. Specimens for testing flexural (n = 240) and diametral tensile (n=80) strengths were prepared according to standardized specifications; the compressive strength (n=80) was measured using a tooth-model of a class II ART restoration. ANOVA and Tukey B tests were used to test for significant differences between dependent and independent variables. The EQUIA system and Chemfil Rock had significantly higher mean scores for all the three strength variables than the Fuji 9 Gold Label and Ketac Molar Easymix (α=0.05). The EQUIA system had significant higher mean scores for diametral tensile and flexural strengths than the Chemfil Rock (α=0.05). The two encapsulated high-viscosity glass-ionomers had significantly higher test values for diametral tensile, flexural and compressive strengths than the commonly used hand-mixed high-viscosity glass-ionomers.
Assessment of the severity of injuries to hands by powered wood splitters.
Lindqvist, Aron; Berglund, Maria; von Kieseritzky, Johanna; Nilsson, Olle
2010-11-01
Our aim was to rate the severity of injuries to hands by powered wood splitters. The patients were identified from a computerised registry, and the cause of injury was confirmed by written questionnaire and structured telephone interview. Information about the anatomy of the injury was gathered from patients' records and radiographs. Severity of injury was rated according to the Hand Injury Severity Scoring System (HISS system) and the Injury Severity Score (ISS). The reliability of HISS rating was tested. The mean Hand Injury Severity Score (HISS) was 63 and the mean ISS was 3.7. Twenty-five (19%) of patients had minor, 41 (31%) had moderate, 30 (23%) had severe, and 35 (27 %) had major injuries when scored by the HISS system. Children's injuries were more severe than those of adults. There was no difference in severity between injuries made by wedge and screw splitters. It is not possible to avoid serious hand injuries from powered wood splitters completely by prohibiting one of the two main types of splitter.
ERIC Educational Resources Information Center
Behizadeh, Nadia; Engelhard, George, Jr.
2015-01-01
In his focus article, Koretz (this issue) argues that accountability has become the primary function of large-scale testing in the United States. He then points out that tests being used for accountability purposes are flawed and that the high-stakes nature of these tests creates a context that encourages score inflation. Koretz is concerned about…
ERIC Educational Resources Information Center
Enomoto, Ernestine K.; Conley, Sharon
2007-01-01
Schools employ educational technology to comply with pressures for greater accountability and efficiency in conducting operations. Specifically, schools use "management information systems" designed to automate data collection of student attendance, grades, test scores, and so on. These management information systems (MIS) employed…
An Expert System for On-Site Instructional Advice.
ERIC Educational Resources Information Center
Martindale, Elizabeth S.; Hofmeister, Alan M.
1988-01-01
Describes Written Language Consultant, an expert system designed to help teachers teach special education students how to write business letters. Three main components of the system are described, including entry of students' test scores; analysis of teachers' uses of classroom time and management techniques; and suggestions for improving test…
Junghaenel, Doerte U; Schneider, Stefan; Stone, Arthur A; Christodoulou, Christopher; Broderick, Joan E
2014-04-01
This study examined the ecological validity and clinical utility of NIH Patient Reported-Outcomes Measurement Information System (PROMIS®) instruments for anger, depression, and fatigue in women with premenstrual symptoms. One-hundred women completed daily diaries and weekly PROMIS assessments over 4weeks. Weekly assessments were administered through Computerized Adaptive Testing (CAT). Weekly CATs and corresponding daily scores were compared to evaluate ecological validity. To test clinical utility, we examined if CATs could detect changes in symptom levels, if these changes mirrored those obtained from daily scores, and if CATs could identify clinically meaningful premenstrual symptom change. PROMIS CAT scores were higher in the pre-menstrual than the baseline (ps<.0001) and post-menstrual (ps<.0001) weeks. The correlations between CATs and aggregated daily scores ranged from .73 to .88 supporting ecological validity. Mean CAT scores showed systematic changes in accordance with the menstrual cycle and the magnitudes of the changes were similar to those obtained from the daily scores. Finally, Receiver Operating Characteristic (ROC) analyses demonstrated the ability of the CATs to discriminate between women with and without clinically meaningful premenstrual symptom change. PROMIS CAT instruments for anger, depression, and fatigue demonstrated validity and utility in premenstrual symptom assessment. The results provide encouraging initial evidence of the utility of PROMIS instruments for the measurement of affective premenstrual symptoms. Copyright © 2014 Elsevier Inc. All rights reserved.
Sefton, Gerri; Lane, Steven; Killen, Roger; Black, Stuart; Lyon, Max; Ampah, Pearl; Sproule, Cathryn; Loren-Gosling, Dominic; Richards, Caitlin; Spinty, Jean; Holloway, Colette; Davies, Coral; Wilson, April; Chean, Chung Shen; Carter, Bernie; Carrol, E.D.
2017-01-01
Pediatric Early Warning Scores are advocated to assist health professionals to identify early signs of serious illness or deterioration in hospitalized children. Scores are derived from the weighting applied to recorded vital signs and clinical observations reflecting deviation from a predetermined “norm.” Higher aggregate scores trigger an escalation in care aimed at preventing critical deterioration. Process errors made while recording these data, including plotting or calculation errors, have the potential to impede the reliability of the score. To test this hypothesis, we conducted a controlled study of documentation using five clinical vignettes. We measured the accuracy of vital sign recording, score calculation, and time taken to complete documentation using a handheld electronic physiological surveillance system, VitalPAC Pediatric, compared with traditional paper-based charts. We explored the user acceptability of both methods using a Web-based survey. Twenty-three staff participated in the controlled study. The electronic physiological surveillance system improved the accuracy of vital sign recording, 98.5% versus 85.6%, P < .02, Pediatric Early Warning Score calculation, 94.6% versus 55.7%, P < .02, and saved time, 68 versus 98 seconds, compared with paper-based documentation, P < .002. Twenty-nine staff completed the Web-based survey. They perceived that the electronic physiological surveillance system offered safety benefits by reducing human error while providing instant visibility of recorded data to the entire clinical team. PMID:27832032
Yamada, Keiko; Muranaga, Shingo; Shinozaki, Tomohiro; Nakamura, Kozo; Tanaka, Sakae; Ogata, Toru
2018-01-26
Mobility decrease is reportedly age-dependent in community dwelling elderly, and a major factor of disability in the geriatric population. The purpose of this study is to examine whether mobility decrease, as assessed using a set of tests, is similarly age-dependent in elderly adults who already have disability. One hundred thirty-five community-dwelling elderly (54 men, 81 women) with disability and 1469 independent community dwellers (1009 men, 460 women) were analyzed. Disability was defined having a certified need for care under the long-term care insurance system in Japan. Lower extremity mobility decrease was quantified using the Locomotive Syndrome Risk Test, which comprises the two-step test, stand-up test, and 25-Question Geriatric Locomotive Function Scale (GLFS-25). Multivariable regression analyses indicated no age-related decrease in the three test scores among elderly with disability, whereas these scores all decreased with age among independent community dwellers. All the test scores decreased as care level increased. Mobility decrease among elderly adults with disability is unrelated to age. However, the severity of care level is associated with mobility decrease.
Audit and internal quality control in immunohistochemistry
Maxwell, P; McCluggage, W
2000-01-01
Aims—Although positive and negative controls are performed and checked in surgical pathology cases undergoing immunohistochemistry, internal quality control procedures for immunohistochemistry are not well described. This study, comprising a retrospective audit, aims to describe a method of internal quality control for immunohistochemistry. A scoring system that allows comparison between cases is described. Methods—Two positive tissue controls for each month over a three year period (1996–1998) of the 10 antibodies used most frequently were evaluated. All test cases undergoing immunohistochemistry in the months of April in this three year period were also studied. When the test case was completely negative for a given antibody, the corresponding positive tissue control from that day was examined. A marking system was devised whereby each immunohistochemical slide was assessed out of a possible score of 8 to take account of staining intensity, uniformity, specificity, background, and counterstaining. Using this scoring system, cases were classified as showing optimal (7–8), borderline (5–6), or unacceptable (0–4) staining. Results—Most positive tissue controls showed either optimal or borderline staining with the exception of neurone specific enolase (NSE), where most slides were unacceptable or borderline as a result of a combination of low intensity, poor specificity, and excessive background staining. All test cases showed either optimal or borderline staining with the exception of a single case stained for NSE, which was unacceptable. Conclusions—This retrospective audit shows that immunohistochemically stained slides can be assessed using this scoring system. With most antibodies, acceptable staining was achieved in most cases. However, there were problems with staining for NSE, which needs to be reviewed. Laboratories should use a system such as this to evaluate which antibodies regularly result in poor staining so that they can be excluded from panels. Routine evaluation of immunohistochemical staining should become part of everyday internal quality control procedures. Key Words: immunohistochemistry • audit • internal quality control PMID:11265178
Predictors of operating room extubation in adult cardiac surgery.
Subramaniam, Kathirvel; DeAndrade, Diana S; Mandell, Daniel R; Althouse, Andrew D; Manmohan, Rajan; Esper, Stephen A; Varga, Jeffrey M; Badhwar, Vinay
2017-11-01
The primary objective of the study was to identify perioperative factors associated with successful immediate extubation in the operating room after adult cardiac surgery. The secondary objective was to derive a simplified predictive scoring system to guide clinicians in operating room extubation. All 1518 patients in this retrospective cohort study underwent standardized fast-track cardiac anesthetic protocol during adult cardiac surgery. Perioperative variables between patients who had successful extubation in the operating room versus in the intensive care unit were retrospectively analyzed using both univariate and multivariable logistic regression analyses. A predictive score of successful operating room extubation was constructed from the multivariable results of 800 patients (derivation set), and the scoring system was further tested using a validation set of 398 patients. Younger age, lower body mass index, higher preoperative serum albumin, absence of chronic lung disease and diabetes, less-invasive surgical approach, isolated coronary bypass surgery, elective surgery, and lower doses of intraoperative intravenous fentanyl were independently associated with higher probability of operating room extubation. The extubation prediction score created in a derivation set of patients performed well in the validation set. Patient scores less than 0 had a minimal probability of successful operating room extubation. Operating room extubation was highly predicted with scores of 5 or greater. Perioperative factors that are independently associated with successful operating room extubation after adult cardiac operations were identified, and an operating room extubation prediction scoring system was validated. This scoring system may be used to guide safe operating room extubation after cardiac operations. Copyright © 2017 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
The effect of obesity on the rate of heparin-induced thrombocytopenia.
Marler, Jacob L; Jones, G Morgan; Wheeler, Brian J; Alshaya, Abdulrahman; Hartmann, Jonathan L; Oliphant, Carrie S
2018-06-01
: Heparin-induced thrombocytopenia (HIT) occurs in patients receiving heparin-containing products due to the formation of platelet-activating antibodies to heparin and platelet factor 4. Diagnosis includes utilization of a scoring system known as the 4-T score, and HIT laboratory assays. Recently, obesity was identified as a potential factor associated with the development of HIT. The objective of this study was to evaluate the association of HIT with obesity in ICU and general medicine patients. We performed a chart review of adult patients within the Methodist Healthcare System, and included patients who had an ELISA and serotonin release assay laboratory tests reported within same hospital admission in which they also had documented receipt of heparin. Obese patients were compared with nonobese patients (BMI < 30) for the primary outcome of HIT occurrence, and secondary outcomes including rate of thrombosis, 4-T scores, and ELISA optical density values. We also generated a 5-T score by including one additional point for those with a BMI of 30 or more to determine the predictive value of this score in identifying HIT. Obesity was confirmed to be a risk factor for HIT, and the 5-T score model was also predictive of the development of HIT. However, the 5-T score was not statistically more predictive of HIT than the 4-T score. Predicting HIT remains challenging and novel markers of HIT are needed to improve HIT recognition. Although obesity did not improve the 4-T score, it may improve the predictability of other scoring systems, and further investigation is warranted.
Wanat, Matthew; Fitousis, Kalliopi; Hall, Jeff; Rice, Lawrence
2013-06-01
The diagnosis of heparin-induced thrombocytopenia (HIT) may be challenging in critically ill patients, as heparin exposures are ubiquitous, and thrombocytopenia is common. Unwarranted ordering and incorrect interpretation of heparin antibody tests can expose a patient to adverse drug events and imposes a significant economic burden on our health care system. A prospective, observational study was performed over 4 months on all adult patients located in 5 intensive care units, with a heparin antibody test ordered. A platelet factor 4/heparin enzyme-linked immunosorbent assay (ELISA) test was ordered in 131 patients. In total, 110 patients had a low 4Ts score (0-3), and of these 103 had a negative ELISA result. In patients with a low 4Ts score, 0 (0%) of 110 had an optical density value >1.0. One hundred twenty-nine patients (98%) had another possible cause of thrombocytopenia identified. In critically ill patients, low 4Ts scores indicate a low probability of HIT, and heparin antibody testing in these patients is not useful.
Everard, Eoin M; Harrison, Andrew J; Lyons, Mark
2017-05-01
Everard, EM, Harrison, AJ, and Lyons, M. Examining the relationship between the functional movement screen and the landing error scoring system in an active, male collegiate population. J Strength Cond Res 31(5): 1265-1272, 2017-In recent years, there has been an increasing focus on movement screening as the principal aspect of preparticipation testing. Two of the most common movement screening tools are the Functional Movement Screen (FMS) and the Landing Error Scoring System (LESS). Several studies have examined the reliability and validity of these tools, but so far, there have been no studies comparing the results of these 2 screening tools against each other. Therefore, the purpose of this study was to determine the relationship between FMS scores and LESS scores. Ninety-eight male college athletes actively competing in sport (Gaelic games, soccer, athletics, boxing/mixed martial arts, Olympic weightlifting) participated in the study and performed the FMS and LESS screens. Both the 21-point and 100-point scoring systems were used to score the FMS. Spearman's correlation coefficients were used to determine the relationship between the 2 screening scores. The results showed a significant moderate correlation between FMS and LESS scores (rho 100 and 21 point = -0.528; -0.487; p < 0.001). In addition, r values of 0.26 and 0.23 indicate a poor shared variance between the 2 screens. The results indicate that performing well in one of the screens does not necessarily equate to performing well in the other. This has practical implications as it highlights that both screens may assess different movement patterns and should not be used as a substitute for each other.
Østergaard, Mikkel; Eshed, Iris; Althoff, Christian E; Poggenborg, Rene P; Diekhoff, Torsten; Krabbe, Simon; Weckbach, Sabine; Lambert, Robert G W; Pedersen, Susanne J; Maksymowych, Walter P; Peterfy, Charles G; Freeston, Jane; Bird, Paul; Conaghan, Philip G; Hermann, Kay-Geert A
2017-11-01
Whole-body magnetic resonance imaging (WB-MRI) is a relatively new technique that can enable assessment of the overall inflammatory status of people with arthritis, but standards for image acquisition, definitions of key pathologies, and a quantification system are required. Our aim was to perform a systematic literature review (SLR) and to develop consensus definitions of key pathologies, anatomical locations for assessment, a set of MRI sequences and imaging planes for the different body regions, and a preliminary scoring system for WB-MRI in inflammatory arthritis. An SLR was initially performed, searching for WB-MRI studies in arthritis, osteoarthritis, spondyloarthritis, or enthesitis. These results were presented to a meeting of the MRI in Arthritis Working Group together with an MR image review. Following this, preliminary standards for WB-MRI in inflammatory arthritides were developed with further iteration at the Working Group meetings at the Outcome Measures in Rheumatology (OMERACT) 2016. The SLR identified 10 relevant original articles (7 cross-sectional and 3 longitudinal, mostly focusing on synovitis and/or enthesitis in spondyloarthritis, 4 with reproducibility data). The Working Group decided on inflammation in peripheral joints and entheses as primary focus areas, and then developed consensus MRI definitions for these pathologies, selected anatomical locations for assessment, agreed on a core set of MRI sequences and imaging planes for the different regions, and proposed a preliminary scoring system. It was decided to test and further develop the system by iterative multireader exercises. These first steps in developing an OMERACT WB-MRI scoring system for use in inflammatory arthritides offer a framework for further testing and refinement.
Exploring a Source of Uneven Score Equity across the Test Score Range
ERIC Educational Resources Information Center
Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D.
2018-01-01
Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
Wasano, K; Ishikawa, T; Kawasaki, T; Yamamoto, S; Tomisato, S; Shinden, S; Minami, S; Wakabayashi, T; Ogawa, K
2017-12-01
We describe a novel scoring system, the facial Palsy Prognosis Prediction score (PPP score), which we test for reliability in predicting pre-therapeutic prognosis of facial palsy. We aimed to use readily available patient data that all clinicians have access to before starting treatment. Multicenter case series with chart review. Three tertiary care hospitals. We obtained haematological and demographic data from 468 facial palsy patients who were treated between 2010 and 2014 in three tertiary care hospitals. Patients were categorised as having Bell's palsy or Ramsey Hunt's palsy. We compared the data of recovered and unrecovered patients. PPP scores consisted of combinatorial threshold values of continuous patient data (eg platelet count) and categorical variables (eg gender) that best predicted recovery. We created separate PPP scores for Bell's palsy patients (PPP-B) and for Ramsey Hunt's palsy patients (PPP-H). The PPP-B score included age (≥65 years), gender (male) and neutrophil-to-lymphocyte ratio (≥2.9). The PPP-H score included age (≥50 years), monocyte rate (≥6.0%), mean corpuscular volume (≥95 fl) and platelet count (≤200 000 /μL). Patient recovery rate significantly decreased with increasing PPP scores (both PPP-B and PPP-H) in a step-wise manner. PPP scores (ie PPP-B score and PPP-H score) ≥2 were associated with worse than average prognosis. Palsy Prognosis Prediction scores are useful for predicting prognosis of facial palsy before beginning treatment. © 2017 John Wiley & Sons Ltd.
Predictive validity of pre-admission assessments on medical student performance.
Dabaliz, Al-Awwab; Kaadan, Samy; Dabbagh, M Marwan; Barakat, Abdulaziz; Shareef, Mohammad Abrar; Al-Tannir, Mohamad; Obeidat, Akef; Mohamed, Ayman
2017-11-24
To examine the predictive validity of pre-admission variables on students' performance in a medical school in Saudi Arabia. In this retrospective study, we collected admission and college performance data for 737 students in preclinical and clinical years. Data included high school scores and other standardized test scores, such as those of the National Achievement Test and the General Aptitude Test. Additionally, we included the scores of the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) exams. Those datasets were then compared with college performance indicators, namely the cumulative Grade Point Average (cGPA) and progress test, using multivariate linear regression analysis. In preclinical years, both the National Achievement Test (p=0.04, B=0.08) and TOEFL (p=0.017, B=0.01) scores were positive predictors of cGPA, whereas the General Aptitude Test (p=0.048, B=-0.05) negatively predicted cGPA. Moreover, none of the pre-admission variables were predictive of progress test performance in the same group. On the other hand, none of the pre-admission variables were predictive of cGPA in clinical years. Overall, cGPA strongly predict-ed students' progress test performance (p<0.001 and B=19.02). Only the National Achievement Test and TOEFL significantly predicted performance in preclinical years. However, these variables do not predict progress test performance, meaning that they do not predict the functional knowledge reflected in the progress test. We report various strengths and deficiencies in the current medical college admission criteria, and call for employing more sensitive and valid ones that predict student performance and functional knowledge, especially in the clinical years.
Predictive validity of pre-admission assessments on medical student performance
Dabaliz, Al-Awwab; Kaadan, Samy; Dabbagh, M. Marwan; Barakat, Abdulaziz; Shareef, Mohammad Abrar; Al-Tannir, Mohamad; Obeidat, Akef
2017-01-01
Objectives To examine the predictive validity of pre-admission variables on students’ performance in a medical school in Saudi Arabia. Methods In this retrospective study, we collected admission and college performance data for 737 students in preclinical and clinical years. Data included high school scores and other standardized test scores, such as those of the National Achievement Test and the General Aptitude Test. Additionally, we included the scores of the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) exams. Those datasets were then compared with college performance indicators, namely the cumulative Grade Point Average (cGPA) and progress test, using multivariate linear regression analysis. Results In preclinical years, both the National Achievement Test (p=0.04, B=0.08) and TOEFL (p=0.017, B=0.01) scores were positive predictors of cGPA, whereas the General Aptitude Test (p=0.048, B=-0.05) negatively predicted cGPA. Moreover, none of the pre-admission variables were predictive of progress test performance in the same group. On the other hand, none of the pre-admission variables were predictive of cGPA in clinical years. Overall, cGPA strongly predict-ed students’ progress test performance (p<0.001 and B=19.02). Conclusions Only the National Achievement Test and TOEFL significantly predicted performance in preclinical years. However, these variables do not predict progress test performance, meaning that they do not predict the functional knowledge reflected in the progress test. We report various strengths and deficiencies in the current medical college admission criteria, and call for employing more sensitive and valid ones that predict student performance and functional knowledge, especially in the clinical years. PMID:29176032
Love, William J; Lehenbauer, Terry W; Kass, Philip H; Van Eenennaam, Alison L; Aly, Sharif S
2014-01-01
Several clinical scoring systems for diagnosis of bovine respiratory disease (BRD) in calves have been proposed. However, such systems were based on subjective judgment, rather than statistical methods, to weight scores. Data from a pair-matched case-control study on a California calf raising facility was used to develop three novel scoring systems to diagnose BRD in preweaned dairy calves. Disease status was assigned using both clinical signs and diagnostic test results for BRD-associated pathogens. Regression coefficients were used to weight score values. The systems presented use nasal and ocular discharge, rectal temperature, ear and head carriage, coughing, and respiratory quality as predictors. The systems developed in this research utilize fewer severity categories of clinical signs, require less calf handling, and had excellent agreement (Kappa > 0.8) when compared to an earlier scoring system. The first scoring system dichotomized all clinical predictors but required inducing a cough. The second scoring system removed induced cough as a clinical abnormality but required distinguishing between three levels of nasal discharge severity. The third system removed induced cough and forced a dichotomized variable for nasal discharge. The first system presented in this study used the following predictors and assigned values: coughing (induced or spontaneous coughing, 2 points), nasal discharge (any discharge, 3 points), ocular discharge (any discharge, 2 points), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C or 102.5°F, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized "BRD positive" if their total score was ≥4. This system correctly classified 95.4% cases and 88.6% controls. The second presented system categorized the predictors and assigned weights as follows: coughing (spontaneous only, 2 points), mild nasal discharge (unilateral, serous, or watery discharge, 3 points), moderate to severe nasal discharge (bilateral, cloudy, mucoid, mucopurlent, or copious discharge, 5 points), ocular discharge (any discharge, 1 point), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized "BRD positive" if their total score was ≥4. This system correctly classified 89.3% cases and 92.8% controls. The third presented system used the following predictors and scores: coughing (spontaneous only, 2 points), nasal discharge (any, 4 points), ocular discharge (any, 2 points), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized "BRD positive" if their total score was ≥5. This system correctly classified 89.4% cases and 90.8% controls. Each of the proposed systems offer few levels of clinical signs and data-based weights for on-farm diagnosis of BRD in dairy calves.
Love, William J.; Lehenbauer, Terry W.; Kass, Philip H.; Van Eenennaam, Alison L.
2014-01-01
Several clinical scoring systems for diagnosis of bovine respiratory disease (BRD) in calves have been proposed. However, such systems were based on subjective judgment, rather than statistical methods, to weight scores. Data from a pair-matched case-control study on a California calf raising facility was used to develop three novel scoring systems to diagnose BRD in preweaned dairy calves. Disease status was assigned using both clinical signs and diagnostic test results for BRD-associated pathogens. Regression coefficients were used to weight score values. The systems presented use nasal and ocular discharge, rectal temperature, ear and head carriage, coughing, and respiratory quality as predictors. The systems developed in this research utilize fewer severity categories of clinical signs, require less calf handling, and had excellent agreement (Kappa > 0.8) when compared to an earlier scoring system. The first scoring system dichotomized all clinical predictors but required inducing a cough. The second scoring system removed induced cough as a clinical abnormality but required distinguishing between three levels of nasal discharge severity. The third system removed induced cough and forced a dichotomized variable for nasal discharge. The first system presented in this study used the following predictors and assigned values: coughing (induced or spontaneous coughing, 2 points), nasal discharge (any discharge, 3 points), ocular discharge (any discharge, 2 points), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C or 102.5°F, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized “BRD positive” if their total score was ≥4. This system correctly classified 95.4% cases and 88.6% controls. The second presented system categorized the predictors and assigned weights as follows: coughing (spontaneous only, 2 points), mild nasal discharge (unilateral, serous, or watery discharge, 3 points), moderate to severe nasal discharge (bilateral, cloudy, mucoid, mucopurlent, or copious discharge, 5 points), ocular discharge (any discharge, 1 point), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized “BRD positive” if their total score was ≥4. This system correctly classified 89.3% cases and 92.8% controls. The third presented system used the following predictors and scores: coughing (spontaneous only, 2 points), nasal discharge (any, 4 points), ocular discharge (any, 2 points), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized “BRD positive” if their total score was ≥5. This system correctly classified 89.4% cases and 90.8% controls. Each of the proposed systems offer few levels of clinical signs and data-based weights for on-farm diagnosis of BRD in dairy calves. PMID:24482759
Signal amplification of FISH for automated detection using image cytometry.
Truong, K; Boenders, J; Maciorowski, Z; Vielh, P; Dutrillaux, B; Malfoy, B; Bourgeois, C A
1997-05-01
The purpose of this study was to improve the detection of FISH signals, in order that spot counting by a fully automated image cytometer be comparable to that obtained visually under the microscope. Two systems of spot scoring, visual and automated counting, were investigated in parallel on stimulated human lymphocytes with FISH using a biotinylated centromeric probe for chromosome 3. Signal characteristics were first analyzed on images recorded with a coupled charge device (CCD) camera. Number of spots per nucleus were scored visually on these recorded images versus automatically with a DISCOVERY image analyzer. Several fluochromes, amplification and pretreatments were tested. Our results for both visual and automated scoring show that the tyramide amplification system (TSA) gives the best amplification of signal if pepsin treatment is applied prior to FISH. Accuracy of the automated scoring, however, remained low (58% of nuclei containing two spots) compared to the visual scoring because of the high intranuclear variation between FISH spots.
Rhee, Chin Kook; Kim, Jin Woo; Hwang, Yong Il; Lee, Jin Hwa; Jung, Ki-Suck; Lee, Myung Goo; Yoo, Kwang Ha; Lee, Sang Haak; Shin, Kyeong-Cheol; Yoon, Hyoung Kyu
2015-01-01
Background and objective According to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelines, either a modified Medical Research Council (mMRC) dyspnea score of ≥2 or a chronic obstructive pulmonary disease (COPD) assessment test (CAT) score of ≥10 is considered to represent COPD patients who are more symptomatic. We aimed to identify the ideal CAT score that exhibits minimal discrepancy with the mMRC score. Methods A receiver operating characteristic curve of the CAT score was generated for an mMRC scores of 1 and 2. A concordance analysis was applied to quantify the association between the frequencies of patients categorized into GOLD groups A–D using symptom cutoff points. A κ-coefficient was calculated. Results For an mMRC score of 2, a CAT score of 15 showed the maximum value of Youden’s index with a sensitivity and specificity of 0.70 and 0.66, respectively (area under the receiver operating characteristic curve [AUC] 0.74; 95% confidence interval [CI], 0.70–0.77). For an mMRC score of 1, a CAT score of 10 showed the maximum value of Youden’s index with a sensitivity and specificity of 0.77 and 0.65, respectively (AUC 0.77; 95% CI, 0.72–0.83). The κ value for concordance was highest between an mMRC score of 1 and a CAT score of 10 (0.66), followed by an mMRC score of 2 and a CAT score of 15 (0.56), an mMRC score of 2 and a CAT score of 10 (0.47), and an mMRC score of 1 and a CAT score of 15 (0.43). Conclusion A CAT score of 10 was most concordant with an mMRC score of 1 when classifying patients with COPD into GOLD groups A–D. However, a discrepancy remains between the CAT and mMRC scoring systems. PMID:26316736
Bhargava, Rahul; Kumar, Prachi; Kaur, Avinash; Kumar, Manjushri; Mishra, Anurag
2014-07-01
To compare the diagnostic value and accuracy of dry eye scoring system (DESS), conjunctival impression cytology (CIC), tear film breakup time (TBUT), and Schirmer's test in computer users. A case-control study was done at two referral eye centers. Eyes of 344 computer users were compared to 371 eyes of age and sex matched controls. Dry eye questionnaire (DESS) was administered to both groups and they further underwent measurement of TBUT, Schirmer's, and CIC. Correlation analysis was performed between DESS, CIC, TBUT, and Schirmer's test scores. A Pearson's coefficient of the linear expression (R (2)) of 0.5 or more was statistically significant. The mean age in cases (26.05 ± 4.06 years) was comparable to controls (25.67 ± 3.65 years) (P = 0.465). The mean symptom score in computer users was significantly higher as compared to controls (P < 0.001). Mean TBUT, Schirmer's test values, and goblet cell density were significantly reduced in computer users (P < 0.001). TBUT, Schirmer's, and CIC were abnormal in 48.5%, 29.1%, and 38.4% symptomatic computer users respectively as compared to 8%, 6.7%, and 7.3% symptomatic controls respectively. On correlation analysis, there was a significant (inverse) association of dry eye symptoms (DESS) with TBUT and CIC scores (R (2) > 0.5), in contrast to Schirmer's scores (R(2) < 0.5). Duration of computer usage had a significant effect on dry eye symptoms severity, TBUT, and CIC scores as compared to Schirmer's test. DESS should be used in combination with TBUT and CIC for dry eye evaluation in computer users.
Kaido, Minako; Ishida, Reiko; Dogru, Murat; Tsubota, Kazuo
2011-09-01
To investigate the relation of functional visual acuity (FVA) measurements with dry eye test parameters and to compare the testing methods with and without blink suppression and anesthetic instillation. A prospective comparative case series. Thirty right eyes of 30 dry eye patients and 25 right eyes of 25 normal subjects seen at Keio University School of Medicine, Department of Ophthalmology were studied. FVA testing was performed using a FVA measurement system with two different approaches, one in which measurements were made under natural blinking conditions without topical anesthesia (FVA-N) and the other in which the measurements were made under the blink suppression condition with topical anesthetic eye drops (FVA-BS). Tear function examinations, such as the Schirmer test, tear film break-up time, and fluorescein and Rose Bengal vital staining as ocular surface evaluation, were performed. The mean logMAR FVA-N scores and logMAR Landolt visual acuity scores were significantly lower in the dry eye subjects than in the healthy controls (p < 0.05), while there were no statistical differences between the logMAR FVA-BS scores of the dry eye subjects and those of the healthy controls. There was a significant correlation between the logMAR Landolt visual acuities and the logMAR FVA-N and logMAR FVA-BS scores. The FVA-N scores correlated significantly with tear quantities, tear stability and, especially, the ocular surface vital staining scores. FVA measurements performed under natural blinking significantly reflected the tear functions and ocular surface status of the eye and would appear to be a reliable method of FVA testing. FVA measurement is also an accurate predictor of dry eye status.
Automated Assessment of Postural Stability (AAPS)
2016-10-01
with human volunteers and used our preliminary data to quantify system calibration and limitations of performance. We have also compared our system’s...scoring. Furthermore, we have begun the process of testing with human volunteers and used our preliminary data to quantify system calibration and
Privacy Impact Assessment for the Lead-based Paint System of Records
The Lead-based Paint System of Records collects personally identifiable information, test scores, and submitted fees. Learn how this data is collected, how it will be used, access to the data, the purpose of data collection, and record retention policies.
Vadalà, A; De Carli, A; Vulpiani, M C; Iorio, R; Vetrano, M; Scapellato, S; Suarez, T; Di Salvo, F; Ferretti, A
2012-12-01
The aim of this paper was to report clinical, functional and radiological results of 80 patients surgically treated with a combined mini-open and percutaneous surgical repair as proposed by Kakiuchi. All patients were evaluated with a physical examination, evaluation scales, a functional test (Ergo-jump Bosco System), and an ultrasonographic exam along with Power Doppler Ultrasonography (PDU) (S/S). At a mean follow-up of 58 months no cases of rerupture were detected. VISA-A evaluation scale showed an excellent score in 63 patients (78.75%), a good score in 14 patients (17.5%), a fair score in two patients (2.5%), and a poor score in one patient (1.25%). Hannover scale showed an excellent score in 63 patients (78.75%), and a good score in 17 patients (21.25%). Ergo-Jump evaluation showed a 2.07% mean deficit of the affected limb at the Squatting Jump test, a 3.26% mean deficit at the Counter Movement Jump test, and a 0.0062% mean improvement at the Repetitive Jump test. Ultrasonographic exam showed in all cases a satisfactory recovery of the integrity of the operated tendon. The mean AP and LL widths showed a significant increase of 7.13±2.97 mm (+56.1%) and of 4.01±2.36 mm (+43.81%) respectively. According to the modified Öhberg score scale, PDU exam showed a grade +1 in 16 patients (20%) and a grade +2 in seven cases (8.7%). The absence of rerupture cases, the satisfactory functional and ultrasonographic results of the patients included in this study cause us to consider this technique as reliable and effective even in young high-demand patients.
Ries, Julie D; Echternach, John L; Nof, Leah; Gagnon Blodgett, Michelle
2009-06-01
With the increasing incidence of Alzheimer disease (AD), determining the validity and reliability of outcome measures for people with this disease is necessary. The goals of this study were to assess test-retest reliability of data for the Timed "Up & Go" Test (TUG), the Six-Minute Walk Test (6MWT), and gait speed and to calculate minimal detectable change (MDC) scores for each outcome measure. Performance differences between groups with mild to moderate AD and moderately severe to severe AD (as determined by the Functional Assessment Staging [FAST] scale) were studied. This was a prospective, nonexperimental, descriptive methodological study. Background data collected for 51 people with AD included: use of an assistive device, Mini-Mental Status Examination scores, and FAST scale scores. Each participant engaged in 2 test sessions, separated by a 30- to 60-minute rest period, which included 2 TUG trials, 1 6MWT trial, and 2 gait speed trials using a computerized gait assessment system. A specific cuing protocol was followed to achieve optimal performance during test sessions. Test-retest reliability values for the TUG, the 6MWT, and gait speed were high for all participants together and for the mild to moderate AD and moderately severe to severe AD groups separately (intraclass correlation coefficients > or = .973); however, individual variability of performance also was high. Calculated MDC scores at the 90% confidence interval were: TUG=4.09 seconds, 6MWT=33.5 m (110 ft), and gait speed=9.4 cm/s. The 2 groups were significantly different in performance of clinical tests, with the participants who were more cognitively impaired being more physically and functionally impaired. A single researcher for data collection limited sample numbers and prohibited blinding to dementia level. The TUG, the 6MWT, and gait speed are reliable outcome measures for use with people with AD, recognizing that individual variability of performance is high. Minimal detectable change scores at the 90% confidence interval can be used to assess change in performance over time and the impact of treatment.
Reliability of sonographic assessment of tendinopathy in tennis elbow.
Poltawski, Leon; Ali, Syed; Jayaram, Vijay; Watson, Tim
2012-01-01
To assess the reliability and compute the minimum detectable change using sonographic scales to quantify the extent of pathology and hyperaemia in the common extensor tendon in people with tennis elbow. The lateral elbows of 19 people with tennis elbow were assessed sonographically twice, 1-2 weeks apart. Greyscale and power Doppler images were recorded for subsequent rating of abnormalities. Tendon thickening, hypoechogenicity, fibrillar disruption and calcification were each rated on four-point scales, and scores were summed to provide an overall rating of structural abnormality; hyperaemia was scored on a five point scale. Inter-rater reliability was established using the intraclass correlation coefficient (ICC) to compare scores assigned independently to the same set of images by a radiologist and a physiotherapist with training in musculoskeletal imaging. Test-retest reliability was assessed by comparing scores assigned by the physiotherapist to images recorded at the two sessions. The minimum detectable change (MDC) was calculated from the test-retest reliability data. ICC values for inter-rater reliability ranged from 0.35 (95% CI: 0.05, 0.60) for fibrillar disruption to 0.77 (0.55, 0.88) for overall greyscale score, and 0.89 (0.79, 0.95) for hyperaemia. Test-retest reliability ranged from 0.70 (0.48, 0.84) for tendon thickening to 0.82 (0.66, 0.90) for overall greyscale score and 0.86 (0.73, 0.93) for calcification. The MDC for the greyscale total score was 2.0/12 and for the hyperaemia score was 1.1/5. The sonographic scoring system used in this study may be used reliably to quantify tendon abnormalities and change over time. A relatively inexperienced imager can conduct the assessment and use the rating scales reliably.
The Scoring of Matching Questions Tests: A Closer Look
ERIC Educational Resources Information Center
Jancarík, Antonín; Kostelecká, Yvona
2015-01-01
Electronic testing has become a regular part of online courses. Most learning management systems offer a wide range of tools that can be used in electronic tests. With respect to time demands, the most efficient tools are those that allow automatic assessment. The presented paper focuses on one of these tools: matching questions in which one…
Neurodevelopmental Assessment of the Young Child: The State of the Art
ERIC Educational Resources Information Center
Allen, Marilee C.
2005-01-01
A wide variety of tests are available to assess the central nervous system (CNS) function of the toddler and preschool-aged child. These tests vary as to function; qualities and abilities tapped; facility with which they can be learned, administered, and scored; availability of test materials and manuals or training videos; and strength of…
Validation of the Seating and Mobility Script Concordance Test
ERIC Educational Resources Information Center
Cohen, Laura J.; Fitzgerald, Shirley G.; Lane, Suzanne; Boninger, Michael L.; Minkel, Jean; McCue, Michael
2009-01-01
The purpose of this study was to develop the scoring system for the Seating and Mobility Script Concordance Test (SMSCT), obtain and appraise internal and external structure evidence, and assess the validity of the SMSCT. The SMSCT purpose is to provide a method for testing knowledge of seating and mobility prescription. A sample of 106 therapists…
Sauerbruch, T; Ansari, H; Wotzka, R; Soehendra, N; Köpcke, W
1988-01-08
Prospective prognosis systems for predicting half-year death-rate after bleeding from oesophageal varices and sclerotherapy were tested on 129 patients. The receiver-operating-characteristic curves of three discriminant scores were compared with the Child-Pugh classification. It was found that the latter is still the best for prognosticating the course of the disease. A simplified discriminant score which contains as its only factors bilirubin and the Quick value does, however, give nearly as good information.
Validity of a novel computerized screening test system for mild cognitive impairment.
Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk
2018-06-20
ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.
Ahmadian, Leila; Dorosti, Nafise; Khajouei, Reza; Gohari, Sadrieh Hajesmaeel
2017-06-01
Hospital Information Systems (HIS) are used for easy access to information, improvement of documentation and reducing errors. Nonetheless, using these systems is faced with some barriers and obstacles. This study identifies the challenges and the obstacles of using these systems in the academic and non-academic hospitals in Kerman. This is a cross-sectional study which was carried out in 2015. The statistical population in this study consisted of the nurses who had been working in the academic and non-academic hospitals in Kerman. A questionnaire consisting of two sections was used. The first section consisted of the demographic information of the participants and the second section comprised 34 questions about the challenges of HIS use. Data were analyzed by the descriptive and statistical analysis (t-test, and ANOVA) using SPSS 19 software. The most common and important challenges in the academic hospitals were about human environment factors, particularly "negative attitude of society toward using HIS". In the non-academic hospitals, the most common and important challenges were related to human factors, and among them, "no incentive to use system" was the main factor. The results of the t-test method revealed that there was a significant relationship between gender and the mean score of challenges related to the organizational environment category in the academic hospitals and between familiarity with HIS and mean score of human environment factors (p<0.05). The results of the ANOVA test also revealed that the educational degree and work experience in the healthcare environment (years) in the academic hospitals have a significant relationship with the mean score related to the hardware challenges, as well, experience with HIS has a significant relationship, with the mean score related to the human challenges (p<0.05). The most important challenges in using the information systems are the factors related to the human environment and the human factors. The results of this study can bring a good perspective to the policy makers and the managers regarding obstacles of using HISs from the nurses' perspective, so that they can solve their problems and can successfully implement these systems.
Humanizing Assessment Reports with a Computer.
ERIC Educational Resources Information Center
Mathews, Walter M.
Five computerized narrative assessment reports are discussed. These are: (1) the Teaching Information Processing System Student Report, used for a college economics course; (2) the Preliminary Scholastic Aptitude Test (PSAT) Score Report; (3) the Programmed Composition of Psychological Test Reports employed at the Mayo Clinic for reporting results…
Chacko, Shiny
2014-01-01
The conceptual framework of the study, undertaken in select health centres of New Delhi, was based on General System Model. The research approach was evaluative with one group pre-test and post-test design. The study population comprised of Community Health Workers working in selected centres in Najafgarh, Delhi. Purposive sampling technique was used to select a sample of 30 Community Health Workers. A structured knowledge questionnaire was developed to assess the knowledge of subjects. A Structured Teaching Programme was developed to enhance the knowledge of Community Health Workers. Pre-test was given on day 1 and Structured Teaching Programme administered on same day. Post-test was conducted on day 7. Most of the Community Health Workers were in the age group of 21-30 years with academic qualification up to Higher Secondary level. Maximum Community Health Workers had professional qualification as ANM/MPHW (female). Majority of the Community Health Workers had experience up to 5 years. Initially there was deficit in scores of knowledge of Community Health Workers regarding Visual Inspection with Acetic Acid (VIA) test. Mean post-test knowledge scores of Community Health Workers were found to be signifi- cantly higher than their mean pre-test knowledge score. The Community Health Workers after expo- sure to Structured Teaching Programme gained a significant positive relationship between post-test knowledge scores. The study reveals the efficacy of Structured Teaching Programme in enhancing the knowledge of Community Health Workers regarding VIA test and a need for conducting a regular and well planned health teaching programme on VIA test for improving their knowledge on VIA test for the early detection and diagnosis of cervical cancer.
Tomita, Hirofumi; Masugi, Yohei; Hoshino, Ken; Fuchimoto, Yasushi; Fujino, Akihiro; Shimojima, Naoki; Ebinuma, Hirotoshi; Saito, Hidetsugu; Sakamoto, Michiie; Kuroda, Tatsuo
2014-06-01
Although liver fibrosis is an important predictor of outcomes for biliary atresia (BA), postsurgical native liver histology has not been well reported. Here, we retrospectively evaluated postsurgical native liver histology, and developed and assessed a novel scoring system - the BA liver fibrosis (BALF) score for non-invasively predicting liver fibrosis grades. We identified 259 native liver specimens from 91 BA patients. Of these, 180 specimens, obtained from 62 patients aged ≥1 year at examination, were used to develop the BALF scoring system. The BALF score equation was determined according to the prediction of histological fibrosis grades by multivariate ordered logistic regression analysis. The diagnostic powers of the BALF score and several non-invasive markers were assessed by area under the receiver operating characteristic curve (AUROC) analyses. Natural logarithms of the serum total bilirubin, γ-glutamyltransferase, and albumin levels, and age were selected as significantly independent variables for the BALF score equation. The BALF score had a good diagnostic power (AUROCs=0.86-0.94, p<0.001) and good diagnostic accuracy (79.4-93.3%) for each fibrosis grade. The BALF score revealed a strong correlation with fibrosis grade (r=0.77, p<0.001), and was the preferable non-invasive marker for diagnosing fibrosis grades ⩾F2. In a serial liver histology subgroup analysis, 7/15 patients exhibited liver fibrosis improvement with BALF scores being equivalent to histological fibrosis grades of F0-1. In postsurgical BA patients aged ⩾1year, the BALF score is a potential non-invasive marker of native liver fibrosis. Copyright © 2014 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
The variability of software scoring of the CDMAM phantom associated with a limited number of images
NASA Astrophysics Data System (ADS)
Yang, Chang-Ying J.; Van Metter, Richard
2007-03-01
Software scoring approaches provide an attractive alternative to human evaluation of CDMAM images from digital mammography systems, particularly for annual quality control testing as recommended by the European Protocol for the Quality Control of the Physical and Technical Aspects of Mammography Screening (EPQCM). Methods for correlating CDCOM-based results with human observer performance have been proposed. A common feature of all methods is the use of a small number (at most eight) of CDMAM images to evaluate the system. This study focuses on the potential variability in the estimated system performance that is associated with these methods. Sets of 36 CDMAM images were acquired under carefully controlled conditions from three different digital mammography systems. The threshold visibility thickness (TVT) for each disk diameter was determined using previously reported post-analysis methods from the CDCOM scorings for a randomly selected group of eight images for one measurement trial. This random selection process was repeated 3000 times to estimate the variability in the resulting TVT values for each disk diameter. The results from using different post-analysis methods, different random selection strategies and different digital systems were compared. Additional variability of the 0.1 mm disk diameter was explored by comparing the results from two different image data sets acquired under the same conditions from the same system. The magnitude and the type of error estimated for experimental data was explained through modeling. The modeled results also suggest a limitation in the current phantom design for the 0.1 mm diameter disks. Through modeling, it was also found that, because of the binomial statistic nature of the CDMAM test, the true variability of the test could be underestimated by the commonly used method of random re-sampling.
Instance-Based Question Answering
2006-12-01
answer clustering, composition, and scoring. Moreover, with the effort dedicated to improving monolingual system performance, system parameters are...text collections: document type, manual or automatic annotations (if any), and stylistic and notational differences in technical terms. Monolingual ...forum in which cross language retrieval systems and question answering systems are tested for various Eu- ropean languages. The CLEF QA monolingual task
Reddy, Yogesh N V; Carter, Rickey E; Obokata, Masaru; Redfield, Margaret M; Borlaug, Barry A
2018-05-23
Background -Diagnosis of heart failure with preserved ejection fraction (HFpEF) is challenging in euvolemic patients with dyspnea, and no evidence-based criteria are available. We sought to develop and then validate non-invasive diagnostic criteria that could be used to estimate the likelihood that HFpEF is present among patients with unexplained dyspnea in order to guide further testing. Methods -Consecutive patients with unexplained dyspnea referred for invasive hemodynamic exercise testing were retrospectively evaluated. Diagnosis of HFpEF (case) or non-cardiac dyspnea (control) was ascertained by invasive hemodynamic exercise testing. Logistic regression was performed to evaluate the ability of clinical findings to discriminate cases from controls. A scoring system was developed and then validated in a separate test cohort. Results -The derivation cohort included 414 consecutive patients (267 HFpEF and 147 controls, HFpEF prevalence 64%). The test cohort included 100 consecutive patients (61 HFpEF, prevalence 61%). Obesity, atrial fibrillation, age>60 years, treatment with 2 or more antihypertensives, echocardiographic E/e' ratio>9 and echocardiographic pulmonary artery systolic pressure>35 mmHg were selected as the final set of predictive variables. A weighted score based on these six variables was used to create a composite score (H 2 FPEF score) ranging from 0-9. The odds of HFpEF doubled for each 1 unit score increase [OR 1.98 [1.74-2.30], p<0.0001], with an AUC of 0.841 (p<0.0001). The H 2 FPEF score was superior to a currently-used algorithm based upon expert consensus (increase in AUC of +0.169 [+0.120 to +0.217], p<0.0001). Performance in the independent test cohort was maintained [AUC 0.886, p<0.0001]. Conclusions -The H 2 FPEF score, which relies upon simple clinical characteristics and echocardiography, enables discrimination of HFpEF from non-cardiac causes of dyspnea, and can assist in determination of the need for further diagnostic testing in the evaluation of patients with unexplained exertional dyspnea.
Miyashita, Theresa L; Diakogeorgiou, Eleni; Marrie, Kaitlyn
Investigation into the effect of cumulative subconcussive head impacts has yielded various results in the literature, with many supporting a link to neurological deficits. Little research has been conducted on men's lacrosse and associated balance deficits from head impacts. (1) Athletes will commit more errors on the postseason Balance Error Scoring System (BESS) test. (2) There will be a positive correlation to change in BESS scores and head impact exposure data. Prospective longitudinal study. Level 3. Thirty-four Division I men's lacrosse players (age, 19.59 ± 1.42 years) wore helmets instrumented with a sensor to collect head impact exposure data over the course of a competitive season. Players completed a BESS test at the start and end of the competitive season. The number of errors from pre- to postseason increased during the double-leg stance on foam ( P < 0.001), tandem stance on foam ( P = 0.009), total number of errors on a firm surface ( P = 0.042), and total number of errors on a foam surface ( P = 0.007). There were significant correlations only between the total errors on a foam surface and linear acceleration ( P = 0.038, r = 0.36), head injury criteria ( P = 0.024, r = 0.39), and Gadd Severity Index scores ( P = 0.031, r = 0.37). Changes in the total number of errors on a foam surface may be considered a sensitive measure to detect balance deficits associated with cumulative subconcussive head impacts sustained over the course of 1 lacrosse season, as measured by average linear acceleration, head injury criteria, and Gadd Severity Index scores. If there is microtrauma to the vestibular system due to repetitive subconcussive impacts, only an assessment that highly stresses the vestibular system may be able to detect these changes. Cumulative subconcussive impacts may result in neurocognitive dysfunction, including balance deficits, which are associated with an increased risk for injury. The development of a strategy to reduce total number of head impacts may curb the associated sequelae. Incorporation of a modified BESS test, firm surface only, may not be recommended as it may not detect changes due to repetitive impacts over the course of a competitive season.
Ray, Midge N; Houston, Thomas K; Yu, Feliciano B; Menachemi, Nir; Maisiak, Richard S; Allison, Jeroan J; Berner, Eta S
2006-01-01
The authors developed and evaluated a rating scale, the Attitudes toward Handheld Decision Support Software Scale (H-DSS), to assess physician attitudes about handheld decision support systems. The authors conducted a prospective assessment of psychometric characteristics of the H-DSS including reliability, validity, and responsiveness. Participants were 82 Internal Medicine residents. A higher score on each of the 14 five-point Likert scale items reflected a more positive attitude about handheld DSS. The H-DSS score is the mean across the fourteen items. Attitudes toward the use of the handheld DSS were assessed prior to and six months after receiving the handheld device. Cronbach's Alpha was used to assess internal consistency reliability. Pearson correlations were used to estimate and detect significant associations between scale scores and other measures (validity). Paired sample t-tests were used to test for changes in the mean attitude scale score (responsiveness) and for differences between groups. Internal consistency reliability for the scale was alpha = 0.73. In testing validity, moderate correlations were noted between the attitude scale scores and self-reported Personal Digital Assistant (PDA) usage in the hospital (correlation coefficient = 0.55) and clinic (0.48), p < 0.05 for both. The scale was responsive, in that it detected the expected increase in scores between the two administrations (3.99 (s.d. = 0.35) vs. 4.08, (s.d. = 0.34), p < 0.005). The authors' evaluation showed that the H-DSS scale was reliable, valid, and responsive. The scale can be used to guide future handheld DSS development and implementation.
An Approach to Scoring and Equating Tests with Binary Items: Piloting With Large-Scale Assessments
ERIC Educational Resources Information Center
Dimitrov, Dimiter M.
2016-01-01
This article describes an approach to test scoring, referred to as "delta scoring" (D-scoring), for tests with dichotomously scored items. The D-scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The D-score is computed from the…
Evaluation of Consumer Understanding of Different Front-of-Package Nutrition Labels, 2010–2011
Bragg, Marie A.; Seamans, Marissa J.; Mechulan, Regine L.; Novak, Nicole; Brownell, Kelly D.
2012-01-01
Introduction Governments throughout the world are using or considering various front-of-package (FOP) food labeling systems to provide nutrition information to consumers. Our web-based study tested consumer understanding of different FOP labeling systems. Methods Adult participants (N = 480) were randomized to 1 of 5 groups to evaluate FOP labels: 1) no label; 2) multiple traffic light (MTL); 3) MTL plus daily caloric requirement icon (MTL+caloric intake); 4) traffic light with specific nutrients to limit based on food category (TL+SNL); or 5) the Choices logo. Total percentage correct quiz scores were created reflecting participants’ ability to select the healthier of 2 foods and estimate amounts of saturated fat, sugar, and sodium in foods. Participants also rated products on taste, healthfulness, and how likely they were to purchase the product. Quiz scores and product perceptions were compared with 1-way analysis of variance followed by post-hoc Tukey tests. Results The MTL+caloric intake group (mean [standard deviation], 73.3% [6.9%]) and Choices group (72.5% [13.2%]) significantly outperformed the no label group (67.8% [10.3%]) and the TL+SNL group (65.8% [7.3%]) in selecting the more healthful product on the healthier product quiz. The MTL and MTL+caloric intake groups achieved average scores of more than 90% on the saturated fat, sugar, and sodium quizzes, which were significantly better than the no label and Choices group average scores, which were between 34% and 47%. Conclusion An MTL+caloric intake label and the Choices symbol hold promise as FOP labeling systems and require further testing in different environments and population subgroups. PMID:22995103
Performance characteristics of NuVal and the Overall Nutritional Quality Index (ONQI).
Katz, David L; Njike, Valentine Y; Rhee, Lauren Q; Reingold, Arthur; Ayoob, Keith T
2010-04-01
Improving diets has considerable potential to improve health, but progress in this area has been limited, and advice to increase fruit and vegetable intake has largely gone unheeded. Our objective was to test the performance characteristics of the Overall Nutritional Quality Index (ONQI), a tool designed to help improve dietary patterns one well-informed choice at a time. The ONQI was developed by a multidisciplinary group of nutrition and public health scientists independent of food industry interests and is the basis for the NuVal Nutritional Guidance System. Dietary guidelines, existing nutritional scoring systems, and other pertinent scientific literature were reviewed. An algorithm incorporating >30 entries that represent both micro- and macronutrient properties of foods, as well as weighting coefficients representing epidemiologic associations between nutrients and health outcomes, was developed and subjected to consumer research and testing of performance characteristics. ONQI and expert panel rankings correlated highly (R = 0.92, P < 0.001). In consumer testing, approximately 80% of >800 study participants indicated that the ONQI would influence their purchase intent. ONQI scoring distinguished the more-healthful DASH (Dietary Approaches to Stop Hypertension) diet (mean score: 46) from the typical American diet according to the National Health and Nutrition Examination Survey (NHANES) 2003-2006 (mean score: 26.5; P < 0.01). In linear regression analysis of the NHANES 2003-2006 populations (n = 15,900), the NuVal system was significantly associated with the Healthy Eating Index 2005 (P < 0.0001). Recently generated data from ongoing studies indicate favorable effects on purchase patterns and significant correlation with health outcomes in large cohorts of men and women followed for decades. NuVal offers universally applicable nutrition guidance that is independent of food industry interests and is supported by consumer research and scientific evaluation of its performance characteristics.
Wright, Alexander; Lyttleton, Oliver; Lewis, Paul; Quirke, Philip; Treanor, Darren
2011-01-01
Background: Tissue MicroArrays (TMAs) are a high throughput technology for rapid analysis of protein expression across hundreds of patient samples. Often, data relating to TMAs is specific to the clinical trial or experiment it is being used for, and not interoperable. The Tissue Microarray Data Exchange Specification (TMA DES) is a set of eXtensible Markup Language (XML)-based protocols for storing and sharing digitized Tissue Microarray data. XML data are enclosed by named tags which serve as identifiers. These tag names can be Common Data Elements (CDEs), which have a predefined meaning or semantics. By using this specification in a laboratory setting with increasing demands for digital pathology integration, we found that the data structure lacked the ability to cope with digital slide imaging in respect to web-enabled digital pathology systems and advanced scoring techniques. Materials and Methods: By employing user centric design, and observing behavior in relation to TMA scoring and associated data, the TMA DES format was extended to accommodate the current limitations. This was done with specific focus on developing a generic tool for handling any given scoring system, and utilizing data for multiple observations and observers. Results: DTDs were created to validate the extensions of the TMA DES protocol, and a test set of data containing scores for 6,708 TMA core images was generated. The XML was then read into an image processing algorithm to utilize the digital pathology data extensions, and scoring results were easily stored alongside the existing multiple pathologist scores. Conclusions: By extending the TMA DES format to include digital pathology data and customizable scoring systems for TMAs, the new system facilitates the collaboration between pathologists and organizations, and can be used in automatic or manual data analysis. This allows complying systems to effectively communicate complex and varied scoring data. PMID:21572508
Creativity and Performativity Policies in Primary School Cultures
ERIC Educational Resources Information Center
Troman, Geoff; Jeffrey, Bob; Raggl, Andrea
2007-01-01
Cultures of performativity in English primary schools refer to systems and relationships of: target-setting; Ofsted inspections; school league tables constructed from pupil test scores; performance management; performance related pay; threshold assessment; and advanced skills teachers. Systems which demand that teachers "perform" and in…
Li, Yang; Yang, Jianyi
2017-04-24
The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.
Azzi, Salah; Salem, Jennifer; Thibaud, Nathalie; Chantot-Bastaraud, Sandra; Lieber, Eli; Netchine, Irène; Harbison, Madeleine D
2015-01-01
Background Multiple clinical scoring systems have been proposed for Silver-Russell syndrome (SRS). Here we aimed to test a clinical scoring system for SRS and to analyse the correlation between (epi)genotype and phenotype. Subjects and methods Sixty-nine patients were examined by two physicians. Clinical scores were generated for all patients, with a new, six-item scoring system: (1) small for gestational age, birth length and/or weight ≤−2SDS, (2) postnatal growth retardation (height ≤−2SDS), (3) relative macrocephaly at birth, (4) body asymmetry, (5) feeding difficulties and/or body mass index (BMI) ≤−2SDS in toddlers; (6) protruding forehead at the age of 1–3 years. Subjects were considered to have likely SRS if they met at least four of these six criteria. Molecular investigations were performed blind to the clinical data. Results The 69 patients were classified into two groups (Likely-SRS (n=60), Unlikely-SRS (n=9)). Forty-six Likely-SRS patients (76.7%) displayed either 11p15 ICR1 hypomethylation (n=35; 58.3%) or maternal UPD of chromosome 7 (mUPD7) (n=11; 18.3%). Eight Unlikely-SRS patients had neither ICR1 hypomethylation nor mUPD7, whereas one patient had mUPD7. The clinical score and molecular results yielded four groups that differed significantly overall and for individual scoring system factors. Further molecular screening led identifying chromosomal abnormalities in Likely-SRS-double-negative and Unlikely-SRS groups. Four Likely-SRS-double negative patients carried a DLK1/GTL2 IG-DMR hypomethylation, a mUPD16; a mUPD20 and a de novo 1q21 microdeletion. Conclusions This new scoring system is very sensitive (98%) for the detection of patients with SRS with demonstrated molecular abnormalities. Given its clinical and molecular heterogeneity, SRS could be considered as a spectrum. PMID:25951829
New scoring system for intra-abdominal injury diagnosis after blunt trauma.
Shojaee, Majid; Faridaalaee, Gholamreza; Yousefifard, Mahmoud; Yaseri, Mehdi; Arhami Dolatabadi, Ali; Sabzghabaei, Anita; Malekirastekenari, Ali
2014-01-01
An accurate scoring system for intra-abdominal injury (IAI) based on clinical manifestation and examination may decrease unnecessary CT scans, save time, and reduce healthcare cost. This study is designed to provide a new scoring system for a better diagnosis of IAI after blunt trauma. This prospective observational study was performed from April 2011 to October 2012 on patients aged above 18 years and suspected with blunt abdominal trauma (BAT) admitted to the emergency department (ED) of Imam Hussein Hospital and Shohadaye Hafte Tir Hospital. All patients were assessed and treated based on Advanced Trauma Life Support and ED protocol. Diagnosis was done according to CT scan findings, which was considered as the gold standard. Data were gathered based on patient's history, physical exam, ultrasound and CT scan findings by a general practitioner who was not blind to this study. Chi-square test and logistic regression were done. Factors with significant relationship with CT scan were imported in multivariate regression models, where a coefficient (β) was given based on the contribution of each of them. Scoring system was developed based on the obtained total β of each factor. Altogether 261 patients (80.1% male) were enrolled (48 cases of IAI). A 24-point blunt abdominal trauma scoring system (BATSS) was developed. Patients were divided into three groups including low (score<8), moderate (8≤score<12) and high risk (score≥12). In high risk group immediate laparotomy should be done, moderate group needs further assessments, and low risk group should be kept under observation. Low risk patients did not show positive CT-scans (specificity 100%). Conversely, all high risk patients had positive CT-scan findings (sensitivity 100%). The receiver operating characteristic curve indicated a close relationship between the results of CT scan and BATSS (sensitivity=99.3%). The present scoring system furnishes a high precision and reproducible diagnostic tool for BAT detection and has the potential to reduce unnecessary CT scan and cut unnecessary costs.
Merk, Josef; Schlotz, Wolff; Falter, Thomas
2017-01-01
This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts. PMID:28979228
Merk, Josef; Schlotz, Wolff; Falter, Thomas
2017-01-01
This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts.
[Development and validation of the Visual Analogue Scale (VAS) Spine Score].
Knop, C; Oeser, M; Bastian, L; Lange, U; Zdichavsky, M; Blauth, M
2001-06-01
The aim of the study was the development and validation of a new subjective rating scale for assessment of outcome in patients with thoracolumbar fractures and fracture dislocations. The VAS spine score consists of 19 score items, using 100-mm visual analogue scales. The items are answered by the patients independently of rater assessment. To measure the analogue scales and calculate the score, a computer-aided system was evolved consisting of self-developed software and digitizer board. The overall score is the mean of all items answered with values between 0 and 100. The individual score loss is calculated as the difference between the preinjury score and at follow-up with values between 0 and 100. The VAS spine score was tested for reliability with a group of 136 healthy volunteers. We performed a test-retest study with an interval of 24 h. For statistical analysis of the validity, we prospectively followed a group of 53 patients with the new outcome score. We chose patients with injuries of the thoracolumbar spine, all having been operatively treated by combined posterior-anterior stabilization and fusion between 1994 and 1996. In the reference group, the average test score was 91.95 (58-100) and 92.10 (58-100) at retest. The mean individual difference between test and retest scored 1.037 (0-8). A high reliability was proved by a strong correlation with a coefficient of 0.976 (p < 0.001). A high internal consistency of the VAS spine score was shown by a Cronbach-alpha of 0.9117. The mean score for the preinjury status of the patients was comparable to the reference group, amounting to 89.60 (21-100). The mean score at the time of implant removal was significantly (p < 0.001) decreased to 58.25 (13-97). Until the time of follow-up a significant (p < 0.001) increase was noted, and the group scored 66.08 (15-100) at follow-up. This was a significant (p < 0.001) difference compared with the preinjury status. The individual score loss averaged 24.1 (0-80). In the patient group we also noted a Cronbach-alpha > 0.95, indicating a high internal consistency. With the VAS spine score the authors have inaugurated a new tool for outcome measurement in the treatment of patients with thoracolumbar injuries. The study has proved the score to be both reliable and valid. The application of the score is helpful in analyzing the subjective outcome, and the results can be correlated with objective measures. The score is a useful tool for comparative clinical studies, addressing the outcome after different methods of treatment.
DIFAS: Differential Item Functioning Analysis System. Computer Program Exchange
ERIC Educational Resources Information Center
Penfield, Randall D.
2005-01-01
Differential item functioning (DIF) is an important consideration in assessing the validity of test scores (Camilli & Shepard, 1994). A variety of statistical procedures have been developed to assess DIF in tests of dichotomous (Hills, 1989; Millsap & Everson, 1993) and polytomous (Penfield & Lam, 2000; Potenza & Dorans, 1995) items. Some of these…
Bressel, Eadric; Yonker, Joshua C; Kras, John; Heath, Edward M
2007-01-01
Context: How athletes from different sports perform on balance tests is not well understood. When prescribing balance exercises to athletes in different sports, it may be important to recognize performance variations. Objective: To compare static and dynamic balance among collegiate athletes competing or training in soccer, basketball, and gymnastics. Design: A quasi-experimental, between-groups design. Independent variables included limb (dominant and nondominant) and sport played. Setting: A university athletic training facility. Patients or Other Participants: Thirty-four female volunteers who competed in National Collegiate Athletic Association Division I soccer (n = 11), basketball (n = 11), or gymnastics (n = 12). Intervention(s): To assess static balance, participants performed 3 stance variations (double leg, single leg, and tandem leg) on 2 surfaces (stiff and compliant). For assessment of dynamic balance, participants performed multidirectional maximal single-leg reaches from a unilateral base of support. Main Outcome Measure(s): Errors from the Balance Error Scoring System and normalized leg reach distances from the Star Excursion Balance Test were used to assess static and dynamic balance, respectively. Results: Balance Error Scoring System error scores for the gymnastics group were 55% lower than for the basketball group (P = .01), and Star Excursion Balance Test scores were 7% higher in the soccer group than the basketball group (P = .04). Conclusions: Gymnasts and soccer players did not differ in terms of static and dynamic balance. In contrast, basketball players displayed inferior static balance compared with gymnasts and inferior dynamic balance compared with soccer players. PMID:17597942
Conrad, Mark F; Kang, Jeanwan; Mukhopadhyay, Shankha; Patel, Virendra I; LaMuraglia, Glenn M; Cambria, Richard P
2013-10-01
The benefit of carotid endarterectomy (CEA) over medical therapy in patients with asymptomatic carotid artery stenosis is predicated upon a life expectancy of at least 5 years after the procedure. The goal of this study was to create a scoring system for prediction of 5-year survival after CEA that can be used to triage patients with ACAS. All patients who underwent CEA for severe asymptomatic carotid stenosis from 1989 to 2005 were identified. Long-term survival was determined by a review of hospital records and the social security death index. Because all patients had at least 5-year follow-up, a logistic regression of predictors of survival at 5 years was performed and the odds ratios associated with particular significant comorbidities were used to create a scoring system to predict survival. The scoring system was then validated within the cohort using the Hosmer-Lemeshow Test and a derivation/validation receiver operating characteristic (ROC) curve. There were 2004 CEA performed in 1791 patients. The average follow-up was 130 ± 49 months. The clinical profile of the cohort data included 84% hypertension, 56% coronary artery disease (CAD), 24% diabetes, and 71% on statins. The 30-day stroke rate was 1.1% and the death rate was 0.7%. The actual 5-year survival was 73%. Logistic regression yielded the following predictors of mortality: age (by decade) (odds ratio [OR] = 1.8, P < 0.0001), CAD (OR = 1.5, P = 0.0007), chronic obstructive pulmonary disease (OR = 2.5; P < 0.0001), diabetes (OR = 1.7, P < 0.0001), neck radiation (OR = 2.6, P = 0.005), no statin (OR = 2.1, P < 0.0001), and creatinine more than 1.5 (OR = 2.6, P < 0.0001). These variables were then assigned a hierarchal point scoring system in accordance with the OR value. The 5-year survival based on the scoring system was as follows: 0 to 5 points = 92.5%, 6 to 8 points = 83.6%, 9 to 11 points = 63.7%, 12 to 14 points = 46.5%, and more than 15 points = 33.8%. The Hosmer-Lemeshow test validated the scoring system (P = 0.26) and there was no difference in the ROC curves (C statistic = 0.74 vs 0.73). This validated scoring system can be a useful tool for determining which patients are likely to benefit most from CEA based on the probability of long-term survival. Given that the 5-year survival of patients in the medical arm of the asymptomatic CEA trials was 60% to 70%, it is reasonable to conclude that patients who score 0 to 8 points are excellent candidates for CEA whereas most patients with ≥12 points should be managed with medical therapy alone.
Grantcharov, Teodor P; Bardram, Linda; Funch-Jensen, Peter; Rosenberg, Jacob
2003-02-01
The study was carried out to analyze the learning rate for laparoscopic skills on a virtual reality training system and to establish whether the simulator was able to differentiate between surgeons with different laparoscopic experience. Forty-one surgeons were divided into three groups according to their experience in laparoscopic surgery: masters (group 1, performed more than 100 cholecystectomies), intermediates (group 2, between 15 and 80 cholecystectomies), and beginners (group 3, fewer than 10 cholecystectomies) were included in the study. The participants were tested on the Minimally Invasive Surgical Trainer-Virtual Reality (MIST-VR) 10 consecutive times within a 1-month period. Assessment of laparoscopic skills included time, errors, and economy of hand movement, measured by the simulator. The learning curves regarding time reached plateau after the second repetition for group 1, the fifth repetition for group 2, and the seventh repetition for group 3 (Friedman's tests P <0.05). Experienced surgeons did not improve their error or economy of movement scores (Friedman's tests, P >0.2) indicating the absence of a learning curve for these parameters. Group 2 error scores reached plateau after the first repetition, and group 3 after the fifth repetition. Group 2 improved their economy of movement score up to the third repetition and group 3 up to the sixth repetition (Friedman's tests, P <0.05). Experienced surgeons (group 1) demonstrated best performance parameters, followed by group 2 and group 3 (Mann-Whitney test P <0.05). Different learning curves existed for surgeons with different laparoscopic background. The familiarization rate on the simulator was proportional to the operative experience of the surgeons. Experienced surgeons demonstrated best laparoscopic performance on the simulator, followed by those with intermediate experience and the beginners. These differences indicate that the scoring system of MIST-VR is sensitive and specific to measuring skills relevant for laparoscopic surgery.
Assessing Freshman Engineering Students' Understanding of Ethical Behavior.
Henslee, Amber M; Murray, Susan L; Olbricht, Gayla R; Ludlow, Douglas K; Hays, Malcolm E; Nelson, Hannah M
2017-02-01
Academic dishonesty, including cheating and plagiarism, is on the rise in colleges, particularly among engineering students. While students decide to engage in these behaviors for many different reasons, academic integrity training can help improve their understanding of ethical decision making. The two studies outlined in this paper assess the effectiveness of an online module in increasing academic integrity among first semester engineering students. Study 1 tested the effectiveness of an academic honesty tutorial by using a between groups design with a Time 1- and Time 2-test. An academic honesty quiz assessed participants' knowledge at both time points. Study 2, which incorporated an improved version of the module and quiz, utilized a between groups design with three assessment time points. The additional Time 3-test allowed researchers to test for retention of information. Results were analyzed using ANCOVA and t tests. In Study 1, the experimental group exhibited significant improvement on the plagiarism items, but not the total score. However, at Time 2 there was no significant difference between groups after controlling for Time 1 scores. In Study 2, between- and within-group analyses suggest there was a significant improvement in total scores, but not plagiarism scores, after exposure to the tutorial. Overall, the academic integrity module impacted participants as evidenced by changes in total score and on specific plagiarism items. Although future implementation of the tutorial and quiz would benefit from modifications to reduce ceiling effects and improve assessment of knowledge, the results suggest such tutorial may be one valuable element in a systems approach to improving the academic integrity of engineering students.
O’Malley, Natasha T.; Cunningham, Michael; Leung, Frankie; Blauth, Michael; Kates, Stephen L.
2011-01-01
Background: Surgical education is continually expanding to encompass new techniques and technologies. It is vital that educational activity is directed at gaps in knowledge and ability to improve the quality of learning. Aim: The aim of this study is to describe a published learning assessment toolkit when applied to participants attending AOTrauma Orthogeriatric Fracture courses. Methods: Precourse, participants received a questionnaire covering 10 competencies to assess knowledge gaps and a 20-question clinical knowledge test. The knowledge gap between perceived and desired knowledge was correlated with clinical knowledge test results to help course faculty focus the course curriculum to meet identified educational needs. A commitment to change survey was also administered. Results: Over 3 courses, 48% of registered attendees responded to the precourse survey, 44.5% responded postcourse. The precourse gap scores were generally highest for 2 competencies (“address secondary prevention,” “build a system of care”) indicating a higher level of motivation to learn in these topics and lowest for a variety of competencies (eg. “restore function early,” “co-manage patient care in the US surgeons group”) indicating lower motivation to learn in these competencies. These precourse gap scores guided adaptations in the course structure. Postcourse gaps were reduced in the 4 cohorts. Large improvements were seen in “Address secondary prevention” and “Build a system of care” in many of the cohorts. Competencies with the lowest precourse knowledge test scores were noted in each cohort. Where low pretest scores were noted, it highlighted the need for faculty to put appropriate emphasis on these topics in the delivery of the course content. Conclusion: The technique of evaluating and identifying gaps in knowledge and ability allows course designers to focus on areas of deficits. Measurable success was shown with a subjectively decreased gap score and objectively improved clinical knowledge, as demonstrated by improved test results after course completion. PMID:23569686
Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong
2015-08-01
The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Ukkola-Vuoti, Liisa; Kanduri, Chakravarthi; Oikkonen, Jaana; Buck, Gemma; Blancher, Christine; Raijas, Pirre; Karma, Kai; Lähdesmäki, Harri; Järvelä, Irma
2013-01-01
Music perception and practice represent complex cognitive functions of the human brain. Recently, evidence for the molecular genetic background of music related phenotypes has been obtained. In order to further elucidate the molecular background of musical phenotypes we analyzed genome wide copy number variations (CNVs) in five extended pedigrees and in 172 unrelated subjects characterized for musical aptitude and creative functions in music. Musical aptitude was defined by combination of the scores of three music tests (COMB scores): auditory structuring ability, Seashores test for pitch and for time. Data on creativity in music (herein composing, improvising and/or arranging music) was surveyed using a web-based questionnaire.Several CNVRs containing genes that affect neurodevelopment, learning and memory were detected. A deletion at 5q31.1 covering the protocadherin-α gene cluster (Pcdha 1-9) was found co-segregating with low music test scores (COMB) in both sample sets. Pcdha is involved in neural migration, differentiation and synaptogenesis. Creativity in music was found to co-segregate with a duplication covering glucose mutarotase gene (GALM) at 2p22. GALM has influence on serotonin release and membrane trafficking of the human serotonin transporter. Interestingly, genes related to serotonergic systems have been shown to associate not only with psychiatric disorders but also with creativity and music perception. Both, Pcdha and GALM, are related to the serotonergic systems influencing cognitive and motor functions, important for music perception and practice. Finally, a 1.3 Mb duplication was identified in a subject with low COMB scores in the region previously linked with absolute pitch (AP) at 8q24. No differences in the CNV burden was detected among the high/low music test scores or creative/non-creative groups. In summary, CNVs and genes found in this study are related to cognitive functions. Our result suggests new candidate genes for music perception related traits and supports the previous results from AP study.
Oikkonen, Jaana; Buck, Gemma; Blancher, Christine; Raijas, Pirre; Karma, Kai; Lähdesmäki, Harri; Järvelä, Irma
2013-01-01
Music perception and practice represent complex cognitive functions of the human brain. Recently, evidence for the molecular genetic background of music related phenotypes has been obtained. In order to further elucidate the molecular background of musical phenotypes we analyzed genome wide copy number variations (CNVs) in five extended pedigrees and in 172 unrelated subjects characterized for musical aptitude and creative functions in music. Musical aptitude was defined by combination of the scores of three music tests (COMB scores): auditory structuring ability, Seashores test for pitch and for time. Data on creativity in music (herein composing, improvising and/or arranging music) was surveyed using a web-based questionnaire. Several CNVRs containing genes that affect neurodevelopment, learning and memory were detected. A deletion at 5q31.1 covering the protocadherin-α gene cluster (Pcdha 1-9) was found co-segregating with low music test scores (COMB) in both sample sets. Pcdha is involved in neural migration, differentiation and synaptogenesis. Creativity in music was found to co-segregate with a duplication covering glucose mutarotase gene (GALM) at 2p22. GALM has influence on serotonin release and membrane trafficking of the human serotonin transporter. Interestingly, genes related to serotonergic systems have been shown to associate not only with psychiatric disorders but also with creativity and music perception. Both, Pcdha and GALM, are related to the serotonergic systems influencing cognitive and motor functions, important for music perception and practice. Finally, a 1.3 Mb duplication was identified in a subject with low COMB scores in the region previously linked with absolute pitch (AP) at 8q24. No differences in the CNV burden was detected among the high/low music test scores or creative/non-creative groups. In summary, CNVs and genes found in this study are related to cognitive functions. Our result suggests new candidate genes for music perception related traits and supports the previous results from AP study. PMID:23460800
Mitra, Nilesh Kumar; Barua, Ankur
2015-03-03
The impact of web-based formative assessment practices on performance of undergraduate medical students in summative assessments is not widely studied. This study was conducted among third-year undergraduate medical students of a designated university in Malaysia to compare the effect, on performance in summative assessment, of repeated computer-based formative assessment with automated feedback with that of single paper-based formative assessment with face-to face feedback. This quasi-randomized trial was conducted among two groups of undergraduate medical students who were selected by stratified random technique from a cohort undertaking the Musculoskeletal module. The control group C (n = 102) was subjected to a paper-based formative MCQ test. The experimental group E (n = 65) was provided three online formative MCQ tests with automated feedback. The summative MCQ test scores for both these groups were collected after the completion of the module. In this study, no significant difference was observed between the mean summative scores of the two groups. However, Band 1 students from group E with higher entry qualification showed higher mean score in the summative assessment. A trivial, but significant and positive correlation (r(2) = +0.328) was observed between the online formative test scores and summative assessment scores of group E. The proportionate increase of performance in group E was found to be almost double than group C. The use of computer based formative test with automated feedback improved the performance of the students with better academic background in the summative assessment. Computer-based formative test can be explored as an optional addition to the curriculum of pre-clinical integrated medical program to improve the performance of the students with higher academic ability.
Conditional Standard Errors of Measurement for Composite Scores Using IRT
ERIC Educational Resources Information Center
Kolen, Michael J.; Wang, Tianyou; Lee, Won-Chan
2012-01-01
Composite scores are often formed from test scores on educational achievement test batteries to provide a single index of achievement over two or more content areas or two or more item types on that test. Composite scores are subject to measurement error, and as with scores on individual tests, the amount of error variability typically depends on…
Corneal staining patterns in vernal keratoconjunctivitis: the new VKC-CLEK scoring scale.
Leonardi, Andrea; Lazzarini, Daniela; La Gloria Valerio, Alvise; Scalora, Tania; Fregona, Iva
2018-01-24
To propose a new scoring system in the assessment of ocular surface epithelial damage in vernal keratoconjunctivitis (VKC). 25 consecutive patients with VKC (50 eyes) were evaluated using the Quality of Life in children with VKC (QUICK) questionnaire and objective clinical measures: fluorescein and lissamine green staining and cornea confocal microscopy (Heidelberg Retina Tomography 3). Oxford, Van Bljsterweld and a new system, the VKC-Collaborative Longitudinal Evaluation of Keratoconus study (CLEK) (VKC-CLEK) scores, were used to evaluate the epithelial damage after staining. Mean Oxford and VKC-CLEK scores were significantly different after fluorescein staining (P<0.001), but significantly correlated (P<0.001; r=0.649). The same data were obtained comparing Van Bljsterweld and VKC-CLEK after lissamine green staining (P<0.001; r=0.760). In patient with limbal VKC, a statistically significant difference was found comparing new VKC-CLEK scores and Oxford or Van Bljsterweld scores (P<0.001), but not in tarsal VKC. A statistically superior concordance was found between QUICK and VKC-CLEK scores compared with standard staining scores values (P<0.001). Oxford and Van Bijsterveld scores are not adequate for the evaluation of the epithelial damage in patients with limbal VKC because the staining patterns considered for these tests do not correspond to the staining patterns in patients with VKC. We propose a new scoring system, VKC-CLEK, to better evaluate both limbal and tarsal epithelial damage in patients with VKC. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Bosslet, Gabriel T; Carlos, W Graham; Tybor, David J; McCallister, Jennifer; Huebert, Candace; Henderson, Ashley; Miles, Matthew C; Twigg, Homer; Sears, Catherine R; Brown, Cynthia; Farber, Mark O; Lahm, Tim; Buckley, John D
2017-04-01
Few data have been published regarding scoring tools for selection of postgraduate medical trainee candidates that have wide applicability. The authors present a novel scoring tool developed to assist postgraduate programs in generating an institution-specific rank list derived from selected elements of the U.S. Electronic Residency Application System (ERAS) application. The authors developed and validated an ERAS and interview day scoring tool at five pulmonary and critical care fellowship programs: the ERAS Application Scoring Tool-Interview Scoring Tool. This scoring tool was then tested for intrarater correlation versus subjective rankings of ERAS applications. The process for development of the tool was performed at four other institutions, and it was performed alongside and compared with the "traditional" ranking methods at the five programs and compared with the submitted National Residency Match Program rank list. The ERAS Application Scoring Tool correlated highly with subjective faculty rankings at the primary institution (average Spearman's r = 0.77). The ERAS Application Scoring Tool-Interview Scoring Tool method correlated well with traditional ranking methodology at all five institutions (Spearman's r = 0.54, 0.65, 0.72, 0.77, and 0.84). This study validates a process for selecting and weighting components of the ERAS application and interview day to create a customizable, institution-specific tool for ranking candidates to postgraduate medical education programs. This scoring system can be used in future studies to compare the outcomes of fellowship training.
Altuntas, Nilgun; Turkyilmaz, Canan; Yildiz, Havva; Kulali, Ferit; Hirfanoglu, Ibrahim; Onal, Esra; Ergenekon, Ebru; Koç, Esin; Atalay, Yıldız
2014-05-01
We aimed to evaluate the validity and reliability of the Infant Breastfeeding Assessment Tool (IBFAT), the Mother Baby Assessment (MBA) Tool, and the LATCH scoring system. Mothers who delivered healthy, full-term infants in the Obstetrics & Gynecology Service of Gazi University, Ankara, Turkey, between December 2013 and January 2014 and their infants were included in the study. Forty-six randomly selected breastfeeding sessions were monitored and scored simultaneously by three researchers (Raters 1, 2, and 3) using LATCH, IBFAT, and the MBA Tool. Researchers put the score sheets in an envelope in order to hide them from each other. The compatibility of the scores given by three researchers was assessed by statistical methods. We found positive and significant correlation coefficients between 0.81 to 0.88 for the total MBA score, between 0.90 to 0.95 for the total IBFAT score, and between 0.85 to 0.91 for the total LATCH score. Correlation coefficients testing these three tools ranged from 0.71 to 0.88, with the minimum value being noted for the correlation between LATCH and IBFAT scores and the maximum value being noted for the correlation between LATCH and MBA scores. We found positive and significant correlations between researchers' scores for 46 observations using the three assessment tools. This study showed that these above-mentioned tools were compatible for the assessment of the efficiency of breastfeeding.
MOLINA, Gustavo Fabián; CABRAL, Ricardo Juan; MAZZOLA, Ignacio; BRAIN LASCANO, Laura; FRENCKEN, Jo. E.
2013-01-01
The Atraumatic Restorative Treatment (ART) approach was suggested to be a suitable method to treat enamel and dentine carious lesions in patients with disabilities. The use of a restorative glass-ionomer with optimal mechanical properties is, therefore, very important. Objective: To test the null-hypotheses that no difference in diametral tensile, compressive and flexural strengths exists between: (1) The EQUIA system and (2) The Chemfil Rock (encapsulated glass-ionomers; test materials) and the Fuji 9 Gold Label and the Ketac Molar Easymix (hand-mixed conventional glass-ionomers; control materials); (3) The EQUIA system and Chemfil Rock. Material and Methods: Specimens for testing flexural (n=240) and diametral tensile (n=80) strengths were prepared according to standardized specifications; the compressive strength (n=80) was measured using a tooth-model of a class II ART restoration. ANOVA and Tukey B tests were used to test for significant differences between dependent and independent variables. Results: The EQUIA system and Chemfil Rock had significantly higher mean scores for all the three strength variables than the Fuji 9 Gold Label and Ketac Molar Easymix (α=0.05). The EQUIA system had significant higher mean scores for diametral tensile and flexural strengths than the Chemfil Rock (α=0.05). Conclusion: The two encapsulated high-viscosity glass-ionomers had significantly higher test values for diametral tensile, flexural and compressive strengths than the commonly used hand-mixed high-viscosity glass-ionomers. PMID:23857657
2009-01-01
Background Chronic kidney disease (CKD) is a serious public health problem in Taiwan and the world. The most effective, affordable treatments involve early prevention/detection/intervention, requiring screening. Successfully implementing CKD programs requires good patient participation, affected by patient perceptions of screening service quality. Service quality improvements can help make such programs more successful. Thus, good tools for assessing service quality perceptions are important. Aim: to investigate using a modified SERVQUAL questionnaire in assessing patient expectations, perceptions, and loyalty towards kidney disease screening service quality. Method 1595 kidney disease screening program patients in Taichung City were requested to complete and return a modified kidney disease screening SERVQUAL questionnaire. 1187 returned them. Incomplete ones (102) were culled and 1085 were chosen as effective for use. Paired t-tests, correlation tests, ANOVA, LSD test, and factor analysis identified the characteristics and factors of service quality. The paired t-test tested expectation score and perception score gaps. A structural equation modeling system examined satisfaction-based components' relationships. Results The effective response rate was 91.4%. Several methods verified validity. Cronbach's alpha on internal reliability was above 0.902. On patient satisfaction, expectation scores are high: 6.50 (0.82), but perception scores are significantly lower 6.14 (1.02). Older patients' perception scores are lower than younger patients'. Expectation and perception scores for patients with different types of jobs are significantly different. Patients higher on education have lower scores for expectation (r = -0.09) and perception (r = -0.26). Factor analysis identified three factors in the 22 item SERVQUAL form, which account for 80.8% of the total variance for the expectation scores and 86.9% of the total variance for the satisfaction scores. Expectation and perception score gaps in all 22 items are significant. The goodness-of-fit summary of the SEM results indicates that expectations and perceptions are positively correlated, perceptions and loyalty are positively correlated, but expectations and loyalty are not positively correlated. Conclusions The results of this research suggest that the SERVQUAL instrument is a useful measurement tool in assessing and monitoring service quality in kidney disease screening services, enabling the staff to identify where service improvements are needed from the patients' perspectives. PMID:20021684
ERIC Educational Resources Information Center
Livingston, Samuel A.; Chen, Haiwen H.
2015-01-01
Quantitative information about test score reliability can be presented in terms of the distribution of equated scores on an alternate form of the test for test takers with a given score on the form taken. In this paper, we describe a procedure for estimating that distribution, for any specified score on the test form taken, by estimating the joint…
A Rasch-Based Validation of the Hooper Visual Organization Test in Chinese-Speaking Children
ERIC Educational Resources Information Center
Wuang, Yee-Pay; Wang, Li-Chen; Su, Chwen-Yng
2010-01-01
The aim of this study was to examine the validation of the Hooper Visual Organization Test (HVOT) for use in children by testing for item fit, unidimensionality, item hierarchy, reliability, and screening capacity. A modified scoring system was devised for the HVOT so that children received some credit for being able to describe the function of…
Why Lessons Learned from the Past Require Haertel's Expanded Scope for Test Validation
ERIC Educational Resources Information Center
Shepard, Lorrie A.
2013-01-01
In his article, Haertel (this issue) asks a fundamental question about how use of a test is expected to cause improvements in the educational system and in learning. He also considers how test validity should be investigated and argues for a more expansive view of validity that does not stop with scoring or generalization (the more technical and…
Results on the Slosson Drawing Coordination Test with Appalachian Sheltered Workshop Clients.
ERIC Educational Resources Information Center
Rogers, George W., Jr.; Richmond, Bert O.
Fifty-four clients (13- to 52-years-old) in an Appalachian sheltered workshop were administered the Slosson Drawing Coordination Test (SDCT) and the Bender Visual Motor Gestalt Test. Twenty-nine Ss were labeled possibly brain damaged by the SDCT, and 17 Ss by the M. Hutt scoring system for the Bender-Gestalt. Two psychologists using all available…
Khan, Fehmeda Farrukh; Numan, Ahsan; Khawaja, Khadija Irfan; Atif, Ali; Fatima, Aziz; Masud, Faisal
2015-01-01
Early diagnosis of distal peripheral neuropathy (DSPN) the commonest diabetes complications, helps prevent significant morbidity. Clinical parameters are useful for detection, but subjectivity and lack of operator proficiency often results in inaccuracies. Comparative diagnostic accuracy of Diabetic Neuropathy Symptom (DNS) score and Diabetic Neuropathy Examination (DNE) score in detecting DSPN confirmed by nerve conduction studies (NCS) has not been evaluated. This study compares the performance of these scores in predicting the presence of electro physiologically proven DSPN. The objective of this, study was to compare the diagnostic accuracy of DNS and DNE scores in detecting NCS proven DSPN in type-2 diabetics, and to determine the frequency of sub-clinical DSPN among type-2 diabetics. In this cross-sectional study the DNS score and DNE score were determined in 110 diagnosed type-2 diabetic patients. NCS were carried out and amplitudes, velocities and latencies of sensory and motor nerves in lower limb were recorded. Comparison between the two clinical diagnostic modalities and NCS using Pearson's chi square test showed a significant association between NCS and DNE scores (p-value =.003, specificity 93%). The DNS score performed poorly in comparison (p-value = .068, specificity 77%). When the two scores were taken in combination the specificity in diagnosing DSPN was greater (p-value = .018, specificity 96%) than either alone. 33% of patients had subclinical neuropathy. DNE score alone and in combination with DNS score is reliable in predicting DSPN and is more specific than DNS score in evaluating DSPN. Both tests lack sensitivity. Patients without any evidence of clinical neuropathy manifest abnormalities on NCS.
Sierra-Guzmán, Rafael; Jiménez, Fernando; Abián-Vicén, Javier
2018-05-01
Previous studies have reported the factors contributing to chronic ankle instability, which could lead to more effective treatments. However, factors such as the reflex response and ankle muscle strength have not been taken into account in previous investigations. Fifty recreational athletes with chronic ankle instability and 55 healthy controls were recruited. Peroneal reaction time in response to sudden inversion, isokinetic evertor muscle strength and dynamic balance with the Star Excursion Balance Test and the Biodex Stability System were measured. The relationship between the Cumberland Ankle Instability Tool score and performance on each test was assessed and a backward multiple linear regression analysis was conducted. Participants with chronic ankle instability showed prolonged peroneal reaction time, poor performance in the Biodex Stability System and decreased reach distance in the Star Excursion Balance Test. No significant differences were found in eversion and inversion peak torque. Moderate correlations were found between the Cumberland Ankle Instability Tool score and the peroneal reaction time and performance on the Star Excursion Balance Test. Peroneus brevis reaction time and the posteromedial and lateral directions of the Star Excursion Balance Test accounted for 36% of the variance in the Cumberland Ankle Instability Tool. Dynamic balance deficits and delayed peroneal reaction time are present in participants with chronic ankle instability. Peroneus brevis reaction time and the posteromedial and lateral directions of the Star Excursion Balance Test were the main contributing factors to the Cumberland Ankle Instability Tool score. No clear strength impairments were reported in unstable ankles. Copyright © 2018 Elsevier Ltd. All rights reserved.
Giarenis, Ilias; Musonda, Patrick; Mastoroudes, Heleni; Robinson, Dudley; Cardozo, Linda
2016-10-01
Traditionally, urodynamic studies (UDS) have been used to assess lower urinary tract symptoms (LUTS), but their routine use is now discouraged. While urodynamic stress incontinence is strongly associated with the symptom of stress urinary incontinence (SUI) and a positive cough test, there is a weak relationship between symptoms of overactive bladder and detrusor overactivity (DO). The aim of our study was to develop a model to predict DO in women with LUTS. This prospective study included consecutive women with LUTS attending a urodynamic clinic. All women underwent a comprehensive clinical and urodynamic assessment. The effect of each variable on the odds of DO was estimated both by univariate analysis and adjusted analysis using logistic regression. 1006 women with LUTS were included in the study with 374 patients (37%) diagnosed with DO. The factors considered to be the best predictors of DO were urgency urinary incontinence, urge rating/void and parity (p-value<0.01). The absence of SUI, vaginal bulging and previous continence surgery were also good predictors of DO (p-value<0.01). We have created a prediction model for DO based on our best predictors. In our scoring system, presence of UUI scores 5; mean urge rating/void≥3 scores 3; parity≥2 scores 2; previous continence surgery scores -1; presence of SUI scores -1; and the complaint of vaginal bulging scores -1. If a criterion is absent, then the score is 0 and the total score can vary from a value of -3 to +10. The Receiver Operating Characteristic (ROC) analysis for the overall cut-off points revealed an area under the curve of 0.748 (95%CI 0.741, 0.755). This model is able to predict DO more accurately than a symptomatic diagnosis alone, in women with LUTS. The introduction of this scoring system as a screening tool into clinical practice may reduce the need for expensive and invasive tests to diagnose DO, but cannot replace UDS completely. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Saatcioglu, Argun; Skrtic, Thomas M.; DeLuca, Thomas A.
2016-01-01
The overuse of test accommodations (e.g., test readers, extra time, and calculators) for students with disabilities is a potential means of gaming the accountability system because it can inflate proficiency gains. However, no direct evidence on this problem exists, and findings on whether or not test accommodations improve test scores are…
Jakimov, Tamara; Mrdović, Igor; Filipović, Branka; Zdravković, Marija; Djoković, Aleksandra; Hinić, Saša; Milić, Nataša; Filipović, Branislav
2017-12-31
To compare the prognostic performance of three major risk scoring systems including global registry for acute coronary events (GRACE), thrombolysis in myocardial infarction (TIMI), and prediction of 30-day major adverse cardiovascular events after primary percutaneous coronary intervention (RISK-PCI). This single-center retrospective study involved 200 patients with acute coronary syndrome (ACS) who underwent invasive diagnostic approach, ie, coronary angiography and myocardial revascularization if appropriate, in the period from January 2014 to July 2014. The GRACE, TIMI, and RISK-PCI risk scores were compared for their predictive ability. The primary endpoint was a composite 30-day major adverse cardiovascular event (MACE), which included death, urgent target-vessel revascularization (TVR), stroke, and non-fatal recurrent myocardial infarction (REMI). The c-statistics of the tested scores for 30-day MACE or area under the receiver operating characteristic curve (AUC) with confidence intervals (CI) were as follows: RISK-PCI (AUC=0.94; 95% CI 1.790-4.353), the GRACE score on admission (AUC=0.73; 95% CI 1.013-1.045), the GRACE score on discharge (AUC=0.65; 95% CI 0.999-1.033). The RISK-PCI score was the only score that could predict TVR (AUC=0.91; 95% CI 1.392-2.882). The RISK-PCI scoring system showed an excellent discriminative potential for 30-day death (AUC=0.96; 95% CI 1.339-3.548) in comparison with the GRACE scores on admission (AUC=0.88; 95% CI 1.018-1.072) and on discharge (AUC=0.78; 95% CI 1.000-1.058). In comparison with the GRACE and TIMI scores, RISK-PCI score showed a non-inferior ability to predict 30-day MACE and death in ACS patients. Moreover, RISK-PCI was the only scoring system that could predict recurrent ischemia requiring TVR.
Jakimov, Tamara; Mrdović, Igor; Filipović, Branka; Zdravković, Marija; Djoković, Aleksandra; Hinić, Saša; Milić, Nataša; Filipović, Branislav
2017-01-01
Aim To compare the prognostic performance of three major risk scoring systems including global registry for acute coronary events (GRACE), thrombolysis in myocardial infarction (TIMI), and prediction of 30-day major adverse cardiovascular events after primary percutaneous coronary intervention (RISK-PCI). Methods This single-center retrospective study involved 200 patients with acute coronary syndrome (ACS) who underwent invasive diagnostic approach, ie, coronary angiography and myocardial revascularization if appropriate, in the period from January 2014 to July 2014. The GRACE, TIMI, and RISK-PCI risk scores were compared for their predictive ability. The primary endpoint was a composite 30-day major adverse cardiovascular event (MACE), which included death, urgent target-vessel revascularization (TVR), stroke, and non-fatal recurrent myocardial infarction (REMI). Results The c-statistics of the tested scores for 30-day MACE or area under the receiver operating characteristic curve (AUC) with confidence intervals (CI) were as follows: RISK-PCI (AUC = 0.94; 95% CI 1.790-4.353), the GRACE score on admission (AUC = 0.73; 95% CI 1.013-1.045), the GRACE score on discharge (AUC = 0.65; 95% CI 0.999-1.033). The RISK-PCI score was the only score that could predict TVR (AUC = 0.91; 95% CI 1.392-2.882). The RISK-PCI scoring system showed an excellent discriminative potential for 30-day death (AUC = 0.96; 95% CI 1.339-3.548) in comparison with the GRACE scores on admission (AUC = 0.88; 95% CI 1.018-1.072) and on discharge (AUC = 0.78; 95% CI 1.000-1.058). Conclusions In comparison with the GRACE and TIMI scores, RISK-PCI score showed a non-inferior ability to predict 30-day MACE and death in ACS patients. Moreover, RISK-PCI was the only scoring system that could predict recurrent ischemia requiring TVR. PMID:29308832
ERIC Educational Resources Information Center
Miller-Whitehead, Marie
Evidence provided by analysis of science scale scores on the McGraw-Hill CTB/4 science test for grades 2 through 8 in Tennessee, part of the Tennessee Comprehensive Assessment Program (TCAP), shows that it is possible for high achieving school systems to show continuous improvement from year to year. These results would tend to offset fears that…
ERIC Educational Resources Information Center
Joe, Jilliam N.; Tocci, Cynthia M.; Holtzman, Steven L.; Williams, Jean C.
2013-01-01
The purpose of this paper is to provide states and school districts with processes they can use to help ensure high-quality data collection during teacher observations. Educational Testing Service's (ETS's) goal in writing it is to share the knowledge and expertise they gained: (1) from designing and implementing scoring processes for the Measures…
Nurses' Evaluation of Their Use and Mastery in Health Assessment Skills: Selected Iran's Hospitals
Adib-Hajbaghery, Mohsen; Safa, Azade
2013-01-01
Background: Health assessment skills are of the most important skills which nurses require. The more precise assessment, the better results would be obtained and the quality of patient care would be improved. However, in Iran, few studies have investigated nurses’ assessment skills. Objectives: This study was aimed to assessnurses' evaluation of the learned skills of health assessment and their use. Materials and Methods: This cross-sectional study was conducted on 200 nurses in Isfahan province hospitals. Data was collected by a questionnaire including demographic data and 120 health assessment skills. Nurses scored their frequency of using and proficiency in skills. Statistical analysis was conducted by ANOVA, Tukey test and independent sample T-tests. Results: The highest level of using and proficiency in skills was related to taking history. Nurses received 87.25% of score in this field. The lowest level of application was in assessment of the urogenital system so that nurses received 16.37% of score in this area. Also the lowest proficiency was in assessment of the nervous system and nurses received 34.58% of score in this area. Conclusions: The level of nurses' proficiency in the health assessment skills was not satisfactory. Modifying the curriculum and cooperating of nurse managers and nursing schools can help to improve the situation. PMID:25414875
Nurses' Evaluation of Their Use and Mastery in Health Assessment Skills: Selected Iran's Hospitals.
Adib-Hajbaghery, Mohsen; Safa, Azade
2013-09-01
Health assessment skills are of the most important skills which nurses require. The more precise assessment, the better results would be obtained and the quality of patient care would be improved. However, in Iran, few studies have investigated nurses' assessment skills. This study was aimed to assessnurses' evaluation of the learned skills of health assessment and their use. This cross-sectional study was conducted on 200 nurses in Isfahan province hospitals. Data was collected by a questionnaire including demographic data and 120 health assessment skills. Nurses scored their frequency of using and proficiency in skills. Statistical analysis was conducted by ANOVA, Tukey test and independent sample T-tests. The highest level of using and proficiency in skills was related to taking history. Nurses received 87.25% of score in this field. The lowest level of application was in assessment of the urogenital system so that nurses received 16.37% of score in this area. Also the lowest proficiency was in assessment of the nervous system and nurses received 34.58% of score in this area. The level of nurses' proficiency in the health assessment skills was not satisfactory. Modifying the curriculum and cooperating of nurse managers and nursing schools can help to improve the situation.
Creativity measured by divergent thinking is associated with two axes of autistic characteristics
Takeuchi, Hikaru; Taki, Yasuyuki; Sekiguchi, Atsushi; Nouchi, Rui; Kotozaki, Yuka; Nakagawa, Seishu; Miyauchi, Carlos M.; Iizuka, Kunio; Yokoyama, Ryoichi; Shinada, Takamitsu; Yamamoto, Yuki; Hanawa, Sugiko; Araki, Tsuyoshi; Hashizume, Hiroshi
2014-01-01
Creativity generally involves the conception of original and valuable ideas, and it plays a key role in scientific achievement. Moreover, individuals with autistic spectrum conditions (ASCs) tend to achieve in scientific fields. Recently, it has been proposed that low empathizing and high systemizing characterize individuals with ASCs. Empathizing is the drive to identify the mental status of other individuals and respond to it with an appropriate emotion; systemizing is the drive to analyze a system. It has been proposed that this higher systemizing underlies the scientific achievement of individuals with ASCs, suggesting the possible positive association between creativity and systemizing. However, previous findings on the association between ASCs and creativity were conflicting. Conversely, previous studies have suggested an association between prosocial traits and creativity, indicating the possible association between empathizing and creativity. Here we investigated the association between creativity measured by divergent thinking (CDT) and empathizing, systemizing, and the discrepancy between systemizing and empathizing, which is called D score. CDT was measured using the S-A creativity test. The individual degree of empathizing (empathizing quotient, EQ) and that of systemizing (systemizing quotient, SQ), and D score was measured via a validated questionnaire (SQ and EQ questionnaires). The results showed that higher CDT was significantly and positively correlated with both the score of EQ and the score of SQ but not with D score. These results suggest that CDT is positively associated with one of the characteristics of ASCs (analytical aspects), while exhibiting a negative association with another (lower social aspects). Therefore, the discrepancy between systemizing and empathizing, which is strongly associated with autistic tendency, was not associated with CDT. PMID:25191299
ERIC Educational Resources Information Center
Haga, Wayne; Moreno, Abel; Segall, Mark
2012-01-01
In this paper, we compare the performance of Computer Information Systems (CIS) majors on the Information Systems Analyst (ISA) Certification Exam. The impact that the form of delivery of information systems coursework may have on the exam score is studied. Using a sample that spans three years, we test for significant differences between scores…
Bologna, Matteo; Berardelli, Isabella; Paparella, Giulia; Marsili, Luca; Ricciardi, Lucia; Fabbrini, Giovanni; Berardelli, Alfredo
2016-01-01
Altered emotional processing, including reduced emotion facial expression and defective emotion recognition, has been reported in patients with Parkinson's disease (PD). However, few studies have objectively investigated facial expression abnormalities in PD using neurophysiological techniques. It is not known whether altered facial expression and recognition in PD are related. To investigate possible deficits in facial emotion expression and emotion recognition and their relationship, if any, in patients with PD. Eighteen patients with PD and 16 healthy controls were enrolled in this study. Facial expressions of emotion were recorded using a 3D optoelectronic system and analyzed using the facial action coding system. Possible deficits in emotion recognition were assessed using the Ekman test. Participants were assessed in one experimental session. Possible relationship between the kinematic variables of facial emotion expression, the Ekman test scores, and clinical and demographic data in patients were evaluated using the Spearman's test and multiple regression analysis. The facial expression of all six basic emotions had slower velocity and lower amplitude in patients in comparison to healthy controls (all P s < 0.05). Patients also yielded worse Ekman global score and disgust, sadness, and fear sub-scores than healthy controls (all P s < 0.001). Altered facial expression kinematics and emotion recognition deficits were unrelated in patients (all P s > 0.05). Finally, no relationship emerged between kinematic variables of facial emotion expression, the Ekman test scores, and clinical and demographic data in patients (all P s > 0.05). The results in this study provide further evidence of altered emotional processing in PD. The lack of any correlation between altered facial emotion expression kinematics and emotion recognition deficits in patients suggests that these abnormalities are mediated by separate pathophysiological mechanisms.
Nord, Anette; Hult, Håkan; Kreitz-Sandberg, Susanne; Herlitz, Johan; Svensson, Leif; Nilsson, Lennart
2017-06-23
The aim of this research is to investigate if two additional interventions, test and reflection, after standard cardiopulmonary resuscitation (CPR) training facilitate learning by comparing 13-year-old students' practical skills and willingness to act. Seventh grade students in council schools of two municipalities in south-east Sweden. School classes were randomised to CPR training only (O), CPR training with a practical test including feedback (T) or CPR training with reflection and a practical test including feedback (RT). Measures of practical skills and willingness to act in a potential life-threatening situation were studied directly after training and at 6 months using a digital reporting system and a survey. A modified Cardiff test was used to register the practical skills, where scores in each of 12 items resulted in a total score of 12-48 points. The study was conducted in accordance with current European Resuscitation Council guidelines during December 2013 to October 2014. 29 classes for a total of 587 seventh grade students were included in the study. The total score of the modified Cardiff test at 6 months was the primary outcome. Secondary outcomes were the total score directly after training, the 12 individual items of the modified Cardiff test and willingness to act. At 6 months, the T and O groups scored 32 (3.9) and 30 (4.0) points, respectively (p<0.001), while the RT group scored 32 (4.2) points (not significant when compared with T). There were no significant differences in willingness to act between the groups after 6 months. A practical test including feedback directly after training improved the students' acquisition of practical CPR skills. Reflection did not increase further CPR skills. At 6-month follow-up, no intervention effect was found regarding willingness to make a life-saving effort. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Nord, Anette; Hult, Håkan; Kreitz-Sandberg, Susanne; Herlitz, Johan; Svensson, Leif; Nilsson, Lennart
2017-01-01
Objectives The aim of this research is to investigate if two additional interventions, test and reflection, after standard cardiopulmonary resuscitation (CPR) training facilitate learning by comparing 13-year-old students’ practical skills and willingness to act. Settings Seventh grade students in council schools of two municipalities in south-east Sweden. Design School classes were randomised to CPR training only (O), CPR training with a practical test including feedback (T) or CPR training with reflection and a practical test including feedback (RT). Measures of practical skills and willingness to act in a potential life-threatening situation were studied directly after training and at 6 months using a digital reporting system and a survey. A modified Cardiff test was used to register the practical skills, where scores in each of 12 items resulted in a total score of 12–48 points. The study was conducted in accordance with current European Resuscitation Council guidelines during December 2013 to October 2014. Participants 29 classes for a total of 587 seventh grade students were included in the study. Primary and secondary outcome measures The total score of the modified Cardiff test at 6 months was the primary outcome. Secondary outcomes were the total score directly after training, the 12 individual items of the modified Cardiff test and willingness to act. Results At 6 months, the T and O groups scored 32 (3.9) and 30 (4.0) points, respectively (p<0.001), while the RT group scored 32 (4.2) points (not significant when compared with T). There were no significant differences in willingness to act between the groups after 6 months. Conclusions A practical test including feedback directly after training improved the students’ acquisition of practical CPR skills. Reflection did not increase further CPR skills. At 6-month follow-up, no intervention effect was found regarding willingness to make a life-saving effort. PMID:28645953
Machino, Masaaki; Yukawa, Yasutsugu; Imagama, Shiro; Ito, Keigo; Katayama, Yoshito; Matsumoto, Tomohiro; Inoue, Taro; Ouchida, Jun; Tomita, Keisuke; Ishiguro, Naoki; Kato, Fumihiko
2016-05-01
A prospective cohort study. The purpose of this study was to compare surgical outcomes between non-elderly and elderly patients with cervical spondylotic myelopathy (CSM) who underwent laminoplasty. Since age at the time of surgery influences the surgical outcome, we designed a large-scale cohort study to examine the surgical outcome for CSM from a single operative procedure used exclusively in elderly patients. A total of 505 consecutive patients with CSM (311 men; 194 women) were prospectively enrolled. The mean age was 66.6 years (range, 41-91), and the average postoperative follow-up period was 26.5 ± 12.5 months. Patients were divided into three groups according to age: non-elderly (<65 yr, n = 201), young-old (65-74 yr, n = 186), and old-old (≥75 yr, n = 118). Pre- and postoperative neurological status was evaluated using the Japanese Orthopaedic Association scoring system for cervical myelopathy (JOA score) and quantifiable tests-the 10-s grip and release test (10-s G&R test) and the 10-s step test. Mean achieved JOA scores in non-elderly, young-old, and old-old groups were 3.1, 3.2, and 3.0, respectively, with no significant difference among three groups (P = 0.5735). Mean preoperative 10-s G&R test results were 17.3, 14.4, and 13.0, respectively, indicating a significant decrease with increasing age, whereas postoperative results significantly improved in all groups (21.0, 17.9, and 16.3, respectively). Similarly, the 10-s step test significantly decreased with age, with preoperative scores of 14.3, 11.5, and 8.6, respectively, whereas postoperative scores improved to 17.3, 14.9, and 12.5, respectively. The three groups showed no significant difference in the rate of postoperative complications. Elderly patients adequately recovered from laminoplasty in terms of achieved JOA score, the 10-s G&R test, and the 10-s step test. Therefore, laminoplasty for CSM is beneficial in elderly patients. 2.
Lascano, Danny; Finkelstein, Julia B; Barlow, LaMont J; Kabat, Daniel; RoyChoudhury, Arindam; Caso, Jorge R; DeCastro, G Joel; Gold, William; McKiernan, James M
2015-12-01
To evaluate whether there is a correlation between publicized health ranking systems and surgical outcomes after radical cystectomy (RC) in New York State (NYS). Using the Statewide Planning and Research Cooperative System, data were collected in an aggregated fashion per hospital for the 20 hospitals with the highest RC volume in NYS from 2009 to 2012. Hospital characteristics were obtained from the publicly available sources such as the Centers for Medicare and Medicaid Services. Publicized ranking systems evaluated included the US News & World Health Report for Urology ranking (USHR), Healthgrades (HG) score, and Consumer Reports (CR) safety ranking. Outcomes measured included mortality, readmissions, and causes of readmissions. CR safety scores were inversely associated with overall death at 90 days after surgery (R = -0.527, P = .030), number of readmissions (R = -0.608, P = .030), and readmissions because of surgical complications (R = -0.523, P = .031) on a Pearson correlation test. On Kendall rank tau test, USHR and HG were not associated with any outcome of interest, although the scores correlated with increasing RC volume. In our analysis of 20 hospitals with the highest RC volume in NYS, USHR and HG scores were not strongly associated with any clinical outcome after RC. CR performed well in comparison with USHR and HG. Nevertheless, better metrics are needed to compare hospitals and to incorporate curative rates for morbid surgeries. Copyright © 2015 Elsevier Inc. All rights reserved.
Longo, Caterina; Casari, Alice; De Pace, Barbara; Simonazzi, Silvia; Mazzaglia, Giovanna; Pellacani, Giovanni
2013-02-01
Many instrumental devices have been testing in analysing and quantifying the skin aging signs. However, histopathology still remains the only methods that allow a microscopic assessment of the skin. However, a skin biopsy is not feasible in aesthetically critical areas such as the face. Recently, confocal microscopy has been discovered as a noninvasive tool with a nearly histologic resolution. Distinct morphologic confocal aspects on facial skin have been described and correlated with the histopathologic counterparts. In our study we aim to develop an easy to use confocal aging score to quantify the skin aging related signs. A sample of facial skin of fifty volunteers has been subjected to confocal imaging. Combining the previously identified confocal features, three different semi-quantitative scores were calculated: - epidermal disarray score (irregular honeycombed pattern + epidermal thickness + furrow pattern); - epidermal hyperplasia score (mottled pigmentation + extent of polycyclic papillary + epidermal thickness; - collagen score (curled fibers, 2 for huddles of collagen, 1 for coarse collagen structures, and 0 for thin reticulated collagen) The epidermal disarray score showed a stable trend up to 65 years and a dramatic increase in the elderly subjects epidermal. Hyperplasia score was characterized by an ascending trend from younger subjects to middle age. The total collagen score showed a progressive trend with age with a different proportion of distinct collagen type. RCM is a powerful, noninvasive technique that could permit to microscopically quantify the aging signs and to test cosmetic efficacy. © 2012 John Wiley & Sons A/S.
Genetics Home Reference: Cowden syndrome
... MS, Eng C. A clinical scoring system for selection of patients for PTEN mutation testing is proposed ... should consult with a qualified healthcare professional . About Selection Criteria for Links Data Files & API Site Map ...
Cunha, Burke A; Syed, Uzma; Stroll, Stephanie; Mickail, Nardeen; Laguerre, Marianne
2009-01-01
In spring 2009, a novel strain of influenza A originating in Veracruz, Mexico, quickly spread to the United States and throughout the world. This influenza A virus was the product of gene reassortment of 4 different genetic elements: human influenza, swine influenza, avian influenza, and Eurasian swine influenza. In the United States, New York was the epicenter of the swine influenza (H1N1) pandemic. Hospital emergency departments (EDs) were inundated with patients with influenza-like illnesses (ILIs) requesting screening for H1N1. Our ED screening, as well as many others, used a rapid screening test for influenza A (QuickVue A/B) because H1N1 was a variant of influenza A. The definitive laboratory test i.e., RT-PCR for H1N1 was developed by the Centers for Disease Control (Atlanta, GA) and subsequently distributed to health departments. Because of the extraordinary volume of test requests, health authorities restricted reverse transcription polymerase chain reaction (RT-PCR) testing. Hence most EDs, including our own, were dependent on rapid influenza diagnostic tests (RIDTs) for swine influenza. A positive rapid influenza A test was usually predictive of RT-PCR H1N1 positivity, but the rapid influenza A screening test (QuickVue A/B) was associated with 30% false negatives. The inability to rely on RIDTs for H1N1 diagnosis resulted in underdiagnosing H1N1. Confronted with adults admitted with ILIs, negative RIDTs, and restricted RT-PCR testing, there was a critical need to develop clinical criteria to diagnose probable swine influenza H1N1 pneumonia. During the pandemic, the Infectious Disease Division at Winthrop-University Hospital developed clinical criteria for adult admitted patients with ILIs and negative RIDTs. Similar to the one developed for the clinical diagnosis of legionnaire's disease. The Winthrop-University Hospital Infectious Disease Division's diagnostic weighted point score system for swine influenza H1N1 pneumonia is based on key clinical and laboratory features. During the "herald" wave of the swine influenza H1N1 pandemic, the diagnostic weighted point score system accurately identified probable swine influenza H1N1 pneumonia and accurately differentiated swine influenza H1N1 pneumonia from ILIs and other viral and bacterial community-acquired pneumonias. In hospitalized adults with ILIs and negative RIDTs, the diagnostic weighted diagnostic point score system, may be used to make a presumptive clinical diagnosis of swine influenza H1N1 pneumonia.
Validation of measures from the smartphone sway balance application: a pilot study.
Patterson, Jeremy A; Amick, Ryan Z; Thummar, Tarunkumar; Rogers, Michael E
2014-04-01
A number of different balance assessment techniques are currently available and widely used. These include both subjective and objective assessments. The ability to provide quantitative measures of balance and posture is the benefit of objective tools, however these instruments are not generally utilized outside of research laboratory settings due to cost, complexity of operation, size, duration of assessment, and general practicality. The purpose of this pilot study was to assess the value and validity of using software developed to access the iPod and iPhone accelerometers output and translate that to the measurement of human balance. Thirty healthy college-aged individuals (13 male, 17 female; age = 26.1 ± 8.5 years) volunteered. Participants performed a static Athlete's Single Leg Test protocol for 10 sec, on a Biodex Balance System SD while concurrently utilizing a mobile device with balance software. Anterior/posterior stability was recorded using both devices, described as the displacement in degrees from level, and was termed the "balance score." There were no significant differences between the two reported balance scores (p = 0.818. Mean balance score on the balance platform was 1.41 ± 0.90, as compared to 1.38 ± 0.72 using the mobile device. There is a need for a valid, convenient, and cost-effective tool to objectively measure balance. Results of this study are promising, as balance score derived from the Smartphone accelerometers were consistent with balance scores obtained from a previously validated balance system. However, further investigation is necessary as this version of the mobile software only assessed balance in the anterior/posterior direction. Additionally, further testing is necessary on a healthy populations and as well as those with impairment of the motor control system. Level 2b (Observational study of validity)(1.)
Alvarez-Twose, I; González-de-Olano, D; Sánchez-Muñoz, L; Matito, A; Jara-Acevedo, M; Teodosio, C; García-Montero, A; Morgado, J M; Orfao, A; Escribano, L
2012-01-01
A variable percentage of patients with systemic mast cell (MC) activation symptoms meet criteria for systemic mastocytosis (SM). We prospectively evaluated the clinical utility of the REMA score versus serum baseline tryptase (sBt) levels for predicting MC clonality and SM in 158 patients with systemic MC activation symptoms in the absence of mastocytosis in the skin (MIS). World Health Organization criteria for SM were applied in all cases. MC clonality was defined as the presence of KIT-mutated MC or by a clonal HUMARA test. The REMA score consisted of the assignment of positive or negative points as follows: male (+1), female (-1), sBt <15 μg/l (-1) or >25 μg/l (+2), presence (-2) or absence (+1) of pruritus, hives or angioedema and presence (+3) of presyncope or syncope. Efficiency of the REMA score for predicting MC clonality and SM was assessed by receiver operating characteristic (ROC) curve analyses and compared to those obtained by means of sBt levels alone. Molecular studies revealed the presence of clonal MC in 68/80 SM cases and in 11/78 patients who did not meet the criteria for SM. ROC curve analyses confirmed the greater sensitivity and a similar specificity of the REMA score versus sBt levels (84 vs. 59% and 74 vs. 70% for MC clonality and 87 vs. 62% and 73 vs. 71% for SM, respectively). Our results confirm the clinical utility of the REMA score to predict MC clonality and SM in patients suffering from systemic MC activation symptoms without MIS. Copyright © 2011 S. Karger AG, Basel.
Functional Performance and Balance in the Oldest-Old.
Kafri, Michal; Hutzler, Yeshayahu; Korsensky, Olga; Laufer, Yocheved
2017-06-01
The group of individuals 85 years and over (termed oldest-old) is the fastest-growing population in the Western world. Although daily functional abilities and balance capabilities are known to decrease as an individual grows older, little is known about the balance and functional characteristics of the oldest-old population. The aims of this study were to characterize balance control, functional abilities, and balance self-efficacy in the oldest-old, to test the correlations between these constructs, and to explore differences between fallers and nonfallers in this age group. Forty-five individuals living in an assisted living facility who ambulated independently participated in the study. The mean age was 90.3 (3.7) years. Function was tested using the Late-Life Function and Disability Instrument (LLFDI). Balance was tested with the mini-Balance Evaluation System Test (mini-BESTest) and the Timed Up and Go (TUG) test. Balance self-efficacy was tested with the Activities-Specific Balance Confidence (ABC) scale. The mean total function LLFDI score was 63.2 (11.4). The mean mini-BESTest score was 69.8% (18.6%) and the mean TUG time was 12.6 (6.9) seconds. The mean ABC score was 80.2% (14.2%). Good correlation (r > 0.7) was observed between the ABC and the function component of the LLFDI, as well as with the lower extremity domains. Correlations between the mini-BESTest scores and the LLFDI were fair to moderate (r's range: 0.38-0.62). Age and ABC scores were significant independent explanators of LLFDI score (P = .0141 and P = .0009, respectively). Fallers and nonfallers differed significantly across all outcome measures scores, except for TUG and for the "Reactive Postural Control" and "Sensory Orientation" domains of the mini-BESTest. The results of this study provide normative data regarding the balance and functional abilities of the oldest-old, and indicate a strong association between self-efficacy and function. These results emphasize the importance of incorporating strategies that maintain and improve balance self-efficacy in interventions aimed at enhancing the functional level of this cohort.
Reliability and validity analysis of the open-source Chinese Foot and Ankle Outcome Score (FAOS).
Ling, Samuel K K; Chan, Vincent; Ho, Karen; Ling, Fona; Lui, T H
2017-12-21
Develop the first reliable and validated open-source outcome scoring system in the Chinese language for foot and ankle problems. Translation of the English FAOS into Chinese following regular protocols. First, two forward-translations were created separately, these were then combined into a preliminary version by an expert committee, and was subsequently back-translated into English. The process was repeated until the original and back translations were congruent. This version was then field tested on actual patients who provided feedback for modification. The final Chinese FAOS version was then tested for reliability and validity. Reliability analysis was performed on 20 subjects while validity analysis was performed on 50 subjects. Tools used to validate the Chinese FAOS were the SF36 and Pain Numeric Rating Scale (NRS). Internal consistency between the FAOS subgroups was measured using Cronbach's alpha. Spearman's correlation was calculated between each subgroup in the FAOS, SF36 and NRS. The Chinese FAOS passed both reliability and validity testing; meaning it is reliable, internally consistent and correlates positively with the SF36 and the NRS. The Chinese FAOS is a free, open-source scoring system that can be used to provide a relatively standardised outcome measure for foot and ankle studies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validity and reliability of Nintendo Wii Fit balance scores.
Wikstrom, Erik A
2012-01-01
Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Descriptive laboratory study. Sports medicine research laboratory. Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Participants completed a single-limb-stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r < 0.50). Intrasession reliability for Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT reach distances. In addition, the included Wii Fit balance activity scores generally had poor intrasession and intersession reliability.
Carrillo-Larco, Rodrigo M; Miranda, J Jaime; Gilman, Robert H; Medina-Lezama, Josefina; Chirinos-Pacheco, Julio A; Muñoz-Retamozo, Paola V; Smeeth, Liam; Checkley, William; Bernabe-Ortiz, Antonio
2017-11-29
Chronic Kidney Disease (CKD) represents a great burden for the patient and the health system, particularly if diagnosed at late stages. Consequently, tools to identify patients at high risk of having CKD are needed, particularly in limited-resources settings where laboratory facilities are scarce. This study aimed to develop a risk score for prevalent undiagnosed CKD using data from four settings in Peru: a complete risk score including all associated risk factors and another excluding laboratory-based variables. Cross-sectional study. We used two population-based studies: one for developing and internal validation (CRONICAS), and another (PREVENCION) for external validation. Risk factors included clinical- and laboratory-based variables, among others: sex, age, hypertension and obesity; and lipid profile, anemia and glucose metabolism. The outcome was undiagnosed CKD: eGFR < 60 ml/min/1.73m 2 . We tested the performance of the risk scores using the area under the receiver operating characteristic (ROC) curve, sensitivity, specificity, positive/negative predictive values and positive/negative likelihood ratios. Participants in both studies averaged 57.7 years old, and over 50% were females. Age, hypertension and anemia were strongly associated with undiagnosed CKD. In the external validation, at a cut-off point of 2, the complete and laboratory-free risk scores performed similarly well with a ROC area of 76.2% and 76.0%, respectively (P = 0.784). The best assessment parameter of these risk scores was their negative predictive value: 99.1% and 99.0% for the complete and laboratory-free, respectively. The developed risk scores showed a moderate performance as a screening test. People with a score of ≥ 2 points should undergo further testing to rule out CKD. Using the laboratory-free risk score is a practical approach in developing countries where laboratories are not readily available and undiagnosed CKD has significant morbidity and mortality.
Harris, L K; Whay, H R; Murrell, J C
2018-04-01
This study investigated the effects of osteoarthritis (OA) on somatosensory processing in dogs using mechanical threshold testing. A pressure algometer was used to measure mechanical thresholds in 27 dogs with presumed hind limb osteoarthritis and 28 healthy dogs. Mechanical thresholds were measured at the stifles, radii and sternum, and were correlated with scores from an owner questionnaire and a clinical checklist, a scoring system that quantified clinical signs of osteoarthritis. The effects of age and bodyweight on mechanical thresholds were also investigated. Multiple regression models indicated that, when bodyweight was taken into account, dogs with presumed osteoarthritis had lower mechanical thresholds at the stifles than control dogs, but not at other sites. Non-parametric correlations showed that clinical checklist scores and questionnaire scores were negatively correlated with mechanical thresholds at the stifles. The results suggest that mechanical threshold testing using a pressure algometer can detect primary, and possibly secondary, hyperalgesia in dogs with presumed osteoarthritis. This suggests that the mechanical threshold testing protocol used in this study might facilitate assessment of somatosensory changes associated with disease progression or response to treatment. Copyright © 2017. Published by Elsevier Ltd.
Walters, Glenn D
2017-12-01
There is some consensus on the value of cognitive-behaviourally informed interventions in the criminal justice system, but uncertainty about which components are of critical value. To test the hypothesis that change in prisoners - criminal thinking and institutional misconduct - will both follow completion of a brief cognitive behavioural intervention. A one-group pre-test-post-test quasi-experimental design was used to assess change on the General Criminal Thinking (GCT) scale of the Psychological Inventory of Criminal Thinking Styles among 219 male prisoners completing a 10-week cognitive behavioural intervention, referred to as 'Lifestyle Issues'. Institutional misconduct was measured for 1 year prior to completion of the course and 2 years subsequently. Using variable-oriented analysis, post-test GCT scores were compared with change in prison conduct, controlling for the pre-test thinking scores. Calculations were repeated by using person-oriented analysis. Prisoners who displayed a drop in GCT scores between pre-test and post-test levels were significantly more likely to show a reduction in prison misconduct, whereas prison misconduct was likely to escalate among those who displayed a rise in criminal thinking scores from pre-test to post-test. These findings must still be regarded as preliminary, but taken together with other work and with cognitive behavioural theory, they suggest that development of more prosocial thinking and abilities may have an early beneficial effect on institutional behaviour. Their measurement may offer a practical way in which men could be assessed for readiness to return to the community. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Kim, Bong Hyun; Kim, Kyuseok; Nam, Hae Jeong
2017-01-31
Many previous studies of electroacupuncture used combined therapy of electroacupuncture and systemic manual acupuncture, so it was uncertain which treatment was effective. This study evaluated and compared the effects of systemic manual acupuncture, periauricular electroacupuncture and distal electroacupuncture for treating patients with tinnitus. A randomized, parallel, open-labeled exploratory trial was conducted. Subjects aged 20-75 years who had suffered from idiopathic tinnitus for > 2 weeks were recruited from May 2013 to April 2014. The subjects were divided into three groups by systemic manual acupuncture group (MA), periauricular electroacupuncture group (PE), and distal electroacupuncture group (DE). The groups were selected by random drawing. Nine acupoints (TE 17, TE21, SI19, GB2, GB8, ST36, ST37, TE3 and TE9), two periauricular acupoints (TE17 and TE21), and four distal acupoints (TE3, TE9, ST36, and ST37) were selected. The treatment sessions were performed twice weekly for a total of eight sessions over 4 weeks. Outcomes were the tinnitus handicap inventory (THI) score and the loud and uncomfortable visual analogue scales (VAS). Demographic and clinical characteristics of all participants were compared between the groups upon admission using one-way analysis of variance (ANOVA). One-way ANOVA was used to evaluate the THI, VAS loud , and VAS uncomfortable scores. The least significant difference test was used as a post-hoc test. Thirty-nine subjects were eligible and their data were analyzed. No difference in THI and VAS loudness scores was observed in between groups. The VAS uncomfortable scores decreased significantly in MA and DE compared with those in PE. Within the group, all three treatments showed some effect on THI, VAS loudness scores and VAS uncomfortable scores after treatment except DE in THI. There was no statistically significant difference between systemic manual acupuncture, periauricular electroacupuncture and distal electroacupuncture in tinnitus. However, all three treatments had some effect on tinnitus within the group before and after treatment. Systemic manual acupuncture and distal electroacupuncture have some effect on VAS uncomfortable . KCT0001991 by CRIS (Clinical Research Information Service), 2016-8-1, retrospectively registered.
Rosselli, M; Ardila, A; Bateman, J R; Guzmán, M
2001-01-01
Limited information is currently available about performance of Spanish-speaking children on different neuropsychological tests. This study was designed to (a) analyze the effects of age and sex on different neuropsychological test scores of a randomly selected sample of Spanish-speaking children, (b) analyze the value of neuropsychological test scores for predicting school performance, and (c) describe the neuropsychological profile of Spanish-speaking children with learning disabilities (LD). Two hundred ninety (141 boys, 149 girls) 6- to 11-year-old children were selected from a school in Bogotá, Colombia. Three age groups were distinguished: 6- to 7-, 8- to 9-, and 10- to 11-year-olds. Performance was measured utilizing the following neuropsychological tests: Seashore Rhythm Test, Finger Tapping Test (FTT), Grooved Pegboard Test, Children's Category Test (CCT), California Verbal Learning Test-Children's Version (CVLT-C), Benton Visual Retention Test (BVRT), and Bateria Woodcock Psicoeducativa en Español (Woodcock, 1982). Normative scores were calculated. Age effect was significant for most of the test scores. A significant sex effect was observed for 3 test scores. Intercorrelations were performed between neuropsychological test scores and academic areas (science, mathematics, Spanish, social studies, and music). In a post hoc analysis, children presenting very low scores on the reading, writing, and arithmetic achievement scales of the Woodcock battery were identified in the sample, and their neuropsychological test scores were compared with a matched normal group. Finally, a comparison was made between Colombian and American norms.
Van Norman, Ethan R; Nelson, Peter M; Klingbeil, David A
2017-09-01
Educators need recommendations to improve screening practices without limiting students' instructional opportunities. Repurposing previous years' state test scores has shown promise in identifying at-risk students within multitiered systems of support. However, researchers have not directly compared the diagnostic accuracy of previous years' state test scores with data collected during fall screening periods to identify at-risk students. In addition, the benefit of using previous state test scores in conjunction with data from a separate measure to identify at-risk students has not been explored. The diagnostic accuracy of 3 types of screening approaches were tested to predict proficiency on end-of-year high-stakes assessments: state test data obtained during the previous year, data from a different measure administered in the fall, and both measures combined (i.e., a gated model). Extant reading and math data (N = 2,996) from 10 schools in the Midwest were analyzed. When used alone, both measures yielded similar sensitivity and specificity values. The gated model yielded superior specificity values compared with using either measure alone, at the expense of sensitivity. Implications, limitations, and ideas for future research are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
What Makes Nations Intelligent?
Hunt, Earl
2012-05-01
Modern society is driven by the use of cognitive artifacts: physical instruments or styles of reasoning that amplify our ability to think. The artifacts range from writing systems to computers. In everyday life, a person demonstrates intelligence by showing skill in using these artifacts. Intelligence tests and their surrogates force examinees to exhibit some of these skills but not others. This is why test scores correlate substantially but not perfectly with a variety of measures of socioeconomic success. The same thing is true at the international level. Nations can be evaluated by the extent to which their citizens score well on cognitive tests, including both avowed intelligence tests and a variety of tests of academic achievement. The resulting scores are substantially correlated with various indices of national wealth, health, environmental quality, and schooling and with a vaguer variable, social commitment to innovation. These environmental variables are suggested as causes of the differences in general cognitive skills between national populations. It is conceivable that differences in gene pools also contribute to international and, within nations, group differences in cognitive skills, but at present it is impossible to evaluate the extent of genetic influences. © The Author(s) 2012.
Anticipating and Incorporating Stakeholder Feedback When Developing Value-Added Models
ERIC Educational Resources Information Center
Balch, Ryan; Koedel, Cory
2014-01-01
State and local education agencies across the United States are increasingly adopting rigorous teacher evaluation systems. Most systems formally incorporate teacher performance as measured by student test-score growth, sometimes by state mandate. An important consideration that will influence the long-term persistence and efficacy of these systems…
The Impact of Flagging on the Admission Process.
ERIC Educational Resources Information Center
Cahalan-Laitusis, Cara; Mandinach, Ellen B.; Camara, Wayne J.
2003-01-01
Study explored issues surrounding flagging test scores taken under non-standard conditions and how the admission process could better serve students with disabilities. Respondents to survey felt current system was not adequately serving subgroups of students, believing some non-disabled students were manipulating the system to gain an advantage on…
Sargénius, Hanna L; Bylsma, Frederick W; Lydersen, Stian; Hestad, Knut
2017-01-01
The aims of this study were to investigate visual-construction and organizational strategy among individuals with severe obesity, as measured by the Rey Complex Figure Test (RCFT), and to examine the validity of the Q-score as a measure for the quality of performance on the RCFT. Ninety-six non-demented morbidly obese (MO) patients and 100 healthy controls (HC) completed the RCFT. Their performance was calculated by applying the standard scoring criteria. The quality of the copying process was evaluated per the directions of the Q-score scoring system. Results revealed that the MO did not perform significantly lower than the HC on Copy accuracy (mean difference -0.302, CI -1.374 to 0.769, p = 0.579). In contrast, the groups did statistically differ from each other, with MO performing poorer than the HC on the Q-score (mean -1.784, CI -3.237 to -0.331, p = 0.016) and the Unit points (mean -1.409, CI -2.291 to -0.528, p = 0.002), but not on the Order points score (mean -0.351, CI -0.994 to 0.293, p = 0.284). Differences on the Unit score and the Q-score were slightly reduced when adjusting for gender, age, and education. This study presents evidence supporting the presence of inefficiency in visuospatial constructional ability among MO patients. We believe we have found an indication that the Q-score captures a wider range of cognitive processes that are not described by traditional scoring methods. Rather than considering accuracy and placement of the different elements only, the Q-score focuses more on how the subject has approached the task.
SPIDERplan: A tool to support decision-making in radiation therapy treatment plan assessment.
Ventura, Tiago; Lopes, Maria do Carmo; Ferreira, Brigida Costa; Khouri, Leila
2016-01-01
In this work, a graphical method for radiotherapy treatment plan assessment and comparison, named SPIDERplan, is proposed. It aims to support plan approval allowing independent and consistent comparisons of different treatment techniques, algorithms or treatment planning systems. Optimized plans from modern radiotherapy are not easy to evaluate and compare because of their inherent multicriterial nature. The clinical decision on the best treatment plan is mostly based on subjective options. SPIDERplan combines a graphical analysis with a scoring index. Customized radar plots based on the categorization of structures into groups and on the determination of individual structures scores are generated. To each group and structure, an angular amplitude is assigned expressing the clinical importance defined by the radiation oncologist. Completing the graphical evaluation, a global plan score, based on the structures score and their clinical weights, is determined. After a necessary clinical validation of the group weights, SPIDERplan efficacy, to compare and rank different plans, was tested through a planning exercise where plans had been generated for a nasal cavity case using different treatment planning systems. SPIDERplan method was applied to the dose metrics achieved by the nasal cavity test plans. The generated diagrams and scores successfully ranked the plans according to the prescribed dose objectives and constraints and the radiation oncologist priorities, after a necessary clinical validation process. SPIDERplan enables a fast and consistent evaluation of plan quality considering all targets and organs at risk.
A Standardized DNA Variant Scoring System for Pathogenicity Assessments in Mendelian Disorders
Karbassi, Izabela; Maston, Glenn A.; Love, Angela; DiVincenzo, Christina; Braastad, Corey D.; Elzinga, Christopher D.; Bright, Alison R.; Previte, Domenic; Zhang, Ke; Rowland, Charles M.; McCarthy, Michele; Lapierre, Jennifer L.; Dubois, Felicita; Medeiros, Katelyn A.; Batish, Sat Dev; Jones, Jeffrey; Liaquat, Khalida; Hoffman, Carol A.; Jaremko, Malgorzata; Wang, Zhenyuan; Sun, Weimin; Buller‐Burckle, Arlene; Strom, Charles M.; Keiles, Steven B.
2015-01-01
ABSTRACT We developed a rules‐based scoring system to classify DNA variants into five categories including pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign. Over 16,500 pathogenicity assessments on 11,894 variants from 338 genes were analyzed for pathogenicity based on prediction tools, population frequency, co‐occurrence, segregation, and functional studies collected from internal and external sources. Scores were calculated by trained scientists using a quantitative framework that assigned differential weighting to these five types of data. We performed descriptive and comparative statistics on the dataset and tested interobserver concordance among the trained scientists. Private variants defined as variants found within single families (n = 5,182), were either VUS (80.5%; n = 4,169) or likely pathogenic (19.5%; n = 1,013). The remaining variants (n = 6,712) were VUS (38.4%; n = 2,577) or likely benign/benign (34.7%; n = 2,327) or likely pathogenic/pathogenic (26.9%, n = 1,808). Exact agreement between the trained scientists on the final variant score was 98.5% [95% confidence interval (CI) (98.0, 98.9)] with an interobserver consistency of 97% [95% CI (91.5, 99.4)]. Variant scores were stable and showed increasing odds of being in agreement with new data when re‐evaluated periodically. This carefully curated, standardized variant pathogenicity scoring system provides reliable pathogenicity scores for DNA variants encountered in a clinical laboratory setting. PMID:26467025
A Standardized DNA Variant Scoring System for Pathogenicity Assessments in Mendelian Disorders.
Karbassi, Izabela; Maston, Glenn A; Love, Angela; DiVincenzo, Christina; Braastad, Corey D; Elzinga, Christopher D; Bright, Alison R; Previte, Domenic; Zhang, Ke; Rowland, Charles M; McCarthy, Michele; Lapierre, Jennifer L; Dubois, Felicita; Medeiros, Katelyn A; Batish, Sat Dev; Jones, Jeffrey; Liaquat, Khalida; Hoffman, Carol A; Jaremko, Malgorzata; Wang, Zhenyuan; Sun, Weimin; Buller-Burckle, Arlene; Strom, Charles M; Keiles, Steven B; Higgins, Joseph J
2016-01-01
We developed a rules-based scoring system to classify DNA variants into five categories including pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign. Over 16,500 pathogenicity assessments on 11,894 variants from 338 genes were analyzed for pathogenicity based on prediction tools, population frequency, co-occurrence, segregation, and functional studies collected from internal and external sources. Scores were calculated by trained scientists using a quantitative framework that assigned differential weighting to these five types of data. We performed descriptive and comparative statistics on the dataset and tested interobserver concordance among the trained scientists. Private variants defined as variants found within single families (n = 5,182), were either VUS (80.5%; n = 4,169) or likely pathogenic (19.5%; n = 1,013). The remaining variants (n = 6,712) were VUS (38.4%; n = 2,577) or likely benign/benign (34.7%; n = 2,327) or likely pathogenic/pathogenic (26.9%, n = 1,808). Exact agreement between the trained scientists on the final variant score was 98.5% [95% confidence interval (CI) (98.0, 98.9)] with an interobserver consistency of 97% [95% CI (91.5, 99.4)]. Variant scores were stable and showed increasing odds of being in agreement with new data when re-evaluated periodically. This carefully curated, standardized variant pathogenicity scoring system provides reliable pathogenicity scores for DNA variants encountered in a clinical laboratory setting. © 2015 The Authors. **Human Mutation published by Wiley Periodicals, Inc.
ERIC Educational Resources Information Center
Powers, Donald; Schedl, Mary; Papageorgiou, Spiros
2017-01-01
The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…
Physical Education and Its Effect on Elementary Testing Results
ERIC Educational Resources Information Center
Tremarche, Pamela V.; Robinson, Ellyn M.; Graham, Louise B.
2007-01-01
This study was designed to determine the impact of increased quality Physical Education time on Massachusetts Comprehensive Assessment System (MCAS) standardized scores. The MCAS test was given to 311 fourth-grade students in two Southeastern communities in Massachusetts, within a two-month period in April and May of 2001. The participants were…
21 CFR 866.6050 - Ovarian adnexal mass assessment score test system.
Code of Federal Regulations, 2013 CFR
2013-04-01
... surgery is planned, is malignant. The test is for adjunctive use, in the context of a negative primary clinical and radiological evaluation, to augment the identification of patients whose gynecologic surgery... § 866.1(e). (c) Black box warning. Under section 520(e) of the Federal Food, Drug, and Cosmetic Act...
A Student Data Base: An Aid to Student Selection, Program Evaluation, and Management Decision Making
ERIC Educational Resources Information Center
And Others; Maynard, Diane
1974-01-01
The authors outline a proposed student information system incorporating a cross-section of student characteristics to provide a basis for longitudinal analysis and an examination of changes in students. (Data might include standard biographical information, achievement test scores, and information obtained from a required test battery in…
Alignment of Standards and Assessments as an Accountability Criterion. ERIC Digest.
ERIC Educational Resources Information Center
La Marca, Paul M.
This digest provides an overview of the concept of alignment and the role it plays in assessment and accountability systems. It also discusses methodological issues affecting the study of alignment and explores the relationship between alignment and test score interpretation. Alignment refers to the degree of match between test content and subject…
Hackethal, A; Immenroth, M; Bürger, T
2006-04-01
The Minimally Invasive Surgical Trainer-Virtual Reality (MIST-VR) simulator is validated for laparoscopy training, but benchmarks and target scores for assessing single tasks are needed. Control data for the MIST-VR traversal task scenario were collected from 61 novices who performed the task 10 times over 3 days (1 h daily). Data were collected on the time taken, error score, economy of movement, and total score. Test differences were analyzed through percentage scores and t-tests for paired samples. Improvement was greatest over tests 1 to 5 (improvement: test(1.2), 38.07%; p = 0.000; test(4.5), 10.66%; p = 0.010): between tests 5 and 10, improvement slowed and scores stabilized. Variation in participants' performance fell steadily over the 10 tests. Trainees should perform at least 10 tests of the traversal task-five to get used to the equipment and task (automation phase; target total score, 95.16) and five to stabilize and consolidate performance (test 10 target total score, 74.11).
Merriman, W J; Barnett, B E
1995-12-01
This study was undertaken to explore the relationship between language skills and gross-motor skills of 28 preschool children from two private preschools in New York City. Pearson product-moment correlation coefficients were calculated for language (revised Preschool Language Scale) and gross motor (Test of Gross Motor Development) scores. Locomotor skills were significantly related to both auditory comprehension and verbal ability while object control scores did not correlate significantly with either language score. These results were discussed in terms of previous research and with reference to dynamical systems theory. Suggestions for research were made.
Personality type of the glaucoma patient.
Lim, Michele C; Shiba, Diana R; Clark, Ingrid J; Kim, Daniel Y; Styles, Douglas E; Brandt, James D; Watnik, Mitchell R; Barthelow, Isaac J
2007-12-01
To characterize the personality profile of glaucoma subjects. One hundred eight subjects including 56 open-angle glaucoma (OAG) and 52 controls were given the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) test and all performed automated perimetry. Clinical and demographic information which could relate to personality type was collected. OAG subjects had significantly higher Hypochondriasis (Hs; P=0.0082), Hysteria (Hy; P=0.0056), and Health Concerns (HEA; P=0.0025) mean scores than the control group. OAG subjects also had a significantly greater frequency of clinically abnormal score for hysteria (P=0.0262), and health concerns (P=0.0018). Multivariate analysis of variance revealed that Hypochondriasis, Hysteria, and Health Concerns scores were related to number of systemic medications used and to diagnostic group. Other potential explanatory variables such as sex, ethnicity, number of medical problems, length of glaucoma diagnosis, occurrence of glaucoma surgery, intraocular pressure, and visual status (logMAR, visual field indices) were not related to these personality scores. Patients with a diagnosis of OAG had more abnormal MMPI-2 scores in areas that focus upon concerns of somatic complaints and poor health. The use of systemic medications, which may be a constant reminder of illness, is a factor that may contribute to higher MMPI-2 scores.
Brunet, Jennifer; Valette, Xavier; Buklas, Dimitrios; Lehoux, Philippe; Verrier, Pierre; Sauneuf, Bertrand; Ivascau, Calin; Dalibert, Yves; Seguin, Amélie; Terzi, Nicolas; Babatasi, Gérard; du Cheyron, Damien; Parienti, Jean-Jacques; Daubin, Cédric
2017-07-01
We aimed to test the performance of PRESERVE and RESP scores to predict death in patients with severe ARDS receiving extracorporeal membrane oxygenation (ECMO) with different case mixes. All consecutive patients treated with ECMO for refractory ARDS, regardless of cause, in the Caen University Hospital in northwestern France over the last decade were included in a retrospective cohort study. The receiver operating characteristic curves of each score were plotted, and the area under the curve was computed to assess their performance in predicting mortality (c-index). Forty-one subjects were included. Pre-ECMO ventilator settings were: mean V T , 6.1 ± 0.9 mL/kg; breathing frequency, 32 ± 4 breaths/min; PEEP, 11 ± 4 cm H 2 O; peak inspiratory pressure, 48 ± 9 cm H 2 O; plateau pressure, 30.4 ± 4.4 cm H 2 O. At ECMO initiation, blood gas results were: pH 7.22 ± 0.17, P aO 2 /F IO 2 = 63 ± 22 mm Hg; P aCO 2 = 56 ± 18 mm Hg; F IO 2 = 99 ± 2%. Pre-ECMO data were available in 35 and 27 subjects for calculation of the PRESERVE score and RESP score, respectively. Pre-ECMO scoring system results were: median PRESERVE score, 4 (interquartile range 2-5), and median RESP score, 0 (interquartile range -2 to 2). Twenty-three subjects (56%) died, including 19 receiving ECMO. In univariate analysis, plateau pressure ( P = .031), driving pressure ( P = <.001), and compliance ( P = .02) recorded at the time of ECMO initiation as well as the PRESERVE score ( P = .032) were significantly associated with mortality. With a c-index of 0.69 (95% CI 0.53-0.87), the PRESERVE score had better discrimination than the RESP score (c-index of 0.60 [95% CI 0.41-0.78]) for predicting mortality. The use of these scores in helping physicians to determine the patients with ARDS most likely to benefit from ECMO should be limited in clinical practice because of their relatively poor performance in predicting death in subjects with severe ARDS receiving ECMO support. Before widespread use is initiated, these scoring systems should be tested in large prospective studies of subjects with severe ARDS undergoing ECMO treatment. Copyright © 2017 by Daedalus Enterprises.
A practical scoring system to predict mortality in patients with perforated peptic ulcer.
Menekse, Ebru; Kocer, Belma; Topcu, Ramazan; Olmez, Aydemir; Tez, Mesut; Kayaalp, Cuneyt
2015-01-01
The mortality rate of perforated peptic ulcer is still high particularly for aged patients and all the existing scoring systems to predict mortality are complicated or based on history taking which is not always reliable for elderly patients. This study's aim was to develop an easy and applicable scoring system to predict mortality based on hospital admission data. Total 227 patients operated for perforated peptic ulcer in two centers were included. All data that may be potential predictors with respect to hospital mortality were retrospectively analyzed. The mortality and morbidity rates were 10.1% and 24.2%, respectively. Multivariated analysis pointed out three parameters corresponding 1 point for each which were age >65 years, albumin ≤1,5 g/dl and BUN >45 mg/dl. Its prediction rate was high with 0,931 (95% CI, 0,890 to 0,961) value of AUC. The hospital mortality rates for none, one, two and three positive results were zero, 7.1%, 34.4% and 88.9%, respectively. Because the new system consists only age and routinely measured two simple laboratory tests (albumin and BUN), its application is easy and prediction power is satisfactory. Verification of this new scoring system is required by large scale multicenter studies.
Sepsis mortality prediction with the Quotient Basis Kernel.
Ribas Ripoll, Vicent J; Vellido, Alfredo; Romero, Enrique; Ruiz-Rodríguez, Juan Carlos
2014-05-01
This paper presents an algorithm to assess the risk of death in patients with sepsis. Sepsis is a common clinical syndrome in the intensive care unit (ICU) that can lead to severe sepsis, a severe state of septic shock or multi-organ failure. The proposed algorithm may be implemented as part of a clinical decision support system that can be used in combination with the scores deployed in the ICU to improve the accuracy, sensitivity and specificity of mortality prediction for patients with sepsis. In this paper, we used the Simplified Acute Physiology Score (SAPS) for ICU patients and the Sequential Organ Failure Assessment (SOFA) to build our kernels and algorithms. In the proposed method, we embed the available data in a suitable feature space and use algorithms based on linear algebra, geometry and statistics for inference. We present a simplified version of the Fisher kernel (practical Fisher kernel for multinomial distributions), as well as a novel kernel that we named the Quotient Basis Kernel (QBK). These kernels are used as the basis for mortality prediction using soft-margin support vector machines. The two new kernels presented are compared against other generative kernels based on the Jensen-Shannon metric (centred, exponential and inverse) and other widely used kernels (linear, polynomial and Gaussian). Clinical relevance is also evaluated by comparing these results with logistic regression and the standard clinical prediction method based on the initial SAPS score. As described in this paper, we tested the new methods via cross-validation with a cohort of 400 test patients. The results obtained using our methods compare favourably with those obtained using alternative kernels (80.18% accuracy for the QBK) and the standard clinical prediction method, which are based on the basal SAPS score or logistic regression (71.32% and 71.55%, respectively). The QBK presented a sensitivity and specificity of 79.34% and 83.24%, which outperformed the other kernels analysed, logistic regression and the standard clinical prediction method based on the basal SAPS score. Several scoring systems for patients with sepsis have been introduced and developed over the last 30 years. They allow for the assessment of the severity of disease and provide an estimate of in-hospital mortality. Physiology-based scoring systems are applied to critically ill patients and have a number of advantages over diagnosis-based systems. Severity score systems are often used to stratify critically ill patients for possible inclusion in clinical trials. In this paper, we present an effective algorithm that combines both scoring methodologies for the assessment of death in patients with sepsis that can be used to improve the sensitivity and specificity of the currently available methods. Copyright © 2014 Elsevier B.V. All rights reserved.
Estimating Total-Test Scores from Partial Scores in a Matrix Sampling Design.
ERIC Educational Resources Information Center
Sachar, Jane; Suppes, Patrick
1980-01-01
The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students and 60 items of the 110-item Stanford Mental Arithmetic Test. Three methods yielded fairly good estimates of the total-test score. (Author/RL)
Reliable scar scoring system to assess photographs of burn patients.
Mecott, Gabriel A; Finnerty, Celeste C; Herndon, David N; Al-Mousawi, Ahmed M; Branski, Ludwik K; Hegde, Sachin; Kraft, Robert; Williams, Felicia N; Maldonado, Susana A; Rivero, Haidy G; Rodriguez-Escobar, Noe; Jeschke, Marc G
2015-12-01
Several scar-scoring scales exist to clinically monitor burn scar development and maturation. Although scoring scars through direct clinical examination is ideal, scars must sometimes be scored from photographs. No scar scale currently exists for the latter purpose. We modified a previously described scar scale (Yeong et al., J Burn Care Rehabil 1997) and tested the reliability of this new scale in assessing burn scars from photographs. The new scale consisted of three parameters as follows: scar height, surface appearance, and color mismatch. Each parameter was assigned a score of 1 (best) to 4 (worst), generating a total score of 3-12. Five physicians with burns training scored 120 representative photographs using the original and modified scales. Reliability was analyzed using coefficient of agreement, Cronbach alpha, intraclass correlation coefficient, variance, and coefficient of variance. Analysis of variance was performed using the Kruskal-Wallis test. Color mismatch and scar height scores were validated by analyzing actual height and color differences. The intraclass correlation coefficient, the coefficient of agreement, and Cronbach alpha were higher for the modified scale than those of the original scale. The original scale produced more variance than that in the modified scale. Subanalysis demonstrated that, for all categories, the modified scale had greater correlation and reliability than the original scale. The correlation between color mismatch scores and actual color differences was 0.84 and between scar height scores and actual height was 0.81. The modified scar scale is a simple, reliable, and useful scale for evaluating photographs of burn patients. Copyright © 2015 Elsevier Inc. All rights reserved.
A standardized test battery for the study of synesthesia
Eagleman, David M.; Kagan, Arielle D.; Nelson, Stephanie S.; Sagaram, Deepak; Sarma, Anand K.
2014-01-01
Synesthesia is an unusual condition in which stimulation of one modality evokes sensation or experience in another modality. Although discussed in the literature well over a century ago, synesthesia slipped out of the scientific spotlight for decades because of the difficulty in verifying and quantifying private perceptual experiences. In recent years, the study of synesthesia has enjoyed a renaissance due to the introduction of tests that demonstrate the reality of the condition, its automatic and involuntary nature, and its measurable perceptual consequences. However, while several research groups now study synesthesia, there is no single protocol for comparing, contrasting and pooling synesthetic subjects across these groups. There is no standard battery of tests, no quantifiable scoring system, and no standard phrasing of questions. Additionally, the tests that exist offer no means for data comparison. To remedy this deficit we have devised the Synesthesia Battery. This unified collection of tests is freely accessible online (http://www.synesthete.org). It consists of a questionnaire and several online software programs, and test results are immediately available for use by synesthetes and invited researchers. Performance on the tests is quantified with a standard scoring system. We introduce several novel tests here, and offer the software for running the tests. By presenting standardized procedures for testing and comparing subjects, this endeavor hopes to speed scientific progress in synesthesia research. PMID:16919755
Joshi, Shreedhar S; Anthony, G; Manasa, D; Ashwini, T; Jagadeesh, A M; Borde, Deepak P; Bhat, Seetharam; Manjunath, C N
2014-01-01
To validate Aristotle basic complexity and Aristotle comprehensive complexity (ABC and ACC) and risk adjustment in congenital heart surgery-1 (RACHS-1) prediction models for in hospital mortality after surgery for congenital heart disease in a single surgical unit. Patients younger than 18 years, who had undergone surgery for congenital heart diseases from July 2007 to July 2013 were enrolled. Scoring for ABC and ACC scoring and assigning to RACHS-1 categories were done retrospectively from retrieved case files. Discriminative power of scoring systems was assessed with area under curve (AUC) of receiver operating curves (ROC). Calibration (test for goodness of fit of the model) was measured with Hosmer-Lemeshow modification of χ2 test. Net reclassification improvement (NRI) and integrated discrimination improvement (IDI) were applied to assess reclassification. A total of 1150 cases were assessed with an all-cause in-hospital mortality rate of 7.91%. When modeled for multivariate regression analysis, the ABC (χ2 = 8.24, P = 0.08), ACC (χ2 = 4.17 , P = 0.57) and RACHS-1 (χ2 = 2.13 , P = 0.14) scores showed good overall performance. The AUC was 0.677 with 95% confidence interval (CI) of 0.61-0.73 for ABC score, 0.704 (95% CI: 0.64-0.76) for ACC score and for RACHS-1 it was 0.607 (95%CI: 0.55-0.66). ACC had an improved predictability in comparison to RACHS-1 and ABC on analysis with NRI and IDI. ACC predicted mortality better than ABC and RCAHS-1 models. A national database will help in developing predictive models unique to our populations, till then, ACC scoring model can be used to analyze individual performances and compare with other institutes.
The creation, management, and use of data quality information for life cycle assessment.
Edelen, Ashley; Ingwersen, Wesley W
2018-04-01
Despite growing access to data, questions of "best fit" data and the appropriate use of results in supporting decision making still plague the life cycle assessment (LCA) community. This discussion paper addresses revisions to assessing data quality captured in a new US Environmental Protection Agency guidance document as well as additional recommendations on data quality creation, management, and use in LCA databases and studies. Existing data quality systems and approaches in LCA were reviewed and tested. The evaluations resulted in a revision to a commonly used pedigree matrix, for which flow and process level data quality indicators are described, more clarity for scoring criteria, and further guidance on interpretation are given. Increased training for practitioners on data quality application and its limits are recommended. A multi-faceted approach to data quality assessment utilizing the pedigree method alongside uncertainty analysis in result interpretation is recommended. A method of data quality score aggregation is proposed and recommendations for usage of data quality scores in existing data are made to enable improved use of data quality scores in LCA results interpretation. Roles for data generators, data repositories, and data users are described in LCA data quality management. Guidance is provided on using data with data quality scores from other systems alongside data with scores from the new system. The new pedigree matrix and recommended data quality aggregation procedure can now be implemented in openLCA software. Additional ways in which data quality assessment might be improved and expanded are described. Interoperability efforts in LCA data should focus on descriptors to enable user scoring of data quality rather than translation of existing scores. Developing and using data quality indicators for additional dimensions of LCA data, and automation of data quality scoring through metadata extraction and comparison to goal and scope are needed.
Diagnostic accuracy of sleep bruxism scoring in absence of audio-video recording: a pilot study.
Carra, Maria Clotilde; Huynh, Nelly; Lavigne, Gilles J
2015-03-01
Based on the most recent polysomnographic (PSG) research diagnostic criteria, sleep bruxism is diagnosed when >2 rhythmic masticatory muscle activity (RMMA)/h of sleep are scored on the masseter and/or temporalis muscles. These criteria have not yet been validated for portable PSG systems. This pilot study aimed to assess the diagnostic accuracy of scoring sleep bruxism in absence of audio-video recordings. Ten subjects (mean age 24.7 ± 2.2) with a clinical diagnosis of sleep bruxism spent one night in the sleep laboratory. PSG were performed with a portable system (type 2) while audio-video was recorded. Sleep studies were scored by the same examiner three times: (1) without, (2) with, and (3) without audio-video in order to test the intra-scoring and intra-examiner reliability for RMMA scoring. The RMMA event-by-event concordance rate between scoring without audio-video and with audio-video was 68.3 %. Overall, the RMMA index was overestimated by 23.8 % without audio-video. However, the intra-class correlation coefficient (ICC) between scorings with and without audio-video was good (ICC = 0.91; p < 0.001); the intra-examiner reliability was high (ICC = 0.97; p < 0.001). The clinical diagnosis of sleep bruxism was confirmed in 8/10 subjects based on scoring without audio-video and in 6/10 subjects with audio-video. Although the absence of audio-video recording, the diagnostic accuracy of assessing RMMA with portable PSG systems appeared to remain good, supporting their use for both research and clinical purposes. However, the risk of moderate overestimation in absence of audio-video must be taken into account.
Adjorlolo, Samuel
2018-06-01
The sociocultural differences between Western and sub-Saharan African countries make it imperative to standardize neuropsychological tests in the latter. However, Western-normed tests are frequently administered in sub-Saharan Africa because of challenges hampering standardization efforts. Yet a salient topical issue in the cross-cultural neuropsychology literature relates to the utility of Western-normed neuropsychological tests in minority groups, non-Caucasians, and by extension Ghanaians. Consequently, this study investigates the diagnostic accuracy, sensitivity, and specificity of executive function (EF) tests (The Stroop Test, Trail Making Test, and Controlled Oral Word Association Test), and a Revised Quick Cognitive Screening Test (RQCST) in a sample of 50 patients diagnosed with moderate traumatic brain injury and 50 healthy controls in Ghana. The EF test scores showed good diagnostic accuracy, with area under the curve (AUC) values of the Trail Making Test scores ranging from .746 to .902. With respect to the Stroop Test scores, the AUC values ranged from .793 to .898, while Controlled Oral Word Association Test had AUC value of .787. The RQCST scores discriminated between the groups, with AUC values ranging from .674 to .912. The AUC values of composite EF score and a neuropsychological score created from EF and RQCST scores were .936 and. 942, respectively. Additionally, the Stroop Test, Trail Making Test, EF composite score, and RQCST scores showed good to excellent sensitivities and specificities. In general, this study has shown that commonly used EF tests in Western countries have diagnostic accuracy, sensitivity, and specificity when administered in Ghanaian samples. The findings and implications of the study are discussed.
Serizawa, Toru; Higuchi, Yoshinori; Nagano, Osamu; Hirai, Tatsuo; Ono, Junichi; Saeki, Naokatsu; Miyakawa, Akifumi
2012-12-01
The authors conducted validity testing of the 5 major reported indices for radiosurgically treated brain metastases- the original Radiation Therapy Oncology Group's Recursive Partitioning Analysis (RPA), the Score Index for Radiosurgery in Brain Metastases (SIR), the Basic Score for Brain Metastases (BSBM), the Graded Prognostic Assessment (GPA), and the subclassification of RPA Class II proposed by Yamamoto-in nearly 2500 cases treated with Gamma Knife surgery (GKS), focusing on the preservation of neurological function as well as the traditional endpoint of overall survival. The authors analyzed data from 2445 cases treated with GKS by the first author (T.S.), the primary surgeon. The patient group consisted of 1716 patients treated between January 1998 and March 2008 (the Chiba series) and 729 patients treated between April 2008 and December 2011 (the Tokyo series). The interval from the date of GKS until the date of the patient's death (overall survival) and impaired activities of daily living (qualitative survival) were calculated using the Kaplan-Meier method, while the absolute risk for two adjacent classes of each grading system and both hazard ratios and 95% confidence intervals were estimated using the Cox proportional hazards model. For overall survival, there were highly statistically significant differences between each two adjacent patient groups characterized by class or score (all p values < 0.001), except for GPA Scores 3.5-4.0 and 3.0. The SIR showed the best statistical results for predicting preservation of neurological function. Although no other grading systems yielded statistically significant differences in qualitative survival, the BSBM and the modified RPA appeared to be better than the original RPA and GPA. The modified RPA subclassification, proposed by Yamamoto, is well balanced in scoring simplicity with respect to case number distribution and statistical results for overall survival. However, a new or revised grading system is necessary for predicting qualitative survival and for selecting the optimal treatment for patients with brain metastasis treated by GKS.
Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions
ERIC Educational Resources Information Center
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.
2010-01-01
Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Using arborescences to estimate hierarchicalness in directed complex networks
2018-01-01
Complex networks are a useful tool for the understanding of complex systems. One of the emerging properties of such systems is their tendency to form hierarchies: networks can be organized in levels, with nodes in each level exerting control on the ones beneath them. In this paper, we focus on the problem of estimating how hierarchical a directed network is. We propose a structural argument: a network has a strong top-down organization if we need to delete only few edges to reduce it to a perfect hierarchy—an arborescence. In an arborescence, all edges point away from the root and there are no horizontal connections, both characteristics we desire in our idealization of what a perfect hierarchy requires. We test our arborescence score in synthetic and real-world directed networks against the current state of the art in hierarchy detection: agony, flow hierarchy and global reaching centrality. These tests highlight that our arborescence score is intuitive and we can visualize it; it is able to better distinguish between networks with and without a hierarchical structure; it agrees the most with the literature about the hierarchy of well-studied complex systems; and it is not just a score, but it provides an overall scheme of the underlying hierarchy of any directed complex network. PMID:29381761
Inoue, Ayako; Oshita, Harumi; Maruyama, Yoshihiro; Tanaka, Yoshihiro; Ishitobi, Yoshinobu; Kawano, Aimi; Ikeda, Rie; Ando, Tomoko; Aizawa, Saeko; Masuda, Koji; Higuma, Haruka; Kanehisa, Masayuki; Ninomiya, Taiga; Akiyoshi, Jotaro
2015-07-30
Borderline personality disorder (BPD) is characterized by affective instability, unstable relationships, and identity disturbance. We measured salivary alpha-amylase (sAA) and salivary cortisol levels in all participants during exposure to the Trier Social Stress Test (TSST) and an electric stimulation stress. Seventy-two BPD patients were compared with 377 age- and gender- matched controls. The State and Trait versions of the Spielberger Anxiety Inventory test (STAI-S and STAI-T, respectively), the Profile of Mood State (POMS) tests, and the Beck Depression Inventory (BDI), the Depression and Anxiety Cognition Scale (DACS) were administered to participants before electrical stimulation. Following TSST exposure, salivary cortisol levels significantly decreased in female patients and significantly increased in male patients compared with controls. POMS tension-anxiety, depression-dejection, anger-hostility, fatigue, and confusion scores were significantly increased in BPD patients compared with controls. In contrast, vigor scores were significantly decreased in BPD patients relative to controls. Furthermore, STAI-T and STAI-S anxiety scores and BDI scores were significantly increased in BPD patient compared with controls. DACS scores were significantly increased in BPD patient compared with controls. Different stressors (e.g., psychological or physical) induced different responses in the HPA and SAM systems in female or male BPD patients. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Shiha, G; Seif, S; Eldesoky, A; Elbasiony, M; Soliman, R; Metwally, A; Zalata, K; Mikhail, N
2017-05-01
A simple non-invasive score (Fibrofast, FIB-5) was developed using five routine laboratory tests (ALT, AST, alkaline phosphatase, albumin and platelets count) for the detection of significant hepatic fibrosis in patients with chronic hepatitis C. The FIB-4 index is a non-invasive test for the assessment of liver fibrosis, and a score of ≤1.45 enables the correct identification of patients who have non-significant (F0-1) from significant fibrosis (F2-4), and could avoid liver biopsy. The aim of this study was to compare the performance characteristics of FIB-5 and FIB-4 to differentiate between non-significant and significant fibrosis. A cross-sectional study included 604 chronic HCV patients. All liver biopsies were scored using the METAVIR system. Both FIB-5 and FIB-4 scores were measured and the performance characteristics were calculated using the ROC curve. The performance characteristics of FIB-5 at ≥7.5 and FIB-4 at ≤1.45 for the differentiation between non-significant fibrosis and significant fibrosis were: specificity 94.4%, PPV 85.7%, and specificity 54.9%, PPV 55.7% respectively. FIB-5 score at the new cutoff is superior to FIB-4 index for the differentiation between non-significant and significant fibrosis.
Healthcare teams as complex adaptive systems: Focus on interpersonal interaction.
Pype, Peter; Krystallidou, Demi; Deveugele, Myriam; Mertens, Fien; Rubinelli, Sara; Devisch, Ignaas
2017-11-01
The aim of this study is to test the feasibility of a tool to objectify the functioning of healthcare teams operating in the complexity zone, and to evaluate its usefulness in identifying areas for team quality improvement. We distributed The Complex Adaptive Leadership (CAL™) Organisational Capability Questionnaire (OCQ) to all members of one palliative care team (n=15) and to palliative care physicians in Flanders, Belgium (n=15). Group discussions were held on feasibility aspects and on the low scoring topics. Data was analysed calculating descriptive statistics (sum score, mean and standard deviation). The one sample T-Test was used to detect differences within each group. Both groups of participants reached mean scores ranging from good to excellent. The one sample T test showed statistically significant differences between participants' sum scores within each group (p<0,001). Group discussion led to suggestions for quality improvement e.g. enhanced feedback strategies between team members. The questionnaire used in our study shows to be a feasible and useful instrument for the evaluation of the palliative care teams' day-to-day operations and to identify areas for quality improvement. The CAL™OCQ is a promising instrument to evaluate any healthcare team functioning. A group discussion on the questionnaire scores can serve as a starting point to identify targets for quality improvement initiatives. Copyright © 2017 Elsevier B.V. All rights reserved.
Measuring and Analyzing Cognitive Skills at the Platoon Level
1990-03-01
s~jt tnheted ofe cursJsfe n h ai fcma fiinv 1fie trea’,; .)f ult\\ u01 this rsearch r t detrmed whics ariismot inunTes de cisoe makie abilites...admiinister the test to a group of Non -Comnmissioned Officers from Fort Ord. California. The test is 2iven once at the beginning of the Basic Non ...C. MTHOPOLEM............................................3 11. TEST DESIGN AND SCORING SYSTEM...........................5 A. TEST SCENARIO
ERIC Educational Resources Information Center
Lowe, Patricia A.
2015-01-01
The present study examined measurement invariance across gender and gender differences on two measures of test anxiety developed for U.S. middle and high school, and college students. It was hypothesized that measurement invariance and gender differences would be found on the two measures of test anxiety, suggesting no separate scoring system is…
Topographic characterisation of dental implants for commercial use.
Mendoza-Arnau, A; Vallecillo-Capilla, M-F; Cabrerizo-Vílchez, M-Á; Rosales-Leal, J-I
2016-09-01
To characterize the surface topography of several dental implants for commercial use. Dental implants analyzed were Certain (Biomet 3i), Tissue Level (Straumann), Interna (BTI), MG-InHex (MozoGrau), SPI (Alphabio) and Hikelt (Bioner). Surface topography was ascertained using a confocal microscope with white light. Roughness parameters obtained were: Ra, Rq, Rv, Rp, Rt, Rsk and Rku. The results were analysed using single-factor ANOVA and Student-Neuman-Keuls (p<0.05) tests. Certain and Hikelt obtained the highest Ra and Rq scores, followed by Tissue Level. Interna and SPI obtained lower scores, and MG-InHex obtained the lowest score. Rv scores followed the same trend. Certain obtained the highest Rp score, followed by SPI and Hikelt, then Interna and Tissue Level. MG-InHex obtained the lowest scores. Certain obtained the highest Rt score, followed by Interna and Hikelt, then SPI and Tissue Level. The lowest scores were for MG-InHex. Rsk was negative (punctured surface) in the MG-InHex, SPI and Tissue Level systems, and positive (pointed surface) in the other systems. Rku was higher than 3 (Leptokurtic) in Tissue Level, Interna, MG-InHex and SPI, and lower than 3 (Platykurtic) in Certain and Hikelt. The type of implant determines surface topography, and there are differences in the roughness parameters of the various makes of implants for clinical use.
Kwak, Jihoon; Genovesio, Auguste; Kang, Myungjoo; Hansen, Michael Adsett Edberg; Han, Sung-Jun
2015-01-01
Genotoxicity testing is an important component of toxicity assessment. As illustrated by the European registration, evaluation, authorization, and restriction of chemicals (REACH) directive, it concerns all the chemicals used in industry. The commonly used in vivo mammalian tests appear to be ill adapted to tackle the large compound sets involved, due to throughput, cost, and ethical issues. The somatic mutation and recombination test (SMART) represents a more scalable alternative, since it uses Drosophila, which develops faster and requires less infrastructure. Despite these advantages, the manual scoring of the hairs on Drosophila wings required for the SMART limits its usage. To overcome this limitation, we have developed an automated SMART readout. It consists of automated imaging, followed by an image analysis pipeline that measures individual wing genotoxicity scores. Finally, we have developed a wing score-based dose-dependency approach that can provide genotoxicity profiles. We have validated our method using 6 compounds, obtaining profiles almost identical to those obtained from manual measures, even for low-genotoxicity compounds such as urethane. The automated SMART, with its faster and more reliable readout, fulfills the need for a high-throughput in vivo test. The flexible imaging strategy we describe and the analysis tools we provide should facilitate the optimization and dissemination of our methods. PMID:25830368
Pulse wave velocity and cognitive function in older adults.
Zhong, Wenjun; Cruickshanks, Karen J; Schubert, Carla R; Carlsson, Cynthia M; Chappell, Richard J; Klein, Barbara E K; Klein, Ronald; Acher, Charles W
2014-01-01
Arterial stiffness may be associated with cognitive function. In this study, pulse wave velocity (PWV) was measured from the carotid to femoral (CF-PWV) and from the carotid to radial (CR-PWV) with the Complior SP System. Cognitive function was measured by 6 tests of executive function, psychomotor speed, memory, and language fluency. A total of 1433 participants were included (mean age 75 y, 43% men). Adjusting for age, sex, education, pulse rate, hemoglobin A1C, high-density lipoprotein cholesterol, hypertension, cardiovascular disease history, smoking, drinking, and depression symptoms, a CF-PWV>12 m/s was associated with a lower Mini-Mental State Examination score (coefficient: -0.31, SE: 0.11, P=0.005), fewer words recalled on Auditory Verbal Learning Test (coefficient: -1.10, SE: 0.43, P=0.01), and lower score on the composite cognition score (coefficient: -0.10, SE: 0.05, P=0.04) and marginally significantly associated with longer time to complete Trail Making Test-part B (coefficient: 6.30, SE: 3.41, P=0.06), CF-PWV was not associated with Trail Making Test-part A, Digit Symbol Substation Test, or Verbal Fluency Test. No associations were found between CR-PWV and cognitive performance measures. Higher large artery stiffness was associated with worse cognitive function, and longitudinal studies are needed to confirm these associations.
Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha
2016-03-01
This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (p<0.001) and for group 2 was 0.305 (p<0.001). The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.
Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model
ERIC Educational Resources Information Center
Kim, Kyung Yong; Lee, Won-Chan
2018-01-01
Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Is the NIHSS Certification Process Too Lenient?
Hills, Nancy K.; Josephson, S. Andrew; Lyden, Patrick D.; Johnston, S. Claiborne
2009-01-01
Background and Purpose The National Institutes of Health Stroke Scale (NIHSS) is a widely used measure of neurological function in clinical trials and patient assessment; inter-rater scoring variability could impact communications and trial power. The manner in which the rater certification test is scored yields multiple correct answers that have changed over time. We examined the range of possible total NIHSS scores from answers given in certification tests by over 7,000 individual raters who were certified. Methods We analyzed the results of all raters who completed one of two standard multiple-patient videotaped certification examinations between 1998 and 2004. The range for the correct score, calculated using NIHSS ‘correct answers’, was determined for each patient. The distribution of scores derived from those who passed the certification test then was examined. Results A total of 6,268 raters scored 5 patients on Test 1; 1,240 scored 6 patients on Test 2. Using a National Stroke Association (NSA) answer key, we found that correct total scores ranged from 2 correct scores to as many as 12 different correct total scores. Among raters who achieved a passing score and were therefore qualified to administer the NIHSS, score distributions were even wider, with 1 certification patient receiving 18 different correct total scores. Conclusions Allowing multiple acceptable answers for questions on the NIHSS certification test introduces scoring variability. It seems reasonable to assume that the wider the range of acceptable answers in the certification test, the greater the variability in the performance of the test in trials and clinical practice by certified examiners. Greater consistency may be achieved by deriving a set of ‘best’ answers through expert consensus on all questions where this is possible, then teaching raters how to derive these answers using a required interactive training module. PMID:19295205
Pelvic-floor strength in women with incontinence as assessed by the brink scale.
FitzGerald, Mary P; Burgio, Kathryn L; Borello-France, Diane F; Menefee, Shawn A; Schaffer, Joseph; Kraus, Stephen; Mallett, Veronica T; Xu, Yan
2007-10-01
The purpose of this study was to describe how clinical pelvic-floor muscle (PFM) strength (force-generating capacity) is related to patient characteristics, lower urinary tract symptoms, and fecal incontinence symptoms. Data were obtained from 643 women who were participating in a randomized surgical trial for treatment of stress urinary incontinence. Patient demographic variables, baseline urinary and fecal incontinence symptom questionnaires, urodynamic data and urinary diary data, pad test results, and standardized assessment of pelvic organ support were compared with PFM strength as described by the Brink scoring system. Bivariate analysis of factors associated with the Brink scale score was done using analysis of variance and linear regression. Multivariate analysis included patient variables that were significant on bivariate analysis. The mean Brink scale score was 9 (SD=2) and did not vary widely in this large, but highly select, patient sample. We found a weak, but statistically strong, relationship between age and Brink score. Brink scores were not related to diary and pad test measures of incontinence severity. Overall, PFM strength was good in this sample of women with stress incontinence. Scores tended to be similar, and it is possible that the Brink scale does not reflect real clinical differences in PFM strength.
Sex Differences in Vestibular/Ocular and Neurocognitive Outcomes After Sport-Related Concussion.
Sufrinko, Alicia M; Mucha, Anne; Covassin, Tracey; Marchetti, Greg; Elbin, R J; Collins, Michael W; Kontos, Anthony P
2017-03-01
To examine sex differences in vestibular and oculomotor symptoms and impairment in athletes with sport-related concussion (SRC). The secondary purpose was to replicate previously reported sex differences in total concussion symptoms, and performance on neurocognitive and balance testing. Prospective cross-sectional study of consecutively enrolled clinic patients within 21 days of a SRC. Specialty Concussion Clinic. Included male (n = 36) and female (n = 28) athletes ages 9 to 18 years. Vestibular symptoms and impairment was measured with the Vestibular/Ocular Motor Screening (VOMS). Participants completed the Immediate Post-concussion Assessment and Cognitive Test (ImPACT), Post-concussion Symptom Scale (PCSS), and Balance Error Scoring System (BESS). Sex differences on clinical measures. Females had higher PCSS scores (P = 0.01) and greater VOMS vestibular ocular reflex (VOR) score (P = 0.01) compared with males. There were no sex differences on BESS or ImPACT. Total PCSS scores together with female sex accounted for 45% of the variance in VOR scores. Findings suggest higher VOR scores after SRC in female compared with male athletes. Findings did not extend to other components of the VOMS tool suggesting that sex differences may be specific to certain types of vestibular impairment after SRC. Additional research on the clinical significance of the current findings is needed.
NCAP test improvements with pretensioners and load limiters.
Walz, Marie
2004-03-01
New Car Assessment Program (NCAP) test scores, measured by the United States Department of Transportation's (USDOT) National Highway Traffic Safety Administration (NHTSA), were analyzed in order to assess the benefits of equipping safety belt systems with pretensioners and load limiters. Safety belt pretensioners retract the safety belt almost instantly in a crash to remove excess slack. They tie the occupant to the vehicle's deceleration early during the crash, reducing the peak load experienced by the occupant. Load limiters and other energy management systems allow safety belts to yield in a crash, preventing the shoulder belt from directing too much energy on the chest of the occupant. In NCAP tests, vehicles are crashed into a fixed barrier at 35 mph. During the test, instruments measure the accelerations of the head and chest, as well as the force on the legs of anthropomorphic dummies secured in the vehicle by safety belts. NCAP data from model year 1998 through 2001 cars and light trucks were examined. The combination of pretensioners and load limiters is estimated to reduce Head Injury Criterion (HIC) by 232, chest acceleration by an average of 6.6 g's, and chest deflection (displacement) by 10.6 mm, for drivers and right front passengers. The unit used to measure chest acceleration (g) is defined as a unit of force equal to the force exerted by gravity. All of these reductions are statistically significant. When looked at individually, pretensioners are more effective in reducing HIC scores for both drivers and right front passengers, as well as chest acceleration and chest deflection scores for drivers. Load limiters show greater reductions in chest acceleration and chest deflection scores for right front passengers. By contrast, in make-models for which neither load limiters nor pretensioners have been added, there is little change during 1998 to 2001 in HIC, chest acceleration, or chest deflection values in NCAP tests.
De Gori, Marco; Adamczewski, Benjamin; Jenny, Jean-Yves
2017-06-01
The purpose of the study was to use the cumulative summation (CUSUM) test to assess the learning curve during the introduction of a new surgical technique (patient-specific instrumentation) in total knee arthroplasty (TKA) in an academic department. The first 50TKAs operated on at an academic department using patient-specific templates (PSTs) were scheduled to enter the study. All patients had a preoperative computed tomography scan evaluation to plan bone resections. The PSTs were positioned intraoperatively according to the best-fit technique and their three-dimensional orientation was recorded by a navigation system. The position of the femur and tibia PST was compared to the planned position for four items for each component: coronal and sagittal orientation, medial and lateral height of resection. Items were summarized to obtain knee, femur and tibia PST scores, respectively. These scores were plotted according to chronological order and included in a CUSUM analysis. The tested hypothesis was that the PST process for TKA was immediately under control after its introduction. CUSUM test showed that positioning of the PST significantly differed from the target throughout the study. There was a significant difference between all scores and the maximal score. No case obtained the maximal score of eight points. The study was interrupted after 20 cases because of this negative evaluation. The CUSUM test is effective in monitoring the learning curve when introducing a new surgical procedure. Introducing PST for TKA in an academic department may be associated with a long-lasting learning curve. The study was registered on the clinical.gov website (Identifier NCT02429245). Copyright © 2017 Elsevier B.V. All rights reserved.
Liinamo, A E; Karjalainen, L; Ojala, M; Vilva, V
1997-03-01
Data from field trials of Finnish Hounds between 1988 and 1992 in Finland were used to estimate genetic parameters and environmental effects for measures of hunting performance using REML procedures and an animal model. The original data set included 28,791 field trial records from 5,666 dogs. Males and females had equal hunting performance, whereas experience acquired by age improved trial results compared with results for young dogs (P < .001). Results were mostly better on snow than on bare ground (P < .001), and testing areas, years, months, and their interactions affected results (P < .001). Estimates of heritabilities and repeatabilities were low for most of the 28 measures, mainly due to large residual variances. The highest heritabilities were for frequency of tonguing (h2 = .15), pursuit score (h2 = .13), tongue score (h2 = .13), ghost trailing score (h2 = .12), and merit and final score (both h2 = .11). Estimates of phenotypic and genetic correlations were positive and moderate or high for search scores, pursuit scores, and final scores but lower for other studied measures. The results suggest that, due to low heritabilities, evaluation of breeding values for Finnish Hounds with respect to their hunting ability should be based on animal model BLUP methods instead of mere performance testing. The evaluation system of field trials should also be revised for more reliability.
Rathnakar, Surag Kajoor; Vishnu, Vikram Hubbanageri; Muniyappa, Shridhar; Prasath, Arun
2017-02-01
Acute Pancreatitis (AP) is one of the common conditions encountered in the emergency room. The course of the disease ranges from mild form to severe acute form. Most of these episodes are mild and spontaneously subsiding within 3 to 5 days. In contrast, Severe Acute Pancreatitis (SAP) occurring in around 15-20% of all cases, mortality can range between 10 to 85% across various centres and countries. In such a situation we need an indicator which can predict the outcome of an attack, as severe or mild, as early as possible and such an indicator should be sensitive and specific enough to trust upon. PANC-3 scoring is such a scoring system in predicting the outcome of an attack of AP. To assess the accuracy and predictability of PANC-3 scoring system over APACHE II in predicting severity in an attack of AP. This prospective study was conducted on 82 patients admitted with the diagnosis of pancreatitis. Investigations to evaluate PANC-3 and APACHE II were done on all the patients and the PANC-3 and APACHE II score was calculated. PANC-3 score has a sensitivity of 82.6% and specificity of 77.9%, the test had a Positive Predictive Value (PPV) of 0.59 and Negative Predictive Value (NPV) of 0.92. Sensitivity of APACHE II in predicting SAP was 91.3% and specificity was 96.6% with PPV of 0.91, NPV was 0.96. Our study shows that PANC-3 can be used to predict the severity of pancreatitis as efficiently as APACHE II. The interpretation of PANC-3 does not need expertise and can be applied at the time of admission which is an advantage when compared to classical scoring systems.
Adkin, A; Brouwer, A; Simons, R R L; Smith, R P; Arnold, M E; Broughan, J; Kosmider, R; Downs, S H
2016-01-01
Identifying and ranking cattle herds with a higher risk of being or becoming infected on known risk factors can help target farm biosecurity, surveillance schemes and reduce spread through animal trading. This paper describes a quantitative approach to develop risk scores, based on the probability of infection in a herd with bovine tuberculosis (bTB), to be used in a risk-based trading (RBT) scheme in England and Wales. To produce a practical scoring system the risk factors included need to be simple and quick to understand, sufficiently informative and derived from centralised national databases to enable verification and assess compliance. A logistic regression identified herd history of bTB, local bTB prevalence, herd size and movements of animals onto farms in batches from high risk areas as being significantly associated with the probability of bTB infection on farm. Risk factors were assigned points using the estimated odds ratios to weight them. The farm risk score was defined as the sum of these individual points yielding a range from 1 to 5 and was calculated for each cattle farm that was trading animals in England and Wales at the start of a year. Within 12 months, of those farms tested, 30.3% of score 5 farms had a breakdown (sensitivity). Of farms scoring 1-4 only 5.4% incurred a breakdown (1-specificity). The use of this risk scoring system within RBT has the potential to reduce infected cattle movements; however, there are cost implications in ensuring that the information underpinning any system is accurate and up to date. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.
Yi, Eunhee S; Boland, Jennifer M; Maleszewski, Joseph J; Roden, Anja C; Oliveira, Andre M; Aubry, Marie-Christine; Erickson-Johnson, Michele R; Caron, Bolette L; Li, Yan; Tang, Hui; Stoddard, Shawn; Wampfler, Jason; Kulig, Kimary; Yang, Ping
2011-03-01
Accurate, cost-effective methods for testing anaplastic lymphoma kinase gene rearrangement (ALK+) are needed to select patients with non-small cell lung carcinoma for ALK-inhibitor therapy. Fluorescent in situ hybridization (FISH) is used to detect ALK+, but it is expensive and not routinely available. We explored the potential of an immunohistochemistry (IHC) scoring system as an affordable, accessible approach. One hundred one samples were obtained from an enriched cohort of never-smokers with adenocarcinoma from the Mayo Clinic Lung Cancer Cohort. IHC was performed using the ALK1 monoclonal antibody with ADVANCE detection system (Dako) and FISH with dual-color, break-apart probe (Abbott Molecular) on formalin-fixed, paraffin-embedded tissue. Cases were assessed as IHC score 0 (no staining; n = 69), 1+ (faint cytoplasmic staining, n = 21), 2+ (moderate, smooth cytoplasmic staining; n = 3), or 3+ (intense, granular cytoplasmic staining in ≥10% of tumor cells; n = 8). All IHC 3+ cases were FISH+, whereas 1 of 3 IHC 2+ and 1 of 21 IHC 1+ cases were FISH+. All 69 IHC 0 cases were FISH-. Considering FISH a gold-standard reference in this study, sensitivity and specificity of IHC were 90 and 97.8%, respectively, when 2+ and 3+ were regarded as IHC positive and 0 and 1+ as IHC negative. IHC scoring correlates with FISH and may be a useful algorithm in testing ALK+ by FISH in non-small cell lung carcinoma, similar to human epidermal growth factor-2 testing in breast cancer. Further study is needed to validate this approach.
Spectral-Temporal Modulated Ripple Discrimination by Children With Cochlear Implants.
Landsberger, David M; Padilla, Monica; Martinez, Amy S; Eisenberg, Laurie S
A postlingually implanted adult typically develops hearing with an intact auditory system, followed by periods of deafness (or near deafness) and adaptation to the implant. For an early implanted child whose brain is highly plastic, the auditory system matures with consistent input from a cochlear implant. It is likely that the auditory system of early implanted cochlear implant users is fundamentally different than postlingually implanted adults. The purpose of this study is to compare the basic psychophysical capabilities and limitations of these two populations on a spectral resolution task to determine potential effects of early deprivation and plasticity. Performance on a spectral resolution task (Spectral-temporally Modulated Ripple Test [SMRT]) was measured for 20 bilaterally implanted, prelingually deafened children (between 5 and 13 years of age) and 20 hearing children within the same age range. Additionally, 15 bilaterally implanted, postlingually deafened adults, and 10 hearing adults were tested on the same task. Cochlear implant users (adults and children) were tested bilaterally, and with each ear alone. Hearing listeners (adults and children) were tested with the unprocessed SMRT and with a vocoded version that simulates an 8-channel cochlear implant. For children with normal hearing, a positive correlation was found between age and SMRT score for both the unprocessed and vocoded versions. Older hearing children performed similarly to hearing adults in both the unprocessed and vocoded test conditions. However, for children with cochlear implants, no significant relationship was found between SMRT score and chronological age, age at implantation, or years of implant experience. Performance by children with cochlear implants was poorer than performance by cochlear implanted adults. It was also found that children implanted sequentially tended to have better scores with the first implant compared with the second implant. This difference was not observed for adults. An additional finding was that SMRT score was negatively correlated with age for adults with implants. Results from this study suggest that basic psychophysical capabilities of early implanted children and postlingually implanted adults differ when assessed in the sound field using their personal implant processors. Because spectral resolution does not improve with age for early implanted children, it seems likely that the sparse representation of the signal provided by a cochlear implant limits spectral resolution development. These results are supported by the finding that postlingually implanted adults, whose auditory systems matured before the onset of hearing loss, perform significantly better than early implanted children on the spectral resolution test.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Xueqian; Greuter, Marcel J. W.; Groen, Jaap M.
Purpose: Coronary artery calcium score, traditionally based on electrocardiography (ECG)-triggered computed tomography (CT), predicts cardiovascular risk. However, nontriggered CT is extensively utilized. The study-purpose is to evaluate the in vitro agreement in coronary calcium score between nontriggered thoracic CT and ECG-triggered cardiac CT.Methods: Three artificial coronary arteries containing calcifications of different densities (high, medium, and low), and sizes (large, medium, and small), were studied in a moving cardiac phantom. Two 64-detector CT systems were used. The phantom moved at 0–90 mm/s in nontriggered low-dose CT as index test, and at 0–30 mm/s in ECG-triggered CT as reference. Differences in calciummore » scores between nontriggered and ECG-triggered CT were analyzed by t-test and 95% confidence interval. The sensitivity to detect calcification was calculated as the percentage of positive calcium scores.Results: Overall, calcium scores in nontriggered CT were not significantly different to those in ECG-triggered CT (p > 0.05). Calcium scores in nontriggered CT were within the 95% confidence interval of calcium scores in ECG-triggered CT, except predominantly at higher velocities (≥50 mm/s) for the high-density and large-size calcifications. The sensitivity for a nonzero calcium score was 100% for large calcifications, but 46%± 11% for small calcifications in nontriggered CT.Conclusions: When performing multiple measurements, good agreement in positive calcium scores is found between nontriggered thoracic and ECG-triggered cardiac CT. Agreement decreases with increasing coronary velocity. From this phantom study, it can be concluded that a high calcium score can be detected by nontriggered CT, and thus, that nontriggered CT likely can identify individuals at high risk of cardiovascular disease. On the other hand, a zero calcium score in nontriggered CT does not reliably exclude coronary calcification.« less
Melanoma detection using a mobile phone app
NASA Astrophysics Data System (ADS)
Diniz, Luciano E.; Ennser, K.
2016-03-01
Mobile phones have had their processing power greatly increased since their invention a few decades ago. As a direct result of Moore's Law, this improvement has made available several applications that were impossible before. The aim of this project is to develop a mobile phone app, integrated with its camera coupled to an amplifying lens, to help distinguish melanoma. The proposed device has the capability of processing skin mole images and suggesting, using a score system, if it is a case of melanoma or not. This score system is based on the ABCDE signs of melanoma, and takes into account the area, the perimeter and the colors present in the nevus. It was calibrated and tested using images from the PH2 Dermoscopic Image Database from Pedro Hispano Hospital. The results show that the system created can be useful, with an accuracy of up to 100% for malign cases and 80% for benign cases (including common and atypical moles), when used in the test group.
Introduction of the non-technical skills for surgeons (NOTSS) system in a Japanese cancer center.
Tsuburaya, Akira; Soma, Takahiro; Yoshikawa, Takaki; Cho, Haruhiko; Miki, Tamotsu; Uramatsu, Masashi; Fujisawa, Yoshikazu; Youngson, George; Yule, Steven
2016-12-01
Non-technical skills rating systems, which are designed to support surgical performance, have been introduced worldwide, but not officially in Japan. We performed a pilot study to evaluate the "non-technical skills for surgeons" (NOTSS) rating system in a major Japanese cancer center. Upper gastrointestinal surgeons were selected as trainers or trainees. The trainers attended a master-class on NOTSS, which included simulated demo-videos, to promote consistency across the assessments. The trainers thereafter commenced observing the trainees and whole teams, utilizing the NOTSS and "observational teamwork assessment for surgery" (OTAS) rating systems, before and after their education. Four trainers and six trainees were involved in this study. Test scores for understanding human factors and the NOTSS system were 5.89 ± 1.69 and 8.00 ± 1.32 before and after the e-learning, respectively (mean ± SD, p = 0.010). The OTAS scores for the whole team improved significantly after the trainees' education in five out of nine stages (p < 0.05). There were no differences in the NOTSS scores before and after education, with a small improvement in the total scores for the "teamwork and communication" and "leadership" categories. These findings demonstrate that implementing the NOTSS system is feasible in Japan. Education of both surgical trainers and trainees would contribute to better team performance.
Toffan, Adam; Alexander, Marion J L; Peeler, Jason
2017-07-28
The purpose of the study was to compare the most effective joint movements, segment velocities and body positions to perform the fastest and most accurate pass of high school and university football quarterbacks. Secondary purposes were to develop a quarterback throwing test to assess skill level, to determine which kinematic variables were different between high school and university athletes as well as to determine which variables were significant predictors of quarterback throwing test performance. Ten high school and ten university athletes were filmed for the study, performing nine passes at a target and two passes for maximum distance. Thirty variables were measured using Dartfish Team Pro 4.5.2 video analysis system, and Microsoft Excel was used for statistical analysis. University athletes scored slightly higher than the high school athletes on the throwing test, however this result was not statistically significant. Correlation analysis and forward stepwise multiple regression analysis was performed on both the high school players and the university players in order to determine which variables were significant predictors of throwing test score. Ball velocity was determined to have the strongest predictive effect on throwing test score (r = 0.900) for the high school athletes, however, position of the back foot at release was also determined to be important (r = 0.661) for the university group. Several significant differences in throwing technique between groups were noted during the pass, however, body position at release showed the greatest differences between the two groups. High school players could benefit from more complete weight transfer and decreased throw time to increase throwing test score. University athletes could benefit from increased throw time and greater range of motion in external shoulder rotation and trunk rotation to increase their throwing test score. Coaches and practitioners will be able to use the findings of this research to help improve these and related throwing variables in their high school and university quarterbacks.
State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Utah
ERIC Educational Resources Information Center
Center on Education Policy, 2010
2010-01-01
This paper profiles Utah's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) increased in grade 8 reading. In grade 4 reading, the percentage scoring proficient on the state test showed a…
State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP. Washington
ERIC Educational Resources Information Center
Center on Education Policy, 2010
2010-01-01
This paper profiles Washington's test score trends through 2008-09. Between 2005 and 2009, the percentages of students reaching the proficient level on the state test and the basic level on NAEP (National Assessment of Educational Progress) decreased in grade 4 reading. In grade 4 math, the percentage scoring proficient on the state test decreased…
ERIC Educational Resources Information Center
Goldhaber, Dan; Gratz, Trevor; Theobald, Roddy
2016-01-01
We investigate the predictive validity of teacher credential test scores for student performance in secondary STEM classrooms in Washington state. After replicating earlier findings that teacher basic skills licensure test scores are a modest and statistically significant predictor of student math test score gains in elementary grades, we focus on…
ERIC Educational Resources Information Center
Meijer, Rob R.
2004-01-01
Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a…
Fabbiani, Massimiliano; Grima, Pierfrancesco; Milanini, Benedetta; Mondi, Annalisa; Baldonero, Eleonora; Ciccarelli, Nicoletta; Cauda, Roberto; Silveri, Maria C; De Luca, Andrea; Di Giambenedetto, Simona
2015-01-01
The aim of the study was to explore how viral resistance and antiretroviral central nervous system (CNS) penetration could impact on cognitive performance of HIV-infected patients. We performed a multicentre cross-sectional study enrolling HIV-infected patients undergoing neuropsychological testing, with a previous genotypic resistance test on plasma samples. CNS penetration-effectiveness (CPE) scores and genotypic susceptibility scores (GSS) were calculated for each regimen. A composite score (CPE-GSS) was then constructed. Factors associated with cognitive impairment were investigated by logistic regression analysis. A total of 215 patients were included. Mean CPE was 7.1 (95% CI 6.9, 7.3) with 206 (95.8%) patients showing a CPE≥6. GSS correction decreased the CPE value in 21.4% (mean 6.5, 95% CI 6.3, 6.7), 26.5% (mean 6.4, 95% CI 6.1, 6.6) and 24.2% (mean 6.4, 95% CI 6.2, 6.6) of subjects using ANRS, HIVDB and REGA rules, respectively. Overall, 66 (30.7%) patients were considered cognitively impaired. No significant association could be demonstrated between CPE and cognitive impairment. However, higher GSS-CPE was associated with a lower risk of cognitive impairment (CPE-GSSANRS odds ratio 0.75, P=0.022; CPE-GSSHIVDB odds ratio 0.77, P=0.038; CPE-GSSREGA odds ratio 0.78, P=0.038). Overall, a cutoff of CPE-GSS≥5 seemed the most discriminatory according to each different interpretation system. GSS-corrected CPE score showed a better correlation with neurocognitive performance than the standard CPE score. These results suggest that antiretroviral drug susceptibility, besides drug CNS penetration, can play a role in the control of HIV-associated neurocognitive disorders.
Wiltsey Stirman, Shannon; Marques, Luana; Creed, Torrey A; Gutner, Cassidy A; DeRubeis, Robert; Barnett, Paul G; Kuhn, Eric; Suvak, Michael; Owen, Jason; Vogt, Dawne; Jo, Booil; Schoenwald, Sonja; Johnson, Clara; Mallard, Kera; Beristianos, Matthew; La Bash, Heidi
2018-05-22
Identifying scalable strategies for assessing fidelity is a key challenge in implementation science. However, for psychosocial interventions, the existing, reliable ways to test treatment fidelity quality are often labor intensive, and less burdensome strategies may not reflect actual clinical practice. Cognitive behavioral therapies (CBTs) provide clinicians with a set of effective core elements to help treat a multitude of disorders, which, evidence suggests, need to be delivered with fidelity to maximize potential client impact. The current "gold standard" for rating CBTs is rating recordings of therapy sessions, which is extremely time-consuming and requires a substantial amount of initial training. Although CBTs can vary based on the target disorder, one common element employed in most CBTs is the use of worksheets to identify specific behaviors and thoughts that affect a client's ability to recover. The present study will develop and evaluate an innovative new approach to rate CBT fidelity, by developing a universal CBT scoring system based on worksheets completed in therapy sessions. To develop a scoring system for CBT worksheets, we will compile common CBT elements from a variety of CBT worksheets for a range of psychiatric disorders and create adherence and competence measures. We will collect archival worksheets from past studies to test the scoring system and assess test-retest reliability. To evaluate whether CBT worksheet scoring accurately reflects clinician fidelity, we will recruit clinicians who are engaged in a CBT for depression, anxiety, and/or posttraumatic stress disorder. Clinicians and clients will transmit routine therapy materials produced in session (e.g., worksheets, clinical notes, session recordings) to the study team after each session. We will compare observer-rated fidelity, clinical notes, and fidelity-rated worksheets to identify the most effective and efficient method to assess clinician fidelity. Clients will also be randomly assigned to either complete the CBT worksheets on paper forms or on a mobile application (app) to learn if worksheet format influences clinician and client experience or differs in terms of reflecting fidelity. Scoring fidelity using CBT worksheets may allow clinics to test fidelity in a short and effective manner, enhancing continuous quality improvement in the workplace. Clinicians and clinics can use such data to improve clinician fidelity in real time, leading to improved patient outcomes. ClinicalTrials.gov NCT03479398 . Retrospectively registered March 20, 2018.
Validation study of an electronic method of condensed outcomes tools reporting in orthopaedics.
Farr, Jack; Verma, Nikhil; Cole, Brian J
2013-12-01
Patient-reported outcomes (PRO) instruments are a vital source of data for evaluating the efficacy of medical treatments. Historically, outcomes instruments have been designed, validated, and implemented as paper-based questionnaires. The collection of paper-based outcomes information may result in patients becoming fatigued as they respond to redundant questions. This problem is exacerbated when multiple PRO measures are provided to a single patient. In addition, the management and analysis of data collected in paper format involves labor-intensive processes to score and render the data analyzable. Computer-based outcomes systems have the potential to mitigate these problems by reformatting multiple outcomes tools into a single, user-friendly tool.The study aimed to determine whether the electronic outcomes system presented produces results comparable with the test-retest correlations reported for the corresponding orthopedic paper-based outcomes instruments.The study is designed as a crossover study based on consecutive orthopaedic patients arriving at one of two designated orthopedic knee clinics.Patients were assigned to complete either a paper or a computer-administered questionnaire based on a similar set of questions (Knee injury and Osteoarthritis Outcome Score, International Knee Documentation Committee form, 36-Item Short Form survey, version 1, Lysholm Knee Scoring Scale). Each patient completed the same surveys using the other instrument, so that all patients had completed both paper and electronic versions. Correlations between the results from the two modes were studied and compared with test-retest data from the original validation studies.The original validation studies established test-retest reliability by computing correlation coefficients for two administrations of the paper instrument. Those correlation coefficients were all in the range of 0.7 to 0.9, which was deemed satisfactory. The present study computed correlation coefficients between the paper and electronic modes of administration. These correlation coefficients demonstrated similar results with an overall value of 0.86.On the basis of the correlation coefficients, the electronic application of commonly used knee outcome scores compare variably to the traditional paper variants with a high rate of test-retest correlation. This equivalence supports the use of the condensed electronic outcomes system and validates comparison of scores between electronic and paper modes. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
ERIC Educational Resources Information Center
Feldt, Leonard S.
2004-01-01
In some settings, the validity of a battery composite or a test score is enhanced by weighting some parts or items more heavily than others in the total score. This article describes methods of estimating the total score reliability coefficient when differential weights are used with items or parts.
Tennant, Alan; Küçükdeveci, Ayse A; Kutlay, Sehim; Elhan, Atilla H
2006-03-23
The Middlesex Elderly Assessment of Mental State (MEAMS) was developed as a screening test to detect cognitive impairment in the elderly. It includes 12 subtests, each having a 'pass score'. A series of tasks were undertaken to adapt the measure for use in the adult population in Turkey and to determine the validity of existing cut points for passing subtests, given the wide range of educational level in the Turkish population. This study focuses on identifying and validating the scoring system of the MEAMS for Turkish adult population. After the translation procedure, 350 normal subjects and 158 acquired brain injury patients were assessed by the Turkish version of MEAMS. Initially, appropriate pass scores for the normal population were determined through ANOVA post-hoc tests according to age, gender and education. Rasch analysis was then used to test the internal construct validity of the scale and the validity of the cut points for pass scores on the pooled data by using Differential Item Functioning (DIF) analysis within the framework of the Rasch model. Data with the initially modified pass scores were analyzed. DIF was found for certain subtests by age and education, but not for gender. Following this, pass scores were further adjusted and data re-fitted to the model. All subtests were found to fit the Rasch model (mean item fit 0.184, SD 0.319; person fit -0.224, SD 0.557) and DIF was then found to be absent. Thus the final pass scores for all subtests were determined. The MEAMS offers a valid assessment of cognitive state for the adult Turkish population, and the revised cut points accommodate for age and education. Further studies are required to ascertain the validity in different diagnostic groups.
Management of heart failure in the new era: the role of scores.
Mantegazza, Valentina; Badagliacca, Roberto; Nodari, Savina; Parati, Gianfranco; Lombardi, Carolina; Di Somma, Salvatore; Carluccio, Erberto; Dini, Frank Lloyd; Correale, Michele; Magrì, Damiano; Agostoni, Piergiuseppe
2016-08-01
Heart failure is a widespread syndrome involving several organs, still characterized by high mortality and morbidity, and whose clinical course is heterogeneous and hardly predictable.In this scenario, the assessment of heart failure prognosis represents a fundamental step in clinical practice. A single parameter is always unable to provide a very precise prognosis. Therefore, risk scores based on multiple parameters have been introduced, but their clinical utility is still modest. In this review, we evaluated several prognostic models for acute, right, chronic, and end-stage heart failure based on multiple parameters. In particular, for chronic heart failure we considered risk scores essentially based on clinical evaluation, comorbidities analysis, baroreflex sensitivity, heart rate variability, sleep disorders, laboratory tests, echocardiographic imaging, and cardiopulmonary exercise test parameters. What is at present established is that a single parameter is not sufficient for an accurate prediction of prognosis in heart failure because of the complex nature of the disease. However, none of the scoring systems available is widely used, being in some cases complex, not user-friendly, or based on expensive or not easily available parameters. We believe that multiparametric scores for risk assessment in heart failure are promising but their widespread use needs to be experienced.
Pompeu, J E; Arduini, L A; Botelho, A R; Fonseca, M B F; Pompeu, S M A A; Torriani-Pasin, C; Deutsch, J E
2014-06-01
To assess the feasibility, safety and outcomes of playing Microsoft Kinect Adventures™ for people with Parkinson's disease in order to guide the design of a randomised clinical trial. Single-group, blinded trial. Rehabilitation Center of São Camilo University, Brazil. Seven patients (six males, one female) with Parkinson's disease (Hoehn and Yahr Stages 2 and 3). Fourteen 60-minute sessions, three times per week, playing four games of Kinect Adventures! The feasibility and safety outcomes were patients' game performance and adverse events, respectively. The clinical outcomes were the 6-minute walk test, Balance Evaluation System Test, Dynamic Gait Index and Parkinson's Disease Questionnaire (PDQ-39). Patients' scores for the four games showed improvement. The mean [standard deviation (SD)] scores in the first and last sessions of the Space Pop game were 151 (36) and 198 (29), respectively [mean (SD) difference 47 (7), 95% confidence interval 15 to 79]. There were no adverse events. Improvements were also seen in the 6-minute walk test, Balance Evaluation System Test, Dynamic Gait Index and PDQ-39 following training. Kinect-based training was safe and feasible for people with Parkinson's disease (Hoehn and Yahr Stages 2 and 3). Patients improved their scores for all four games. No serious adverse events occurred during training with Kinect Adventures!, which promoted improvement in activities (balance and gait), body functions (cardiopulmonary aptitude) and participation (quality of life). Copyright © 2013 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Isaia, Federica; Gyurko, Robert; Roomian, Tamar C; Hawley, Charles E
2018-04-06
The Root Coverage Esthetic Score (RES) was published in 2009 as an esthetic scoring system to measure visible final outcomes of root coverage procedures performed on Miller I and II recession defects. The aim of this study was to evaluate the intra-examiner, intra-group, and inter-examiner reliability of the (Root Coverage Esthetic Score) RES when used among periodontal faculty, post-graduate students in periodontology, and pre-doctoral DMD students when using the RES at Tufts University School of Dental Medicine (TUSDM). Thirty-three participants (12 second year DMD students, 11 periodontal residents, and 10 faculty members) were assembled to evaluate 25 baseline and 6-months post-treatment outcomes of mucogingival surgeries using the RES. Each projection was shown for 30 seconds during which the participants were asked to use the RES scoring system to evaluate the surgical outcomes. The results were then recorded on a standardized worksheet grid. To test intra-examiner reliability, 7 of the 25 projections were shown twice. Intra-examiner reliability and inter-examiner reliability were assessed using intraclass correlation coefficient using a two-way mixed effects model, and stratified by education level. PG residents had the highest tendency to agree with each other with an interclass correlation (ICC) of 0.53 (95%CI 0.36 - 0.74). DMD students had an ICC: 0.51 (95%CI: 0.33 - 0.75), and PG faculty members produced an ICC: 0.41 (95%CI: 0.24 - 0.64). There was no statistically significant difference in ICC among the three groups of participants (Kruskal-Wallis test, P = 0.2440). When the data for each RES element were then combined, the mean ICC for the total interrater agreement for RES was 0.48 (95% CI: 0.32-0.71). This corresponds to an overall moderate agreement among all participants using the RES to evaluate the 25 surgical outcomes. The intra-examiner reliability within each of the three groups was quite high. The highest mean ICC was produced by the PG Faculty (0.908). The mean ICCs for PG residents was 0.867, and the mean ICC for DMD students was 0.855. The Kruskal-Wallis test (p = 0.46) failed to find any statistical difference in intra-examiner reliability between the three groups of participants CONCLUSIONS: The RES is a "moderately" reliable scoring system for mucogingival treatments in a dental school setting and can be used even by operators with different level of periodontal experience. This scoring system can be repeated by the same examiner obtaining reliable results. This article is protected by copyright. All rights reserved. © 2018 American Academy of Periodontology.
Dalziell, Andrew; Boyle, James; Mutrie, Nanette
2015-01-01
This study will extend on a pilot study and will evaluate the impact of a novel approach to PE, Better Movers and Thinkers (BMT), on students' cognition, physical activity habits, and gross motor coordination (GMC). The study will involve six mainstream state schools with students aged 9-11 years. Three schools will be allocated as the intervention condition and three as the control condition. The design of the study is a 16-week intervention with pre-, post- and 6 month follow-up measurements taken using the 'Cognitive Assessment System (CAS)' GMC tests, and the 'Physical Activity Habits Questionnaire for Children (PAQ-C).' Qualitative data will be gathered using student focus groups and class teacher interviews in each of the six schools. ANCOVA will be used to evaluate any effect of intervention comparing pre-test scores with post-test scores and then pre-test scores with 6 month follow-up scores. Qualitative data will be analysed through an iterative process using grounded theory. This protocol provides the details of the rationale and design of the study and details of the intervention, outcome measures, and the recruitment process. The study will address gaps within current research by evaluating if a change of approach in the delivery of PE within schools has an effect on children's cognition, PA habits, and GMC within a Scottish setting.
Quantitative traits for the tail suspension test: automation, optimization, and BXD RI mapping.
Lad, Heena V; Liu, Lin; Payá-Cano, José L; Fernandes, Cathy; Schalkwyk, Leonard C
2007-07-01
Immobility in the tail suspension test (TST) is considered a model of despair in a stressful situation, and acute treatment with antidepressants reduces immobility. Inbred strains of mouse exhibit widely differing baseline levels of immobility in the TST and several quantitative trait loci (QTLs) have been nominated. The labor of manual scoring and various scoring criteria make obtaining robust data and comparisons across different laboratories problematic. Several studies have validated strain gauge and video analysis methods by comparison with manual scoring. We set out to find objective criteria for automated scoring parameters that maximize the biological information obtained, using a video tracking system on tapes of tail suspension tests of 24 lines of the BXD recombinant inbred panel and the progenitor strains C57BL/6J and DBA/2J. The maximum genetic effect size is captured using the highest time resolution and a low mobility threshold. Dissecting the trait further by comparing genetic association of multiple measures reveals good evidence for loci involved in immobility on chromosomes 4 and 15. These are best seen when using a high threshold for immobility, despite the overall better heritability at the lower threshold. A second trial of the test has greater duration of immobility and a completely different genetic profile. Frequency of mobility is also an independent phenotype, with a distal chromosome 1 locus.
Race, Socioeconomic Status, and Implicit Bias: Implications for Closing the Achievement Gap
NASA Astrophysics Data System (ADS)
Schlosser, Elizabeth Auretta Cox
This study accessed the relationship between race, socioeconomic status, age and the race implicit bias held by middle and high school science teachers in Mobile and Baldwin County Public School Systems. Seventy-nine participants were administered the race Implicit Association Test (race IAT), created by Greenwald, A. G., Nosek, B. A., & Banaji, M. R., (2003) and a demographic survey. Quantitative analysis using analysis of variances, ANOVA and t-tests were used in this study. An ANOVA was performed comparing the race IAT scores of African American science teachers and their Caucasian counterparts. A statically significant difference was found (F = .4.56, p = .01). An ANOVA was also performed using the race IAT scores comparing the age of the participants; the analysis yielded no statistical difference based on age. A t-test was performed comparing the race IAT scores of African American teachers who taught at either Title I or non-Title I schools; no statistical difference was found between groups (t = -17.985, p < .001). A t-test was also performed comparing the race IAT scores of Caucasian teachers who taught at either Title I or non-Title I schools; a statistically significant difference was found between groups ( t = 2.44, p > .001). This research examines the implications of the achievement gap among African American and Caucasian students in science.
Middle-School Understanding of the Greenhouse Effect using a NetLogo Computer Model
NASA Astrophysics Data System (ADS)
Schultz, L.; Koons, P. O.; Schauffler, M.
2009-12-01
We investigated the effectiveness of a freely available agent based, modeling program as a learning tool for seventh and eighth grade students to explore the greenhouse effect without added curriculum. The investigation was conducted at two Maine middle-schools with 136 seventh-grade students and 11 eighth-grade students in eight classes. Students were given a pre-test that consisted of a concept map, a free-response question, and multiple-choice questions about how the greenhouse effect influences the Earth's temperature. The computer model simulates the greenhouse effect and allows students to manipulate atmospheric and surface conditions to observe the effects on the Earth’s temperature. Students explored the Greenhouse Effect model for approximately twenty minutes with only two focus questions for guidance. After the exploration period, students were given a post-test that was identical to the pre-test. Parametric post-test analysis of the assessments indicated middle-school students gained in their understanding about how the greenhouse effect influences the Earth's temperature after exploring the computer model for approximately twenty minutes. The magnitude of the changes in pre- and post-test concept map and free-response scores were small (average free-response post-test score of 7.0) compared to an expert's score (48), indicating that students understood only a few of the system relationships. While students gained in their understanding about the greenhouse effect, there was evidence that students held onto their misconceptions that (1) carbon dioxide in the atmosphere deteriorates the ozone layer, (2) the greenhouse effect is a result of humans burning fossil fuels, and (3) infrared and visible light have similar behaviors with greenhouse gases. We recommend using the Greenhouse Effect computer model with guided inquiry to focus students’ investigations on the system relationships in the model.
Vujaklija, Ivan; Roche, Aidan D; Hasenoehrl, Timothy; Sturma, Agnes; Amsuess, Sebastian; Farina, Dario; Aszmann, Oskar C
2017-01-01
Missing an upper limb dramatically impairs daily-life activities. Efforts in overcoming the issues arising from this disability have been made in both academia and industry, although their clinical outcome is still limited. Translation of prosthetic research into clinics has been challenging because of the difficulties in meeting the necessary requirements of the market. In this perspective article, we suggest that one relevant factor determining the relatively small clinical impact of myocontrol algorithms for upper limb prostheses is the limit of commonly used laboratory performance metrics. The laboratory conditions, in which the majority of the solutions are being evaluated, fail to sufficiently replicate real-life challenges. We qualitatively support this argument with representative data from seven transradial amputees. Their ability to control a myoelectric prosthesis was tested by measuring the accuracy of offline EMG signal classification, as a typical laboratory performance metrics, as well as by clinical scores when performing standard tests of daily living. Despite all subjects reaching relatively high classification accuracy offline, their clinical scores varied greatly and were not strongly predicted by classification accuracy. We therefore support the suggestion to test myocontrol systems using clinical tests on amputees, fully fitted with sockets and prostheses highly resembling the systems they would use in daily living, as evaluation benchmark. Agreement on this level of testing for systems developed in research laboratories would facilitate clinically relevant progresses in this field.
State Test Score Trends through 2008-09, Part 1: Rising Scores on State Tests and NAEP
ERIC Educational Resources Information Center
Chudowsky, Naomi; Chudowsky, Victor
2010-01-01
In recent years, scores on the annual state reading and mathematics tests used for accountability have gone up in most states. These trends in state test scores do not always coincide, however, with trends on the National Assessment of Educational Progress (NAEP), the federally sponsored assessment that is administered periodically to…
ERIC Educational Resources Information Center
Doppelt, Jerome E.
1956-01-01
The standard error of measurement as a means for estimating the margin of error that should be allowed for in test scores is discussed. The true score measures the performance that is characteristic of the person tested; the variations, plus and minus, around the true score describe a characteristic of the test. When the standard deviation is used…
ERIC Educational Resources Information Center
Bell, Michael L.; Roubinek, Darrell L.
1989-01-01
Compares fourth-graders' subtest scores on the Stanford Achievement Test (SAT), the Iowa Test of Basic Skills (ITBS), and the Metropolitan Achievement Test (MAT). Finds right-brain dominant students scored better on four SAT subtests, and left-brain dominant students scored better on four ITBS subtests and two MAT subtests. (NH)
Developing Test Score Reports that Work: The Process and Best Practices for Effective Communication
ERIC Educational Resources Information Center
Zenisky, April L.; Hambleton, Ronald K.
2012-01-01
Test scores matter these days. Test-takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees' information or usability needs, but this is clearly changing…
The efficacy of commercially available veterinary diets recommended for dogs with atopic dermatitis.
Glos, Katharina; Linek, Monika; Loewenstein, Christine; Mayer, Ursula; Mueller, Ralf S
2008-10-01
The classical treatments for dogs with atopic dermatitis have traditionally been oral antipruritic drugs, allergen-specific immunotherapy and topical therapy. Fifty dogs with atopic dermatitis were included in this multicentred, double-blinded, randomized study to compare clinical response to an 8-week period of feeding one of three commercial veterinary foods marketed for dogs with atopic dermatitis (diets A-C) or a widely distributed supermarket food (diet D). Atopic dermatitis was diagnosed using Willemse's criteria and through the exclusion of differential diagnoses. Fourteen dogs were assigned to diet A and 12 dogs each to diet B, C or D. Flea and tick control using a monthly fipronil spot-on product was administered for a minimum of 4 weeks prior to inclusion in the study and during the study period. Evaluations were made monthly. These included lesion scores, using an established scoring system (canine atopic dermatitis extent and severity index, CADESI-03) and owner evaluation of pruritus level using a visual analogue scale. After 8 weeks on the new diets, there was a significant improvement in CADESI and pruritus scores with diet B (Wilcoxon test, P = 0.043 and paired t-test, P = 0.012, respectively), in pruritus scores with diet A (paired t-test, P = 0.019) and in CADESI scores with diet D (Wilcoxon test, P = 0.037). No significant changes were detected with diet C. Based on the results of this study, in addition to the conventional therapies, changing the diet of dogs with atopic dermatitis may be a useful adjunctive therapeutic measure.
Lau, Y Y W; Pluske, J R; Fleming, P A
2015-08-01
Under intensive pig husbandry, outdoor systems offer a more complex physical and social environment compared with indoor systems (farrowing sheds). As the rearing environment affects behavioural development, it can, therefore, influence behavioural responses of pigs to stressful environments in later stages of production. We tested how the rearing environment influenced behavioural responses to a novel arena test in piglets on the day that they were weaned and mixed into large groups. We recorded video footage and compared the behavioural responses of 30 outdoor-raised and 30 farrowing shed-raised piglets tested in an experimental arena and sequentially exposed to four challenges (each for 5 min) on the day of weaning. Quantitative and qualitative behavioural measures were recorded using time budgets and scoring demeanour or 'qualitative behavioural expression' (using Qualitative Behavioural Assessment (QBA)). When held in isolation (challenge 1), both groups were scored as more 'scared/worried', while outdoor-raised piglets spent more time eating and jumping against the arena walls. Both groups interacted with a plastic ball (challenge 2: exposure to a novel object) during which they were scored as more 'playful/curious' than other challenges. When a food bowl was introduced (challenge 3), farrowing shed-raised piglets were more interested in playing with the food bowl itself, whereas outdoor-raised piglets spent more time eating the feed. Finally, there were no significant differences in social behaviour (challenge 4: introduction of another piglet) between the two groups in terms of the latency to contact each other, amount of time recorded engaged in aggressive/non-aggressive social interactions or QBA scores. Although piglets spent 30% of their time interacting with the other piglet, and half of this time (47%) was engaged in negative interactions (pushing, biting), the levels of aggression were not different between the two groups. Overall, outdoor-raised piglets ate more and were scored as more 'calm/passive', whereas farrowing shed-raised piglets spent more time investigating their environment and were scored as more 'playful/inquisitive'. In conclusion, we did not find differences in behaviour between outdoor-raised and farrowing shed-raised piglets that would highlight welfare issues. The differences found in this study may reflect conflicting affective states, with responses to confinement, neophobia and motivation for exploration evident.