validated scoring system: Topics by Science.gov

Sample records for validated scoring system

Principles for valid histopathologic scoring in research

PubMed Central

Gibson-Corley, Katherine N.; Olivier, Alicia K.; Meyerholz, David K.

2013-01-01

Histopathologic scoring is a tool by which semi-quantitative data can be obtained from tissues. Initially, a thorough understanding of the experimental design, study objectives and methods are required to allow the pathologist to appropriately examine tissues and develop lesion scoring approaches. Many principles go into the development of a scoring system such as tissue examination, lesion identification, scoring definitions and consistency in interpretation. Masking (a.k.a. “blinding”) of the pathologist to experimental groups is often necessary to constrain bias and multiple mechanisms are available. Development of a tissue scoring system requires appreciation of the attributes and limitations of the data (e.g. nominal, ordinal, interval and ratio data) to be evaluated. Incidence, ordinal and rank methods of tissue scoring are demonstrated along with key principles for statistical analyses and reporting. Validation of a scoring system occurs through two principal measures: 1) validation of repeatability and 2) validation of tissue pathobiology. Understanding key principles of tissue scoring can help in the development and/or optimization of scoring systems so as to consistently yield meaningful and valid scoring data. PMID:23558974
The accuracy of Internet search engines to predict diagnoses from symptoms can be assessed with a validated scoring system.

PubMed

Shenker, Bennett S

2014-02-01

To validate a scoring system that evaluates the ability of Internet search engines to correctly predict diagnoses when symptoms are used as search terms. We developed a five point scoring system to evaluate the diagnostic accuracy of Internet search engines. We identified twenty diagnoses common to a primary care setting to validate the scoring system. One investigator entered the symptoms for each diagnosis into three Internet search engines (Google, Bing, and Ask) and saved the first five webpages from each search. Other investigators reviewed the webpages and assigned a diagnostic accuracy score. They rescored a random sample of webpages two weeks later. To validate the five point scoring system, we calculated convergent validity and test-retest reliability using Kendall's W and Spearman's rho, respectively. We used the Kruskal-Wallis test to look for differences in accuracy scores for the three Internet search engines. A total of 600 webpages were reviewed. Kendall's W for the raters was 0.71 (p<0.0001). Spearman's rho for test-retest reliability was 0.72 (p<0.0001). There was no difference in scores based on Internet search engine. We found a significant difference in scores based on the webpage's order on the Internet search engine webpage (p=0.007). Pairwise comparisons revealed higher scores in the first webpages vs. the fourth (corr p=0.009) and fifth (corr p=0.017). However, this significance was lost when creating composite scores. The five point scoring system to assess diagnostic accuracy of Internet search engines is a valid and reliable instrument. The scoring system may be used in future Internet research. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Establishment and Validation of GV-SAPS II Scoring System for Non-Diabetic Critically Ill Patients.

PubMed

Liu, Wen-Yue; Lin, Shi-Gang; Zhu, Gui-Qi; Poucke, Sven Van; Braddock, Martin; Zhang, Zhongheng; Mao, Zhi; Shen, Fei-Xia; Zheng, Ming-Hua

2016-01-01

Recently, glucose variability (GV) has been reported as an independent risk factor for mortality in non-diabetic critically ill patients. However, GV is not incorporated in any severity scoring system for critically ill patients currently. The aim of this study was to establish and validate a modified Simplified Acute Physiology Score II scoring system (SAPS II), integrated with GV parameters and named GV-SAPS II, specifically for non-diabetic critically ill patients to predict short-term and long-term mortality. Training and validation cohorts were exacted from the Multiparameter Intelligent Monitoring in Intensive Care database III version 1.3 (MIMIC-III v1.3). The GV-SAPS II score was constructed by Cox proportional hazard regression analysis and compared with the original SAPS II, Sepsis-related Organ Failure Assessment Score (SOFA) and Elixhauser scoring systems using area under the curve of the receiver operator characteristic (auROC) curve. 4,895 and 5,048 eligible individuals were included in the training and validation cohorts, respectively. The GV-SAPS II score was established with four independent risk factors, including hyperglycemia, hypoglycemia, standard deviation of blood glucose levels (GluSD), and SAPS II score. In the validation cohort, the auROC values of the new scoring system were 0.824 (95% CI: 0.813-0.834, P< 0.001) and 0.738 (95% CI: 0.725-0.750, P< 0.001), respectively for 30 days and 9 months, which were significantly higher than other models used in our study (all P < 0.001). Moreover, Kaplan-Meier plots demonstrated significantly worse outcomes in higher GV-SAPS II score groups both for 30-day and 9-month mortality endpoints (all P< 0.001). We established and validated a modified prognostic scoring system that integrated glucose variability for non-diabetic critically ill patients, named GV-SAPS II. It demonstrated a superior prognostic capability and may be an optimal scoring system for prognostic evaluation in this patient group.
Validating a Prognostic Scoring System for Postmastectomy Locoregional Recurrence in Breast Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cheng, Skye Hung-Chun, E-mail: skye@kfsyscc.org; Clinical Research Office, Koo Foundation Sun Yat-Sen Cancer Center, Taipei, Taiwan; Department of Radiation Oncology, Duke University Medical Center, Durham, North Carolina

2013-03-15

Purpose: This study is designed to validate a previously developed locoregional recurrence risk (LRR) scoring system and further define which groups of patients with breast cancer would benefit from postmastectomy radiation therapy (PMRT). Methods and Materials: An LRR risk scoring system was developed previously at our institution using breast cancer patients initially treated with modified radical mastectomy between 1990 and 2001. The LRR score comprised 4 factors: patient age, lymphovascular invasion, estrogen receptor negativity, and number of involved lymph nodes. We sought to validate the original study by examining a new dataset of 1545 patients treated between 2002 and 2007. Results:more » The 1545 patients were scored according to the previously developed criteria: 920 (59.6%) were low risk (score 0-1), 493 (31.9%) intermediate risk (score 2-3), and 132 (8.5%) were high risk (score ≥4). The 5-year locoregional control rates with and without PMRT in low-risk, intermediate-risk, and high-risk groups were 98% versus 97% (P=.41), 97% versus 91% (P=.0005), and 89% versus 50% (P=.0002) respectively. Conclusions: This analysis of an additional 1545 patients treated between 2002 and 2007 validates our previously reported LRR scoring system and suggests appropriate patients for whom PMRT will be beneficial. Independent validation of this scoring system by other institutions is recommended.« less
Reliability and Validity of a Japanese-language and Culturally Adapted Version of the Musculoskeletal Tumor Society Scoring System for the Lower Extremity.

PubMed

Iwata, Shintaro; Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Yonemoto, Tsukasa; Kawai, Akira

2016-09-01

The Musculoskeletal Tumor Society (MSTS) scoring system is a widely used functional evaluation tool for patients treated for musculoskeletal tumors. Although the MSTS scoring system has been validated in English and Brazilian Portuguese, a Japanese version of the MSTS scoring system has not yet been validated. We sought to determine whether a Japanese-language translation of the MSTS scoring system for the lower extremity had (1) sufficient reliability and internal consistency, (2) adequate construct validity, and (3) reasonable criterion validity compared with the Toronto Extremity Salvage Score (TESS) and SF-36 using psychometric analysis. The Japanese version of the MSTS scoring system was developed using accepted guidelines, which included translation of the English version of the MSTS into Japanese by five native Japanese bilingual musculoskeletal oncology surgeons and integrated into one document. One hundred patients with a diagnosis of intermediate or malignant bone or soft tissue tumors located in the lower extremity and who had undergone tumor resection with or without reconstruction or amputation participated in this study. Reliability was evaluated by test-retest analysis, and internal consistency was established by Cronbach's alpha coefficient. Construct validity was evaluated using the principal factor analysis and Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS scoring system with the TESS and SF-36. Test-retest analysis showed a high intraclass correlation coefficient (0.92; 95% CI, 0.88-0.95), indicating high reliability of the Japanese version of the MSTS scoring system, although a considerable ceiling effect was observed, with 23 patients (23%) given the maximum score. Cronbach's alpha coefficient was 0.87 (95% CI, 0.82-0.90), suggesting a high level of internal consistency. Factor analysis revealed that all items had high loading values and communalities; we identified a central role for the items "walking" and "gait" according to the Akaike information criterion network. The total MSTS score was correlated with that of the TESS (r = 0.81; 95% CI, 0.73-0.87; p < 0.001) and the physical component summary and physical functioning of the SF-36. The Japanese-language translation of the MSTS scoring system for the lower extremity has sufficient reliability and reasonable validity. Nevertheless, the observation of a ceiling effect suggests poor ability of this system to discriminate from among patients who have a high level of function.
Development and validation of a composite scoring system for robot-assisted surgical training--the Robotic Skills Assessment Score.

PubMed

Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A

2013-12-01

A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
A scoring system for ascertainment of incident stroke; the Risk Index Score (RISc).

PubMed

Kass-Hout, T A; Moyé, L A; Smith, M A; Morgenstern, L B

2006-01-01

The main objective of this study was to develop and validate a computer-based statistical algorithm that could be translated into a simple scoring system in order to ascertain incident stroke cases using hospital admission medical records data. The Risk Index Score (RISc) algorithm was developed using data collected prospectively by the Brain Attack Surveillance in Corpus Christi (BASIC) project, 2000. The validity of RISc was evaluated by estimating the concordance of scoring system stroke ascertainment to stroke ascertainment by physician and/or abstractor review of hospital admission records. RISc was developed on 1718 randomly selected patients (training set) and then statistically validated on an independent sample of 858 patients (validation set). A multivariable logistic model was used to develop RISc and subsequently evaluated by goodness-of-fit and receiver operating characteristic (ROC) analyses. The higher the value of RISc, the higher the patient's risk of potential stroke. The study showed RISc was well calibrated and discriminated those who had potential stroke from those that did not on initial screening. In this study we developed and validated a rapid, easy, efficient, and accurate method to ascertain incident stroke cases from routine hospital admission records for epidemiologic investigations. Validation of this scoring system was achieved statistically; however, clinical validation in a community hospital setting is warranted.
Establishment and Validation of GV-SAPS II Scoring System for Non-Diabetic Critically Ill Patients

PubMed Central

Liu, Wen-Yue; Lin, Shi-Gang; Zhu, Gui-Qi; Poucke, Sven Van; Braddock, Martin; Zhang, Zhongheng; Mao, Zhi; Shen, Fei-Xia

2016-01-01

Background and Aims Recently, glucose variability (GV) has been reported as an independent risk factor for mortality in non-diabetic critically ill patients. However, GV is not incorporated in any severity scoring system for critically ill patients currently. The aim of this study was to establish and validate a modified Simplified Acute Physiology Score II scoring system (SAPS II), integrated with GV parameters and named GV-SAPS II, specifically for non-diabetic critically ill patients to predict short-term and long-term mortality. Methods Training and validation cohorts were exacted from the Multiparameter Intelligent Monitoring in Intensive Care database III version 1.3 (MIMIC-III v1.3). The GV-SAPS II score was constructed by Cox proportional hazard regression analysis and compared with the original SAPS II, Sepsis-related Organ Failure Assessment Score (SOFA) and Elixhauser scoring systems using area under the curve of the receiver operator characteristic (auROC) curve. Results 4,895 and 5,048 eligible individuals were included in the training and validation cohorts, respectively. The GV-SAPS II score was established with four independent risk factors, including hyperglycemia, hypoglycemia, standard deviation of blood glucose levels (GluSD), and SAPS II score. In the validation cohort, the auROC values of the new scoring system were 0.824 (95% CI: 0.813–0.834, P< 0.001) and 0.738 (95% CI: 0.725–0.750, P< 0.001), respectively for 30 days and 9 months, which were significantly higher than other models used in our study (all P < 0.001). Moreover, Kaplan-Meier plots demonstrated significantly worse outcomes in higher GV-SAPS II score groups both for 30-day and 9-month mortality endpoints (all P< 0.001). Conclusions We established and validated a modified prognostic scoring system that integrated glucose variability for non-diabetic critically ill patients, named GV-SAPS II. It demonstrated a superior prognostic capability and may be an optimal scoring system for prognostic evaluation in this patient group. PMID:27824941
Scoring Systems to Estimate Intracerebral Control and Survival Rates of Patients Irradiated for Brain Metastases;Brain metastases; Radiation therapy; Local control; Survival; Prognostic scores

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rades, Dirk, E-mail: Rades.Dirk@gmx.net; Dziggel, Liesa; Haatanen, Tiina

2011-07-15

Purpose: To create and validate scoring systems for intracerebral control (IC) and overall survival (OS) of patients irradiated for brain metastases. Methods and Materials: In this study, 1,797 patients were randomly assigned to the test (n = 1,198) or the validation group (n = 599). Two scoring systems were developed, one for IC and another for OS. The scores included prognostic factors found significant on multivariate analyses. Age, performance status, extracerebral metastases, interval tumor diagnosis to RT, and number of brain metastases were associated with OS. Tumor type, performance status, interval, and number of brain metastases were associated with IC.more » The score for each factor was determined by dividing the 6-month IC or OS rate (given in percent) by 10. The total score represented the sum of the scores for each factor. The score groups of the test group were compared with the corresponding score groups of the validation group. Results: In the test group, 6-month IC rates were 17% for 14-18 points, 49% for 19-23 points, and 77% for 24-27 points (p < 0.0001). IC rates in the validation group were 19%, 52%, and 77%, respectively (p < 0.0001). In the test group, 6-month OS rates were 9% for 15-19 points, 41% for 20-25 points, and 78% for 26-30 points (p < 0.0001). OS rates in the validation group were 7%, 39%, and 79%, respectively (p < 0.0001). Conclusions: Patients irradiated for brain metastases can be given scores to estimate OS and IC. IC and OS rates of the validation group were similar to the test group demonstrating the validity and reproducibility of both scores.« less
Elbow-specific clinical rating systems: extent of established validity, reliability, and responsiveness.

PubMed

The, Bertram; Reininga, Inge H F; El Moumni, Mostafa; Eygendaal, Denise

2013-10-01

The modern standard of evaluating treatment results includes the use of rating systems. Elbow-specific rating systems are frequently used in studies aiming at elbow-specific pathology. However, proper validation studies seem to be relatively sparse. In addition, these scoring systems might not always be used for appropriate populations of interest. Both of these issues might give rise to invalid conclusions being reported in the literature. Our aim was to investigate the extent to which the available elbow-specific outcome measurement tools have been validated and the quality of the validation itself. We also aimed to provide characteristics of the populations used for validation of these scales to enable clinicians to use them appropriately. A literature search identified 17 studies of 12 different elbow-specific scoring systems. These were assessed for validity, reliability, and responsiveness characteristics. The quality of these assessments was rated according to the Consensus Based Standards for the Selection of Health Measurement Instruments (COSMIN) checklist criteria, a standardized and validated tool developed specifically for this purpose. Currently, the only elbow-specific rating system that is validated using high-quality methodology is the Oxford Elbow Score, a patient-administered outcome measure tool that has been validated on heterogeneous study populations. Other rating systems still have to be proven in the future to be as good as the Oxford Elbow Score for clinical or research purposes. Additional validation studies are needed. Copyright © 2013 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
Psychometric properties including reliability, validity and responsiveness of the Majeed pelvic score in patients with chronic sacroiliac joint pain.

PubMed

Bajada, Stefan; Mohanty, Khitish

2016-06-01

The Majeed scoring system is a disease-specific outcome measure that was originally designed to assess pelvic injuries. The aim of this study was to determine the psychometric properties of the Majeed scoring system for chronic sacroiliac joint pain. Internal consistency, content validity, criterion validity, construct validity and responsiveness to change was assessed prospectively for the Majeed scoring system in a cohort of 60 patients diagnosed with sacroiliac joint pain. This diagnosis was confirmed with CT-guided sacroiliac joint anaesthetic block. The overall Majeed score showed acceptable internal consistency (Cronbach alpha = 0.63). Similarly, it showed acceptable floor (0 %) and ceiling (0 %) effects. On the other hand, the domains of pain, work, sitting and sexual intercourse had high (>30 %) floor effects. Significant correlation with the physical component of the Short Form-36 (p = 0.005) and Oswestry disability index (p ≤ 0.001) was found indicating acceptable criterion validity. The overall Majeed score showed acceptable construct validity with all five developed hypotheses showing significance (p ≤ 0.05). The overall Majeed score showed acceptable responsiveness to change with a large (≥0.80) effect size and standardized response mean. Overall the Majeed scoring system demonstrated acceptable psychometric properties for outcome assessment in chronic sacroiliac joint pain. Thus, its use in this condition is adequate. However, some domains demonstrated suboptimal performance indicating that improvement might be achieved with the development of an outcome measure specific for sacroiliac joint dysfunction and degeneration.
Establishment and validation of the scoring system for preoperative prediction of central lymph node metastasis in papillary thyroid carcinoma.

PubMed

Liu, Wen; Cheng, Ruochuan; Ma, Yunhai; Wang, Dan; Su, Yanjun; Diao, Chang; Zhang, Jianming; Qian, Jun; Liu, Jin

2018-05-03

Early preoperative diagnosis of central lymph node metastasis (CNM) is crucial to improve survival rates among patients with papillary thyroid carcinoma (PTC). Here, we analyzed clinical data from 2862 PTC patients and developed a scoring system using multivariable logistic regression and testified by the validation group. The predictive diagnostic effectiveness of the scoring system was evaluated based on consistency, discrimination ability, and accuracy. The scoring system considered seven variables: gender, age, tumor size, microcalcification, resistance index >0.7, multiple nodular lesions, and extrathyroid extension. The area under the receiver operating characteristic curve (AUC) was 0.742, indicating a good discrimination. Using 5 points as a diagnostic threshold, the validation results for validation group had an AUC of 0.758, indicating good discrimination and consistency in the scoring system. The sensitivity of this predictive model for preoperative diagnosis of CNM was 4 times higher than a direct ultrasound diagnosis. These data indicate that the CNM prediction model would improve preoperative diagnostic sensitivity for CNM in patients with papillary thyroid carcinoma.
Validation of prognostic scores for clinical outcomes in cirrhotic patients with acute variceal bleeding.

PubMed

Motola-Kuba, Miguel; Escobedo-Arzate, Angélica; Tellez-Avila, Félix; Altamirano, José; Aguilar-Olivos, Nancy; González-Angulo, Alberto; Zamarripa-Dorsey, Felipe; Uribe, Misael; Chávez-Tapia, Norberto C

Background. The Rockall, Glasgow-Blatchford, and AIMS65 are useful and validated scoring systems for predicting the outcomes of patients with nonvariceal gastrointestinal bleeding. However, there are no validated evidence for using them to predict outcomes on variceal bleeding. The aim of this study was to evaluate and compare the prognostic accuracy of different nonvariceal bleeding scores with other liver-specific scoring systems in cirrhotic patients. A retrospective multicenter study that included 160 cirrhotic patients with acute variceal bleeding. The AUROC's to predict in-hospital mortality, and rebleeding, were analyzed for each scoring system. Overall in-hospital mortality occurred in 13% and in-hospital rebleeding in 12% of patients. The systems with the best AUROC value for predicting mortality were MELD (0.828; 95% CI 0.748-0.909), and AIMS65 (0.817; 95% CI 0.724-0.909). The best score systems for predicting rebleeding were Glasgow-Blatchford (0.756; 95% CI 0.640- 0.827), and Rockall (0.691; 95% CI 0.580-0.802). In addition to liver-specific scores, the AIMS65 score is accurate for predicting in-hospital mortality in cirrhotic patients with acute variceal bleeding. Other scoring systems might be useful for predicting significant clinical outcomes in these patients.
Development and Validation of a Risk Scoring System for Severe Acute Lower Gastrointestinal Bleeding.

PubMed

Aoki, Tomonori; Nagata, Naoyoshi; Shimbo, Takuro; Niikura, Ryota; Sakurai, Toshiyuki; Moriyasu, Shiori; Okubo, Hidetaka; Sekine, Katsunori; Watanabe, Kazuhiro; Yokoi, Chizu; Yanase, Mikio; Akiyama, Junichi; Mizokami, Masashi; Uemura, Naomi

2016-11-01

We aimed to develop and validate a risk scoring system to determine the risk of severe lower gastrointestinal bleeding (LGIB) and predict patient outcomes. We first performed a retrospective analysis of data from 439 patients emergently hospitalized for acute LGIB at the National Center for Global Health and Medicine in Japan, from January 2009 through December 2013. We used data on comorbidities, medication, presenting symptoms, and vital signs, and laboratory test results to develop a scoring system for severe LGIB (defined as continuous and/or recurrent bleeding). We validated the risk score in a prospective study of 161 patients with acute LGIB admitted to the same center from April 2014 through April 2015. We assessed the system's accuracy in predicting patient outcome using area under the receiver operating characteristics curve (AUC) analysis. All patients underwent colonoscopy. In the first study, 29% of the patients developed severe LGIB. We devised a risk scoring system based on nonsteroidal anti-inflammatory drugs use, no diarrhea, no abdominal tenderness, blood pressure of 100 mm Hg or lower, antiplatelet drugs use, albumin level less than 3.0 g/dL, disease scores of 2 or higher, and syncope (NOBLADS), which all were independent correlates of severe LGIB. Severe LGIB developed in 75.7% of patients with scores of 5 or higher compared with 2% of patients without any of the factors correlated with severe LGIB (P < .001). The NOBLADS score determined the severity of LGIB with an AUC value of 0.77. In the validation (second) study, severe LGIB developed in 35% of patients; the NOBLADS score predicted the severity of LGIB with an AUC value of 0.76. Higher NOBLADS scores were associated with a requirement for blood transfusion, longer hospital stay, and intervention (P < .05 for trend). We developed and validated a scoring system for risk of severe LGIB based on 8 factors (NOBLADS score). The system also determined the risk for blood transfusion, longer hospital stay, and intervention. It might be used in decision making regarding intervention and management. Copyright © 2016 AGA Institute. Published by Elsevier Inc. All rights reserved.
Simple Scoring System and Artificial Neural Network for Knee Osteoarthritis Risk Prediction: A Cross-Sectional Study

PubMed Central

Yoo, Tae Keun; Kim, Deok Won; Choi, Soo Beom; Oh, Ein; Park, Jee Soo

2016-01-01

Background Knee osteoarthritis (OA) is the most common joint disease of adults worldwide. Since the treatments for advanced radiographic knee OA are limited, clinicians face a significant challenge of identifying patients who are at high risk of OA in a timely and appropriate way. Therefore, we developed a simple self-assessment scoring system and an improved artificial neural network (ANN) model for knee OA. Methods The Fifth Korea National Health and Nutrition Examination Surveys (KNHANES V-1) data were used to develop a scoring system and ANN for radiographic knee OA. A logistic regression analysis was used to determine the predictors of the scoring system. The ANN was constructed using 1777 participants and validated internally on 888 participants in the KNHANES V-1. The predictors of the scoring system were selected as the inputs of the ANN. External validation was performed using 4731 participants in the Osteoarthritis Initiative (OAI). Area under the curve (AUC) of the receiver operating characteristic was calculated to compare the prediction models. Results The scoring system and ANN were built using the independent predictors including sex, age, body mass index, educational status, hypertension, moderate physical activity, and knee pain. In the internal validation, both scoring system and ANN predicted radiographic knee OA (AUC 0.73 versus 0.81, p<0.001) and symptomatic knee OA (AUC 0.88 versus 0.94, p<0.001) with good discriminative ability. In the external validation, both scoring system and ANN showed lower discriminative ability in predicting radiographic knee OA (AUC 0.62 versus 0.67, p<0.001) and symptomatic knee OA (AUC 0.70 versus 0.76, p<0.001). Conclusions The self-assessment scoring system may be useful for identifying the adults at high risk for knee OA. The performance of the scoring system is improved significantly by the ANN. We provided an ANN calculator to simply predict the knee OA risk. PMID:26859664
Results of an Internet survey determining the most frequently used ankle scores by AOFAS members.

PubMed

Lau, Johnny T C; Mahomed, Nizar M; Schon, Lew C

2005-06-01

With technological advances in ankle arthroplasty, there has been parallel development in the outcome instruments used to assess the results of surgery. The literature recommends the use of valid, reliable, and responsive ankle scores, but the ankle scores commonly used in clinical practice remain undefined. An internet survey of members of the American Orthopaedic Foot and Ankle Society (AOFAS) was conducted to determine which three ankle scores they perceived as most commonly used in the literature, which ones they believe are validated, which ones they prefer, and which they use in practice. According to respondents, the three most commonly used scores were the AOFAS Ankle score, the Foot Function Index (FFI), and the Musculoskeletal Outcomes Data Evaluation and Management System (MODEMS). The respondents believed that the AOFAS Ankle score, FFI, and MODEMS were validated. The FFI and MODEMS are validated, but the AOFAS ankle score is not validated. Most respondents preferred using the AOFAS Ankle score. The use of the empirical AOFAS Ankle score continues among AOFAS members.
Automatic, semi-automatic and manual validation of urban drainage data.

PubMed

Branisavljević, N; Prodanović, D; Pavlović, D

2010-01-01

Advances in sensor technology and the possibility of automated long distance data transmission have made continuous measurements the preferable way of monitoring urban drainage processes. Usually, the collected data have to be processed by an expert in order to detect and mark the wrong data, remove them and replace them with interpolated data. In general, the first step in detecting the wrong, anomaly data is called the data quality assessment or data validation. Data validation consists of three parts: data preparation, validation scores generation and scores interpretation. This paper will present the overall framework for the data quality improvement system, suitable for automatic, semi-automatic or manual operation. The first two steps of the validation process are explained in more detail, using several validation methods on the same set of real-case data from the Belgrade sewer system. The final part of the validation process, which is the scores interpretation, needs to be further investigated on the developed system.
A new extranodal scoring system based on the prognostically relevant extranodal sites in diffuse large B-cell lymphoma, not otherwise specified treated with chemoimmunotherapy.

PubMed

Hwang, Hee Sang; Yoon, Dok Hyun; Suh, Cheolwon; Huh, Jooryung

2016-08-01

Extranodal involvement is a well-known prognostic factor in patients with diffuse large B-cell lymphomas (DLBCL). Nevertheless, the prognostic impact of the extranodal scoring system included in the conventional international prognostic index (IPI) has been questioned in an era where rituximab treatment has become widespread. We investigated the prognostic impacts of individual sites of extranodal involvement in 761 patients with DLBCL who received rituximab-based chemoimmunotherapy. Subsequently, we established a new extranodal scoring system based on extranodal sites, showing significant prognostic correlation, and compared this system with conventional scoring systems, such as the IPI and the National Comprehensive Cancer Network-IPI (NCCN-IPI). An internal validation procedure, using bootstrapped samples, was also performed for both univariate and multivariate models. Using multivariate analysis with a backward variable selection, we found nine extranodal sites (the liver, lung, spleen, central nervous system, bone marrow, kidney, skin, adrenal glands, and peritoneum) that remained significant for use in the final model. Our newly established extranodal scoring system, based on these sites, was better correlated with patient survival than standard scoring systems, such as the IPI and the NCCN-IPI. Internal validation by bootstrapping demonstrated an improvement in model performance of our modified extranodal scoring system. Our new extranodal scoring system, based on the prognostically relevant sites, may improve the performance of conventional prognostic models of DLBCL in the rituximab era and warrants further external validation using large study populations.
Design and internal validation of an obstetric early warning score: secondary analysis of the Intensive Care National Audit and Research Centre Case Mix Programme database.

PubMed

Carle, C; Alexander, P; Columb, M; Johal, J

2013-04-01

We designed and internally validated an aggregate weighted early warning scoring system specific to the obstetric population that has the potential for use in the ward environment. Direct obstetric admissions from the Intensive Care National Audit and Research Centre's Case Mix Programme Database were randomly allocated to model development (n = 2240) or validation (n = 2200) sets. Physiological variables collected during the first 24 h of critical care admission were analysed. Logistic regression analysis for mortality in the model development set was initially used to create a statistically based early warning score. The statistical score was then modified to create a clinically acceptable early warning score. Important features of this clinical obstetric early warning score are that the variables are weighted according to their statistical importance, a surrogate for the FI O2 /Pa O2 relationship is included, conscious level is assessed using a simplified alert/not alert variable, and the score, trigger thresholds and response are consistent with the new non-obstetric National Early Warning Score system. The statistical and clinical early warning scores were internally validated using the validation set. The area under the receiver operating characteristic curve was 0.995 (95% CI 0.992-0.998) for the statistical score and 0.957 (95% CI 0.923-0.991) for the clinical score. Pre-existing empirically designed early warning scores were also validated in the same way for comparison. The area under the receiver operating characteristic curve was 0.955 (95% CI 0.922-0.988) for Swanton et al.'s Modified Early Obstetric Warning System, 0.937 (95% CI 0.884-0.991) for the obstetric early warning score suggested in the 2003-2005 Report on Confidential Enquiries into Maternal Deaths in the UK, and 0.973 (95% CI 0.957-0.989) for the non-obstetric National Early Warning Score. This highlights that the new clinical obstetric early warning score has an excellent ability to discriminate survivors from non-survivors in this critical care data set. Further work is needed to validate our new clinical early warning score externally in the obstetric ward environment. Anaesthesia © 2013 The Association of Anaesthetists of Great Britain and Ireland.
Student-Centered Reliability, Concurrent Validity and Instructional Sensitivity in Scoring of Students' Concept Maps in a University Science Laboratory

ERIC Educational Resources Information Center

Kaya, Osman Nafiz; Kilic, Ziya

2004-01-01

Student-centered approach of scoring the concept maps consisted of three elements namely symbol system, individual portfolio and scoring scheme. We scored student-constructed concept maps based on 5 concept map criteria: validity of concepts, adequacy of propositions, significance of cross-links, relevancy of examples, and interconnectedness. With…

Measuring Quality in Rural Kindergarten Classrooms: Reliability and Validity Evidence for the Classroom Assessment Scoring System, Kindergarten-Third Grade (CLASS K-3)

ERIC Educational Resources Information Center

Sandilos, Lia E.

2012-01-01

The purpose of the current study was to evaluate the structural validity and stability of scores on a measure of global classroom quality, the Classroom Assessment Scoring System, Kindergarten-Third Grade (CLASS K-3; Pianta, La Paro, & Hamre, 2008). Using data from a sample of 417 kindergarten classrooms in the rural Southern and Mid-Atlantic…
The Motivational Value Systems Questionnaire (MVSQ): Psychometric Analysis Using a Forced Choice Thurstonian IRT Model

PubMed Central

Merk, Josef; Schlotz, Wolff; Falter, Thomas

2017-01-01

This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts. PMID:28979228
The Motivational Value Systems Questionnaire (MVSQ): Psychometric Analysis Using a Forced Choice Thurstonian IRT Model.

PubMed

Merk, Josef; Schlotz, Wolff; Falter, Thomas

2017-01-01

This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts.
Validity and reliability of a novel immunosuppressive adverse effects scoring system in renal transplant recipients.

PubMed

Meaney, Calvin J; Arabi, Ziad; Venuto, Rocco C; Consiglio, Joseph D; Wilding, Gregory E; Tornatore, Kathleen M

2014-06-12

After renal transplantation, many patients experience adverse effects from maintenance immunosuppressive drugs. When these adverse effects occur, patient adherence with immunosuppression may be reduced and impact allograft survival. If these adverse effects could be prospectively monitored in an objective manner and possibly prevented, adherence to immunosuppressive regimens could be optimized and allograft survival improved. Prospective, standardized clinical approaches to assess immunosuppressive adverse effects by health care providers are limited. Therefore, we developed and evaluated the application, reliability and validity of a novel adverse effects scoring system in renal transplant recipients receiving calcineurin inhibitor (cyclosporine or tacrolimus) and mycophenolic acid based immunosuppressive therapy. The scoring system included 18 non-renal adverse effects organized into gastrointestinal, central nervous system and aesthetic domains developed by a multidisciplinary physician group. Nephrologists employed this standardized adverse effect evaluation in stable renal transplant patients using physical exam, review of systems, recent laboratory results, and medication adherence assessment during a clinic visit. Stable renal transplant recipients in two clinical studies were evaluated and received immunosuppressive regimens comprised of either cyclosporine or tacrolimus with mycophenolic acid. Face, content, and construct validity were assessed to document these adverse effect evaluations. Inter-rater reliability was determined using the Kappa statistic and intra-class correlation. A total of 58 renal transplant recipients were assessed using the adverse effects scoring system confirming face validity. Nephrologists (subject matter experts) rated the 18 adverse effects as: 3.1 ± 0.75 out of 4 (maximum) regarding clinical importance to verify content validity. The adverse effects scoring system distinguished 1.75-fold increased gastrointestinal adverse effects (p=0.008) in renal transplant recipients receiving tacrolimus and mycophenolic acid compared to the cyclosporine regimen. This finding demonstrated construct validity. Intra-class correlation was 0.81 (95% confidence interval: 0.65-0.90) and Kappa statistic of 0.68 ± 0.25 for all 18 adverse effects and verified substantial inter-rater reliability. This immunosuppressive adverse effects scoring system in stable renal transplant recipients was evaluated and substantiated face, content and construct validity with inter-rater reliability. The scoring system may facilitate prospective, standardized clinical monitoring of immunosuppressive adverse drug effects in stable renal transplant recipients and improve medication adherence.
British isles lupus assessment group 2004 index is valid for assessment of disease activity in systemic lupus erythematosus

PubMed Central

Yee, Chee-Seng; Farewell, Vernon; Isenberg, David A; Rahman, Anisur; Teh, Lee-Suan; Griffiths, Bridget; Bruce, Ian N; Ahmad, Yasmeen; Prabu, Athiveeraramapandian; Akil, Mohammed; McHugh, Neil; D'Cruz, David; Khamashta, Munther A; Maddison, Peter; Gordon, Caroline

2007-01-01

Objective To determine the construct and criterion validity of the British Isles Lupus Assessment Group 2004 (BILAG-2004) index for assessing disease activity in systemic lupus erythematosus (SLE). Methods Patients with SLE were recruited into a multicenter cross-sectional study. Data on SLE disease activity (scores on the BILAG-2004 index, Classic BILAG index, and Systemic Lupus Erythematosus Disease Activity Index 2000 [SLEDAI-2K]), investigations, and therapy were collected. Overall BILAG-2004 and overall Classic BILAG scores were determined by the highest score achieved in any of the individual systems in the respective index. Erythrocyte sedimentation rates (ESRs), C3 levels, C4 levels, anti–double-stranded DNA (anti-dsDNA) levels, and SLEDAI-2K scores were used in the analysis of construct validity, and increase in therapy was used as the criterion for active disease in the analysis of criterion validity. Statistical analyses were performed using ordinal logistic regression for construct validity and logistic regression for criterion validity. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Results Of the 369 patients with SLE, 92.7% were women, 59.9% were white, 18.4% were Afro-Caribbean and 18.4% were South Asian. Their mean ± SD age was 41.6 ± 13.2 years and mean disease duration was 8.8 ± 7.7 years. More than 1 assessment was obtained on 88.6% of the patients, and a total of 1,510 assessments were obtained. Increasing overall scores on the BILAG-2004 index were associated with increasing ESRs, decreasing C3 levels, decreasing C4 levels, elevated anti-dsDNA levels, and increasing SLEDAI-2K scores (all P < 0.01). Increase in therapy was observed more frequently in patients with overall BILAG-2004 scores reflecting higher disease activity. Scores indicating active disease (overall BILAG-2004 scores of A and B) were significantly associated with increase in therapy (odds ratio [OR] 19.3, P < 0.01). The BILAG-2004 and Classic BILAG indices had comparable sensitivity, specificity, PPV, and NPV. Conclusion These findings show that the BILAG-2004 index has construct and criterion validity. PMID:18050213
Multiattribute health utility scoring for the computerized adaptive measure CAT-5D-QOL was developed and validated.

PubMed

Kopec, Jacek A; Sayre, Eric C; Rogers, Pamela; Davis, Aileen M; Badley, Elizabeth M; Anis, Aslam H; Abrahamowicz, Michal; Russell, Lara; Rahman, Md Mushfiqur; Esdaile, John M

2015-10-01

The CAT-5D-QOL is a previously reported item response theory (IRT)-based computerized adaptive tool to measure five domains (attributes) of health-related quality of life. The objective of this study was to develop and validate a multiattribute health utility (MAHU) scoring method for this instrument. The MAHU scoring system was developed in two stages. In phase I, we obtained standard gamble (SG) utilities for 75 hypothetical health states in which only one domain varied (15 states per domain). In phase II, we obtained SG utilities for 256 multiattribute states. We fit a multiplicative regression model to predict SG utilities from the five IRT domain scores. The prediction model was constrained using data from phase I. We validated MAHU scores by comparing them with the Health Utilities Index Mark 3 (HUI3) and directly measured utilities and by assessing between-group discrimination. MAHU scores have a theoretical range from -0.842 to 1. In the validation study, the scores were, on average, higher than HUI3 utilities and lower than directly measured SG utilities. MAHU scores correlated strongly with the HUI3 (Spearman ρ = 0.78) and discriminated well between groups expected to differ in health status. Results reported here provide initial evidence supporting the validity of the MAHU scoring system for the CAT-5D-QOL. Copyright © 2015 Elsevier Inc. All rights reserved.
Complex and elementary histological scoring systems for articular cartilage repair.

PubMed

Orth, Patrick; Madry, Henning

2015-08-01

The repair of articular cartilage defects is increasingly moving into the focus of experimental and clinical investigations. Histological analysis is the gold standard for a valid and objective evaluation of cartilaginous repair tissue and predominantly relies on the use of established scoring systems. In the past three decades, numerous elementary and complex scoring systems have been described and modified, including those of O'Driscoll, Pineda, Wakitani, Sellers and Fortier for entire defects as well as those according to the International Cartilage Repair Society (ICRS-I/II) for osteochondral tissue biopsies. Yet, this coexistence of different grading scales inconsistently addressing diverse parameters may impede comparability between reported study outcomes. Furthermore, validation of these histological scoring systems has only seldom been performed to date. The aim of this review is (1) to give a comprehensive overview and to compare the most important established histological scoring systems for articular cartilage repair, (2) to describe their specific advantages and pitfalls, and (3) to provide valid recommendations for their use in translational and clinical studies of articular cartilage repair.
Risk score to predict gastrointestinal bleeding after acute ischemic stroke.

PubMed

Ji, Ruijun; Shen, Haipeng; Pan, Yuesong; Wang, Penglian; Liu, Gaifen; Wang, Yilong; Li, Hao; Singhal, Aneesh B; Wang, Yongjun

2014-07-25

Gastrointestinal bleeding (GIB) is a common and often serious complication after stroke. Although several risk factors for post-stroke GIB have been identified, no reliable or validated scoring system is currently available to predict GIB after acute stroke in routine clinical practice or clinical trials. In the present study, we aimed to develop and validate a risk model (acute ischemic stroke associated gastrointestinal bleeding score, the AIS-GIB score) to predict in-hospital GIB after acute ischemic stroke. The AIS-GIB score was developed from data in the China National Stroke Registry (CNSR). Eligible patients in the CNSR were randomly divided into derivation (60%) and internal validation (40%) cohorts. External validation was performed using data from the prospective Chinese Intracranial Atherosclerosis Study (CICAS). Independent predictors of in-hospital GIB were obtained using multivariable logistic regression in the derivation cohort, and β-coefficients were used to generate point scoring system for the AIS-GIB. The area under the receiver operating characteristic curve (AUROC) and the Hosmer-Lemeshow goodness-of-fit test were used to assess model discrimination and calibration, respectively. A total of 8,820, 5,882, and 2,938 patients were enrolled in the derivation, internal validation and external validation cohorts. The overall in-hospital GIB after AIS was 2.6%, 2.3%, and 1.5% in the derivation, internal, and external validation cohort, respectively. An 18-point AIS-GIB score was developed from the set of independent predictors of GIB including age, gender, history of hypertension, hepatic cirrhosis, peptic ulcer or previous GIB, pre-stroke dependence, admission National Institutes of Health stroke scale score, Glasgow Coma Scale score and stroke subtype (Oxfordshire). The AIS-GIB score showed good discrimination in the derivation (0.79; 95% CI, 0.764-0.825), internal (0.78; 95% CI, 0.74-0.82) and external (0.76; 95% CI, 0.71-0.82) validation cohorts. The AIS-GIB score was well calibrated in the derivation (P = 0.42), internal (P = 0.45) and external (P = 0.86) validation cohorts. The AIS-GIB score is a valid clinical grading scale to predict in-hospital GIB after AIS. Further studies on the effect of the AIS-GIB score on reducing GIB and improving outcome after AIS are warranted.
Validation of Automated Scoring of Oral Reading

ERIC Educational Resources Information Center

Balogh, Jennifer; Bernstein, Jared; Cheng, Jian; Van Moere, Alistair; Townshend, Brent; Suzuki, Masanori

2012-01-01

A two-part experiment is presented that validates a new measurement tool for scoring oral reading ability. Data collected by the U.S. government in a large-scale literacy assessment of adults were analyzed by a system called VersaReader that uses automatic speech recognition and speech processing technologies to score oral reading fluency. In the…
Validation of the nursing workload scoring systems "Nursing Activities Score" (NAS), and "Therapeutic Intervention Scoring System For Critically Ill Children" (TISS-C) in a Greek Paediatric Intensive Care Unit.

PubMed

Nieri, Alexandra-Stavroula; Manousaki, Kalliopi; Kalafati, Maria; Padilha, Katia Grilio; Stafseth, Siv K; Katsoulas, Theodoros; Matziou, Vasiliki; Giannakopoulou, Margarita

2018-04-11

To assess the reliability and validity of the Greek version of Nursing Activities Score (NAS), and Therapeutic Intervention Scoring System for Critically Ill Children (TISS-C) in a Greek Paediatric Intensive Care Unit (PICU). A methodological study was performed in one PICU of the largest Paediatric Hospital in Athens-Greece. The culturally adapted and validated Greek NAS version, enriched according to the Norwegian paediatric one (P-NAS), was used. TISS-C and Norwegian paediatric interventions were translated to Greek language and backwards. Therapeutic Intervention Scoring System (TISS-28) was used as a gold standard. Two independent observers simultaneously recorded 30 daily P-NAS and TISS-C records. Totally, 188 daily P-NAS, TISS-C and TISS-28 reports in a sample of 29 patients have been obtained during 5 weeks. Descriptive statistics, reliability and validity measures were applied using SPSS (ver 22.0) (p ≤ 0.05). Kappa was 0.963 for P-NAS and 0.9895 for TISS-C (p < 0.001) and Intraclass Correlation Coefficient for all scale items of TISS-C was 1.00 (p < 0.001). P-NAS, TISS-28 and TISS-C measurements were significantly correlated (0.680 ≤ rho ≤ 0.743, p < 0.001). The mean score(±SD) for TISS-28, P-NAS and TISS-C was 23.05(±5.72), 58.14(±13.98) and 20.21(±9.66) respectively. These results support the validity of P-NAS and TISS-C scales to be used in Greek PICUs. Copyright © 2018 Elsevier Ltd. All rights reserved.
Scoring and staging systems using cox linear regression modeling and recursive partitioning.

PubMed

Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H

2006-01-01

Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
A Validation of the Classroom Assessment Scoring System in Finnish Kindergartens

ERIC Educational Resources Information Center

Pakarinen, Eija; Lerkkanen, Marja-Kristiina; Poikkeus, Anna-Maija; Kiuru, Noona; Siekkinen, Martti; Rasku-Puttonen, Helena; Nurmi, Jari-Erik

2010-01-01

Research Findings: This study examined the validity and reliability of the Classroom Assessment Scoring System (CLASS; R. C. Pianta, K. M. La Paro, & B. K. Hamre, 2008) in Finnish kindergartens. A pair of trained observers used the CLASS to observe 49 kindergarten teachers (47 female, 2 male) on two different days. Questionnaires measuring…
Validation of risk assessment scoring systems for an audit of elective surgery for gastrointestinal cancer in elderly patients: an audit.

PubMed

Wakabayashi, Hisao; Sano, Takanori; Yachida, Shinichi; Okano, Keiichi; Izuishi, Kunihiko; Suzuki, Yasuyuki

2007-10-01

The goal of this study was to validate the usefulness of risk assessment scoring systems for a surgical audit in elective digestive surgery for elderly patients. The validated scoring systems used were the Physiological and Operative Severity Score for enUmeration of Mortality and morbidity (POSSUM) and the Portsmouth predictor equation for mortality (P-POSSUM). This study involved 153 consecutive patients aged 75 years and older who underwent elective gastric or colorectal surgery between July 2004 and June 2006. A retrospective analysis was performed on data collected prior to each surgery. The predicted mortality and morbidity risks were calculated using each of the scoring systems and were used to obtain the observed/predicted (O/E) mortality and morbidity ratios. New logistic regression equations for morbidity and mortality were then calculated using the scores from the POSSUM system and applied retrospectively. The O/E ratio for morbidity obtained from POSSUM score was 0.23. The O/E ratios for mortality from the POSSUM score and the P-POSSUM were 0.15 and 0.38, respectively. Utilizing the new equations using scores from the POSSUM, the O/E ratio increased to 0.88. Both the POSSUM and P-POSSUM over-predicted the morbidity and mortality in elective gastrointestinal surgery for malignant tumors in elderly patients. However, if a surgical unit makes appropriate calculations using its own patient series and updates these equations, the POSSUM system can be useful in the risk assessment for surgery in elderly patients.
[Validation of the Glasgow-Blatchford Scoring System to predict mortality in patients with upper gastrointestinal bleeding in a hospital of Lima, Peru (June 2012-December 2013)].

PubMed

Cassana, Alessandra; Scialom, Silvia; Segura, Eddy R; Chacaltana, Alfonso

2015-07-01

Upper gastrointestinal bleeding is a major cause of hospitalization and the most prevalent emergency worldwide, with a mortality rate of up to 14%. In Peru, there have not been any studies on the use of the Glasgow-Blatchford Scoring System to predict mortality in upper gastrointestinal bleeding. The aim of this study is to perform an external validation of the Glasgow-Blatchford Scoring System and to establish the best cutoff for predicting mortality in upper gastrointestinal bleeding in a hospital of Lima, Peru. This was a longitudinal, retrospective, analytical validation study, with data from patients with a clinical and endoscopic diagnosis of upper gastrointestinal bleeding treated at the Gastrointestinal Hemorrhage Unit of the Hospital Nacional Edgardo Rebagliati Martins between June 2012 and December 2013. We calculated the area under the curve for the receiver operating characteristic of the Glasgow-Blatchford Scoring System to predict mortality with a 95% confidence interval. A total of 339 records were analyzed. 57.5% were male and the mean age (standard deviation) was 67.0 (15.7) years. The median of the Glasgow-Blatchford Scoring System obtained in the population was 12. The ROC analysis for death gave an area under the curve of 0.59 (95% CI 0.5-0.7). Stratifying by type of upper gastrointestinal bleeding resulted in an area under the curve of 0.66 (95% CI 0.53-0.78) for non-variceal type. In this population, the Glasgow-Blatchford Scoring System has no diagnostic validity for predicting mortality.
Development and validation of the Sports Athlete Foot and Ankle Score: an instrument for sports-related ankle injuries.

PubMed

Morssinkhof, M L A; Wang, O; James, L; van der Heide, H J L; Winson, I G

2013-09-01

Many existing scoring systems assess ankle function, but there is no evidence that any of them has been validated in a group of patients with a higher demand on their ankle function. Problems include ceiling effects, not being able to detect change or they do not contain a sports-subscale. The aim of this study was to create a validated self-administered scoring system for ankle injuries in the higher performing athlete. First, 26 patients were interviewed to solicit opinions needed to create the final score, which is modified from the Foot and Ankle Outcome Score (FAOS). Second, SAFAS was validated in a group of 25 athletes with and 14 athletes without ankle injury. It is a self-administered region specific sports foot and ankle score that contains four subscales assessing the levels of symptoms, pain, daily living and sports. The Spearman correlation coefficients between SAFAS and the Foot and Ankle Ability Measure (FAAM) ranged from 0.78 to 0.88. Content validity is established by key informant interviews, expert opinions and a high satisfaction rate of 75%. Cronbach's alpha indicated good internal consistency of each subscale ranging from 0.77 to 0.92. SAFAS has shown good evidence for being a valid instrudent for assessing sports-related ankle injuries in high-performing athletes. Copyright © 2013 European Foot and Ankle Society. Published by Elsevier Ltd. All rights reserved.
Bowel Endometriosis Syndrome: a new scoring system for pelvic organ dysfunction and quality of life.

PubMed

Riiskjær, M; Egekvist, A G; Hartwell, D; Forman, A; Seyer-Hansen, M; Kesmodel, U S

2017-09-01

Is it possible to develop a validated score that can identify women with Bowel Endometriosis Syndrome (BENS) and be used to monitor the effect of medical and surgical treatment? The BENS score can be used to identify women with BENS and to monitor the effect of medical and surgical treatment of women suffering from bowel endometriosis. Endometriosis is a heterogeneous disease with extensive variation in anatomical and clinical presentation, and symptoms do not always correspond to the disease burden. Current endometriosis scoring systems are mainly based on anatomical and surgical findings. The score was developed and validated from a cohort of 525 women with medically or surgically treated bowel endometriosis from Aarhus and Copenhagen University Hospitals, Denmark. Patients filled in questionnaires on pelvic pain, quality of life (QoL) and urinary, sexual and bowel function. Items were selected for the final score using clinical and statistical criteria. The chosen variables were included in a multivariate analysis. Individual score values were designated items to form the BENS score, which was divided into 'no BENS', 'minor BENS' and 'major BENS.' Internal and external validations were performed. The six most important items were 'pelvic pain', 'use of analgesics', 'dyschezia', 'straining to urinate', 'fecal urgency' and 'satisfaction with sexual life'. The range of the BENS score (0-28) was divided into 0-8 (no BENS), 9-16 (minor BENS) and 17-28 (major BENS). External validation showed a significant association between BENS score and QoL (P = 0.0001). The BENS scoring system is limited by the fact that it was developed from a single endometriosis unit in Denmark, making it susceptible to social, cultural and demographic bias. It is the first endometriosis classification system to be based directly on the symptomatology of the patient. Validation in other languages will promote comparison of treatments and results across borders. No external funding was either sought or obtained for this study. A.F. is an investigator for Bayer, outside this work. © The Author 2017. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
The Chelsea critical care physical assessment tool (CPAx): validation of an innovative new tool to measure physical morbidity in the general adult critical care population; an observational proof-of-concept pilot study.

PubMed

Corner, E J; Wood, H; Englebretsen, C; Thomas, A; Grant, R L; Nikoletou, D; Soni, N

2013-03-01

To develop a scoring system to measure physical morbidity in critical care - the Chelsea Critical Care Physical Assessment Tool (CPAx). The development process was iterative involving content validity indices (CVI), a focus group and an observational study of 33 patients to test construct validity against the Medical Research Council score for muscle strength, peak cough flow, Australian Therapy Outcome Measures score, Glasgow Coma Scale score, Bloomsbury sedation score, Sequential Organ Failure Assessment score, Short Form 36 (SF-36) score, days of mechanical ventilation and inter-rater reliability. Trauma and general critical care patients from two London teaching hospitals. Users of the CPAx felt that it possessed content validity, giving a final CVI of 1.00 (P<0.05). Construct validation data showed moderate to strong significant correlations between the CPAx score and all secondary measures, apart from the mental component of the SF-36 which demonstrated weak correlation with the CPAx score (r=0.024, P=0.720). Reliability testing showed internal consistency of α=0.798 and inter-rater reliability of κ=0.988 (95% confidence interval 0.791 to 1.000) between five raters. This pilot work supports proof of concept of the CPAx as a measure of physical morbidity in the critical care population, and is a cogent argument for further investigation of the scoring system. Copyright © 2012 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Proposal and validation of a new model to estimate survival for hepatocellular carcinoma patients.

PubMed

Liu, Po-Hong; Hsu, Chia-Yang; Hsia, Cheng-Yuan; Lee, Yun-Hsuan; Huang, Yi-Hsiang; Su, Chien-Wei; Lee, Fa-Yauh; Lin, Han-Chieh; Huo, Teh-Ia

2016-08-01

The survival of hepatocellular carcinoma (HCC) patients is heterogeneous. We aim to develop and validate a simple prognostic model to estimate survival for HCC patients (MESH score). A total of 3182 patients were randomised into derivation and validation cohort. Multivariate analysis was used to identify independent predictors of survival in the derivation cohort. The validation cohort was employed to examine the prognostic capabilities. The MESH score allocated 1 point for each of the following parameters: large tumour (beyond Milan criteria), presence of vascular invasion or metastasis, Child-Turcotte-Pugh score ≥6, performance status ≥2, serum alpha-fetoprotein level ≥20 ng/ml, and serum alkaline phosphatase ≥200 IU/L, with a maximal of 6 points. In the validation cohort, significant survival differences were found across all MESH scores from 0 to 6 (all p < 0.01). The MESH system was associated with the highest homogeneity and lowest corrected Akaike information criterion compared with Barcelona Clínic Liver Cancer, Hong Kong Liver Cancer (HKLC), Cancer of the Liver Italian Program, Taipei Integrated Scoring and model to estimate survival in ambulatory HCC Patients systems. The prognostic accuracy of the MESH scores remained constant in patients with hepatitis B- or hepatitis C-related HCC. The MESH score can also discriminate survival for patients from early to advanced stages of HCC. This newly proposed simple and accurate survival model provides enhanced prognostic accuracy for HCC. The MESH system is a useful supplement to the BCLC and HKLC classification schemes in refining treatment strategies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Recognition of Atypical Symptoms of Acute Myocardial Infarction: Development and Validation of a Risk Scoring System.

PubMed

Li, Polly W C; Yu, Doris S F

Atypical symptom presentation in patients with acute myocardial infarction (AMI) is associated with longer delay in care seeking and poorer prognosis. Symptom recognition in these patients is a challenging task. Our purpose in this risk prediction model development study was to develop and validate a risk scoring system for estimating cumulative risk for atypical AMI presentation. A consecutive sample was recruited for the developmental (n = 300) and validation (n = 97) cohorts. Symptom experience was measured with the validated Chinese version of the Symptoms of Acute Coronary Syndromes Inventory. Potential predictors were identified from the literature. Multivariable logistic regression was performed to identify significant predictors. A risk scoring system was then constructed by assigning weights to each significant predictor according to their b coefficients. Five independent predictors for atypical symptom presentation were older age (≥75 years), female gender, diabetes mellitus, history of AMI, and absence of hyperlipidemia. The Hosmer and Lemeshow test (χ6 = 4.47, P = .62) indicated that this predictive model was adequate to predict the outcome. Acceptable discrimination was demonstrated, with area under the receiver operating characteristic curve as 0.74 (95% confidence interval, 0.67-0.82) (P < .001). The predictive power of this risk scoring system was confirmed in the validation cohort. Atypical AMI presentation is common. A simple risk scoring system developed on the basis of the 5 identified predictors can raise awareness of atypical AMI presentation and promote symptom recognition by estimating the cumulative risk for an individual to present with atypical AMI symptoms.
Predicting the need for massive transfusion in trauma patients: the Traumatic Bleeding Severity Score.

PubMed

Ogura, Takayuki; Nakamura, Yoshihiko; Nakano, Minoru; Izawa, Yoshimitsu; Nakamura, Mitsunobu; Fujizuka, Kenji; Suzukawa, Masayuki; Lefor, Alan T

2014-05-01

The ability to easily predict the need for massive transfusion may improve the process of care, allowing early mobilization of resources. There are currently no clear criteria to activate massive transfusion in severely injured trauma patients. The aims of this study were to create a scoring system to predict the need for massive transfusion and then to validate this scoring system. We reviewed the records of 119 severely injured trauma patients and identified massive transfusion predictors using statistical methods. Each predictor was converted into a simple score based on the odds ratio in a multivariate logistic regression analysis. The Traumatic Bleeding Severity Score (TBSS) was defined as the sum of the component scores. The predictive value of the TBSS for massive transfusion was then validated, using data from 113 severely injured trauma patients. Receiver operating characteristic curve analysis was performed to compare the results of TBSS with the Trauma-Associated Severe Hemorrhage score and the Assessment of Blood Consumption score. In the development phase, five predictors of massive transfusion were identified, including age, systolic blood pressure, the Focused Assessment with Sonography for Trauma scan, severity of pelvic fracture, and lactate level. The maximum TBSS is 57 points. In the validation study, the average TBSS in patients who received massive transfusion was significantly greater (24.2 [6.7]) than the score of patients who did not (6.2 [4.7]) (p < 0.01). The area under the receiver operating characteristic curve, sensitivity, and specificity for a TBSS greater than 15 points was 0.985 (significantly higher than the other scoring systems evaluated at 0.892 and 0.813, respectively), 97.4%, and 96.2%, respectively. The TBSS is simple to calculate using an available iOS application and is accurate in predicting the need for massive transfusion. Additional multicenter studies are needed to further validate this scoring system and further assess its utility. Prognostic study, level III.

Scoring Methods for Building Genotypic Scores: An Application to Didanosine Resistance in a Large Derivation Set

PubMed Central

Houssaini, Allal; Assoumou, Lambert; Miller, Veronica; Calvez, Vincent; Marcelin, Anne-Geneviève; Flandre, Philippe

2013-01-01

Background Several attempts have been made to determine HIV-1 resistance from genotype resistance testing. We compare scoring methods for building weighted genotyping scores and commonly used systems to determine whether the virus of a HIV-infected patient is resistant. Methods and Principal Findings Three statistical methods (linear discriminant analysis, support vector machine and logistic regression) are used to determine the weight of mutations involved in HIV resistance. We compared these weighted scores with known interpretation systems (ANRS, REGA and Stanford HIV-db) to classify patients as resistant or not. Our methodology is illustrated on the Forum for Collaborative HIV Research didanosine database (N = 1453). The database was divided into four samples according to the country of enrolment (France, USA/Canada, Italy and Spain/UK/Switzerland). The total sample and the four country-based samples allow external validation (one sample is used to estimate a score and the other samples are used to validate it). We used the observed precision to compare the performance of newly derived scores with other interpretation systems. Our results show that newly derived scores performed better than or similar to existing interpretation systems, even with external validation sets. No difference was found between the three methods investigated. Our analysis identified four new mutations associated with didanosine resistance: D123S, Q207K, H208Y and K223Q. Conclusions We explored the potential of three statistical methods to construct weighted scores for didanosine resistance. Our proposed scores performed at least as well as already existing interpretation systems and previously unrecognized didanosine-resistance associated mutations were identified. This approach could be used for building scores of genotypic resistance to other antiretroviral drugs. PMID:23555613
Assessment of Stone Complexity for PCNL: A Systematic Review of the Literature, How Best Can We Record Stone Complexity in PCNL?

PubMed

Withington, John; Armitage, James; Finch, William; Wiseman, Oliver; Glass, Jonathan; Burgess, Neil

2016-01-01

This study aims to systematically review the literature reporting tools for scoring stone complexity and the stratification of outcomes by stone complexity. In doing so, we aim to determine whether the evidence favors uniform adoption of any one scoring system. PubMed and Embase databases were systematically searched for relevant studies from 2004 to 2014. Reports selected according to predetermined inclusion and exclusion criteria were appraised in terms of methodologic quality and their findings summarized in structured tables. After review, 15 studies were considered suitable for inclusion. Four distinct scoring systems were identified and a further five studies that aimed to validate aspects of those scoring systems. Six studies reported the stratification of outcomes by stone complexity, without specifically defining a scoring system. All studies reported some correlation between stone complexity and stone clearance. Correlation with complications was less clearly established, where investigated. This review does not allow us to firmly recommend one scoring system over the other. However, the quality of evidence supporting validation of the Guy's Stone Score is marginally superior, according to the criteria applied in this study. Further evaluation of the interobserver reliability of this scoring system is required.
Risk Factors for Venous Thromboembolism in Pediatric Trauma Patients and Validation of a Novel Scoring System: The Risk of Clots in Kids with Trauma (ROCKIT score)

PubMed Central

Yen, Jennifer; Van Arendonk, Kyle J.; Streiff, Michael B.; McNamara, LeAnn; Stewart, F. Dylan; Conner G, Kim G; Thompson, Richard E.; Haut, Elliott R.; Takemoto, Clifford M.

2017-01-01

OBJECTIVES Identify risk factors for venous thromboembolism (VTE) and develop a VTE risk assessment model for pediatric trauma patients. DESIGN, SETTING, AND PATIENTS We performed a retrospective review of patients 21 years and younger who were hospitalized following traumatic injuries at the John Hopkins level 1 adult and pediatric trauma center (1987-2011). The clinical characteristics of patients with and without VTE were compared, and multivariable logistic regression analysis was used to identify independent risk factors for VTE. Weighted risk assessment scoring systems were developed based on these and previously identified factors from patients in the National Trauma Data Bank (NTDB 2008-2010); the scoring systems were validated in this cohort from Johns Hopkins as well as a cohort of pediatric admissions from the NTDB (2011-2012). MAIN RESULTS Forty-nine of 17,366 pediatric trauma patients (0.28%) were diagnosed with VTE after admission to our trauma center. After adjusting for potential confounders, VTE was independently associated with older age, surgery, blood transfusion, higher Injury Severity Score (ISS), and lower Glasgow Coma Scale (GCS) score. These and additional factors were identified in 402,329 pediatric patients from the NTDB from 2008-2010; independent risk factors from the logistic regression analysis of this NTDB cohort were selected and incorporated into weighted risk assessment scoring systems. Two models were developed and were cross-validated in 2 separate pediatric trauma cohorts: 1) 282,535 patients in the NTDB from 2011 to 2012 2) 17,366 patients from Johns Hopkins. The receiver operator curve using these models in the validation cohorts had area under the curves that ranged 90% to 94%. CONCLUSIONS VTE is infrequent after trauma in pediatric patients. We developed weighted scoring systems to stratify pediatric trauma patients at risk for VTE. These systems may have potential to guide risk-appropriate VTE prophylaxis in children after trauma. PMID:26963757
Translation and validation of the Dutch new Knee Society Scoring System ©.

PubMed

Van Der Straeten, Catherine; Witvrouw, Erik; Willems, Tine; Bellemans, Johan; Victor, Jan

2013-11-01

A new version of The Knee Society Knee Scoring System(©) (KSS) has recently been developed. Before this scale can be used in non-English-speaking populations, it has to be translated and validated for a particular population. We evaluated the construct and content validity, the test-retest reliability, and the internal consistency of the Dutch version of the New Knee Society KSS. A Dutch translation was performed using a forward-backward translation protocol. We tested the construct validity of the Dutch New KSS by comparing it with the Dutch versions of the WOMAC, Knee Injury and Osteoarthritis Outcome Score (KOOS), and SF-12 scores in 137 patients undergoing total knee arthroplasty (TKA). Content validity was assessed by comparing pre- and postoperative scores and by checking floor and ceiling effects. To evaluate test-retest reliability and consistency, 47 patients completed the questionnaire a second time with a mean of 8 days interval (range, 2-20 days) between tests. Construct validity was demonstrated because the Dutch New KSS correlated well with the Dutch WOMAC (r = -0.751; p < 0.001), Dutch KOOS (r = -0.723; p < 0.001), and Dutch SF-12 (r = 0.569; p < 0.001). There was a significant difference between pre- and postoperative scores (p < 0.001) in line with the other scores. Test-retest reliability proved excellent with an intraclass correlation coefficient between 0.73 and 0.92 depending on the domain tested. Consistency as indicated by Cronbach's alpha ranging from 0.84 to 0.96 was good to excellent. As demonstrated by the validation procedure, the Dutch New KSS is an excellent instrument to evaluate TKA outcome in Dutch-speaking patients.
The ACTA PORT-score for predicting perioperative risk of blood transfusion for adult cardiac surgery.

PubMed

Klein, A A; Collier, T; Yeates, J; Miles, L F; Fletcher, S N; Evans, C; Richards, T

2017-09-01

A simple and accurate scoring system to predict risk of transfusion for patients undergoing cardiac surgery is lacking. We identified independent risk factors associated with transfusion by performing univariate analysis, followed by logistic regression. We then simplified the score to an integer-based system and tested it using the area under the receiver operator characteristic (AUC) statistic with a Hosmer-Lemeshow goodness-of-fit test. Finally, the scoring system was applied to the external validation dataset and the same statistical methods applied to test the accuracy of the ACTA-PORT score. Several factors were independently associated with risk of transfusion, including age, sex, body surface area, logistic EuroSCORE, preoperative haemoglobin and creatinine, and type of surgery. In our primary dataset, the score accurately predicted risk of perioperative transfusion in cardiac surgery patients with an AUC of 0.76. The external validation confirmed accuracy of the scoring method with an AUC of 0.84 and good agreement across all scores, with a minor tendency to under-estimate transfusion risk in very high-risk patients. The ACTA-PORT score is a reliable, validated tool for predicting risk of transfusion for patients undergoing cardiac surgery. This and other scores can be used in research studies for risk adjustment when assessing outcomes, and might also be incorporated into a Patient Blood Management programme. © The Author 2017. Published by Oxford University Press on behalf of the British Journal of Anaesthesia. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Preliminary validation of 2 magnetic resonance image scoring systems for osteoarthritis of the hip according to the OMERACT filter.

PubMed

Maksymowych, Walter P; Cibere, Jolanda; Loeuille, Damien; Weber, Ulrich; Zubler, Veronika; Roemer, Frank W; Jaremko, Jacob L; Sayre, Eric C; Lambert, Robert G W

2014-02-01

Development of a validated magnetic resonance image (MRI) scoring system is essential in hip OA because radiographs are insensitive to change. We assessed the feasibility and reliability of 2 previously developed scoring methods: (1) the Hip Inflammation MRI Scoring System (HIMRISS) and (2) the Hip Osteoarthritis MRI Scoring System (HOAMS). Six readers (3 radiologists, 3 rheumatologists) participated in 2 reading exercises. In Reading Exercise 1, MRI of the hip of 20 subjects were read at a single time point followed by further standardization of methodology. In Reading Exercise 2, MRI of the hip of 18 subjects from a randomized controlled trial, assessed at 2 timepoints, and 27 subjects from a cross-sectional study were read for HIMRISS and HOAMS bone marrow lesions (BML) and synovitis. Reliability was assessed using intraclass correlation coefficient (ICC) and kappa statistics. Both methods were considered feasible. For Reading 1, HIMRISS ICC were 0.52, 0.61, 0.70, and 0.58 for femoral BML, acetabular BML, effusion, and total scores, respectively; and for HOAMS, summed BML and synovitis ICC were 0.52 and 0.46, respectively. For Reading 2, HIMRISS and HOAMS ICC for BML and synovitis-effusion improved substantially. Interobserver reliability for change scores was 0.81 and 0.71 for HIMRISS femoral and HOAMS summed BML, respectively. Responsiveness and discrimination was moderate to high for synovitis-effusion. Significant associations were noted between BML or synovitis scores and Western Ontario and McMaster Universities Osteoarthritis Index pain scores for baseline values (p ≤ 0.001). The BML and synovitis-effusion components of both HIMRISS and HOAMS scoring systems are feasible and reliable, and should be validated further.
Development and initial validation of the Bedside Paediatric Early Warning System score

PubMed Central

2009-01-01

Introduction Adverse outcomes following clinical deterioration in children admitted to hospital wards is frequently preventable. Identification of children for referral to critical care experts remains problematic. Our objective was to develop and validate a simple bedside score to quantify severity of illness in hospitalized children. Methods A case-control design was used to evaluate 11 candidate items and identify a pragmatic score for routine bedside use. Case-patients were urgently admitted to the intensive care unit (ICU). Control-patients had no 'code blue', ICU admission or care restrictions. Validation was performed using two prospectively collected datasets. Results Data from 60 case and 120 control-patients was obtained. Four out of eleven candidate-items were removed. The seven-item Bedside Paediatric Early Warning System (PEWS) score ranges from 0–26. The mean maximum scores were 10.1 in case-patients and 3.4 in control-patients. The area under the receiver operating characteristics curve was 0.91, compared with 0.84 for the retrospective nurse-rating of patient risk for near or actual cardiopulmonary arrest. At a score of 8 the sensitivity and specificity were 82% and 93%, respectively. The score increased over 24 hours preceding urgent paediatric intensive care unit (PICU) admission (P < 0.0001). In 436 urgent consultations, the Bedside PEWS score was higher in patients admitted to the ICU than patients who were not admitted (P < 0.0001). Conclusions We developed and performed the initial validation of the Bedside PEWS score. This 7-item score can quantify severity of illness in hospitalized children and identify critically ill children with at least one hours notice. Prospective validation in other populations is required before clinical application. PMID:19678924
Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

ERIC Educational Resources Information Center

Rupp, André A.

2018-01-01

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…
Validity of the Child Facial Coding System for the Assessment of Acute Pain in Children With Cerebral Palsy.

PubMed

Hadden, Kellie L; LeFort, Sandra; O'Brien, Michelle; Coyte, Peter C; Guerriere, Denise N

2016-04-01

The purpose of the current study was to examine the concurrent and discriminant validity of the Child Facial Coding System for children with cerebral palsy. Eighty-five children (mean = 8.35 years, SD = 4.72 years) were videotaped during a passive joint stretch with their physiotherapist and during 3 time segments: baseline, passive joint stretch, and recovery. Children's pain responses were rated from videotape using the Numerical Rating Scale and Child Facial Coding System. Results indicated that Child Facial Coding System scores during the passive joint stretch significantly correlated with Numerical Rating Scale scores (r = .72, P < .01). Child Facial Coding System scores were also significantly higher during the passive joint stretch than the baseline and recovery segments (P < .001). Facial activity was not significantly correlated with the developmental measures. These findings suggest that the Child Facial Coding System is a valid method of identifying pain in children with cerebral palsy. © The Author(s) 2015.
Ecological Validity and Clinical Utility of Patient-Reported Outcomes Measurement Information System (PROMIS®) instruments for detecting premenstrual symptoms of depression, anger, and fatigue

PubMed Central

Junghaenel, Doerte U.; Schneider, Stefan; Stone, Arthur A.; Christodoulou, Christopher; Broderick, Joan E.

2014-01-01

Objective This study examined the ecological validity and clinical utility of NIH Patient Reported-Outcomes Measurement Information System (PROMIS®) instruments for anger, depression, and fatigue in women with premenstrual symptoms. Methods One-hundred women completed daily diaries and weekly PROMIS assessments over 4 weeks. Weekly assessments were administered through Computerized Adaptive Testing (CAT). Weekly CATs and corresponding daily scores were compared to evaluate ecological validity. To test clinical utility, we examined if CATs could detect changes in symptom levels, if these changes mirrored those obtained from daily scores, and if CATs could identify clinically meaningful premenstrual symptom change. Results PROMIS CAT scores were higher in the pre-menstrual than the baseline (ps < .0001) and post-menstrual (ps < .0001) weeks. The correlations between CATs and aggregated daily scores ranged from .73 to .88 supporting ecological validity. Mean CAT scores showed systematic changes in accordance with the menstrual cycle and the magnitudes of the changes were similar to those obtained from the daily scores. Finally, Receiver Operating Characteristic (ROC) analyses demonstrated the ability of the CATs to discriminate between women with and without clinically meaningful premenstrual symptom change. Conclusions PROMIS CAT instruments for anger, depression, and fatigue demonstrated validity and utility in premenstrual symptom assessment. The results provide encouraging initial evidence of the utility of PROMIS instruments for the measurement of affective premenstrual symptoms. PMID:24630180
Development and initial validation of an endoscopic part-task training box.

PubMed

Thompson, Christopher C; Jirapinyo, Pichamol; Kumar, Nitin; Ou, Amy; Camacho, Andrew; Lengyel, Balazs; Ryan, Michele B

2014-09-01

There is currently no objective and validated methodology available to assess the progress of endoscopy trainees or to determine when technical competence has been achieved. The aims of the current study were to develop an endoscopic part-task simulator and to assess scoring system validity. Fundamental endoscopic skills were determined via kinematic analysis, literature review, and expert interviews. Simulator prototypes and scoring systems were developed to reflect these skills. Validity evidence for content, internal structure, and response process was evaluated. The final training box consisted of five modules (knob control, torque, retroflexion, polypectomy, and navigation and loop reduction). A total of 5 minutes were permitted per module with extra points for early completion. Content validity index (CVI)-realism was 0.88, CVI-relevance was 1.00, and CVI-representativeness was 0.88, giving a composite CVI of 0.92. Overall, 82 % of participants considered the simulator to be capable of differentiating between ability levels, and 93 % thought the simulator should be used to assess ability prior to performing procedures in patients. Inter-item assessment revealed correlations from 0.67 to 0.93, suggesting that tasks were sufficiently correlated to assess the same underlying construct, with each task remaining independent. Each module represented 16.0 % - 26.1 % of the total score, suggesting that no module contributed disproportionately to the composite score. Average box scores were 272.6 and 284.4 (P = 0.94) when performed sequentially, and average score for all participants with proctor 1 was 297.6 and 308.1 with proctor 2 (P = 0.94), suggesting reproducibility and minimal error associated with test administration. A part-task training box and scoring system were developed to assess fundamental endoscopic skills, and validity evidence regarding content, internal structure, and response process was demonstrated. © Georg Thieme Verlag KG Stuttgart · New York.
Prognostic scoring systems for myelodysplastic syndromes (MDS) in a population-based setting: a report from the Swedish MDS register.

PubMed

Moreno Berggren, Daniel; Folkvaljon, Yasin; Engvall, Marie; Sundberg, Johan; Lambe, Mats; Antunovic, Petar; Garelius, Hege; Lorenz, Fryderyk; Nilsson, Lars; Rasmussen, Bengt; Lehmann, Sören; Hellström-Lindberg, Eva; Jädersten, Martin; Ejerblad, Elisabeth

2018-06-01

The myelodysplastic syndromes (MDS) have highly variable outcomes and prognostic scoring systems are important tools for risk assessment and to guide therapeutic decisions. However, few population-based studies have compared the value of the different scoring systems. With data from the nationwide Swedish population-based MDS register we validated the International Prognostic Scoring System (IPSS), revised IPSS (IPSS-R) and the World Health Organization (WHO) Classification-based Prognostic Scoring System (WPSS). We also present population-based data on incidence, clinical characteristics including detailed cytogenetics and outcome from the register. The study encompassed 1329 patients reported to the register between 2009 and 2013, 14% of these had therapy-related MDS (t-MDS). Based on the MDS register, the yearly crude incidence of MDS in Sweden was 2·9 per 100 000 inhabitants. IPSS-R had a significantly better prognostic power than IPSS (P < 0·001). There was a trend for better prognostic power of IPSS-R compared to WPSS (P = 0·05) and for WPSS compared to IPSS (P = 0·07). IPSS-R was superior to both IPSS and WPSS for patients aged ≤70 years. Patients with t-MDS had a worse outcome compared to de novo MDS (d-MDS), however, the validity of the prognostic scoring systems was comparable for d-MDS and t-MDS. In conclusion, population-based studies are important to validate prognostic scores in a 'real-world' setting. In our nationwide cohort, the IPSS-R showed the best predictive power. © 2018 John Wiley & Sons Ltd.
Development and validation of climate change system thinking instrument (CCSTI) for measuring system thinking on climate change content

NASA Astrophysics Data System (ADS)

Meilinda; Rustaman, N. Y.; Firman, H.; Tjasyono, B.

2018-05-01

The Climate Change System Thinking Instrument (CCSTI) is developed to measure a system thinking ability in the concept of climate change. CCSTI is developed in four phase’s development including instrument draft development, validation and evaluation including readable material test, expert validation, and field test. The result of field test is analyzed by looking at the readability score in Cronbach’s alpha test. Draft instrument is tested on college students majoring in Biology Education, Physics Education, and Chemistry Education randomly with a total number of 80 college students. Score of Content Validation Index at 0.86, which means that the CCSTI developed are categorized as very appropriate with question indicators and Cronbach’s alpha about 0.605 which mean categorized undesirable to minimal acceptable. From 45 questions of system thinking, there are 37 valid questions spread in four indicators of system thinking, which are system thinking phase I (pre-requirement), system thinking phase II (basic), system thinking phase III (intermediate), and system thinking phase IV (coherent expert).
The colostomy impact score: development and validation of a patient reported outcome measure for rectal cancer patients with a permanent colostomy. A population-based study.

PubMed

Thyø, A; Emmertsen, K J; Pinkney, T D; Christensen, P; Laurberg, S

2017-01-01

The aim was to develop and validate a simple scoring system evaluating the impact of colostomy dysfunction on quality of life (QOL) in patients with a permanent stoma after rectal cancer treatment. In this population-based study, 610 patients with a permanent colostomy after previous rectal cancer treatment during the period 2001-2007 completed two questionnaires: (i) the basic stoma questionnaire consisting of 22 items about stoma function with one anchor question addressing the overall stoma impact on QOL and (ii) the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire (EORTC QLQ) C30. Answers from half of the cohort were used to develop the score and subsequently validated on the remaining half. Logistic regression analyses identified and selected items for the score and multivariate analysis established the score value allocated to each item. The colostomy impact score includes seven items with a total range from 0 to 38 points. A score of ≥ 10 indicates major colostomy impact (Major CI). The score has a sensitivity of 85.7% for detecting patients with significant stoma impact on QOL. Using the EORTC QLQ scales, patients with Major CI experienced significant impairment in their QOL compared to the Minor CI group. This new scoring system appears valid for the assessment of the impact on QOL from having a permanent colostomy in a Danish rectal cancer population. It requires validation in non-Danish populations prior to its acceptance as a valuable patient-reported outcome measure for patients internationally. Colorectal Disease © 2016 The Association of Coloproctology of Great Britain and Ireland.
Methodologies for semiquantitative evaluation of hip osteoarthritis by magnetic resonance imaging: approaches based on the whole organ and focused on active lesions.

PubMed

Jaremko, Jacob L; Lambert, Robert G W; Zubler, Veronika; Weber, Ulrich; Loeuille, Damien; Roemer, Frank W; Cibere, Jolanda; Pianta, Marcus; Gracey, David; Conaghan, Philip; Ostergaard, Mikkel; Maksymowych, Walter P

2014-02-01

As a wider variety of therapeutic options for osteoarthritis (OA) becomes available, there is an increasing need to objectively evaluate disease severity on magnetic resonance imaging (MRI). This is more technically challenging at the hip than at the knee, and as a result, few systematic scoring systems exist. The OMERACT (Outcome Measures in Rheumatology) filter of truth, discrimination, and feasibility can be used to validate image-based scoring systems. Our objective was (1) to review the imaging features relevant to the assessment of severity and progression of hip OA; and (2) to review currently used methods to grade these features in existing hip OA scoring systems. A systematic literature review was conducted. MEDLINE keyword search was performed for features of arthropathy (such as hip + bone marrow edema or lesion, synovitis, cyst, effusion, cartilage, etc.) and scoring system (hip + OA + MRI + score or grade), with a secondary manual search for additional references in the retrieved publications. Findings relevant to the severity of hip OA include imaging markers associated with inflammation (bone marrow lesion, synovitis, effusion), structural damage (cartilage loss, osteophytes, subchondral cysts, labral tears), and predisposing geometric factors (hip dysplasia, femoral-acetabular impingement). Two approaches to the semiquantitative assessment of hip OA are represented by Hip OA MRI Scoring System (HOAMS), a comprehensive whole organ assessment of nearly all findings, and the Hip Inflammation MRI Scoring System (HIMRISS), which selectively scores only active lesions (bone marrow lesion, synovitis/effusion). Validation is presently confined to limited assessment of reliability. Two methods for semiquantitative assessment of hip OA on MRI have been described and validation according to the OMERACT Filter is limited to evaluation of reliability.
Validation of an MRI Brain Injury and Growth Scoring System in Very Preterm Infants Scanned at 29- to 35-Week Postmenstrual Age.

PubMed

George, J M; Fiori, S; Fripp, J; Pannek, K; Bursle, J; Moldrich, R X; Guzzetta, A; Coulthard, A; Ware, R S; Rose, S E; Colditz, P B; Boyd, R N

2017-07-01

The diagnostic and prognostic potential of brain MR imaging before term-equivalent age is limited until valid MR imaging scoring systems are available. This study aimed to validate an MR imaging scoring system of brain injury and impaired growth for use at 29 to 35 weeks postmenstrual age in infants born at <31 weeks gestational age. Eighty-three infants in a prospective cohort study underwent early 3T MR imaging between 29 and 35 weeks' postmenstrual age (mean, 32 +2 ± 1 +3 weeks; 49 males, born at median gestation of 28 +4 weeks; range, 23 +6 -30 +6 weeks; mean birthweight, 1068 ± 312 g). Seventy-seven infants had a second MR scan at term-equivalent age (mean, 40 +6 ± 1 +3 weeks). Structural images were scored using a modified scoring system which generated WM, cortical gray matter, deep gray matter, cerebellar, and global scores. Outcome at 12-months corrected age (mean, 12 months 4 days ± 1 +2 weeks) consisted of the Bayley Scales of Infant and Toddler Development, 3rd ed. (Bayley III), and the Neuro-Sensory Motor Developmental Assessment. Early MR imaging global, WM, and deep gray matter scores were negatively associated with Bayley III motor (regression coefficient for global score β = -1.31; 95% CI, -2.39 to -0.23; P = .02), cognitive (β = -1.52; 95% CI, -2.39 to -0.65; P < .01) and the Neuro-Sensory Motor Developmental Assessment outcomes (β = -1.73; 95% CI, -3.19 to -0.28; P = .02). Early MR imaging cerebellar scores were negatively associated with the Neuro-Sensory Motor Developmental Assessment (β = -5.99; 95% CI, -11.82 to -0.16; P = .04). Results were reconfirmed at term-equivalent-age MR imaging. This clinically accessible MR imaging scoring system is valid for use at 29 to 35 weeks postmenstrual age in infants born very preterm. It enables identification of infants at risk of adverse outcomes before the current standard of term-equivalent age. © 2017 by American Journal of Neuroradiology.
Prognostic score to predict mortality during TB treatment in TB/HIV co-infected patients.

PubMed

Nguyen, Duc T; Jenkins, Helen E; Graviss, Edward A

2018-01-01

Estimating mortality risk during TB treatment in HIV co-infected patients is challenging for health professionals, especially in a low TB prevalence population, due to the lack of a standardized prognostic system. The current study aimed to develop and validate a simple mortality prognostic scoring system for TB/HIV co-infected patients. Using data from the CDC's Tuberculosis Genotyping Information Management System of TB patients in Texas reported from 01/2010 through 12/2016, age ≥15 years, HIV(+), and outcome being "completed" or "died", we developed and internally validated a mortality prognostic score using multiple logistic regression. Model discrimination was determined by the area under the receiver operating characteristic (ROC) curve (AUC). The model's good calibration was determined by a non-significant Hosmer-Lemeshow's goodness of fit test. Among the 450 patients included in the analysis, 57 (12.7%) died during TB treatment. The final prognostic score used six characteristics (age, residence in long-term care facility, meningeal TB, chest x-ray, culture positive, and culture not converted/unknown), which are routinely collected by TB programs. Prognostic scores were categorized into three groups that predicted mortality: low-risk (<20 points), medium-risk (20-25 points) and high-risk (>25 points). The model had good discrimination and calibration (AUC = 0.82; 0.80 in bootstrap validation), and a non-significant Hosmer-Lemeshow test p = 0.71. Our simple validated mortality prognostic scoring system can be a practical tool for health professionals in identifying TB/HIV co-infected patients with high mortality risk.
The Individualized Classroom Assessment Scoring System (inCLASS): Preliminary Reliability and Validity of a System for Observing Preschoolers’ Competence in Classroom Interactions

PubMed Central

Downer, Jason T.; Booren, Leslie M.; Lima, Olivia K.; Luckner, Amy E.; Pianta, Robert C.

2012-01-01

This paper introduces the Individualized Classroom Assessment Scoring System (inCLASS), an observation tool that targets children’s interactions in preschool classrooms with teachers, peers, and tasks. In particular, initial evidence is reported of the extent to which the inCLASS meets the following psychometric criteria: inter-rater reliability, normal distributions and adequate range, construct validity, and criterion-related validity. These initial findings suggest that the inCLASS has the potential to provide an authentic, contextualized assessment of young children’s classroom behaviors. Future directions for research with the inCLASS are discussed. PMID:23175598
The Effects of Differing Response Criteria on the Assessment of Writing Competence.

ERIC Educational Resources Information Center

Winters, Lynn

The purpose of this study was to investigate the relative validities of four essay scoring systems, reflecting alternative conceptualizations of the writing process, for identifying "competent" writers. Each rater was trained in two of the four scoring systems: General Impression Scoring (GI), Diederich Expository Scale (DES), CSE…
Preliminary Face and Construct Validation Study of a Virtual Basic Laparoscopic Skill Trainer

PubMed Central

Sankaranarayanan, Ganesh; Lin, Henry; Arikatla, Venkata S.; Mulcare, Maureen; Zhang, Likun; Derevianko, Alexandre; Lim, Robert; Fobert, David; Cao, Caroline; Schwaitzberg, Steven D.; Jones, Daniel B.

2010-01-01

Abstract Background The Virtual Basic Laparoscopic Skill Trainer (VBLaST™) is a developing virtual-reality–based surgical skill training system that incorporates several of the tasks of the Fundamentals of Laparoscopic Surgery (FLS) training system. This study aimed to evaluate the face and construct validity of the VBLaST™ system. Materials and Methods Thirty-nine subjects were voluntarily recruited at the Beth Israel Deaconess Medical Center (Boston, MA) and classified into two groups: experts (PGY 5, fellow and practicing surgeons) and novice (PGY 1–4). They were then asked to perform three FLS tasks, consisting of peg transfer, pattern cutting, and endoloop, on both the VBLaST and FLS systems. The VBLaST performance scores were automatically computed, while the FLS scores were rated by a trained evaluator. Face validity was assessed using a 5-point Likert scale, varying from not realistic/useful (1) to very realistic/useful (5). Results Face-validity scores showed that the VBLaST system was significantly realistic in portraying the three FLS tasks (3.95 ± 0.909), as well as the reality in trocar placement and tool movements (3.67 ± 0.874). Construct-validity results show that VBLaST was able to differentiate between the expert and novice group (P = 0.015). However, of the two tasks used for evaluating VBLaST, only the peg-transfer task showed a significant difference between the expert and novice groups (P = 0.003). Spearman correlation coefficient analysis between the two scores showed significant correlation for the peg-transfer task (Spearman coefficient 0.364; P = 0.023). Conclusions VBLaST demonstrated significant face and construct validity. A further set of studies, involving improvement to the current VBLaST system, is needed to thoroughly demonstrate face and construct validity for all the tasks. PMID:20201683

Measuring the impact of diagnostic decision support on the quality of clinical decision making: development of a reliable and valid composite score.

PubMed

Ramnarayan, Padmanabhan; Kapoor, Ritika R; Coren, Michael; Nanduri, Vasantha; Tomlinson, Amanda L; Taylor, Paul M; Wyatt, Jeremy C; Britto, Joseph F

2003-01-01

Few previous studies evaluating the benefits of diagnostic decision support systems have simultaneously measured changes in diagnostic quality and clinical management prompted by use of the system. This report describes a reliable and valid scoring technique to measure the quality of clinical decision plans in an acute medical setting, where diagnostic decision support tools might prove most useful. Sets of differential diagnoses and clinical management plans generated by 71 clinicians for six simulated cases, before and after decision support from a Web-based pediatric differential diagnostic tool (ISABEL), were used. A composite quality score was calculated separately for each diagnostic and management plan by considering the appropriateness value of each component diagnostic or management suggestion, a weighted sum of individual suggestion ratings, relevance of the entire plan, and its comprehensiveness. The reliability and validity (face, concurrent, construct, and content) of these two final scores were examined. Two hundred fifty-two diagnostic and 350 management suggestions were included in the interrater reliability analysis. There was good agreement between raters (intraclass correlation coefficient, 0.79 for diagnoses, and 0.72 for management). No counterintuitive scores were demonstrated on visual inspection of the sets. Content validity was verified by a consultation process with pediatricians. Both scores discriminated adequately between the plans of consultants and medical students and correlated well with clinicians' subjective opinions of overall plan quality (Spearman rho 0.65, p < 0.01). The diagnostic and management scores for each episode showed moderate correlation (r = 0.51). The scores described can be used as key outcome measures in a larger study to fully assess the value of diagnostic decision aids, such as the ISABEL system.
Evaluation of several ultrasonography scoring systems for synovitis and comparison to clinical examination: results from a prospective multicentre study of rheumatoid arthritis.

PubMed

Dougados, Maxime; Jousse-Joulin, Sandrine; Mistretta, Frederic; d'Agostino, Maria-Antonietta; Backhaus, Marina; Bentin, Jacques; Chalès, Gérard; Chary-Valckenaere, Isabelle; Conaghan, Philip; Etchepare, Fabien; Gaudin, Philippe; Grassi, Walter; van der Heijde, Désirée; Sellam, Jérémie; Naredo, Esperanza; Szkudlarek, Marcin; Wakefield, Richard; Saraux, Alain

2010-05-01

To evaluate different global ultrasonographic (US) synovitis scoring systems as potential outcome measures of rheumatoid arthritis (RA) according to the Outcome Measures in Rheumatoid Arthritis Clinical Trials (OMERACT) filter. To study selected global scoring systems, for the clinical, B mode and power Doppler techniques, the following joints were evaluated: 28 joints (28-joint Disease Activity Score (DAS28)), 20 joints (metacarpophalangeals (MCPs) + metatarsophalangeals (MTPs)) and 38 joints (28 joints + MTPs) using either a binary (yes/no) or a 0-3 grade. The study was a prospective, 4-month duration follow-up of 76 patients with RA requiring anti-tumour necrosis factor (TNF) therapy (complete follow-up data: 66 patients). Intraobserver reliability was evaluated using the intraclass correlation coefficient (ICC), construct validity was evaluated using the Cronbach alpha test and external validity was evaluated using level of correlation between scoring system and C reactive protein (CRP). Sensitivity to change was evaluated using the standardised response mean. Discriminating capacity was evaluated using the standardised mean differences in patients considered by the doctor as significantly improved or not at the end of the study. Different clinimetric properties of various US scoring systems were at least as good as the clinical scores with, for example, intraobserver reliability ranging from 0.61 to 0.97 versus from 0.53 to 0.82, construct validity ranging from 0.76 to 0.89 versus from 0.76 to 0.88, correlation with CRP ranging from 0.28 to 0.34 versus from 0.28 to 0.35 and sensitivity to change ranging from 0.60 to 1.21 versus from 0.96 to 1.36 for US versus clinical scoring systems, respectively. This study suggests that US evaluation of synovitis is an outcome measure at least as relevant as physical examination. Further studies are required in order to achieve optimal US scoring systems for monitoring patients with RA in clinical trials and in clinical practice.
Automated Pressure Injury Risk Assessment System Incorporated Into an Electronic Health Record System.

PubMed

Jin, Yinji; Jin, Taixian; Lee, Sun-Mi

Pressure injury risk assessment is the first step toward preventing pressure injuries, but traditional assessment tools are time-consuming, resulting in work overload and fatigue for nurses. The objectives of the study were to build an automated pressure injury risk assessment system (Auto-PIRAS) that can assess pressure injury risk using data, without requiring nurses to collect or input additional data, and to evaluate the validity of this assessment tool. A retrospective case-control study and a system development study were conducted in a 1,355-bed university hospital in Seoul, South Korea. A total of 1,305 pressure injury patients and 5,220 nonpressure injury patients participated for the development of a risk scoring algorithm: 687 and 2,748 for the validation of the algorithm and 237 and 994 for validation after clinical implementation, respectively. A total of 4,211 pressure injury-related clinical variables were extracted from the electronic health record (EHR) systems to develop a risk scoring algorithm, which was validated and incorporated into the EHR. That program was further evaluated for predictive and concurrent validity. Auto-PIRAS, incorporated into the EHR system, assigned a risk assessment score of high, moderate, or low and displayed this on the Kardex nursing record screen. Risk scores were updated nightly according to 10 predetermined risk factors. The predictive validity measures of the algorithm validation stage were as follows: sensitivity = .87, specificity = .90, positive predictive value = .68, negative predictive value = .97, Youden index = .77, and the area under the receiver operating characteristic curve = .95. The predictive validity measures of the Braden Scale were as follows: sensitivity = .77, specificity = .93, positive predictive value = .72, negative predictive value = .95, Youden index = .70, and the area under the receiver operating characteristic curve = .85. The kappa of the Auto-PIRAS and Braden Scale risk classification result was .73. The predictive performance of the Auto-PIRAS was similar to Braden Scale assessments conducted by nurses. Auto-PIRAS is expected to be used as a system that assesses pressure injury risk automatically without additional data collection by nurses.
Development and validation of a premature ejaculation diagnostic tool.

PubMed

Symonds, Tara; Perelman, Michael A; Althof, Stanley; Giuliano, François; Martin, Mona; May, Kathryn; Abraham, Lucy; Crossland, Anna; Morris, Mark

2007-08-01

Diagnosis of premature ejaculation (PE) for clinical trial purposes has typically relied on intravaginal ejaculation latency time (IELT) for entry, but this parameter does not capture the multidimensional nature of PE. Therefore, the aim was to develop a brief, multidimensional, psychometrically validated instrument for diagnosing PE status. The questionnaire development involved three stages: (1) Five focus groups and six individual interviews were conducted to develop the content; (2) psychometric validation using three different groups of men; and (3) generation of a scoring system. For psychometric validation/scoring system development, data was collected from (1) men with PE based on clinician diagnosis, using DSM-IV-TR, who also had IELTs < or =2 min (n=292); (2) men self-reporting PE (n=309); and (3) men self-reporting no-PE (n=701). Standard psychometric analyses were conducted to produce the final questionnaire. Sensitivity/specificity analysis was used to determine an appropriate scoring system. The qualitative research identified 9 items to capture the essence of DSM-IV-TR PE classification. The psychometric validation resulted in a 5-item, unidimensional, measure, which captures the essence of DSM-IV-TR: control, frequency, minimal stimulation, distress, and interpersonal difficulty. Sensitivity/specificity analyses suggested a score of < or =8 indicated no-PE, 9 and 10 probable PE, and > or =11 PE. The development and validation of this new PE diagnostic tool has resulted in a new, user-friendly, and brief self-report questionnaire for use in clinical trials to diagnose PE.
Reliability of the Balance Evaluation Systems Test (BESTest) and BESTest sections for adults with hemiparesis

PubMed Central

Rodrigues, Letícia C.; Marques, Aline P.; Barros, Paula B.; Michaelsen, Stella M.

2014-01-01

BACKGROUND: The Balance Evaluation Systems Test (BESTest) was recently created to allow the development of treatments according to the specific balance system affected in each patient. The Brazilian version of the BESTest has not been specifically tested after stroke. OBJECTIVE: To evaluate the intra- and inter-rater reliability and concurrent and convergent validity of the total score of the BESTest and BESTest sections for adults with hemiparesis after stroke. METHOD: The study included 16 subjects (61.1±7.5 years) with chronic hemiparesis (54.5±43.5 months after stroke). The BESTest was administered by two raters in the same week and one of the raters repeated the test after a one-week interval. Intraclass correlation coefficient (ICC) was calculated to assess intra- and interrater reliability. Concurrent validity with the Berg Balance Scale (BBS) and convergent validity with the Activities-specific Balance Confidence scale (ABC-Brazil) were assessed using Pearson's correlation coefficient. RESULTS: Both the BESTest total score (ICC=0.98) and the BESTest sections (ICC between 0.85 and 0.96) have excellent intrarater reliability. Interrater reliability for the total score was excellent (ICC=0.93) and, for the sections, it ranged between 0.71 and 0.94. The correlation coefficient between the BESTest and the BBS and ABC-Brazil were 0.78 and 0.59, respectively. CONCLUSIONS: The Brazilian version of the BESTest demonstrated adequate reliability when measured by sections and could identify what balance system was affected in patients after stroke. Concurrent validity was excellent with the BBS total score and good to excellent with the sections. The total scores but not the sections present adequate convergent validity with the ABC-Brazil. However, other psychometric properties should be further investigated. PMID:25003281
An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

PubMed

Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

2014-05-01

Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
A simple weighted scoring system to guide surgical decision-making in patients with parapneumonic pleural effusion.

PubMed

Chang, Che-Chia; Chen, Tzu-Ping; Yeh, Chi-Hsiao; Huang, Pin-Fu; Wang, Yao-Chang; Yin, Shun-Ying

2016-11-01

The selection of ideal candidates for surgical intervention among patients with parapneumonic pleural effusion remains challenging. In this retrospective study, we sought to identify the main predictors of surgical treatment and devise a simple scoring system to guide surgical decision-making. Between 2005 and 2014, we identified 276 patients with parapneumonic pleural effusion. Patients in the training set (n=201) were divided into two groups according to their treatment modality (non-surgery vs. surgery). Using multivariable logistic regression analysis, we devised a scoring system to guide surgical decision-making. The score was subsequently validated in an independent set of 75 patients. A white blood cell count >13,500/µL, pleuritic pain, loculations, and split pleura sign were identified as independent predictors of surgical treatment. A weighted score based on these factors was devised, as follows: white blood cell count >13,500/µL (one point), pleuritic pain (one point), loculations (two points), and split pleura sign (three points). A score >4 was associated with a surgical approach with a sensitivity of 93.4%, a specificity of 82.4%, and an area under curve (AUC) of 0.879 (95% confidence interval: 0.828-0.930). In the validation set, a sensitivity of 94.3% and a specificity of 79.6% were found (AUC=0.869). The proposed scoring system reliably identifies patients with parapneumonic pleural effusion who are candidates for surgery. Pending independent external validation, our score may inform the appropriate use of surgical interventions in this clinical setting.
Predictors of operating room extubation in adult cardiac surgery.

PubMed

Subramaniam, Kathirvel; DeAndrade, Diana S; Mandell, Daniel R; Althouse, Andrew D; Manmohan, Rajan; Esper, Stephen A; Varga, Jeffrey M; Badhwar, Vinay

2017-11-01

The primary objective of the study was to identify perioperative factors associated with successful immediate extubation in the operating room after adult cardiac surgery. The secondary objective was to derive a simplified predictive scoring system to guide clinicians in operating room extubation. All 1518 patients in this retrospective cohort study underwent standardized fast-track cardiac anesthetic protocol during adult cardiac surgery. Perioperative variables between patients who had successful extubation in the operating room versus in the intensive care unit were retrospectively analyzed using both univariate and multivariable logistic regression analyses. A predictive score of successful operating room extubation was constructed from the multivariable results of 800 patients (derivation set), and the scoring system was further tested using a validation set of 398 patients. Younger age, lower body mass index, higher preoperative serum albumin, absence of chronic lung disease and diabetes, less-invasive surgical approach, isolated coronary bypass surgery, elective surgery, and lower doses of intraoperative intravenous fentanyl were independently associated with higher probability of operating room extubation. The extubation prediction score created in a derivation set of patients performed well in the validation set. Patient scores less than 0 had a minimal probability of successful operating room extubation. Operating room extubation was highly predicted with scores of 5 or greater. Perioperative factors that are independently associated with successful operating room extubation after adult cardiac operations were identified, and an operating room extubation prediction scoring system was validated. This scoring system may be used to guide safe operating room extubation after cardiac operations. Copyright © 2017 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
Development and validation of a prognostic score to predict mortality in patients with acute-on-chronic liver failure.

PubMed

Jalan, Rajiv; Saliba, Faouzi; Pavesi, Marco; Amoros, Alex; Moreau, Richard; Ginès, Pere; Levesque, Eric; Durand, Francois; Angeli, Paolo; Caraceni, Paolo; Hopf, Corinna; Alessandria, Carlo; Rodriguez, Ezequiel; Solis-Muñoz, Pablo; Laleman, Wim; Trebicka, Jonel; Zeuzem, Stefan; Gustot, Thierry; Mookerjee, Rajeshwar; Elkrief, Laure; Soriano, German; Cordoba, Joan; Morando, Filippo; Gerbes, Alexander; Agarwal, Banwari; Samuel, Didier; Bernardi, Mauro; Arroyo, Vicente

2014-11-01

Acute-on-chronic liver failure (ACLF) is a frequent syndrome (30% prevalence), characterized by acute decompensation of cirrhosis, organ failure(s) and high short-term mortality. This study develops and validates a specific prognostic score for ACLF patients. Data from 1349 patients included in the CANONIC study were used. First, a simplified organ function scoring system (CLIF Consortium Organ Failure score, CLIF-C OFs) was developed to diagnose ACLF using data from all patients. Subsequently, in 275 patients with ACLF, CLIF-C OFs and two other independent predictors of mortality (age and white blood cell count) were combined to develop a specific prognostic score for ACLF (CLIF Consortium ACLF score [CLIF-C ACLFs]). A concordance index (C-index) was used to compare the discrimination abilities of CLIF-C ACLF, MELD, MELD-sodium (MELD-Na), and Child-Pugh (CPs) scores. The CLIF-C ACLFs was validated in an external cohort and assessed for sequential use. The CLIF-C ACLFs showed a significantly higher predictive accuracy than MELDs, MELD-Nas, and CPs, reducing (19-28%) the corresponding prediction error rates at all main time points after ACLF diagnosis (28, 90, 180, and 365 days) in both the CANONIC and the external validation cohort. CLIF-C ACLFs computed at 48 h, 3-7 days, and 8-15 days after ACLF diagnosis predicted the 28-day mortality significantly better than at diagnosis. The CLIF-C ACLFs at ACLF diagnosis is superior to the MELDs and MELD-Nas in predicting mortality. The CLIF-C ACLFs is a clinically relevant, validated scoring system that can be used sequentially to stratify the risk of mortality in ACLF patients. Copyright © 2014 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Development of a novel score for the prediction of hospital mortality in patients with severe sepsis: the use of electronic healthcare records with LASSO regression.

PubMed

Zhang, Zhongheng; Hong, Yucai

2017-07-25

There are several disease severity scores being used for the prediction of mortality in critically ill patients. However, none of them was developed and validated specifically for patients with severe sepsis. The present study aimed to develop a novel prediction score for severe sepsis. A total of 3206 patients with severe sepsis were enrolled, including 1054 non-survivors and 2152 survivors. The LASSO score showed the best discrimination (area under curve: 0.772; 95% confidence interval: 0.735-0.810) in the validation cohort as compared with other scores such as simplified acute physiology score II, acute physiological score III, Logistic organ dysfunction system, sequential organ failure assessment score, and Oxford Acute Severity of Illness Score. The calibration slope was 0.889 and Brier value was 0.173. The study employed a single center database called Medical Information Mart for Intensive Care-III) MIMIC-III for analysis. Severe sepsis was defined as infection and acute organ dysfunction. Clinical and laboratory variables used in clinical routines were included for screening. Subjects without missing values were included, and the whole dataset was split into training and validation cohorts. The score was coined LASSO score because variable selection was performed using the least absolute shrinkage and selection operator (LASSO) technique. Finally, the LASSO score was evaluated for its discrimination and calibration in the validation cohort. The study developed the LASSO score for mortality prediction in patients with severe sepsis. Although the score had good discrimination and calibration in a randomly selected subsample, external validations are still required.
Ecological validity and clinical utility of Patient-Reported Outcomes Measurement Information System (PROMIS®) instruments for detecting premenstrual symptoms of depression, anger, and fatigue.

PubMed

Junghaenel, Doerte U; Schneider, Stefan; Stone, Arthur A; Christodoulou, Christopher; Broderick, Joan E

2014-04-01

This study examined the ecological validity and clinical utility of NIH Patient Reported-Outcomes Measurement Information System (PROMIS®) instruments for anger, depression, and fatigue in women with premenstrual symptoms. One-hundred women completed daily diaries and weekly PROMIS assessments over 4weeks. Weekly assessments were administered through Computerized Adaptive Testing (CAT). Weekly CATs and corresponding daily scores were compared to evaluate ecological validity. To test clinical utility, we examined if CATs could detect changes in symptom levels, if these changes mirrored those obtained from daily scores, and if CATs could identify clinically meaningful premenstrual symptom change. PROMIS CAT scores were higher in the pre-menstrual than the baseline (ps<.0001) and post-menstrual (ps<.0001) weeks. The correlations between CATs and aggregated daily scores ranged from .73 to .88 supporting ecological validity. Mean CAT scores showed systematic changes in accordance with the menstrual cycle and the magnitudes of the changes were similar to those obtained from the daily scores. Finally, Receiver Operating Characteristic (ROC) analyses demonstrated the ability of the CATs to discriminate between women with and without clinically meaningful premenstrual symptom change. PROMIS CAT instruments for anger, depression, and fatigue demonstrated validity and utility in premenstrual symptom assessment. The results provide encouraging initial evidence of the utility of PROMIS instruments for the measurement of affective premenstrual symptoms. Copyright © 2014 Elsevier Inc. All rights reserved.
Validation of the modified Ranson versus Glasgow score for pancreatitis in a Singaporean population.

PubMed

Tan, Yong Hui Alvin; Rafi, Shumaila; Tyebally Fang, Mirriam; Hwang, Stephen; Lim, Ee Wen; Ngu, James; Tan, Su-Ming

2017-09-01

The characteristics of patients with acute pancreatitis in multi-ethnic Singapore differ from that of the populations used in formulating the modified Ranson and Glasgow scores. The use of these scoring systems has not previously been validated in the Singaporean setting. This study aims to validate and compare the prognostic use of the modified Ranson and Glasgow scores, and to determine the superiority of one score over the other in predicting the outcome for acute pancreatitis in the Singaporean population. This is a 3-year retrospective study of patients diagnosed with acute pancreatitis at our centre. Patients with chronic pancreatitis, acute on chronic pancreatitis, iatrogenic pancreatitis, pancreatic cancer as well as those with incomplete Ranson or Glasgow scores were excluded from the study. Case notes and computer records were reviewed for local complications of pancreatitis and organ failure. Receiver operator characteristic (ROC) curves of the Ranson and Glasgow scores were plotted for the prediction of severity and mortality. Between January 2010 and December 2012, 230 cases were diagnosed with acute pancreatitis. A majority of the patients had mild pancreatitis (n = 194, 84.3%), and the overall 30-day mortality rate was 3.5% (n = 8). ROC of the Ranson and Glasgow scoring systems for mortality showed an area under curve (AUC) of 0.854 (P = 0.001) and 0.776 (P = 0.008), respectively. For severity, the AUC for the modified Ranson and Glasgow score was calculated to be 0.694 and 0.668, respectively. The ROC curves of Ranson and Glasgow scores for mortality are comparable with that published in earlier studies. In a Singaporean population, the Ranson score is more accurate in the prediction of mortality. However, both scoring systems are poor predictors for severity of acute pancreatitis. © 2015 Royal Australasian College of Surgeons.
Pediatric Heart Donor Assessment Tool (PH-DAT): A novel donor risk scoring system to predict 1-year mortality in pediatric heart transplantation.

PubMed

Zafar, Farhan; Jaquiss, Robert D; Almond, Christopher S; Lorts, Angela; Chin, Clifford; Rizwan, Raheel; Bryant, Roosevelt; Tweddell, James S; Morales, David L S

2018-03-01

In this study we sought to quantify hazards associated with various donor factors into a cumulative risk scoring system (the Pediatric Heart Donor Assessment Tool, or PH-DAT) to predict 1-year mortality after pediatric heart transplantation (PHT). PHT data with complete donor information (5,732) were randomly divided into a derivation cohort and a validation cohort (3:1). From the derivation cohort, donor-specific variables associated with 1-year mortality (exploratory p-value < 0.2) were incorporated into a multivariate logistic regression model. Scores were assigned to independent predictors (p < 0.05) based on relative odds ratios (ORs). The final model had an acceptable predictive value (c-statistic = 0.62). The significant 5 variables (ischemic time, stroke as the cause of death, donor-to-recipient height ratio, donor left ventricular ejection fraction, glomerular filtration rate) were used for the scoring system. The validation cohort demonstrated a strong correlation between the observed and expected rates of 1-year mortality (r = 0.87). The risk of 1-year mortality increases by 11% (OR 1.11 [1.08 to 1.14]; p < 0.001) in the derivation cohort and 9% (OR 1.09 [1.04 to 1.14]; p = 0.001) in the validation cohort with an increase of 1-point in score. Mortality risk increased 5 times from the lowest to the highest donor score in this cohort. Based on this model, a donor score range of 10 to 28 predicted 1-year recipient mortality of 11% to 31%. This novel pediatric-specific, donor risk scoring system appears capable of predicting post-transplant mortality. Although the PH-DAT may benefit organ allocation and assessment of recipient risk while controlling for donor risk, prospective validation of this model is warranted. Copyright © 2018 International Society for the Heart and Lung Transplantation. Published by Elsevier Inc. All rights reserved.
Validation of cytogenetic risk groups according to International Prognostic Scoring Systems by peripheral blood CD34+FISH: results from a German diagnostic study in comparison with an international control group

PubMed Central

Braulke, Friederike; Platzbecker, Uwe; Müller-Thomas, Catharina; Götze, Katharina; Germing, Ulrich; Brümmendorf, Tim H.; Nolte, Florian; Hofmann, Wolf-Karsten; Giagounidis, Aristoteles A. N.; Lübbert, Michael; Greenberg, Peter L.; Bennett, John M.; Solé, Francesc; Mallo, Mar; Slovak, Marilyn L.; Ohyashiki, Kazuma; Le Beau, Michelle M.; Tüchler, Heinz; Pfeilstöcker, Michael; Nösslinger, Thomas; Hildebrandt, Barbara; Shirneshan, Katayoon; Aul, Carlo; Stauder, Reinhard; Sperr, Wolfgang R.; Valent, Peter; Fonatsch, Christa; Trümper, Lorenz; Haase, Detlef; Schanz, Julie

2015-01-01

International Prognostic Scoring Systems are used to determine the individual risk profile of myelodysplastic syndrome patients. For the assessment of International Prognostic Scoring Systems, an adequate chromosome banding analysis of the bone marrow is essential. Cytogenetic information is not available for a substantial number of patients (5%–20%) with dry marrow or an insufficient number of metaphase cells. For these patients, a valid risk classification is impossible. In the study presented here, the International Prognostic Scoring Systems were validated based on fluorescence in situ hybridization analyses using extended probe panels applied to cluster of differentiation 34 positive (CD34+) peripheral blood cells of 328 MDS patients of our prospective multicenter German diagnostic study and compared to chromosome banding results of 2902 previously published patients with myelodysplastic syndromes. For cytogenetic risk classification by fluorescence in situ hybridization analyses of CD34+ peripheral blood cells, the groups differed significantly for overall and leukemia-free survival by uni- and multivariate analyses without discrepancies between treated and untreated patients. Including cytogenetic data of fluorescence in situ hybridization analyses of peripheral CD34+ blood cells (instead of bone marrow banding analysis) into the complete International Prognostic Scoring System assessment, the prognostic risk groups separated significantly for overall and leukemia-free survival. Our data show that a reliable stratification to the risk groups of the International Prognostic Scoring Systems is possible from peripheral blood in patients with missing chromosome banding analysis by using a comprehensive probe panel (clinicaltrials.gov identifier:01355913). PMID:25344522
Scoring systems for outcome prediction in patients with perforated peptic ulcer.

PubMed

Thorsen, Kenneth; Søreide, Jon Arne; Søreide, Kjetil

2013-04-10

Patients with perforated peptic ulcer (PPU) often present with acute, severe illness that carries a high risk for morbidity and mortality. Mortality ranges from 3-40% and several prognostic scoring systems have been suggested. The aim of this study was to review the available scoring systems for PPU patients, and to assert if there is evidence to prefer one to the other. We searched PubMed for the mesh terms "perforated peptic ulcer", "scoring systems", "risk factors", "outcome prediction", "mortality", "morbidity" and the combinations of these terms. In addition to relevant scores introduced in the past (e.g. Boey score), we included recent studies published between January 2000 and December 2012) that reported on scoring systems for prediction of morbidity and mortality in PPU patients. A total of ten different scoring systems used to predict outcome in PPU patients were identified; the Boey score, the Hacettepe score, the Jabalpur score the peptic ulcer perforation (PULP) score, the ASA score, the Charlson comorbidity index, the sepsis score, the Mannheim Peritonitis Index (MPI), the Acute physiology and chronic health evaluation II (APACHE II), the simplified acute physiology score II (SAPS II), the Mortality probability models II (MPM II), the Physiological and Operative Severity Score for the enumeration of Mortality and Morbidity physical sub-score (POSSUM-phys score). Only four of the scores were specifically constructed for PPU patients. In five studies the accuracy of outcome prediction of different scoring systems was evaluated by receiver operating characteristics curve (ROC) analysis, and the corresponding area under the curve (AUC) among studies compared. Considerable variation in performance both between different scores and between different studies was found, with the lowest and highest AUC reported between 0.63 and 0.98, respectively. While the Boey score and the ASA score are most commonly used to predict outcome for PPU patients, considerable variations in accuracy for outcome prediction were shown. Other scoring systems are hampered by a lack of validation or by their complexity that precludes routine clinical use. While the PULP score seems promising it needs external validation before widespread use.
[Scoring systems in intensive care medicine : principles, models, application and limits].

PubMed

Fleig, V; Brenck, F; Wolff, M; Weigand, M A

2011-10-01

Scoring systems are used in all diagnostic areas of medicine. Several parameters are evaluated and rated with points according to their value in order to simplify a complex clinical situation with a score. The application ranges from the classification of disease severity through determining the number of staff for the intensive care unit (ICU) to the evaluation of new therapies under study conditions. Since the introduction of scoring systems in the 1980's a variety of different score models has been developed. The scoring systems that are employed in intensive care and are discussed in this article can be categorized into prognostic scores, expenses scores and disease-specific scores. Since the introduction of compulsory recording of two scoring systems for accounting in the German diagnosis-related groups (DRG) system, these tools have gained more importance for all intensive care physicians. Problems remain in the valid calculation of scores and interpretation of the results.
Calculating the risk of a pancreatic fistula after a pancreaticoduodenectomy: a systematic review

PubMed Central

Vallance, Abigail E; Young, Alastair L; Macutkiewicz, Christian; Roberts, Keith J; Smith, Andrew M

2015-01-01

Background A post-operative pancreatic fistula (POPF) is a major cause of morbidity and mortality after a pancreaticoduodenectomy (PD). This systematic review aimed to identify all scoring systems to predict POPF after a PD, consider their clinical applicability and assess the study quality. Method An electronic search was performed of Medline (1946–2014) and EMBASE (1996–2014) databases. Results were screened according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, and quality assessed according to the QUIPS (quality in prognostic studies) tool. Results Six eligible scoring systems were identified. Five studies used the International Study Group on Pancreatic Fistula (ISGPF) definition. The proposed scores feature between two and five variables and of the 16 total variables, the majority (12) featured in only one score. Three scores could be fully completed pre-operatively whereas 1 score included intra-operative and two studies post-operative variables. Four scores were internally validated and of these, two scores have been subject to subsequent multicentre review. The median QUIPS score was 38 out of 50 (range 16–50). Conclusion These scores show potential in calculating the individualized patient risk of POPF. There is, however, much variation in current scoring systems and further validation in large multicentre cohorts is now needed. PMID:26456948
Validity of the Japanese Orthopaedic Association scoring system based on patient-reported improvement after posterior lumbar interbody fusion.

PubMed

Fujimori, Takahito; Okuda, Shinya; Iwasaki, Motoki; Yamasaki, Ryoji; Maeno, Takafumi; Yamashita, Tomoya; Matsumoto, Tomiya; Wada, Eiji; Oda, Takenori

2016-06-01

The Japanese Orthopaedic Association (JOA) scoring system is a physician-based outcome that has been used to evaluate treatment effectiveness after lumbar surgery. However, patient-centered evaluation becomes increasingly important. There is no study that has examined the relationship between the JOA scoring system and patients' self-reported improvement. The purpose of the present study was to validate the JOA scoring system for assessment of patient-reported improvement after lumbar surgery. This is a retrospective review of prospectively collected data. The patient sample included 273 mail-in responders of the 466 consecutive patients who underwent posterior lumbar interbody fusion for spondylolisthesis between 1996 and 2008 in a single hospital. The outcome measures were the JOA scoring system and patients' self-reported improvement. Two hundred seventy three patients were divided into five anchoring groups based on self-reported improvement from "Much better" to "Much worse." Outcomes (ie, recovery rate, amount of change from preoperative condition, and postoperative score) based on the JOA scoring system were compared among groups. Using the patient's self-reported improvement scale as an anchor, the association among each of the outcomes was examined. The cutoff point and the area under the curve (AUC) that differentiated "Improved" from "Neither improved nor worse" was calculated using receiver operating characteristic (ROC) curve analysis. The recovery rate and postoperative score were significantly different in 9 of 10 pairs of anchoring groups. The amount of change was significantly different in six pairs. Spearman correlation coefficient for the 5-point scale anchors of patients' self-reported improvement was 0.20 (p=.001) for the baseline score, 0.31 (p<.001) for the amount of change, 0.55 (p<.001) for the recovery rate, and 0.56 (p<.001) for the postoperative score. According to ROC analysis, the best cutoff points and AUCs were 13 points and 0.69, respectively, for the amount of change, 67% and 0.73, respectively, for recovery rate, and 23 points and 0.72, respectively, for postoperative score. The JOA scoring system is a valid method for assessment of patients' self-reported improvement. Patients' self-reported improvement is more likely to be associated with the final condition, such as postoperative score or recovery rate, rather than the change from the preoperative condition. Copyright © 2016 Elsevier Inc. All rights reserved.
Multicenter Validation of a Customizable Scoring Tool for Selection of Trainees for a Residency or Fellowship Program. The EAST-IST Study.

PubMed

Bosslet, Gabriel T; Carlos, W Graham; Tybor, David J; McCallister, Jennifer; Huebert, Candace; Henderson, Ashley; Miles, Matthew C; Twigg, Homer; Sears, Catherine R; Brown, Cynthia; Farber, Mark O; Lahm, Tim; Buckley, John D

2017-04-01

Few data have been published regarding scoring tools for selection of postgraduate medical trainee candidates that have wide applicability. The authors present a novel scoring tool developed to assist postgraduate programs in generating an institution-specific rank list derived from selected elements of the U.S. Electronic Residency Application System (ERAS) application. The authors developed and validated an ERAS and interview day scoring tool at five pulmonary and critical care fellowship programs: the ERAS Application Scoring Tool-Interview Scoring Tool. This scoring tool was then tested for intrarater correlation versus subjective rankings of ERAS applications. The process for development of the tool was performed at four other institutions, and it was performed alongside and compared with the "traditional" ranking methods at the five programs and compared with the submitted National Residency Match Program rank list. The ERAS Application Scoring Tool correlated highly with subjective faculty rankings at the primary institution (average Spearman's r = 0.77). The ERAS Application Scoring Tool-Interview Scoring Tool method correlated well with traditional ranking methodology at all five institutions (Spearman's r = 0.54, 0.65, 0.72, 0.77, and 0.84). This study validates a process for selecting and weighting components of the ERAS application and interview day to create a customizable, institution-specific tool for ranking candidates to postgraduate medical education programs. This scoring system can be used in future studies to compare the outcomes of fellowship training.
Measuring Life Stress: A Comparison of the Predictive Validity of Different Scoring Systems for the Social Readjustment Rating Scale.

ERIC Educational Resources Information Center

McGrath, Robert E. V.; Burkhart, Barry R.

1983-01-01

Assessed whether accounting for variables in the scoring of the Social Readjustment Rating Scale (SRRS) would improve the predictive validity of the inventory. Results from 107 sets of questionnaires showed that income and level of education are significant predictors of the capacity to cope with stress. (JAC)

Scoring systems for outcome prediction in patients with perforated peptic ulcer

PubMed Central

2013-01-01

Background Patients with perforated peptic ulcer (PPU) often present with acute, severe illness that carries a high risk for morbidity and mortality. Mortality ranges from 3-40% and several prognostic scoring systems have been suggested. The aim of this study was to review the available scoring systems for PPU patients, and to assert if there is evidence to prefer one to the other. Material and methods We searched PubMed for the mesh terms “perforated peptic ulcer”, “scoring systems”, “risk factors”, ”outcome prediction”, “mortality”, ”morbidity” and the combinations of these terms. In addition to relevant scores introduced in the past (e.g. Boey score), we included recent studies published between January 2000 and December 2012) that reported on scoring systems for prediction of morbidity and mortality in PPU patients. Results A total of ten different scoring systems used to predict outcome in PPU patients were identified; the Boey score, the Hacettepe score, the Jabalpur score the peptic ulcer perforation (PULP) score, the ASA score, the Charlson comorbidity index, the sepsis score, the Mannheim Peritonitis Index (MPI), the Acute physiology and chronic health evaluation II (APACHE II), the simplified acute physiology score II (SAPS II), the Mortality probability models II (MPM II), the Physiological and Operative Severity Score for the enumeration of Mortality and Morbidity physical sub-score (POSSUM-phys score). Only four of the scores were specifically constructed for PPU patients. In five studies the accuracy of outcome prediction of different scoring systems was evaluated by receiver operating characteristics curve (ROC) analysis, and the corresponding area under the curve (AUC) among studies compared. Considerable variation in performance both between different scores and between different studies was found, with the lowest and highest AUC reported between 0.63 and 0.98, respectively. Conclusion While the Boey score and the ASA score are most commonly used to predict outcome for PPU patients, considerable variations in accuracy for outcome prediction were shown. Other scoring systems are hampered by a lack of validation or by their complexity that precludes routine clinical use. While the PULP score seems promising it needs external validation before widespread use. PMID:23574922
A new survival score for patients with brain metastases who received whole-brain radiotherapy (WBRT) alone.

PubMed

Rades, Dirk; Dziggel, Liesa; Nagy, Viorica; Segedin, Barbara; Lohynska, Radka; Veninga, Theo; Khoa, Mai T; Trang, Ngo T; Schild, Steven E

2013-07-01

Survival scores for patients with brain metastasis exist. However, the treatment regimens used to create these scores were heterogeneous. This study aimed to develop and validate a survival score in homogeneously treated patients. Eight-hundred-and-eighty-two patients receiving 10 × 3Gy of WBRT alone were randomly assigned to a test group (N=441) or a validation group (N=441). In the multivariate analysis of the test group, age, performance status, extracranial metastasis, and systemic treatment prior to WBRT were independent predictors of survival. The score for each factor was determined by dividing the 6-month survival rate (in %) by 10. Scores were summed and total scores ranged from 6 to 19 points. Patients were divided into four prognostic groups. The 6-month survival rates were 4% for 6-9 points, 29% for 10-14 points, 62% for 15-17 points, and 93% for 17-18 points (p<0.001) in the test group. The survival rates were 3%, 28%, 54% and 96%, respectively (p<0.001) in the validation group. Since the 6-month survival rates in the validation group were very similar to the test group, this new score (WBRT-30) appears valid and reproducible. It can help making treatment choices and stratifying patients in future trials. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Reliability and Validity of the Musculoskeletal Tumor Society Scoring System for the Upper Extremity in Japanese Patients.

PubMed

Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Iwata, Shintaro; Kobayashi, Eisuke; Tanzawa, Yoshikazu; Yonemoto, Tsukasa; Kawano, Hirotaka; Kawai, Akira

2017-09-01

The Musculoskeletal Tumor Society (MSTS) scoring system developed in 1993 is a widely used disease-specific evaluation tool for assessment of physical function in patients with musculoskeletal tumors; however, only a few studies have confirmed its reliability and validity. The aim of this study was to validate the MSTS scoring system for the upper extremity (MSTS-UE) in Japanese patients with musculoskeletal tumors for use by others in research. Does the MSTS-UE have: (1) sufficient reliability and internal consistency; (2) adequate construct validity; and (3) reasonable criterion validity in comparison to the Toronto Extremity Salvage Score (TESS) or SF-36? Reliability was performed using test-retest analysis, and internal consistency was evaluated with Cronbach's alpha coefficient. Construct validity was evaluated using a scree plot to confirm the construct number and the Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS-UE with the TESS and SF-36. The test-retest reliability with intraclass correlation coefficient (0.95; 95% CI, 0.91-0.97) was excellent, and internal consistency with Cronbach's α (0.7; 95% CI, 0.53-0.81) was acceptable. There were no ceiling and floor effects. The Akaike Information Criterion network showed that lifting ability, pain, and dexterity played central roles among the components. The MSTS-UE showed substantial correlation with the TESS scoring scale (r = 0.75; p < 0.001) and fair correlation with the SF-36 physical component summary (r = 0.37; p = 0.007). Although the MSTS-UE showed slight correlation with the SF-36 mental component summary, the emotional acceptance component of the MSTS-UE showed fair correlation (r = 0.29; p = 0.039). We can conclude that the MSTS is not an adequate measure of general health-related quality of life; however, this system was designed mainly to be a simple measure of function in a single extremity. To evaluate the mental state of patients with musculoskeletal tumors in the upper extremity, further study is needed.
Validation of the FAMACHA© system in South American camelids.

PubMed

Storey, Bobby E; Williamson, Lisa H; Howell, Sue B; Terrill, Thomas H; Berghaus, Roy; Vidyashankar, Anand N; Kaplan, Ray M

2017-08-30

Haemonchus contortus resistant to multiple anthelmintics threaten the viability of the small ruminant industry in areas where this parasite is prevalent. In response to this situation, the FAMACHA© system was developed and validated for use with small ruminants as a way to detect clinical anemia associated with haemonchosis. Given that H. contortus and multiple anthelmintic resistance is a similar problem in camelids, the FAMACHA© system might also provide the same benefits. To address this need, a validation study of the FAMACHA© system was conducted on 21 alpaca and llama farms over a 2-year period. H. contortus was the predominant nematode parasite on 17 of the 21 farms (10 alpaca and 7 llama farms) enrolled in the study, based on fecal culture results. The FAMACHA© card was used to score the color of the lower palpebral (lower eye lid) conjunctiva on a 1-5 scale. Packed cell volume (PCV) values were measured and compared to FAMACHA© scores using FAMACHA© score cutoffs of ≥3 or ≥4 and with anemia defined as a PCV ≤15%, ≤17%, or≤20%. PCV was significantly associated with FAMACHA© score, fecal egg count (FEC), and body condition score (BCS), regardless of the FAMACHA© cutoff score or the PCV% chosen to define clinical anemia (p<0.01 in all cases). The use of FAMACHA© scores ≥3 and PCV ≥ 15% indicating anemia provided the best sensitivity (96.4% vs 92.9% for FAMACHA© ≥4), whereas FAMACHA scores ≥ 4 and PCV ≤20% provided the best specificity (94.2% vs 69.1% for FAMACHA© ≥3). The data from this study support the FAMACHA© system as a useful tool for detecting clinical anemia in camelids suffering from haemonchosis. Parameters for making treatment decisions based on FAMACHA© score in camelids should mirror those established for small ruminants. Copyright © 2017 Elsevier B.V. All rights reserved.
Measurement of COPD Severity Using a Survey-Based Score

PubMed Central

Omachi, Theodore A.; Katz, Patricia P.; Yelin, Edward H.; Iribarren, Carlos; Blanc, Paul D.

2010-01-01

Background: A comprehensive survey-based COPD severity score has usefulness for epidemiologic and health outcomes research. We previously developed and validated the survey-based COPD Severity Score without using lung function or other physiologic measurements. In this study, we aimed to further validate the severity score in a different COPD cohort and using a combination of patient-reported and objective physiologic measurements. Methods: Using data from the Function, Living, Outcomes, and Work cohort study of COPD, we evaluated the concurrent and predictive validity of the COPD Severity Score among 1,202 subjects. The survey instrument is a 35-point score based on symptoms, medication and oxygen use, and prior hospitalization or intubation for COPD. Subjects were systemically assessed using structured telephone survey, spirometry, and 6-min walk testing. Results: We found evidence to support concurrent validity of the score. Higher COPD Severity Score values were associated with poorer FEV1 (r = −0.38), FEV1% predicted (r = −0.40), Body mass, Obstruction, Dyspnea, Exercise Index (r = 0.57), and distance walked in 6 min (r = −0.43) (P < .0001 in all cases). Greater COPD severity was also related to poorer generic physical health status (r = −0.49) and disease-specific health-related quality of life (r = 0.57) (P < .0001). The score also demonstrated predictive validity. It was also associated with a greater prospective risk of acute exacerbation of COPD defined as ED visits (hazard ratio [HR], 1.31; 95% CI, 1.24-1.39), hospitalizations (HR, 1.59; 95% CI, 1.44-1.75), and either measure of hospital-based care for COPD (HR, 1.34; 95% CI, 1.26-1.41) (P < .0001 in all cases). Conclusion: The COPD Severity Score is a valid survey-based measure of disease-specific severity, both in terms of concurrent and predictive validity. The score is a psychometrically sound instrument for use in epidemiologic and outcomes research in COPD. PMID:20040611
Is Effective Teaching Stable?

ERIC Educational Resources Information Center

Patrick, Helen; Mantzicopoulos, Panayota

2016-01-01

The authors examined the ecological validity of using observation-based scores to evaluate individual teachers' effectiveness, mirroring their use by school administrators. Using the Classroom Assessment Scoring System, the authors asked (a) how similar are teachers' emotional support, classroom organization, and instructional support scores from…
Prediction of Waitlist Mortality in Adult Heart Transplant Candidates: The Candidate Risk Score.

PubMed

Jasseron, Carine; Legeai, Camille; Jacquelinet, Christian; Leprince, Pascal; Cantrelle, Christelle; Audry, Benoît; Porcher, Raphael; Bastien, Olivier; Dorent, Richard

2017-09-01

The cardiac allocation system in France is currently based on urgency and geography. Medical urgency is defined by therapies without considering objective patient mortality risk factors. This study aimed to develop a waitlist mortality risk score from commonly available candidate variables. The study included all patients, aged 16 years or older, registered on the national registry CRISTAL for first single-organ heart transplantation between January 2010 and December 2014. This population was randomly divided in a 2:1 ratio into derivation and validation cohorts. The association of variables at listing with 1-year waitlist death or delisting for worsening medical condition was assessed within the derivation cohort. The predictors were used to generate a candidate risk score (CRS). Validation of the CRS was performed in the validation cohort. Concordance probability estimation (CPE) was used to evaluate the discriminative capacity of the models. During the study period, 2333 patients were newly listed. The derivation (n =1 555) and the validation cohorts (n = 778) were similar. Short-term mechanical circulatory support, natriuretic peptide decile, glomerular filtration rate, and total bilirubin level were included in a simplified model and incorporated into the score. The Concordance probability estimation of the CRS was 0.73 in the derivation cohort and 0.71 in the validation cohort. The correlation between observed and expected 1-year waitlist mortality in the validation cohort was 0.87. The candidate risk score provides an accurate objective prediction of waitlist mortality. It is currently being used to develop a modified cardiac allocation system in France.
Prediction of outcome in asphyxiated newborns treated with hypothermia: Is a MRI scoring system described before the cooling era still useful?

PubMed

Al Amrani, Fatema; Marcovitz, Jaclyn; Sanon, Priscille-Nice; Khairy, May; Saint-Martin, Christine; Shevell, Michael; Wintermark, Pia

2018-05-01

To determine whether an MRI scoring system, which was validated in the pre-cooling era, can still predict the neurodevelopmental outcome of asphyxiated newborns treated with hypothermia at 2 years of age. We conducted a retrospective cohort study of asphyxiated newborns treated with hypothermia. An MRI scoring system, which was validated in the pre-cooling era, was used to grade the severity of brain injury on the neonatal brain MRI. Their neurodevelopment was assessed around 2 years of age; adverse outcome included cerebral palsy, global developmental delay, and/or epilepsy. One hundred and sixty-nine newborns were included. Among the 131 newborns who survived and had a brain MRI during the neonatal period, 92% were evaluated around 2 years of age or later. Of these newborns, 37% displayed brain injury, and 23% developed an adverse outcome. Asphyxiated newborns treated with hypothermia who had an adverse outcome had a significantly higher MRI score (p <0.001) compared to those without an adverse outcome. An MRI scoring system that was validated before the cooling era is still able to reliably differentiate which of the asphyxiated newborns treated with hypothermia were more prone to develop an adverse outcome around 2 years of age. Copyright © 2018 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
A novel early risk assessment tool for detecting clinical outcomes in patients with heat-related illness (J-ERATO score): Development and validation in independent cohorts in Japan.

PubMed

Hayashida, Kei; Kondo, Yutaka; Hifumi, Toru; Shimazaki, Junya; Oda, Yasutaka; Shiraishi, Shinichiro; Fukuda, Tatsuma; Sasaki, Junichi; Shimizu, Keiki

2018-01-01

We sought to develop a novel risk assessment tool to predict the clinical outcomes after heat-related illness. Prospective, multicenter observational study. Patients who transferred to emergency hospitals in Japan with heat-related illness were registered. The sample was divided into two parts: 60% to construct the score and 40% to validate it. A binary logistic regression model was used to predict hospital admission as a primary outcome. The resulting model was transformed into a scoring system. A total of 3,001 eligible patients were analyzed. There was no difference in variables between development and validation cohorts. Based on the result of a logistic regression model in the development phase (n = 1,805), the J-ERATO score was defined as the sum of the six binary components in the prehospital setting (respiratory rate≥22 /min, Glasgow coma scale<15, systolic blood pressure≤100 mmHg, heart rate≥100 bpm, body temperature≥38°C, and age≥65 y), for a total score ranging from 0 to 6. In the validation phase (n = 1,196), the score had excellent discrimination (C-statistic 0.84; 95% CI 0.79-0.89, p<0.0001) and calibration (P>0.2 by Hosmer-Lemeshow test). The observed proportion of hospital admission increased with increasing J-ERATO score (score = 0, 5.0%; score = 1, 15.0%; score = 2, 24.6%; score = 3, 38.6%; score = 4, 68.0%; score = 5, 85.2%; score = 6, 96.4%). Multivariate analyses showed that the J-ERATO score was an independent positive predictor of hospital admission (adjusted OR, 2.43; 95% CI, 2.06-2.87; P<0.001), intensive care unit (ICU) admission (3.73; 2.95-4.72; P<0.001) and in-hospital mortality (1.65; 1.18-2.32; P = 0.004). The J-ERATO score is simply assessed and can facilitate the identification of patients with higher risk of heat-related hospitalization. This scoring system is also significantly associated with the higher likelihood of ICU admission and in-hospital mortality after heat-related hospitalization.
Validity Evidence and Scoring Guidelines for Standardized Patient Encounters and Patient Notes From a Multisite Study of Clinical Performance Examinations in Seven Medical Schools.

PubMed

Park, Yoon Soo; Hyderi, Abbas; Heine, Nancy; May, Win; Nevins, Andrew; Lee, Ming; Bordage, Georges; Yudkowsky, Rachel

2017-11-01

To examine validity evidence of local graduation competency examination scores from seven medical schools using shared cases and to provide rater training protocols and guidelines for scoring patient notes (PNs). Between May and August 2016, clinical cases were developed, shared, and administered across seven medical schools (990 students participated). Raters were calibrated using training protocols, and guidelines were developed collaboratively across sites to standardize scoring. Data included scores from standardized patient encounters for history taking, physical examination, and PNs. Descriptive statistics were used to examine scores from the different assessment components. Generalizability studies (G-studies) using variance components were conducted to estimate reliability for composite scores. Validity evidence was collected for response process (rater perception), internal structure (variance components, reliability), relations to other variables (interassessment correlations), and consequences (composite score). Student performance varied by case and task. In the PNs, justification of differential diagnosis was the most discriminating task. G-studies showed that schools accounted for less than 1% of total variance; however, for the PNs, there were differences in scores for varying cases and tasks across schools, indicating a school effect. Composite score reliability was maximized when the PN was weighted between 30% and 40%. Raters preferred using case-specific scoring guidelines with clear point-scoring systems. This multisite study presents validity evidence for PN scores based on scoring rubric and case-specific scoring guidelines that offer rigor and feedback for learners. Variability in PN scores across participating sites may signal different approaches to teaching clinical reasoning among medical schools.
Validation of cytogenetic risk groups according to International Prognostic Scoring Systems by peripheral blood CD34+FISH: results from a German diagnostic study in comparison with an international control group.

PubMed

Braulke, Friederike; Platzbecker, Uwe; Müller-Thomas, Catharina; Götze, Katharina; Germing, Ulrich; Brümmendorf, Tim H; Nolte, Florian; Hofmann, Wolf-Karsten; Giagounidis, Aristoteles A N; Lübbert, Michael; Greenberg, Peter L; Bennett, John M; Solé, Francesc; Mallo, Mar; Slovak, Marilyn L; Ohyashiki, Kazuma; Le Beau, Michelle M; Tüchler, Heinz; Pfeilstöcker, Michael; Nösslinger, Thomas; Hildebrandt, Barbara; Shirneshan, Katayoon; Aul, Carlo; Stauder, Reinhard; Sperr, Wolfgang R; Valent, Peter; Fonatsch, Christa; Trümper, Lorenz; Haase, Detlef; Schanz, Julie

2015-02-01

International Prognostic Scoring Systems are used to determine the individual risk profile of myelodysplastic syndrome patients. For the assessment of International Prognostic Scoring Systems, an adequate chromosome banding analysis of the bone marrow is essential. Cytogenetic information is not available for a substantial number of patients (5%-20%) with dry marrow or an insufficient number of metaphase cells. For these patients, a valid risk classification is impossible. In the study presented here, the International Prognostic Scoring Systems were validated based on fluorescence in situ hybridization analyses using extended probe panels applied to cluster of differentiation 34 positive (CD34(+)) peripheral blood cells of 328 MDS patients of our prospective multicenter German diagnostic study and compared to chromosome banding results of 2902 previously published patients with myelodysplastic syndromes. For cytogenetic risk classification by fluorescence in situ hybridization analyses of CD34(+) peripheral blood cells, the groups differed significantly for overall and leukemia-free survival by uni- and multivariate analyses without discrepancies between treated and untreated patients. Including cytogenetic data of fluorescence in situ hybridization analyses of peripheral CD34(+) blood cells (instead of bone marrow banding analysis) into the complete International Prognostic Scoring System assessment, the prognostic risk groups separated significantly for overall and leukemia-free survival. Our data show that a reliable stratification to the risk groups of the International Prognostic Scoring Systems is possible from peripheral blood in patients with missing chromosome banding analysis by using a comprehensive probe panel (clinicaltrials.gov identifier:01355913). Copyright© Ferrata Storti Foundation.
Validation of measures from the smartphone sway balance application: a pilot study.

PubMed

Patterson, Jeremy A; Amick, Ryan Z; Thummar, Tarunkumar; Rogers, Michael E

2014-04-01

A number of different balance assessment techniques are currently available and widely used. These include both subjective and objective assessments. The ability to provide quantitative measures of balance and posture is the benefit of objective tools, however these instruments are not generally utilized outside of research laboratory settings due to cost, complexity of operation, size, duration of assessment, and general practicality. The purpose of this pilot study was to assess the value and validity of using software developed to access the iPod and iPhone accelerometers output and translate that to the measurement of human balance. Thirty healthy college-aged individuals (13 male, 17 female; age = 26.1 ± 8.5 years) volunteered. Participants performed a static Athlete's Single Leg Test protocol for 10 sec, on a Biodex Balance System SD while concurrently utilizing a mobile device with balance software. Anterior/posterior stability was recorded using both devices, described as the displacement in degrees from level, and was termed the "balance score." There were no significant differences between the two reported balance scores (p = 0.818. Mean balance score on the balance platform was 1.41 ± 0.90, as compared to 1.38 ± 0.72 using the mobile device. There is a need for a valid, convenient, and cost-effective tool to objectively measure balance. Results of this study are promising, as balance score derived from the Smartphone accelerometers were consistent with balance scores obtained from a previously validated balance system. However, further investigation is necessary as this version of the mobile software only assessed balance in the anterior/posterior direction. Additionally, further testing is necessary on a healthy populations and as well as those with impairment of the motor control system. Level 2b (Observational study of validity)(1.)
Gene network inherent in genomic big data improves the accuracy of prognostic prediction for cancer patients.

PubMed

Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock

2017-09-29

Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients.
Gene network inherent in genomic big data improves the accuracy of prognostic prediction for cancer patients

PubMed Central

Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock

2017-01-01

Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients. PMID:29100405
Improving results for carotid artery stenting by validation of the anatomic scoring system for carotid artery stenting with patient-specific simulated rehearsal.

PubMed

Willaert, Willem I M; Cheshire, Nicholas J; Aggarwal, Rajesh; Van Herzeele, Isabelle; Stansby, Gerard; Macdonald, Sumaira; Vermassen, Frank E

2012-12-01

Carotid artery stenting (CAS) is a technically demanding procedure with a risk of periprocedural stroke. A scoring system based on anatomic criteria has been developed to facilitate patient selection for CAS. Advancements in simulation science also enable case evaluation through patient-specific virtual reality (VR) rehearsal on an endovascular simulator. This study aimed to validate the anatomic scoring system for CAS using the patient-specific VR technology. Three patients were selected and graded according to the CAS scoring system (maximum score, 9): one easy (score, <4.9), one intermediate (score, 5.0-5.9), and one difficult (score, >7.0). The three cases were performed on the simulator in random order by 20 novice interventionalists pretrained in CAS. Technical performances were assessed using simulator-based metrics and expert-based ratings. The interventionalists took significantly longer to perform the difficult CAS case (median, 31.6 vs 19.7 vs 14.6 minutes; P<.0001) compared with the intermediate and easy cases; similarly, more fluoroscopy time (20.7 vs 12.1 vs 8.2 minutes; P<.0001), contrast volume (56.5 vs 51.5 vs 50.0 mL; P=.0060), and roadmaps (10 vs 9 vs 9; P=.0040) were used. The quality of performance declined significantly as the cases became more challenging (score, 24 vs 22 vs 19; P<.0001). The anatomic scoring system for CAS can predict the difficulty of a CAS procedure as measured by patient-specific VR. This scoring system, with or without the additional use of patient-specific VR, can guide novice interventionalists in selecting appropriate patients for CAS. This may reduce the perioperative stroke risk and enhance patient safety. Copyright © 2012 Society for Vascular Surgery. Published by Mosby, Inc. All rights reserved.
Construct and face validity of the American College of Surgeons/Association of Program Directors in Surgery laparoscopic troubleshooting team training exercise.

PubMed

Arain, Nabeel A; Hogg, Deborah C; Gala, Rajiv B; Bhoja, Ravi; Tesfay, Seifu T; Webb, Erin M; Scott, Daniel J

2012-01-01

Our aim was to develop an objective scoring system and evaluate construct and face validity for a laparoscopic troubleshooting team training exercise. Surgery and gynecology novices (n = 14) and experts (n = 10) participated. Assessments included the following: time-out, scenario decision making (SDM) score (based on essential treatments rendered and completion time), operating room communication assessment (investigator developed), line operations safety audits (teamwork), and National Aeronautics and Space Administration-Task Load Index (workload). Significant differences were detected for SDM scores for scenarios 1 (192 vs 278; P = .01) and 3 (129 vs 225; P = .004), operating room communication assessment (67 vs 91; P = .002), and line operations safety audits (58 vs 87; P = .001), but not for time-out (46 vs 51) or scenario 2 SDM score (301 vs 322). Workload was similar for both groups and face validity (8.8 on a 10-point scale) was strongly supported. Objective decision-making scoring for 2 of 3 scenarios and communication and teamwork ratings showed construct validity. Face validity and participant feedback were excellent. Copyright © 2012 Elsevier Inc. All rights reserved.
Validation of a Predictive Scoring System for Deep Sternal Wound Infection after Bilateral Internal Thoracic Artery Grafting in a Cohort of French Patients.

PubMed

Perrotti, Andrea; Gatti, Giuseppe; Dorigo, Enrica; Sinagra, Gianfranco; Pappalardo, Aniello; Chocron, Sidney

The Gatti score is a weighted scoring system based on risk factors for deep sternal wound infection (DSWI) that was created in an Italian center to predict DSWI risk after bilateral internal thoracic artery (BITA) grafting. No external evaluation based on validation samples derived from other surgical centers has been performed. The aim of this study is to perform this validation. During 2015, BITA grafts were used as skeletonized conduits in all 255 consecutive patients with multi-vessel coronary disease who underwent isolated coronary bypass surgery at the Department of Thoracic and Cardio-Vascular Surgery, University Hospital Jean Minjoz, Besançon, France. Baseline characteristics, operative data, and immediate outcomes of every patient were collected prospectively. A DSWI risk score was assigned to each patient pre-operatively. The discrimination power of both models, pre-operative and combined, of the Gatti score was assessed with the calculation of the area under the receiver operating characteristic curve. Fourteen (5.5%) patients had DSWI. Major differences both as the baseline characteristics of patients and surgical techniques were found between this series and the original series from which the Gatti score was derived. The area under the receiver operating characteristic curve was 0.78 (95% confidence interval: 0.64-0.92) for the pre-operative model and 0.84 (95% confidence interval: 0.69-0.98) for the combined model. The Gatti score has proven to be effective even in a cohort of French patients despite major differences from the original Italian series. Multi-center validation studies must be performed before introducing the score into clinical practice.
Reliability and Construct Validity of the Patient-Reported Outcomes Measurement Information System (PROMIS) Instruments in Women with Fibromyalgia.

PubMed

Merriwether, Ericka N; Rakel, Barbara A; Zimmerman, Miriam B; Dailey, Dana L; Vance, Carol G T; Darghosian, Leon; Golchha, Meenakshi; Geasland, Katherine M; Chimenti, Ruth; Crofford, Leslie J; Sluka, Kathleen A

2017-08-01

The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed to standardize measurement of clinically relevant patient-reported outcomes. This study evaluated the reliability and construct validity of select PROMIS static short-form (SF) instruments in women with fibromyalgia. Analysis of baseline data from the Fibromyalgia Activity Study with TENS (FAST), a randomized controlled trial of the efficacy of transcutaneous electrical nerve stimulation. Dual site, university-based outpatient clinics. Women aged 20 to 67 years diagnosed with fibromyalgia. Participants completed the Revised Fibromyalgia Impact Questionnaire (FIQR) and 10 PROMIS static SF instruments. Internal consistency was calculated using Cronbach alpha. Convergent validity was examined against the FIQR using Pearson correlation and multiple regression analysis. PROMIS static SF instruments had fair to high internal consistency (Cronbach α = 0.58 to 0.94, P < 0.05). PROMIS 'physical function' domain score was highly correlated with FIQR 'function' score (r = -0.73). The PROMIS 'total' score was highly correlated with the FIQR total score (r = -0.72). Correlations with FIQR total score of each of the three PROMIS domain scores were r = -0.65 for 'physical function,' r = -0.63 for 'global,' and r = -0.57 for 'symptom' domain. PROMIS 'physical function,' 'global,' and 'symptom' scores explained 58% of the FIQR total score variance. Select PROMIS static SF instruments demonstrate convergent validity with the FIQR, a legacy measure of fibromyalgia disease severity. These results highlight the potential utility of select PROMIS static SFs for assessment and tracking of patient-reported outcomes in fibromyalgia. © 2016 American Academy of Pain Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Summed and Weighted Summary Scores for the Medsger Disease Severity Scale Compared with the Physician's Global Assessment of Disease Severity in Systemic Sclerosis.

PubMed

Harel, Daphna; Hudson, Marie; Iliescu, Alexandra; Baron, Murray; Steele, Russell

2016-08-01

To develop a weighted summary score for the Medsger Disease Severity Scale (DSS) and to compare its measurement properties with those of a summed DSS score and a physician's global assessment (PGA) of severity score in systemic sclerosis (SSc). Data from 875 patients with SSc enrolled in a multisite observational research cohort were extracted from a central database. Item response theory was used to estimate weights for the DSS weighted score. Intraclass correlation coefficients (ICC) and convergent, discriminative, and predictive validity of the 3 summary measures in relation to patient-reported outcomes (PRO) and mortality were compared. Mean PGA was 2.69 (SD 2.16, range 0-10), mean DSS summed score was 8.60 (SD 4.02, range 0-36), and mean DSS weighted score was 8.11 (SD 4.05, range 0-36). ICC were similar for all 3 measures [PGA 6.9%, 95% credible intervals (CrI) 2.1-16.2; DSS summed score 2.5%, 95% CrI 0.4-6.7; DSS weighted score 2.0%, 95% CrI 0.1-5.6]. Convergent and discriminative validity of the 3 measures for PRO were largely similar. In Cox proportional hazards models adjusting for age and sex, the 3 measures had similar predictive ability for mortality (adjusted R(2) 13.9% for PGA, 12.3% for DSS summed score, and 10.7% DSS weighted score). The 3 summary scores appear valid and perform similarly. However, there were some concerns with the weights computed for individual DSS scales, with unexpected low weights attributed to lung, heart, and kidney, leading the PGA to be the preferred measure at this time. Further work refining the DSS could improve the measurement properties of the DSS summary scores.
External Validity of a Risk Stratification Score Predicting Early Distant Brain Failure and Salvage Whole Brain Radiation Therapy After Stereotactic Radiosurgery for Brain Metastases.

PubMed

Press, Robert H; Boselli, Danielle M; Symanowski, James T; Lankford, Scott P; McCammon, Robert J; Moeller, Benjamin J; Heinzerling, John H; Fasola, Carolina E; Burri, Stuart H; Patel, Kirtesh R; Asher, Anthony L; Sumrall, Ashley L; Curran, Walter J; Shu, Hui-Kuo G; Crocker, Ian R; Prabhu, Roshan S

2017-07-01

A scoring system using pretreatment factors was recently published for predicting the risk of early (≤6 months) distant brain failure (DBF) and salvage whole brain radiation therapy (WBRT) after stereotactic radiosurgery (SRS) alone. Four risk factors were identified: (1) lack of prior WBRT; (2) melanoma or breast histologic features; (3) multiple brain metastases; and (4) total volume of brain metastases <1.3 cm 3 , with each factor assigned 1 point. The purpose of this study was to assess the validity of this scoring system and its appropriateness for clinical use in an independent external patient population. We reviewed the records of 247 patients with 388 brain metastases treated with SRS between 2010 at 2013 at Levine Cancer Institute. The Press (Emory) risk score was calculated and applied to the validation cohort population, and subsequent risk groups were analyzed using cumulative incidence. The low-risk (LR) group had a significantly lower risk of early DBF than did the high-risk (HR) group (22.6% vs 44%, P=.004), but there was no difference between the HR and intermediate-risk (IR) groups (41.2% vs 44%, P=.79). Total lesion volume <1.3 cm 3 (P=.004), malignant melanoma (P=.007), and multiple metastases (P<.001) were validated as predictors for early DBF. Prior WBRT and breast cancer histologic features did not retain prognostic significance. Risk stratification for risk of early salvage WBRT were similar, with a trend toward an increased risk for HR compared with LR (P=.09) but no difference between IR and HR (P=.53). The 3-level Emory risk score was shown to not be externally valid, but the model was able to stratify between 2 levels (LR and not-LR [combined IR and HR]) for early (≤6 months) DBF. These results reinforce the importance of validating predictive models in independent cohorts. Further refinement of this scoring system with molecular information and in additional contemporary patient populations is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

The Pancreatitis Activity Scoring System predicts clinical outcomes in acute pancreatitis: findings from a prospective cohort study.

PubMed

Buxbaum, James; Quezada, Michael; Chong, Bradford; Gupta, Nikhil; Yu, Chung Yao; Lane, Christianne; Da, Ben; Leung, Kenneth; Shulman, Ira; Pandol, Stephen; Wu, Bechien

2018-05-01

The Pancreatitis Activity Scoring System (PASS) has been derived by an international group of experts via a modified Delphi process. Our aim was to perform an external validation study to assess for concordance of the PASS score with high face validity clinical outcomes and determine specific meaningful thresholds to assist in application of this scoring system in a large prospectively ascertained cohort. We analyzed data from a prospective cohort study of consecutive patients admitted to the Los Angeles County Hospital between March 2015 and March 2017. Patients were identified using an emergency department paging system and electronic alert system. Comprehensive characterization included substance use history, pancreatitis etiology, biochemical profile, and detailed clinical course. We calculated the PASS score at admission, discharge, and at 12 h increments during the hospitalization. We performed several analyses to assess the relationship between the PASS score and outcomes at various points during hospitalization as well as following discharge. Using multivariable logistic regression analysis, we assessed the relationship between admission PASS score and risk of severe pancreatitis. PASS score performance was compared to established systems used to predict severe pancreatitis. Additional inpatient outcomes assessed included local complications, length of stay, development of systemic inflammatory response syndrome (SIRS), and intensive care unit (ICU) admission. We also assessed whether the PASS score at discharge was associated with early readmission (re-hospitalization for pancreatitis symptoms and complications within 30 days of discharge). A total of 439 patients were enrolled, their mean age was 42 (±15) years, and 53% were male. Admission PASS score >140 was associated with moderately severe and severe pancreatitis (OR 3.5 [95% CI 2.0, 6.3]), ICU admission (OR 4.9 [2.5, 9.4]), local complications (3.0 [1.6, 5.7]), and development of SIRS (OR 2.9 [1.8, 4.5]) as well as prolongation of hospitalization by a mean of 1.5 (1.3-1.7) days. For the prediction of moderately severe/severe pancreatitis, the PASS score (AUC = 0.71) was comparable to the more established Ranson's (AUC = 0.63), Glasgow (AUC = 0.72), Panc3 (AUC = 0.57), and HAPS (AUC = 0.54) scoring systems. Discharge PASS score >60 was associated with early readmission (OR 5.0 [2.4, 10.7]). The PASS score is associated with important clinical outcomes in acute pancreatitis. The ability of the score to forecast important clinical events at different points in the disease course suggests that it is a valid measure of activity in patients with acute pancreatitis.
Development and validation of immune dysfunction score to predict 28-day mortality of sepsis patients

PubMed Central

Fang, Wen-Feng; Douglas, Ivor S.; Chen, Yu-Mu; Lin, Chiung-Yu; Kao, Hsu-Ching; Fang, Ying-Tang; Huang, Chi-Han; Chang, Ya-Ting; Huang, Kuo-Tung; Wang, Yi-His; Wang, Chin-Chou

2017-01-01

Background Sepsis-induced immune dysfunction ranging from cytokines storm to immunoparalysis impacts outcomes. Monitoring immune dysfunction enables better risk stratification and mortality prediction and is mandatory before widely application of immunoadjuvant therapies. We aimed to develop and validate a scoring system according to patients’ immune dysfunction status for 28-day mortality prediction. Methods A prospective observational study from a cohort of adult sepsis patients admitted to ICU between August 2013 and June 2016 at Kaohsiung Chang Gung Memorial Hospital in Taiwan. We evaluated immune dysfunction status through measurement of baseline plasma Cytokine levels, Monocyte human leukocyte-DR expression by flow cytometry, and stimulated immune response using post LPS stimulated cytokine elevation ratio. An immune dysfunction score was created for 28-day mortality prediction and was validated. Results A total of 151 patients were enrolled. Data of the first consecutive 106 septic patients comprised the training cohort, and of other 45 patients comprised the validation cohort. Among the 106 patients, 21 died and 85 were still alive on day 28 after ICU admission. (mortality rate, 19.8%). Independent predictive factors revealed via multivariate logistic regression analysis included segmented neutrophil-to-monocyte ratio, granulocyte-colony stimulating factor, interleukin-10, and monocyte human leukocyte antigen-antigen D–related levels, all of which were selected to construct the score, which predicted 28-day mortality with area under the curve of 0.853 and 0.789 in the training and validation cohorts, respectively. Conclusions The immune dysfunction scoring system developed here included plasma granulocyte-colony stimulating factor level, interleukin-10 level, serum segmented neutrophil-to-monocyte ratio, and monocyte human leukocyte antigen-antigen D–related expression appears valid and reproducible for predicting 28-day mortality. PMID:29073262
Can we have an overall osteoarthritis severity score for the patellofemoral joint using magnetic resonance imaging? Reliability and validity.

PubMed

Kobayashi, Sarah; Peduto, Anthony; Simic, Milena; Fransen, Marlene; Refshauge, Kathryn; Mah, Jean; Pappas, Evangelos

2018-04-01

This work aimed to assess inter-rater reliability and agreement of a magnetic resonance imaging (MRI)-based Kellgren and Lawrence (K&L) grading for patellofemoral joint osteoarthritis (OA) and to validate it against the MRI Osteoarthritis Knee Score (MOAKS). MRI scans from people aged 45 to 75 years with chronic knee pain participating in a randomised clinical trial evaluating dietary supplements were utilised. Fifty participants were randomly selected and scored using the MRI-based K&L grading using axial and sagittal MRI scans. Raters conducted inter-rater reliability, blinded to clinical information, radiology reports and other rater results. Intra- and inter-rater reliability and agreement were evaluated using the intra-class correlation coefficient (ICC) and Cohen's weighted kappa. There was a 2-week interval between the first and second readings for intra-rater reliability. Validity was assessed using the MOAKS and evaluated using Spearman's correlation coefficient. Intra-rater reliability of the K&L system was excellent: ICC 0.91 (95% CI 0.82-0.95); weighted kappa (ĸ = 0.69). Inter-rater reliability was high (ICC 0.88; 95% CI 0.79-0.93), while agreement between raters was moderate (ĸ = 0.49-0.57). Validity analysis demonstrated a strong correlation between the total MOAKS features score and the K&L grading system (ρ = 0.62-0.67) but weak correlations when compared with individual MOAKS features (ρ = 0.19-0.61). The high reliability and good agreement show consistency in grading the severity of patellofemoral OA with the MRI-based K&L score. Our validity results suggest that the scale may be useful, particularly in the clinical environment. Future research should validate this method against clinical findings.
Validation of a Clinical Scoring System for Outcome Prediction in Dogs with Acute Kidney Injury Managed by Hemodialysis.

PubMed

Segev, G; Langston, C; Takada, K; Kass, P H; Cowgill, L D

2016-05-01

A scoring system for outcome prediction in dogs with acute kidney injury (AKI) recently has been developed but has not been validated. The scoring system previously developed for outcome prediction will accurately predict outcome in a validation cohort of dogs with AKI managed with hemodialysis. One hundred fifteen client-owned dogs with AKI. Medical records of dogs with AKI treated by hemodialysis between 2011 and 2015 were reviewed. Dogs were included only if all variables required to calculate the final predictive score were available, and the 30-day outcome was known. A predictive score for 3 models was calculated for each dog. Logistic regression was used to evaluate the association of the final predictive score with each model's outcome. Receiver operating curve (ROC) analyses were performed to determine sensitivity and specificity for each model based on previously established cut-off values. Higher scores for each model were associated with decreased survival probability (P < .001). Based on previously established cut-off values, 3 models (models A, B, C) were associated with sensitivities/specificities of 73/75%, 71/80%, and 75/86%, respectively, and correctly classified 74-80% of the dogs. All models were simple to apply and allowed outcome prediction that closely corresponded with actual outcome in an independent cohort. As expected, accuracies were slightly lower compared with those from the previously reported cohort used initially to develop the models. Copyright © 2016 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.
A Risk-Scoring System for Predicting Methicillin Resistance in Community-Onset Staphylococcus aureus Bacteremia in Korea.

PubMed

Suh, Hyeon Jeong; Park, Wan Beom; Jung, Sook-In; Song, Kyoung-Ho; Kwak, Yee Gyung; Kim, Kye-Hyung; Hwang, Jeong-Hwan; Yun, Na Ra; Jang, Hee-Chang; Kim, Young Keun; Kim, Nak-Hyun; Park, Kyung-Hwa; Kang, Seung Ji; Lee, Shinwon; Kim, Eu Suk; Kim, Hong Bin

2018-06-01

We aimed to develop a simple scoring system to predict risk for methicillin resistance in community-onset Staphylococcus aureus bacteremia (CO-SAB) by identifying the clinical and epidemiological risk factors for community-onset methicillin-resistant S. aureus (MRSA). We retrospectively analyzed data from three multicenter cohort studies in Korea in which patient information was prospectively collected and risk factors for methicillin resistance in CO-SAB were identified. We then developed and validated a risk-scoring system. To analyze the 1,802 cases of CO-SAB, we included the four most powerful predictors of methicillin resistance that we identified in the scoring system: underlying hematologic disease (-1 point), endovascular infection as the primary site of infection (-1 point), history of hospitalization or surgery in ≤1 year (+0.5 points), and previous isolation of MRSA in ≤6 months (+1.5 points). With this scoring system, cases were classified into low (less than -0.5), intermediate (-0.5-1.5), and high (≥1.5) risk groups. The proportions of MRSA cases in each group were 24.7% (22/89), 39.0% (607/1,557), and 78.8% (123/156), respectively, and 16.7% (1/6), 33.8% (112/331), and 76.9% (10/13) in a validation set. This risk-scoring system for methicillin resistance in CO-SAB may help physicians select appropriate empirical antibiotics more quickly.
Validation of the Social Appearance Anxiety Scale in Patients with Systemic Sclerosis: A Scleroderma Patient-centered Intervention Network Cohort Study.

PubMed

Mills, Sarah D; Kwakkenbos, Linda; Carrier, Marie-Eve; Gholizadeh, Shadi; Fox, Rina S; Jewett, Lisa R; Gottesman, Karen; Roesch, Scott C; Thombs, Brett D; Malcarne, Vanessa L

2018-01-17

Systemic sclerosis (SSc) is an autoimmune disease that can cause disfiguring changes in appearance. This study examined the structural validity, internal consistency reliability, convergent validity, and measurement equivalence of the Social Appearance Anxiety Scale (SAAS) across SSc disease subtypes. Patients enrolled in the Scleroderma Patient-centered Intervention Network Cohort completed the SAAS and measures of appearance-related concerns and psychological distress. Confirmatory factor analysis (CFA) was used to examine the structural validity of the SAAS. Multiple-group CFA was used to determine if SAAS scores can be compared across patients with limited and diffuse disease subtypes. Cronbach's alpha was used to examine internal consistency reliability. Correlations of SAAS scores with measures of body image dissatisfaction, fear of negative evaluation, social anxiety, and depression were used to examine convergent validity. SAAS scores were hypothesized to be positively associated with all convergent validity measures, with correlations significant and moderate to large in size. A total of 938 patients with SSc were included. CFA supported a one-factor structure (CFI: .92; SRMR: .04; RMSEA: .08), and multiple-group CFA indicated that the scalar invariance model best fit the data. Internal consistency reliability was good in the total sample (α = .96) and in disease subgroups. Overall, evidence of convergent validity was found with measures of body image dissatisfaction, fear of negative evaluation, social anxiety, and depression. The SAAS can be reliably and validly used to assess fear of appearance evaluation in patients with SSc, and SAAS scores can be meaningfully compared across disease subtypes. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Age, PaO2/FIO2, and Plateau Pressure Score: A Proposal for a Simple Outcome Score in Patients With the Acute Respiratory Distress Syndrome.

PubMed

Villar, Jesús; Ambrós, Alfonso; Soler, Juan Alfonso; Martínez, Domingo; Ferrando, Carlos; Solano, Rosario; Mosteiro, Fernando; Blanco, Jesús; Martín-Rodríguez, Carmen; Fernández, María Del Mar; López, Julia; Díaz-Domínguez, Francisco J; Andaluz-Ojeda, David; Merayo, Eleuterio; Pérez-Méndez, Lina; Fernández, Rosa Lidia; Kacmarek, Robert M

2016-07-01

Although there is general agreement on the characteristic features of the acute respiratory distress syndrome, we lack a scoring system that predicts acute respiratory distress syndrome outcome with high probability. Our objective was to develop an outcome score that clinicians could easily calculate at the bedside to predict the risk of death of acute respiratory distress syndrome patients 24 hours after diagnosis. A prospective, multicenter, observational, descriptive, and validation study. A network of multidisciplinary ICUs. Six-hundred patients meeting Berlin criteria for moderate and severe acute respiratory distress syndrome enrolled in two independent cohorts treated with lung-protective ventilation. None. Using individual demographic, pulmonary, and systemic data at 24 hours after acute respiratory distress syndrome diagnosis, we derived our prediction score in 300 acute respiratory distress syndrome patients based on stratification of variable values into tertiles, and validated in an independent cohort of 300 acute respiratory distress syndrome patients. Primary outcome was in-hospital mortality. We found that a 9-point score based on patient's age, PaO2/FIO2 ratio, and plateau pressure at 24 hours after acute respiratory distress syndrome diagnosis was associated with death. Patients with a score greater than 7 had a mortality of 83.3% (relative risk, 5.7; 95% CI, 3.0-11.0), whereas patients with scores less than 5 had a mortality of 14.5% (p < 0.0000001). We confirmed the predictive validity of the score in a validation cohort. A simple 9-point score based on the values of age, PaO2/FIO2 ratio, and plateau pressure calculated at 24 hours on protective ventilation after acute respiratory distress syndrome diagnosis could be used in real time for rating prognosis of acute respiratory distress syndrome patients with high probability.
External validation of scoring systems in risk stratification of upper gastrointestinal bleeding.

PubMed

Anchu, Anna Cherian; Mohsina, Subair; Sureshkumar, Sathasivam; Mahalakshmy, T; Kate, Vikram

2017-03-01

The aim of this study was to externally validate the four commonly used scoring systems in the risk stratification of patients with upper gastrointestinal bleed (UGIB). Patients of UGIB who underwent endoscopy within 24 h of presentation were stratified prospectively using the pre-endoscopy Rockall score (PRS) >0, complete Rockall score (CRS) >2, Glasgow Blatchford bleeding scores (GBS) >3, and modified GBS (m-GBS) >3 scores. Patients were followed up to 30 days. Prognostic accuracy of the scores was done by comparing areas under curve (AUC) in terms of overall risk stratification, re-bleeding, mortality, need for intervention, and length of hospitalization. One hundred and seventy-five patients were studied. All four scores performed better in the overall risk stratification on AUC [PRS = 0.566 (CI: 0.481-0.651; p-0.043)/CRS = 0.712 (CI: 0.634-0.790); p<0.001)/GBS = 0.810 (CI: 0.744-0.877; p->0.001); m-GBS = 0.802 (CI: 0.734-0.871; p<0.001)], whereas only CRS achieved significance in identifying re-bleed [AUC-0.679 (CI: 0.579-0.780; p = 0.003)]. All the scoring systems except PRS were found to be significantly better in detecting 30-day mortality with a high AUC (CRS = 0.798; p-0.042)/GBS = 0.833; p-0.023); m-GBS = 0.816; p-0.031). All four scores demonstrated significant accuracy in the risk stratification of non-variceal patients; however, only GBS and m-GBS were significant in variceal etiology. Higher cutoff scores achieved better sensitivity/specificity [RS > 0 (50/60.8), CRS > 1 (87.5/50.6), GBS > 7 (88.5/63.3), m-GBS > 7(82.3/72.6)] in the risk stratification. GBS and m-GBS appear to be more valid in risk stratification of UGIB patients in this region. Higher cutoff values achieved better predictive accuracy.
Reliability and validity analysis of the open-source Chinese Foot and Ankle Outcome Score (FAOS).

PubMed

Ling, Samuel K K; Chan, Vincent; Ho, Karen; Ling, Fona; Lui, T H

2017-12-21

Develop the first reliable and validated open-source outcome scoring system in the Chinese language for foot and ankle problems. Translation of the English FAOS into Chinese following regular protocols. First, two forward-translations were created separately, these were then combined into a preliminary version by an expert committee, and was subsequently back-translated into English. The process was repeated until the original and back translations were congruent. This version was then field tested on actual patients who provided feedback for modification. The final Chinese FAOS version was then tested for reliability and validity. Reliability analysis was performed on 20 subjects while validity analysis was performed on 50 subjects. Tools used to validate the Chinese FAOS were the SF36 and Pain Numeric Rating Scale (NRS). Internal consistency between the FAOS subgroups was measured using Cronbach's alpha. Spearman's correlation was calculated between each subgroup in the FAOS, SF36 and NRS. The Chinese FAOS passed both reliability and validity testing; meaning it is reliable, internally consistent and correlates positively with the SF36 and the NRS. The Chinese FAOS is a free, open-source scoring system that can be used to provide a relatively standardised outcome measure for foot and ankle studies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validation of patient determined disease steps (PDDS) scale scores in persons with multiple sclerosis.

PubMed

Learmonth, Yvonne C; Motl, Robert W; Sandroff, Brian M; Pula, John H; Cadavid, Diego

2013-04-25

The Patient Determined Disease Steps (PDDS) is a promising patient-reported outcome (PRO) of disability in multiple sclerosis (MS). To date, there is limited evidence regarding the validity of PDDS scores, despite its sound conceptual development and broad inclusion in MS research. This study examined the validity of the PDDS based on (1) the association with Expanded Disability Status Scale (EDSS) scores and (2) the pattern of associations between PDDS and EDSS scores with Functional System (FS) scores as well as ambulatory and other outcomes. 96 persons with MS provided demographic/clinical information, completed the PDDS and other PROs including the Multiple Sclerosis Walking Scale-12 (MSWS-12), and underwent a neurological examination for generating FS and EDSS scores. Participants completed assessments of cognition, ambulation including the 6-minute walk (6 MW), and wore an accelerometer during waking hours over seven days. There was a strong correlation between EDSS and PDDS scores (ρ = .783). PDDS and EDSS scores were strongly correlated with Pyramidal (ρ = .578 &ρ = .647, respectively) and Cerebellar (ρ = .501 &ρ = .528, respectively) FS scores as well as 6 MW distance (ρ = .704 &ρ = .805, respectively), MSWS-12 scores (ρ = .801 &ρ = .729, respectively), and accelerometer steps/day (ρ = -.740 &ρ = -.717, respectively). This study provides novel evidence supporting the PDDS as valid PRO of disability in MS.
Development and validation of a surgical-pathologic staging and scoring system for cervical cancer.

PubMed

Li, Shuang; Li, Xiong; Zhang, Yuan; Zhou, Hang; Tang, Fangxu; Jia, Yao; Hu, Ting; Sun, Haiying; Yang, Ru; Chen, Yile; Cheng, Xiaodong; Lv, Weiguo; Wu, Li; Zhou, Jin; Wang, Shaoshuai; Huang, Kecheng; Wang, Lin; Yao, Yuan; Yang, Qifeng; Yang, Xingsheng; Zhang, Qinghua; Han, Xiaobing; Lin, Zhongqiu; Xing, Hui; Qu, Pengpeng; Cai, Hongbing; Song, Xiaojie; Tian, Xiaoyu; Shen, Jian; Xi, Ling; Li, Kezhen; Deng, Dongrui; Wang, Hui; Wang, Changyu; Wu, Mingfu; Zhu, Tao; Chen, Gang; Gao, Qinglei; Wang, Shixuan; Hu, Junbo; Kong, Beihua; Xie, Xing; Ma, Ding

2016-04-12

Most cervical cancer patients worldwide receive surgical treatments, and yet the current International Federation of Gynecology and Obstetrics (FIGO) staging system do not consider surgical-pathologic data. We propose a more comprehensive and prognostically valuable surgical-pathologic staging and scoring system (SPSs). Records from 4,220 eligible cervical cancer cases (Cohort 1) were screened for surgical-pathologic risk factors. We constructed a surgical-pathologic staging and SPSs, which was subsequently validated in a prospective study of 1,104 cervical cancer patients (Cohort 2). In Cohort 1, seven independent risk factors were associated with patient outcome: lymph node metastasis (LNM), parametrial involvement, histological type, grade, tumor size, stromal invasion, and lymph-vascular space invasion (LVSI). The FIGO staging system was revised and expanded into a surgical-pathologic staging system by including additional criteria of LNM, stromal invasion, and LVSI. LNM was subdivided into three categories based on number and location of metastases. Inclusion of all seven prognostic risk factors improves practical applicability. Patients were stratified into three SPSs risk categories: zero-, low-, and high-score with scores of 0, 1 to 3, and ≥4 (P=1.08E-45; P=6.15E-55). In Cohort 2, 5-year overall survival (OS) and disease-free survival (DFS) outcomes decreased with increased SPSs scores (P=9.04E-15; P=3.23E-16), validating the approach. Surgical-pathologic staging and SPSs show greater homogeneity and discriminatory utility than FIGO staging. Surgical-pathologic staging and SPSs improve characterization of tumor severity and disease invasion, which may more accurately predict outcome and guide postoperative therapy.
Simplification of a scoring system maintained overall accuracy but decreased the proportion classified as low risk.

PubMed

Sanders, Sharon; Flaws, Dylan; Than, Martin; Pickering, John W; Doust, Jenny; Glasziou, Paul

2016-01-01

Scoring systems are developed to assist clinicians in making a diagnosis. However, their uptake is often limited because they are cumbersome to use, requiring information on many predictors, or complicated calculations. We examined whether, and how, simplifications affected the performance of a validated score for identifying adults with chest pain in an emergency department who have low risk of major adverse cardiac events. We simplified the Emergency Department Assessment of Chest pain Score (EDACS) by three methods: (1) giving equal weight to each predictor included in the score, (2) reducing the number of predictors, and (3) using both methods--giving equal weight to a reduced number of predictors. The diagnostic accuracy of the simplified scores was compared with the original score in the derivation (n = 1,974) and validation (n = 909) data sets. There was no difference in the overall accuracy of the simplified versions of the score compared with the original EDACS as measured by the area under the receiver operating characteristic curve (0.74 to 0.75 for simplified versions vs. 0.75 for the original score in the validation cohort). With score cut-offs set to maintain the sensitivity of the combination of score and tests (electrocardiogram and cardiac troponin) at a level acceptable to clinicians (99%), simplification reduced the proportion of patients classified as low risk from 50% with the original score to between 22% and 42%. Simplification of a clinical score resulted in similar overall accuracy but reduced the proportion classified as low risk and therefore eligible for early discharge compared with the original score. Whether the trade-off is acceptable, will depend on the context in which the score is to be used. Developers of clinical scores should consider simplification as a method to increase uptake, but further studies are needed to determine the best methods of deriving and evaluating simplified scores. Copyright © 2016 Elsevier Inc. All rights reserved.
Derivation, Validation and Application of a Pragmatic Risk Prediction Index for Benchmarking of Surgical Outcomes.

PubMed

Spence, Richard T; Chang, David C; Kaafarani, Haytham M A; Panieri, Eugenio; Anderson, Geoffrey A; Hutter, Matthew M

2018-02-01

Despite the existence of multiple validated risk assessment and quality benchmarking tools in surgery, their utility outside of high-income countries is limited. We sought to derive, validate and apply a scoring system that is both (1) feasible, and (2) reliably predicts mortality in a middle-income country (MIC) context. A 5-step methodology was used: (1) development of a de novo surgical outcomes database modeled around the American College of Surgeons' National Surgical Quality Improvement Program (ACS-NSQIP) in South Africa (SA dataset), (2) use of the resultant data to identify all predictors of in-hospital death with more than 90% capture indicating feasibility of collection, (3) use these predictors to derive and validate an integer-based score that reliably predicts in-hospital death in the 2012 ACS-NSQIP, (4) apply the score in the original SA dataset and demonstrate its performance, (5) identify threshold cutoffs of the score to prompt action and drive quality improvement. Following step one-three above, the 13 point Codman's score was derived and validated on 211,737 and 109,079 patients, respectively, and includes: age 65 (1), partially or completely dependent functional status (1), preoperative transfusions ≥4 units (1), emergency operation (2), sepsis or septic shock (2) American Society of Anesthesia score ≥3 (3) and operative procedure (1-3). Application of the score to 373 patients in the SA dataset showed good discrimination and calibration to predict an in-hospital death. A Codman Score of 8 is an optimal cutoff point for defining expected and unexpected deaths. We have designed a novel risk prediction score specific for a MIC context. The Codman Score can prove useful for both (1) preoperative decision-making and (2) benchmarking the quality of surgical care in MIC's.
External validation of the simple clinical score and the HOTEL score, two scores for predicting short-term mortality after admission to an acute medical unit.

PubMed

Stræde, Mia; Brabrand, Mikkel

2014-01-01

Clinical scores can be of aid to predict early mortality after admission to a medical admission unit. A developed scoring system needs to be externally validated to minimise the risk of the discriminatory power and calibration to be falsely elevated. We performed the present study with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. Pre-planned prospective observational cohort study. Danish 460-bed regional teaching hospital. We included 3046 consecutive patients from 2 October 2008 until 19 February 2009. 26 (0.9%) died within one calendar day and 196 (6.4%) died within 30 days. We calculated SCS for 1080 patients. We found an AUROC of 0.960 (95% confidence interval [CI], 0.932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ(2) = 2.68 (10 degrees of freedom), P = 0.998 and χ(2) = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ(2) = 5.56 (10 degrees of freedom), P = 0.234. We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision.
WHipple-ABACUS, a simple, validated risk score for 30-day mortality after pancreaticoduodenectomy developed using the ACS-NSQIP database.

PubMed

Gleeson, Elizabeth M; Shaikh, Mohammad F; Shewokis, Patricia A; Clarke, John R; Meyers, William C; Pitt, Henry A; Bowne, Wilbur B

2016-11-01

Pancreaticoduodenectomy needs simple, validated risk models to better identify 30-day mortality. The goal of this study is to develop a simple risk score to predict 30-day mortality after pancreaticoduodenectomy. We reviewed cases of pancreaticoduodenectomy from 2005-2012 in the American College of Surgeons-National Surgical Quality Improvement Program databases. Logistic regression was used to identify preoperative risk factors for morbidity and mortality from a development cohort. Scores were created using weighted beta coefficients, and predictive accuracy was assessed on the validation cohort using receiver operator characteristic curves and measuring area under the curve. The 30-day mortality rate was 2.7% for patients who underwent pancreaticoduodenectomy (n = 14,993). We identified 8 independent risk factors. The score created from weighted beta coefficients had an area under the curve of 0.71 (95% confidence interval, 0.66-0.77) on the validation cohort. Using the score WHipple-ABACUS (hypertension With medication + History of cardiac surgery + Age >62 + 2 × Bleeding disorder + Albumin <3.5 g/dL + 2 × disseminated Cancer + 2 × Use of steroids + 2 × Systemic inflammatory response syndrome), mortality rates increase with increasing score (P < .001). While other risk scores exist for 30-day mortality after pancreaticoduodenectomy, we present a simple, validated score developed using exclusively preoperative predictors surgeons could use to identify patients at risk for this procedure. Copyright © 2016 Elsevier Inc. All rights reserved.
Validity of Walk Score® as a measure of neighborhood walkability in Japan.

PubMed

Koohsari, Mohammad Javad; Sugiyama, Takemi; Hanibuchi, Tomoya; Shibata, Ai; Ishii, Kaori; Liao, Yung; Oka, Koichiro

2018-03-01

Objective measures of environmental attributes have been used to understand how neighborhood environments relate to physical activity. However, this method relies on detailed spatial data, which are often not easily available. Walk Score® is a free, publicly available web-based tool that shows how walkable a given location is based on objectively-derived proximity to several types of local destinations and street connectivity. To date, several studies have tested the concurrent validity of Walk Score as a measure of neighborhood walkability in the USA and Canada. However, it is unknown whether Walk Score is a valid measure in other regions. The current study examined how Walk Score is correlated with objectively-derived attributes of neighborhood walkability, for residential addresses in Japan. Walk Scores were obtained for 1072 residential addresses in urban and rural areas in Japan. Five environmental attributes (residential density, intersection density, number of local destinations, sidewalk availability, and access to public transportation) were calculated using geographic information systems for each address. Pearson's correlation coefficients between Walk Score and these environmental attributes were calculated (conducted in May 2017). Significant positive correlations were observed between Walk Score and environmental attributes relevant to walking. Walk Score was most closely associated with intersection density ( r  = 0.82) and with the number of local destinations ( r  = 0.77). Walk Score appears to be a valid measure of neighborhood walkability in Japan. Walk Score will allow urban designers and public health practitioners to identify walkability of local areas without relying on detailed geographic data.
Additional Support for the Information Systems Analyst Exam as a Valid Program Assessment Tool

ERIC Educational Resources Information Center

Carpenter, Donald A.; Snyder, Johnny; Slauson, Gayla Jo; Bridge, Morgan K.

2011-01-01

This paper presents a statistical analysis to support the notion that the Information Systems Analyst (ISA) exam can be used as a program assessment tool in addition to measuring student performance. It compares ISA exam scores earned by students in one particular Computer Information Systems program with scores earned by the same students on the…
Measuring Teacher Effectiveness with the Pennsylvania Value-Added Assessment System

ERIC Educational Resources Information Center

Bowen, Naomi

2017-01-01

The purpose of this research was to determine if the Pennsylvania Value-Added Assessment System Average Growth Index (PVAAS AGI) scores, derived from standardized tests and calculated for Pennsylvania schools, provide a valid and reliable assessment of teacher effectiveness, as these scores are currently used to derive 15% of the annual…
Molecular Classification Substitutes for the Prognostic Variables Stage, Age, and MYCN Status in Neuroblastoma Risk Assessment.

PubMed

Rosswog, Carolina; Schmidt, Rene; Oberthuer, André; Juraeva, Dilafruz; Brors, Benedikt; Engesser, Anne; Kahlert, Yvonne; Volland, Ruth; Bartenhagen, Christoph; Simon, Thorsten; Berthold, Frank; Hero, Barbara; Faldum, Andreas; Fischer, Matthias

2017-12-01

Current risk stratification systems for neuroblastoma patients consider clinical, histopathological, and genetic variables, and additional prognostic markers have been proposed in recent years. We here sought to select highly informative covariates in a multistep strategy based on consecutive Cox regression models, resulting in a risk score that integrates hazard ratios of prognostic variables. A cohort of 695 neuroblastoma patients was divided into a discovery set (n=75) for multigene predictor generation, a training set (n=411) for risk score development, and a validation set (n=209). Relevant prognostic variables were identified by stepwise multivariable L1-penalized least absolute shrinkage and selection operator (LASSO) Cox regression, followed by backward selection in multivariable Cox regression, and then integrated into a novel risk score. The variables stage, age, MYCN status, and two multigene predictors, NB-th24 and NB-th44, were selected as independent prognostic markers by LASSO Cox regression analysis. Following backward selection, only the multigene predictors were retained in the final model. Integration of these classifiers in a risk scoring system distinguished three patient subgroups that differed substantially in their outcome. The scoring system discriminated patients with diverging outcome in the validation cohort (5-year event-free survival, 84.9±3.4 vs 63.6±14.5 vs 31.0±5.4; P<.001), and its prognostic value was validated by multivariable analysis. We here propose a translational strategy for developing risk assessment systems based on hazard ratios of relevant prognostic variables. Our final neuroblastoma risk score comprised two multigene predictors only, supporting the notion that molecular properties of the tumor cells strongly impact clinical courses of neuroblastoma patients. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Development and Initial Validation of the Macrophage Activation Syndrome/Primary Hemophagocytic Lymphohistiocytosis Score, a Diagnostic Tool that Differentiates Primary Hemophagocytic Lymphohistiocytosis from Macrophage Activation Syndrome.

PubMed

Minoia, Francesca; Bovis, Francesca; Davì, Sergio; Insalaco, Antonella; Lehmberg, Kai; Shenoi, Susan; Weitzman, Sheila; Espada, Graciela; Gao, Yi-Jin; Anton, Jordi; Kitoh, Toshiyuki; Kasapcopur, Ozgur; Sanner, Helga; Merino, Rosa; Astigarraga, Itziar; Alessio, Maria; Jeng, Michael; Chasnyk, Vyacheslav; Nichols, Kim E; Huasong, Zeng; Li, Caifeng; Micalizzi, Concetta; Ruperto, Nicolino; Martini, Alberto; Cron, Randy Q; Ravelli, Angelo; Horne, AnnaCarin

2017-10-01

To develop and validate a diagnostic score that assists in discriminating primary hemophagocytic lymphohistiocytosis (pHLH) from macrophage activation syndrome (MAS) related to systemic juvenile idiopathic arthritis. The clinical, laboratory, and histopathologic features of 362 patients with MAS and 258 patients with pHLH were collected in a multinational collaborative study. Eighty percent of the population was assessed to develop the score and the remaining 20% constituted the validation sample. Variables that entered the best fitted model of logistic regression were assigned a score, based on their statistical weight. The MAS/HLH (MH) score was made up with the individual scores of selected variables. The cutoff in the MH score that discriminated pHLH from MAS best was calculated by means of receiver operating characteristic curve analysis. Score performance was examined in both developmental and validation samples. Six variables composed the MH score: age at onset, neutrophil count, fibrinogen, splenomegaly, platelet count, and hemoglobin. The MH score ranged from 0 to 123, and its median value was 97 (1st-3rd quartile 75-123) and 12 (1st-3rd quartile 11-34) in pHLH and MAS, respectively. The probability of a diagnosis of pHLH ranged from <1% for a score of <11 to >99% for a score of ≥123. A cutoff value of ≥60 revealed the best performance in discriminating pHLH from MAS. The MH score is a powerful tool that may aid practitioners to identify patients who are more likely to have pHLH and, thus, could be prioritized for functional and genetic testing. Copyright © 2017 Elsevier Inc. All rights reserved.

Validity and reliability of a pilot scale for assessment of multiple system atrophy symptoms.

PubMed

Matsushima, Masaaki; Yabe, Ichiro; Takahashi, Ikuko; Hirotani, Makoto; Kano, Takahiro; Horiuchi, Kazuhiro; Houzen, Hideki; Sasaki, Hidenao

2017-01-01

Multiple system atrophy (MSA) is a rare progressive neurodegenerative disorder for which brief yet sensitive scale is required in order for use in clinical trials and general screening. We previously compared several scales for the assessment of MSA symptoms and devised an eight-item pilot scale with large standardized response mean [handwriting, finger taps, transfers, standing with feet together, turning trunk, turning 360°, gait, body sway]. The aim of the present study is to investigate the validity and reliability of a simple pilot scale for assessment of multiple system atrophy symptoms. Thirty-two patients with MSA (15 male/17 female; 20 cerebellar subtype [MSA-C]/12 parkinsonian subtype [MSA-P]) were prospectively registered between January 1, 2014 and February 28, 2015. Patients were evaluated by two independent raters using the Unified MSA Rating Scale (UMSARS), Scale for Assessment and Rating of Ataxia (SARA), and the pilot scale. Correlations between UMSARS, SARA, pilot scale scores, intraclass correlation coefficients (ICCs), and Cronbach's alpha coefficients were calculated. Pilot scale scores significantly correlated with scores for UMSARS Parts I, II, and IV as well as with SARA scores. Intra-rater and inter-rater ICCs and Cronbach's alpha coefficients remained high (> 0.94) for all measures. The results of the present study indicate the validity and reliability of the eight-item pilot scale, particularly for the assessment of symptoms in patients with early state multiple system atrophy.
Relapses vs. reactions in multibacillary leprosy: proposal of new relapse criteria.

PubMed

Linder, Katharina; Zia, Mutaher; Kern, Winfried V; Pfau, Ruth K M; Wagner, Dirk

2008-03-01

To compare a new scoring system for multibacillary (MB) leprosy relapses, which combines time factor, risk factors and clinical presentation at relapse, to WHO criteria. Data were collected on all relapses diagnosed between 1998 and 2004 at the Marie-Adelaide-Centre in Karachi, Pakistan, including case histories, clinical manifestations, follow-up, bacterial indices, treatment and contacts. For the diagnosis of MB relapses a simple scoring system was developed and validated on a data-set of mouse foot pads (MFP)-confirmed relapses (Leprosy Reviews, 76, 2005, 241). Its sensitivity was further evaluated in the Karachi relapse cohort. The P-value was calculated with McNemar's test with continuity correction. The new scoring system that combines time factor, risk factors and clinical presentation at relapse had a higher sensitivity in MFP-confirmed relapses than the WHO-criteria (95%vs. 65%, P < 0.01). The sensitivity of the scoring system was also significantly higher than the WHO criteria in the 57 cases of MB-relapses diagnosed in Karachi (72%vs. 54%, P < 0.05). This new simple scoring system for diagnosing MB-relapses in leprosy should be further validated in a prospective study to confirm its superior sensitivity and to evaluate the specificity of these criteria by using MFP-confirmation for patients presenting with signs of activity after treatment.
A comparison of three developmental stage scoring systems.

PubMed

Dawson, Theo Linda

2002-01-01

In social psychological research the stage metaphor has fallen into disfavor due to concerns about bias, reliability, and validity. To address some of these issues, I employ a multidimensional partial credit analysis comparing moral judgment interviews scored with the Standard Issue Scoring System (SISS) (Colby and Kohlberg, 1987b), evaluative reasoning interviews scored with the Good Life Scoring System (GLSS) (Armon, 1984b), and Good Education interviews scored with the Hierarchical Complexity Scoring System (HCSS) (Commons, Danaher, Miller, and Dawson, 2000). A total of 209 participants between the ages of 5 and 86 were interviewed. The multidimensional model reveals that even though the scoring systems rely upon different criteria and the data were collected using different methods and scored by different teams of raters, the SISS, GLSS, and HCSS all appear to measure the same latent variable. The HCSS exhibits more internal consistency than the SISS and GLSS, and solves some methodological problems introduced by the content dependency of the SISS and GLSS. These results and their implications are elaborated.
Feasibility and validity of animal-based indicators for on-farm welfare assessment of thermal stress in dairy goats

NASA Astrophysics Data System (ADS)

Battini, Monica; Barbieri, Sara; Fioni, Luna; Mattiello, Silvana

2016-02-01

This investigation tested the feasibility and validity of indicators of cold and heat stress in dairy goats for on-farm welfare assessment protocols. The study was performed on two intensive dairy farms in Italy. Two different 3-point scale (0-2) scoring systems were applied to assess cold and heat stress. Cold and heat stress scores were visually assessed from outside the pen in the morning, afternoon and evening in January-February, April-May and July 2013 for a total of nine sessions of observations/farm. Temperature (°C), relative humidity (%) and wind speed (km/h) were recorded and Thermal Heat Index (THI) was calculated. The sessions were allocated to three climatic seasons, depending on THI ranges: cold (<50), neutral (50-65) and hot (>65). Score 2 was rarely assessed; therefore, scores 1 and 2 were aggregated for statistical analysis. The amount of goats suffering from cold stress was significantly higher in the cold season than in neutral ( P < 0.01) and hot ( P < 0.001) seasons. Signs of heat stress were recorded only in the hot season ( P < 0.001). The visual assessment from outside the pen confirms the on-farm feasibility of both indicators: No constraint was found and time required was less than 10 min. Our results show that cold and heat stress scores are valid indicators to detect thermal stress in intensively managed dairy goats. The use of a binary scoring system (presence/absence), merging scores 1 and 2, may be a further refinement to improve the feasibility. This study also allows the prediction of optimal ranges of THI for dairy goat breeds in intensive husbandry systems, setting a comfort zone included into 55 and 70.
Mammographic image quality in relation to positioning of the breast: A multicentre international evaluation of the assessment systems currently used, to provide an evidence base for establishing a standardised method of assessment.

PubMed

Taylor, K; Parashar, D; Bouverat, G; Poulos, A; Gullien, R; Stewart, E; Aarre, R; Crystal, P; Wallis, M

2017-11-01

Optimum mammography positioning technique is necessary to maximise cancer detection. Current criteria for mammography appraisal lack reliability and validity with a need to develop a more objective system. We aimed to establish current international practice in assessing image quality (IQ), of screening mammograms then develop and validate a reproducible assessment tool. A questionnaire sent to centres in countries undertaking population screening identified practice, participants for an expert panel (EP) of radiologists/radiographers and a testing panel (TP) of radiographers. The EP developed category criteria and descriptors using a modified Delphi process to agree definitions. The EP scored 12 screening mammograms to test agreement then a main set of 178 cases. Weighted scores were derived for each descriptor enabling calculation of numerical parameters for each new category. The TP then scored the main set. Statistical analysis included ANOVA, t-tests and Kendall's coefficient. 11 centres in 8 countries responded forming an EP of 7 members and TP of 44 members. The EP showed moderate agreement when the scoring the mini test set W = 0.50 p < 0.001 and the main set W = 0.55 p < 0.001, 'posterior nipple line' being the most difficult descriptor. The weighted total scores differentiated the 4 new categories Perfect, Good, Adequate and Inadequate (p < 0.001). We have developed an assessment tool by Delphi consensus and weighted consensus criteria. We have successfully tabulated a range of numerical scores for each new category providing the first validated and reproducible mammography IQ scoring system. Copyright © 2017 The College of Radiographers. Published by Elsevier Ltd. All rights reserved.
Development and validation of a prognostic scoring system for patients with chronic myelomonocytic leukemia.

PubMed

Such, Esperanza; Germing, Ulrich; Malcovati, Luca; Cervera, José; Kuendgen, Andrea; Della Porta, Matteo G; Nomdedeu, Benet; Arenillas, Leonor; Luño, Elisa; Xicoy, Blanca; Amigo, Mari L; Valcarcel, David; Nachtkamp, Kathrin; Ambaglio, Ilaria; Hildebrandt, Barbara; Lorenzo, Ignacio; Cazzola, Mario; Sanz, Guillermo

2013-04-11

The natural course of chronic myelomonocytic leukemia (CMML) is highly variable but a widely accepted prognostic scoring system for patients with CMML is not available. The main aim of this study was to develop a new CMML-specific prognostic scoring system (CPSS) in a large series of 558 patients with CMML (training cohort, Spanish Group of Myelodysplastic Syndromes) and to validate it in an independent series of 274 patients (validation cohort, Heinrich Heine University Hospital, Düsseldorf, Germany, and San Matteo Hospital, Pavia, Italy). The most relevant variables for overall survival (OS) and evolution to acute myeloblastic leukemia (AML) were FAB and WHO CMML subtypes, CMML-specific cytogenetic risk classification, and red blood cell (RBC) transfusion dependency. CPSS was able to segregate patients into 4 clearly different risk groups for OS (P < .001) and risk of AML evolution (P < .001) and its predictive capability was confirmed in the validation cohort. An alternative CPSS with hemoglobin instead of RBC transfusion dependency offered almost identical prognostic capability. This study confirms the prognostic impact of FAB and WHO subtypes, recognizes the importance of RBC transfusion dependency and cytogenetics, and offers a simple and powerful CPSS for accurately assessing prognosis and planning therapy in patients with CMML.
Testing the Predictive Validity of the Hendrich II Fall Risk Model.

PubMed

Jung, Hyesil; Park, Hyeoun-Ae

2018-03-01

Cumulative data on patient fall risk have been compiled in electronic medical records systems, and it is possible to test the validity of fall-risk assessment tools using these data between the times of admission and occurrence of a fall. The Hendrich II Fall Risk Model scores assessed during three time points of hospital stays were extracted and used for testing the predictive validity: (a) upon admission, (b) when the maximum fall-risk score from admission to falling or discharge, and (c) immediately before falling or discharge. Predictive validity was examined using seven predictive indicators. In addition, logistic regression analysis was used to identify factors that significantly affect the occurrence of a fall. Among the different time points, the maximum fall-risk score assessed between admission and falling or discharge showed the best predictive performance. Confusion or disorientation and having a poor ability to rise from a sitting position were significant risk factors for a fall.
Validation of Gujarati Version of ABILOCO-Kids Questionnaire.

PubMed

Diwan, Shraddha; Diwan, Jasmin; Patel, Pankaj; Bansal, Ankita B

2015-10-01

ABILOCO-Kids is a measure of locomotion ability for children with cerebral palsy (CP) aged 6 to 15 years & is available in English & French. To validate the Gujarati version of ABILOCO-Kids questionnaire to be used in clinical research on Gujarati population. ABILOCO-Kids questionnaire was translated into Gujarati from English using forward-backward-forward method. To ensure face & content validity of Gujarati version using group consensus method, each item was examined by group of experts having mean experience of 24.62 years in field of paediatric and paediatric physiotherapy. Each item was analysed for content, meaning, wording, format, ease of administration & scoring. Each item was scored by expert group as either accepted, rejected or accepted with modification. Procedure was continued until 80% of consensus for all items. Concurrent validity was examined on 55 children with Cerebral Palsy (6-15 years) of all Gross Motor Functional Classification System (GMFCS) level & all clinical types by correlating score of ABILOCO-Kids with Gross Motor Functional Measure & GMFCS. In phase 1 of validation, 16 items were accepted as it is; 22 items accepted with modification & 3 items went for phase 2 validation. For concurrent validity, highly significant positive correlation was found between score of ABILOCO-Kids & total GMFM (r=0.713, p<0.005) & highly significant negative correlation with GMFCS (r= -0.778, p<0.005). Gujarati translated version of ABILOCO-Kids questionnaire has good face & content validity as well as concurrent validity which can be used to measure caregiver reported locomotion ability in children with CP.
Development and validation of a surgical-pathologic staging and scoring system for cervical cancer

PubMed Central

Zhou, Hang; Tang, Fangxu; Jia, Yao; Hu, Ting; Sun, Haiying; Yang, Ru; Chen, Yile; Cheng, Xiaodong; Lv, Weiguo; Wu, Li; Zhou, Jin; Wang, Shaoshuai; Huang, Kecheng; Wang, Lin; Yao, Yuan; Yang, Qifeng; Yang, Xingsheng; Zhang, Qinghua; Han, Xiaobing; Lin, Zhongqiu; Xing, Hui; Qu, Pengpeng; Cai, Hongbing; Song, Xiaojie; Tian, Xiaoyu; Shen, Jian; Xi, Ling; Li, Kezhen; Deng, Dongrui; Wang, Hui; Wang, Changyu; Wu, Mingfu; Zhu, Tao; Chen, Gang; Gao, Qinglei; Wang, Shixuan; Hu, Junbo; Kong, Beihua; Xie, Xing; Ma, Ding

2016-01-01

Background Most cervical cancer patients worldwide receive surgical treatments, and yet the current International Federation of Gynecology and Obstetrics (FIGO) staging system do not consider surgical-pathologic data. We propose a more comprehensive and prognostically valuable surgical-pathologic staging and scoring system (SPSs). Methods Records from 4,220 eligible cervical cancer cases (Cohort 1) were screened for surgical-pathologic risk factors. We constructed a surgical-pathologic staging and SPSs, which was subsequently validated in a prospective study of 1,104 cervical cancer patients (Cohort 2). Results In Cohort 1, seven independent risk factors were associated with patient outcome: lymph node metastasis (LNM), parametrial involvement, histological type, grade, tumor size, stromal invasion, and lymph-vascular space invasion (LVSI). The FIGO staging system was revised and expanded into a surgical-pathologic staging system by including additional criteria of LNM, stromal invasion, and LVSI. LNM was subdivided into three categories based on number and location of metastases. Inclusion of all seven prognostic risk factors improves practical applicability. Patients were stratified into three SPSs risk categories: zero-, low-, and high-score with scores of 0, 1 to 3, and ≥4 (P=1.08E-45; P=6.15E-55). In Cohort 2, 5-year overall survival (OS) and disease-free survival (DFS) outcomes decreased with increased SPSs scores (P=9.04E-15; P=3.23E-16), validating the approach. Surgical-pathologic staging and SPSs show greater homogeneity and discriminatory utility than FIGO staging. Conclusions Surgical-pathologic staging and SPSs improve characterization of tumor severity and disease invasion, which may more accurately predict outcome and guide postoperative therapy. PMID:27014971
Reviewing Reliability and Validity of Information for University Educational Evaluation

NASA Astrophysics Data System (ADS)

Otsuka, Yusaku

To better utilize evaluations in higher education, it is necessary to share the methods of reviewing reliability and validity of examination scores and grades, and to accumulate and share data for confirming results. Before the GPA system is first introduced into a university or college, the reliability of examination scores and grades, especially for essay examinations, must be assured. Validity is a complicated concept, so should be assured in various ways, including using professional audits, theoretical models, and statistical data analysis. Because individual students and teachers are continually improving, using evaluations to appraise their progress is not always compatible with using evaluations in appraising the implementation of accountability in various departments or the university overall. To better utilize evaluations and improve higher education, evaluations should be integrated into the current system by sharing the vision of an academic learning community and promoting interaction between students and teachers based on sufficiently reliable and validated evaluation tools.
Development and Validation of a Scoring System to Predict Outcomes of Patients With Primary Biliary Cirrhosis Receiving Ursodeoxycholic Acid Therapy.

PubMed

Lammers, Willem J; Hirschfield, Gideon M; Corpechot, Christophe; Nevens, Frederik; Lindor, Keith D; Janssen, Harry L A; Floreani, Annarosa; Ponsioen, Cyriel Y; Mayo, Marlyn J; Invernizzi, Pietro; Battezzati, Pier M; Parés, Albert; Burroughs, Andrew K; Mason, Andrew L; Kowdley, Kris V; Kumagi, Teru; Harms, Maren H; Trivedi, Palak J; Poupon, Raoul; Cheung, Angela; Lleo, Ana; Caballeria, Llorenç; Hansen, Bettina E; van Buuren, Henk R

2015-12-01

Approaches to risk stratification for patients with primary biliary cirrhosis (PBC) are limited, single-center based, and often dichotomous. We aimed to develop and validate a better model for determining prognoses of patients with PBC. We performed an international, multicenter meta-analysis of 4119 patients with PBC treated with ursodeoxycholic acid at liver centers in 8 European and North American countries. Patients were randomly assigned to derivation (n = 2488 [60%]) and validation cohorts (n = 1631 [40%]). A risk score (GLOBE score) to predict transplantation-free survival was developed and validated with univariate and multivariable Cox regression analyses using clinical and biochemical variables obtained after 1 year of ursodeoxycholic acid therapy. Risk score outcomes were compared with the survival of age-, sex-, and calendar time-matched members of the general population. The prognostic ability of the GLOBE score was evaluated alongside those of the Barcelona, Paris-1, Rotterdam, Toronto, and Paris-2 criteria. Age (hazard ratio = 1.05; 95% confidence interval [CI]: 1.04-1.06; P < .0001); levels of bilirubin (hazard ratio = 2.56; 95% CI: 2.22-2.95; P < .0001), albumin (hazard ratio = 0.10; 95% CI: 0.05-0.24; P < .0001), and alkaline phosphatase (hazard ratio = 1.40; 95% CI: 1.18-1.67; P = .0002); and platelet count (hazard ratio/10 units decrease = 0.97; 95% CI: 0.96-0.99; P < .0001) were all independently associated with death or liver transplantation (C-statistic derivation, 0.81; 95% CI: 0.79-0.83, and validation cohort, 0.82; 95% CI: 0.79-0.84). Patients with risk scores >0.30 had significantly shorter times of transplant-free survival than matched healthy individuals (P < .0001). The GLOBE score identified patients who would survive for 5 years and 10 years (responders) with positive predictive values of 98% and 88%, respectively. Up to 22% and 21% of events and nonevents, respectively, 10 years after initiation of treatment were correctly reclassified in comparison with earlier proposed criteria. In subgroups of patients aged <45, 45-52, 52-58, 58-66, and ≥66 years, age-specific GLOBE-score thresholds beyond which survival significantly deviated from matched healthy individuals were -0.52, 0.01, 0.60, 1.01 and 1.69, respectively. Transplant-free survival could still be accurately calculated by the GLOBE score with laboratory values collected at 2-5 years after treatment. We developed and validated scoring system (the GLOBE score) to predict transplant-free survival of ursodeoxycholic acid-treated patients with PBC. This score might be used to select strategies for treatment and care. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.
An immunohistochemical and fluorescence in situ hybridization-based comparison between the Oracle HER2 Bond Immunohistochemical System, Dako HercepTest, and Vysis PathVysion HER2 FISH using both commercially validated and modified ASCO/CAP and United Kingdom HER2 IHC scoring guidelines.

PubMed

O'Grady, Anthony; Allen, David; Happerfield, Lisa; Johnson, Nicola; Provenzano, Elena; Pinder, Sarah E; Tee, Lilian; Gu, Mai; Kay, Elaine W

2010-12-01

Immunohistochemistry (IHC) is used as the frontline assay to determine HER2 status in invasive breast cancer patients. The aim of the study was to compare the performance of the Leica Oracle HER2 Bond IHC System (Oracle) with the current most readily accepted Dako HercepTest (HercepTest), using both commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. A total of 445 breast cancer samples from 3 international clinical HER2 referral centers were stained with the 2 test systems and scored in a blinded fashion by experienced pathologists. The overall agreement between the 2 tests in a 3×3 (negative, equivocal and positive) analysis shows a concordance of 86.7% and 86.3%, respectively when analyzed using commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. There is a good concordance between the Oracle and the HercepTest. The advantages of a complete fully automated test such as the Oracle include standardization of key analytical factors and improved turn around time. The implementation of the modified ASCO/CAP and UK HER2 IHC scoring guidelines has minimal effect on either assay interpretation, showing that Oracle can be used as a methodology for accurately determining HER2 IHC status in formalin fixed, paraffin-embedded breast cancer tissue.
English Cross-Cultural Translation and Validation of the Neuromuscular Score: A System for Motor Function Classification in Patients With Neuromuscular Diseases

PubMed Central

Vuillerot, Carole; Meilleur, Katherine G.; Jain, Minal; Waite, Melissa; Wu, Tianxia; Linton, Melody; Datsgir, Jahannaz; Donkervoort, Sandra; Leach, Meganne E.; Rutkowski, Anne; Rippert, Pascal; Payan, Christine; Iwaz, Jean; Hamroun, Dalil; Bérard, Carole; Poirot, Isabelle; Bönnemann, Carsten G.

2016-01-01

Objective To develop and validate an English version of the Neuromuscular (NM)-Score, a classification for patients with NM diseases in each of the 3 motor function domains: D1, standing and transfers; D2, axial and proximal motor function; and D3, distal motor function. Design Validation survey. Setting Patients seen at a medical research center between June and September 2013. Participants Consecutive patients (N = 42) aged 5 to 19 years with a confirmed or suspected diagnosis of congenital muscular dystrophy. Interventions Not applicable. Main Outcome Measures An English version of the NM-Score was developed by a 9-person expert panel that assessed its content validity and semantic equivalence. Its concurrent validity was tested against criterion standards (Brooke Scale, Motor Function Measure [MFM], activity limitations for patients with upper and/or lower limb impairments [ACTIVLIM], Jebsen Test, and myometry measurements). Informant agreement between patient/caregiver (P/C)-reported and medical doctor (MD)-reported NM scores was measured by weighted kappa. Results Significant correlation coefficients were found between NM scores and criterion standards. The highest correlations were found between NM-score D1 and MFM score D1 (ρ = −.944, P<.0001), ACTIVLIM (ρ = −.895, P<.0001), and hip abduction strength by myometry (ρ = −.811, P<.0001). Informant agreement between P/C-reported and MD-reported NM scores was high for D1 (κ = .801; 95% confidence interval [CI], .701–.914) but moderate for D2 (κ = .592; 95% CI, .412–.773) and D3 (κ = .485; 95% CI, .290–.680). Correlation coefficients between the NM scores and the criterion standards did not significantly differ between P/C-reported and MD-reported NM scores. Conclusions Patients and physicians completed the English NM-Score easily and accurately. The English version is a reliable and valid instrument that can be used in clinical practice and research to describe the functional abilities of patients with NM diseases. PMID:24862765
Development and Validation of a Disease Severity Scoring Model for Pediatric Sepsis.

PubMed

Hu, Li; Zhu, Yimin; Chen, Mengshi; Li, Xun; Lu, Xiulan; Liang, Ying; Tan, Hongzhuan

2016-07-01

Multiple severity scoring systems have been devised and evaluated in adult sepsis, but a simplified scoring model for pediatric sepsis has not yet been developed. This study aimed to develop and validate a new scoring model to stratify the severity of pediatric sepsis, thus assisting the treatment of sepsis in children. Data from 634 consecutive patients who presented with sepsis at Children's hospital of Hunan province in China in 2011-2013 were analyzed, with 476 patients placed in training group and 158 patients in validation group. Stepwise discriminant analysis was used to develop the accurate discriminate model. A simplified scoring model was generated using weightings defined by the discriminate coefficients. The discriminant ability of the model was tested by receiver operating characteristic curves (ROC). The discriminant analysis showed that prothrombin time, D-dimer, total bilirubin, serum total protein, uric acid, PaO2/FiO2 ratio, myoglobin were associated with severity of sepsis. These seven variables were assigned with values of 4, 3, 3, 4, 3, 3, 3 respectively based on the standardized discriminant coefficients. Patients with higher scores had higher risk of severe sepsis. The areas under ROC (AROC) were 0.836 for accurate discriminate model, and 0.825 for simplified scoring model in validation group. The proposed disease severity scoring model for pediatric sepsis showed adequate discriminatory capacity and sufficient accuracy, which has important clinical significance in evaluating the severity of pediatric sepsis and predicting its progress.
A Supervised Learning Process to Validate Online Disease Reports for Use in Predictive Models.

PubMed

Patching, Helena M M; Hudson, Laurence M; Cooke, Warrick; Garcia, Andres J; Hay, Simon I; Roberts, Mark; Moyes, Catherine L

2015-12-01

Pathogen distribution models that predict spatial variation in disease occurrence require data from a large number of geographic locations to generate disease risk maps. Traditionally, this process has used data from public health reporting systems; however, using online reports of new infections could speed up the process dramatically. Data from both public health systems and online sources must be validated before they can be used, but no mechanisms exist to validate data from online media reports. We have developed a supervised learning process to validate geolocated disease outbreak data in a timely manner. The process uses three input features, the data source and two metrics derived from the location of each disease occurrence. The location of disease occurrence provides information on the probability of disease occurrence at that location based on environmental and socioeconomic factors and the distance within or outside the current known disease extent. The process also uses validation scores, generated by disease experts who review a subset of the data, to build a training data set. The aim of the supervised learning process is to generate validation scores that can be used as weights going into the pathogen distribution model. After analyzing the three input features and testing the performance of alternative processes, we selected a cascade of ensembles comprising logistic regressors. Parameter values for the training data subset size, number of predictors, and number of layers in the cascade were tested before the process was deployed. The final configuration was tested using data for two contrasting diseases (dengue and cholera), and 66%-79% of data points were assigned a validation score. The remaining data points are scored by the experts, and the results inform the training data set for the next set of predictors, as well as going to the pathogen distribution model. The new supervised learning process has been implemented within our live site and is being used to validate the data that our system uses to produce updated predictive disease maps on a weekly basis.
Development and validation of a predictive score for perioperative transfusion in patients with hepatocellular carcinoma undergoing liver resection.

PubMed

Wang, Hai-Qing; Yang, Jian; Yang, Jia-Yin; Wang, Wen-Tao; Yan, Lu-Nan

2015-08-01

Liver resection is a major surgery requiring perioperative blood transfusion. Predicting the need for blood transfusion for patients undergoing liver resection is of great importance. The present study aimed to develop and validate a model for predicting transfusion requirement in HBV-related hepatocellular carcinoma patients undergoing liver resection. A total of 1543 consecutive liver resections were included in the study. Randomly selected sample set of 1080 cases (70% of the study cohort) were used to develop a predictive score for transfusion requirement and the remaining 30% (n=463) was used to validate the score. Based on the preoperative and predictable intraoperative parameters, logistic regression was used to identify risk factors and to create an integer score for the prediction of transfusion requirement. Extrahepatic procedure, major liver resection, hemoglobin level and platelets count were identified as independent predictors for transfusion requirement by logistic regression analysis. A score system integrating these 4 factors was stratified into three groups which could predict the risk of transfusion, with a rate of 11.4%, 24.7% and 57.4% for low, moderate and high risk, respectively. The prediction model appeared accurate with good discriminatory abilities, generating an area under the receiver operating characteristic curve of 0.736 in the development set and 0.709 in the validation set. We have developed and validated an integer-based risk score to predict perioperative transfusion for patients undergoing liver resection in a high-volume surgical center. This score allows identifying patients at a high risk and may alter transfusion practices.
External Validation of the Simple Clinical Score and the HOTEL Score, Two Scores for Predicting Short-Term Mortality after Admission to an Acute Medical Unit

PubMed Central

Stræde, Mia; Brabrand, Mikkel

2014-01-01

Background Clinical scores can be of aid to predict early mortality after admission to a medical admission unit. A developed scoring system needs to be externally validated to minimise the risk of the discriminatory power and calibration to be falsely elevated. We performed the present study with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. Methods Pre-planned prospective observational cohort study. Setting Danish 460-bed regional teaching hospital. Findings We included 3046 consecutive patients from 2 October 2008 until 19 February 2009. 26 (0.9%) died within one calendar day and 196 (6.4%) died within 30 days. We calculated SCS for 1080 patients. We found an AUROC of 0.960 (95% confidence interval [CI], 0.932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774–0.879) for 30-day mortality, and goodness-of-fit test, χ2 = 2.68 (10 degrees of freedom), P = 0.998 and χ2 = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95% CI, 0.901–0.962) for 24-hours mortality and goodness-of-fit test, χ2 = 5.56 (10 degrees of freedom), P = 0.234. Conclusion We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision. PMID:25144186
Defining High Risk Patients for Endovascular Aneurysm Repair: A National Analysis

PubMed Central

Egorova, Natalia; Giacovelli, Jeannine K.; Gelijns, Annetine; Mureebe, Leila; Greco, Giampaolo; Morrissey, Nicholas; Nowygrod, Roman; Moskowitz, Alan; McKinsey, James; Kent, K. Craig

2011-01-01

Background Endovascular aneurysm repair (EVAR) is commonly used as a minimally invasive technique for repairing infrarenal aortic aneurysms. There have been recent concerns that a subset of high-risk patients experience unfavorable outcomes with this intervention. To determine whether such a high-risk cohort exists and to identify the characteristics of these patients, we analyzed the outcomes of Medicare patients treated with EVAR from 2000–2006. Methods and Results We identified 66,943 patients who underwent EVAR from Inpatient Medicare database. The overall 30-day mortality was 1.6%. A risk model for perioperative mortality was developed by randomly selecting 44,630 patients; the other 1/3 of the dataset was used to validate the model. The model was deemed reliable (Hosmer-Lemeshow statistics was p=0.25 for the development, p=0.24 for the validation model) and accurate (c=0.735 and c=0.731 for the development and the validation model, respectively). In our scoring system, where scores ranged between 1 and 7, the following were identified as significant baseline factors that predict mortality: renal failure with dialysis (score=7), renal failure without dialysis (score=3), clinically significant lower extremity ischemia (score=5), patient age ≥85 (score=3), 75–84 (score=2), 70–74 (score=1), heart failure (score=3), chronic liver disease (score=3), female gender (score=2), neurological disorders (score=2), , chronic pulmonary disease (score=2), surgeon experience in EVAR<3 procedures (score=1) and hospital annual volume in EVAR <7 procedures (score=1). The majority of Medicare patients who were treated (96.6%, n=64,651) had a score of 9 or less, which correlated with a mortality < 5%. Only 3.4% of patients had a mortality ≥ 5% and 0.8% of patients (n=509) had a score of 13 or higher, which correlated with a mortality >10%. Conclusion We conclude that there is a high-risk cohort of patients that should not be treated with EVAR; however, this cohort is small. Our scoring system, which is based on patient and institutional factors, provides criteria that can be easily used by clinicians to quantify perioperative risk for EVAR candidates. PMID:19782526
A risk prediction model for determining appropriateness of CEA in patients with asymptomatic carotid artery stenosis.

PubMed

Conrad, Mark F; Kang, Jeanwan; Mukhopadhyay, Shankha; Patel, Virendra I; LaMuraglia, Glenn M; Cambria, Richard P

2013-10-01

The benefit of carotid endarterectomy (CEA) over medical therapy in patients with asymptomatic carotid artery stenosis is predicated upon a life expectancy of at least 5 years after the procedure. The goal of this study was to create a scoring system for prediction of 5-year survival after CEA that can be used to triage patients with ACAS. All patients who underwent CEA for severe asymptomatic carotid stenosis from 1989 to 2005 were identified. Long-term survival was determined by a review of hospital records and the social security death index. Because all patients had at least 5-year follow-up, a logistic regression of predictors of survival at 5 years was performed and the odds ratios associated with particular significant comorbidities were used to create a scoring system to predict survival. The scoring system was then validated within the cohort using the Hosmer-Lemeshow Test and a derivation/validation receiver operating characteristic (ROC) curve. There were 2004 CEA performed in 1791 patients. The average follow-up was 130 ± 49 months. The clinical profile of the cohort data included 84% hypertension, 56% coronary artery disease (CAD), 24% diabetes, and 71% on statins. The 30-day stroke rate was 1.1% and the death rate was 0.7%. The actual 5-year survival was 73%. Logistic regression yielded the following predictors of mortality: age (by decade) (odds ratio [OR] = 1.8, P < 0.0001), CAD (OR = 1.5, P = 0.0007), chronic obstructive pulmonary disease (OR = 2.5; P < 0.0001), diabetes (OR = 1.7, P < 0.0001), neck radiation (OR = 2.6, P = 0.005), no statin (OR = 2.1, P < 0.0001), and creatinine more than 1.5 (OR = 2.6, P < 0.0001). These variables were then assigned a hierarchal point scoring system in accordance with the OR value. The 5-year survival based on the scoring system was as follows: 0 to 5 points = 92.5%, 6 to 8 points = 83.6%, 9 to 11 points = 63.7%, 12 to 14 points = 46.5%, and more than 15 points = 33.8%. The Hosmer-Lemeshow test validated the scoring system (P = 0.26) and there was no difference in the ROC curves (C statistic = 0.74 vs 0.73). This validated scoring system can be a useful tool for determining which patients are likely to benefit most from CEA based on the probability of long-term survival. Given that the 5-year survival of patients in the medical arm of the asymptomatic CEA trials was 60% to 70%, it is reasonable to conclude that patients who score 0 to 8 points are excellent candidates for CEA whereas most patients with ≥12 points should be managed with medical therapy alone.
A comprehensive scoring system to measure healthy community design in land use plans and regulations.

PubMed

Maiden, Kristin M; Kaplan, Marina; Walling, Lee Ann; Miller, Patricia P; Crist, Gina

2017-02-01

Comprehensive land use plans and their corresponding regulations play a role in determining the nature of the built environment and community design, which are factors that influence population health and health disparities. To determine the level in which a plan addresses healthy living and active design, there is a need for a systematic, reliable and valid method of analyzing and scoring health-related content in plans and regulations. This paper describes the development and validation of a scoring tool designed to measure the strength and comprehensiveness of health-related content found in land use plans and the corresponding regulations. The measures are scored based on the presence of a specific item and the specificity and action-orientation of language. To establish reliability and validity, 42 land use plans and regulations from across the United States were scored January-April 2016. Results of the psychometric analysis indicate the scorecard is a reliable scoring tool for land use plans and regulations related to healthy living and active design. Intraclass correlation coefficients (ICC) scores showed strong inter-rater reliability for total strength and comprehensiveness. ICC scores for total implementation scores showed acceptable consistency among scorers. Cronbach's alpha values for all focus areas were acceptable. Strong content validity was measured through a committee vetting process. The development of this tool has far-reaching implications, bringing standardization of measurement to the field of land use plan assessment, and paving the way for systematic inclusion of health-related design principles, policies, and requirements in land use plans and their corresponding regulations. Copyright © 2016 Elsevier Inc. All rights reserved.

Derivation and validation of a diagnostic score based on case-mix groups to predict 30-day death or urgent readmission.

PubMed

van Walraven, Carl; Wong, Jenna; Forster, Alan J

2012-01-01

Between 5% and 10% of patients die or are urgently readmitted within 30 days of discharge from hospital. Readmission risk indexes have either excluded acute diagnoses or modelled them as multiple distinct variables. In this study, we derived and validated a score summarizing the influence of acute hospital diagnoses and procedures on death or urgent readmission within 30 days. From population-based hospital abstracts in Ontario, we randomly sampled 200 000 discharges between April 2003 and March 2009 and determined who had been readmitted urgently or died within 30 days of discharge. We used generalized estimating equation modelling, with a sample of 100 000 patients, to measure the adjusted association of various case-mix groups (CMGs-homogenous groups of acute care inpatients with similar clinical and resource-utilization characteristics) with 30-day death or urgent readmission. This final model was transformed into a scoring system that was validated in the remaining 100 000 patients. Patients in the derivation set belonged to 1 of 506 CMGs and had a 6.8% risk of 30-day death or urgent readmission. Forty-seven CMG codes (more than half of which were directly related to chronic diseases) were independently associated with this outcome, which led to a CMG score that ranged from -6 to 7 points. The CMG score was significantly associated with 30-day death or urgent readmission (unadjusted odds ratio for a 1-point increase in CMG score 1.52, 95% confidence interval [CI] 1.49-1.56). Alone, the CMG score was only moderately discriminative (C statistic 0.650, 95% CI 0.644-0.656). However, when the CMG score was added to a validated risk index for death or readmission, the C statistic increased to 0.759 (95% CI 0.753-0.765). The CMG score was well calibrated for 30-day death or readmission. In this study, we developed a scoring system for acute hospital diagnoses and procedures that could be used as part of a risk-adjustment methodology for analyses of postdischarge outcomes.
Validation of pathological grading systems for predicting metastatic potential in pheochromocytoma and paraganglioma

PubMed Central

Koh, Jung-Min; Ahn, Seong Hee; Kim, Hyeonmok; Kim, Beom-Jun; Sung, Tae-Yon; Kim, Young Hoon; Hong, Suck Joon; Song, Dong Eun

2017-01-01

Purpose The Grading system for Adrenal Pheochromocytoma and Paraganglioma (GAPP) was proposed for predicting the metastatic potential of pheochromocytoma and paraganglioma to overcome the limitations of the Pheochromocytoma of the Adrenal Scaled Score (PASS). However, to date, no study validating the GAPP has been conducted, and previous studies did not include mutations in the succinate dehydrogenase type B (SDHB) gene in the score calculation. In this retrospective cohort study, we validated the prediction ability of GAPP and assessed whether it would be improved by inclusion of the loss of SDHB immunohistochemical staining. Methods We divided the tumors into non-metastatic and metastatic groups based on the presence of synchronous or metachronous metastases. The GAPP score and PASS at the initial operation were measured. Moreover, we combined some GAPP parameters with the immunohistochemical staining of SDHB to obtain a modified GAPP (M-GAPP) score. Results Metastasis occurred in 15/72 (20.8%) patients, with a mean follow-up of 43.5 months. Loss of SDHB staining was more frequent (P = 0.044) in the metastatic group. The GAPP score (P = 0.006), PASS (P = 0.003), and M-GAPP score (P<0.001) were all higher in the metastatic group. Twelve of 40 (30.0%) moderately or poorly differentiated tumors, as defined by the GAPP score, and 12/34 (35.3%) tumors with a PASS ≥4 were metastatic. Conversely, 10/19 (52.6%) tumors with an M-GAPP score ≥3 were metastatic. The area under the curve of the M-GAPP score (0.822) was significantly higher than that of the GAPP (0.728) (P = 0.012), but similar to that of the PASS (0.753) (P = 0.411). The GAPP (P = 0.032) and M-GAPP scores (P = 0.040), but not PASS (P = 0.200), negatively correlated with metastasis-free survival. Conclusion The GAPP was validated, and M-GAPP, a combination of some GAPP parameters and loss of SDHB staining, might be useful for the prediction of the metastatic potential of pheochromocytoma and paraganglioma. PMID:29117221
Reliability and Construct Validity of Limits of Stability Test in Adolescents Using a Portable Forceplate System.

PubMed

Alsalaheen, Bara; Haines, Jamie; Yorke, Amy; Broglio, Steven P

2015-12-01

To examine the reliability, convergent, and discriminant validity of the limits of stability (LOS) test to assess dynamic postural stability in adolescents using a portable forceplate system. Cross-sectional reliability observational study. School setting. Adolescents (N=36) completed all measures during the first session. To examine the reliability of the LOS test, a subset of 15 participants repeated the LOS test after 1 week. Not applicable. Outcome measurements included the LOS test, Balance Error Scoring System, Instrumented Balance Error Scoring System, and Modified Clinical Test for Sensory Interaction on Balance. A significant relation was observed among LOS composite scores (r=.36-.87, P<.05). However, no relation was observed between LOS and static balance outcome measurements. The reliability of the LOS composite scores ranged from moderate to good (intraclass correlation coefficient model 2,1=.73-.96). The results suggest that the LOS composite scores provide unique information about dynamic postural stability, and the LOS test completed at 100% of the theoretical limit appeared to be a reliable test of dynamic postural stability in adolescents. Clinicians should use dynamic balance measurement as part of their balance assessment and should not use static balance testing (eg, Balance Error Scoring System) to make inferences about dynamic balance, especially when balance assessment is used to determine rehabilitation outcomes, or when making return to play decisions after injury. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Validation of a prediction model that allows direct comparison of the Oxford Knee Score and American Knee Society clinical rating system.

PubMed

Maempel, J F; Clement, N D; Brenkel, I J; Walmsley, P J

2015-04-01

This study demonstrates a significant correlation between the American Knee Society (AKS) Clinical Rating System and the Oxford Knee Score (OKS) and provides a validated prediction tool to estimate score conversion. A total of 1022 patients were prospectively clinically assessed five years after TKR and completed AKS assessments and an OKS questionnaire. Multivariate regression analysis demonstrated significant correlations between OKS and the AKS knee and function scores but a stronger correlation (r = 0.68, p < 0.001) when using the sum of the AKS knee and function scores. Addition of body mass index and age (other statistically significant predictors of OKS) to the algorithm did not significantly increase the predictive value. The simple regression model was used to predict the OKS in a group of 236 patients who were clinically assessed nine to ten years after TKR using the AKS system. The predicted OKS was compared with actual OKS in the second group. Intra-class correlation demonstrated excellent reliability (r = 0.81, 95% confidence intervals 0.75 to 0.85) for the combined knee and function score when used to predict OKS. Our findings will facilitate comparison of outcome data from studies and registries using either the OKS or the AKS scores and may also be of value for those undertaking meta-analyses and systematic reviews. ©2015 The British Editorial Society of Bone & Joint Surgery.
Development of an instrument to measure medical students' perceptions of the assessment environment: initial validation.

PubMed

Sim, Joong Hiong; Tong, Wen Ting; Hong, Wei-Han; Vadivelu, Jamuna; Hassan, Hamimah

2015-01-01

Assessment environment, synonymous with climate or atmosphere, is multifaceted. Although there are valid and reliable instruments for measuring the educational environment, there is no validated instrument for measuring the assessment environment in medical programs. This study aimed to develop an instrument for measuring students' perceptions of the assessment environment in an undergraduate medical program and to examine the psychometric properties of the new instrument. The Assessment Environment Questionnaire (AEQ), a 40-item, four-point (1=Strongly Disagree to 4=Strongly Agree) Likert scale instrument designed by the authors, was administered to medical undergraduates from the authors' institution. The response rate was 626/794 (78.84%). To establish construct validity, exploratory factor analysis (EFA) with principal component analysis and varimax rotation was conducted. To examine the internal consistency reliability of the instrument, Cronbach's α was computed. Mean scores for the entire AEQ and for each factor/subscale were calculated. Mean AEQ scores of students from different academic years and sex were examined. Six hundred and eleven completed questionnaires were analysed. EFA extracted four factors: feedback mechanism (seven items), learning and performance (five items), information on assessment (five items), and assessment system/procedure (three items), which together explained 56.72% of the variance. Based on the four extracted factors/subscales, the AEQ was reduced to 20 items. Cronbach's α for the 20-item AEQ was 0.89, whereas Cronbach's α for the four factors/subscales ranged from 0.71 to 0.87. Mean score for the AEQ was 2.68/4.00. The factor/subscale of 'feedback mechanism' recorded the lowest mean (2.39/4.00), whereas the factor/subscale of 'assessment system/procedure' scored the highest mean (2.92/4.00). Significant differences were found among the AEQ scores of students from different academic years. The AEQ is a valid and reliable instrument. Initial validation supports its use to measure students' perceptions of the assessment environment in an undergraduate medical program.
The Reliability and Validity of the Thoracolumbar Injury Classification System in Pediatric Spine Trauma.

PubMed

Savage, Jason W; Moore, Timothy A; Arnold, Paul M; Thakur, Nikhil; Hsu, Wellington K; Patel, Alpesh A; McCarthy, Kathryn; Schroeder, Gregory D; Vaccaro, Alexander R; Dimar, John R; Anderson, Paul A

2015-09-15

The thoracolumbar injury classification system (TLICS) was evaluated in 20 consecutive pediatric spine trauma cases. The purpose of this study was to determine the reliability and validity of the TLICS in pediatric spine trauma. The TLICS was developed to improve the categorization and management of thoracolumbar trauma. TLICS has been shown to have good reliability and validity in the adult population. The clinical and radiographical findings of 20 pediatric thoracolumbar fractures were prospectively presented to 20 surgeons with disparate levels of training and experience with spinal trauma. These injuries were consecutively scored using the TLICS. Cohen unweighted κ coefficients and Spearman rank order correlation values were calculated for the key parameters (injury morphology, status of posterior ligamentous complex, neurological status, TLICS total score, and proposed management) to assess the inter-rater reliabilities. Five surgeons scored the same cases 3 months later to assess the intra-rater reliability. The actual management of each case was then compared with the treatment recommended by the TLICS algorithm to assess validity. The inter-rater κ statistics of all subgroups (injury morphology, status of the posterior ligamentous complex, neurological status, TLICS total score, and proposed treatment) were within the range of moderate to substantial reproducibility (0.524-0.958). All subgroups had excellent intra-rater reliability (0.748-1.000). The various indices for validity were calculated (80.3% correct, 0.836 sensitivity, 0.785 specificity, 0.676 positive predictive value, 0.899 negative predictive value). Overall, TLICS demonstrated good validity. The TLICS has good reliability and validity when used in the pediatric population. The inter-rater reliability of predicting management and indices for validity are lower than those in adults with thoracolumbar fractures, which is likely due to differences in the way children are treated for certain types of injuries. TLICS can be used to reliably categorize thoracolumbar injuries in the pediatric population; however, modifications may be needed to better guide treatment in this specific patient population. 4.
Derivation and Evaluation of a Risk-Scoring Tool to Predict Participant Attrition in a Lifestyle Intervention Project.

PubMed

Jiang, Luohua; Yang, Jing; Huang, Haixiao; Johnson, Ann; Dill, Edward J; Beals, Janette; Manson, Spero M; Roubideaux, Yvette

2016-05-01

Participant attrition in clinical trials and community-based interventions is a serious, common, and costly problem. In order to develop a simple predictive scoring system that can quantify the risk of participant attrition in a lifestyle intervention project, we analyzed data from the Special Diabetes Program for Indians Diabetes Prevention Program (SDPI-DP), an evidence-based lifestyle intervention to prevent diabetes in 36 American Indian and Alaska Native communities. SDPI-DP participants were randomly divided into a derivation cohort (n = 1600) and a validation cohort (n = 801). Logistic regressions were used to develop a scoring system from the derivation cohort. The discriminatory power and calibration properties of the system were assessed using the validation cohort. Seven independent factors predicted program attrition: gender, age, household income, comorbidity, chronic pain, site's user population size, and average age of site staff. Six factors predicted long-term attrition: gender, age, marital status, chronic pain, site's user population size, and average age of site staff. Each model exhibited moderate to fair discriminatory power (C statistic in the validation set: 0.70 for program attrition, and 0.66 for long-term attrition) and excellent calibration. The resulting scoring system offers a low-technology approach to identify participants at elevated risk for attrition in future similar behavioral modification intervention projects, which may inform appropriate allocation of retention resources. This approach also serves as a model for other efforts to prevent participant attrition.
Procedure-specific assessment tool for flexible pharyngo-laryngoscopy: gathering validity evidence and setting pass-fail standards.

PubMed

Melchiors, Jacob; Petersen, K; Todsen, T; Bohr, A; Konge, Lars; von Buchwald, Christian

2018-06-01

The attainment of specific identifiable competencies is the primary measure of progress in the modern medical education system. The system, therefore, requires a method for accurately assessing competence to be feasible. Evidence of validity needs to be gathered before an assessment tool can be implemented in the training and assessment of physicians. This evidence of validity must according to the contemporary theory on validity be gathered from specific sources in a structured and rigorous manner. The flexible pharyngo-laryngoscopy (FPL) is central to the otorhinolaryngologist. We aim to evaluate the flexible pharyngo-laryngoscopy assessment tool (FLEXPAT) created in a previous study and to establish a pass-fail level for proficiency. Eighteen physicians with different levels of experience (novices, intermediates, and experienced) were recruited to the study. Each performed an FPL on two patients. These procedures were video recorded, blinded, and assessed by two specialists. The score was expressed as the percentage of a possible max score. Cronbach's α was used to analyze internal consistency of the data, and a generalizability analysis was performed. The scores of the three different groups were explored, and a pass-fail level was determined using the contrasting groups' standard setting method. Internal consistency was strong with a Cronbach's α of 0.86. We found a generalizability coefficient of 0.72 sufficient for moderate stakes assessment. We found a significant difference between the novice and experienced groups (p < 0.001) and strong correlation between experience and score (Pearson's r = 0.75). The pass/fail level was established at 72% of the maximum score. Applying this pass-fail level in the test population resulted in half of the intermediary group receiving a failing score. We gathered validity evidence for the FLEXPAT according to the contemporary framework as described by Messick. Our results support a claim of validity and are comparable to other studies exploring clinical assessment tools. The high rate of physicians underperforming in the intermediary group demonstrates the need for continued educational intervention. Based on our work, we recommend the use of the FLEXPAT in clinical assessment of FPL and the application of a pass-fail level of 72% for proficiency.
Development of a Comprehensive Osteochondral Allograft MRI Scoring System (OCAMRISS) With Histopathologic, Micro–Computed Tomography, and Biomechanical Validation

PubMed Central

Pallante-Kichura, Andrea L.; Bae, Won C.; Du, Jiang; Statum, Sheronda; Wolfson, Tanya; Gamst, Anthony C.; Cory, Esther; Amiel, David; Bugbee, William D.; Sah, Robert L.; Chung, Christine B.

2014-01-01

Objective: To describe and apply a semiquantitative MRI scoring system for multifeature analysis of cartilage defect repair in the knee by osteochondral allografts and to correlate this scoring system with histopathologic, micro–computed tomography (µCT), and biomechanical reference standards using a goat repair model. Design: Fourteen adult goats had 2 osteochondral allografts implanted into each knee: one in the medial femoral condyle and one in the lateral trochlea. At 12 months, goats were euthanized and MRI was performed. Two blinded radiologists independently rated 9 primary features for each graft, including cartilage signal, fill, edge integration, surface congruity, calcified cartilage integrity, subchondral bone plate congruity, subchondral bone marrow signal, osseous integration, and presence of cystic changes. Four ancillary features of the joint were also evaluated, including opposing cartilage, meniscal tears, synovitis, and fat-pad scarring. Comparison was made with histologic and µCT reference standards as well as biomechanical measures. Interobserver agreement and agreement with reference standards was assessed. Cohen’s κ, Spearman’s correlation, and Kruskal-Wallis tests were used as appropriate. Results: There was substantial agreement (κ > 0.6, P < 0.001) for each MRI feature and with comparison against reference standards, except for cartilage edge integration (κ = 0.6). There was a strong positive correlation between MRI and reference standard scores (ρ = 0.86, P < 0.01). Osteochondral allograft MRI scoring system was sensitive to differences in outcomes between the types of allografts. Conclusions: We have described a comprehensive MRI scoring system for osteochondral allografts and have validated this scoring system with histopathologic and µCT reference standards as well as biomechanical indentation testing. PMID:24489999
Validation of Clinical Scoring Systems ART and ABCR after Transarterial Chemoembolization of Hepatocellular Carcinoma.

PubMed

Kloeckner, Roman; Pitton, Michael B; Dueber, Christoph; Schmidtmann, Irene; Galle, Peter R; Koch, Sandra; Wörns, Marcus A; Weinmann, Arndt

2017-01-01

To perform an external validation of the Assessment for Retreatment with Transarterial Chemoembolization (ART) and α-fetoprotein (AFP), Barcelona Clinic Liver Cancer (BCLC), Child-Pugh, and response (ABCR) scores and to compare them in terms of prognostic power. From 2000 to 2015, 871 patients with hepatocellular carcinoma underwent transarterial chemoembolization at a tertiary referral hospital, and 176 met all inclusion and exclusion criteria for both scores and were analyzed. Nineteen percent (n = 34) had BCLC stage A disease and 81% had stage B disease. Thirty-nine patients (22%) presented with elevated AFP levels. Overall survival was calculated. Scores were validated and compared with a Harrell C-index, integrated Brier score (IBS), and prediction error curves. Before the second chemoembolization procedure, 22 patients (12%) showed an increase of 1 point in Child-Pugh score and 51 patients (22%) had an increase of ≥ 2 points. Thirty-one patients (23%) showed a > 25% increase in aspartate aminotransferase level, and 114 (65%) showed a response to treatment. Consequently, 127 patients (72%) had a low ART score and 49 (28%) had a high ART score. One hundred fifty-eight patients (90%) had a low ABCR score, whereas 18 (10%) had a high ABCR score. Low and high ART score groups had median survival durations of 20.8 and 15.3 mo, respectively. Harrell C-indexes were 0.572 and 0.608, and IBSs were 0.135 and 0.128, for ART and ABCR, respectively. For both scores, an increase in Child-Pugh score ≥ 2 points and a radiologic response were significantly associated with survival. Both scores were of limited predictive value, and neither was sufficient to support clear-cut clinical decisions. Further effort is necessary to determine criteria for making valid clinical predictions. Copyright © 2017 SIR. Published by Elsevier Inc. All rights reserved.
Development and Validation of a Chronic Pancreatitis Prognosis Score in 2 Independent Cohorts.

PubMed

Beyer, Georg; Mahajan, Ujjwal M; Budde, Christoph; Bulla, Thomas J; Kohlmann, Thomas; Kuhlmann, Louise; Schütte, Kerstin; Aghdassi, Ali A; Weber, Eckhard; Weiss, F Ulrich; Drewes, Asbjørn M; Olesen, Søren S; Lerch, Markus M; Mayerle, Julia

2017-12-01

The clinical course of chronic pancreatitis is unpredictable. There is no model to assess disease severity or progression or predict patient outcomes. We performed a prospective study of 91 patients with chronic pancreatitis; data were collected from patients seen at academic centers in Europe from January 2011 through April 2014. We analyzed correlations between clinical, laboratory, and imaging data with number of hospital readmissions and in-hospital days over the next 12 months; the parameters with the highest degree of correlation were used to develop a 3-stage chronic pancreatitis prognosis score (COPPS). The predictive strength was validated in 129 independent subjects identified from 2 prospective databases. The mean number of hospital admissions was 1.9 (95% confidence interval [CI], 1.39-2.44) and 15.2 for hospital days (95% CI, 10.76-19.71) for the development cohort and 10.9 for the validation cohort (95% CI, 7.54-14.30) (P = .08). Based on bivariate correlations, pain (numeric rating scale), level of glycated hemoglobin A1c, level of C-reactive protein, body mass index, and platelet count were used to develop the COPPS system. The patients' median COPPS was 8.9 points (range, 5-14). The system accurately discriminated stages of disease severity (low to high): A (5-6 points), B (7-9), and C (10-15). In Pearson correlation analysis of the development cohort, the COPPS correlated with hospital admissions (0.39; P < .01) and number of hospital days (0.33; P < .01). The correlation was validated in the validation set (Pearson correlation values of 0.36 and 0.44; P < .01). COPPS did not correlate with results from the Cambridge classification system. We developed and validated an easy to use dynamic multivariate scoring system, similar to the Child-Pugh-Score for liver cirrhosis. The COPPS allows objective monitoring of patients with chronic pancreatitis, determining risk for readmission to hospital and potential length of hospital stay. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
Validation of Gujarati Version of ABILOCO-Kids Questionnaire

PubMed Central

Diwan, Jasmin; Patel, Pankaj; Bansal, Ankita B.

2015-01-01

Background ABILOCO-Kids is a measure of locomotion ability for children with cerebral palsy (CP) aged 6 to 15 years & is available in English & French. Aim To validate the Gujarati version of ABILOCO-Kids questionnaire to be used in clinical research on Gujarati population. Materials and Methods ABILOCO-Kids questionnaire was translated into Gujarati from English using forward-backward-forward method. To ensure face & content validity of Gujarati version using group consensus method, each item was examined by group of experts having mean experience of 24.62 years in field of paediatric and paediatric physiotherapy. Each item was analysed for content, meaning, wording, format, ease of administration & scoring. Each item was scored by expert group as either accepted, rejected or accepted with modification. Procedure was continued until 80% of consensus for all items. Concurrent validity was examined on 55 children with Cerebral Palsy (6-15 years) of all Gross Motor Functional Classification System (GMFCS) level & all clinical types by correlating score of ABILOCO-Kids with Gross Motor Functional Measure & GMFCS. Result In phase 1 of validation, 16 items were accepted as it is; 22 items accepted with modification & 3 items went for phase 2 validation. For concurrent validity, highly significant positive correlation was found between score of ABILOCO-Kids & total GMFM (r=0.713, p<0.005) & highly significant negative correlation with GMFCS (r= -0.778, p<0.005). Conclusion Gujarati translated version of ABILOCO-Kids questionnaire has good face & content validity as well as concurrent validity which can be used to measure caregiver reported locomotion ability in children with CP. PMID:26557603
The Interaction of Sexual Validation, Criminal Justice Involvement, and Sexually Transmitted Infection Risk Among Adolescent and Young Adult Males.

PubMed

Matson, Pamela A; Towe, Vivian; Ellen, Jonathan M; Chung, Shang-En; Sherman, Susan G

2018-03-01

Young men who have been involved with the criminal justice system are more likely to have concurrent sexual partners, a key driver of sexually transmitted infections. The value men place on having sexual relationships to validate themselves may play an important role in understanding this association. Data were from a household survey. Young men (N = 132), aged 16 to 24 years, self-reported whether they ever spent time in jail or juvenile detention and if they had sexual partnerships that overlapped in time. A novel scale, "Validation through Sex and Sexual Relationships" (VTSSR) assessed the importance young men place on sex and sexual relationships (α = 0.91). Weighted logistic regression accounted for the sampling design. The mean (SD) VTSSR score was 23.7 (8.8) with no differences by race. Both criminal justice involvement (CJI) (odds ratio [OR], 3.69; 95% confidence interval [CI], 1.12-12.1) and sexual validation (OR, 1.10; 95% CI, 1.04-1.16) were associated with an increased odds of concurrency; however, CJI did not remain associated with concurrency in the fully adjusted model. There was effect modification, CJI was associated with concurrency among those who scored high on sexual validation (OR, 9.18; 95% CI, 1.73-48.6]; however, there was no association among those who scored low on sexual validation. Racial differences were observed between CJI and concurrency, but not between sexual validation and concurrency. Sexual validation may be an important driver of concurrency for men who have been involved with the criminal justice system. Study findings have important implications on how sexual validation may explain racial differences in rates of concurrency.
Towards a contemporary, comprehensive scoring system for determining technical outcomes of hybrid percutaneous chronic total occlusion treatment: The RECHARGE score.

PubMed

Maeremans, Joren; Spratt, James C; Knaapen, Paul; Walsh, Simon; Agostoni, Pierfrancesco; Wilson, William; Avran, Alexandre; Faurie, Benjamin; Bressollette, Erwan; Kayaert, Peter; Bagnall, Alan J; Smith, Dave; McEntegart, Margaret B; Smith, William H T; Kelly, Paul; Irving, John; Smith, Elliot J; Strange, Julian W; Dens, Jo

2018-02-01

This study sought to create a contemporary scoring tool to predict technical outcomes of chronic total occlusion (CTO) percutaneous coronary intervention (PCI) from patients treated by hybrid operators with differing experience levels. Current scoring systems need regular updating to cope with the positive evolutions regarding materials, techniques, and outcomes, while at the same time being applicable for a broad range of operators. Clinical and angiographic characteristics from 880 CTO-PCIs included in the REgistry of CrossBoss and Hybrid procedures in FrAnce, the NetheRlands, BelGium and UnitEd Kingdom (RECHARGE) were analyzed by using a derivation and validation set (2:1 ratio). Variables significantly associated with technical failure in the multivariable analysis were incorporated in the score. Subsequently, the discriminatory capacity was assessed and the validation set was used to compare with the J-CTO score and PROGRESS scores. Technical success in the derivation and validation sets was 83% and 85%, respectively. Multivariate analysis identified six parameters associated with technical failure: blunt stump (beta coefficient (b) = 1.014); calcification (b = 0.908); tortuosity ≥45° (b = 0.964); lesion length 20 mm (b = 0.556); diseased distal landing zone (b = 0.794), and previous bypass graft on CTO vessel (b = 0.833). Score variables remained significant after bootstrapping. The RECHARGE score showed better discriminatory capacity in both sets (area-under-the-curve (AUC) = 0.783 and 0.711), compared to the J-CTO (AUC = 0.676) and PROGRESS (AUC = 0.608) scores. The RECHARGE score is a novel, easy-to-use tool for assessing the risk for technical failure in hybrid CTO-PCI and has the potential to perform well for a broad community of operators. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Technical feasibility and validation of a coronary artery calcium scoring system using CT coronary angiography images.

PubMed

Pavitt, Christopher W; Harron, Katie; Lindsay, Alistair C; Zielke, Sayeh; Ray, Robin; Gordon, Daniel; Rubens, Michael B; Padley, Simon P; Nicol, Edward D

2016-05-01

We validate a novel CT coronary angiography (CCTA) coronary calcium scoring system. Calcium was quantified on CCTA images using a new patient-specific attenuation threshold: mean + 2SD of intra-coronary contrast density (HU). Using 335 patient data sets a conversion factor (CF) for predicting CACS from CCTA scores (CCTAS) was derived and validated in a separate cohort (n = 168). Bland-Altman analysis and weighted kappa for MESA centiles and Agatston risk groupings were calculated. Multivariable linear regression yielded a CF: CACS = (1.185 × CCTAS) + (0.002 × CCTAS × attenuation threshold). When applied to CCTA data sets there was excellent correlation (r = 0.95; p < 0.0001) and agreement (mean difference -10.4 [95% limits of agreement -258.9 to 238.1]) with traditional calcium scores. Agreement was better for calcium scores below 500; however, MESA percentile agreement was better for high risk patients. Risk stratification was excellent (Agatston groups k = 0.88 and MESA centiles k = 0.91). Eliminating the dedicated CACS scan decreased patient radiation exposure by approximately one-third. CCTA calcium scores can accurately predict CACS using a simple, individualized, semiautomated approach reducing acquisition time and radiation exposure when evaluating patients for CAD. This method is not affected by the ROI location, imaging protocol, or tube voltage strengthening its clinical applicability. • Coronary calcium scores can be reliably determined on contrast-enhanced cardiac CT • This score can accurately risk stratify patients • Elimination of a dedicated calcium scan reduces patient radiation by a third.
The development and validation of a questionnaire for rotator cuff disorders: The Functional Shoulder Score

PubMed Central

Ibrahim, Edward F; Petrou, Charalambos; Galanos, Antonis

2015-01-01

Background The purpose of the present study was to validate the Functional Shoulder Score (FSS), a new patient-reported outcome score specifically designed to evaluate patients with rotator cuff disorders. Methods One hundred and nineteen patients were assessed using two shoulder scoring systems [the FSS and the Constant–Murley Score (CMS)] at 3 weeks pre- and 6 months post-arthroscopic rotator cuff surgery. The reliability, validity, responsiveness and interpretability of the FSS were evaluated. Results Reliability analysis (test–retest) showed an intraclass correlation coefficient value of 0.96 [95% confidence interval (CI) = 0.92 to 0.98]. Internal consistency analysis revealed a Cronbach's alpha coefficient of 0.93. The Pearson correlation coefficient FSS-CMS was 0.782 pre-operatively and 0.737 postoperatively (p < 0.0005). There was a statistically significant increase in FSS scores postoperatively, an effect size of 3.06 and standardized response mean of 2.80. The value for minimal detectable change was ±8.38 scale points (based on a 90% CI) and the minimal clinically important difference for improvement was 24.7 ± 5.4 points. Conclusions The FSS is a patient-reported outcome measure that can easily be incorporated into clinical practice, providing a quick, reliable, valid and practical measure for rotator cuff problems. The questionnaire is highly sensitive to clinical change. PMID:27582986
Validity of retrospective disease activity assessment in systemic lupus erythematosus.

PubMed

Arce-Salinas, A; Cardiel, M H; Guzmán, J; Alcocer-Varela, J

1996-05-01

To evaluate the validity of retrospective disease activity assessment derived from clinical charts. We prospectively evaluated 37 patients with systemic lupus erythematosus (SLE) in 90 visits using the SLE Disease Activity Index (SLEDAI), the Mexican SLEDAI (Mex-SLEDAI), and the Lupus Activity Criteria Count (LACC) indices. Routine clinical observations were written by rheumatologists blind to index scores. These notes were reviewed 2 years later to obtain retrospective index scores and their validity was assessed using prospective scores as the standard. Statistical analysis was by Spearman's rank correlation coefficient (rs), Wilcoxon matched pairs test, kappa statistic, and intraclass correlation coefficient (ri). We calculated the sensitivity and specificity of retrospective indices to detect active disease. Median retrospective scores were lower in all indices: SLEDAI (4 VS 2, p =0.004, RS = 0.68, ri = 0.30); Mex-SLEDAI (2 vs 1, p < 0.0003, rs = 0.79, ri = 0.31); and LACC (1 vs 1, p = 0.007, rs = 0.65, ri = 0.21). Used to detect active SLE, the retrospective SLEDAI had a sensitivity of 0.68 and a specificity of 0.86; corresponding values for the Mex-SLEDAI were 0.72 and 0.91, and for the LACC, 0.77 and 0.76. Retrospective disease activity indices tended to provide lower scores than prospective evaluations. They often missed patients with mildly active disease, but when positive they were good predictors of disease activity.
Validation of the French version of the Acceptability E-scale (AES) for mental E-health systems.

PubMed

Micoulaud-Franchi, Jean-Arthur; Sauteraud, Alain; Olive, Jérôme; Sagaspe, Patricia; Bioulac, Stéphanie; Philip, Pierre

2016-03-30

Despite the increasing use of E-health systems for mental-health organizations, there is a lack of psychometric tools to evaluate their acceptability by patients with mental disorders. Thus, this study aimed to translate and validate a French version of the Acceptability E-scale (AES), a 6-item self-reported questionnaire that evaluates the extent to which patients find E-health systems acceptable. A forward-backward translation of the AES was performed. The psychometric properties of the French AES version, with construct validity, internal structural validity and external validity (Pearson's coefficient between AES scores and depression symptoms on the Beck Depression Inventory II) were analyzed. In a sample of 178 patients (mean age=46.51 years, SD=12.91 years), the validation process revealed satisfactory psychometric properties: factor analysis revealed two factors: "Satisfaction" (3 items) and "Usability" (3 items) and Cronbach's alpha was 0.7. No significant relation was found between AES scores and depression symptoms. The French version of the AES revealed a two-factor scale that differs from the original version. In line with the importance of acceptability in mental health and with a view to E-health systems for patients with mental disorders, the use of the AES in psychiatry may provide important information on acceptability (i.e., satisfaction and usability). Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
VALIDITY OF THE REARRANGEMENT EXERCISE AS A PREDICTOR OF ESSAY WRITING ABILITY.

ERIC Educational Resources Information Center

CONRY, JULIANNE JOYCE

DATA FROM THE PARAGRAPH ORGANIZATION PORTION OF THE CEEB ENGLISH COMPOSITION TEST (ECT) WERE CONVERTED TO THE ORIGINAL RANK-ORDER AND WERE THEN RESCORED BY THREE SYSTEMS USING SPEARMAN'S RHO TO DETERMINE WHICH METHOD YIELDED SCORES THAT CORRELATED BEST WITH TOTAL ESSAY SCORES. TWO OF THE METHODS INVESTIGATED, ONE IN WHICH THE NUMBER OF SCORES WAS…
TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

ERIC Educational Resources Information Center

Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

2012-01-01

Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

The Basilar Artery on Computed Tomography Angiography Prognostic Score for Basilar Artery Occlusion.

PubMed

Alemseged, Fana; Shah, Darshan G; Diomedi, Marina; Sallustio, Fabrizio; Bivard, Andrew; Sharma, Gagan; Mitchell, Peter J; Dowling, Richard J; Bush, Steven; Yan, Bernard; Caltagirone, Carlo; Floris, Roberto; Parsons, Mark W; Levi, Christopher R; Davis, Stephen M; Campbell, Bruce C V

2017-03-01

Basilar artery occlusion is associated with high risk of disability and mortality. This study aimed to assess the prognostic value of a new radiological score: the Basilar Artery on Computed Tomography Angiography (BATMAN) score. A retrospective analysis of consecutive stroke patients with basilar artery occlusion diagnosed on computed tomographic angiography was performed. BATMAN score is a 10-point computed tomographic angiography-based grading system which incorporates thrombus burden and the presence of collaterals. Reliability was assessed with intraclass coefficient correlation. Good outcome was defined as modified Rankin Scale score of ≤3 at 3 months and successful reperfusion as thrombolysis in cerebral infarction 2b-3. BATMAN score was externally validated and compared with the Posterior Circulation Collateral score. The derivation cohort included 83 patients with 41 in the validation cohort. In receiver operating characteristic (ROC) analysis, BATMAN score had an area under receiver operating characteristic curve of 0.81 (95% confidence interval [CI], 0.7-0.9) in derivation cohort and an area under receiver operating characteristic curve of 0.74 (95% CI, 0.6-0.9) in validation cohort. In logistic regression adjusted for age and clinical severity, BATMAN score of <7 was associated with poor outcome in derivation cohort (odds ratio, 5.5; 95% CI, 1.4-21; P =0.01), in validation cohort (odds ratio, 6.9; 95% CI, 1.4-33; P =0.01), and in endovascular patients, after adjustment for recanalization and time to treatment (odds ratio, 4.8; 95% CI, 1.2-18; P =0.01). BATMAN score of <7 was not associated with recanalization. Interrater agreement was substantial (intraclass coefficient correlation, 0.85; 95% CI, 0.8-0.9). BATMAN score had greater accuracy compared with Posterior Circulation Collateral score ( P =0.04). The addition of collateral quality to clot burden in BATMAN score seems to improve prognostic accuracy in basilar artery occlusion patients. © 2017 American Heart Association, Inc.
French adaptation of the new Knee Society Scoring System for total knee arthroplasty.

PubMed

Debette, C; Parratte, S; Maucort-Boulch, D; Blanc, G; Pauly, V; Lustig, S; Servien, E; Neyret, P; Argenson, J N

2014-09-01

In November 2011, the Knee Society published its new KSS score to evaluate objective clinical data and also patient expectations, satisfaction and knee function during various physical activities before and after total knee arthroplasty (TKA). We undertook the French cross-cultural adaptation of this scoring system according to current recommendations. The French version of the new KSS score is a consistent, feasible, reliable and discriminating score. Eighty patients with knee osteoarthritis were recruited from two centers: one group of 40 patients had a TKA indication, while the other group of 40 patients had an indication for conservative treatment. After the new KSS score was translated and back-translated, it was compared to three other validated instruments (KOOS, AMIQUAL and SF-12) to determine construct validity, discriminating power, feasibility in terms of response rate and existence of floor or ceiling effect, internal consistency with Chronbach's alpha and reliability based on reproducibility and sensitivity to change (responsiveness). Due to missing data, two cases were eliminated. We found that the score could discriminate between groups; it had a nearly 100% response rate, a ceiling effect in the "expectations" domain, satisfactory Chronbach's alpha, excellent reproducibility and good responsiveness. These results confirm that the French version of the new KSS score is reliable, feasible, discriminating, consistent and responsive. The novelty of this scoring system resides in the "expectations" and "satisfaction" domains, its availability as a self-assessment questionnaire and the evaluation of function during various activities. Level III. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
AFSS: athlete's foot severity score. A proposal and validation.

PubMed

Cohen, A D; Wolak, A; Alkan, M; Shalev, R; Vardy, D A

2002-04-01

We developed a simple scoring system to evaluate the severity of tinea pedis (Athlete's foot severity score, AFSS). The AFSS consists of a clinical evaluation, using a three-point scale, of erythema and scaling in the plantar and interdigital spaces of the feet, and counts of interdigital spaces involved. Each foot is evaluated separately. The validity of the AFSS was assessed in 224 soldiers of the Israel Defense Force using mycological cultures as the main outcome measure and subjective assessment of pruritus as the secondary outcome measure. Mycological examinations were performed in 106 patients who had clinical evidence of tinea pedis. AFSS was significantly associated with culture results (P<0.0001), as well as with the presence of pruritus (P=0.002), and pruritus scores (P=0.025). We conclude the AFSS is valid for the clinical evaluation of tinea pedis severity in military settings. The application of AFSS to civilian morbidity should be subjected to further evaluation. AFSS: Schweregrad-Beurteilung des Athletenfusses. Ein Vorschlag
Validity and reliability of the infant breastfeeding assessment tool, the mother baby assessment tool, and the LATCH scoring system.

PubMed

Altuntas, Nilgun; Turkyilmaz, Canan; Yildiz, Havva; Kulali, Ferit; Hirfanoglu, Ibrahim; Onal, Esra; Ergenekon, Ebru; Koç, Esin; Atalay, Yıldız

2014-05-01

We aimed to evaluate the validity and reliability of the Infant Breastfeeding Assessment Tool (IBFAT), the Mother Baby Assessment (MBA) Tool, and the LATCH scoring system. Mothers who delivered healthy, full-term infants in the Obstetrics & Gynecology Service of Gazi University, Ankara, Turkey, between December 2013 and January 2014 and their infants were included in the study. Forty-six randomly selected breastfeeding sessions were monitored and scored simultaneously by three researchers (Raters 1, 2, and 3) using LATCH, IBFAT, and the MBA Tool. Researchers put the score sheets in an envelope in order to hide them from each other. The compatibility of the scores given by three researchers was assessed by statistical methods. We found positive and significant correlation coefficients between 0.81 to 0.88 for the total MBA score, between 0.90 to 0.95 for the total IBFAT score, and between 0.85 to 0.91 for the total LATCH score. Correlation coefficients testing these three tools ranged from 0.71 to 0.88, with the minimum value being noted for the correlation between LATCH and IBFAT scores and the maximum value being noted for the correlation between LATCH and MBA scores. We found positive and significant correlations between researchers' scores for 46 observations using the three assessment tools. This study showed that these above-mentioned tools were compatible for the assessment of the efficiency of breastfeeding.
Development and validation of an individual dietary index based on the British Food Standard Agency nutrient profiling system in a French context.

PubMed

Julia, Chantal; Touvier, Mathilde; Méjean, Caroline; Ducrot, Pauline; Péneau, Sandrine; Hercberg, Serge; Kesse-Guyot, Emmanuelle

2014-12-01

Nutrient profiling systems could be useful public health tools as a basis for front-of-package nutrition labeling, advertising regulations, or food taxes. However, their ability beyond characterization of foods to adequately characterize individual diets necessitates further investigation. The objectives of this study were 1) to calculate a score at the individual level based on the British Food Standard Agency (FSA) food-level nutrient profiling system of each food consumed, and 2) to evaluate the validity of the resulting diet-quality score against food group consumption, nutrient intake, and sociodemographic and lifestyle variables. A representative sample of the French population was selected from the NutriNet-Santé Study (n = 4225). Dietary data were collected through repeated 24-h dietary records. Sociodemographic and lifestyle data were self-reported. All foods consumed were characterized by their FSA nutrient profile, and the energy intake from each food consumed was used to compute FSA-derived aggregated scores at the individual level. A score of adherence to French nutritional recommendations [Programme National Nutrition Santé guideline score (PNNS-GS)] was computed as a comparison diet-quality score. Associations between food consumption, nutritional indicators, lifestyle and sociodemographic variables, and quartiles of aggregated scores were investigated using ANOVAs and linear regression models. Participants with more favorable scores consumed higher amounts of fruits [difference Δ = 156 g/d between quartile 1 (less favorable) and quartile 4 (most favorable), P < 0.001], vegetables (Δ = 85 g/d, P < 0.001), and fish, and lower amounts of snack foods (Δ = -72 g/d, P < 0.001 for sugary snacks); they also had higher vitamin and mineral intakes and lower intakes of saturated fat. Participants with more favorable scores also had a higher adherence to nutritional recommendations measured with the PNNS-GS (Δ = 2.13 points, P < 0.001). Women, older subjects, and higher-income subjects were more likely to have more favorable scores. Our results show adequate validity of the FSA nutrient profiling system to characterize individual diets in a French context. The NutriNet-Santé Study was registered in the European Clinical Trials Database (EudraCT) as 2013-000929-31. © 2014 American Society for Nutrition.
Development and Validation of a Novel Scoring System for Predicting Technical Success of Chronic Total Occlusion Percutaneous Coronary Interventions: The PROGRESS CTO (Prospective Global Registry for the Study of Chronic Total Occlusion Intervention) Score.

PubMed

Christopoulos, Georgios; Kandzari, David E; Yeh, Robert W; Jaffer, Farouc A; Karmpaliotis, Dimitri; Wyman, Michael R; Alaswad, Khaldoon; Lombardi, William; Grantham, J Aaron; Moses, Jeffrey; Christakopoulos, Georgios; Tarar, Muhammad Nauman J; Rangan, Bavana V; Lembo, Nicholas; Garcia, Santiago; Cipher, Daisha; Thompson, Craig A; Banerjee, Subhash; Brilakis, Emmanouil S

2016-01-11

This study sought to develop a novel parsimonious score for predicting technical success of chronic total occlusion (CTO) percutaneous coronary intervention (PCI) performed using the hybrid approach. Predicting technical success of CTO PCI can facilitate clinical decision making and procedural planning. We analyzed clinical and angiographic parameters from 781 CTO PCIs included in PROGRESS CTO (Prospective Global Registry for the Study of Chronic Total Occlusion Intervention) using a derivation and validation cohort (2:1 sampling ratio). Variables with strong association with technical success in multivariable analysis were assigned 1 point, and a 4-point score was developed from summing all points. The PROGRESS CTO score was subsequently compared with the J-CTO (Multicenter Chronic Total Occlusion Registry in Japan) score in the validation cohort. Technical success was 92.9%. On multivariable analysis, factors associated with technical success included proximal cap ambiguity (beta coefficient [b] = 0.88), moderate/severe tortuosity (b = 1.18), circumflex artery CTO (b = 0.99), and absence of "interventional" collaterals (b = 0.88). The resulting score demonstrated good calibration and discriminatory capacity in the derivation (Hosmer-Lemeshow chi-square = 2.633; p = 0.268, and receiver-operator characteristic [ROC] area = 0.778) and validation (Hosmer-Lemeshow chi-square = 5.333; p = 0.070, and ROC area = 0.720) subset. In the validation cohort, the PROGRESS CTO and J-CTO scores performed similarly in predicting technical success (ROC area 0.720 vs. 0.746, area under the curve difference = 0.026, 95% confidence interval = -0.093 to 0.144). The PROGRESS CTO score is a novel useful tool for estimating technical success in CTO PCI performed using the hybrid approach. Copyright © 2016 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
Development of Servo Motor Trainer for Basic Control System in Laboratory of Electrical Engineering Control System Faculty of Engineering Universitas Negeri Surabaya

NASA Astrophysics Data System (ADS)

Endryansyah; Wanarti Rusimamto, Puput; Ridianto, Adam; Sugiarto, Hariyadi

2018-04-01

In the Department of Electrical Engineering FT Unesa, there are 3 majors: S1 Electrical Engineering Education, S1 Electrical Engineering, and D3 Electrical Engineering. Courses the Basic System Settings go to in the curriculum of the three programs. Team lecturer college of basic system settings seek learning innovation, focused on the development of trainer to student practicum at the laboratory of systems control. Trainer developed is a servo motor along with the lab module that contains a wide variety of theories about the servo motor and guide the practicum. This research type is development research using methods Research & development (R & D). In which the steps are applied in this study is as follows: pay attention to the potential and existing problems, gather information and study the literature, design the product, validate the design, revise the design, a limited trial. The results of the validation of learning device in the form of modules and trainer obtained as follows: score validation of learning device is 3,64; score validation lab module Servo Motor is 3,47; and questionnaire responses of students is 3,73. The result of the whole validation value is located in the interval >of 3.25 s/d 4 with the category of “Very Valid”, so it can be concluded that all instruments have a level of validity “Very Valid” and worthy of use for further learning.
The London handicap scale: a re-evaluation of its validity using standard scoring and simple summation.

PubMed

Jenkinson, C; Mant, J; Carter, J; Wade, D; Winner, S

2000-03-01

To assess the validity of the London handicap scale (LHS) using a simple unweighted scoring system compared with traditional weighted scoring 323 patients admitted to hospital with acute stroke were followed up by interview 6 months after their stroke as part of a trial looking at the impact of a family support organiser. Outcome measures included the six item LHS, the Dartmouth COOP charts, the Frenchay activities index, the Barthel index, and the hospital anxiety and depression scale. Patients' handicap score was calculated both using the standard procedure (with weighting) for the LHS, and using a simple summation procedure without weighting (U-LHS). Construct validity of both LHS and U-LHS was assessed by testing their correlations with the other outcome measures. Cronbach's alpha for the LHS was 0.83. The U-LHS was highly correlated with the LHS (r=0.98). Correlation of U-LHS with the other outcome measures gave very similar results to correlation of LHS with these measures. Simple summation scoring of the LHS does not lead to any change in the measurement properties of the instrument compared with standard weighted scoring. Unweighted scores are easier to calculate and interpret, so it is recommended that these are used.
Validation of Computerized Automatic Calculation of the Sequential Organ Failure Assessment Score

PubMed Central

Harrison, Andrew M.; Pickering, Brian W.; Herasevich, Vitaly

2013-01-01

Purpose. To validate the use of a computer program for the automatic calculation of the sequential organ failure assessment (SOFA) score, as compared to the gold standard of manual chart review. Materials and Methods. Adult admissions (age > 18 years) to the medical ICU with a length of stay greater than 24 hours were studied in the setting of an academic tertiary referral center. A retrospective cross-sectional analysis was performed using a derivation cohort to compare automatic calculation of the SOFA score to the gold standard of manual chart review. After critical appraisal of sources of disagreement, another analysis was performed using an independent validation cohort. Then, a prospective observational analysis was performed using an implementation of this computer program in AWARE Dashboard, which is an existing real-time patient EMR system for use in the ICU. Results. Good agreement between the manual and automatic SOFA calculations was observed for both the derivation (N=94) and validation (N=268) cohorts: 0.02 ± 2.33 and 0.29 ± 1.75 points, respectively. These results were validated in AWARE (N=60). Conclusion. This EMR-based automatic tool accurately calculates SOFA scores and can facilitate ICU decisions without the need for manual data collection. This tool can also be employed in a real-time electronic environment. PMID:23936639
Face, content, and construct validity of four, inanimate training exercises using the da Vinci ® Si surgical system configured with Single-Site ™ instrumentation.

PubMed

Jarc, Anthony M; Curet, Myriam

2015-08-01

Validated training exercises are essential tools for surgeons as they develop technical skills to use robot-assisted minimally invasive surgical systems. The purpose of this study was to show face, content, and construct validity of four, inanimate training exercises using the da Vinci (®) Si surgical system configured with Single-Site (™) instrumentation. New (N = 21) and experienced (N = 6) surgeons participated in the study. New surgeons (11 Gynecology [GYN] and 10 General Surgery [GEN]) had not completed any da Vinci Single-Site cases but may have completed multiport cases using the da Vinci system. They participated in this study prior to attending a certification course focused on da Vinci Single-Site instrumentation. Experienced surgeons (5 GYN and 1 GEN) had completed at least 25 da Vinci Single-Site cases. The surgeons completed four inanimate training exercises and then rated them with a questionnaire. Raw metrics and overall normalized scores were computed using both video recordings and kinematic data collected from the surgical system. The experienced surgeons significantly outperformed new surgeons for many raw metrics and the overall normalized scores derived from video review (p < 0.05). Only one exercise did not achieve a significant difference between new and experienced surgeons (p = 0.08) when calculating an overall normalized score using both video and advanced metrics derived from kinematic data. Both new and experienced surgeons rated the training exercises as appearing, to train and measure technical skills used during da Vinci Single-Site surgery and actually testing the technical skills used during da Vinci Single-Site surgery. In summary, the four training exercises showed face, content, and construct validity. Improved overall scores could be developed using additional metrics not included in this study. The results suggest that the training exercises could be used in an overall training curriculum aimed at developing proficiency in technical skills for surgeons new to da Vinci Single-Site instrumentation.
Validity and reliability of the Greek version of the xerostomia questionnaire in head and neck cancer patients.

PubMed

Memtsa, Pinelopi Theopisti; Tolia, Maria; Tzitzikas, Ioannis; Bizakis, Ioannis; Pistevou-Gombaki, Kyriaki; Charalambidou, Martha; Iliopoulou, Chrysoula; Kyrgias, George

2017-03-01

Xerostomia after radiation therapy for head and neck (H&N) cancer has serious effects on patients' quality of life. The purpose of this study was to validate the Greek version of the self-reported eight-item xerostomia questionnaire (XQ) in patients treated with radiotherapy for H&N cancer. The XQ was translated into Greek and administered to 100 XQ patients. An exploratory factor analysis was performed. Reliability measures were calculated. Several types of validity were evaluated. The observer-rated scoring system was also used. The mean XQ value was 41.92 (SD 22.71). Factor analysis revealed the unidimensional nature of the questionnaire. High reliability measures (ICC, Cronbach's α, Pearson coefficients) were obtained. Patients differed statistically significantly in terms of XQ score, depending on the RTOG/EORTC classification. The Greek version of XQ is valid and reliable. Its score is well related to observer's findings and it can be used to evaluate the impact of radiation therapy on the subjective feeling of xerostomia.
The BRICS (Bronchiectasis Radiologically Indexed CT Score): A Multicenter Study Score for Use in Idiopathic and Postinfective Bronchiectasis.

PubMed

Bedi, Pallavi; Chalmers, James D; Goeminne, Pieter C; Mai, Cindy; Saravanamuthu, Pira; Velu, Prasad Palani; Cartlidge, Manjit K; Loebinger, Michael R; Jacob, Joe; Kamal, Faisal; Schembri, Nicola; Aliberti, Stefano; Hill, Uta; Harrison, Mike; Johnson, Christopher; Screaton, Nicholas; Haworth, Charles; Polverino, Eva; Rosales, Edmundo; Torres, Antoni; Benegas, Michael N; Rossi, Adriano G; Patel, Dilip; Hill, Adam T

2018-05-01

The goal of this study was to develop a simplified radiological score that could assess clinical disease severity in bronchiectasis. The Bronchiectasis Radiologically Indexed CT Score (BRICS) was devised based on a multivariable analysis of the Bhalla score and its ability in predicting clinical parameters of severity. The score was then externally validated in six centers in 302 patients. A total of 184 high-resolution CT scans were scored for the validation cohort. In a multiple logistic regression model, disease severity markers significantly associated with the Bhalla score were percent predicted FEV 1 , sputum purulence, and exacerbations requiring hospital admission. Components of the Bhalla score that were significantly associated with the disease severity markers were bronchial dilatation and number of bronchopulmonary segments with emphysema. The BRICS was developed with these two parameters. The receiver operating-characteristic curve values for BRICS in the derivation cohort were 0.79 for percent predicted FEV 1 , 0.71 for sputum purulence, and 0.75 for hospital admissions per year; these values were 0.81, 0.70, and 0.70, respectively, in the validation cohort. Sputum free neutrophil elastase activity was significantly elevated in the group with emphysema on CT imaging. A simplified CT scoring system can be used as an adjunct to clinical parameters to predict disease severity in patients with idiopathic and postinfective bronchiectasis. Copyright © 2017 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.
Evaluating the accuracy of the Wechsler Memory Scale-Fourth Edition (WMS-IV) logical memory embedded validity index for detecting invalid test performance.

PubMed

Soble, Jason R; Bain, Kathleen M; Bailey, K Chase; Kirton, Joshua W; Marceaux, Janice C; Critchfield, Edan A; McCoy, Karin J M; O'Rourke, Justin J F

2018-01-08

Embedded performance validity tests (PVTs) allow for continuous assessment of invalid performance throughout neuropsychological test batteries. This study evaluated the utility of the Wechsler Memory Scale-Fourth Edition (WMS-IV) Logical Memory (LM) Recognition score as an embedded PVT using the Advanced Clinical Solutions (ACS) for WAIS-IV/WMS-IV Effort System. This mixed clinical sample was comprised of 97 total participants, 71 of whom were classified as valid and 26 as invalid based on three well-validated, freestanding criterion PVTs. Overall, the LM embedded PVT demonstrated poor concordance with the criterion PVTs and unacceptable psychometric properties using ACS validity base rates (42% sensitivity/79% specificity). Moreover, 15-39% of participants obtained an invalid ACS base rate despite having a normatively-intact age-corrected LM Recognition total score. Receiving operating characteristic curve analysis revealed a Recognition total score cutoff of < 61% correct improved specificity (92%) while sensitivity remained weak (31%). Thus, results indicated the LM Recognition embedded PVT is not appropriate for use from an evidence-based perspective, and that clinicians may be faced with reconciling how a normatively intact cognitive performance on the Recognition subtest could simultaneously reflect invalid performance validity.
Assessing new terminal body and facial hair growth during pregnancy: toward developing a simplified visual scoring system for hirsutism.

PubMed

Yang, Yabo; Han, Yang; Wang, Wenjun; Du, Tao; Li, Yu; Zhang, Jianping; Yang, Dongzi; Zhao, Xiaomiao

2016-02-01

To study the distribution and progression of terminal hair growth in pregnant women and to determine the feasibility of a simplified scoring system for assessing hirsutism. Prospective follow-up observational study. Academic hospital. A total of 115 pregnant women (discovery cohort) and 1,159 women with polycystic ovary syndrome (PCOS) (validation cohort). Facial and body terminal hair growth assessed by modified Ferriman and Gallwey score system (mFG score), and total testosterone (TT) level detected by liquid chromatography with tandem mass spectrometry. Degree of facial and body terminal hair growth. The serum TT level and mFG score increased as pregnancy progressed. Both the prospective study and receiver operating characteristics curve indicated that the body areas with the greatest contribution to hirsutism (defined as an mFG score ≥5) with new terminal hair growth were the upper lip, lower back, lower abdomen, and thigh. A simplified mFG scoring system (sFG) was developed, and a cutoff value of ≥3 was defined as hirsutism. Pregnant hirsute women were distinguished from nonhirsute women with an accuracy of 95.2%, sensitivity of 96.8%, and specificity of 94.3% for detecting hirsutism. This was further validated in the PCOS population with a sensitivity, specificity, and positive predictive value of 97.6%, 96.4%, and 96.4%, respectively. This study suggests that the upper lip, lower back, lower abdomen, and thigh may be an effective simplified combination of the mFG system for the evaluation of excess hair growth in Chinese women. ChiCTR-OCH-14005012. Copyright © 2016 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
A Critical Review of Some Qualitative Research Methods Used to Explore Rater Cognition

ERIC Educational Resources Information Center

Suto, Irenka

2012-01-01

Internationally, many assessment systems rely predominantly on human raters to score examinations. Arguably, this facilitates the assessment of multiple sophisticated educational constructs, strengthening assessment validity. It can introduce subjectivity into the scoring process, however, engendering threats to accuracy. The present objectives…
Assessing normative cut points through differential item functioning analysis: an example from the adaptation of the Middlesex Elderly Assessment of Mental State (MEAMS) for use as a cognitive screening test in Turkey.

PubMed

Tennant, Alan; Küçükdeveci, Ayse A; Kutlay, Sehim; Elhan, Atilla H

2006-03-23

The Middlesex Elderly Assessment of Mental State (MEAMS) was developed as a screening test to detect cognitive impairment in the elderly. It includes 12 subtests, each having a 'pass score'. A series of tasks were undertaken to adapt the measure for use in the adult population in Turkey and to determine the validity of existing cut points for passing subtests, given the wide range of educational level in the Turkish population. This study focuses on identifying and validating the scoring system of the MEAMS for Turkish adult population. After the translation procedure, 350 normal subjects and 158 acquired brain injury patients were assessed by the Turkish version of MEAMS. Initially, appropriate pass scores for the normal population were determined through ANOVA post-hoc tests according to age, gender and education. Rasch analysis was then used to test the internal construct validity of the scale and the validity of the cut points for pass scores on the pooled data by using Differential Item Functioning (DIF) analysis within the framework of the Rasch model. Data with the initially modified pass scores were analyzed. DIF was found for certain subtests by age and education, but not for gender. Following this, pass scores were further adjusted and data re-fitted to the model. All subtests were found to fit the Rasch model (mean item fit 0.184, SD 0.319; person fit -0.224, SD 0.557) and DIF was then found to be absent. Thus the final pass scores for all subtests were determined. The MEAMS offers a valid assessment of cognitive state for the adult Turkish population, and the revised cut points accommodate for age and education. Further studies are required to ascertain the validity in different diagnostic groups.
Instrument Motion Metrics for Laparoscopic Skills Assessment in Virtual Reality and Augmented Reality.

PubMed

Fransson, Boel A; Chen, Chi-Ya; Noyes, Julie A; Ragle, Claude A

2016-11-01

To determine the construct and concurrent validity of instrument motion metrics for laparoscopic skills assessment in virtual reality and augmented reality simulators. Evaluation study. Veterinarian students (novice, n = 14) and veterinarians (experienced, n = 11) with no or variable laparoscopic experience. Participants' minimally invasive surgery (MIS) experience was determined by hospital records of MIS procedures performed in the Teaching Hospital. Basic laparoscopic skills were assessed by 5 tasks using a physical box trainer. Each participant completed 2 tasks for assessments in each type of simulator (virtual reality: bowel handling and cutting; augmented reality: object positioning and a pericardial window model). Motion metrics such as instrument path length, angle or drift, and economy of motion of each simulator were recorded. None of the motion metrics in a virtual reality simulator showed correlation with experience, or to the basic laparoscopic skills score. All metrics in augmented reality were significantly correlated with experience (time, instrument path, and economy of movement), except for the hand dominance metric. The basic laparoscopic skills score was correlated to all performance metrics in augmented reality. The augmented reality motion metrics differed between American College of Veterinary Surgeons diplomates and residents, whereas basic laparoscopic skills score and virtual reality metrics did not. Our results provide construct validity and concurrent validity for motion analysis metrics for an augmented reality system, whereas a virtual reality system was validated only for the time score. © Copyright 2016 by The American College of Veterinary Surgeons.
The validation of the visual analogue scale for patient satisfaction after total hip arthroplasty.

PubMed

Brokelman, Roy B G; Haverkamp, Daniel; van Loon, Corné; Hol, Annemiek; van Kampen, Albert; Veth, Rene

2012-06-01

INTRODUCTION: Patient satisfaction becomes more important in our modern health care system. The assessment of satisfaction is difficult because it is a multifactorial item for which no golden standard exists. One of the potential methods of measuring satisfaction is by using the well-known visual analogue scale (VAS). In this study, we validated VAS for satisfaction. PATIENT AND METHODS: In this prospective study, we studied 147 patients (153 hips). The construct validity was measured using the Spearman correlation test that compares the satisfaction VAS with the Harris hip score, pain VAS at rest and during activity, Oxford hip score, Short Form 36 and Western Ontario McMaster Universities Osteoarthritis Index. The reliability was tested using the intra-class coefficient. RESULTS: The Pearson correlation test showed correlations in the range of 0.40-0.80. The satisfaction VAS had a high correlation between the pain VAS and Oxford hip score, which could mean that pain is one of the most important factors in patient satisfaction. The intra-class coefficient was 0.95. CONCLUSIONS: There is a moderate to mark degree of correlation between the satisfaction VAS and the currently available subjective and objective scoring systems. The intra-class coefficient of 0.95 indicates an excellent test-retest reliability. The VAS satisfaction is a simple instrument to quantify the satisfaction of a patient after total hip arthroplasty. In this study, we showed that the satisfaction VAS has a good validity and reliability.
The Cord Blood Apgar: a novel scoring system to optimize selection of banked cord blood grafts for transplantation

PubMed Central

Page, Kristin M.; Zhang, Lijun; Mendizabal, Adam; Wease, Stephen; Carter, Shelly; Shoulars, Kevin; Gentry, Tracy; Balber, Andrew E.; Kurtzberg, Joanne

2012-01-01

BACKGROUND Engraftment failure and delays, likely due to diminished cord blood unit (CBU) potency, remain major barriers to the overall success of unrelated umbilical cord blood transplantation (UCBT). To address this problem, we developed and retrospectively validated a novel scoring system, the Cord Blood Apgar (CBA), which is predictive of engraftment after UCBT. STUDY DESIGN AND METHODS In a single-center retrospective study, utilizing a database of 435 consecutive single cord myeloablative UCBTs performed between January 1, 2000, to December 31, 2008, precryopreservation and postthaw graft variables (total nucleated cell, CD34+, colony-forming units, mononuclear cell content, and volume) were initially correlated with neutrophil engraftment. Subsequently, based on the magnitude of hazard ratios (HRs) in univariate analysis, a weighted scoring system to predict CBU potency was developed using a randomly selected training data set and internally validated on the remaining data set. RESULTS The CBA assigns transplanted CBUs three scores: a precryopreservation score (PCS), a postthaw score (PTS), and a composite score (CS), which incorporates the PCS and PTS values. CBA-PCS scores, which could be used for initial unit selection, were predictive of neutrophil (CBA-PCS ≥ 7.75 vs. <7.75, HR 3.5; p < 0.0001) engraftment. Likewise, CBA-PTS and CS scores were strongly predictive of Day 42 neutrophil engraftment (CBA-PTS ≥ 9.5 vs. <9.5, HR 3.16, p < 0.0001; CBA-CS ≥ 17.75 vs. <17.75, HR 4.01, p < 0.0001). CONCLUSION The CBA is strongly predictive of engraftment after UCBT and shows promise for optimizing screening of CBU donors for transplantation. In the future, a segment could be assayed for the PTS score providing data to apply the CS for final CBU selection. PMID:21810098
S007--Preliminary Evaluation of the Pattern Cutting and the Ligating Loop Virtual Laparoscopic Trainers

PubMed Central

Chellali, A.; Ahn, W.; Sankaranarayanan, G.; Flinn, J. T.; Schwaitzberg, S. D.; Jones, D.B.; De, Suvranu; Cao, C.G.L.

2014-01-01

Introduction The Fundamentals of Laparoscopic Surgery (FLS) trainer is currently the standard for training and evaluating basic laparoscopic skills. However, its manual scoring system is time-consuming and subjective. The Virtual Basic Laparoscopic Skill Trainer (VBLaST©) is the virtual version of the FLS trainer which allows automatic and real time assessment of skill performance, as well as force feedback. In this study, the VBLaST© pattern cutting (VBLaST-PC©) and ligating loop (VBLaST-LL©) tasks were evaluated as part of a validation study. We hypothesized that performance would be similar on the FLS and VBLaST© trainers, and that subjects with more experience would perform better than those with less experience on both trainers. Methods Fifty-five subjects with varying surgical experience were recruited at the Learning Center during the 2013 SAGES annual meeting and were divided into two groups: experts (PGY 5, surgical fellows and surgical attendings) and novices (PGY 1–4). They were asked to perform the pattern cutting or the ligating loop task on the FLS and the VBLaST© trainers. Their performance scores for each trainer were calculated and compared. Results There were no significant differences between the FLS and VBLaST© scores for either the pattern cutting or the ligating loop task. Experts’ scores were significantly higher than the scores for novices on both trainers. Conclusion This study showed that the subjects’ performance on the VBLaST© trainer was similar to the FLS performance for both tasks. Both the VBLaST-PC© and the VBLaST-LL© tasks permitted discrimination between the novice and expert groups. Though concurrent and discriminant validity has been established, further studies to establish convergent and predictive validity are needed. Once validated as a training system for laparoscopic skills, the system is expected to overcome the current limitations of the FLS trainer. PMID:25159626

The discriminatory capability of existing scores to predict advanced colorectal neoplasia: a prospective colonoscopy study of 5,899 screening participants.

PubMed

Wong, Martin C S; Ching, Jessica Y L; Ng, Simpson; Lam, Thomas Y T; Luk, Arthur K C; Wong, Sunny H; Ng, Siew C; Ng, Simon S M; Wu, Justin C Y; Chan, Francis K L; Sung, Joseph J Y

2016-02-03

We evaluated the performance of seven existing risk scoring systems in predicting advanced colorectal neoplasia in an asymptomatic Chinese cohort. We prospectively recruited 5,899 Chinese subjects aged 50-70 years in a colonoscopy screening programme(2008-2014). Scoring systems under evaluation included two scoring tools from the US; one each from Spain, Germany, and Poland; the Korean Colorectal Screening(KCS) scores; and the modified Asia Pacific Colorectal Screening(APCS) scores. The c-statistics, sensitivity, specificity, positive predictive values(PPVs), and negative predictive values(NPVs) of these systems were evaluated. The resources required were estimated based on the Number Needed to Screen(NNS) and the Number Needed to Refer for colonoscopy(NNR). Advanced neoplasia was detected in 364 (6.2%) subjects. The German system referred the least proportion of subjects (11.2%) for colonoscopy, whilst the KCS scoring system referred the highest (27.4%). The c-statistics of all systems ranged from 0.56-0.65, with sensitivities ranging from 0.04-0.44 and specificities from 0.74-0.99. The modified APCS scoring system had the highest c-statistics (0.65, 95% C.I. 0.58-0.72). The NNS (12-19) and NNR (5-10) were similar among the scoring systems. The existing scoring systems have variable capability to predict advanced neoplasia among asymptomatic Chinese subjects, and further external validation should be performed.
Comparison of the goals and MISTELS scores for the evaluation of surgeons on training benches.

PubMed

Wolf, Rémi; Medici, Maud; Fiard, Gaëlle; Long, Jean-Alexandre; Moreau-Gaudry, Alexandre; Cinquin, Philippe; Voros, Sandrine

2018-01-01

Evaluation of surgical technical abilities is a major issue in minimally invasive surgery. Devices such as training benches offer specific scores to evaluate surgeons but cannot transfer in the operating room (OR). A contrario, several scores measure performance in the OR, but have not been evaluated on training benches. Our aim was to demonstrate that the GOALS score, which can effectively grade in the OR the abilities involved in laparoscopy, can be used for evaluation on a laparoscopic testbench (MISTELS). This could lead to training systems that can identify more precisely the skills that have been acquired or must still be worked on. 32 volunteers (surgeons, residents and medical students) performed the 5 tasks of the MISTELS training bench and were simultaneously video-recorded. Their performance was evaluated with the MISTELS score and with the GOALS score based on the review of the recording by two experienced, blinded laparoscopic surgeons. The concurrent validity of the GOALS score was assessed using Pearson and Spearman correlation coefficients with the MISTELS score. The construct validity of the GOALS score was assessed with k-means clustering and accuracy rates. Lastly, abilities explored by each MISTELS task were identified with multiple linear regression. GOALS and MISTELS scores are strongly correlated (Pearson correlation coefficient = 0.85 and Spearman correlation coefficient = 0.82 for the overall score). The GOALS score proves to be valid for construction for the tasks of the training bench, with a better accuracy rate between groups of level after k-means clustering, when compared to the original MISTELS score (accuracy rates, respectively, 0.75 and 0.56). GOALS score is well suited for the evaluation of the performance of surgeons of different levels during the completion of the tasks of the MISTELS training bench.
Why Lessons Learned from the Past Require Haertel's Expanded Scope for Test Validation

ERIC Educational Resources Information Center

Shepard, Lorrie A.

2013-01-01

In his article, Haertel (this issue) asks a fundamental question about how use of a test is expected to cause improvements in the educational system and in learning. He also considers how test validity should be investigated and argues for a more expansive view of validity that does not stop with scoring or generalization (the more technical and…
A scoring system to predict breast cancer mortality at 5 and 10 years.

PubMed

Paredes-Aracil, Esther; Palazón-Bru, Antonio; Folgado-de la Rosa, David Manuel; Ots-Gutiérrez, José Ramón; Compañ-Rosique, Antonio Fernando; Gil-Guillén, Vicente Francisco

2017-03-24

Although predictive models exist for mortality in breast cancer (BC) (generally all cause-mortality), they are not applicable to all patients and their statistical methodology is not the most powerful to develop a predictive model. Consequently, we developed a predictive model specific for BC mortality at 5 and 10 years resolving the above issues. This cohort study included 287 patients diagnosed with BC in a Spanish region in 2003-2016. time-to-BC death. Secondary variables: age, personal history of breast surgery, personal history of any cancer/BC, premenopause, postmenopause, grade, estrogen receptor, progesterone receptor, c-erbB2, TNM stage, multicentricity/multifocality, diagnosis and treatment. A points system was constructed to predict BC mortality at 5 and 10 years. The model was internally validated by bootstrapping. The points system was integrated into a mobile application for Android. Mean follow-up was 8.6 ± 3.5 years and 55 patients died of BC. The points system included age, personal history of BC, grade, TNM stage and multicentricity. Validation was satisfactory, in both discrimination and calibration. In conclusion, we constructed and internally validated a scoring system for predicting BC mortality at 5 and 10 years. External validation studies are needed for its use in other geographical areas.
External Validation and Evaluation of Reliability and Validity of the Modified Seoul National University Renal Stone Complexity Scoring System to Predict Stone-Free Status After Retrograde Intrarenal Surgery.

PubMed

Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong

2015-08-01

The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Evaluating the complementary roles of an SJT and academic assessment for entry into clinical practice.

PubMed

Cousans, Fran; Patterson, Fiona; Edwards, Helena; Walker, Kim; McLachlan, John C; Good, David

2017-05-01

Although there is extensive evidence confirming the predictive validity of situational judgement tests (SJTs) in medical education, there remains a shortage of evidence for their predictive validity for performance of postgraduate trainees in their first role in clinical practice. Moreover, to date few researchers have empirically examined the complementary roles of academic and non-academic selection methods in predicting in-role performance. This is an important area of enquiry as despite it being common practice to use both types of methods within a selection system, there is currently no evidence that this approach translates into increased predictive validity of the selection system as a whole, over that achieved by the use of a single selection method. In this preliminary study, the majority of the range of scores achieved by successful applicants to the UK Foundation Programme provided a unique opportunity to address both of these areas of enquiry. Sampling targeted high (>80th percentile) and low (<20th percentile) scorers on the SJT. Supervisors rated 391 trainees' in-role performance, and incidence of remedial action was collected. SJT and academic performance scores correlated with supervisor ratings (r = .31 and .28, respectively). The relationship was stronger between the SJT and in-role performance for the low scoring group (r = .33, high scoring group r = .11), and between academic performance and in-role performance for the high scoring group (r = .29, low scoring group r = .11). Trainees with low SJT scores were almost five times more likely to receive remedial action. Results indicate that an SJT for entry into trainee physicians' first role in clinical practice has good predictive validity of supervisor-rated performance and incidence of remedial action. In addition, an SJT and a measure of academic performance appeared to be complementary to each other. These initial findings suggest that SJTs may be more predictive at the lower end of a scoring distribution, and academic attainment more predictive at the higher end.
Continuous Monitoring of Essential Tremor Using a Portable System Based on Smartwatch.

PubMed

Zheng, Xiaochen; Vieira Campos, Alba; Ordieres-Meré, Joaquín; Balseiro, Jose; Labrador Marcos, Sergio; Aladro, Yolanda

2017-01-01

Essential tremor (ET) shows amplitude fluctuations throughout the day, presenting challenges in both clinical and treatment monitoring. Tremor severity is currently evaluated by validated rating scales, which only provide a timely and subjective assessment during a clinical visit. Motor sensors have shown favorable performances in quantifying tremor objectively. A new highly portable system was used to monitor tremor continuously during daily lives. It consists of a smartwatch with a triaxial accelerometer, a smartphone, and a remote server. An experiment was conducted involving eight ET patients. The average effective data collection time per patient was 26 (±6.05) hours. Fahn-Tolosa-Marin Tremor Rating Scale (FTMTRS) was adopted as the gold standard to classify tremor and to validate the performance of the system. Quantitative analysis of tremor severity on different time scales is validated. Significant correlations were observed between neurologist's FTMTRS and patient's FTMTRS auto-assessment scores ( r = 0.84; p = 0.009), between the device quantitative measures and the scores from the standardized assessments of neurologists ( r = 0.80; p = 0.005) and patient's auto-evaluation ( r = 0.97; p = 0.032), and between patient's FTMTRS auto-assessment scores day-to-day ( r = 0.87; p < 0.001). A graphical representation of four patients with different degrees of tremor was presented, and a representative system is proposed to summarize the tremor scoring at different time scales. This study demonstrates the feasibility of prolonged and continuous monitoring of tremor severity during daily activities by a highly portable non-restrictive system, a useful tool to analyze efficacy and effectiveness of treatment.
Evaluation of the Validity and Response Burden of Patient Self-Report Measures of the Pain Assessment Screening Tool and Outcomes Registry (PASTOR).

PubMed

Cook, Karon F; Kallen, Michael A; Buckenmaier, Chester; Flynn, Diane M; Hanling, Steven R; Collins, Teresa S; Joltes, Kristin; Kwon, Kyung; Medina-Torne, Sheila; Nahavandi, Parisa; Suen, Joshua; Gershon, Richard

2017-07-01

In 2009, the Army Pain Management Task Force was chartered. On the basis of their findings, the Department of Defense recommended a comprehensive pain management strategy that included development of a standardized pain assessment system that would collect patient-reported outcomes data to inform the patient-provider clinical encounter. The result was the Pain Assessment Screening Tool and Outcomes Registry (PASTOR). The purpose of this study was to assess the validity and response burden of the patient-reported outcome measures in PASTOR. Data for analyses were collected from 681 individuals who completed PASTOR at baseline and follow-up as part of their routine clinical care. The survey tool included self-report measures of pain severity and pain interference (measured using the National Institutes of Health Patient-Reported Outcome Measurement Information System [PROMIS] and the Defense and Veterans Pain Rating scale). PROMIS measures of pain correlates also were administered. Validation analyses included estimation of score associations among measures, comparison of scores of known groups, responsiveness, ceiling and floor effects, and response burden. Results of psychometric testing provided substantial evidence for the validity of PASTOR self-report measures in this population. Expected associations among scores largely supported the concurrent validity of the measures. Scores effectively distinguished among respondents on the basis of their self-reported impressions of general health. PROMIS measures were administered using computer adaptive testing and each, on average, required less than 1 minute to administer. Statistical and graphical analyses demonstrated the responsiveness of PASTOR measures over time. Reprint & Copyright © 2017 Association of Military Surgeons of the U.S.
Factor structure and convergent validity of the Derriford Appearance Scale-24 using standard scoring versus treating ‘not applicable’ responses as missing data: a Scleroderma Patient-centered Intervention Network (SPIN) cohort study

PubMed Central

Merz, Erin L; Kwakkenbos, Linda; Carrier, Marie-Eve; Gholizadeh, Shadi; Mills, Sarah D; Fox, Rina S; Jewett, Lisa R; Williamson, Heidi; Harcourt, Diana; Assassi, Shervin; Furst, Daniel E; Gottesman, Karen; Mayes, Maureen D; Moss, Tim P; Thombs, Brett D; Malcarne, Vanessa L

2018-01-01

Objective Valid measures of appearance concern are needed in systemic sclerosis (SSc), a rare, disfiguring autoimmune disease. The Derriford Appearance Scale-24 (DAS-24) assesses appearance-related distress related to visible differences. There is uncertainty regarding its factor structure, possibly due to its scoring method. Design Cross-sectional survey. Setting Participants with SSc were recruited from 27 centres in Canada, the USA and the UK. Participants who self-identified as having visible differences were recruited from community and clinical settings in the UK. Participants Two samples were analysed (n=950 participants with SSc; n=1265 participants with visible differences). Primary and secondary outcome measures The DAS-24 factor structure was evaluated using two scoring methods. Convergent validity was evaluated with measures of social interaction anxiety, depression, fear of negative evaluation, social discomfort and dissatisfaction with appearance. Results When items marked by respondents as ‘not applicable’ were scored as 0, per standard DAS-24 scoring, a one-factor model fit poorly; when treated as missing data, the one-factor model fit well. Convergent validity analyses revealed strong correlations that were similar across scoring methods. Conclusions Treating ‘not applicable’ responses as missing improved the measurement model, but did not substantively influence practical inferences that can be drawn from DAS-24 scores. Indications of item redundancy and poorly performing items suggest that the DAS-24 could be improved and potentially shortened. PMID:29511009
Validation of the American Board of Orthodontics Objective Grading System for assessing the treatment outcomes of Chinese patients.

PubMed

Song, Guang-Ying; Baumrind, Sheldon; Zhao, Zhi-He; Ding, Yin; Bai, Yu-Xing; Wang, Lin; He, Hong; Shen, Gang; Li, Wei-Ran; Wu, Wei-Zi; Ren, Chong; Weng, Xuan-Rong; Geng, Zhi; Xu, Tian-Min

2013-09-01

Orthodontics in China has developed rapidly, but there is no standard index of treatment outcomes. We assessed the validity of the American Board of Orthodontics Objective Grading System (ABO-OGS) for the classification of treatment outcomes in Chinese patients. We randomly selected 108 patients who completed treatment between July 2005 and September 2008 in 6 orthodontic treatment centers across China. Sixty-nine experienced Chinese orthodontists made subjective assessments of the end-of-treatment casts for each patient. Three examiners then used the ABO-OGS to measure the casts. Pearson correlation analysis and receiver operating characteristic curve analysis were conducted to evaluate the correspondence between the ABO-OGS cast measurements and the orthodontists' subjective assessments. The average subjective grading scores were highly correlated with the ABO-OGS scores (r = 0.7042). Four of the 7 study cast components of the ABO-OGS score-occlusal relationship, overjet, interproximal contact, and alignment-were statistically significantly correlated with the judges' subjective assessments. Together, these 4 accounted for 58% of the variability in the average subjective grading scores. The ABO-OGS cutoff score for cases that the judges deemed satisfactory was 16 points; the corresponding cutoff score for cases that the judges considered acceptable was 21 points. The ABO-OGS is a valid index for the assessment of treatment outcomes in Chinese patients. By comparing the objective scores on this modification of the ABO-OGS with the mean subjective assessment of a panel of highly qualified Chinese orthodontists, a cutoff point for satisfactory treatment outcome was defined as 16 points or fewer, with scores of 16 to 21 points denoting less than satisfactory but still acceptable treatment. Cases that scored greater than 21 points were considered unacceptable. Copyright © 2013 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
The Depressive Experiences Questionnaire: validity and psychological correlates in a clinical sample.

PubMed

Riley, W T; McCranie, E W

1990-01-01

This study sought to compare the original and revised scoring systems of the Depressive Experiences Questionnaire (DEQ) and to assess the construct validity of the Dependent and Self-Critical subscales of the DEQ in a clinically depressed sample. Subjects were 103 depressed inpatients who completed the DEQ, the Beck Depression Inventory (BDI), the Hopelessness Scale, the Automatic Thoughts Questionnaire (ATQ), the Rathus Assertiveness Schedule (RAS), and the Minnesota Multiphasic Personality Inventory (MMPI). The original and revised scoring systems of the DEQ evidenced good concurrent validity for each factor scale, but the revised system did not sufficiently discriminate dependent and self-critical dimensions. Using the original scoring system, self-criticism was significantly and positively related to severity of depression, whereas dependency was not, particularly for males. Factor analysis of the DEQ scales and the other scales used in this study supported the dependent and self-critical dimensions. For men, the correlation of the DEQ with the MMPI scales indicated that self-criticism was associated with psychotic symptoms, hostility/conflict, and a distress/exaggerated response set, whereas dependency did not correlate significantly with any MMPI scales. Females, however, did not exhibit a differential pattern of correlations between either the Dependency or the Self-Criticism scales and the MMPI. These findings suggest possible gender differences in the clinical characteristics of male and female dependent and self-critical depressive subtypes.
Validation of the Lupus Nephritis Clinical Indices in Childhood-Onset Systemic Lupus Erythematosus.

PubMed

Mina, Rina; Abulaban, Khalid; Klein-Gitelman, Marisa S; Eberhard, Barbara A; Ardoin, Stacy P; Singer, Nora; Onel, Karen; Tucker, Lori; O'neil, Kathleen; Wright, Tracey; Brooks, Elizabeth; Rouster-Stevens, Kelly; Jung, Lawrence; Imundo, Lisa; Rovin, Brad; Witte, David; Ying, Jun; Brunner, Hermine I

2016-02-01

To validate clinical indices of lupus nephritis activity and damage when used in children against the criterion standard of kidney biopsy findings. In 83 children requiring kidney biopsy, the Systemic Lupus Erythematosus Disease Activity Index renal domain (SLEDAI-R), British Isles Lupus Assessment Group index renal domain (BILAG-R), Systemic Lupus International Collaborating Clinics (SLICC) renal activity score (SLICC-RAS), and SLICC Damage Index renal domain (SDI-R) were measured. Fixed effects and logistic models were calculated to predict International Society of Nephrology/Renal Pathology Society (ISN/RPS) class; low-to-moderate versus high lupus nephritis activity (National Institutes of Health [NIH] activity index [AI]) score: ≤10 versus >10; tubulointerstitial activity index (TIAI) score: ≤5 versus >5; or the absence versus presence of lupus nephritis chronicity (NIH chronicity index) score: 0 versus ≥1. There were 10, 50, and 23 patients with ISN/RPS class I/II, III/IV, and V, respectively. Scores of the clinical indices did not differentiate among patients by ISN/RPS class. The SLEDAI-R and SLICC-RAS but not the BILAG-R differed with lupus nephritis activity status defined by NIH-AI scores, while only the SLEDAI-R scores differed between lupus nephritis activity status based on TIAI scores. The sensitivity and specificity of the SDI-R to capture lupus nephritis chronicity was 23.5% and 91.7%, respectively. Despite being designed to measure lupus nephritis activity, SLICC-RAS and SLEDAI-R scores significantly differed with lupus nephritis chronicity status. Current clinical indices of lupus nephritis fail to discriminate ISN/RPS class in children. Despite its shortcomings, the SLEDAI-R appears best for measuring lupus nephritis activity in a clinical setting. The SDI-R is a poor correlate of lupus nephritis chronicity. © 2016, American College of Rheumatology.
Derivation and validation of a novel risk score for safe discharge after acute lower gastrointestinal bleeding: a modelling study.

PubMed

Oakland, Kathryn; Jairath, Vipul; Uberoi, Raman; Guy, Richard; Ayaru, Lakshmana; Mortensen, Neil; Murphy, Mike F; Collins, Gary S

2017-09-01

Acute lower gastrointestinal bleeding is a common reason for emergency hospital admission, and identification of patients at low risk of harm, who are therefore suitable for outpatient investigation, is a clinical and research priority. We aimed to develop and externally validate a simple risk score to identify patients with lower gastrointestinal bleeding who could safely avoid hospital admission. We undertook model development with data from the National Comparative Audit of Lower Gastrointestinal Bleeding from 143 hospitals in the UK in 2015. Multivariable logistic regression modelling was used to identify predictors of safe discharge, defined as the absence of rebleeding, blood transfusion, therapeutic intervention, 28 day readmission, or death. The model was converted into a simplified risk scoring system and was externally validated in 288 patients admitted with lower gastrointestinal bleeding (184 safely discharged) from two UK hospitals (Charing Cross Hospital, London, and Hammersmith Hospital, London) that had not contributed data to the development cohort. We calculated C statistics for the new model and did a comparative assessment with six previously developed risk scores. Of 2336 prospectively identified admissions in the development cohort, 1599 (68%) were safely discharged. Age, sex, previous admission for lower gastrointestinal bleeding, rectal examination findings, heart rate, systolic blood pressure, and haemoglobin concentration strongly discriminated safe discharge in the development cohort (C statistic 0·84, 95% CI 0·82-0·86) and in the validation cohort (0·79, 0·73-0·84). Calibration plots showed the new risk score to have good calibration in the validation cohort. The score was better than the Rockall, Blatchford, Strate, BLEED, AIMS65, and NOBLADS scores in predicting safe discharge. A score of 8 or less predicts a 95% probability of safe discharge. We developed and validated a novel clinical prediction model with good discriminative performance to identify patients with lower gastrointestinal bleeding who are suitable for safe outpatient management, which has important economic and resource implications. Bowel Disease Research Foundation and National Health Service Blood and Transplant. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validating Test Score Meaning and Defending Test Score Use: Different Aims, Different Methods

ERIC Educational Resources Information Center

Cizek, Gregory J.

2016-01-01

Advances in validity theory and alacrity in validation practice have suffered because the term "validity" has been used to refer to two incompatible concerns: (1) the degree of support for specified interpretations of test scores (i.e. intended score meaning) and (2) the degree of support for specified applications (i.e. intended test…
Chronic obstructive lung disease "expert system": validation of a predictive tool for assisting diagnosis.

PubMed

Braido, Fulvio; Santus, Pierachille; Corsico, Angelo Guido; Di Marco, Fabiano; Melioli, Giovanni; Scichilone, Nicola; Solidoro, Paolo

2018-01-01

The purposes of this study were development and validation of an expert system (ES) aimed at supporting the diagnosis of chronic obstructive lung disease (COLD). A questionnaire and a WebFlex code were developed and validated in silico. An expert panel pilot validation on 60 cases and a clinical validation on 241 cases were performed. The developed questionnaire and code validated in silico resulted in a suitable tool to support the medical diagnosis. The clinical validation of the ES was performed in an academic setting that included six different reference centers for respiratory diseases. The results of the ES expressed as a score associated with the risk of suffering from COLD were matched and compared with the final clinical diagnoses. A set of 60 patients were evaluated by a pilot expert panel validation with the aim of calculating the sample size for the clinical validation study. The concordance analysis between these preliminary ES scores and diagnoses performed by the experts indicated that the accuracy was 94.7% when both experts and the system confirmed the COLD diagnosis and 86.3% when COLD was excluded. Based on these results, the sample size of the validation set was established in 240 patients. The clinical validation, performed on 241 patients, resulted in ES accuracy of 97.5%, with confirmed COLD diagnosis in 53.6% of the cases and excluded COLD diagnosis in 32% of the cases. In 11.2% of cases, a diagnosis of COLD was made by the experts, although the imaging results showed a potential concomitant disorder. The ES presented here (COLD ES ) is a safe and robust supporting tool for COLD diagnosis in primary care settings.
Building an efficient surgical team using a bench model simulation: construct validity of the Legacy Inanimate System for Endoscopic Team Training (LISETT).

PubMed

Zheng, B; Denk, P M; Martinec, D V; Gatta, P; Whiteford, M H; Swanström, L L

2008-04-01

Complex laparoscopic tasks require collaboration of surgeons as a surgical team. Conventionally, surgical teams are formed shortly before the start of the surgery, and team skills are built during the surgery. There is a need to establish a training simulation to improve surgical team skills without jeopardizing the safety of surgery. The Legacy Inanimate System for Laparoscopic Team Training (LISETT) is a bench simulation designed to enhance surgical team skills. The reported project tested the construct validity of LISETT. The research question was whether the LISETT scores show progressive improvement correlating with the level of surgical training and laparoscopic team experience or not. With LISETT, two surgeons are required to work closely to perform two laparoscopic tasks: peg transportation and suturing. A total of 44 surgical dyad teams were recruited, composed of medical students, residents, laparoscopic fellows, and experienced surgeons. The LISETT scores were calculated according to the speed and accuracy of the movements. The LISETT scores were positively correlated with surgical experience, and the results can be generalized confidently to surgical teams (Pearson's coefficient, 0.73; p = 0.001). To analyze the influences of individual skill and team dynamics on LISETT performance, team quality was rated by team members using communication and cooperation characters after each practice. The LISETT scores are positively correlated with self-rated team quality scores (Pearson's coefficient, 0.39; p = 0.008). The findings proved LISETT to be a valid system for assessing cooperative skills of a surgical team. By increasing practice time, LISETT provides an opportunity to build surgical team skills, which include effective communication and cooperation.
A comparison of two patient classification instruments in an acute care hospital.

PubMed

Seago, Jean Ann

2002-05-01

Patient classification systems are alternately praised and vilified by staff nurses, nurse managers, and nurse executives. Most nurses agree that substantial resources are used to create or find, implement, manage, and maintain the systems, and that the predictive ability of the instruments is intermittent. The purpose of this study is to compare the predictive validity of two types of patient classification instruments commonly used in acute care hospitals in California. Acute care hospitals in California are required by both the Joint Commission on Accreditation of Healthcare Organizations and California Title 22 to have a reliable and valid patient classification system (PCS). The two general types of systems commonly used are the summative task type PCS and the critical incident or criterion type PCS. There is little to assist nurse executives in deciding which type of PCS to choose. There is modest research demonstrating the validity and reliability of different PCSs but no published data comparing the predictive validity of the different types of systems. The unit of analysis is one patient shift called the study shift. The study shift is defined as the first day shift after the patient has been in the hospital for a full 24 hours. Data were collected using medical record review only. Both types, criterion and summative, of PCS data collection instruments were completed for all patients at both collection points. Each patient had a before and after score for each type of instrument. Three hundred forty-nine medical records for inpatients meeting the inclusion criteria were examined. The average patient age was 76 years, the average length of stay was 6.6 days with an average of 6.7 secondary diagnoses recorded. Fifty-five percent of the sample was female and the most common primary diagnosis was CHF, followed by COPD, CVA, and pneumonia. There was a difference in mean summative predictor score and the mean summative actual score of 1.57 points with the predictor score higher (P =.001; CI =.62--2.5). For the criterion instrument, 68.4% of the predictor criterion scores were in category 2 compared to 65.5% of the actual criterion scores. The criterion predictor agreed with the criterion actual score 45% of the time for category 1 patients, 87.3% of the time for category 2 patients, 77.1% of the time for category 3 patients and 72.7% of the time for category 4 patients, with an overall agreement between predictor and actual criterion scores of 79.9% (Kappa P <.001, indicating agreement is not by chance). The most significant finding of this study is that there are virtually no differences in the predictive ability of summative versus criterion patient classification instruments. Using the same patients, both types of instruments predicted the actual score over 78% of the time.
Concurrent Validity of the Classroom Strategies Scale-Teacher Form: A Preliminary Investigation

ERIC Educational Resources Information Center

Reddy, Linda A.; Dudek, Christopher M.; Rualo, Angelique J.; Fabiano, Gregory A.

2016-01-01

The present study investigated the concurrent validity of the Classroom Strategies Scale-Teacher Form (CSS-T), a multidimensional teacher formative assessment of instructional and behavioral management practices. The CSS-T is compared with the Classroom Assessment Scoring System (CLASS), a well-known teacher assessment of overall classroom…
A Measure of Cognition within the Context of Assertion.

ERIC Educational Resources Information Center

Golden, Morrie

1981-01-01

Described the development and evaluation of a measure of cognitive belief systems and thinking styles. Reliability and validity results were poor for junior college students. For university and nonstudent populations, cognition scores discriminated social anxiety. The Cognition Scale of Assertiveness is a reliable and valid measure of cognitive…
Validation of the American version of the CareGiver Oncology Quality of Life (CarGOQoL) questionnaire.

PubMed

Kaveney, Sarah C; Baumstarck, Karine; Minaya-Flores, Patricia; Shannon, Tarrah; Symes, Philip; Loundou, Anderson; Auquier, Pascal

2016-05-28

The CareGiver Oncology Quality of Life (CarGOQoL) questionnaire, a 29-item, multidimensional, self-administered questionnaire, was validated using a large French sample. We reported the linguistic validation process and the metric validity of the English version of CarGOQoL in the United- States. The translation process consisted of 3 consecutive steps: forward-backward translation, acceptability testing, and cognitive interviews. The psychometric testing was applied to caregivers of consecutive patients with representative cancers who were recruited from the Regional Cancer Center in northwestern Pennsylvania. All individuals completed the CarGOQoL at baseline, day- 30, and day- 90. Internal consistency, reliability, external validity, reproducibility, and sensitivity to change were tested. The translated version was validated on a total of 87 American cancer caregivers. The dimensions of the CarGOQoL generally demonstrated a high internal consistency (Cronbach's alpha > 0.70 for all but four domain scores). External validity testing revealed that the CarGOQoL index score correlated significantly with all SF-36 dimension scores except the physical composite score (Pearson's correlation: 0.28-0.70). Reproducibility was satisfactory at day- 30 (intraclass correlation coefficient: 0.46-0.94) and day- 90 (0.43-0.92). Four specific dimensions of CarGOQoL showed responsiveness: the Psychological well-being, the Relationships with health care system, the Social support and the Finances. The American version of the CarGOQoL constitutes a useful instrument to measure QoL in caregivers of cancer patients in the United- States.

Validity of a novel computerized screening test system for mild cognitive impairment.

PubMed

Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk

2018-06-20

ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.
Developing a contributing factor classification scheme for Rasmussen's AcciMap: Reliability and validity evaluation.

PubMed

Goode, N; Salmon, P M; Taylor, N Z; Lenné, M G; Finch, C F

2017-10-01

One factor potentially limiting the uptake of Rasmussen's (1997) Accimap method by practitioners is the lack of a contributing factor classification scheme to guide accident analyses. This article evaluates the intra- and inter-rater reliability and criterion-referenced validity of a classification scheme developed to support the use of Accimap by led outdoor activity (LOA) practitioners. The classification scheme has two levels: the system level describes the actors, artefacts and activity context in terms of 14 codes; the descriptor level breaks the system level codes down into 107 specific contributing factors. The study involved 11 LOA practitioners using the scheme on two separate occasions to code a pre-determined list of contributing factors identified from four incident reports. Criterion-referenced validity was assessed by comparing the codes selected by LOA practitioners to those selected by the method creators. Mean intra-rater reliability scores at the system (M = 83.6%) and descriptor (M = 74%) levels were acceptable. Mean inter-rater reliability scores were not consistently acceptable for both coding attempts at the system level (M T1 = 68.8%; M T2 = 73.9%), and were poor at the descriptor level (M T1 = 58.5%; M T2 = 64.1%). Mean criterion referenced validity scores at the system level were acceptable (M T1 = 73.9%; M T2 = 75.3%). However, they were not consistently acceptable at the descriptor level (M T1 = 67.6%; M T2 = 70.8%). Overall, the results indicate that the classification scheme does not currently satisfy reliability and validity requirements, and that further work is required. The implications for the design and development of contributing factors classification schemes are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validation of the Risk Prediction Models STATE-Score and START-Strategy to Guide TACE Treatment in Patients with Hepatocellular Carcinoma.

PubMed

Mähringer-Kunz, Aline; Kloeckner, Roman; Pitton, Michael B; Düber, Christoph; Schmidtmann, Irene; Galle, Peter R; Koch, Sandra; Weinmann, Arndt

2017-07-01

Several scoring systems that guide patients' treatment regimen for transarterial chemoembolization (TACE) of hepatocellular carcinoma (HCC) have been introduced, but none have gained widespread acceptance in clinical practice. The purpose of this study is to externally validate the Selection for TrAnsarterial chemoembolization TrEatment (STATE)-score and START-strategy [i.e., sequential use of the STATE-score and Assessment for Retreatment with TACE (ART)-score]. From January 2000 to September 2015, 933 patients with HCC underwent TACE at our institution. All variables needed to calculate the STATE-score and implement the START-strategy were determined. STATE comprised serum albumin, up-to-seven criteria, and C-reactive protein (CRP). ART comprised an increase in aspartate aminotransferase, the Child-Pugh score, and a radiological tumor response. Overall survival was calculated, and multivariate analysis performed. In addition, the STATE-score and START-strategy were validated using the Harrell's C-index and integrated Brier score (IBS). The STATE-score was calculated in 228 patients. Low and high STATE-scores corresponded to median survival of 14.3 and 20.2 months, respectively. Harrell's C was 0.558 and IBS 0.133. For the STATE-score, significant predictors of survival were up-to-seven criteria (p = 0.006) and albumin (p = 0.022). CRP values were not predictive (p = 0.367). The ART-score was calculated in 207 patients. Combining the STATE-score and ART-score led to a Harrell's C of 0.580 and IBS of 0.132. The STATE-score was unable to reliably determine the suitability for initial TACE. The START-strategy only slightly improved the predictive ability compared to the ART-score alone. Therefore, neither the STATE-score nor START-strategy alone provides sufficient certainty for clear-cut clinical decisions.
Validity and reliability of Nintendo Wii Fit balance scores.

PubMed

Wikstrom, Erik A

2012-01-01

Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Descriptive laboratory study. Sports medicine research laboratory. Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Participants completed a single-limb-stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r < 0.50). Intrasession reliability for Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT reach distances. In addition, the included Wii Fit balance activity scores generally had poor intrasession and intersession reliability.
Web-based education in systems-based practice: a randomized trial.

PubMed

Kerfoot, B Price; Conlin, Paul R; Travison, Thomas; McMahon, Graham T

2007-02-26

All accredited US residency programs are expected to offer curricula and evaluate their residents in 6 general competencies. Medical schools are now adopting similar competency frameworks. We investigated whether a Web-based program could effectively teach and assess elements of systems-based practice. We enrolled 276 medical students and 417 residents in the fields of surgery, medicine, obstetrics-gynecology, and emergency medicine in a 9-week randomized, controlled, crossover educational trial. Participants were asked to sequentially complete validated Web-based modules on patient safety and the US health care system. The primary outcome measure was performance on a 26-item validated online test administered before, between, and after the participants completed the modules. Six hundred forty (92.4%) of the 693 enrollees participated in the study; 512 (80.0%) of the participants completed all 3 tests. Participants' test scores improved significantly after completion of the first module (P<.001). Overall learning from the 9-week Web-based program, as measured by the increase in scores (posttest scores minus pretest scores), was 16 percentage points (95% confidence interval, 14-17 percentage points; P<.001) in public safety topics and 22 percentage points (95% confidence interval, 20-23 percentage points; P<.001) in US health care system topics. A Web-based educational program on systems-based practice competencies generated significant and durable learning across a broad range of medical students and residents.
Wearable Improved Vision System for Color Vision Deficiency Correction

PubMed Central

Riccio, Daniel; Di Perna, Luigi; Sanniti Di Baja, Gabriella; De Nino, Maurizio; Rossi, Settimio; Testa, Francesco; Simonelli, Francesca; Frucci, Maria

2017-01-01

Color vision deficiency (CVD) is an extremely frequent vision impairment that compromises the ability to recognize colors. In order to improve color vision in a subject with CVD, we designed and developed a wearable improved vision system based on an augmented reality device. The system was validated in a clinical pilot study on 24 subjects with CVD (18 males and 6 females, aged 37.4 ± 14.2 years). The primary outcome was the improvement in the Ishihara Vision Test score with the correction proposed by our system. The Ishihara test score significantly improved (\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$p = 0.03$ \\end{document}) from 5.8 ± 3.0 without correction to 14.8 ± 5.0 with correction. Almost all patients showed an improvement in color vision, as shown by the increased test scores. Moreover, with our system, 12 subjects (50%) passed the vision color test as normal vision subjects. The development and preliminary validation of the proposed platform confirm that a wearable augmented-reality device could be an effective aid to improve color vision in subjects with CVD. PMID:28507827
Children's Behavior in the Postanesthesia Care Unit: The Development of the Child Behavior Coding System-PACU (CBCS-P)

PubMed Central

Tan, Edwin T.; Martin, Sarah R.; Fortier, Michelle A.; Kain, Zeev N.

2012-01-01

Objective To develop and validate a behavioral coding measure, the Children's Behavior Coding System-PACU (CBCS-P), for children's distress and nondistress behaviors while in the postanesthesia recovery unit. Methods A multidisciplinary team examined videotapes of children in the PACU and developed a coding scheme that subsequently underwent a refinement process (CBCS-P). To examine the reliability and validity of the coding system, 121 children and their parents were videotaped during their stay in the PACU. Participants were healthy children undergoing elective, outpatient surgery and general anesthesia. The CBCS-P was utilized and objective data from medical charts (analgesic consumption and pain scores) were extracted to establish validity. Results Kappa values indicated good-to-excellent (κ's > .65) interrater reliability of the individual codes. The CBCS-P had good criterion validity when compared to children's analgesic consumption and pain scores. Conclusions The CBCS-P is a reliable, observational coding method that captures children's distress and nondistress postoperative behaviors. These findings highlight the importance of considering context in both the development and application of observational coding schemes. PMID:22167123
Use of the Award Fee in Air Force System and Subsystem Acquisition

DTIC Science & Technology

1980-03-01

31 too, may have targetted profit positions for its contractors. These goali may or may not change during the transaction. But, in any event, the...discusses "averaging effects" of aggregating factor scores that impair their discriminative validity by pulling "very high and very low scores towards
Validation of the Novaco Anger Scale-Provocation Inventory (Danish) With Nonclinical, Clinical, and Offender Samples.

PubMed

Moeller, Stine Bjerrum; Novaco, Raymond W; Heinola-Nielsen, Vivian; Hougaard, Helle

2016-10-01

Anger has high prevalence in clinical and forensic settings, and it is associated with aggressive behavior and ward atmosphere on psychiatric units. Dysregulated anger is a clinical problem in Danish mental health care systems, but no anger assessment instruments have been validated in Danish. Because the Novaco Anger Scale and Provocation Inventory (NAS-PI) has been extensively validated with different clinical populations and lends itself to clinical case formulation, it was selected for translation and evaluation in the present multistudy project. Psychometric properties of the NAS-PI were investigated with samples of 477 nonclinical, 250 clinical, 167 male prisoner, and 64 male forensic participants. Anger prevalence and its relationship with other anger measures, anxiety/depression, and aggression were examined. NAS-PI was found to have high reliability, concurrent validity, and discriminant validity, and its scores discriminated the samples. High scores in the offender group demonstrated the feasibility of obtaining self-report assessments of anger with this population. Retrospective and prospective validity of the NAS were tested with the forensic patient sample regarding physically aggressive behavior in hospital. Regression analyses showed that higher scores on NAS increase the risk of having acted aggressively in the past and of acting aggressively in the future. © The Author(s) 2015.
Validation and evaluation of epistemic uncertainty in rainfall thresholds for regional scale landslide forecasting

NASA Astrophysics Data System (ADS)

Gariano, Stefano Luigi; Brunetti, Maria Teresa; Iovine, Giulio; Melillo, Massimo; Peruccacci, Silvia; Terranova, Oreste Giuseppe; Vennari, Carmela; Guzzetti, Fausto

2015-04-01

Prediction of rainfall-induced landslides can rely on empirical rainfall thresholds. These are obtained from the analysis of past rainfall events that have (or have not) resulted in slope failures. Accurate prediction requires reliable thresholds, which need to be validated before their use in operational landslide warning systems. Despite the clear relevance of validation, only a few studies have addressed the problem, and have proposed and tested robust validation procedures. We propose a validation procedure that allows for the definition of optimal thresholds for early warning purposes. The validation is based on contingency table, skill scores, and receiver operating characteristic (ROC) analysis. To establish the optimal threshold, which maximizes the correct landslide predictions and minimizes the incorrect predictions, we propose an index that results from the linear combination of three weighted skill scores. Selection of the optimal threshold depends on the scope and the operational characteristics of the early warning system. The choice is made by selecting appropriately the weights, and by searching for the optimal (maximum) value of the index. We discuss weakness in the validation procedure caused by the inherent lack of information (epistemic uncertainty) on landslide occurrence typical of large study areas. When working at the regional scale, landslides may have occurred and may have not been reported. This results in biases and variations in the contingencies and the skill scores. We introduce two parameters to represent the unknown proportion of rainfall events (above and below the threshold) for which landslides occurred and went unreported. We show that even a very small underestimation in the number of landslides can result in a significant decrease in the performance of a threshold measured by the skill scores. We show that the variations in the skill scores are different for different uncertainty of events above or below the threshold. This has consequences in the ROC analysis. We applied the proposed procedure to a catalogue of rainfall conditions that have resulted in landslides, and to a set of rainfall events that - presumably - have not resulted in landslides, in Sicily, in the period 2002-2012. First, we determined regional event duration-cumulated event (ED) rainfall thresholds for shallow landslide occurrence using 200 rainfall conditions that have resulted in 223 shallow landslides in Sicily in the period 2002-2011. Next, we validated the thresholds using 29 rainfall conditions that have triggered 42 shallow landslides in Sicily in 2012, and 1250 rainfall events that presumably have not resulted in landslides in the same year. We performed a back analysis simulating the use of the thresholds in a hypothetical landslide warning system operating in 2012.
Development and validation of self-reported line drawings of the modified Beighton score for the assessment of generalised joint hypermobility.

PubMed

Cooper, Dale J; Scammell, Brigitte E; Batt, Mark E; Palmer, Debbie

2018-01-17

The impracticalities and comparative expense of carrying out a clinical assessment is an obstacle in many large epidemiological studies. The purpose of this study was to develop and validate a series of electronic self-reported line drawing instruments based on the modified Beighton scoring system for the assessment of self-reported generalised joint hypermobility. Five sets of line drawings were created to depict the 9-point Beighton score criteria. Each instrument consisted of an explanatory question whereby participants were asked to select the line drawing which best represented their joints. Fifty participants completed the self-report online instrument on two occasions, before attending a clinical assessment. A blinded expert clinical observer then assessed participants' on two occasions, using a standardised goniometry measurement protocol. Validity of the instrument was assessed by participant-observer agreement and reliability by participant repeatability and observer repeatability using unweighted Cohen's kappa (k). Validity and reliability were assessed for each item in the self-reported instrument separately, and for the sum of the total scores. An aggregate score for generalised joint hypermobility was determined based on a Beighton score of 4 or more out of 9. Observer-repeatability between the two clinical assessments demonstrated perfect agreement (k 1.00; 95% CI 1.00, 1.00). Self-reported participant-repeatability was lower but it was still excellent (k 0.91; 95% CI 0.74, 1.00). The participant-observer agreement was excellent (k 0.96; 95% CI 0.87, 1.00). Validity was excellent for the self-report instrument, with a good sensitivity of 0.87 (95% CI 0.81, 0.91) and excellent specificity of 0.99 (95% CI 0.98, 1.00). The self-reported instrument provides a valid and reliable assessment of the presence of generalised joint hypermobility and may have practical use in epidemiological studies.
Psychometric Evaluation of the PROMIS Fatigue-Short Form Across Diverse Populations

PubMed Central

Ameringer, Suzanne; Elswick, R. K.; Menzies, Victoria; Robins, Jo Lynne; Starkweather, Angela; Walter, Jeanne; Gentry, Amanda Elswick; Jallo, Nancy

2016-01-01

Background The need for reliable, valid tools to measure patient-reported outcomes (PROs) is critical for both research and for evaluating treatment effects in practice. The Patient Reported Outcome Measurement Information System (PROMIS) Fatigue-Short Form v1.0 –Fatigue 7a (PROMIS F-SF) has had limited psychometric evaluation in various populations. Objectives The aim of the study is to examine psychometric properties of PROMIS F-SF item responses across various populations. Methods Data from five studies with common data elements were used in this secondary analysis. Samples from patients with fibromyalgia, sickle cell disease, cardiometabolic risk, pregnancy, and healthy controls were used. Reliability was estimated using Cronbach’s alpha. Dimensionality was evaluated with confirmatory factor analysis. Concurrent validity was evaluated by examining Pearson’s correlations between scores from the PROMIS F-SF, the Multidimensional Fatigue Symptom Inventory-Short Form (MFSI-SF), and the Brief Fatigue Inventory (BFI). Discriminant validity was evaluated by examining Pearson’s correlations between scores on the PROMIS F-SF and measures of stress and depressive symptoms. Known groups validity was assessed by comparing PROMIS F-SH scores in the clinical samples to healthy controls. Results Reliability of PROMIS F-SF scores was adequate across samples, ranging from .72 in the pregnancy sample to .88 in healthy controls. Unidimensionality was supported in each sample. Concurrent validity was strong; across the groups, correlations with scores on the MFSI-SF and BFI ranged from .60–.85. Correlations of the PROMIS-SF with measures of stress and depressive mood were moderate to strong, ranging from .37–.64. PROMIS F-SF scores were significantly higher in clinical samples, compared to healthy controls. Discussion Reliability and validity of the PROMIS F-SF were acceptable. The PROMIS F-SF is a suitable measure of fatigue across the four diverse clinical populations included in the analysis. PMID:27362514
Development and validation of the Pediatric Anesthesia Behavior score--an objective measure of behavior during induction of anesthesia.

PubMed

Beringer, Richard M; Greenwood, Rosemary; Kilpatrick, Nicky

2014-02-01

Measuring perioperative behavior changes requires validated objective rating scales. We developed a simple score for children's behavior during induction of anesthesia (Pediatric Anesthesia Behavior score) and assessed its reliability, concurrent validity, and predictive validity. Data were collected as part of a wider observational study of perioperative behavior changes in children undergoing general anesthesia for elective dental extractions. One-hundred and two healthy children aged 2-12 were recruited. Previously validated behavioral scales were used as follows: the modified Yale Preoperative Anxiety Scale (m-YPAS); the induction compliance checklist (ICC); the Pediatric Anesthesia Emergence Delirium scale (PAED); and the Post-Hospitalization Behavior Questionnaire (PHBQ). Pediatric Anesthesia Behavior (PAB) score was independently measured by two investigators, to allow assessment of interobserver reliability. Concurrent validity was assessed by examining the correlation between the PAB score, the m-YPAS, and the ICC. Predictive validity was assessed by examining the association between the PAB score, the PAED scale, and the PHBQ. The PAB score correlated strongly with both the m-YPAS (P < 0.001) and the ICC (P < 0.001). PAB score was significantly associated with the PAED score (P = 0.031) and with the PHBQ (P = 0.034). Two independent investigators recorded identical PAB scores for 94% of children and overall, there was close agreement between scores (Kappa coefficient of 0.886 [P < 0.001]). The PAB score is simple to use and may predict which children are at increased risk of developing postoperative behavioral disturbance. This study provides evidence for its reliability and validity. © 2013 John Wiley & Sons Ltd.
External validation of the endometriosis fertility index (EFI) staging system for predicting non-ART pregnancy after endometriosis surgery.

PubMed

Tomassetti, C; Geysenbergh, B; Meuleman, C; Timmerman, D; Fieuws, S; D'Hooghe, T

2013-05-01

Can the ability of the endometriosis fertility index (EFI) to predict non-assisted reproductive technology (ART) pregnancy after endometriosis surgery be confirmed by an external validation study? The significant relationship between the EFI score and the time to non-ART pregnancy observed in our study represents an external validation of this scoring system. The EFI was previously developed and tested prospectively in a single center, but up to now no external validation has been published. Our data provide validation of the EFI in an external fertility unit on a robust scientific basis, to identify couples with a good prognosis for spontaneous conception who can therefore defer ART treatment, regardless of their revised American Fertility Society (rAFS) endometriosis staging. Retrospective cohort study where the EFI was calculated based on history and detailed surgical findings, and related to pregnancy outcome in 233 women attempting non-ART conception immediately after surgery; all data used for EFI calculation and analysis of reproductive outcome had been collected prospectively as part of another study. The EFI score was calculated (score 0-10) for 233 women with all rAFS endometriosis stages (minimal-mild, n = 75; moderate-severe, n = 158) after endometriosis surgery (1 September 2006-30 September 2010) in a university hospital-based reproductive medicine unit with combined expertise in reproductive surgery and medically assisted reproduction. All participants attempted non-ART conception immediately after surgery by natural intercourse, ovulation induction with timed intercourse or intrauterine insemination (with or without ovulation induction or controlled ovarian stimulation). All analyses were performed for three different definitions of pregnancy [overall (any HCG >25 IU/l), clinical and ongoing >20 weeks]. Six groups were distinguished (EFI scores 1-3, 4, 5, 6, 7+8, 9+10), and Kaplan-Meier (K-M) estimates for cumulative pregnancy rate were calculated. Subjects were censored when they were lost to follow-up, had subsequent surgery for endometriosis, started ovarian suppression or underwent ART. As K-M estimates might overestimate the actual event rate, cumulative incidence estimates treating ART as competing event were also calculated. Cox regression analysis was used to assess the performance of EFI and constituting variables. Performance of the score (prediction, discrimination) was quantified with the following methods: mean squared error of prediction (Brier score), areas under the receiver-operating curve and global concordance index C(τ). There was a highly significant relationship between the EFI and the time to non-ART pregnancy (cumulative overall pregnancy rate, P = 0.0004), with the K-M estimate of cumulative overall pregnancy rate at 12 months after surgery equal to 45.5% [95% confidence interval (CI) 39.47-49.87]-ranging from 16.67% (95% CI 5.01-47.65) for EFI scores 0-3, to 62.55% (95% CI 55.18-69.94) for EFI scores 9-10. For each increase of 1 point in the EFI score, the relative risk of becoming pregnant increased by 31% (95% CI 16-47%; i.e. hazard ratio 1.31). The 'least function score'-which assesses the tubal/ovarian function at conclusion of surgery-was found to be the most important contributor to the total EFI score among all the other variables (age, duration of infertility, prior pregnancy, AFS endometriosis lesion and total score). The EFI score had a moderate performance in the prediction of the pregnancy rate. Indeed, the decrease in prediction error was rather small, as shown by the decrease in Brier score from 0.213 to 0.198, and low estimates for R² (13%) and C(τ) (0.629). As the EFI was validated externally in our own European population after initial testing by Adamson and Pasta (Endometriosis fertility index: the new, validated endometriosis staging system. Fertil Steril 2010;94:1609-1615) in an American population, it appears that the EFI can be used clinically to counsel infertile endometriosis patients receiving reproductive surgery in specialized centers about their post-operative conception options. This research was supported by funds obtained via the Clinical Research Fund of the University Hospitals Leuven, Belgium, via the Ferring Chair in Reproductive Medicine and Surgery, and the Serono Chair in Reproductive Medicine granted to the Leuven University Fertility Center. The authors have no conflicts of interest to declare.
Validation of the GILLS score for tongue-lip adhesion in Robin sequence patients.

PubMed

Abramowicz, Shelly; Bacic, Janine D; Mulliken, John B; Rogers, Gary F

2012-03-01

The GILLS score consists of gastroesophageal reflux disease, preoperative intubation, late surgical intervention, low birth weight, and syndromic diagnosis. The purpose of this study was to test the validity of the GILLS score in predicting success of tongue-lip adhesion (TLA) in managing Robin sequence. Infants with Robin sequence were included in the study if they had a TLA for airway compromise subsequent to formulation of the GILLS scoring system, that is, they were not included in the original GILLS analysis. The patients were prospectively considered based on the presence of the 5 factors that constitute the GILLS score. A score of ≤ 2 predicts success of TLA. Twenty patients met the inclusion criteria. Tongue-lip adhesion managed the compromised airway in 18 (90%) of 20 patients. Overall, the GILLS score had a sensitivity of 83%, specificity of 50%, positive predictive value of 94%, and negative predictive value of 25%. The GILLS score accurately predicts a successful outcome for TLA in infants with Robin sequence. For infants with a score of 2 or less, TLA is the procedure of choice. Infants with a GILLS score of 3 or greater were 5 times more likely to fail TLA than those with a score of 2 or less. In these patients, other methods of managing the airway should be considered.
Scoring Rubric Development: Validity and Reliability.

ERIC Educational Resources Information Center

Moskal, Barbara M.; Leydens, Jon A.

2000-01-01

Provides clear definitions of the terms "validity" and "reliability" in the context of developing scoring rubrics and illustrates these definitions through examples. Also clarifies how validity and reliability may be addressed in the development of scoring rubrics, defined as descriptive scoring schemes developed to guide the analysis of the…
Validation of the Chinese version of the FOUR score in the assessment of neurosurgical patients with different level of consciousness.

PubMed

Peng, Juan; Deng, Yingying; Chen, Fangyao; Zhang, Xiaomei; Wang, Xiaoyan; Zhou, Ying; Zhou, Hongzhen; Qiu, Binghui

2015-12-10

The Glasgow Coma Scale (GCS) is currently the most widely used scoring system for comatose patients. A decade ago, the Full Outline of Unresponsiveness (FOUR) score was devised to better capture four functional aspects of consciousness (eye, motor responses, brainstem reflexes, and respiration). This study aimed to validate the Chinese version of the FOUR score in patients with different levels of consciousness. The study had two phases: (1) translation of the FOUR score, and (2) assessment of its reliability and validity. The Chinese version of the FOUR score was developed according to a standardized protocol. One hundred-twenty consecutive patients with acute brain damage, admitted to Nanfang Hospital (Southern Medical University, Guangdong, China) from November 2014 to February 2015, were enrolled. The inter-rater agreement for the FOUR score and GCS was evaluated using intraclass correlation coefficient (ICC). Receiver operating characteristic (ROC) curves were established to determine the scales' abilities to predict outcome. The rater agreement was excellent both for FOUR (ICC = 0.970; p < 0.001) and GCS (ICC = 0.958; p < 0.001). The FOUR score yielded an excellent test-retest reliability (ICC = 0.930; p < 0.001). Spearman's correlation coefficients between GCS and the FOUR score were high: r = 0.932, first rating; r = 0.887, second rating (all p < 0.001). Areas under the curve (AUC) for mortality were 0.834 (95 % CI, 0.740-0.928) and 0.815 (95 % CI, 0.723-0.908) for the FOUR score and GCS, respectively. The Chinese version of the FOUR score is a reliable scale for evaluating the level of consciousness in patients with acute brain injury.
Creation and validation of a novel body condition scoring method for the magellanic penguin (Spheniscus magellanicus) in the zoo setting.

PubMed

Clements, Julie; Sanchez, Jessica N

2015-11-01

This research aims to validate a novel, visual body scoring system created for the Magellanic penguin (Spheniscus magellanicus) suitable for the zoo practitioner. Magellanics go through marked seasonal fluctuations in body mass gains and losses. A standardized multi-variable visual body condition guide may provide a more sensitive and objective assessment tool compared to the previously used single variable method. Accurate body condition scores paired with seasonal weight variation measurements give veterinary and keeper staff a clearer understanding of an individual's nutritional status. San Francisco Zoo staff previously used a nine-point body condition scale based on the classic bird standard of a single point of keel palpation with the bird restrained in hand, with no standard measure of reference assigned to each scoring category. We created a novel, visual body condition scoring system that does not require restraint to assesses subcutaneous fat and muscle at seven body landmarks using illustrations and descriptive terms. The scores range from one, the least robust or under-conditioned, to five, the most robust, or over-conditioned. The ratio of body weight to wing length was used as a "gold standard" index of body condition and compared to both the novel multi-variable and previously used single-variable body condition scores. The novel multi-variable scale showed improved agreement with weight:wing ratio compared to the single-variable scale, demonstrating greater accuracy, and reliability when a trained assessor uses the multi-variable body condition scoring system. Zoo staff may use this tool to manage both the colony and the individual to assist in seasonally appropriate Magellanic penguin nutrition assessment. © 2015 Wiley Periodicals, Inc.
The Italian version of the Mouth Handicap in Systemic Sclerosis scale (MHISS) is valid, reliable and useful in assessing oral health-related quality of life (OHRQoL) in systemic sclerosis (SSc) patients.

PubMed

Maddali Bongi, S; Del Rosso, A; Miniati, I; Galluccio, F; Landi, G; Tai, G; Matucci-Cerinic, M

2012-09-01

In systemic sclerosis (SSc), mouth and face involvement leads to problems in oral health-related quality of life (OHRQoL). Mouth Handicap in Systemic Sclerosis scale (MHISS) is a 12-item questionnaire specifically quantifying mouth disability in SSc, organized in 3 subscales. Our aim was to validate Italian version of MHISS, by assessing its test-retest reliability and internal and external consistency in Italian SSc patients. Forty SSc patients (7 dSSc, 33 lSSc; age and disease duration: 57.27 ± 11.41, 9.4 ± 4.4 years; 22 with sicca syndrome) were evaluated with MHISS. MHISS was translated following a forward-backward translation procedure, with independent translations and counter-translation. Test-retest reliability was evaluated, comparing the results of two administrations, with intraclass correlation coefficient (ICC). Internal consistency was assessed by Cronbach's α and external consistency by comparison with mouth opening. MHISS has a good test-retest reliability (ICC: 0.93) and internal consistency (Cronbach's α:0.99). A good external consistency was confirmed by correlation with mouth opening (rho: -0,3869, p: 0.0137). Total MHISS score was 17.65 ± 5.20, with scores of subscale 1 (reduced mouth opening) of 6.60 ± 2.85 and scores of subscales 2 (sicca syndrome) and 3 (aesthetic concerns) of 7.82 ± 2.59 and 3.22 ± 1.14. Total and subscale 2 scores are higher in dSSc than in lSSc. This result may be due to the higher presence of sicca syndrome in dSSc than in lSSc (p = 0.0109). Our results support validity and reliability in Italian SSc patients of MHISS, specifically measuring SSc OHRQoL.
District health information system assessment: a case study in iran.

PubMed

Raeisi, Ahmad Reza; Saghaeiannejad, Sakineh; Karimi, Saeed; Ehteshami, Asghar; Kasaei, Mahtab

2013-03-01

Health care managers and personnel should be aware and literate of health information system in order to increase the efficiency and effectiveness in their organization. Since accurate, appropriate, precise, timely, valid information and interpretation of information is required and is the basis for policy planning and decision making in various levels of the organization. This study was conducted to assess the district health information system evolution in Iran according to WHO framework. This research is an applied, descriptive cross sectional study, in which a total of twelve urban and eight rural facilities, and the district health center at Falavarjan region were surveyed by using a questionnaire with 334 items. Content and constructive validity and reliability of the questionnaire were confirmed with correlation coefficient of 0.99. Obtained data were analyzed with SPSS 16 software and descriptive statistics were used to examine measures of WHO compliance. The analysis of data revealed that the mean score of compliance of district health information system framework was 35.75 percent. The maximum score of compliance with district health information system belonged to the data collection process (70 percent). The minimum score of compliance with district health information system belonged to information based decision making process with a score of 10 percent. District Health Information System Criteria in Isfahan province do not completely comply with WHO framework. Consequently, it seems that health system managers engaged with underlying policy and decision making processes at district health level should try to restructure and decentralize district health information system and develop training management programs for their managers.

Comparison of the severity of lower extremity arterial disease in smokers and patients with diabetes using a novel duplex Doppler scoring system.

PubMed

Hiremath, Rudresh; Gowda, Goutham; Ibrahim, Jebin; Reddy, Harish T; Chodiboina, Haritha; Shah, Rushit

2017-07-01

The aim of this study was to validate the diagnostic feasibility of a novel scoring system of peripheral arterial disease (PAD) in smokers and patients with diabetes depending on duplex Doppler sonographic features. Patients presenting with the symptomatology of PAD were divided into three groups: diabetes only, smoking only, and smokers with diabetes. The patients were clinically examined, a clinical severity score was obtained, and the subjects were categorized into the three extrapolated categories of mild, moderate, and severe. All 106 subjects also underwent a thorough duplex Doppler examination, and various aspects of PAD were assessed and tabulated. These components were used to create a novel duplex Doppler scoring system. Depending on the scores obtained, each individual was categorized as having mild, moderate, or severe illness. The Cohen kappa value was used to assess interobserver agreement between the two scoring systems. Interobserver agreement between the traditional Rutherford clinical scoring system and the newly invented duplex Doppler scoring system showed a kappa value of 0.83, indicating significant agreement between the two scoring systems (P<0.001). Duplex Doppler imaging is an effective screening investigation for lower extremity arterial disease, as it not only helps in its diagnosis, but also in the staging and grading of the disease, providing information that can be utilized for future management and treatment planning.
Construct Validity and Scoring Methods of the World Health Organization: Health and Work Performance Questionnaire Among Workers With Arthritis and Rheumatological Conditions.

PubMed

AlHeresh, Rawan; LaValley, Michael P; Coster, Wendy; Keysor, Julie J

2017-06-01

To evaluate construct validity and scoring methods of the world health organization-health and work performance questionnaire (HPQ) for people with arthritis. Construct validity was examined through hypothesis testing using the recommended guidelines of the consensus-based standards for the selection of health measurement instruments (COSMIN). The HPQ using the absolute scoring method showed moderate construct validity as four of the seven hypotheses were met. The HPQ using the relative scoring method had weak construct validity as only one of the seven hypotheses were met. The absolute scoring method for the HPQ is superior in construct validity to the relative scoring method in assessing work performance among people with arthritis and related rheumatic conditions; however, more research is needed to further explore other psychometric properties of the HPQ.
Predicting death from kala-azar: construction, development, and validation of a score set and accompanying software.

PubMed

Costa, Dorcas Lamounier; Rocha, Regina Lunardi; Chaves, Eldo de Brito Ferreira; Batista, Vivianny Gonçalves de Vasconcelos; Costa, Henrique Lamounier; Costa, Carlos Henrique Nery

2016-01-01

Early identification of patients at higher risk of progressing to severe disease and death is crucial for implementing therapeutic and preventive measures; this could reduce the morbidity and mortality from kala-azar. We describe a score set composed of four scales in addition to software for quick assessment of the probability of death from kala-azar at the point of care. Data from 883 patients diagnosed between September 2005 and August 2008 were used to derive the score set, and data from 1,031 patients diagnosed between September 2008 and November 2013 were used to validate the models. Stepwise logistic regression analyses were used to derive the optimal multivariate prediction models. Model performance was assessed by its discriminatory accuracy. A computational specialist system (Kala-Cal(r)) was developed to speed up the calculation of the probability of death based on clinical scores. The clinical prediction score showed high discrimination (area under the curve [AUC] 0.90) for distinguishing death from survival for children ≤2 years old. Performance improved after adding laboratory variables (AUC 0.93). The clinical score showed equivalent discrimination (AUC 0.89) for older children and adults, which also improved after including laboratory data (AUC 0.92). The score set also showed a high, although lower, discrimination when applied to the validation cohort. This score set and Kala-Cal(r) software may help identify individuals with the greatest probability of death. The associated software may speed up the calculation of the probability of death based on clinical scores and assist physicians in decision-making.
The use of quizStar application for online examination in basic physics course

NASA Astrophysics Data System (ADS)

Kustijono, R.; Budiningarti, H.

2018-03-01

The purpose of the study is to produce an online Basic Physics exam system using the QuizStar application. This is a research and development with ADDIE model. The steps are: 1) analysis; 2) design; 3) development; 4) implementation; 5) evaluation. System feasibility is reviewed for its validity, practicality, and effectiveness. The subjects of research are 60 Physics Department students of Universitas Negeri Surabaya. The data analysis used is a descriptive statistic. The validity, practicality, and effectiveness scores are measured using a Likert scale. Criteria feasible if the total score of all aspects obtained is ≥ 61%. The results obtained from the online test system by using QuizStar developed are 1) conceptually feasible to use; 2) the system can be implemented in the Basic Physics assessment process, and the existing constraints can be overcome; 3) student's response to system usage is in a good category. The results conclude that QuizStar application is eligible to be used for online Basic Physics exam system.
Risk Prediction of New Adjacent Vertebral Fractures After PVP for Patients with Vertebral Compression Fractures: Development of a Prediction Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhong, Bin-Yan; He, Shi-Cheng; Zhu, Hai-Dong

PurposeWe aim to determine the predictors of new adjacent vertebral fractures (AVCFs) after percutaneous vertebroplasty (PVP) in patients with osteoporotic vertebral compression fractures (OVCFs) and to construct a risk prediction score to estimate a 2-year new AVCF risk-by-risk factor condition.Materials and MethodsPatients with OVCFs who underwent their first PVP between December 2006 and December 2013 at Hospital A (training cohort) and Hospital B (validation cohort) were included in this study. In training cohort, we assessed the independent risk predictors and developed the probability of new adjacent OVCFs (PNAV) score system using the Cox proportional hazard regression analysis. The accuracy ofmore » this system was then validated in both training and validation cohorts by concordance (c) statistic.Results421 patients (training cohort: n = 256; validation cohort: n = 165) were included in this study. In training cohort, new AVCFs after the first PVP treatment occurred in 33 (12.9%) patients. The independent risk factors were intradiscal cement leakage and preexisting old vertebral compression fracture(s). The estimated 2-year absolute risk of new AVCFs ranged from less than 4% in patients with neither independent risk factors to more than 45% in individuals with both factors.ConclusionsThe PNAV score is an objective and easy approach to predict the risk of new AVCFs.« less
Cross-cultural adaptation and validation of Systemic Lupus Erythematosus Quality of Life questionnaire into Arabic.

PubMed

Aziz, M M; Galal, M A A; Elzohri, M H; El-Nouby, F; Leong, K P

2018-04-01

Systemic lupus erythematosus (SLE) is a chronic autoimmune disease which affects all aspects of quality of life (QoL) of the patients. Comprehensive patient assessment should include QoL measures in addition to the objective clinical measures of the disease. There is no specific Arabic instrument for assessment of QoL of SLE patients. The objective of this study was to translate and cross culturally adapt the SLEQOL questionnaire into Arabic and test its reliability and validity. The SLEQOL questionnaire was translated into Arabic based on the Guidelines for Translation and Cross-cultural Adaptation into other languages. Reliability was assessed by interviewing patients three times: two interviews on the same day by different interviewers and the third interview 14 days later by one of the first interviewers. Validity was assessed by correlating SLEQOL scores of 91 patients with 36-item Short Form Health Survey (SF-36) scores and clinical parameters of the patients. We found that the Arabic version of SLEQOL has a Cronbach's alpha of 0.936, interobserver and intraobserver correlation coefficients of 0.809 and 0.886 respectively. Strong correlations were also found between SLEQOL scores and SF-36 Physical and Mental Component summaries. In conclusion, the Arabic version of SLEQOL is a reliable and valid instrument for measuring QoL of Egyptian SLE patients.
Concurrent validity of persian version of wechsler intelligence scale for children - fourth edition and cognitive assessment system in patients with learning disorder.

PubMed

Rostami, Reza; Sadeghi, Vahid; Zarei, Jamileh; Haddadi, Parvaneh; Mohazzab-Torabi, Saman; Salamati, Payman

2013-04-01

The aim of this study was to compare the Persian version of the wechsler intelligence scale for children - fourth edition (WISC-IV) and cognitive assessment system (CAS) tests, to determine the correlation between their scales and to evaluate the probable concurrent validity of these tests in patients with learning disorders. One-hundered-sixty-two children with learning disorder who were presented at Atieh Comprehensive Psychiatry Center were selected in a consecutive non-randomized order. All of the patients were assessed based on WISC-IV and CAS scores questionnaires. Pearson correlation coefficient was used to analyze the correlation between the data and to assess the concurrent validity of the two tests. Linear regression was used for statistical modeling. The type one error was considered 5% in maximum. There was a strong correlation between total score of WISC-IV test and total score of CAS test in the patients (r=0.75, P<0.001). The correlations among the other scales were mostly high and all of them were statistically significant (P<0.001). A linear regression model was obtained (α = 0.51, β = 0.81 and P<0.001). There is an acceptable correlation between the WISC-IV scales and CAS test in children with learning disorders. A concurrent validity is established between the two tests and their scales.
Concurrent Validity of Persian Version of Wechsler Intelligence Scale for Children - Fourth Edition and Cognitive Assessment System in Patients with Learning Disorder

PubMed Central

Rostami, Reza; Sadeghi, Vahid; Zarei, Jamileh; Haddadi, Parvaneh; Mohazzab-Torabi, Saman; Salamati, Payman

2013-01-01

Objective The aim of this study was to compare the Persian version of the wechsler intelligence scale for children - fourth edition (WISC-IV) and cognitive assessment system (CAS) tests, to determine the correlation between their scales and to evaluate the probable concurrent validity of these tests in patients with learning disorders. Methods One-hundered-sixty-two children with learning disorder who were presented at Atieh Comprehensive Psychiatry Center were selected in a consecutive non-randomized order. All of the patients were assessed based on WISC-IV and CAS scores questionnaires. Pearson correlation coefficient was used to analyze the correlation between the data and to assess the concurrent validity of the two tests. Linear regression was used for statistical modeling. The type one error was considered 5% in maximum. Findings There was a strong correlation between total score of WISC-IV test and total score of CAS test in the patients (r=0.75, P<0.001). The correlations among the other scales were mostly high and all of them were statistically significant (P<0.001). A linear regression model was obtained (α = 0.51, β = 0.81 and P<0.001). Conclusion There is an acceptable correlation between the WISC-IV scales and CAS test in children with learning disorders. A concurrent validity is established between the two tests and their scales. PMID:23724180
Development of an Itemwise Efficiency Scoring Method: Concurrent, Convergent, Discriminant, and Neuroimaging-Based Predictive Validity Assessed in a Large Community Sample

PubMed Central

Moore, Tyler M.; Reise, Steven P.; Roalf, David R.; Satterthwaite, Theodore D.; Davatzikos, Christos; Bilker, Warren B.; Port, Allison M.; Jackson, Chad T.; Ruparel, Kosha; Savitt, Adam P.; Baron, Robert B.; Gur, Raquel E.; Gur, Ruben C.

2016-01-01

Traditional “paper-and-pencil” testing is imprecise in measuring speed and hence limited in assessing performance efficiency, but computerized testing permits precision in measuring itemwise response time. We present a method of scoring performance efficiency (combining information from accuracy and speed) at the item level. Using a community sample of 9,498 youths age 8-21, we calculated item-level efficiency scores on four neurocognitive tests, and compared the concurrent, convergent, discriminant, and predictive validity of these scores to simple averaging of standardized speed and accuracy-summed scores. Concurrent validity was measured by the scores' abilities to distinguish men from women and their correlations with age; convergent and discriminant validity were measured by correlations with other scores inside and outside of their neurocognitive domains; predictive validity was measured by correlations with brain volume in regions associated with the specific neurocognitive abilities. Results provide support for the ability of itemwise efficiency scoring to detect signals as strong as those detected by standard efficiency scoring methods. We find no evidence of superior validity of the itemwise scores over traditional scores, but point out several advantages of the former. The itemwise efficiency scoring method shows promise as an alternative to standard efficiency scoring methods, with overall moderate support from tests of four different types of validity. This method allows the use of existing item analysis methods and provides the convenient ability to adjust the overall emphasis of accuracy versus speed in the efficiency score, thus adjusting the scoring to the real-world demands the test is aiming to fulfill. PMID:26866796
Reliability, Validity, and Sensitivity to Change Overtime of the Modified Melasma Area and Severity Index Score.

PubMed

Abou-Taleb, Doaa A E; Ibrahim, Ahmed K; Youssef, Eman M K; Moubasher, Alaa E A

2017-02-01

The new modified Melasma Area and Severity Index (mMASI) score, the recently used outcome measure for melasma, has not been tested to determine its sensitivity to change in melasma. To determine the reliability, validity, and sensitivity to change overtime of the mMASI score in assessment of the severity of melasma. Pearson correlation, Cronbach alpha, and intraclass correlation coefficient were calculated to assess the reliability of the mMASI score. Validity of the mMASI scale was carried out using Spearman correlation between mMASI total score (before and after treatment), clinical data, and patient's responses. The mMASI score showed excellent reliability and good validity for assessment of the severity of melasma. The authors also determined that the mMASI score demonstrated sensitivity to change over time. An excellent degree of agreement between the mMSAI and MASI scores was revealed. The mMASI score is reliable, valid, and responsive to change in the assessment of severity of melasma. Moreover, the mMASI score was found to be easier to learn and perform and simpler in calculation compared with the MASI score. Overall, the mMASI score can effectively replace the MASI score.
A Comparison between SRSS-IE and SSiS-PSG Scores: Examining Convergent Validity

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Common, Eric Alan; Zorigian, Kris; Brunsting, Nelson C.; Schatschneider, Christopher

2015-01-01

We report findings of a validation study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE, an adapted version of the Student Risk Screening Scale) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG). Participants included 458 kindergarten through fifth-grade…
Additional Evidence of Convergent Validity between SRSS-IE and SSiS-PSG Scores

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Ennis, Robin Parks; Royer, David James

2015-01-01

We report findings of a validity study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG; Elliott & Gresham, 2007). Participants were 1,680 kindergarten through sixth-grade elementary students from three…
Development and testing of a scale to assess physician attitudes about handheld computers with decision support.

PubMed

Ray, Midge N; Houston, Thomas K; Yu, Feliciano B; Menachemi, Nir; Maisiak, Richard S; Allison, Jeroan J; Berner, Eta S

2006-01-01

The authors developed and evaluated a rating scale, the Attitudes toward Handheld Decision Support Software Scale (H-DSS), to assess physician attitudes about handheld decision support systems. The authors conducted a prospective assessment of psychometric characteristics of the H-DSS including reliability, validity, and responsiveness. Participants were 82 Internal Medicine residents. A higher score on each of the 14 five-point Likert scale items reflected a more positive attitude about handheld DSS. The H-DSS score is the mean across the fourteen items. Attitudes toward the use of the handheld DSS were assessed prior to and six months after receiving the handheld device. Cronbach's Alpha was used to assess internal consistency reliability. Pearson correlations were used to estimate and detect significant associations between scale scores and other measures (validity). Paired sample t-tests were used to test for changes in the mean attitude scale score (responsiveness) and for differences between groups. Internal consistency reliability for the scale was alpha = 0.73. In testing validity, moderate correlations were noted between the attitude scale scores and self-reported Personal Digital Assistant (PDA) usage in the hospital (correlation coefficient = 0.55) and clinic (0.48), p < 0.05 for both. The scale was responsive, in that it detected the expected increase in scores between the two administrations (3.99 (s.d. = 0.35) vs. 4.08, (s.d. = 0.34), p < 0.005). The authors' evaluation showed that the H-DSS scale was reliable, valid, and responsive. The scale can be used to guide future handheld DSS development and implementation.
On the Validity of Useless Tests

ERIC Educational Resources Information Center

Sireci, Stephen G.

2016-01-01

A misconception exists that validity may refer only to the "interpretation" of test scores and not to the "uses" of those scores. The development and evolution of validity theory illustrate test score interpretation was a primary focus in the earliest days of modern testing, and that validating interpretations derived from test…
New Comprehensive Cytogenetic Scoring System for Primary Myelodysplastic Syndromes (MDS) and Oligoblastic Acute Myeloid Leukemia After MDS Derived From an International Database Merge

PubMed Central

Schanz, Julie; Tüchler, Heinz; Solé, Francesc; Mallo, Mar; Luño, Elisa; Cervera, José; Granada, Isabel; Hildebrandt, Barbara; Slovak, Marilyn L.; Ohyashiki, Kazuma; Steidl, Christian; Fonatsch, Christa; Pfeilstöcker, Michael; Nösslinger, Thomas; Valent, Peter; Giagounidis, Aristoteles; Aul, Carlo; Lübbert, Michael; Stauder, Reinhard; Krieger, Otto; Garcia-Manero, Guillermo; Faderl, Stefan; Pierce, Sherry; Le Beau, Michelle M.; Bennett, John M.; Greenberg, Peter; Germing, Ulrich; Haase, Detlef

2012-01-01

Purpose The karyotype is a strong independent prognostic factor in myelodysplastic syndromes (MDS). Since the implementation of the International Prognostic Scoring System (IPSS) in 1997, knowledge concerning the prognostic impact of abnormalities has increased substantially. The present study proposes a new and comprehensive cytogenetic scoring system based on an international data collection of 2,902 patients. Patients and Methods Patients were included from the German-Austrian MDS Study Group (n = 1,193), the International MDS Risk Analysis Workshop (n = 816), the Spanish Hematological Cytogenetics Working Group (n = 849), and the International Working Group on MDS Cytogenetics (n = 44) databases. Patients with primary MDS and oligoblastic acute myeloid leukemia (AML) after MDS treated with supportive care only were evaluated for overall survival (OS) and AML evolution. Internal validation by bootstrap analysis and external validation in an independent patient cohort were performed to confirm the results. Results In total, 19 cytogenetic categories were defined, providing clear prognostic classification in 91% of all patients. The abnormalities were classified into five prognostic subgroups (P < .001): very good (median OS, 61 months; hazard ratio [HR], 0.5; n = 81); good (49 months; HR, 1.0 [reference category]; n = 1,809); intermediate (26 months; HR, 1.6; n = 529); poor (16 months; HR, 2.6; n = 148); and very poor (6 months; HR, 4.2; n = 187). The internal and external validations confirmed the results of the score. Conclusion In conclusion, these data should contribute to the ongoing efforts to update the IPSS by refining the cytogenetic risk categories. PMID:22331955
The mortality risk score and the ADG score: two points-based scoring systems for the Johns Hopkins aggregated diagnosis groups to predict mortality in a general adult population cohort in Ontario, Canada.

PubMed

Austin, Peter C; Walraven, Carl van

2011-10-01

Logistic regression models that incorporated age, sex, and indicator variables for the Johns Hopkins' Aggregated Diagnosis Groups (ADGs) categories have been shown to accurately predict all-cause mortality in adults. To develop 2 different point-scoring systems using the ADGs. The Mortality Risk Score (MRS) collapses age, sex, and the ADGs to a single summary score that predicts the annual risk of all-cause death in adults. The ADG Score derives weights for the individual ADG diagnosis groups. : Retrospective cohort constructed using population-based administrative data. All 10,498,413 residents of Ontario, Canada, between the age of 20 and 100 years who were alive on their birthday in 2007, participated in this study. Participants were randomly divided into derivation and validation samples. : Death within 1 year. In the derivation cohort, the MRS ranged from -21 to 139 (median value 29, IQR 17 to 44). In the validation group, a logistic regression model with the MRS as the sole predictor significantly predicted the risk of 1-year mortality with a c-statistic of 0.917. A regression model with age, sex, and the ADG Score has similar performance. Both methods accurately predicted the risk of 1-year mortality across the 20 vigintiles of risk. The MRS combined values for a person's age, sex, and the John Hopkins ADGs to accurately predict 1-year mortality in adults. The ADG Score is a weighted score representing the presence or absence of the 32 ADG diagnosis groups. These scores will facilitate health services researchers conducting risk adjustment using administrative health care databases.
Neurocognition and community outcome in schizophrenia: long-term predictive validity.

PubMed

Fujii, Daryl E; Wylie, A Michael

2003-02-01

The present study examined the predictive validity of neuropsychological measures to functional outcome in 26 schizophrenic patients 15-plus year post-testing. Outcome measures included score on the Resource Associated Functional Level Scale (RAFLS), number of state hospital admissions, and total duration of state hospital inpatient stay. Results of several stepwise multiple regressions revealed that verbal memory significantly predicted RAFLS score, accounting for nearly half of the variance. Trails B significantly predicted duration of state hospital inpatient status. Discussion focused on the utility of these measures for clinicians and system planners. Copyright 2002 Elsevier Science B.V.
Validation study of an electronic method of condensed outcomes tools reporting in orthopaedics.

PubMed

Farr, Jack; Verma, Nikhil; Cole, Brian J

2013-12-01

Patient-reported outcomes (PRO) instruments are a vital source of data for evaluating the efficacy of medical treatments. Historically, outcomes instruments have been designed, validated, and implemented as paper-based questionnaires. The collection of paper-based outcomes information may result in patients becoming fatigued as they respond to redundant questions. This problem is exacerbated when multiple PRO measures are provided to a single patient. In addition, the management and analysis of data collected in paper format involves labor-intensive processes to score and render the data analyzable. Computer-based outcomes systems have the potential to mitigate these problems by reformatting multiple outcomes tools into a single, user-friendly tool.The study aimed to determine whether the electronic outcomes system presented produces results comparable with the test-retest correlations reported for the corresponding orthopedic paper-based outcomes instruments.The study is designed as a crossover study based on consecutive orthopaedic patients arriving at one of two designated orthopedic knee clinics.Patients were assigned to complete either a paper or a computer-administered questionnaire based on a similar set of questions (Knee injury and Osteoarthritis Outcome Score, International Knee Documentation Committee form, 36-Item Short Form survey, version 1, Lysholm Knee Scoring Scale). Each patient completed the same surveys using the other instrument, so that all patients had completed both paper and electronic versions. Correlations between the results from the two modes were studied and compared with test-retest data from the original validation studies.The original validation studies established test-retest reliability by computing correlation coefficients for two administrations of the paper instrument. Those correlation coefficients were all in the range of 0.7 to 0.9, which was deemed satisfactory. The present study computed correlation coefficients between the paper and electronic modes of administration. These correlation coefficients demonstrated similar results with an overall value of 0.86.On the basis of the correlation coefficients, the electronic application of commonly used knee outcome scores compare variably to the traditional paper variants with a high rate of test-retest correlation. This equivalence supports the use of the condensed electronic outcomes system and validates comparison of scores between electronic and paper modes. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
EUTOS CML prognostic scoring system predicts ELN-based 'event-free survival' better than Euro/Hasford and Sokal systems in CML patients receiving front-line imatinib mesylate.

PubMed

Uz, Burak; Buyukasik, Yahya; Atay, Hilmi; Kelkitli, Engin; Turgut, Mehmet; Bektas, Ozlen; Eliacik, Eylem; Isik, Ayşe; Aksu, Salih; Goker, Hakan; Sayinalp, Nilgun; Ozcebe, Osman I; Haznedaroglu, Ibrahim C

2013-09-01

The validity of the three currently used chronic myeloid leukemia (CML) scoring systems (Sokal CML prognostic scoring system, Euro/Hasford CML scoring system, and the EUTOS CML prognostic scoring system) were compared in the CML patients receiving frontline imatinib mesylate. One hundred and fourty-three chronic phase CML patients (71 males, 72 females) taking imatinib as frontline treatment were included in the study. The median age was 44 (16-82) years. Median total and on-imatinib follow-up durations were 29 (3.8-130) months and 25 (3-125) months, respectively. The complete hematological response (CHR) rate at 3 months was 95%. The best cumulative complete cytogenetic response (CCyR) rate at 24 months was 79.6%. Euro/Hasford scoring system was well-correlated with both Sokal and EUTOS scores (r = 0.6, P < 0.001 and r = 0.455, P < 0.001). However, there was only a weak correlation between Sokal and EUTOS scores (r = 0.2, P = 0.03). The 5-year median estimated event-free survival for low and high EUTOS risk patients were 62.6 (25.7-99.5) and 15.3 (7.4-23.2) months, respectively (P < 0.001). This performance was better than Sokal (P = 0.3) and Euro/Hasford (P = 0.04) scoring systems. Overall survival and CCyR rates were also better predicted by the EUTOS score. EUTOS CML prognostic scoring system, which is the only prognostic system developed during the imatinib era, predicts European LeukemiaNet (ELN)-based event-free survival better than Euro/Hasford and Sokal systems in CML patients receiving frontline imatinib mesylate. This observation might have important clinical implications.
Development of a clinical score system for the diagnosis of photoallergic contact dermatitis using a consensus process: item selection and reliability.

PubMed

Cazzaniga, S; Lecchi, S; Bruze, M; Chosidow, O; Diepgen, T; Gonçalo, M; Hercogova, J; Pigatto, P D; Naldi, L

2015-07-01

Photoallergic contact dermatitis (PACD) is an uncommon condition, and there is a lack of validated criteria for its diagnosis. To identify a set of relevant criteria to be considered when suspecting a diagnosis of PACD and to assess the reproducibility of these criteria. This was a diagnostic item selection and reliability study performed between July 2012 and October 2012. A panel of seven recognized experts was invited to consecutive rounds of a Delphi survey and to a conclusive face-to-face meeting with the aim of obtaining an agreement on criteria for the diagnosis of PACD. The panel was also provided with a series of 16 reports of suspected PACDs to be classified according to a five-point likelihood scale. Identified criteria with the weights attributed by experts were used to develop a score system for the diagnosis of PACD. Consensus was measured by calculating the Intraclass Correlation Coefficient (ICC). The performance of the score system was evaluated in terms of overall classification accuracy. Seven criteria were identified by experts as relevant for the diagnosis of PACD. The criteria were related to the type of skin lesions, accompanying symptoms, skin area involved, general medical history, modality of exposure to the culprit substance, history of exposure to the sun or other light sources and photopatch test results. Experts reached a moderate agreement on PACD cases classification, with ICC = 0.69 (95% Confidence Interval, CI, 0.50-0.86). The score system enabled discrimination of probable and definite PACD cases from possible and unlikely or excluded ones, with a nearly perfect agreement being observed between the score system classification and judgment by experts. A diagnostic score was proposed. The score should receive a comprehensive validation on a larger series of cases and with multiple evaluators. © 2014 European Academy of Dermatology and Venereology.

Clinical Inquiry: What's the best way to predict the success of a trial of labor after a previous C-section?

PubMed

Warren, Johanna B; Hamilton, Andrew

2015-12-01

Seven validated prospective scoring systems, and one unvalidated system, predict a successful TOLAC based on a variety of clinical factors. The systems use different outcome statistics, so their predictive accuracy can't be directly compared.
Validation of the GerdQ questionnaire for the management of gastro-oesophageal reflux disease in Japan

PubMed Central

Matsuzaki, Juntaro; Okada, Sawako; Hirata, Kenro; Fukuhara, Seiichiro; Hibi, Toshifumi

2013-01-01

Background The GerdQ scoring system may be a useful tool for managing gastro-oesophageal reflux disease. However, GerdQ has not been fully validated in Asian countries. Objective To validate the Japanese version of GerdQ and to compare this version to the Carlsson-Dent questionnaire (CDQ) in both general and hospital-based populations. Methods The questionnaires, including the Japanese versions of GerdQ and CDQ, and questions designed to collect demographic information, were sent to a general population via the web, and to a hospital-based population via conventional mail. The optimal cutoff GerdQ score and the differences in the characteristics between GerdQ and CDQ were assessed. Results The answers from 863 web-responders and 303 conventional-mail responders were analysed. When a GerdQ cutoff score was set at 8, GerdQ significantly predicted the presence of reflux oesophagitis. Although the GerdQ scores were correlated with the CDQ scores, the concordance rates were poor. Multivariate analysis results indicated that, the additional use of over-the-counter medications was associated with GerdQ score ≥ 8, but not with CDQ score ≥ 6. Conclusions The GerdQ cutoff score of 8 was appropriate for the Japanese population. Compared with CDQ, GerdQ was more useful for evaluating treatment efficacy and detecting patients’ unmet medical needs. PMID:24917957
Development and validation of the irritable bowel syndrome scale under the system of quality of life instruments for chronic diseases QLICD-IBS: combinations of classical test theory and generalizability theory.

PubMed

Lei, Pingguang; Lei, Guanghe; Tian, Jianjun; Zhou, Zengfen; Zhao, Miao; Wan, Chonghua

2014-10-01

This paper is aimed to develop the irritable bowel syndrome (IBS) scale of the system of Quality of Life Instruments for Chronic Diseases (QLICD-IBS) by the modular approach and validate it by both classical test theory and generalizability theory. The QLICD-IBS was developed based on programmed decision procedures with multiple nominal and focus group discussions, in-depth interview, and quantitative statistical procedures. One hundred twelve inpatients with IBS were used to provide the data measuring QOL three times before and after treatments. The psychometric properties of the scale were evaluated with respect to validity, reliability, and responsiveness employing correlation analysis, factor analyses, multi-trait scaling analysis, t tests and also G studies and D studies of generalizability theory analysis. Multi-trait scaling analysis, correlation, and factor analyses confirmed good construct validity and criterion-related validity when using SF-36 as a criterion. Test-retest reliability coefficients (Pearson r and intra-class correlation (ICC)) for the overall score and all domains were higher than 0.80; the internal consistency α for all domains at two measurements were higher than 0.70 except for the social domain (0.55 and 0.67, respectively). The overall score and scores for all domains/facets had statistically significant changes after treatments with moderate or higher effect size standardized response mean (SRM) ranging from 0.72 to 1.02 at domain levels. G coefficients and index of dependability (Ф coefficients) confirmed the reliability of the scale further with more exact variance components. The QLICD-IBS has good validity, reliability, responsiveness, and some highlights and can be used as the quality of life instrument for patients with IBS.
Rater Cognition: Implications for Validity

ERIC Educational Resources Information Center

Bejar, Issac I.

2012-01-01

The scoring process is critical in the validation of tests that rely on constructed responses. Documenting that readers carry out the scoring in ways consistent with the construct and measurement goals is an important aspect of score validity. In this article, rater cognition is approached as a source of support for a validity argument for scores…
Changing abilities vs. changing tasks: Examining validity degradation with test scores and college performance criteria both assessed longitudinally.

PubMed

Dahlke, Jeffrey A; Kostal, Jack W; Sackett, Paul R; Kuncel, Nathan R

2018-05-03

We explore potential explanations for validity degradation using a unique predictive validation data set containing up to four consecutive years of high school students' cognitive test scores and four complete years of those students' college grades. This data set permits analyses that disentangle the effects of predictor-score age and timing of criterion measurements on validity degradation. We investigate the extent to which validity degradation is explained by criterion dynamism versus the limited shelf-life of ability scores. We also explore whether validity degradation is attributable to fluctuations in criterion variability over time and/or GPA contamination from individual differences in course-taking patterns. Analyses of multiyear predictor data suggest that changes to the determinants of performance over time have much stronger effects on validity degradation than does the shelf-life of cognitive test scores. The age of predictor scores had only a modest relationship with criterion-related validity when the criterion measurement occasion was held constant. Practical implications and recommendations for future research are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Validation of the MARS: a combined physiological and laboratory risk prediction tool for 5- to 7-day in-hospital mortality.

PubMed

Öhman, M C; Atkins, T E H; Cooksley, T; Brabrand, M

2018-06-01

The Medical Admission Risk System (MARS) uses 11 physiological and laboratory data and had promising results in its derivation study for predicting 5- and 7- day mortality. To perform an external independent validation of the MARS score. An unplanned secondary cohort study. Patients admitted to the medical admission unit at The Hospital of South West Jutland were included from 2 October 2008 until 19 February 2009 and 23 February 2010 until 26 May 2010 were analysed. Validation of the MARS scores using 5- and 7- day mortality was the primary endpoint. Patients of 5858 were included in the study. Patients of 2923 (49.9%) were women with a median age of 65 years (15-107). The MARS score had an area under the receiving operator characteristic curve of 0.858 (95% CI: 0.831-0.884) for 5-day mortality and 0.844 (0.818-0.870) for 7 day mortality with poor calibration for both outcomes. The MARS score had excellent discriminatory power but poor calibration in predicting both 5- and 7-day mortality. The development of accurate combination physiological/laboratory data risk scores has the potential to improve the recognition of at risk patients.
The R.I.R.S. scoring system: An innovative scoring system for predicting stone-free rate following retrograde intrarenal surgery.

PubMed

Xiao, Yinglong; Li, Deng; Chen, Lei; Xu, Yaoting; Zhang, Dingguo; Shao, Yi; Lu, Jun

2017-11-21

To establish and internally validate an innovative R.I.R.S. scoring system that allows urologists to preoperatively estimate the stone-free rate (SFR) after retrograde intrarenal surgery (RIRS). This study included 382 eligible samples from a total 573 patients who underwent RIRS from January 2014 to December 2016. Four reproducible factors in the R.I.R.S. scoring system, including renal stone density, inferior pole stone, renal infundibular length and stone burden, were measured based on preoperative computed tomography of urography to evaluate the possibility of stone clearance after RIRS. The median cumulative diameter of the stones was 14 mm, and the interquartile range was 10 to 21. The SFR on postoperative day 1 in the present cohort was 61.5% (235 of 382), and the final SFR after 1 month was 73.6% (281 of 382). We established an innovative scoring system to evaluate SFR after RIRS using four preoperative characteristics. The range of the R.I.R.S. scoring system was 4 to 10. The overall score showed a great significance of stone-free status (p < 0.001). The area under the receiver operating characteristic curve of the R.I.R.S. scoring system was 0.904. The R.I.R.S. scoring system is associated with SFR after RIRS. This innovative scoring system can preoperatively assess treatment success after intrarenal surgery and can be used for preoperative surgical arrangement and comparisons of outcomes among different centers and within a center over time.
4H Leukodystrophy: A Brain Magnetic Resonance Imaging Scoring System.

PubMed

Vrij-van den Bos, Suzanne; Hol, Janna A; La Piana, Roberta; Harting, Inga; Vanderver, Adeline; Barkhof, Frederik; Cayami, Ferdy; van Wieringen, Wessel N; Pouwels, Petra J W; van der Knaap, Marjo S; Bernard, Geneviève; Wolf, Nicole I

2017-06-01

4H (hypomyelination, hypodontia and hypogonadotropic hypogonadism) leukodystrophy (4H) is an autosomal recessive hypomyelinating white matter (WM) disorder with neurologic, dental, and endocrine abnormalities. The aim of this study was to develop and validate a magnetic resonance imaging (MRI) scoring system for 4H. A scoring system (0-54) was developed to quantify hypomyelination and atrophy of different brain regions. Pons diameter and bicaudate ratio were included as measures of cerebral and brainstem atrophy, and reference values were determined using controls. Five independent raters completed the scoring system in 40 brain MRI scans collected from 36 patients with genetically proven 4H. Interrater reliability (IRR) and correlations between MRI scores, age, gross motor function, gender, and mutated gene were assessed. IRR for total MRI severity was found to be excellent (intraclass correlation coefficient: 0.87; 95% confidence interval: 0.80-0.92) but varied between different items with some (e.g., myelination of the cerebellar WM) showing poor IRR. Atrophy increased with age in contrast to hypomyelination scores. MRI scores (global, hypomyelination, and atrophy scores) significantly correlated with clinical handicap ( p < 0.01 for all three items) and differed between the different genotypes. Our 4H MRI scoring system reliably quantifies hypomyelination and atrophy in patients with 4H, and MRI scores reflect clinical disease severity. Georg Thieme Verlag KG Stuttgart · New York.
C-GRApH: A Validated Scoring System for Early Stratification of Neurologic Outcome After Out-of-Hospital Cardiac Arrest Treated With Targeted Temperature Management.

PubMed

Kiehl, Erich L; Parker, Alex M; Matar, Ralph M; Gottbrecht, Matthew F; Johansen, Michelle C; Adams, Mark P; Griffiths, Lori A; Dunn, Steven P; Bidwell, Katherine L; Menon, Venu; Enfield, Kyle B; Gimple, Lawrence W

2017-05-20

Out-of-hospital cardiac arrest (OHCA) results in significant morbidity and mortality, primarily from neurologic injury. Predicting neurologic outcome early post-OHCA remains difficult in patients receiving targeted temperature management. Retrospective analysis was performed on consecutive OHCA patients receiving targeted temperature management (32-34°C) for 24 hours at a tertiary-care center from 2008 to 2012 (development cohort, n=122). The primary outcome was favorable neurologic outcome at hospital discharge, defined as cerebral performance category 1 to 2 (poor 3-5). Patient demographics, pre-OHCA diagnoses, and initial laboratory studies post-resuscitation were compared between favorable and poor neurologic outcomes with multivariable logistic regression used to develop a simple scoring system ( C-GRApH ). The C-GRApH score ranges 0 to 5 using equally weighted variables: ( C ): coronary artery disease, known pre-OHCA; ( G ): glucose ≥200 mg/dL; ( R ): rhythm of arrest not ventricular tachycardia/fibrillation; ( A ): age >45; ( pH ): arterial pH ≤7.0. A validation cohort (n=344) included subsequent patients from the initial site (n=72) and an external quaternary-care health system (n=272) from 2012 to 2014. The c-statistic for predicting neurologic outcome was 0.82 (0.74-0.90, P <0.001) in the development cohort and 0.81 (0.76-0.87, P <0.001) in the validation cohort. When subdivided by C-GRApH score, similar rates of favorable neurologic outcome were seen in both cohorts, 70% each for low (0-1, n=60), 22% versus 19% for medium (2-3, n=307), and 0% versus 2% for high (4-5, n=99) C-GRApH scores in the development and validation cohorts, respectively. C-GRApH stratifies neurologic outcomes following OHCA in patients receiving targeted temperature management (32-34°C) using objective data available at hospital presentation, identifying patient subsets with disproportionally favorable ( C-GRApH ≤1) and poor ( C-GRApH ≥4) prognoses. © 2017 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley.
Test Score Stability and Construct Validity of the Adult Manifest Anxiety Scale-College Version Scores among College Students: A Brief Report

ERIC Educational Resources Information Center

Lowe, Patricia A.; Papanastasiou, Elena C.; DeRuyck, Kimberly A.; Reynolds, Cecil R.

2005-01-01

In this study, the authors investigated the temporal stability and construct validity of the Adult Manifest Anxiety Scale-College Version (AMAS-C; C. R. Reynolds, B. O. Richmond, & P. A. Lowe, 2003b) scores. Results indicated that the AMAS-C scores had adequate to excellent test score stability, and evidence supported the construct validity of the…
Bayesian Scoring Systems for Military Pelvic and Perineal Blast Injuries: Is it Time to Take a New Approach?

PubMed

Mossadegh, Somayyeh; He, Shan; Parker, Paul

2016-05-01

Various injury severity scores exist for trauma; it is known that they do not correlate accurately to military injuries. A promising anatomical scoring system for blast pelvic and perineal injury led to the development of an improved scoring system using machine-learning techniques. An unbiased genetic algorithm selected optimal anatomical and physiological parameters from 118 military cases. A Naïve Bayesian model was built using the proposed parameters to predict the probability of survival. Ten-fold cross validation was employed to evaluate its performance. Our model significantly out-performed Injury Severity Score (ISS), Trauma ISS, New ISS, and the Revised Trauma Score in virtually all areas; positive predictive value 0.8941, specificity 0.9027, accuracy 0.9056, and area under curve 0.9059. A two-sample t test showed that the predictive performance of the proposed scoring system was significantly better than the other systems (p < 0.001). With limited resources and the simplest of Bayesian methodologies, we have demonstrated that the Naïve Bayesian model performed significantly better in virtually all areas assessed by current scoring systems used for trauma. This is encouraging and highlights that more can be done to improve trauma systems not only for our military injured, but also for civilian trauma victims. Reprint & Copyright © 2016 Association of Military Surgeons of the U.S.
A contemporary approach to validity arguments: a practical guide to Kane's framework.

PubMed

Cook, David A; Brydges, Ryan; Ginsburg, Shiphra; Hatala, Rose

2015-06-01

Assessment is central to medical education and the validation of assessments is vital to their use. Earlier validity frameworks suffer from a multiplicity of types of validity or failure to prioritise among sources of validity evidence. Kane's framework addresses both concerns by emphasising key inferences as the assessment progresses from a single observation to a final decision. Evidence evaluating these inferences is planned and presented as a validity argument. We aim to offer a practical introduction to the key concepts of Kane's framework that educators will find accessible and applicable to a wide range of assessment tools and activities. All assessments are ultimately intended to facilitate a defensible decision about the person being assessed. Validation is the process of collecting and interpreting evidence to support that decision. Rigorous validation involves articulating the claims and assumptions associated with the proposed decision (the interpretation/use argument), empirically testing these assumptions, and organising evidence into a coherent validity argument. Kane identifies four inferences in the validity argument: Scoring (translating an observation into one or more scores); Generalisation (using the score[s] as a reflection of performance in a test setting); Extrapolation (using the score[s] as a reflection of real-world performance), and Implications (applying the score[s] to inform a decision or action). Evidence should be collected to support each of these inferences and should focus on the most questionable assumptions in the chain of inference. Key assumptions (and needed evidence) vary depending on the assessment's intended use or associated decision. Kane's framework applies to quantitative and qualitative assessments, and to individual tests and programmes of assessment. Validation focuses on evaluating the key claims, assumptions and inferences that link assessment scores with their intended interpretations and uses. The Implications and associated decisions are the most important inferences in the validity argument. © 2015 John Wiley & Sons Ltd.
A Probabilistic Model for Cushing's Syndrome Screening in At-Risk Populations: A Prospective Multicenter Study.

PubMed

León-Justel, Antonio; Madrazo-Atutxa, Ainara; Alvarez-Rios, Ana I; Infantes-Fontán, Rocio; Garcia-Arnés, Juan A; Lillo-Muñoz, Juan A; Aulinas, Anna; Urgell-Rull, Eulàlia; Boronat, Mauro; Sánchez-de-Abajo, Ana; Fajardo-Montañana, Carmen; Ortuño-Alonso, Mario; Salinas-Vert, Isabel; Granada, Maria L; Cano, David A; Leal-Cerro, Alfonso

2016-10-01

Cushing's syndrome (CS) is challenging to diagnose. Increased prevalence of CS in specific patient populations has been reported, but routine screening for CS remains questionable. To decrease the diagnostic delay and improve disease outcomes, simple new screening methods for CS in at-risk populations are needed. To develop and validate a simple scoring system to predict CS based on clinical signs and an easy-to-use biochemical test. Observational, prospective, multicenter. Referral hospital. A cohort of 353 patients attending endocrinology units for outpatient visits. All patients were evaluated with late-night salivary cortisol (LNSC) and a low-dose dexamethasone suppression test for CS. Diagnosis or exclusion of CS. Twenty-six cases of CS were diagnosed in the cohort. A risk scoring system was developed by logistic regression analysis, and cutoff values were derived from a receiver operating characteristic curve. This risk score included clinical signs and symptoms (muscular atrophy, osteoporosis, and dorsocervical fat pad) and LNSC levels. The estimated area under the receiver operating characteristic curve was 0.93, with a sensitivity of 96.2% and specificity of 82.9%. We developed a risk score to predict CS in an at-risk population. This score may help to identify at-risk patients in non-endocrinological settings such as primary care, but external validation is warranted.
Reliability, Validity, and Responsiveness of InFLUenza Patient-Reported Outcome (FLU-PRO©) Scores in Influenza-Positive Patients.

PubMed

Powers, John H; Bacci, Elizabeth D; Guerrero, M Lourdes; Leidy, Nancy Kline; Stringer, Sonja; Kim, Katherine; Memoli, Matthew J; Han, Alison; Fairchok, Mary P; Chen, Wei-Ju; Arnold, John C; Danaher, Patrick J; Lalani, Tahaniyat; Ridoré, Michelande; Burgess, Timothy H; Millar, Eugene V; Hernández, Andrés; Rodríguez-Zulueta, Patricia; Smolskis, Mary C; Ortega-Gallegos, Hilda; Pett, Sarah; Fischer, William; Gillor, Daniel; Macias, Laura Moreno; DuVal, Anna; Rothman, Richard; Dugas, Andrea; Ruiz-Palacios, Guillermo M

2018-02-01

To assess the reliability, validity, and responsiveness of InFLUenza Patient-Reported Outcome (FLU-PRO©) scores for quantifying the presence and severity of influenza symptoms. An observational prospective cohort study of adults (≥18 years) with influenza-like illness in the United States, the United Kingdom, Mexico, and South America was conducted. Participants completed the 37-item draft FLU-PRO daily for up to 14 days. Item-level and factor analyses were used to remove items and determine factor structure. Reliability of the final tool was estimated using Cronbach α and intraclass correlation coefficients (2-day reliability). Convergent and known-groups validity and responsiveness were assessed using global assessments of influenza severity and return to usual health. Of the 536 patients enrolled, 221 influenza-positive subjects comprised the analytical sample. The mean age of the patients was 40.7 years, 60.2% were women, and 59.7% were white. The final 32-item measure has six factors/domains (nose, throat, eyes, chest/respiratory, gastrointestinal, and body/systemic), with a higher order factor representing symptom severity overall (comparative fit index = 0.92; root mean square error of approximation = 0.06). Cronbach α was high (total = 0.92; domain range = 0.71-0.87); test-retest reliability (intraclass correlation coefficient, day 1-day 2) was 0.83 for total scores and 0.57 to 0.79 for domains. Day 1 FLU-PRO domain and total scores were moderately to highly correlated (≥0.30) with Patient Global Rating of Flu Severity (except nose and throat). Consistent with known-groups validity, scores differentiated severity groups on the basis of global rating (total: F = 57.2, P < 0.001; domains: F = 8.9-67.5, P < 0.001). Subjects reporting return to usual health showed significantly greater (P < 0.05) FLU-PRO score improvement by day 7 than did those who did not, suggesting score responsiveness. Results suggest that FLU-PRO scores are reliable, valid, and responsive to change in influenza-positive adults. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Validation of the VISA-A questionnaire for Turkish language: the VISA-A-Tr study.

PubMed

Dogramaci, Yunus; Kalaci, Aydiner; Kücükkübas, Nigar; Inandi, Taceddin; Esen, Erdinc; Yanat, A Nedim

2011-04-01

To evaluate the validity and reliability of the Turkish version of the Victorian Institute of Sports Assessment-Achilles (VISA-A) questionnaire for patients with Achilles tendinopathy. Fifty-five patients with a diagnosis of Achilles tendinopathy and 55 healthy subjects were included in the study. VISA-A questionnaires were translated and culturally adapted into Turkish. The final Turkish version (VISA-A-Tr) was tested for reliability on healthy individuals and patients. Tests for internal consistency, validity and structure were performed on 55 patients. The VISA-A-Tr showed good test-retest reliability (Pearson's r=0.99, p<0.001). The patients with Achilles tendinopathy had a significantly lower score (p<0.001) than the healthy individuals. The VISA-A-Tr score correlated significantly with the Stanish tendon grading system (Spearman's r=-0.86; p<0.001). The VISA-A-Tr is a valid and reliable tool for evaluating the severity of Achilles tendinopathy.
Examining the validity of self-reports on scales measuring students' strategic processing.

PubMed

Samuelstuen, Marit S; Bråten, Ivar

2007-06-01

Self-report inventories trying to measure strategic processing at a global level have been much used in both basic and applied research. However, the validity of global strategy scores is open to question because such inventories assess strategy perceptions outside the context of specific task performance. The primary aim was to examine the criterion-related and construct validity of the global strategy data obtained with the Cross-Curricular Competencies (CCC) scale. Additionally, we wanted to compare the validity of these data with the validity of data obtained with a task-specific self-report inventory focusing on the same types of strategies. The sample included 269 10th-grade students from 12 different junior high schools. Global strategy use as assessed with the CCC was compared with task-specific strategy use reported in three different reading situations. Moreover, relationships between scores on the CCC and scores on measures of text comprehension were examined and compared with relationships between scores on the task-specific strategy measure and the same comprehension measures. The comparison between the CCC strategy scores and the task-specific strategy scores suggested only modest criterion-related validity for the data obtained with the global strategy inventory. The CCC strategy scores were also not related to the text comprehension measures, indicating poor construct validity. In contrast, the task-specific strategy scores were positively related to the comprehension measures, indicating good construct validity. Attempts to measure strategic processing at a global level seem to have limited validity and utility.
An investigation of the clinical use of the house-tree-person projective drawings in the psychological evaluation of child sexual abuse.

PubMed

Palmer, L; Farrar, A R; Valle, M; Ghahary, N; Panella, M; DeGraw, D

2000-05-01

Identification and evaluation of child sexual abuse is an integral task for clinicians. To aid these processes, it is necessary to have reliable and valid psychological measures. This is an investigation of the clinical validity and use of the House-Tree-Person (HTP) projective drawing, a widely used diagnostic tool, in the assessment of child sexual abuse. HTP drawings were collected archivally from a sample of sexually abused children (n = 47) and a nonabused comparison sample (n = 82). The two samples were grossly matched for gender, ethnicity, age, and socioeconomic status. The protocols were scored using a quantitative scoring system. The data were analyzed using a discriminant function analysis. Group membership could not be predicted based on a total HTP score.
The Reliability and Validity of Scores from the Children's Version of the Perception of Success Questionnaire.

ERIC Educational Resources Information Center

Liukkonen, Jarmo; Leskinen, Esko

1999-01-01

Analyzed the reliability and validity of scores of 557 14-year-old Finnish male soccer players on the children's version of the Perception of Success Questionnaire (G. Roberts and others, 1998). Internal consistency coefficients for the two subscales' scores were high, and scores on both scales had strong construct validity. (LSD)
Predicting Blunt Cerebrovascular Injury in Pediatric Trauma: Validation of the “Utah Score”

PubMed Central

Ravindra, Vijay M.; Bollo, Robert J.; Sivakumar, Walavan; Akbari, Hassan; Naftel, Robert P.; Limbrick, David D.; Jea, Andrew; Gannon, Stephen; Shannon, Chevis; Birkas, Yekaterina; Yang, George L.; Prather, Colin T.; Kestle, John R.

2017-01-01

Abstract Risk factors for blunt cerebrovascular injury (BCVI) may differ between children and adults, suggesting that children at low risk for BCVI after trauma receive unnecessary computed tomography angiography (CTA) and high-dose radiation. We previously developed a score for predicting pediatric BCVI based on retrospective cohort analysis. Our objective is to externally validate this prediction score with a retrospective multi-institutional cohort. We included patients who underwent CTA for traumatic cranial injury at four pediatric Level I trauma centers. Each patient in the validation cohort was scored using the “Utah Score” and classified as high or low risk. Before analysis, we defined a misclassification rate <25% as validating the Utah Score. Six hundred forty-five patients (mean age 8.6 ± 5.4 years; 63.4% males) underwent screening for BCVI via CTA. The validation cohort was 411 patients from three sites compared with the training cohort of 234 patients. Twenty-two BCVIs (5.4%) were identified in the validation cohort. The Utah Score was significantly associated with BCVIs in the validation cohort (odds ratio 8.1 [3.3, 19.8], p < 0.001) and discriminated well in the validation cohort (area under the curve 72%). When the Utah Score was applied to the validation cohort, the sensitivity was 59%, specificity was 85%, positive predictive value was 18%, and negative predictive value was 97%. The Utah Score misclassified 16.6% of patients in the validation cohort. The Utah Score for predicting BCVI in pediatric trauma patients was validated with a low misclassification rate using a large, independent, multicenter cohort. Its implementation in the clinical setting may reduce the use of CTA in low-risk patients. PMID:27297774
PROMIS GH (Patient-Reported Outcomes Measurement Information System Global Health) Scale in Stroke: A Validation Study.

PubMed

Katzan, Irene L; Lapin, Brittany

2018-01-01

The International Consortium for Health Outcomes Measurement recently included the 10-item PROMIS GH (Patient-Reported Outcomes Measurement Information System Global Health) scale as part of their recommended Standard Set of Stroke Outcome Measures. Before collection of PROMIS GH is broadly implemented, it is necessary to assess its performance in the stroke population. The objective of this study was to evaluate the psychometric properties of PROMIS GH in patients with ischemic stroke and intracerebral hemorrhage. PROMIS GH and 6 PROMIS domain scales measuring same/similar constructs were electronically collected on 1102 patients with ischemic and hemorrhagic strokes at various stages of recovery from their stroke who were seen in a cerebrovascular clinic from October 12, 2015, through June 2, 2017. Confirmatory factor analysis was performed to evaluate the adequacy of 2-factor structure of component scores. Test-retest reliability and convergent validity of PROMIS GH items and component scores were assessed. Discriminant validity and responsiveness were compared between PROMIS GH and PROMIS domain scales measuring the same or related constructs. Analyses were repeated stratified by stroke subtype and modified Rankin Scale score <2 versus ≥2. There was moderate internal reliability (ordinal α, 0.82-0.88) and marginal model fit for the 2-factor solution for component scores (root mean square error of approximation, 0.11). Convergent validity was good with significant correlations between all PROMIS GH items and PROMIS domain scales ( P <0.001 for all). There was excellent discrimination for all PROMIS GH items and component scores across modified Rankin Scale levels. Good responsiveness (effect size, >0.5) was demonstrated for 8 of the 10 PROMIS GH items. Reliability and validity remained consistent across stroke subtype and disability level (modified Rankin Scale, <2 versus ≥2). PROMIS GH exhibits acceptable performance in patients with stroke. Our findings support International Consortium for Health Outcomes Measurement recommendation to use PROMIS GH as part of the standard set of outcome measures in stroke. © 2017 American Heart Association, Inc.

Assessing Arthroscopic Skills Using Wireless Elbow-Worn Motion Sensors.

PubMed

Kirby, Georgina S J; Guyver, Paul; Strickland, Louise; Alvand, Abtin; Yang, Guang-Zhong; Hargrove, Caroline; Lo, Benny P L; Rees, Jonathan L

2015-07-01

Assessment of surgical skill is a critical component of surgical training. Approaches to assessment remain predominantly subjective, although more objective measures such as Global Rating Scales are in use. This study aimed to validate the use of elbow-worn, wireless, miniaturized motion sensors to assess the technical skill of trainees performing arthroscopic procedures in a simulated environment. Thirty participants were divided into three groups on the basis of their surgical experience: novices (n = 15), intermediates (n = 10), and experts (n = 5). All participants performed three standardized tasks on an arthroscopic virtual reality simulator while wearing wireless wrist and elbow motion sensors. Video output was recorded and a validated Global Rating Scale was used to assess performance; dexterity metrics were recorded from the simulator. Finally, live motion data were recorded via Bluetooth from the wireless wrist and elbow motion sensors and custom algorithms produced an arthroscopic performance score. Construct validity was demonstrated for all tasks, with Global Rating Scale scores and virtual reality output metrics showing significant differences between novices, intermediates, and experts (p < 0.001). The correlation of the virtual reality path length to the number of hand movements calculated from the wireless sensors was very high (p < 0.001). A comparison of the arthroscopic performance score levels with virtual reality output metrics also showed highly significant differences (p < 0.01). Comparisons of the arthroscopic performance score levels with the Global Rating Scale scores showed strong and highly significant correlations (p < 0.001) for both sensor locations, but those of the elbow-worn sensors were stronger and more significant (p < 0.001) than those of the wrist-worn sensors. A new wireless assessment of surgical performance system for objective assessment of surgical skills has proven valid for assessing arthroscopic skills. The elbow-worn sensors were shown to achieve an accurate assessment of surgical dexterity and performance. The validation of an entirely objective assessment of arthroscopic skill with wireless elbow-worn motion sensors introduces, for the first time, a feasible assessment system for the live operating theater with the added potential to be applied to other surgical and interventional specialties. Copyright © 2015 by The Journal of Bone and Joint Surgery, Incorporated.
Validity of the Family Asthma Management System Scale with an urban African-American sample.

PubMed

Celano, Marianne; Klinnert, Mary D; Holsey, Chanda Nicole; McQuaid, Elizabeth L

2011-06-01

To examine the reliability and validity of the Family Asthma Management System Scale for low-income African-American children with poor asthma control and caregivers under stress. The FAMSS assesses eight aspects of asthma management from a family systems perspective. Forty-three children, ages 8-13, and caregivers were interviewed with the FAMSS; caregivers completed measures of primary care quality, family functioning, parenting stress, and psychological distress. Children rated their relatedness with the caregiver, and demonstrated inhaler technique. Medical records were reviewed for dates of outpatient visits for asthma. The FAMSS demonstrated good internal consistency. Higher scores were associated with adequate inhaler technique, recent outpatient care, less parenting stress and better family functioning. Higher scores on the Collaborative Relationship with Provider subscale were associated with greater perceived primary care quality. The FAMSS demonstrated relevant associations with asthma management criteria and family functioning for a low-income, African-American sample.
Continued Validation of the O-SCORE (Ottawa Surgical Competency Operating Room Evaluation): Use in the Simulated Environment.

PubMed

MacEwan, Matthew J; Dudek, Nancy L; Wood, Timothy J; Gofton, Wade T

2016-01-01

CONSTRUCT: The Ottawa Surgical Competency Operating Room Evaluation (O-SCORE) is a 9-item surgical evaluation tool designed to assess technical competence in surgical trainees using behavioral anchors. The initial development of the O-SCORE produced evidence for valid results. Further work is required to determine if the use of a single surgeon or an unblinded rater introduces bias. In addition, the relationship of the O-SCORE to other currently used technical assessment tools should be explored to provide validity evidence related to the relationship to other measures. We have designed this project to provide continued validity evidence for the O-SCORE related to these two issues. Nineteen residents and 2 staff Orthopedic Surgeons from the University of Ottawa volunteered to participate in a 2-part OSCE style station. Participants completed a written questionnaire followed by a videotaped 10-minute simulated open reduction and internal fixation of a midshaft radius fracture. Videos were rated individually by 2 blinded staff orthopedic surgeons using an Objective Structured Assessment of Technical Skills (OSATS) global rating scale, an OSATS checklist, and the O-SCORE in random order. O-SCORE results appeared sensitive to surgical training level even when raters were blinded. In addition, strong agreement between two independent observers using the O-SCORE suggests that the measure captures a performance easily recognized by surgical observers. Ratings on the O-SCORE also were strongly associated with global ratings on the currently most validated technical evaluation tool (OSATS). Collectively, these results suggest that the O-SCORE generates accurate, reproducible, and meaningful results when used in a randomized and blinded fashion, providing continued validity evidence for using this tool to evaluate surgical trainee competence. The O-SCORE was able to differentiate surgical trainee level using blinded raters providing further evidence of validity for the O-SCORE. There was strong agreement between two independent observers using the O-SCORE. Ratings on the O-SCORE also demonstrated equivalence to scores on the most validated technical evaluation tool (OSATS). These results suggest that the O-SCORE demonstrates accurate and reproducible results when used in a randomized and blinded fashion providing continued validity evidence for this tool in the evaluation of surgical competence in the trainees.
Validation of the Seating and Mobility Script Concordance Test

ERIC Educational Resources Information Center

Cohen, Laura J.; Fitzgerald, Shirley G.; Lane, Suzanne; Boninger, Michael L.; Minkel, Jean; McCue, Michael

2009-01-01

The purpose of this study was to develop the scoring system for the Seating and Mobility Script Concordance Test (SMSCT), obtain and appraise internal and external structure evidence, and assess the validity of the SMSCT. The SMSCT purpose is to provide a method for testing knowledge of seating and mobility prescription. A sample of 106 therapists…
Systematic review of systemic sclerosis-specific instruments for the EULAR Outcome Measures Library: An evolutional database model of validated patient-reported outcomes.

PubMed

Ingegnoli, Francesca; Carmona, Loreto; Castrejon, Isabel

2017-04-01

The EULAR Outcome Measures Library (OML) is a freely available database of validated patient-reported outcomes (PROs). The aim of this study was to provide a comprehensive review of validated PROs specifically developed for systemic sclerosis (SSc) to feed the EULAR OML. A sensitive search was developed in Medline and Embase to identify all validation studies, cohort studies, reviews, or meta-analyses in which the objective were the development or validation of specific PROs evaluating organ involvement, disease activity or damage in SSc. A reviewer screened title and abstracts, selected the studies, and collected data concerning validation using ad hoc forms based on the COSMIN checklist. From 13,140 articles captured, 74 met the predefined criteria. After excluding two instruments as they were unavailable in English the selected 23 studies provided information on seven SSc-specific PROs on different SSc domains: burden of illness (symptom burden index), functional status (Scleroderma Assessment Questionnaire), functional ability (scleroderma Functional Score), Raynaud's phenomenon (Raynaud's condition score), mouth involvement (Mouth Handicap in SSc), gastro-intestinal involvement (University of California Los Angeles-Scleroderma Clinical Trial Consortium Gastro-Intestinal tract 2.0), and skin involvement (skin self-assessment). Each of them is partially validated and has different psychometric requirements. Seven SSc-specific PROs have a minimum validation and were included in the EULAR OML. Further development in the area of disease-specific PROs in SSc is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Algorithmic approach to patients presenting with heartburn and epigastric pain refractory to empiric proton pump inhibitor therapy.

PubMed

Roorda, Andrew K; Marcus, Samuel N; Triadafilopoulos, George

2011-10-01

Reflux-like dyspepsia (RLD), where predominant epigastric pain is associated with heartburn and/or regurgitation, is a common clinical syndrome in both primary and specialty care. Because symptom frequency and severity vary, overlap among gastroesophageal reflux disease (GERD), non-erosive reflux disease (NERD), and RLD, is quite common. The chronic and recurrent nature of RLD and its variable response to proton pump inhibitor (PPI) therapy remain problematic. To examine the prevalence of GERD, NERD, and RLD in a community setting using an algorithmic approach and to assess the potential, reproducibility, and validity of a multi-factorial scoring system in discriminating patients with RLD from those with GERD or NERD. Using a novel algorithmic approach, we evaluated an outpatient, community-based cohort referred to a gastroenterologist because of epigastric pain and heartburn that were only partially relieved by PPI. After an initial symptom evaluation (for epigastric pain, heartburn, regurgitation, dysphagia), an endoscopy and distal esophageal biopsies were performed, followed by esophageal motility and 24-h ambulatory pH monitoring to assess esophageal function and pathological acid exposure. A scoring system based on presence of symptoms and severity of findings was devised. Data was collected in two stages: subjects in the first stage were designated as the derivation cohort; subjects in the second stage were labeled the validation cohort. The total cohort comprised 159 patients (59 males, 100 females; mean age 52). On endoscopy, 30 patients (19%) had complicated esophagitis (CE) and 11 (7%) had Barrett's esophagus (BE) and were classified collectively as patients with GERD. One-hundred and eighteen (74%) patients had normal esophagus. Of these, 94 (59%) had one or more of the following: hiatal hernia, positive biopsy, abnormal pH, and/or abnormal motility studies and were classified as patients with NERD. The remaining 24 patients (15%) had normal functional studies and were classified as patients with RLD. Utilizing the scoring system a total score was calculated for each patient and effectively distinguished patients with GERD (mean score 9), NERD (mean score 6), and RLD (mean score 3). Receiver operating characteristic (ROC) curves confirmed the optimization of the model, particularly in RLD (P = 0.0001, 95% CI: 0.91-0.98). In a community cohort of patients presenting with heartburn and epigastric pain partly refractory to empiric PPI therapy, the prevalence of CE was 19%, BE 7%, NERD 59%, and RLD 15%. An algorithmic approach coupled with a novel scoring system, effectively distinguishes GERD from NERD and RLD and facilitates further management decisions. This novel and simple scoring system is both reproducible and validated as a diagnostic aid in evaluating patients presenting with both epigastric pain and heartburn.
POSSUM--a model for surgical outcome audit in quality care.

PubMed

Ng, K J; Yii, M K

2003-10-01

Comparative surgical audit to monitor quality of care should be performed with a risk-adjusted scoring system rather than using crude morbidity and mortality rates. A validated and widely applied risk adjusted scoring system, P-POSSUM (Portsmouth-Physiological and Operative Severity Score for the enUmeration of Mortality) methodology, was applied to a prospective series of predominantly general surgical patients at the Sarawak General Hospital, Kuching over a six months period. The patients were grouped into four risk groups. The observed mortality rates were not significantly different from predicted rates, showing that the quality of surgical care was at par with typical western series. The simplicity and advantages of this scoring system over other auditing tools are discussed. The P-POSSUM methodology could form the basis of local comparative surgical audit for assessment and maintenance of quality care.
Clinical Characteristics and Validation of Bronchiectasis Severity Score Systems for Post-Tuberculosis Bronchiectasis.

PubMed

Wang, Hong; Ji, Xiao-Bin; Li, Cheng-Wei; Lu, Hai-Wen; Mao, Bei; Liang, Shuo; Cheng, Ke-Bin; Bai, Jiu-Wu; Martinez-Garcia, Miguel Angel; Xu, Jin-Fu

2018-05-23

Lung damage related to tuberculosis is a major contributor to the etiology of bronchiectasis in China. It is unknown whether bronchiectasis severity score systems are applicable in these cases. To evaluate the clinical characteristics and validation of bronchiectasis severity score systems for post-tuberculosis bronchiectasis. The study enrolled 596 bronchiectasis patients in Shanghai Pulmonary Hospital between January 2011 and December 2012. The data for calculating FACED and bronchiectasis severity index (BSI) scores along with mortality, readmission, and exacerbation outcomes were collected and analyzed within a follow-up period with a median length of 48 months (interquartile range 43-54 months). The study enrolled 101 post-tuberculosis bronchiectasis patients and 495 non-tuberculosis bronchiectasis patients. Compared with non-post-tuberculosis bronchiectasis, post-tuberculosis bronchiectasis patients experienced less bilateral bronchiectasis (P=0.004), a higher frequency of right upper lobe involvement (P<0.001), and showed the cylindrical type more often (P<0.001). Follow-up data indicated that both scoring systems were able to predict 48(43-54) month mortality in post-tuberculosis patients as assessed by the area under the receiver operator characteristic curve (AUC) (FACED AUC=0.81, BSI AUC=0.70), but they did not predict readmission (FACED and BSI=0.56) or exacerbation (FACED and BSI=0.52) well. There are apparent differences on radiologic features between bronchiectasis patients with and without history of pulmonary tuberculosis. Both FACED and BSI can predict mortality in post-tuberculosis bronchiectasis. This article is protected by copyright. All rights reserved. © 2018 John Wiley & Sons Ltd.
Reliability, Validity, and Minimal Detectable Change of Balance Evaluation Systems Test and Its Short Versions in Older Cancer Survivors: A Pilot Study.

PubMed

Huang, Min H; Miller, Kara; Smith, Kristin; Fredrickson, Kayle; Shilling, Tracy

2016-01-01

Cancer is primarily a disease of older adults. About 77% of all cancers are diagnosed in persons aged 55 years and older. Cancer and its treatment can cause diverse sequelae impacting body systems underlying balance control. No study has examined the psychometric properties of balance assessment tools in older cancer survivors, presenting a significant challenge in the selection of outcome measures for clinicians treating this fast-growing population. This study aimed to determine the reliability, validity, and minimal detectable change (MDC) of the Balance Evaluation System Test (BESTest), Mini-Balance Evaluation Systems Test (Mini-BESTest), and Brief-Balance Evaluation Systems Test (Brief-BESTest) in community-dwelling older cancer survivors. This study was a cross-sectional design. Twenty breast and 8 prostate cancer survivors participated [age (SD) = 68.4 (8.13) years]. The BESTest and Activity-specific Balance Confidence (ABC) Scale were administered during the first session. Scores of Mini-BESTest and Brief-BESTest were extracted on the basis of the scores of BESTest. The BESTest was repeated within 1 to 2 weeks by the same rater to determine the test-retest reliability. For the analysis of the inter-rater reliability, 21 participants were randomly selected to be evaluated by 2 raters. A primary rater administered the test. The 2 raters independently and concurrently scored the performance of the participants. Each rater recorded the ratings separately on the scoring sheet. No discussion among the raters was allowed throughout the testing. Intraclass correlation coefficients (ICCs), standard error of measurement, minimal detectable change (MDC), and Bland-Altman plots were calculated. Concurrent validity of these balance tests with the ABC Scale was examined using the Spearman correlation. The BESTest, Mini-BESTest, and Brief-BESTest had high test-retest (ICC = 0.90-0.94) and interrater reliability (ICC = 0.86-0.96), small standard error of measurement (0.86-2.47 points), and MDC (2.39-6.86 points). The Bland-Altman plot revealed no systematic errors. The scores of BESTest, Mini-BEST, and Brief-BEST were correlated significantly with those of ABC Scale (P < .01), supporting their concurrent validity. The BESTest, Mini-BESTest, and Brief-BESTest showed high interrater and test-retest reliability, and excellent concurrent validity with the ABC Scale for community-dwelling cancer survivors aged 55 years and older who had completed cancer treatments for at least 3 months. Future studies are necessary to determine the predictive values for determining fall risks using balance assessment tools in older cancer survivors. Clinicians can utilize the BESTest and its short versions to evaluate balance problems in community-dwelling older cancer survivors and apply the established MDC to assess the intervention outcomes.
Validation of reactive gases and aerosols in the MACC global analysis and forecast system

NASA Astrophysics Data System (ADS)

Eskes, H.; Huijnen, V.; Arola, A.; Benedictow, A.; Blechschmidt, A.-M.; Botek, E.; Boucher, O.; Bouarar, I.; Chabrillat, S.; Cuevas, E.; Engelen, R.; Flentje, H.; Gaudel, A.; Griesfeller, J.; Jones, L.; Kapsomenakis, J.; Katragkou, E.; Kinne, S.; Langerock, B.; Razinger, M.; Richter, A.; Schultz, M.; Schulz, M.; Sudarchikova, N.; Thouret, V.; Vrekoussis, M.; Wagner, A.; Zerefos, C.

2015-02-01

The European MACC (Monitoring Atmospheric Composition and Climate) project is preparing the operational Copernicus Atmosphere Monitoring Service (CAMS), one of the services of the European Copernicus Programme on Earth observation and environmental services. MACC uses data assimilation to combine in-situ and remote sensing observations with global and regional models of atmospheric reactive gases, aerosols and greenhouse gases, and is based on the Integrated Forecast System of the ECMWF. The global component of the MACC service has a dedicated validation activity to document the quality of the atmospheric composition products. In this paper we discuss the approach to validation that has been developed over the past three years. Topics discussed are the validation requirements, the operational aspects, the measurement data sets used, the structure of the validation reports, the models and assimilation systems validated, the procedure to introduce new upgrades, and the scoring methods. One specific target of the MACC system concerns forecasting special events with high pollution concentrations. Such events receive extra attention in the validation process. Finally, a summary is provided of the results from the validation of the latest set of daily global analysis and forecast products from the MACC system reported in November 2014.
Development and validation of an endoscopic classification of diverticular disease of the colon: the DICA classification.

PubMed

Tursi, Antonio; Brandimarte, Giovanni; Di Mario, Francesco; Andreoli, Arnaldo; Annunziata, Maria Laura; Astegiano, Marco; Bianco, Maria Antonietta; Buri, Luigi; Cammarota, Giovanni; Capezzuto, Erminio; Chilovi, Fausto; Cianci, Massimo; Conigliaro, Rita; Del Favero, Giuseppe; Di Cesare, Luigi; Di Fonzo, Michela; Elisei, Walter; Faggiani, Roberto; Farroni, Ferruccio; Forti, Giacomo; Germanà, Bastianello; Giorgetti, Gian Marco; Giovannone, Maurizio; Lecca, Piera Giuseppina; Loperfido, Silvano; Marmo, Riccardo; Morucci, Piero; Occhigrossi, Giuseppe; Penna, Antonio; Rossi, Alfredo Francesco; Spadaccini, Antonio; Zampaletta, Costantino; Zilli, Maurizio; Zullo, Angelo; Scarpignato, Carmelo; Picchio, Marcello

2015-01-01

A validated endoscopic classification of diverticular disease (DD) of the colon is lacking at present. Our aim was to develop a simple endoscopic score of DD: the Diverticular Inflammation and Complication Assessment (DICA) score. The DICA score for DD resulted in the sum of the scores for the extension of diverticulosis, the number of diverticula per region, the presence and type of inflammation, and the presence and type of complications: DICA 1 (≤ 3), DICA 2 (4-7) and DICA 3 (>7). A comparison with abdominal pain and inflammatory marker expression was also performed. A total of 50 videos of DD patients were reassessed in order to investigate the predictive role of DICA on the outcome of the disease. Overall agreement in using DICA was 0.847 (95% confidence interval, CI, 0.812-0.893): 0.878 (95% CI 0.832-0.895) for DICA 1, 0.765 (95% CI 0.735-0.786) for DICA 2 and 0.891 (95% CI 0.845-0.7923) for DICA 3. Intra-observer agreement (kappa) was 0.91 (95% CI 0.886-0.947). A significant correlation was found between the DICA score and C-reactive protein values (p = 0.0001), as well as between the median pain score and the DICA score (p = 0.0001). With respect to the 50 patients retrospectively reassessed, occurrence/recurrence of disease complications was recorded in 29 patients (58%): 10 (34.5%) were classified as DICA 1 and 19 (65.5%) as DICA 2 (p = 0.036). The DICA score is a simple, reproducible, validated and easy-to-use endoscopic scoring system for DD of the colon. © 2014 S. Karger AG, Basel.
Clinical Risk Score for Persistent Postconcussion Symptoms Among Children With Acute Concussion in the ED.

PubMed

Zemek, Roger; Barrowman, Nick; Freedman, Stephen B; Gravel, Jocelyn; Gagnon, Isabelle; McGahern, Candice; Aglipay, Mary; Sangha, Gurinder; Boutis, Kathy; Beer, Darcy; Craig, William; Burns, Emma; Farion, Ken J; Mikrogianakis, Angelo; Barlow, Karen; Dubrovsky, Alexander S; Meeuwisse, Willem; Gioia, Gerard; Meehan, William P; Beauchamp, Miriam H; Kamil, Yael; Grool, Anne M; Hoshizaki, Blaine; Anderson, Peter; Brooks, Brian L; Yeates, Keith Owen; Vassilyadi, Michael; Klassen, Terry; Keightley, Michelle; Richer, Lawrence; DeMatteo, Carol; Osmond, Martin H

2016-03-08

Approximately one-third of children experiencing acute concussion experience ongoing somatic, cognitive, and psychological or behavioral symptoms, referred to as persistent postconcussion symptoms (PPCS). However, validated and pragmatic tools enabling clinicians to identify patients at risk for PPCS do not exist. To derive and validate a clinical risk score for PPCS among children presenting to the emergency department. Prospective, multicenter cohort study (Predicting and Preventing Postconcussive Problems in Pediatrics [5P]) enrolled young patients (aged 5-<18 years) who presented within 48 hours of an acute head injury at 1 of 9 pediatric emergency departments within the Pediatric Emergency Research Canada (PERC) network from August 2013 through September 2014 (derivation cohort) and from October 2014 through June 2015 (validation cohort). Participants completed follow-up 28 days after the injury. All eligible patients had concussions consistent with the Zurich consensus diagnostic criteria. The primary outcome was PPCS risk score at 28 days, which was defined as 3 or more new or worsening symptoms using the patient-reported Postconcussion Symptom Inventory compared with recalled state of being prior to the injury. In total, 3063 patients (median age, 12.0 years [interquartile range, 9.2-14.6 years]; 1205 [39.3%] girls) were enrolled (n = 2006 in the derivation cohort; n = 1057 in the validation cohort) and 2584 of whom (n = 1701 [85%] in the derivation cohort; n = 883 [84%] in the validation cohort) completed follow-up at 28 days after the injury. Persistent postconcussion symptoms were present in 801 patients (31.0%) (n = 510 [30.0%] in the derivation cohort and n = 291 [33.0%] in the validation cohort). The 12-point PPCS risk score model for the derivation cohort included the variables of female sex, age of 13 years or older, physician-diagnosed migraine history, prior concussion with symptoms lasting longer than 1 week, headache, sensitivity to noise, fatigue, answering questions slowly, and 4 or more errors on the Balance Error Scoring System tandem stance. The area under the curve was 0.71 (95% CI, 0.69-0.74) for the derivation cohort and 0.68 (95% CI, 0.65-0.72) for the validation cohort. A clinical risk score developed among children presenting to the emergency department with concussion and head injury within the previous 48 hours had modest discrimination to stratify PPCS risk at 28 days. Before this score is adopted in clinical practice, further research is needed for external validation, assessment of accuracy in an office setting, and determination of clinical utility.
Evaluating the Validity and Applicability of Automated Essay Scoring in Two Massive Open Online Courses

ERIC Educational Resources Information Center

Reilly, Erin Dawna; Stafford, Rose Eleanore; Williams, Kyle Marie; Corliss, Stephanie Brooks

2014-01-01

The use of massive open online courses (MOOCs) to expand students' access to higher education has raised questions regarding the extent to which this course model can provide and assess authentic, higher level student learning. In response to this need, MOOC platforms have begun utilizing automated essay scoring (AES) systems that allow…
Expert opinion as 'validation' of risk assessment applied to calf welfare.

PubMed

Bracke, Marc B M; Edwards, Sandra A; Engel, Bas; Buist, Willem G; Algers, Bo

2008-07-14

Recently, a Risk Assessment methodology was applied to animal welfare issues in a report of the European Food Safety Authority (EFSA) on intensively housed calves. Because this is a new and potentially influential approach to derive conclusions on animal welfare issues, a so-called semantic-modelling type 'validation' study was conducted by asking expert scientists, who had been involved or quoted in the report, to give welfare scores for housing systems and for welfare hazards. Kendall's coefficient of concordance among experts (n = 24) was highly significant (P < 0.001), but low (0.29 and 0.18 for housing systems and hazards respectively). Overall correlations with EFSA scores were significant only for experts with a veterinary or mixed (veterinary and applied ethological) background. Significant differences in welfare scores were found between housing systems, between hazards, and between experts with different backgrounds. For example, veterinarians gave higher overall welfare scores for housing systems than ethologists did, probably reflecting a difference in their perception of animal welfare. Systems with the lowest scores were veal calves kept individually in so-called "baby boxes" (veal crates) or in small groups, and feedlots. A suckler herd on pasture was rated as the best for calf welfare. The main hazards were related to underfeeding, inadequate colostrum intake, poor stockperson education, insufficient space, inadequate roughage, iron deficiency, inadequate ventilation, poor floor conditions and no bedding. Points for improvement of the Risk Assessment applied to animal welfare include linking information, reporting uncertainty and transparency about underlying values. The study provides novel information on expert opinion in relation to calf welfare and shows that Risk Assessment applied to animal welfare can benefit from a semantic modelling approach.
Renal Tumor Anatomic Complexity: Clinical Implications for Urologists.

PubMed

Joshi, Shreyas S; Uzzo, Robert G

2017-05-01

Anatomic tumor complexity can be objectively measured and reported using nephrometry. Various scoring systems have been developed in an attempt to correlate tumor complexity with intraoperative and postoperative outcomes. Nephrometry may also predict tumor biology in a noninvasive, reproducible manner. Other scoring systems can help predict surgical complexity and the likelihood of complications, independent of tumor characteristics. The accumulated data in this new field provide provocative evidence that objectifying anatomic complexity can consolidate reporting mechanisms and improve metrics of comparisons. Further prospective validation is needed to understand the full descriptive and predictive ability of the various nephrometry scores. Copyright © 2017 Elsevier Inc. All rights reserved.
Validity of GRE General Test Scores and TOEFL Scores for Graduate Admission to a Technical University in Western Europe

ERIC Educational Resources Information Center

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the…
Construct Validation of Analytic Rating Scales in a Speaking Assessment: Reporting a Score Profile and a Composite

ERIC Educational Resources Information Center

Sawaki, Yasuyo

2007-01-01

This is a construct validation study of a second language speaking assessment that reported a language profile based on analytic rating scales and a composite score. The study addressed three key issues: score dependability, convergent/discriminant validity of analytic rating scales and the weighting of analytic ratings in the composite score.…
The Role of Scoring Systems and Urine Dipstick in Prediction of Rhabdomyolysis-induced Acute Kidney Injury: a Systematic Review.

PubMed

Safari, Saeed; Yousefifard, Mahmoud; Hashemi, Behrooz; Baratloo, Alireza; Forouzanfar, Mohammad Mehdi; Rahmati, Farhad; Motamedi, Maryam; Najafi, Iraj

2016-05-01

During the past decade, using serum biomarkers and clinical decision rules for early prediction of rhabdomyolysis-induced acute kidney injury (AKI) has received much attention from researchers. This study aimed to broadly review the value of scoring systems and urine dipstick in prediction of rhabdomyolysis-induced AKI. The study was designed based on the guidelines of the Meta-analysis of Observational Studies in Epidemiology statement. Search was done in electronic databases of MEDLINE, EMBASE, Cochrane Library, Scopus, and Google Scholar by 2 independent reviewers. Studies evaluating AKI risk factors in rhabdomyolysis patients with the aim of developing a scoring model as well as those assessing the role of urine dipstick in these patients were included. Of the 5997 articles found, 143 were potentially relevant studies. After studying their full texts, 6 articles were entered into the systematic review. Two studies had developed or validated scoring systems of the "rule of thumb," and the AKI index, and the Mangled Extremity Severity Score. Four studies were on the predictive value of urine dipstick in risk prediction of rhabdomyolysis-induced AKI, with favorable results. The findings of this systematic review showed that based on the available resources, using the prediction rules and urine dipstick could be considered as valuable screening tools for detection of patients at risk for AKI following rhabdomyolysis. Yet, the external validity of the mentioned tools should be assessed before their general application in routine practice.
Derivation and Cross-Validation of Cutoff Scores for Patients With Schizophrenia Spectrum Disorders on WAIS-IV Digit Span-Based Performance Validity Measures.

PubMed

Glassmire, David M; Toofanian Ross, Parnian; Kinney, Dominique I; Nitch, Stephen R

2016-06-01

Two studies were conducted to identify and cross-validate cutoff scores on the Wechsler Adult Intelligence Scale-Fourth Edition Digit Span-based embedded performance validity (PV) measures for individuals with schizophrenia spectrum disorders. In Study 1, normative scores were identified on Digit Span-embedded PV measures among a sample of patients (n = 84) with schizophrenia spectrum diagnoses who had no known incentive to perform poorly and who put forth valid effort on external PV tests. Previously identified cutoff scores resulted in unacceptable false positive rates and lower cutoff scores were adopted to maintain specificity levels ≥90%. In Study 2, the revised cutoff scores were cross-validated within a sample of schizophrenia spectrum patients (n = 96) committed as incompetent to stand trial. Performance on Digit Span PV measures was significantly related to Full Scale IQ in both studies, indicating the need to consider the intellectual functioning of examinees with psychotic spectrum disorders when interpreting scores on Digit Span PV measures. © The Author(s) 2015.
Risk score to predict the outcome of patients with cerebral vein and dural sinus thrombosis.

PubMed

Ferro, José M; Bacelar-Nicolau, Helena; Rodrigues, Teresa; Bacelar-Nicolau, Leonor; Canhão, Patrícia; Crassard, Isabelle; Bousser, Marie-Germaine; Dutra, Aurélio Pimenta; Massaro, Ayrton; Mackowiack-Cordiolani, Marie-Anne; Leys, Didier; Fontes, João; Stam, Jan; Barinagarrementeria, Fernando

2009-01-01

Around 15% of patients die or become dependent after cerebral vein and dural sinus thrombosis (CVT). We used the International Study on Cerebral Vein and Dural Sinus Thrombosis (ISCVT) sample (624 patients, with a median follow-up time of 478 days) to develop a Cox proportional hazards regression model to predict outcome, dichotomised by a modified Rankin Scale score >2. From the model hazard ratios, a risk score was derived and a cut-off point selected. The model and the score were tested in 2 validation samples: (1) the prospective Cerebral Venous Thrombosis Portuguese Collaborative Study Group (VENOPORT) sample with 91 patients; (2) a sample of 169 consecutive CVT patients admitted to 5 ISCVT centres after the end of the ISCVT recruitment period. Sensitivity, specificity, c statistics and overall efficiency to predict outcome at 6 months were calculated. The model (hazard ratios: malignancy 4.53; coma 4.19; thrombosis of the deep venous system 3.03; mental status disturbance 2.18; male gender 1.60; intracranial haemorrhage 1.42) had overall efficiencies of 85.1, 84.4 and 90.0%, in the derivation sample and validation samples 1 and 2, respectively. Using the risk score (range from 0 to 9) with a cut-off of >or=3 points, overall efficiency was 85.4, 84.4 and 90.1% in the derivation sample and validation samples 1 and 2, respectively. Sensitivity and specificity in the combined samples were 96.1 and 13.6%, respectively. The CVT risk score has a good estimated overall rate of correct classifications in both validation samples, but its specificity is low. It can be used to avoid unnecessary or dangerous interventions in low-risk patients, and may help to identify high-risk CVT patients. (c) 2009 S. Karger AG, Basel.

Reliability and Validity of Oral Reading Fluency Median and Mean Scores among Middle Grade Readers When Using Equated Texts

PubMed Central

Barth, Amy E.; Stuebing, Karla K.; Fletcher, Jack M.; Cirino, Paul T.; Romain, Melissa; Francis, David; Vaughn, Sharon

2012-01-01

We evaluated the reliability and validity of two oral reading fluency scores for one-minute equated passages: median score and mean score. These scores were calculated from measures of reading fluency administered up to five times over the school year to students in grades 6–8 (n = 1,317). Both scores were highly reliable with strong convergent validity for adequately developing and struggling middle grade readers. These results support the use of either the median or mean score for oral reading fluency assessments for middle grade readers. PMID:23087532
INTERPRETING PHYSICAL AND BEHAVIORAL HEALTH SCORES FROM NEW WORK DISABILITY INSTRUMENTS

PubMed Central

Marfeo, Elizabeth E.; Ni, Pengsheng; Chan, Leighton; Rasch, Elizabeth K.; McDonough, Christine M.; Brandt, Diane E.; Bogusz, Kara; Jette, Alan M.

2015-01-01

Objective To develop a system to guide interpretation of scores generated from 2 new instruments measuring work-related physical and behavioral health functioning (Work Disability – Physical Function (WD-PF) and WD – Behavioral Function (WD-BH)). Design Cross-sectional, secondary data from 3 independent samples to develop and validate the functional levels for physical and behavioral health functioning. Subjects Physical group: 999 general adult subjects, 1,017 disability applicants and 497 work-disabled subjects. Behavioral health group: 1,000 general adult subjects, 1,015 disability applicants and 476 work-disabled subjects. Methods Three-phase analytic approach including item mapping, a modified-Delphi technique, and known-groups validation analysis were used to develop and validate cut-points for functional levels within each of the WD-PF and WD-BH instrument’s scales. Results Four and 5 functional levels were developed for each of the scales in the WD-PF and WD-BH instruments. Distribution of the comparative samples was in the expected direction: the general adult samples consistently demonstrated scores at higher functional levels compared with the claimant and work-disabled samples. Conclusion Using an item-response theory-based methodology paired with a qualitative process appears to be a feasible and valid approach for translating the WD-BH and WD-PF scores into meaningful levels useful for interpreting a person’s work-related physical and behavioral health functioning. PMID:25729901
Developing a weighted measure of speech sound accuracy.

PubMed

Preston, Jonathan L; Ramsdell, Heather L; Oller, D Kimbrough; Edwards, Mary Louise; Tobin, Stephen J

2011-02-01

To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound Accuracy (WSSA) score. The authors then evaluate the reliability and validity of this measure. Phonetic transcriptions were analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy was validated against existing measures, was used to discriminate typical and disordered speech production, and was evaluated to examine sensitivity to changes in phonetic accuracy over time. Reliability between transcribers and consistency of scores among different word sets and testing points are compared. Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners' judgments of the severity of a child's speech disorder. The measure separates children with and without speech sound disorders and captures growth in phonetic accuracy in toddlers' speech over time. The measure correlates highly across transcribers, word lists, and testing points. Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children's speech.
Distinguishing infected from noninfected abdominal fluid collections after surgery: an imaging, clinical, and laboratory-based scoring system.

PubMed

Gnannt, Ralph; Fischer, Michael A; Baechler, Thomas; Clavien, Pierre-Alain; Karlo, Christoph; Seifert, Burkhardt; Lesurtel, Mickael; Alkadhi, Hatem

2015-01-01

Mortality from abdominal abscesses ranges from 30% in treated cases up to 80% to 100% in patients with undrained or nonoperated abscesses. Various computed tomographic (CT) imaging features have been suggested to indicate infection of postoperative abdominal fluid collections; however, features are nonspecific and substantial overlap between infected and noninfected collections exists. The purpose of this study was to develop and validate a scoring system on the basis of CT imaging findings as well as laboratory and clinical parameters for distinguishing infected from noninfected abdominal fluid collections after surgery. The score developmental cohort included 100 consecutive patients (69 men, 31 women; mean age, 58 ± 17 years) who underwent portal-venous phase CT within 24 hours before CT-guided intervention of postoperative abdominal fluid collections. Imaging features included attenuation (Hounsfield unit [HU]), volume, wall enhancement and thickness, fat stranding, as well as entrapped gas of fluid collections. Laboratory and clinical parameters included diabetes, intake of immunosuppressive drugs, body temperature, C-reactive protein, and leukocyte blood cell count. The score was validated in a separate cohort of 30 consecutive patients (17 men, 13 women; mean age, 51 ± 15 years) with postoperative abdominal fluid collections. Microbiologic analysis from fluid samples served as the standard of reference. Diabetes, body temperature, C-reactive protein, attenuation of the fluid collection (in HUs), wall enhancement and thickness of the wall, adjacent fat stranding, as well as entrapped gas within the fluid collection were significantly different between infected and noninfected collections (P < 0.001). Multiple logistic regression analysis revealed diabetes, C-reactive protein, attenuation of the fluid collection (in HUs), as well as entrapped gas as significant independent predictors of infection (P < 0.001) and thus was selected for constructing a scoring system from 0 to 10 (diabetes: 2 points; C-reactive protein, ≥ 100 mg/L: 1 point; attenuation of fluid collection, ≥ 20 HU: 4 points; entrapped gas: 3 points). The model was well calibrated (Hosmer-Lemeshow test, P = 0.36). In the validation cohort, scores of 2 or lower had a 90% (95% confidence interval [CI], 56%-100%) negative predictive value, scores of 3 or higher had an 80% (95% CI, 56%-94%) positive predictive value, and scores of 6 or higher a 100% (95% CI, 74%-100%) positive predictive value for diagnosing infected fluid collections. Receiver operating characteristic analysis revealed an area under the curve of 0.96 (95% CI, 0.88-1.00) for the score. We introduce an accurate scoring system including quantitative radiologic, laboratory, and clinical parameters for distinguishing infected from noninfected fluid collections after abdominal surgery.
Translation and adaptation of the fatigue severity scale for use in Portugal.

PubMed

Laranjeira, Carlos António

2012-08-01

The Fatigue Severity Scale (FSS) is a widely used instrument to measure the impact of fatigue on specific types of functioning. This study aims to translate and test the reliability and validity of the Portuguese version of the FSS. The questionnaire was administered to a worker sample of 424 nurses. Reliability analysis showed satisfactory results (Cronbach's alpha coefficient = .87). The test-retest reliability was .85. The principal component analysis showed that the FSS was a measure with a one-factor structure. The construct validity of the total FSS score was assessed by correlation with Maslach Burnout Inventory (MBI) score, Depression Anxiety Stress Scale (DASS) score, and Visual Analogue Scale (VAS) score. Each of the corresponding correlation coefficients among the total FSS score and MBI score, DASS score, and perceived fatigue score (VAS) were .55 (p < .01), .62 (p < .01), and .68 (p < .01), respectively, which shows sufficient construct validity. To measure the discriminant validity of FSS, we examined the differences in scores between groups in terms of the number of hours of sleep and overtime. The less nurses slept and the longer they worked, the higher their total FSS score became. This preliminary validation study of the Portuguese version of FSS proved that it is an acceptable, reliable, and valid measure of fatigue in the working population. Copyright © 2012 Elsevier Inc. All rights reserved.
Design, implementation, and psychometric analysis of a scoring instrument for simulated pediatric resuscitation: a report from the EXPRESS pediatric investigators.

PubMed

Donoghue, Aaron; Ventre, Kathleen; Boulet, John; Brett-Fleegler, Marisa; Nishisaki, Akira; Overly, Frank; Cheng, Adam

2011-04-01

Robustly tested instruments for quantifying clinical performance during pediatric resuscitation are lacking. Examining Pediatric Resuscitation Education through Simulation and Scripting Collaborative was established to conduct multicenter trials of simulation education in pediatric resuscitation, evaluating performance with multiple instruments, one of which is the Clinical Performance Tool (CPT). We hypothesize that the CPT will measure clinical performance during simulated pediatric resuscitation in a reliable and valid manner. Using a pediatric resuscitation scenario as a basis, a scoring system was designed based on Pediatric Advanced Life Support algorithms comprising 21 tasks. Each task was scored as follows: task not performed (0 points); task performed partially, incorrectly, or late (1 point); and task performed completely, correctly, and within the recommended time frame (2 points). Study teams at 14 children's hospitals went through the scenario twice (PRE and POST) with an interposed 20-minute debriefing. Both scenarios for each of eight study teams were scored by multiple raters. A generalizability study, based on the PRE scores, was conducted to investigate the sources of measurement error in the CPT total scores. Inter-rater reliability was estimated based on the variance components. Validity was assessed by repeated measures analysis of variance comparing PRE and POST scores. Sixteen resuscitation scenarios were reviewed and scored by seven raters. Inter-rater reliability for the overall CPT score was 0.63. POST scores were found to be significantly improved compared with PRE scores when controlled for within-subject covariance (F1,15 = 4.64, P < 0.05). The variance component ascribable to rater was 2.4%. Reliable and valid measures of performance in simulated pediatric resuscitation can be obtained from the CPT. Future studies should examine the applicability of trichotomous scoring instruments to other clinical scenarios, as well as performance during actual resuscitations.
Robotic suturing on the FLS model possesses construct validity, is less physically demanding, and is favored by more surgeons compared with laparoscopy.

PubMed

Stefanidis, Dimitrios; Hope, William W; Scott, Daniel J

2011-07-01

The value of robotic assistance for intracorporeal suturing is not well defined. We compared robotic suturing with laparoscopic suturing on the FLS model with a large cohort of surgeons. Attendees (n=117) at the SAGES 2006 Learning Center robotic station placed intracorporeal sutures on the FLS box-trainer model using conventional laparoscopic instruments and the da Vinci® robot. Participant performance was recorded using a validated objective scoring system, and a questionnaire regarding demographics, task workload, and suturing modality preference was completed. Construct validity for both tasks was assessed by comparing the performance scores of subjects with various levels of experience. A validated questionnaire was used for workload measurement. Of the participants, 84% had prior laparoscopic and 10% prior robotic suturing experience. Within the allotted time, 83% of participants completed the suturing task laparoscopically and 72% with the robot. Construct validity was demonstrated for both simulated tasks according to the participants' advanced laparoscopic experience, laparoscopic suturing experience, and self-reported laparoscopic suturing ability (p<0.001 for all) and according to prior robotic experience, robotic suturing experience, and self-reported robotic suturing ability (p<0.001 for all), respectively. While participants achieved higher suturing scores with standard laparoscopy compared with the robot (84±75 vs. 56±63, respectively; p<0.001), they found the laparoscopic task more physically demanding (NASA score 13±5 vs. 10±5, respectively; p<0.001) and favored the robot as their method of choice for intracorporeal suturing (62 vs. 38%, respectively; p<0.01). Construct validity was demonstrated for robotic suturing on the FLS model. Suturing scores were higher using standard laparoscopy likely as a result of the participants' greater experience with laparoscopic suturing versus robotic suturing. Robotic assistance decreases the physical demand of intracorporeal suturing compared with conventional laparoscopy and, in this study, was the preferred suturing method by most surgeons. Curricula for robotic suturing training need to be developed.
A New Clinicobiological Scoring System for the Prediction of Infection-Related Mortality and Survival after Allogeneic Hematopoietic Stem Cell Transplantation.

PubMed

Forcina, Alessandra; Rancoita, Paola M V; Marcatti, Magda; Greco, Raffaella; Lupo-Stanghellini, Maria Teresa; Carrabba, Matteo; Marasco, Vincenzo; Di Serio, Clelia; Bernardi, Massimo; Peccatori, Jacopo; Corti, Consuelo; Bondanza, Attilio; Ciceri, Fabio

2017-12-01

Infection-related mortality (IRM) is a substantial component of nonrelapse mortality (NRM) after allogeneic hematopoietic stem cell transplantation (allo-HSCT). No scores have been developed to predict IRM before transplantation. Pretransplantation clinical and biochemical data were collected from a study cohort of 607 adult patients undergoing allo-HSCT between January 2009 and February 2017. In a training set of 273 patients, multivariate analysis revealed that age >60 years (P = .003), cytomegalovirus host/donor serostatus different from negative/negative (P < .001), pretransplantation IgA level <1.11 g/L (P = .004), and pretransplantation IgM level <.305 g/L (P = .028) were independent predictors of increased IRM. Based on these results, we developed and subsequently validated a 3-tiered weighted prognostic index for IRM in a retrospective set of patients (n = 219) and a prospective set of patients (n = 115). Patients were assigned to 3 different IRM risk classes based on this index score. The score significantly predicted IRM in the training set, retrospective validation set, and prospective validation set (P < .001, .044, and .011, respectively). In the training set, 100-day IRM was 5% for the low-risk group, 11% for the intermediate-riak group, and 16% for the high-risk groups. In the retrospective validation set, the respective 100-day IRM values were 7%, 17%, and 28%, and in the prospective set, they were 0%, 5%, and 7%. This score predicted also overall survival (P < .001 in the training set, P < 041 in the retrospective validation set, and P < .023 in the prospective validation set). Because pretransplantation levels of IgA/IgM can be modulated by the supplementation of enriched immunoglobulins, these results suggest the possibility of prophylactic interventional studies to improve transplantation outcomes. Copyright © 2017 The American Society for Blood and Marrow Transplantation. Published by Elsevier Inc. All rights reserved.
Beyond associations: Do implicit beliefs play a role in smoking addiction?

PubMed

Tibboel, Helen; De Houwer, Jan; Dirix, Nicolas; Spruyt, Adriaan

2017-01-01

Influential dual-system models of addiction suggest that an automatic system that is associative and habitual promotes drug use, whereas a controlled system that is propositional and rational inhibits drug use. It is assumed that effects on the Implicit Association Test (IAT) reflect the automatic processes that guide drug seeking. However, results have been inconsistent, challenging: (1) the validity of addiction IATs; and (2) the assumption that the automatic system contains only simple associative information. We aimed to further test the validity of IATs that are used within this field of research using an experimental design. Second, we introduced a new procedure aimed at examining the automatic activation of complex propositional knowledge, the Relational Responding Task (RRT) and examine the validity of RRT effects in the context of smoking. In two experiments, smokers performed two different tasks: an approach/avoid IAT and a liking IAT in Experiment 1, and a smoking urges RRT and a valence IAT in Experiment 2. Smokers were tested once immediately after smoking and once after 10 hours of nicotine-deprivation. None of the IAT scores were affected by the deprivation manipulation. RRT scores revealed a stronger implicit desire for smoking in the deprivation condition compared to the satiation condition. IATs that are currently used to assess automatic processes in addiction have serious drawbacks. Furthermore, the automatic system may contain not only associations but complex drug-related beliefs as well. The RRT may be a useful and valid tool to examine these beliefs.
The Basilar Artery on Computed Tomography Angiography Score for Acute Basilar Artery Occlusion Treated with Mechanical Thrombectomy.

PubMed

Yang, Haihua; Ma, Ning; Liu, Lian; Gao, Feng; Mo, Dapeng; Miao, Zhongrong

2018-06-01

Recently, the Basilar Artery on Computed Tomography Angiography (BATMAN) score predicts clinical outcome of acute basilar artery occlusion (BAO), yet there is no extensive external validation. The purpose of this study was to validate the prognostic value of BATMAN scoring system for the prediction of clinical outcome in patients with acute BAO treated with endovascular mechanical thrombectomy by using cerebral digital subtraction angiography (DSA). We analyzed the clinical and angiographic data of consecutive patients with acute BAO from March 2012 to November 2016. The BATMAN scoring system was used to assess the collateral status and thrombus burden. Thrombolysis in Cerebral Infarction (TICI) score 2b-3 was defined as successful recanalization. Receiver operating characteristic (ROC) curve was used to determine the area under the curve (AUC) and the optimum cutoff value. Multivariate regression analysis was used to identify the predictor of clinical outcome. This study included 63 patients with acute BAO who underwent mechanical thrombectomy. Of these patients, 90.5% (57/63) achieved successful recanalization (TICI, 2b-3) and 34.9% (22/63) had a favorable outcome (modified Rankin Scale score 0-2). ROC analysis indicated that the AUC of the BATMAN score was .722 (95% confidence interval [CI], .594-.827), and the optimal cutoff value was 3 (sensitivity = 72.73, specificity = 63.41). In multivariate logistic regression analysis, the BATMAN score higher than 3 was associated with favorable outcome (odds ratio, 5.214; 95% CI, 1.47-18.483; P = .011). The BATMAN score on DSA seems to predict the functional outcome in patients of acute BAO treated with mechanical thrombectomy. Copyright © 2018 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Upper gastrointestinal bleeding risk scores: Who, when and why?

PubMed Central

Monteiro, Sara; Gonçalves, Tiago Cúrdia; Magalhães, Joana; Cotter, José

2016-01-01

Upper gastrointestinal bleeding (UGIB) remains a significant cause of hospital admission. In order to stratify patients according to the risk of the complications, such as rebleeding or death, and to predict the need of clinical intervention, several risk scores have been proposed and their use consistently recommended by international guidelines. The use of risk scoring systems in early assessment of patients suffering from UGIB may be useful to distinguish high-risks patients, who may need clinical intervention and hospitalization, from low risk patients with a lower chance of developing complications, in which management as outpatients can be considered. Although several scores have been published and validated for predicting different outcomes, the most frequently cited ones are the Rockall score and the Glasgow Blatchford score (GBS). While Rockall score, which incorporates clinical and endoscopic variables, has been validated to predict mortality, the GBS, which is based on clinical and laboratorial parameters, has been studied to predict the need of clinical intervention. Despite the advantages previously reported, their use in clinical decisions is still limited. This review describes the different risk scores used in the UGIB setting, highlights the most important research, explains why and when their use may be helpful, reflects on the problems that remain unresolved and guides future research with practical impact. PMID:26909231
A novel scoring system based on common laboratory tests predicts the efficacy of TNF-inhibitor and IL-6 targeted therapy in patients with rheumatoid arthritis: a retrospective, multicenter observational study.

PubMed

Nakagawa, Jin; Koyama, Yoshinobu; Kawakami, Atsushi; Ueki, Yukitaka; Tsukamoto, Hiroshi; Horiuchi, Takahiko; Nagano, Shuji; Uchino, Ayumi; Ota, Toshiyuki; Akahoshi, Mitsuteru; Akashi, Koichi

2017-08-11

Currently, although several categories of biological disease-modifying antirheumatic drugs (bDMARDs) are available, there are few data informing selection of initial treatment for individual patients with rheumatoid arthritis (RA). Therefore, tumor necrosis factor inhibitor (TNF-i) and tocilizumab (TCZ) are treated as equivalent treatments in the recent disease management recommendations. We focused on two anticytokine therapies, TCZ and TNF-i, and aimed to develop a scoring system that predicts a better treatment for each RA patient before starting an IL-6 or a TNF-i. The expression of IL-6 and TNF-α mRNA in peripheral blood from 45 newly diagnosed RA patients was measured by DNA microarrays to evaluate cytokine activation. Next, laboratory indices immediately before commencing treatment and disease activity score improvement ratio after 6 months in 98 patients treated with TCZ or TNF-i were retrospectively analyzed. Some indices correlated with TCZ efficacy were selected and their cutoff values were defined by receiver operating characteristic (ROC) analysis to develop a scoring system to discriminate between individuals more likely to respond to TCZ or TNF-i. The validity of the scoring system was verified in these 98 patients and an additional 228 patients. There was significant inverse correlation between the expression of IL-6 and TNF-α mRNA in newly diagnosed RA patients. The analysis of 98 patients revealed significant correlation between TCZ efficacy and platelet counts, hemoglobin, aspartate aminotransferase, and alanine aminotransferase; in contrast, there was no similar correlation in the TNF-i group. The cutoff values were defined by ROC analysis to develop a scoring system (1 point/item, maximum of 4 points). A good TCZ response was predicted if the score was ≥2; in contrast, TNF-i seemed to be preferable if the score was ≤1. Similar results were obtained in a validation study of an additional 228 patients. If the case scored ≥3, the good responder rates of TCZ/TNF-i were 75.0%/37.9% (p < 0.01) and the non-responder rates were 3.1%/27.6% (p < 0.01), respectively. The score is easily calculated from common laboratory results. It appears useful for identifying a better treatment at the time of selecting either an IL-6 or a TNF inhibitor.
Arabic translation, cultural adaptation, and validation study of Knee Outcome Survey: Activities of Daily Living Scale (KOS-ADLS).

PubMed

Algarni, Abdulrahman D; Alrabai, Hamza M; Al-Ahaideb, Abdulaziz; Kachanathu, Shaji John; AlShammari, Sulaiman A

2017-09-01

Knee complaints and their accompanying functional impairments are frequent problems encountered by healthcare practitioners worldwide. Plenty of functional scoring systems were developed and validated to give a relative estimation about the knee function. Despite the wide geographic distribution of Arabic language in the Middle East and North Africa, it is rare to find a validated knee function scale in Arabic. The present study is aimed to translate, validate, and culturally adjust the Knee Outcome Survey: Activities of Daily Living Scale (KOS-ADLS) into Arabic language for future use among Arabic-speaking patients. Permission for translation was obtained from the copyrights holder. Two different teams of high-level clinical and linguistic expertise conducted translation process blindly. Forward-backward translation technique was implemented to ensure preservation of the main conceptual content. Main study consisted of 280 subjects. Reliability was examined by test-retest pilot study. Visual Analogue Scale (VAS), Get Up and Go (GUG) Test, Ascending/Descending Stairs (A/D Stairs), and Subjective Assessment of Function (SAF) were conducted concurrently to show the validity of Arabic KOS-ADLS statistically in relation to these scales. Final translated version showed no significant discrepancies. Minor adaptive adjustment was required to fit Arabian cultural background. Internal consistency was favourable (Cronbach's alpha 0.90). Patients' scoring on Arabic KOS-ADLS appeared relatively consistent with their scoring on VAS, GUG, A/D Stairs, and SAF. A significant linear relationship was demonstrated between SAF and total KOS-ADLS scores on regression analysis (adj. R 2 = 0.548). Arabic KOS-ADLS, as its English counterpart, was found to be a simple, valid, and useful instrument for knee function evaluation. Arabic version of KOS-ADLS represents a promising candidate for unconditional use among Arabic-speaking patients with knee complaints.
Validation of the German version of the Kujala score in patients with patellofemoral instability: a prospective multi-centre study.

PubMed

Dammerer, D; Liebensteiner, M C; Kujala, U M; Emmanuel, K; Kopf, S; Dirisamer, F; Giesinger, J M

2018-04-01

The Kujala score is the most frequently used questionnaire for patellofemoral disorders like pain, instability or osteoarthritis. Unfortunately, we are not aware of a validated German version of the Kujala score. The aim of our study was the translation and linguistic validation of the Kujala score in German-speaking patients with patella instability and the assessment of its measurement characteristics. The German Kujala score was developed in several steps of translation. In addition to healthy controls, the Kujala German was assessed in consecutive patients undergoing reconstruction of the medial patellofemoral ligament for recurrent patellar dislocations. Pre-op, 6 and 12 months postop the patients completed the Kujala German score, the KOOS, the Lysholm score, a VAS Pain, and the SF-12v2 scores. In addition, there was a Kujala German Score retest preop after a 1-week interval. We found high reliability in terms of internal consistency for the Kujala score (Cronbach's alpha = 0.87). Convergent validity with the KOOS (symptom r = 0.65, pain r = 0.78, ADL r = 0.74, sports/recreation r = 0.84, quality of life r = 0.70), the Lysholm score (r = 0.88) and the SF-12 physical component summary score (r = 0.79) and VAS pain (r = - 0.71) was also very high. Discriminant validity in terms of correlation with the SF-12 mental component summary Score was satisfactory (r = 0.14). In conclusion, the German version of the Kujala score proved to be a reliable and valid instrument in the setting of a typical patellofemoral disease treated with a standard patellofemoral procedure.
Derivation and validation of the prediabetes self-assessment screening score after acute pancreatitis (PERSEUS).

PubMed

Soo, Danielle H E; Pendharkar, Sayali A; Jivanji, Chirag J; Gillies, Nicola A; Windsor, John A; Petrov, Maxim S

2017-10-01

Approximately 40% of patients develop abnormal glucose metabolism after a single episode of acute pancreatitis. This study aimed to develop and validate a prediabetes self-assessment screening score for patients after acute pancreatitis. Data from non-overlapping training (n=82) and validation (n=80) cohorts were analysed. Univariate logistic and linear regression identified variables associated with prediabetes after acute pancreatitis. Multivariate logistic regression developed the score, ranging from 0 to 215. The area under the receiver-operating characteristic curve (AUROC), Hosmer-Lemeshow χ 2 statistic, and calibration plots were used to assess model discrimination and calibration. The developed score was validated using data from the validation cohort. The score had an AUROC of 0.88 (95% CI, 0.80-0.97) and Hosmer-Lemeshow χ 2 statistic of 5.75 (p=0.676). Patients with a score of ≥75 had a 94.1% probability of having prediabetes, and were 29 times more likely to have prediabetes than those with a score of <75. The AUROC in the validation cohort was 0.81 (95% CI, 0.70-0.92) and the Hosmer-Lemeshow χ 2 statistic was 5.50 (p=0.599). Model calibration of the score showed good calibration in both cohorts. The developed and validated score, called PERSEUS, is the first instrument to identify individuals who are at high risk of developing abnormal glucose metabolism following an episode of acute pancreatitis. Copyright © 2017 Editrice Gastroenterologica Italiana S.r.l. Published by Elsevier Ltd. All rights reserved.
A predictive scoring instrument for tuberculosis lost to follow-up outcome

PubMed Central

2012-01-01

Background Adherence to tuberculosis (TB) treatment is troublesome, due to long therapy duration, quick therapeutic response which allows the patient to disregard about the rest of their treatment and the lack of motivation on behalf of the patient for improved. The objective of this study was to develop and validate a scoring system to predict the probability of lost to follow-up outcome in TB patients as a way to identify patients suitable for directly observed treatments (DOT) and other interventions to improve adherence. Methods Two prospective cohorts, were used to develop and validate a logistic regression model. A scoring system was constructed, based on the coefficients of factors associated with a lost to follow-up outcome. The probability of lost to follow-up outcome associated with each score was calculated. Predictions in both cohorts were tested using receiver operating characteristic curves (ROC). Results The best model to predict lost to follow-up outcome included the following characteristics: immigration (1 point value), living alone (1 point) or in an institution (2 points), previous anti-TB treatment (2 points), poor patient understanding (2 points), intravenous drugs use (IDU) (4 points) or unknown IDU status (1 point). Scores of 0, 1, 2, 3, 4 and 5 points were associated with a lost to follow-up probability of 2,2% 5,4% 9,9%, 16,4%, 15%, and 28%, respectively. The ROC curve for the validation group demonstrated a good fit (AUC: 0,67 [95% CI; 0,65-0,70]). Conclusion This model has a good capacity to predict a lost to follow-up outcome. Its use could help TB Programs to determine which patients are good candidates for DOT and other strategies to improve TB treatment adherence. PMID:22938040
Validation of a single summary score for the Prolapse/Incontinence Sexual Questionnaire-IUGA revised (PISQ-IR).

PubMed

Constantine, Melissa L; Pauls, Rachel N; Rogers, Rebecca R; Rockwood, Todd H

2017-12-01

The Prolapse/Incontinence Sexual Questionnaire-International Urogynecology Association (IUGA) Revised (PISQ-IR) measures sexual function in women with pelvic floor disorders (PFDs) yet is unwieldy, with six individual subscale scores for sexually active women and four for women who are not. We hypothesized that a valid and responsive summary score could be created for the PISQ-IR. Item response data from participating women who completed a revised version of the PISQ-IR at three clinical sites were used to generate item weights using a magnitude estimation (ME) and Q-sort (Q) approaches. Item weights were applied to data from the original PISQ-IR validation to generate summary scores. Correlation and factor analysis methods were used to evaluate validity and responsiveness of summary scores. Weighted and nonweighted summary scores for the sexually active PISQ-IR demonstrated good criterion validity with condition-specific measures: Incontinence Severity Index = 0.12, 0.11, 0.11; Pelvic Floor Distress Inventory-20 = 0.39, 0.39, 0.12; Epidemiology of Prolapse and Incontinence Questionnaire-Q35 = 0.26 0,.25, 0.40); Female Sexual Functioning Index subscale total score = 0.72, 0.75, 0.72 for nonweighted, ME, and Q summary scores, respectively. Responsiveness evaluation showed weighted and nonweighted summary scores detected moderate effect sizes (Cohen's d > 0.5). Weighted items for those NSA demonstrated significant floor effects and did not meet criterion validity. A PISQ-IR summary score for use with sexually active women, nonweighted or calculated with ME or Q item weights, is a valid and reliable measure for clinical use. The summary scores provide value for assesing clinical treatment of pelvic floor disorders.
Toward the Reliable Diagnosis of DSM-5 Premenstrual Dysphoric Disorder: The Carolina Premenstrual Assessment Scoring System (C-PASS)

PubMed Central

Eisenlohr-Moul, Tory A.; Girdler, Susan S.; Schmalenberger, Katja M.; Dawson, Danyelle N.; Surana, Pallavi; Johnson, Jacqueline L.; Rubinow, David R.

2016-01-01

Objective Despite evidence for the validity of premenstrual dysphoric disorder (PMDD) and its recent inclusion in DSM-5, variable diagnostic practices compromise the construct validity of the diagnosis and threaten the clarity of efforts to understand and treat its underlying pathophysiology. In an effort to hasten and streamline the translation of the new DSM-5 criteria for PMDD into terms compatible with existing research practices, we present the development and initial validation of the Carolina Premenstrual Assessment Scoring System (C-PASS). The C-PASS is a standardized scoring system for making DSM-5 PMDD diagnoses using 2 or more menstrual cycles of daily symptom ratings using the Daily Record of Severity of Problems (DRSP). Method Two hundred women recruited for retrospectively-reported premenstrual emotional symptoms provided 2–4 menstrual cycles of daily symptom ratings on the DRSP. Diagnoses were made by expert clinician and the C-PASS. Results Agreement of C-PASS diagnosis with expert clinical diagnosis was excellent; overall correct classification by the C-PASS was estimated at 98%. Consistent with previous evidence, retrospective reports of premenstrual symptom increases were a poor predictor of prospective C-PASS diagnosis. Conclusions The C-PASS (available as a worksheet, Excel macro, and SAS macro) is a reliable and valid companion protocol to the DRSP that standardizes and streamlines the complex, multilevel diagnosis of DSM-5 PMDD. Consistent use of this robust diagnostic method would result in more clearly-defined, homogeneous samples of women with PMDD, thereby improving the clarity of studies seeking to characterize or treat the underlying pathophysiology of the disorder. PMID:27523500
Are United States Medical Licensing Exam Step 1 and 2 scores valid measures for postgraduate medical residency selection decisions?

PubMed

McGaghie, William C; Cohen, Elaine R; Wayne, Diane B

2011-01-01

United States Medical Licensing Examination (USMLE) scores are frequently used by residency program directors when evaluating applicants. The objectives of this report are to study the chain of reasoning and evidence that underlies the use of USMLE Step 1 and 2 scores for postgraduate medical resident selection decisions and to evaluate the validity argument about the utility of USMLE scores for this purpose. This is a research synthesis using the critical review approach. The study first describes the chain of reasoning that underlies a validity argument about using test scores for a specific purpose. It continues by summarizing correlations of USMLE Step 1 and 2 scores and reliable measures of clinical skill acquisition drawn from nine studies involving 393 medical learners from 2005 to 2010. The integrity of the validity argument about using USMLE Step 1 and 2 scores for postgraduate residency selection decisions is tested. The research synthesis shows that USMLE Step 1 and 2 scores are not correlated with reliable measures of medical students', residents', and fellows' clinical skill acquisition. The validity argument about using USMLE Step 1 and 2 scores for postgraduate residency selection decisions is neither structured, coherent, nor evidence based. The USMLE score validity argument breaks down on grounds of extrapolation and decision/interpretation because the scores are not associated with measures of clinical skill acquisition among advanced medical students, residents, and subspecialty fellows. Continued use of USMLE Step 1 and 2 scores for postgraduate medical residency selection decisions is discouraged.
A Rasch-Based Validation of the Hooper Visual Organization Test in Chinese-Speaking Children

ERIC Educational Resources Information Center

Wuang, Yee-Pay; Wang, Li-Chen; Su, Chwen-Yng

2010-01-01

The aim of this study was to examine the validation of the Hooper Visual Organization Test (HVOT) for use in children by testing for item fit, unidimensionality, item hierarchy, reliability, and screening capacity. A modified scoring system was devised for the HVOT so that children received some credit for being able to describe the function of…

Novel risk score of contrast-induced nephropathy after percutaneous coronary intervention.

PubMed

Ji, Ling; Su, XiaoFeng; Qin, Wei; Mi, XuHua; Liu, Fei; Tang, XiaoHong; Li, Zi; Yang, LiChuan

2015-08-01

Contrast-induced nephropathy (CIN) post-percutaneous coronary intervention (PCI) is a major cause of acute kidney injury. In this study, we established a comprehensive risk score model to assess risk of CIN after PCI procedure, which could be easily used in a clinical environment. A total of 805 PCI patients, divided into analysis cohort (70%) and validation cohort (30%), were enrolled retrospectively in this study. Risk factors for CIN were identified using univariate analysis and multivariate logistic regression in the analysis cohort. Risk score model was developed based on multiple regression coefficients. Sensitivity and specificity of the new risk score system was validated in the validation cohort. Comparisons between the new risk score model and previous reported models were applied. The incidence of post-PCI CIN in the analysis cohort (n = 565) was 12%. Considerably high CIN incidence (50%) was observed in patients with chronic kidney disease (CKD). Age >75, body mass index (BMI) >25, myoglobin level, cardiac function level, hypoalbuminaemia, history of chronic kidney disease (CKD), Intra-aortic balloon pump (IABP) and peripheral vascular disease (PVD) were identified as independent risk factors of post-PCI CIN. A novel risk score model was established using multivariate regression coefficients, which showed highest sensitivity and specificity (0.917, 95%CI 0.877-0.957) compared with previous models. A new post-PCI CIN risk score model was developed based on a retrospective study of 805 patients. Application of this model might be helpful to predict CIN in patients undergoing PCI procedure. © 2015 Asian Pacific Society of Nephrology.
A PROMIS Measure of Neuropathic Pain Quality

PubMed Central

Askew, Robert L.; Cook, Karon F.; Keefe, Francis J.; Nowinski, Cindy J; Cella, David; Revicki, Dennis A.; DeWitt, Esi M. Morgan; Michaud, Kaleb; Trence, Dace L.; Amtmann, Dagmar

2016-01-01

Objectives Neuropathic pain is a consequence of many chronic conditions. This study aimed to develop a unidimensional neuropathic pain scale whose scores represent levels of neuropathic pain and distinguish between individuals with neuropathic and non-neuropathic pain conditions. Methods A candidate item pool of 42 pain quality descriptors was administered to participants with osteoarthritis, rheumatoid arthritis, diabetic neuropathy, and cancer chemotherapy-induced peripheral neuropathy. A subset of pain quality descriptors (items) that best distinguished between participants with and those without neuropathic pain conditions were identified. Dimensionality of pain descriptors was evaluated in a development sample and cross-validated in a hold-out sample. Item responses were calibrated using an item response theory model, and scores were generated on a T-score metric. Neuropathic pain scale scores were evaluated in terms of reliability, validity, and the ability to distinguish between participants with and without conditions typically associated with neuropathic pain. Results Of the 42 initial items, 5 were identified for the Patient Reported Outcome Measurement Information System (PROMIS) Neuropathic Pain Quality scale (PROMIS-PQ-Neuro). The IRT-generated T-scores exhibited good discriminatory ability based on receiver operator characteristic analysis. Score thresholds were identified that optimize sensitivity and specificity. Construct, criterion, and discriminant validity, and reliability of scale scores were supported. Conclusions The 5-item PROMIS PQ-Neuro is a short and practical measure that can be used to identify patients more likely to have neuropathic pain and to distinguish levels of neuropathic pain. The data collected will support future research that targets other unidimensional pain quality domains (e.g., nociceptive pain). PMID:27565279
CRISP: Catheterization RISk score for Pediatrics: A Report from the Congenital Cardiac Interventional Study Consortium (CCISC).

PubMed

Nykanen, David G; Forbes, Thomas J; Du, Wei; Divekar, Abhay A; Reeves, Jaxk H; Hagler, Donald J; Fagan, Thomas E; Pedra, Carlos A C; Fleming, Gregory A; Khan, Danyal M; Javois, Alexander J; Gruenstein, Daniel H; Qureshi, Shakeel A; Moore, Phillip M; Wax, David H

2016-02-01

We sought to develop a scoring system that predicts the risk of serious adverse events (SAE's) for individual pediatric patients undergoing cardiac catheterization procedures. Systematic assessment of risk of SAE in pediatric catheterization can be challenging in view of a wide variation in procedure and patient complexity as well as rapidly evolving technology. A 10 component scoring system was originally developed based on expert consensus and review of the existing literature. Data from an international multi-institutional catheterization registry (CCISC) between 2008 and 2013 were used to validate this scoring system. In addition we used multivariate methods to further refine the original risk score to improve its predictive power of SAE's. Univariate analysis confirmed the strong correlation of each of the 10 components of the original risk score with SAE attributed to a pediatric cardiac catheterization (P < 0.001 for all variables). Multivariate analysis resulted in a modified risk score (CRISP) that corresponds to an increase in value of area under a receiver operating characteristic curve (AUC) from 0.715 to 0.741. The CRISP score predicts risk of occurrence of an SAE for individual patients undergoing pediatric cardiac catheterization procedures. © 2015 Wiley Periodicals, Inc.
Predicting mortality in sick African children: the FEAST Paediatric Emergency Triage (PET) Score.

PubMed

George, Elizabeth C; Walker, A Sarah; Kiguli, Sarah; Olupot-Olupot, Peter; Opoka, Robert O; Engoru, Charles; Akech, Samuel O; Nyeko, Richard; Mtove, George; Reyburn, Hugh; Berkley, James A; Mpoya, Ayub; Levin, Michael; Crawley, Jane; Gibb, Diana M; Maitland, Kathryn; Babiker, Abdel G

2015-07-31

Mortality in paediatric emergency care units in Africa often occurs within the first 24 h of admission and remains high. Alongside effective triage systems, a practical clinical bedside risk score to identify those at greatest risk could contribute to reducing mortality. Data collected during the Fluid As Expansive Supportive Therapy (FEAST) trial, a multi-centre trial involving 3,170 severely ill African children, were analysed to identify clinical and laboratory prognostic factors for mortality. Multivariable Cox regression was used to build a model in this derivation dataset based on clinical parameters that could be quickly and easily assessed at the bedside. A score developed from the model coefficients was externally validated in two admissions datasets from Kilifi District Hospital, Kenya, and compared to published risk scores using Area Under the Receiver Operating Curve (AUROC) and Hosmer-Lemeshow tests. The Net Reclassification Index (NRI) was used to identify additional laboratory prognostic factors. A risk score using 8 clinical variables (temperature, heart rate, capillary refill time, conscious level, severe pallor, respiratory distress, lung crepitations, and weak pulse volume) was developed. The score ranged from 0-10 and had an AUROC of 0.82 (95 % CI, 0.77-0.87) in the FEAST trial derivation set. In the independent validation datasets, the score had an AUROC of 0.77 (95 % CI, 0.72-0.82) amongst admissions to a paediatric high dependency ward and 0.86 (95 % CI, 0.82-0.89) amongst general paediatric admissions. This discriminative ability was similar to, or better than other risk scores in the validation datasets. NRI identified lactate, blood urea nitrogen, and pH to be important prognostic laboratory variables that could add information to the clinical score. Eight clinical prognostic factors that could be rapidly assessed by healthcare staff for triage were combined to create the FEAST Paediatric Emergency Triage (PET) score and externally validated. The score discriminated those at highest risk of fatal outcome at the point of hospital admission and compared well to other published risk scores. Further laboratory tests were also identified as prognostic factors which could be added if resources were available or as indices of severity for comparison between centres in future research studies.
Ensemble assimilation of ARGO temperature profile, sea surface temperature, and altimetric satellite data into an eddy permitting primitive equation model of the North Atlantic Ocean

NASA Astrophysics Data System (ADS)

Yan, Y.; Barth, A.; Beckers, J. M.; Candille, G.; Brankart, J. M.; Brasseur, P.

2015-07-01

Sea surface height, sea surface temperature, and temperature profiles at depth collected between January and December 2005 are assimilated into a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. Sixty ensemble members are generated by adding realistic noise to the forcing parameters related to the temperature. The ensemble is diagnosed and validated by comparison between the ensemble spread and the model/observation difference, as well as by rank histogram before the assimilation experiments. An incremental analysis update scheme is applied in order to reduce spurious oscillations due to the model state correction. The results of the assimilation are assessed according to both deterministic and probabilistic metrics with independent/semiindependent observations. For deterministic validation, the ensemble means, together with the ensemble spreads are compared to the observations, in order to diagnose the ensemble distribution properties in a deterministic way. For probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centered random variable (RCRV) score in order to investigate the reliability properties of the ensemble forecast system. The improvement of the assimilation is demonstrated using these validation metrics. Finally, the deterministic validation and the probabilistic validation are analyzed jointly. The consistency and complementarity between both validations are highlighted.
Scoring system to guide decision making for the use of gentamicin-impregnated collagen sponge to prevent deep sternal wound infection.

PubMed

Benedetto, Umberto; Raja, Shahzad G

2014-11-01

The effectiveness of the routine retrosternal placement of a gentamicin-impregnated collagen sponge (GICS) implant before sternotomy closure is currently a matter of some controversy. We aimed to develop a scoring system to guide decision making for the use of GICS to prevent deep sternal wound infection. Fast backward elimination on predictors, including GICS, was performed using the Lawless and Singhal method. The scoring system was reported as a partial nomogram that can be used to manually obtain predicted individual risk of deep sternal wound infection from the regression model. Bootstrapping validation of the regression models was performed. The final populations consisted of 8750 adult patients undergoing cardiac surgery through full sternotomy during the study period. A total of 329 patients (3.8%) received GICS implant. The overall incidence of deep sternal wound infection was lower among patients who received GICS implant (0.6%) than patients who did not (2.01%) (P=.02). A nomogram to predict the individual risk for deep sternal wound infection was developed that included the use of GICS. Bootstrapping validation confirmed a good discriminative power of the models. The scoring system provides an impartial assessment of the decision-making process for clinicians to establish if GICS implant is effective in reducing the risk for deep sternal wound infection in individual patients undergoing cardiac surgery through full sternotomy. Copyright © 2014 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
[Development and validation of the Visual Analogue Scale (VAS) Spine Score].

PubMed

Knop, C; Oeser, M; Bastian, L; Lange, U; Zdichavsky, M; Blauth, M

2001-06-01

The aim of the study was the development and validation of a new subjective rating scale for assessment of outcome in patients with thoracolumbar fractures and fracture dislocations. The VAS spine score consists of 19 score items, using 100-mm visual analogue scales. The items are answered by the patients independently of rater assessment. To measure the analogue scales and calculate the score, a computer-aided system was evolved consisting of self-developed software and digitizer board. The overall score is the mean of all items answered with values between 0 and 100. The individual score loss is calculated as the difference between the preinjury score and at follow-up with values between 0 and 100. The VAS spine score was tested for reliability with a group of 136 healthy volunteers. We performed a test-retest study with an interval of 24 h. For statistical analysis of the validity, we prospectively followed a group of 53 patients with the new outcome score. We chose patients with injuries of the thoracolumbar spine, all having been operatively treated by combined posterior-anterior stabilization and fusion between 1994 and 1996. In the reference group, the average test score was 91.95 (58-100) and 92.10 (58-100) at retest. The mean individual difference between test and retest scored 1.037 (0-8). A high reliability was proved by a strong correlation with a coefficient of 0.976 (p < 0.001). A high internal consistency of the VAS spine score was shown by a Cronbach-alpha of 0.9117. The mean score for the preinjury status of the patients was comparable to the reference group, amounting to 89.60 (21-100). The mean score at the time of implant removal was significantly (p < 0.001) decreased to 58.25 (13-97). Until the time of follow-up a significant (p < 0.001) increase was noted, and the group scored 66.08 (15-100) at follow-up. This was a significant (p < 0.001) difference compared with the preinjury status. The individual score loss averaged 24.1 (0-80). In the patient group we also noted a Cronbach-alpha > 0.95, indicating a high internal consistency. With the VAS spine score the authors have inaugurated a new tool for outcome measurement in the treatment of patients with thoracolumbar injuries. The study has proved the score to be both reliable and valid. The application of the score is helpful in analyzing the subjective outcome, and the results can be correlated with objective measures. The score is a useful tool for comparative clinical studies, addressing the outcome after different methods of treatment.
Macroscopic cartilage repair scoring of defect fill, integration and total points correlate with corresponding items in histological scoring systems - a study in adult sheep.

PubMed

Goebel, L; Orth, P; Cucchiarini, M; Pape, D; Madry, H

2017-04-01

To correlate osteochondral repair assessed by validated macroscopic scoring systems with established semiquantitative histological analyses in an ovine model and to test the hypothesis that important macroscopic individual categories correlate with their corresponding histological counterparts. In the weight-bearing portion of medial femoral condyles (n = 38) of 19 female adult Merino sheep (age 2-4 years; weight 70 ± 20 kg) full-thickness chondral defects were created (size 4 × 8 mm; International Cartilage Repair Society (ICRS) grade 3C) and treated with Pridie drilling. After sacrifice, 1520 blinded macroscopic observations from three observers at 2-3 time points including five different macroscopic scoring systems demonstrating all grades of cartilage repair where correlated with corresponding categories from 418 blinded histological sections. Categories "defect fill" and "total points" of different macroscopic scoring systems correlated well with their histological counterparts from the Wakitani and Sellers scores (all P ≤ 0.001). "Integration" was assessed in both histological scoring systems and in the macroscopic ICRS, Oswestry and Jung scores. Here, a significant relationship always existed (0.020 ≤ P ≤ 0.049), except for Wakitani and Oswestry (P = 0.054). No relationship was observed for the "surface" between histology and macroscopy (all P > 0.05). Major individual morphological categories "defect fill" and "integration", and "total points" of macroscopic scoring systems correlate with their corresponding categories in elementary and complex histological scoring systems. Thus, macroscopy allows to precisely predict key histological aspects of articular cartilage repair, underlining the specific value of macroscopic scoring for examining cartilage repair. Copyright © 2016 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Peritumoral Artery Scoring System: a Novel Scoring System to Predict Renal Function Outcome after Laparoscopic Partial Nephrectomy.

PubMed

Zhang, Ruiyun; Wu, Guangyu; Huang, Jiwei; Shi, Oumin; Kong, Wen; Chen, Yonghui; Xu, Jianrong; Xue, Wei; Zhang, Jin; Huang, Yiran

2017-06-06

The present study aimed to assess the impact of peritumoral artery characteristics on renal function outcome prediction using a novel Peritumoral Artery Scoring System based on computed tomography arteriography. Peritumoral artery characteristics and renal function were evaluated in 220 patients who underwent laparoscopic partial nephrectomy and then validate in 51 patients with split and total glomerular filtration rate (GFR). In particular, peritumoral artery classification and diameter were measured to assign arteries into low, moderate, and high Peritumoral Artery Scoring System risk categories. Univariable and multivariable logistic regression analyses were then used to determine risk factors for major renal functional decline. The Peritumoral Artery Scoring System and four other nephrometry systems were compared using receiver operating characteristic curve analysis. The Peritumoral Artery Scoring System was significantly superior to the other systems for predicting postoperative renal function decline (p < 0.001). In receiver operating characteristic analysis, our category system was a superior independent predictor of estimated glomerular filtration rate (eGFR) decline (area-under-the-curve = 0.865, p < 0.001) and total GFR decline (area-under-the-curve = 0.796, p < 0.001), and split GFR decline (area-under-the-curve = 0.841, p < 0.001). Peritumoral artery characteristics were independent predictors of renal function outcome after laparoscopic partial nephrectomy.
Separating the Signal from the Noise: An Examination of Student and Teacher Scores Based on Student Learning Objectives (SLOs) in One State

ERIC Educational Resources Information Center

Buckley, Katie Hills

2015-01-01

Despite the prevalence of student learning objectives (SLOs) in teacher evaluation systems throughout the United States, research on the validity of student and teacher SLO scores used for high-stakes decisions is lacking. For this reason, this dissertation is comprised of two chapters that examine student and teacher-level SLO performance data…
The role of trauma scoring in developing trauma clinical governance in the Defence Medical Services

PubMed Central

Russell, R. J.; Hodgetts, T. J.; McLeod, J.; Starkey, K.; Mahoney, P.; Harrison, K.; Bell, E.

2011-01-01

This paper discusses mathematical models of expressing severity of injury and probability of survival following trauma and their use in establishing clinical governance of a trauma system. There are five sections: (i) Historical overview of scoring systems—anatomical, physiological and combined systems and the advantages and disadvantages of each. (ii) Definitions used in official statistics—definitions of ‘killed in action’ and other categories and the importance of casualty reporting rates and comparison across conflicts and nationalities. (iii) Current scoring systems and clinical governance—clinical governance of the trauma system in the Defence Medical Services (DMS) by using trauma scoring models to analyse injury and clinical patterns. (iv) Unexpected outcomes—unexpected outcomes focus clinical governance tools. Unexpected survivors signify good practice to be promulgated. Unexpected deaths pick up areas of weakness to be addressed. Seventy-five clinically validated unexpected survivors were identified over 2 years during contemporary combat operations. (v) Future developments—can the trauma scoring methods be improved? Trauma scoring systems use linear approaches and have significant weaknesses. Trauma and its treatment is a complex system. Nonlinear methods need to be investigated to determine whether these will produce a better approach to the analysis of the survival from major trauma. PMID:21149354
The Physician Recommendation Coding System (PhyReCS): A Reliable and Valid Method to Quantify the Strength of Physician Recommendations During Clinical Encounters

PubMed Central

Scherr, Karen A.; Fagerlin, Angela; Williamson, Lillie D.; Davis, J. Kelly; Fridman, Ilona; Atyeo, Natalie; Ubel, Peter A.

2016-01-01

Background Physicians’ recommendations affect patients’ treatment choices. However, most research relies on physicians’ or patients’ retrospective reports of recommendations, which offer a limited perspective and have limitations such as recall bias. Objective To develop a reliable and valid method to measure the strength of physician recommendations using direct observation of clinical encounters. Methods Clinical encounters (n = 257) were recorded as part of a larger study of prostate cancer decision making. We used an iterative process to create the 5-point Physician Recommendation Coding System (PhyReCS). To determine reliability, research assistants double-coded 50 transcripts. To establish content validity, we used one-way ANOVAs to determine whether relative treatment recommendation scores differed as a function of which treatment patients received. To establish concurrent validity, we examined whether patients’ perceived treatment recommendations matched our coded recommendations. Results The PhyReCS was highly reliable (Krippendorf’s alpha =. 89, 95% CI [.86, .91]). The average relative treatment recommendation score for each treatment was higher for individuals who received that particular treatment. For example, the average relative surgery recommendation score was higher for individuals who received surgery versus radiation (mean difference = .98, SE = .18, p < .001) or active surveillance (mean difference = 1.10, SE = .14, p < .001). Patients’ perceived recommendations matched coded recommendations 81% of the time. Conclusion The PhyReCS is a reliable and valid way to capture the strength of physician recommendations. We believe that the PhyReCS would be helpful for other researchers who wish to study physician recommendations, an important part of patient decision making. PMID:27343015
The Physician Recommendation Coding System (PhyReCS): A Reliable and Valid Method to Quantify the Strength of Physician Recommendations During Clinical Encounters.

PubMed

Scherr, Karen A; Fagerlin, Angela; Williamson, Lillie D; Davis, J Kelly; Fridman, Ilona; Atyeo, Natalie; Ubel, Peter A

2017-01-01

Physicians' recommendations affect patients' treatment choices. However, most research relies on physicians' or patients' retrospective reports of recommendations, which offer a limited perspective and have limitations such as recall bias. To develop a reliable and valid method to measure the strength of physician recommendations using direct observation of clinical encounters. Clinical encounters (n = 257) were recorded as part of a larger study of prostate cancer decision making. We used an iterative process to create the 5-point Physician Recommendation Coding System (PhyReCS). To determine reliability, research assistants double-coded 50 transcripts. To establish content validity, we used 1-way analyses of variance to determine whether relative treatment recommendation scores differed as a function of which treatment patients received. To establish concurrent validity, we examined whether patients' perceived treatment recommendations matched our coded recommendations. The PhyReCS was highly reliable (Krippendorf's alpha = 0.89, 95% CI [0.86, 0.91]). The average relative treatment recommendation score for each treatment was higher for individuals who received that particular treatment. For example, the average relative surgery recommendation score was higher for individuals who received surgery versus radiation (mean difference = 0.98, SE = 0.18, P < 0.001) or active surveillance (mean difference = 1.10, SE = 0.14, P < 0.001). Patients' perceived recommendations matched coded recommendations 81% of the time. The PhyReCS is a reliable and valid way to capture the strength of physician recommendations. We believe that the PhyReCS would be helpful for other researchers who wish to study physician recommendations, an important part of patient decision making. © The Author(s) 2016.
Commonly used severity scores are not good predictors of mortality in sepsis from severe leptospirosis: a series of ten patients.

PubMed

Velissaris, Dimitrios; Karanikolas, Menelaos; Flaris, Nikolaos; Fligou, Fotini; Marangos, Markos; Filos, Kriton S

2012-01-01

Introduction. Severe leptospirosis, also known as Weil's disease, can cause multiorgan failure with high mortality. Scoring systems for disease severity have not been validated for leptospirosis, and there is no documented method to predict mortality. Methods. This is a case series on 10 patients admitted to ICU for multiorgan failure from severe leptospirosis. Data were collected retrospectively, with approval from the Institution Ethics Committee. Results. Ten patients with severe leptospirosis were admitted in the Patras University Hospital ICU in a four-year period. Although, based on SOFA scores, predicted mortality was over 80%, seven of 10 patients survived and were discharged from the hospital in good condition. There was no association between SAPS II or SOFA scores and mortality, but survivors had significantly lower APACHE II scores compared to nonsurvivors. Conclusion. Commonly used severity scores do not seem to be useful in predicting mortality in severe leptospirosis. Early ICU admission and resuscitation based on a goal-directed therapy protocol are recommended and may reduce mortality. However, this study is limited by retrospective data collection and small sample size. Data from large prospective studies are needed to validate our findings.
Predicting Early Death Among Elderly Dialysis Patients: Development and Validation of a Risk Score to Assist Shared Decision Making for Dialysis Initiation.

PubMed

Thamer, Mae; Kaufman, James S; Zhang, Yi; Zhang, Qian; Cotter, Dennis J; Bang, Heejung

2015-12-01

A shared decision-making tool could help elderly patients with advanced chronic kidney disease decide about initiating dialysis therapy. Because mortality may be high in the first few months after initiating dialysis therapy, incorporating early mortality predictors in such a tool would be important for an informed decision. Our objective is to derive and validate a predictive risk score for early mortality after initiating dialysis therapy. Retrospective observational cohort, with development and validation cohorts. US Renal Data System and claims data from the Centers for Medicare & Medicaid Services for 69,441 (aged ≥67 years) patients with end-stage renal disease with a previous 2-year Medicare history who initiated dialysis therapy from January 1, 2009, to December 31, 2010. Demographics, predialysis care, laboratory data, functional limitations, and medical history. All-cause mortality in the first 3 and 6 months. Predicted mortality by logistic regression. The simple risk score (total score, 0-9) included age (0-3 points), low albumin level, assistance with daily living, nursing home residence, cancer, heart failure, and hospitalization (1 point each), and showed area under the receiver operating characteristic curve (AUROC)=0.69 in the validation sample. A comprehensive risk score with additional predictors was also developed (with AUROC=0.72, high concordance between predicted vs observed risk). Mortality probabilities were estimated from these models, with the median score of 3 indicating 12% risk in 3 months and 20% in 6 months, and the highest scores (≥8) indicating 39% risk in 3 months and 55% in 6 months. Patients who did not choose dialysis therapy and did not have a 2-year Medicare history were excluded. Routinely available information can be used by patients with chronic kidney disease, families, and their nephrologists to estimate the risk of early mortality after dialysis therapy initiation, which may facilitate informed decision making regarding treatment options. Copyright © 2015 National Kidney Foundation, Inc. All rights reserved.
Validity and psychometric properties of the General Health Questionnaire-12 in young Australian adolescents.

PubMed

Tait, Robert J; French, Davina J; Hulse, Gary K

2003-06-01

The General Health Questionnaire (GHQ) is a measure of current mental wellbeing that has been extensively validated with adults. The instrument has also been used with adolescents. (i) To assess the psychometric properties of the GHQ-12 among school students in grades 7-10; (ii) to validate it against other psychological tests; and (iii) to suggest a threshold score. The survey was conducted in single sex and mixed schools from the state and private system in Perth, Western Australia. The survey contained the GHQ-12 and measures of anxiety, depression, self-esteem, stress, generalized self-efficacy, social desirability and negative affectivity. There were 336 students (female 55%) with an age range of 11-15 years (median 13). The GHQ showed good internal consistency (alpha 0.88). Girls had higher mean GHQ-12 scores than boys (F (1,326) 15.0, p < 0.001) and scores for both genders increased with school grade (F (3,326) 4.2, p < 0.01). Multiple linear regression showed that depression, anxiety, self-esteem and stress were significant independent predictors of GHQ scores. The model accounted for 68% of the variance (adjusted R 2). Screening indices were calculated by comparison with a combined depression and/or anxiety category. Threshold scores of 13/14 for males and 18/19 for females appeared optimal. General Health Questionnaire scores were compared with two criterion groups: adolescents in hospital with alcohol or drug (AOD) related problems and those with problems not related to AOD use. Only the former group had significantly higher total scores. The GHQ-12 showed good structural characteristics and was appropriately correlated with other measures of related traits. Overall, the GHQ-12 appears to be a valid index of psychological wellbeing in this population and was considerably shorter than some of the other instruments.
Validation of the CPS + EG Staging System for Disease-Specific Survival in Breast Cancer Patients Treated with Neoadjuvant Chemotherapy.

PubMed

Abdelsattar, Jad M; Al-Hilli, Zahraa; Hoskin, Tanya L; Heins, Courtney N; Boughey, Judy C

2016-10-01

CPS + EG staging, which incorporates estrogen receptor (ER) status and tumor grade with pretreatment clinical stage (CS) and post-treatment pathologic stage (PS), has been reported to have better correlation with outcome than classic TNM staging for patients treated with neoadjuvant chemotherapy (NAC). Our goal was to evaluate the performance of CPS + EG staging system in an external cohort treated with NAC. We reviewed patients with stages I-IIIC breast cancer treated with NAC and surgery at our institution between 1988 and 2014. ER status, Nottingham grade, treatment, American Joint Committee on Cancer (AJCC) CS before NAC and PS after NAC, and follow-up data were collected. The discrimination of CPS + EG and pathologic AJCC stage were assessed using area under the curve (AUC) for survival data. A total of 769 patients were analyzed with a median follow-up of 2.6 (range 0.0-19.4) years; 103 patients died of breast cancer. Overall, the 5-year breast cancer cause-specific survival was 81.5 % [95 % confidence interval (CI) 77.6-85.5]. The 5-year, cause-specific survival by CPS + EG score was 93.8 % score 0, 89.9 % score 1, 90.7 % score 2, 84.8 % score 3, 67.7 % score 4, and 43.4 % score 5/6. CPS + EG score was significantly associated with cause-specific survival (p < 0.001) with an AUC of 0.69 (95 % CI 0.62-0.77) at 5 years. This was higher than the AUC of 0.63 (95 % CI 0.56-0.70) for AJCC PS (p = 0.10). This study validates the CPS + EG staging system using Nottingham grade in an external cohort. Addition of tumor biology and treatment response shows promise in improving survival estimates for patients treated with NAC.
Computer-Assisted Automated Scoring of Polysomnograms Using the Somnolyzer System.

PubMed

Punjabi, Naresh M; Shifa, Naima; Dorffner, Georg; Patil, Susheel; Pien, Grace; Aurora, Rashmi N

2015-10-01

Manual scoring of polysomnograms is a time-consuming and tedious process. To expedite the scoring of polysomnograms, several computerized algorithms for automated scoring have been developed. The overarching goal of this study was to determine the validity of the Somnolyzer system, an automated system for scoring polysomnograms. The analysis sample comprised of 97 sleep studies. Each polysomnogram was manually scored by certified technologists from four sleep laboratories and concurrently subjected to automated scoring by the Somnolyzer system. Agreement between manual and automated scoring was examined. Sleep staging and scoring of disordered breathing events was conducted using the 2007 American Academy of Sleep Medicine criteria. Clinical sleep laboratories. A high degree of agreement was noted between manual and automated scoring of the apnea-hypopnea index (AHI). The average correlation between the manually scored AHI across the four clinical sites was 0.92 (95% confidence interval: 0.90-0.93). Similarly, the average correlation between the manual and Somnolyzer-scored AHI values was 0.93 (95% confidence interval: 0.91-0.96). Thus, interscorer correlation between the manually scored results was no different than that derived from manual and automated scoring. Substantial concordance in the arousal index, total sleep time, and sleep efficiency between manual and automated scoring was also observed. In contrast, differences were noted between manually and automated scored percentages of sleep stages N1, N2, and N3. Automated analysis of polysomnograms using the Somnolyzer system provides results that are comparable to manual scoring for commonly used metrics in sleep medicine. Although differences exist between manual versus automated scoring for specific sleep stages, the level of agreement between manual and automated scoring is not significantly different than that between any two human scorers. In light of the burden associated with manual scoring, automated scoring platforms provide a viable complement of tools in the diagnostic armamentarium of sleep medicine. © 2015 Associated Professional Sleep Societies, LLC.
Ventilator Dependence Risk Score for the Prediction of Prolonged Mechanical Ventilation in Patients Who Survive Sepsis/Septic Shock with Respiratory Failure.

PubMed

Chang, Ya-Chun; Huang, Kuo-Tung; Chen, Yu-Mu; Wang, Chin-Chou; Wang, Yi-Hsi; Tseng, Chia-Cheng; Lin, Meng-Chih; Fang, Wen-Feng

2018-04-04

We intended to develop a scoring system to predict mechanical ventilator dependence in patients who survive sepsis/septic shock with respiratory failure. This study evaluated 251 adult patients in medical intensive care units (ICUs) between August 2013 to October 2015, who had survived for over 21 days and received aggressive treatment. The risk factors for ventilator dependence were determined. We then constructed a ventilator dependence (VD) risk score using the identified risk factors. The ventilator dependence risk score was calculated as the sum of the following four variables after being adjusted by proportion to the beta coefficient. We assigned a history of previous stroke, a score of one point, platelet count less than 150,000/μL a score of one point, pH value less than 7.35 a score of two points, and the fraction of inspired oxygen on admission day 7 over 39% as two points. The area under the curve in the derivation group was 0.725 (p < 0.001). We then applied the VD risk score for validation on 175 patients. The area under the curve in the validation group was 0.658 (p = 0.001). VD risk score could be applied to predict prolonged mechanical ventilation in patients who survive sepsis/septic shock.
Development and validation of an Argentine set of facial expressions of emotion.

PubMed

Vaiman, Marcelo; Wagner, Mónica Anna; Caicedo, Estefanía; Pereno, Germán Leandro

2017-02-01

Pictures of facial expressions of emotion are used in a wide range of experiments. The last decade has seen an increase in the number of studies presenting local sets of emotion stimuli. However, only a few existing sets contain pictures of Latin Americans, despite the growing attention emotion research is receiving in this region. Here we present the development and validation of the Universidad Nacional de Cordoba, Expresiones de Emociones Faciales (UNCEEF), a Facial Action Coding System (FACS)-verified set of pictures of Argentineans expressing the six basic emotions, plus neutral expressions. FACS scores, recognition rates, Hu scores, and discrimination indices are reported. Evidence of convergent validity was obtained using the Pictures of Facial Affect in an Argentine sample. However, recognition accuracy was greater for UNCEEF. The importance of local sets of emotion pictures is discussed.

Modified TIME-H: a simplified scoring system for chronic wound management.

PubMed

Lim, K; Free, B; Sinha, S

2015-09-01

Chronic wound assessment requires a systematic approach in order to guide management and improve prognostication. Following a pilot study using the original TIME-H scoring system in chronic wound management, modifications were suggested leading to the development of the Modified TIME-H scoring system. This study investigates the feasibility and reliability of chronic wound prognostication applying the Modified TIME-H score. Patients referred to the hospital's outpatient wound clinic over a 9-month period were categorised into one of three predicted outcome categories based on their Modified TIME-H score. This study shows a higher proportion of patients in the certain healing category achieved healed wounds, with a higher rate of reduction in wound size, when compared with the other categories. The three categories defined in this study are certain healing, uncertain healing and difficult healing. The Modified TIME-H score could be a useful tool for assessment, patient-centred management and prognostication of chronic wounds in clinical practice and requires further validation from other institutions. The authors have no conflict of interest to declare.
Cross-cultural adaptation and validation of the VISA-A questionnaire for German-speaking achilles tendinopathy patients.

PubMed

Lohrer, Heinz; Nauck, Tanja

2009-10-30

Achilles tendinopathy is the predominant overuse injury in runners. To further investigate this overload injury in transverse and longitudinal studies a valid, responsive and reliable outcome measure is demanded. Most questionnaires have been developed for English-speaking populations. This is also true for the VISA-A score, so far representing the only valid, reliable, and disease specific questionnaire for Achilles tendinopathy. To internationally compare research results, to perform multinational studies or to exclude bias originating from subpopulations speaking different languages within one country an equivalent instrument is demanded in different languages. The aim of this study was therefore to cross-cultural adapt and validate the VISA-A questionnaire for German-speaking Achilles tendinopathy patients. According to the "guidelines for the process of cross-cultural adaptation of self-report measures" the VISA-A score was cross-culturally adapted into German (VISA-A-G) using six steps: Translation, synthesis, back translation, expert committee review, pretesting (n = 77), and appraisal of the adaptation process by an advisory committee determining the adequacy of the cross-cultural adaptation. The resulting VISA-A-G was then subjected to an analysis of reliability, validity, and internal consistency in 30 Achilles tendinopathy patients and 79 asymptomatic people. Concurrent validity was tested against a generic tendon grading system (Percy and Conochie) and against a classification system for the effect of pain on athletic performance (Curwin and Stanish). The "advisory committee" determined the VISA-A-G questionnaire as been translated "acceptable". The VISA-A-G questionnaire showed moderate to excellent test-retest reliability (ICC = 0.60 to 0.97). Concurrent validity showed good coherence when correlated with the grading system of Curwin and Stanish (rho = -0.95) and for the Percy and Conochie grade of severity (rho 0.95). Internal consistency (Cronbach's alpha) for the total VISA-A-G scores of the patients was calculated to be 0.737. The VISA-A questionnaire was successfully cross-cultural adapted and validated for use in German speaking populations. The psychometric properties of the VISA-A-G questionnaire are similar to those of the original English version. It therefore can be recommended as a sufficiently robust tool for future measuring clinical severity of Achilles tendinopathy in German speaking patients.
Cross-cultural adaptation and validation of the VISA-A questionnaire for German-speaking Achilles tendinopathy patients

PubMed Central

Lohrer, Heinz; Nauck, Tanja

2009-01-01

Background Achilles tendinopathy is the predominant overuse injury in runners. To further investigate this overload injury in transverse and longitudinal studies a valid, responsive and reliable outcome measure is demanded. Most questionnaires have been developed for English-speaking populations. This is also true for the VISA-A score, so far representing the only valid, reliable, and disease specific questionnaire for Achilles tendinopathy. To internationally compare research results, to perform multinational studies or to exclude bias originating from subpopulations speaking different languages within one country an equivalent instrument is demanded in different languages. The aim of this study was therefore to cross-cultural adapt and validate the VISA-A questionnaire for German-speaking Achilles tendinopathy patients. Methods According to the "guidelines for the process of cross-cultural adaptation of self-report measures" the VISA-A score was cross-culturally adapted into German (VISA-A-G) using six steps: Translation, synthesis, back translation, expert committee review, pretesting (n = 77), and appraisal of the adaptation process by an advisory committee determining the adequacy of the cross-cultural adaptation. The resulting VISA-A-G was then subjected to an analysis of reliability, validity, and internal consistency in 30 Achilles tendinopathy patients and 79 asymptomatic people. Concurrent validity was tested against a generic tendon grading system (Percy and Conochie) and against a classification system for the effect of pain on athletic performance (Curwin and Stanish). Results The "advisory committee" determined the VISA-A-G questionnaire as been translated "acceptable". The VISA-A-G questionnaire showed moderate to excellent test-retest reliability (ICC = 0.60 to 0.97). Concurrent validity showed good coherence when correlated with the grading system of Curwin and Stanish (rho = -0.95) and for the Percy and Conochie grade of severity (rho 0.95). Internal consistency (Cronbach's alpha) for the total VISA-A-G scores of the patients was calculated to be 0.737. Conclusion The VISA-A questionnaire was successfully cross-cultural adapted and validated for use in German speaking populations. The psychometric properties of the VISA-A-G questionnaire are similar to those of the original English version. It therefore can be recommended as a sufficiently robust tool for future measuring clinical severity of Achilles tendinopathy in German speaking patients. PMID:19878572
The Reliability and Validity of Zimbardo Time Perspective Inventory Scores in Academically Talented Adolescents

ERIC Educational Resources Information Center

Worrell, Frank C.; Mello, Zena R.

2007-01-01

In this study, the authors examined the reliability, structural validity, and concurrent validity of Zimbardo Time Perspective Inventory (ZTPI) scores in a group of 815 academically talented adolescents. Reliability estimates of the purported factors' scores were in the low to moderate range. Exploratory factor analysis supported a five-factor…
Integration and Validation of Hysteroscopy Simulation in the Surgical Training Curriculum.

PubMed

Elessawy, Mohamed; Skrzipczyk, Moritz; Eckmann-Scholz, Christel; Maass, Nicolai; Mettler, Liselotte; Guenther, Veronika; van Mackelenbergh, Marion; Bauerschlag, Dirk O; Alkatout, Ibrahim

The primary objective of our study was to test the construct validity of the HystSim hysteroscopic simulator to determine whether simulation training can improve the acquisition of hysteroscopic skills regardless of the previous levels of experience of the participants. The secondary objective was to analyze the performance of a selected task, using specially designed scoring charts to help reduce the learning curve for both novices and experienced surgeons. The teaching of hysteroscopic intervention has received only scant attention, focusing mainly on the development of physical models and box simulators. This encouraged our working group to search for a suitable hysteroscopic simulator module and to test its validation. We decided to use the HystSim hysteroscopic simulator, which is one of the few such simulators that has already completed a validation process, with high ratings for both realism and training capacity. As a testing tool for our study, we selected the myoma resection task. We analyzed the results using the multimetric score system suggested by HystSim, allowing a more precise interpretation of the results. Between June 2014 and May 2015, our group collected data on 57 participants of minimally invasive surgical training courses at the Kiel School of Gynecological Endoscopy, Department of Gynecology and Obstetrics, University Hospitals Schleswig-Holstein, Campus Kiel. The novice group consisted of 42 medical students and residents with no prior experience in hysteroscopy, whereas the expert group consisted of 15 participants with more than 2 years of experience of advanced hysteroscopy operations. The overall results demonstrated that all participants attained significant improvements between their pretest and posttests, independent of their previous levels of experience (p < 0.002). Those in the expert group demonstrated statistically significant, superior scores in the pretest and posttests (p = 0.001, p = 0.006). Regarding visualization and ergonomics, the novices showed a better pretest value than the experts; however, the experts were able to improve significantly during the posttest. These precise findings demonstrated that the multimetric scoring system achieved several important objectives, including clinical relevance, critical relevance, and training motivation. All participants demonstrated improvements in their hysteroscopic skills, proving an adequate construct validation of the HystSim. Using the multimetric scoring system enabled a more accurate analysis of the performance of the participants independent of their levels of experience which could be an important key for streamlining the learning curve. Future studies testing the predictive validation of the simulator and frequency of the training intervals are necessary before the introduction of the simulator into the standard surgical training curriculum. Copyright © 2016 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Systemic Inflammation-Based Biomarkers and Survival in HIV-Positive Subject With Solid Cancer in an Italian Multicenter Study.

PubMed

Raffetti, Elena; Donato, Francesco; Pezzoli, Chiara; Digiambenedetto, Simona; Bandera, Alessandra; Di Pietro, Massimo; Di Filippo, Elisa; Maggiolo, Franco; Sighinolfi, Laura; Fornabaio, Chiara; Castelnuovo, Filippo; Ladisa, Nicoletta; Castelli, Francesco; Quiros Roldan, Eugenia

2015-08-15

Recently, some systemic inflammation-based biomarkers have been demonstrated useful for predicting risk of death in patients with solid cancer independently of tumor characteristics. This study aimed to investigate the prognostic role of systemic inflammation-based biomarkers in HIV-infected patients with solid tumors and to propose a risk score for mortality in these subjects. Clinical and pathological data on solid AIDS-defining cancer (ADC) and non-AIDS-defining cancer (NADC), diagnosed between 1998 and 2012 in an Italian cohort, were analyzed. To evaluate the prognostic role of systemic inflammation- and nutrition-based markers, univariate and multivariable Cox regression models were applied. To compute the risk score equation, the patients were randomly assigned to a derivation and a validation sample. A total of 573 patients (76.3% males) with a mean age of 46.2 years (SD = 10.3) were enrolled. 178 patients died during a median of 3.2 years of follow-up. For solid NADCs, elevated Glasgow Prognostic Score, modified Glasgow Prognostic Score, neutrophil/lymphocyte ratio, platelet/lymphocyte ratio, and Prognostic Nutritional Index were independently associated with risk of death; for solid ADCs, none of these markers was associated with risk of death. For solid NADCs, we computed a mortality risk score on the basis of age at cancer diagnosis, intravenous drug use, and Prognostic Nutritional Index. The areas under the receiver operating characteristic curve were 0.67 (95% confidence interval: 0.58 to 0.75) in the derivation sample and 0.66 (95% confidence interval: 0.54 to 0.79) in the validation sample. Inflammatory biomarkers were associated with risk of death in HIV-infected patients with solid NADCs but not with ADCs.
Recommendations, evaluation and validation of a semi-automated, fluorescent-based scoring protocol for micronucleus testing in human cells.

PubMed

Seager, Anna L; Shah, Ume-Kulsoom; Brüsehafer, Katja; Wills, John; Manshian, Bella; Chapman, Katherine E; Thomas, Adam D; Scott, Andrew D; Doherty, Ann T; Doak, Shareen H; Johnson, George E; Jenkins, Gareth J S

2014-05-01

Micronucleus (MN) induction is an established cytogenetic end point for evaluating structural and numerical chromosomal alterations in genotoxicity testing. A semi-automated scoring protocol for the assessment of MN preparations from human cell lines and a 3D skin cell model has been developed and validated. Following exposure to a range of test agents, slides were stained with 4'-6-diamidino-2-phenylindole (DAPI) and scanned by use of the MicroNuc module of metafer 4, after the development of a modified classifier for selecting MN in binucleate cells. A common difficulty observed with automated systems is an artefactual output of high false positives, in the case of the metafer system this is mainly due to the loss of cytoplasmic boundaries during slide preparation. Slide quality is paramount to obtain accurate results. We show here that to avoid elevated artefactual-positive MN outputs, diffuse cell density and low-intensity nuclear staining are critical. Comparisons between visual (Giemsa stained) and automated (DAPI stained) MN frequencies and dose-response curves were highly correlated (R (2) = 0.70 for hydrogen peroxide, R (2) = 0.98 for menadione, R (2) = 0.99 for mitomycin C, R (2) = 0.89 for potassium bromate and R (2) = 0.68 for quantum dots), indicating the system is adequate to produce biologically relevant and reliable results. Metafer offers many advantages over conventional scoring including increased output and statistical power, and reduced scoring subjectivity, labour and costs. Further, the metafer system is easily adaptable for use with a range of different cells, both suspension and adherent human cell lines. Awareness of the points raised here reduces the automatic positive errors flagged and drastically reduces slide scoring time, making metafer an ideal candidate for genotoxic biomonitoring and population studies and regulatory genotoxic testing.
Usability verification of the Emergency Trauma Score (EMTRAS) and Rapid Emergency Medicine Score (REMS) in patients with trauma: A retrospective cohort study.

PubMed

Park, Hyun Oh; Kim, Jong Woo; Kim, Sung Hwan; Moon, Seong Ho; Byun, Joung Hun; Kim, Ki Nyun; Yang, Jun Ho; Lee, Chung Eun; Jang, In Seok; Kang, Dong Hun; Kim, Seong Chun; Kang, Changwoo; Choi, Jun Young

2017-11-01

Early estimation of mortality risk in patients with trauma is essential. In this study, we evaluate the validity of the Emergency Trauma Score (EMTRAS) and Rapid Emergency Medicine Score (REMS) for predicting in-hospital mortality in patients with trauma. Furthermore, we compared the REMS and the EMTRAS with 2 other scoring systems: the Revised Trauma Score (RTS) and Injury Severity score (ISS).We performed a retrospective chart review of 6905 patients with trauma reported between July 2011 and June 2016 at a large national university hospital in South Korea. We analyzed the associations between patient characteristics, treatment course, and injury severity scoring systems (ISS, RTS, EMTRAS, and REMS) with in-hospital mortality. Discriminating power was compared between scoring systems using the areas under the curve (AUC) of receiver operating characteristic (ROC) curves.The overall in-hospital mortality rate was 3.1%. Higher EMTRAS and REMS scores were associated with hospital mortality (P < .001). The ROC curve demonstrated adequate discrimination (AUC = 0.957 for EMTRAS and 0.9 for REMS). After performing AUC analysis followed by Bonferroni correction for multiple comparisons, EMTRAS was significantly superior to REMS and ISS in predicting in-hospital mortality (P < .001), but not significantly different from the RTS (P = .057). The other scoring systems were not significantly different from each other.The EMTRAS and the REMS are simple, accurate predictors of in-hospital mortality in patients with trauma.
Validation of the Capsule Endoscopy Crohn's Disease Activity Index (CECDAI or Niv score): a multicenter prospective study.

PubMed

Niv, Y; Ilani, S; Levi, Z; Hershkowitz, M; Niv, E; Fireman, Z; O'Donnel, S; O'Morain, C; Eliakim, R; Scapa, E; Kalantzis, N; Kalantzis, C; Apostolopoulos, P; Gal, E

2012-01-01

The Capsule Endoscopy Crohn's Disease Activity Index (CECDAI or Niv score) was devised to measure mucosal disease activity using video capsule endoscopy (VCE). The aim of the current study was to prospectively validate the use of the scoring system in daily practice. This was a multicenter, double-blind, prospective, controlled study of VCE videos from 62 consecutive patients with isolated small-bowel Crohn's disease. The CECDAI was designed to evaluate three main parameters of Crohn's disease: inflammation (A), extent of disease (B), and stricture (C), in both the proximal and distal segments of the small bowel. The final score was calculated by adding the two segmental scores: CECDAI = ([A1 × B1] + C1) + ([A2 × B2] + C2). Each examiner in every site interpreted 6 - 10 videos and calculated the CECDAI. The de-identified CD-ROMs were then coded and sent to the principal investigator for CECDAI calculation. The cecum was reached in 72 % and 86 % of examinations, and proximal small-bowel involvement was found in 56 % and 62 % of the patients, according to the site investigators and principal investigator, respectively. Significant correlation was demonstrated between the calculation of the CECDAI by the individual site investigators and that performed by the principal investigator. Overall correlation between endoscopists from the different study centers was good, with r = 0.767 (range 0.717 - 0.985; Kappa 0.66; P < 0.001). There was no correlation between the CECDAI and the Crohn's Disease Activity Index or the Inflammatory Bowel Disease Quality of Life Questionnaire or any of their components. A new scoring system of mucosal injury in Crohn's disease of the small intestine, the CECDAI, was validated. Its use in controlled trials and/or regular follow-up of these patients is advocated. © Georg Thieme Verlag KG Stuttgart · New York.
Validation of the Lupus Nephritis Clinical Indices in Childhood-Onset Systemic Lupus Erythematosus

PubMed Central

Mina, Rina; Abulaban, Khalid; Klein-Gitelman, Marisa; Eberhard, Anne; Ardoin, Stacy; Singer, Nora; Onel, Karen; Tucker, Lori; O’Neil, Kathleen; Wright, Tracey; Brooks, Elizabeth; Rouster-Stevens, Kelly; Jung, Lawrence; Imundo, Lisa; Rovin, Brad; Witte, David; Ying, Jun; Brunner, Hermine I.

2015-01-01

Objective To validate clinical indices of lupus nephritis (LN) activity and damage when used in children against the criterion standard of kidney biopsy findings. Methods In 83 children requiring kidney biopsy the SLE Disease Activity Index Renal Domain (SLEDAI-R); British Isles Lupus Assessment Group index Renal Domain (BILAG-R), Systemic Lupus International Collaborating Clinics Renal Activity (SLICC-RAS) and Damage Index Renal Domain (SDI-R) were measured. Fixed effect and logistic models were done to predict International Society of Nephrology/Renal Pathology Society (ISN/RPS) class; low/moderate vs. high LN-activity [NIH Activity Index (NIH-AI) score: ≤ 10 vs. > 10; Tubulointerstitial Activity Index (TIAI) score: ≤ 5 vs. > 5) or the absence vs. presence of LN chronicity [NIH Chronicity Index (NIH-CI) score: 0 vs. ≥ 1]. Results There were 10, 50 and 23 patients with class I/II, III/IV and V, respectively. Scores of the clinical indices did not differentiate among patients by ISN/RPS class. The SLEDAI-R and SLICC-RAS but not the BILAG-R differed with LN-activity status defined by NIH-AI scores, while only the SLEDAI-R scores differed between LN-activity status based on TIAI scores. The sensitivity and specificity of the SDI-R to capture LN chronicity was 23.5% and 91.7%, respectively. Despite designed to measure LN-activity, SLICC-RAS and SLEDAI-R scores significantly differed with LN chronicity status. Conclusion Current clinical indices of LN fail to discriminate ISN/RPS Class in children. Despite its shortcomings, the SLEDAI-R appears to best for measuring LN activity in a clinical setting. The SDI-R is a poor correlate of LN chronicity. PMID:26213987
External Validation of European System for Cardiac Operative Risk Evaluation II (EuroSCORE II) for Risk Prioritization in an Iranian Population

PubMed Central

Atashi, Alireza; Amini, Shahram; Tashnizi, Mohammad Abbasi; Moeinipour, Ali Asghar; Aazami, Mathias Hossain; Tohidnezhad, Fariba; Ghasemi, Erfan; Eslami, Saeid

2018-01-01

Introduction The European System for Cardiac Operative Risk Evaluation II (EuroSCORE II) is a prediction model which maps 18 predictors to a 30-day post-operative risk of death concentrating on accurate stratification of candidate patients for cardiac surgery. Objective The objective of this study was to determine the performance of the EuroSCORE II risk-analysis predictions among patients who underwent heart surgeries in one area of Iran. Methods A retrospective cohort study was conducted to collect the required variables for all consecutive patients who underwent heart surgeries at Emam Reza hospital, Northeast Iran between 2014 and 2015. Univariate and multivariate analysis were performed to identify covariates which significantly contribute to higher EuroSCORE II in our population. External validation was performed by comparing the real and expected mortality using area under the receiver operating characteristic curve (AUC) for discrimination assessment. Also, Brier Score and Hosmer-Lemeshow goodness-of-fit test were used to show the overall performance and calibration level, respectively. Results Two thousand five hundred eight one (59.6% males) were included. The observed mortality rate was 3.3%, but EuroSCORE II had a prediction of 4.7%. Although the overall performance was acceptable (Brier score=0.047), the model showed poor discriminatory power by AUC=0.667 (sensitivity=61.90, and specificity=66.24) and calibration (Hosmer-Lemeshow test, P<0.01). Conclusion Our study showed that the EuroSCORE II discrimination power is less than optimal for outcome prediction and less accurate for resource allocation programs. It highlights the need for recalibration of this risk stratification tool aiming to improve post cardiac surgery outcome predictions in Iran. PMID:29617500
Multi-institutional validation of a web-based core competency assessment system.

PubMed

Tabuenca, Arnold; Welling, Richard; Sachdeva, Ajit K; Blair, Patrice G; Horvath, Karen; Tarpley, John; Savino, John A; Gray, Richard; Gulley, Julie; Arnold, Teresa; Wolfe, Kevin; Risucci, Donald A

2007-01-01

The Association of Program Directors in Surgery and the Division of Education of the American College of Surgeons developed and implemented a web-based system for end-of-rotation faculty assessment of ACGME core competencies of residents. This study assesses its reliability and validity across multiple programs. Each assessment included ratings (1-5 scale) on 23 items reflecting the 6 core competencies. A total of 4241 end-of-rotation assessments were completed for 332 general surgery residents (> or =5 evaluations each) at 5 sites during the 2004-2005 and 2005-2006 academic years. The mean rating for each resident on each item was computed for each academic year. The mean rating of items representing each competency was computed for each resident. Additional data included USMLE and ABSITE scores, PGY, and status in program (categorical, designated preliminary, and undesignated preliminary). Coefficient alpha was greater than 0.90 for each competency score. Mean ratings for each competency increased significantly (p < 0.01) as a function of PGY. Mean ratings for professionalism and interpersonal/communication skills (IPC) were significantly higher than all other competencies at all PGY levels. Competency ratings of PGY 1 residents correlated significantly with USMLE Step I, ranging from (r = 0.26, p < 0.01) for Professionalism to (r = 0.41, p < 0.001) for Systems-Based Practice. Ratings of Knowledge (r = 0.31, p < 0.01), Practice-Based Learning & Improvement (PBLI; r = 0.22, p < 0.05), and Systems-Based Practice (r = 0.20, p < 0.05) correlated significantly with 2005 ABSITE Total Percentile. Ratings of all competencies correlated significantly with the 2006 ABSITE Total Percentile Score (range: r = 0.20, p < 0.05 for professionalism to r = 0.35, p < 0.001 for knowledge). Categorical and designated preliminary residents received significantly higher ratings (p < 0.05) than nondesignated preliminaries for knowledge, patient care, PBLI, and systems-based practice only. Faculty ratings of core competencies are internally consistent. The pattern of statistically significant correlations between competency ratings and USMLE and ABSITE scores supports the postdictive and concurrent validity, respectively, of faculty perceptions of resident knowledge. The pattern of increased ratings as a function of PGY supports the construct validity of faculty ratings of resident core competencies.
A valid model for predicting responsible nerve roots in lumbar degenerative disease with diagnostic doubt.

PubMed

Li, Xiaochuan; Bai, Xuedong; Wu, Yaohong; Ruan, Dike

2016-03-15

To construct and validate a model to predict responsible nerve roots in lumbar degenerative disease with diagnostic doubt (DD). From January 2009-January 2013, 163 patients with DD were assigned to the construction (n = 106) or validation sample (n = 57) according to different admission times to hospital. Outcome was assessed according to the Japanese Orthopedic Association (JOA) recovery rate as excellent, good, fair, and poor. The first two results were considered as effective clinical outcome (ECO). Baseline patient and clinical characteristics were considered as secondary variables. A multivariate logistic regression model was used to construct a model with the ECO as a dependent variable and other factors as explanatory variables. The odds ratios (ORs) of each risk factor were adjusted and transformed into a scoring system. Area under the curve (AUC) was calculated and validated in both internal and external samples. Moreover, calibration plot and predictive ability of this scoring system were also tested for further validation. Patients with DD with ECOs in both construction and validation models were around 76 % (76.4 and 75.5 % respectively). more preoperative visual analog pain scale (VAS) score (OR = 1.56, p < 0.01), stenosis levels of L4/5 or L5/S1 (OR = 1.44, p = 0.04), stenosis locations with neuroforamen (OR = 1.95, p = 0.01), neurological deficit (OR = 1.62, p = 0.01), and more VAS improvement of selective nerve route block (SNRB) (OR = 3.42, p = 0.02). the internal area under the curve (AUC) was 0.85, and the external AUC was 0.72, with a good calibration plot of prediction accuracy. Besides, the predictive ability of ECOs was not different from the actual results (p = 0.532). We have constructed and validated a predictive model for confirming responsible nerve roots in patients with DD. The associated risk factors were preoperative VAS score, stenosis levels of L4/5 or L5/S1, stenosis locations with neuroforamen, neurological deficit, and VAS improvement of SNRB. A tool such as this is beneficial in the preoperative counseling of patients, shared surgical decision making, and ultimately improving safety in spine surgery.
Zonal NePhRO scoring system: a superior renal tumor complexity classification model.

PubMed

Hakky, Tariq S; Baumgarten, Adam S; Allen, Bryan; Lin, Hui-Yi; Ercole, Cesar E; Sexton, Wade J; Spiess, Philippe E

2014-02-01

Since the advent of the first standardized renal tumor complexity system, many subsequent scoring systems have been introduced, many of which are complicated and can make it difficult to accurately measure data end points. In light of these limitations, we introduce the new zonal NePhRO scoring system. The zonal NePhRO score is based on 4 anatomical components that are assigned a score of 1, 2, or 3, and their sum is used to classify renal tumors. The zonal NePhRO scoring system is made up of the (Ne)arness to collecting system, (Ph)ysical location of the tumor in the kidney, (R)adius of the tumor, and (O)rganization of the tumor. In this retrospective study, we evaluated patients exhibiting clinical stage T1a or T1b who underwent open partial nephrectomy performed by 2 genitourinary surgeons. Each renal unit was assigned both a zonal NePhRO score and a RENAL (radius, exophytic/endophytic properties, nearness of tumor to the collecting system or sinus in millimeters, anterior/posterior, location relative to polar lines) score, and a blinded reviewer used the same preoperative imaging study to obtain both scores. Additional data points gathered included age, clamp time, complication rate, urine leak rate, intraoperative blood loss, and pathologic tumor size. One hundred sixty-six patients underwent open partial nephrectomy. There were 37 perioperative complications quantitated using the validated Clavien-Dindo system; their occurrence was predicted by the NePhRO score on both univariate and multivariate analyses (P = .0008). Clinical stage, intraoperative blood loss, and tumor diameter were all correlated with the zonal NePhRO score on univariate analysis only. The zonal NePhRO scoring system is a simpler tool that accurately predicts the surgical complexity of a renal lesion. Copyright © 2014 Elsevier Inc. All rights reserved.
Measurement of peripheral venous catheter-related phlebitis: a cross-sectional study.

PubMed

Göransson, Katarina; Förberg, Ulrika; Johansson, Eva; Unbeck, Maria

2017-09-01

Many instruments for measurement of peripheral venous catheter (PVC)-related phlebitis are available, but no consensus exists on their applicability in clinical practice. This absence of consensus affects the ability to identify and compare proportions of PVCs causing phlebitis within and across hospitals as the range varies between 2% and 62% in previous studies. We hypothesised that the instruments' ability to identify phlebitis varies. The aim of this study is to illustrate the complexity of application of phlebitis instruments to a clinical dataset. In this cross-sectional study, we applied 17 instruments for phlebitis identification (divided into three groups [instruments using definitions, severity rating systems, and scoring systems]) to PVCs in adult patients admitted to 12 inpatient units at Karolinska University Hospital in Sweden. We calculated the proportion of PVCs causing phlebitis on the basis of each instrument's minimum criterion for phlebitis. We also analysed each instrument's face validity. We compared proportions using the Z test. On the basis of data collected between Feb 2, 2009, and Feb 20, 2009, May 18, 2009, and June 5, 2009, and Feb 8, 2010, and Feb 26, 2010, we applied 17 instruments for phlebitis identification (eight instruments using definitions, seven severity rating systems, and two scoring systems) to 1175 observed PVCs in 1032 patients. The highest number of PVCs causing phlebitis generated by definitions was 137 (11·7%), by severity rating systems was 395 (33·6%), and by scoring systems was 363 (30·9%). The proportion generated by instruments using definitions was significantly different to that of both the severity rating (difference 21·9% [95% CI 18·6-25·2]; p<0·0001) and scoring (19·2% [12·0-26·4]; p<0·0001) systems. Proportions did not differ significantly between severity rating systems and scoring system (difference 2·7% [95% CI -1·1 to 6·6]; p=0·16). The proportion within instruments ranged from less than 1% to 28%. We identified face validity issues, such as use of indistinct or complex measurements and inconsistent measurements or definitions. Our study highlights several concerns regarding instruments to measure phlebitis published in the scientific community. From a work environment and patient safety perspective, clinical staff engaged in PVC management should be aware of the absence of adequately validated instruments for phlebitis assessment. We suggest that researchers within the field of PVC come together in a joint research programme aiming to develop valid and reliable methods that accurately identify PVC-related adverse events that also includes decision support for clinical staff concerning clinical indications for PVC removal. Such actions could lead to a revised view on what is best practice for management of PVCs. None. Copyright © 2017 Elsevier Ltd. All rights reserved.
RENZI SCORE FOR OBSTRUCTED DEFECATION SYNDROME - VALIDATION OF THE PORTUGUESE VERSION ACCORDING TO THE COSMIN CHECKLIST.

PubMed

Caetano, Ana Celia; Dias, Sara; Santa-Cruz, André; Rolanda, Carla

2018-01-01

Recently, the Obstructed Defecation Syndrome score (ODS score) was developed and validated by Renzi to assess clinical staging and to allow evaluation and comparison of the efficacy of treatment of this disorder. Our goal is to validate the Portuguese version of Renzi ODS score, according to the Consensus based Standards for the selection of the Health Measurement Instruments (COSMIN) checklist. Following guidelines for cross-cultural validity, Renzi ODS score was translated into the Portuguese language. Then, a group of patients and healthy controls were invited to fill in the Renzi ODS score at baseline, after 2 weeks and 3 months, respectively. We assessed internal consistency, reliability and measurement error, content and construct validity, responsiveness and interpretability. A total of 113 individuals (77 patients; 36 healthy controls) completed the questionnaire. Seventy and 30 patients repeated the Renzi ODS score after 2 weeks and 3 months respectively. Factor analysis confirmed the unidimensionality of the scale. Cronbach's α coefficient of 0.77 supported item's homogeneity. Weighted quadratic kappa of 0.89 established test-retest reliability. The smallest detectable change at the individual level was 2.66 and at the group level was 0.30. Renzi ODS score and the total (-0.32) and physical (-0.43) SF-36 scores correlated negatively. Patient and control's groups significantly differed (11 points). The change score of Renzi ODS score between baseline and 3 months correlated negatively with the clinical evolution (-0.86). ROC analysis showed minimal important change of 2.00 with AUC 0.97. Neither floor nor ceiling effects were observed. This work validated the Portuguese version of Renzi ODS score. We can now use this reliable, responsive, and interpretable (at the group level) tool to evaluate Portuguese ODS patients.
Validation of scores of use of inhalation devices: valoration of errors *

PubMed Central

Zambelli-Simões, Letícia; Martins, Maria Cleusa; Possari, Juliana Carneiro da Cunha; Carvalho, Greice Borges; Coelho, Ana Carla Carvalho; Cipriano, Sonia Lucena; de Carvalho-Pinto, Regina Maria; Cukier, Alberto; Stelmach, Rafael

2015-01-01

Abstract Objective: To validate two scores quantifying the ability of patients to use metered dose inhalers (MDIs) or dry powder inhalers (DPIs); to identify the most common errors made during their use; and to identify the patients in need of an educational program for the use of these devices. Methods: This study was conducted in three phases: validation of the reliability of the inhaler technique scores; validation of the contents of the two scores using a convenience sample; and testing for criterion validation and discriminant validation of these instruments in patients who met the inclusion criteria. Results: The convenience sample comprised 16 patients. Interobserver disagreement was found in 19% and 25% of the DPI and MDI scores, respectively. After expert analysis on the subject, the scores were modified and were applied in 72 patients. The most relevant difficulty encountered during the use of both types of devices was the maintenance of total lung capacity after a deep inhalation. The degree of correlation of the scores by observer was 0.97 (p < 0.0001). There was good interobserver agreement in the classification of patients as able/not able to use a DPI (50%/50% and 52%/58%; p < 0.01) and an MDI (49%/51% and 54%/46%; p < 0.05). Conclusions: The validated scores allow the identification and correction of inhaler technique errors during consultations and, as a result, improvement in the management of inhalation devices. PMID:26398751
The first Latin-American risk stratification system for cardiac surgery: can be used as a graphic pocket-card score.

PubMed

Carosella, Victorio C; Navia, Jose L; Al-Ruzzeh, Sharif; Grancelli, Hugo; Rodriguez, Walter; Cardenas, Cesar; Bilbao, Jorge; Nojek, Carlos

2009-08-01

This study aims to develop the first Latin-American risk model that can be used as a simple, pocket-card graphic score at bedside. The risk model was developed on 2903 patients who underwent cardiac surgery at the Spanish Hospital of Buenos Aires, Argentina, between June 1994 and December 1999. Internal validation was performed on 708 patients between January 2000 and June 2001 at the same center. External validation was performed on 1087 patients between February 2000 and January 2007 at three other centers in Argentina. In the development dataset the area under receiver operating characteristics (ROC) curve was 0.73 and the Hosmer-Lemeshow (HL) test was P=0.88. In the internal validation ROC curve was 0.77. In the external validation ROC curve was 0.81, but imperfect calibration was detected because the observed in-hospital mortality (3.96%) was significantly lower than the development dataset (8.20%) (P<0.0001). Recalibration was done in 2007, showing excellent level of agreement between the observed and predicted mortality rates on all patients (P=0.92). This is the first risk model for cardiac surgery developed in a population of Latin-America with both internal and external validation. A simple graphic pocket-card score allows an easy bedside application with acceptable statistic precision.
Validation of reactive gases and aerosols in the MACC global analysis and forecast system

NASA Astrophysics Data System (ADS)

Eskes, H.; Huijnen, V.; Arola, A.; Benedictow, A.; Blechschmidt, A.-M.; Botek, E.; Boucher, O.; Bouarar, I.; Chabrillat, S.; Cuevas, E.; Engelen, R.; Flentje, H.; Gaudel, A.; Griesfeller, J.; Jones, L.; Kapsomenakis, J.; Katragkou, E.; Kinne, S.; Langerock, B.; Razinger, M.; Richter, A.; Schultz, M.; Schulz, M.; Sudarchikova, N.; Thouret, V.; Vrekoussis, M.; Wagner, A.; Zerefos, C.

2015-11-01

The European MACC (Monitoring Atmospheric Composition and Climate) project is preparing the operational Copernicus Atmosphere Monitoring Service (CAMS), one of the services of the European Copernicus Programme on Earth observation and environmental services. MACC uses data assimilation to combine in situ and remote sensing observations with global and regional models of atmospheric reactive gases, aerosols, and greenhouse gases, and is based on the Integrated Forecasting System of the European Centre for Medium-Range Weather Forecasts (ECMWF). The global component of the MACC service has a dedicated validation activity to document the quality of the atmospheric composition products. In this paper we discuss the approach to validation that has been developed over the past 3 years. Topics discussed are the validation requirements, the operational aspects, the measurement data sets used, the structure of the validation reports, the models and assimilation systems validated, the procedure to introduce new upgrades, and the scoring methods. One specific target of the MACC system concerns forecasting special events with high-pollution concentrations. Such events receive extra attention in the validation process. Finally, a summary is provided of the results from the validation of the latest set of daily global analysis and forecast products from the MACC system reported in November 2014.
The Generalizability of Overreporting Across Self-Report Measures: An Investigation With the Minnesota Multiphasic Personality Inventory-2-Restructured Form and the Personality Assessment Inventory in a Civil Disability Sample.

PubMed

Crighton, Adam H; Tarescavage, Anthony M; Gervais, Roger O; Ben-Porath, Yossef S

2017-07-01

Elevated overreporting Validity Scale scores on the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) are associated with higher scores on collateral measures; however, measures used in prior research lacked validity scales. We sought to extend these findings by examining associations between elevated MMPI-2-RF overreporting scale scores and Personality Assessment Inventory (PAI) scale scores among 654 non-head injury civil disability claimants. Individuals were classified as overreporting psychopathology (OR-P), overreporting somatic/cognitive complaints (OR-SC), inconclusive reporting psychopathology (IR-P), inconclusive reporting somatic/cognitive complaints (IR-SC), or valid reporting (VR). Both overreporting groups had significantly and meaningfully higher scores than the VR group on the MMPI-2-RF and PAI scales. Both IR groups had significantly and meaningfully higher scores than the VR group, as well as lower scores than their overreporting counterparts. Our findings demonstrate the utility of inventories with validity scales in assessment batteries that include instruments without measures of protocol validity.

Administration of Neuropsychological Tests Using Interactive Voice Response Technology in the Elderly: Validation and Limitations

PubMed Central

Miller, Delyana Ivanova; Talbot, Vincent; Gagnon, Michèle; Messier, Claude

2013-01-01

Interactive voice response (IVR) systems are computer programs, which interact with people to provide a number of services from business to health care. We examined the ability of an IVR system to administer and score a verbal fluency task (fruits) and the digit span forward and backward in 158 community dwelling people aged between 65 and 92 years of age (full scale IQ of 68–134). Only six participants could not complete all tasks mostly due to early technical problems in the study. Participants were also administered the Wechsler Intelligence Scale fourth edition (WAIS-IV) and Wechsler Memory Scale fourth edition subtests. The IVR system correctly recognized 90% of the fruits in the verbal fluency task and 93–95% of the number sequences in the digit span. The IVR system typically underestimated the performance of participants because of voice recognition errors. In the digit span, these errors led to the erroneous discontinuation of the test: however the correlation between IVR scoring and clinical scoring was still high (93–95%). The correlation between the IVR verbal fluency and the WAIS-IV Similarities subtest was 0.31. The correlation between the IVR digit span forward and backward and the in-person administration was 0.46. We discuss how valid and useful IVR systems are for neuropsychological testing in the elderly. PMID:23950755
Risk score for first-screening of prevalent undiagnosed chronic kidney disease in Peru: the CRONICAS-CKD risk score.

PubMed

Carrillo-Larco, Rodrigo M; Miranda, J Jaime; Gilman, Robert H; Medina-Lezama, Josefina; Chirinos-Pacheco, Julio A; Muñoz-Retamozo, Paola V; Smeeth, Liam; Checkley, William; Bernabe-Ortiz, Antonio

2017-11-29

Chronic Kidney Disease (CKD) represents a great burden for the patient and the health system, particularly if diagnosed at late stages. Consequently, tools to identify patients at high risk of having CKD are needed, particularly in limited-resources settings where laboratory facilities are scarce. This study aimed to develop a risk score for prevalent undiagnosed CKD using data from four settings in Peru: a complete risk score including all associated risk factors and another excluding laboratory-based variables. Cross-sectional study. We used two population-based studies: one for developing and internal validation (CRONICAS), and another (PREVENCION) for external validation. Risk factors included clinical- and laboratory-based variables, among others: sex, age, hypertension and obesity; and lipid profile, anemia and glucose metabolism. The outcome was undiagnosed CKD: eGFR < 60 ml/min/1.73m 2 . We tested the performance of the risk scores using the area under the receiver operating characteristic (ROC) curve, sensitivity, specificity, positive/negative predictive values and positive/negative likelihood ratios. Participants in both studies averaged 57.7 years old, and over 50% were females. Age, hypertension and anemia were strongly associated with undiagnosed CKD. In the external validation, at a cut-off point of 2, the complete and laboratory-free risk scores performed similarly well with a ROC area of 76.2% and 76.0%, respectively (P = 0.784). The best assessment parameter of these risk scores was their negative predictive value: 99.1% and 99.0% for the complete and laboratory-free, respectively. The developed risk scores showed a moderate performance as a screening test. People with a score of ≥ 2 points should undergo further testing to rule out CKD. Using the laboratory-free risk score is a practical approach in developing countries where laboratories are not readily available and undiagnosed CKD has significant morbidity and mortality.
Validation of Patient Reported Outcomes Measurement Information System (PROMIS) Computer Adaptive Tests (CATs) in the Surgical Treatment of Lumbar Spinal Stenosis.

PubMed

Patel, Alpesh A; Dodwad, Shah-Nawaz M; Boody, Barrett S; Bhatt, Surabhi; Savage, Jason W; Hsu, Wellington K; Rothrock, Nan E

2018-03-19

Prospective, cohort study. Demonstrate validity of PROMIS physical function, pain interference, and pain behavior computer adaptive tests (CATs) in surgically treated lumbar stenosis patients. There has been increasing attention given to patient reported outcomes associated with spinal interventions. Historical patient outcome measures have inadequate validation, demonstrate floor/ceiling effects, and infrequently used due to time constraints. PROMIS is an adaptive, responsive NIH assessment tool that measures patient-reported health status. 98 consecutive patients were surgically treated for lumbar spinal stenosis and were assessed using PROMIS CATs, ODI, ZCQ and SF-12. Prior lumbar surgery, history of scoliosis, cancer, trauma, or infection were excluded. Completion time, preoperative assessment, 6 week and 3 month postoperative scores were collected. At baseline, 49%, 79%, and 81% of patients had PROMIS PB, PI, and PF scores greater than 1 SD worse than the general population. 50.6% were categorized as severely disabled, crippled, or bed bound by ODI. PROMIS CATs demonstrated convergent validity through moderate to high correlations with legacy measures (r = 0.35-0.73). PROMIS CATs demonstrated known groups validity when stratified by ODI levels of disability. ODI improvements of at least 10 points on average had changes in PROMIS scores in the expected direction (PI = -12.98, PB = -9.74, PF = 7.53). PROMIS CATs demonstrated comparable responsiveness to change when evaluated against legacy measures. PROMIS PB and PI decreased 6.66 and 9.62 and PROMIS PF increased 6.8 points between baseline and 3-months post-op (p < 0.001). Completion time for the PROMIS CATs (2.6 minutes) compares favorably to ODI, ZCQ, and SF-12 scores (3.1, 3.6, and 3.0 minutes). PROMIS CATs demonstrate convergent validity, known groups validity, and responsiveness for surgically treated patients with lumbar stenosis to detect change over time and are more efficient than legacy instruments. 2.
Outcome measurement in clinical trials for Ulcerative Colitis: towards standardisation

PubMed Central

Cooney, Rachel M; Warren, Bryan F; Altman, Douglas G; Abreu, Maria T; Travis, Simon PL

2007-01-01

Clinical trials on novel drug therapies require clear criteria for patient selection and agreed definitions of disease remission. This principle has been successfully applied in the field of rheumatology where agreed disease scoring systems have allowed multi-centre collaborations and facilitated audit across treatment centres. Unfortunately in ulcerative colitis this consensus is lacking. Thirteen scoring systems have been developed but none have been properly validated. Most trials choose different endpoints and activity indices, making comparison of results from different trials extremely difficult. International consensus on endoscopic, clinical and histological scoring systems is essential as these are the key components used to determine entry criteria and outcome measurements in clinical trials on ulcerative colitis. With multiple new therapies under development, there is a pressing need for consensus to be reached. PMID:17592647
The OMERACT Rheumatoid Arthritis Magnetic Resonance Imaging (MRI) Scoring System: Updated Recommendations by the OMERACT MRI in Arthritis Working Group.

PubMed

Østergaard, Mikkel; Peterfy, Charles G; Bird, Paul; Gandjbakhch, Frédérique; Glinatsi, Daniel; Eshed, Iris; Haavardsholm, Espen A; Lillegraven, Siri; Bøyesen, Pernille; Ejbjerg, Bo; Foltz, Violaine; Emery, Paul; Genant, Harry K; Conaghan, Philip G

2017-11-01

The Outcome Measures in Rheumatology (OMERACT) Rheumatoid Arthritis (RA) Magnetic Resonance Imaging (MRI) scoring system (RAMRIS), evaluating bone erosion, bone marrow edema/osteitis, and synovitis, was introduced in 2002, and is now the standard method of objectively quantifying inflammation and damage by MRI in RA trials. The objective of this paper was to identify subsequent advances and based on them, to provide updated recommendations for the RAMRIS. MRI studies relevant for RAMRIS and technical and scientific advances were analyzed by the OMERACT MRI in Arthritis Working Group, which used these data to provide updated considerations on image acquisition, RAMRIS definitions, and scoring systems for the original and new RA pathologies. Further, a research agenda was outlined. Since 2002, longitudinal studies and clinical trials have documented RAMRIS variables to have face, construct, and criterion validity; high reliability and sensitivity to change; and the ability to discriminate between therapies. This has enabled RAMRIS to demonstrate inhibition of structural damage progression with fewer patients and shorter followup times than has been possible with conventional radiography. Technical improvements, including higher field strengths and improved pulse sequences, allow higher image resolution and contrast-to-noise ratio. These have facilitated development and validation of scoring methods of new pathologies: joint space narrowing and tenosynovitis. These have high reproducibility and moderate sensitivity to change, and can be added to RAMRIS. Combined scores of inflammation or joint damage may increase sensitivity to change and discriminative power. However, this requires further research. Updated 2016 RAMRIS recommendations and a research agenda were developed.
The Predictive Validity of Interim Assessment Scores Based on the Full-Information Bifactor Model for the Prediction of End-of-Grade Test Performance

ERIC Educational Resources Information Center

Immekus, Jason C.; Atitya, Ben

2016-01-01

Interim tests are a central component of district-wide assessment systems, yet their technical quality to guide decisions (e.g., instructional) has been repeatedly questioned. In response, the study purpose was to investigate the validity of a series of English Language Arts (ELA) interim assessments in terms of dimensionality and prediction of…
A Note on Some Characteristics and Correlates of the Meier Art Test of Aesthetic Perception.

ERIC Educational Resources Information Center

Stallings, William M.; Anderson, Frances E.

The reliability and the predictive and concurrent validity of the MATAP were investigated with the implicit goal of improving the prediction of course grades in the College of Fine and Applied Arts. It was found that reliability and validity coefficients were low, and it was suggested that the scoring system was a source of error variance. (MS)
Validation of the Rational and Experiential Multimodal Inventory in the Italian Context.

PubMed

Monacis, Lucia; de Palo, Valeria; Di Nuovo, Santo; Sinatra, Maria

2016-08-01

The unfavorable relations of the Rational and Experiential Inventory Experiential scale with objective criterion measures and its limited content validity led Norris and Epstein to propose a more content-valid measure of the experiential thinking style, the Rational and Experiential Multimodal Inventory (REIm), in order to assess the several facets of a broader experiential system consisting of interrelated components. This study aimed to provide the Italian validation of the inventory by examining its psychometric features, its factor structure (Study 1, N = 545), and its convergent and discriminant validity (Study 2, N = 257). Study 1 supported the 2- and 4-factor solutions, and multi-group analyses confirmed the invariance measurement across age and gender for both models. Study 2 provided evidence for both the convergent validity by supporting the theoretical associations among Rational and Experiential Multimodal Inventory scores and similar and related measures, and the discriminant validity by showing associations between the two thinking styles and a different but conceptually related construct, i.e., identity formation. No associations between Rational and Experiential Multimodal Inventory scores and social desirability were found. The Italian version of the Rational and Experiential Multimodal Inventory showed satisfactory psychometric properties, thus confirming its validity. © The Author(s) 2016.
Validation of the Mayo Hip Score: construct validity, reliability and responsiveness to change.

PubMed

Singh, Jasvinder A; Schleck, Cathy; Harmsen, W Scott; Lewallen, David G

2016-01-19

Previous studies have provided the initial evidence for construct validity and test-retest reliability of the Mayo Hip Score. Instruments used for Total Hip Arthroplasty (THA) outcomes assessment should be valid, reliable and responsive to change. Our main objective was to examine the responsiveness to change, association with subsequent revision and the construct validity of the Mayo hip score. Discriminant ability was assessed by calculating effect size (ES), standardized response mean (SRM) and Guyatt's responsiveness index (GRI). Minimal clinically important difference (MCII) and moderate improvement thresholds were calculated. We assessed construct validity by examining association of scores with preoperative patient characteristics and correlation with Harris hip score, and assessed association of scores with the risk of subsequent revision. Five thousand three hundred seven provided baseline data; of those with baseline data, 2,278 and 2,089 (39%) provided 2- and 5-year data, respectively. Large ES, SRM and GRI ranging 2.66-2.78, 2.42-2.61 and 1.67-1.88 were noted for Mayo hip scores with THA, respectively. The MCII and moderate improvement thresholds were 22.4-22.7 and 39.4-40.5 respectively. Hazard ratios of revision surgery were higher with lower final score or less improvement in Mayo hip score at 2-years and borderline significant/non-significant at 5-years, respectively: (1) score ≤55 with hazard ratios of 2.24 (95% CI, 1.45, 3.46; p = 0.0003) and 1.70 (95% CI, 1.00, 2.92; p = 0.05) of implant revision subsequently, compared to 72-80 points; (2) no improvement or worsening score with hazard ratios 3.94 (95% CI, 1.50, 10.30; p = 0.005) and 2.72 (95% CI, 0.85,8.70; p = 0.09), compared to improvement >50-points. Mayo hip score had significant positive correlation with younger age, male gender, lower BMI, lower ASA class and lower Deyo-Charlson index (p ≤ 0.003 for each) and with Harris hip scores (p < 0.001). Mayo Hip Score is valid, sensitive to change and associated with future risk of revision surgery in patients with primary THA.
Individualism: a valid and important dimension of cultural differences between nations.

PubMed

Schimmack, Ulrich; Oishi, Shigehiro; Diener, Ed

2005-01-01

Oyserman, Coon, and Kemmelmeier's (2002) meta-analysis suggested problems in the measurement of individualism and collectivism. Studies using Hofstede's individualism scores show little convergent validity with more recent measures of individualism and collectivism. We propose that the lack of convergent validity is due to national differences in response styles. Whereas Hofstede statistically controlled for response styles, Oyserman et al.'s meta-analysis relied on uncorrected ratings. Data from an international student survey demonstrated convergent validity between Hofstede's individualism dimension and horizontal individualism when response styles were statistically controlled, whereas uncorrected scores correlated highly with the individualism scores in Oyserman et al.'s meta-analysis. Uncorrected horizontal individualism scores and meta-analytic individualism scores did not correlate significantly with nations' development, whereas corrected horizontal individualism scores and Hofstede's individualism dimension were significantly correlated with development. This pattern of results suggests that individualism is a valid construct for cross-cultural comparisons, but that the measurement of this construct needs improvement.
Validation of use of subsets of teeth when applying the total mouth periodontal score (TMPS) system in dogs.

PubMed

Harvey, Colin E; Laster, Larry; Shofer, Frances S

2012-01-01

A total mouth periodontal score (TMPS) system in dogs has been described previously. Use of buccal and palatal/lingual surfaces of all teeth requires observation and recording of 120 gingivitis scores and 120 periodontitis scores. Although the result is a reliable, repeatable assessment of the extent of periodontal disease in the mouth, observing and recording 240 data points is time-consuming. Using data from a previously reported study of periodontal disease in dogs, correlation analysis was used to determine whether use of any of seven different subsets of teeth can generate TMPS subset gingivitis and periodontitis scores that are highly correlated with TMPS all-site, all-teeth scores. Overall, gingivitis scores were less highly correlated than periodontitis scores. The minimal tooth set with a significant intra-class correlation (> or = 0.9 of means of right and left sides) for both gingivitis scores and attachment loss measurements consisted of the buccal surface of the maxillary third incisor canine, third premolar fourth premolar; and first molar teeth; and, the mandibular canine, third premolar, fourth premolar and first molar teeth on one side (9 teeth, 15 root sites). Use of this subset of teeth, which reduces the number of data points per dog from 240 to 30 for gingivitis and periodontitis at each scoring episode, is recommended when calculating the gingivitis and periodontitis scores using the TMPS system.
MEASURING SPORT-SPECIFIC PHYSICAL ABILITIES IN MALE GYMNASTS: THE MEN'S GYMNASTICS FUNCTIONAL MEASUREMENT TOOL.

PubMed

Sleeper, Mark D; Kenyon, Lisa K; Elliott, James M; Cheng, M Samuel

2016-12-01

Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts' USA-Gymnastics competitive level to calculate the coefficient of determination (r 2 ). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. The relationship between total MGFMT scores and subjects' current USA-Gymnastics competitive level was found to be good (r 2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level 3.
A novel adaptive scoring system for segmentation validation with multiple reference masks

NASA Astrophysics Data System (ADS)

Moltz, Jan H.; Rühaak, Jan; Hahn, Horst K.; Peitgen, Heinz-Otto

2011-03-01

The development of segmentation algorithms for different anatomical structures and imaging protocols is an important task in medical image processing. The validation of these methods, however, is often treated as a subordinate task. Since manual delineations, which are widely used as a surrogate for the ground truth, exhibit an inherent uncertainty, it is preferable to use multiple reference segmentations for an objective validation. This requires a consistent framework that should fulfill three criteria: 1) it should treat all reference masks equally a priori and not demand consensus between the experts; 2) it should evaluate the algorithmic performance in relation to the inter-reference variability, i.e., be more tolerant where the experts disagree about the true segmentation; 3) it should produce results that are comparable for different test data. We show why current state-of-the-art frameworks as the one used at several MICCAI segmentation challenges do not fulfill these criteria and propose a new validation methodology. A score is computed in an adaptive way for each individual segmentation problem, using a combination of volume- and surface-based comparison metrics. These are transformed into the score by relating them to the variability between the reference masks which can be measured by comparing the masks with each other or with an estimated ground truth. We present examples from a study on liver tumor segmentation in CT scans where our score shows a more adequate assessment of the segmentation results than the MICCAI framework.
MARK’s Quadrant scoring system: a symptom-based targeted screening tool for gastric cancer

PubMed Central

Tata, Mahadevan D.; Gurunathan, Ramesh; Palayan, Kandasami

2014-01-01

Background Gastric cancer is notably one of the leading causes of cancer-related death in the world. In Malaysia, these patients present in the advanced stage, thus narrowing the treatment options and making the surgery nearly impossible for successful curative resection. Failure to identify high-risk patients and delay in diagnostic endoscope procedure contributed to the delay in diagnosis. The aim of the study was to develop and validate a scoring system (MARK’s Quadrant) which can identify symptomatic patients who are at risk for gastric cancer. Methods A 3-phase approach was undertaken: Phase 1: development of the weighted scoring system; Phase 2: estimating positive predicting value of MARK’s Quadrant; and Phase 3: a) testing the validity of MARK’s Quadrant in an open-access endoscope system; and b) comparing its usefulness compared to conventional referral system. Results In phases 1 and 2, MARK’s Quadrant with weighted symptoms was developed. The sensitivity of MARK’s Quadrant is 88% and the specificity is 45.5% to detect cancerous and precancerous lesions of gastric. This was confirmed by the prospective data from phase 3 of this study where the diagnostic yield of MARK’s Quadrant to detect any pathological lesion was 95.2%. This score has a high accuracy efficiency of 75%, hence comparing to routine referral system it has an odds ratio (95%CI) of 10.98 (4.63-26.00), 6.71 (4.46-10.09) and 0.95 (0.06-0.15) (P<0.001 respectively) for cancer, precancerous lesion and benign lesion diagnosis respectively. Conclusion MARK’s Quadrant is a useful tool to detect early gastric cancer among symptomatic patients in a low incidence region. PMID:24714557
Implicit Review Instrument to Evaluate Quality of Care Delivered by Physicians to Children in Emergency Departments.

PubMed

Marcin, James P; Romano, Patrick S; Dharmar, Madan; Chamberlain, James M; Dudley, Nanette; Macias, Charles G; Nigrovic, Lise E; Powell, Elizabeth C; Rogers, Alexander J; Sonnett, Meridith; Tzimenatos, Leah; Alpern, Elizabeth R; Andrews-Dickert, Rebecca; Borgialli, Dominic A; Sidney, Erika; Casper, Charlie; Dean, Jonathan Michael; Kuppermann, Nathan

2018-06-01

To evaluate the consistency, reliability, and validity of an implicit review instrument that measures the quality of care provided to children in the emergency department (ED). Medical records of randomly selected children from 12 EDs in the Pediatric Emergency Care Applied Research Network (PECARN). Eight pediatric emergency medicine physicians applied the instrument to 620 medical records. We determined internal consistency using Cronbach's alpha and inter-rater reliability using the intraclass correlation coefficient (ICC). We evaluated the validity of the instrument by correlating scores with four condition-specific explicit review instruments. Individual reviewers' Cronbach's alpha had a mean of 0.85 with a range of 0.76-0.97; overall Cronbach's alpha was 0.90. The ICC was 0.49 for the summary score with a range from 0.40 to 0.46. Correlations between the quality of care score and the four condition-specific explicit review scores ranged from 0.24 to 0.38. The quality of care instrument demonstrated good internal consistency, moderate inter-rater reliability, high inter-rater agreement, and evidence supporting validity. The instrument could be useful for systems' assessment and research in evaluating the care delivered to children in the ED. © Health Research and Educational Trust.
Validation of a proposal for evaluating hospital infection control programs.

PubMed

Silva, Cristiane Pavanello Rodrigues; Lacerda, Rúbia Aparecida

2011-02-01

To validate the construct and discriminant properties of a hospital infection prevention and control program. The program consisted of four indicators: technical-operational structure; operational prevention and control guidelines; epidemiological surveillance system; and prevention and control activities. These indicators, with previously validated content, were applied to 50 healthcare institutions in the city of São Paulo, Southeastern Brazil, in 2009. Descriptive statistics were used to characterize the hospitals and indicator scores, and Cronbach's α coefficient was used to evaluate the internal consistency. The discriminant validity was analyzed by comparing indicator scores between groups of hospitals: with versus without quality certification. The construct validity analysis was based on exploratory factor analysis with a tetrachoric correlation matrix. The indicators for the technical-operational structure and epidemiological surveillance presented almost 100% conformity in the whole sample. The indicators for the operational prevention and control guidelines and the prevention and control activities presented internal consistency ranging from 0.67 to 0.80. The discriminant validity of these indicators indicated higher and statistically significant mean conformity scores among the group of institutions with healthcare certification or accreditation processes. In the construct validation, two dimensions were identified for the operational prevention and control guidelines: recommendations for preventing hospital infection and recommendations for standardizing prophylaxis procedures, with good correlation between the analysis units that formed the guidelines. The same was found for the prevention and control activities: interfaces with treatment units and support units were identified. Validation of the measurement properties of the hospital infection prevention and control program indicators made it possible to develop a tool for evaluating these programs in an ethical and scientific manner in order to obtain a quality diagnosis in this field.
Validation of the Australian Propensity for Angry Driving Scale (Aus-PADS).

PubMed

Leal, Nerida L; Pachana, Nancy A

2009-09-01

The present study used a university sample to assess the test-retest reliability and validity of the Australian Propensity for Angry Driving Scale (Aus-PADS). The scale has stability over time, and convergent validity was established, as Aus-PADS scores correlated significantly with established anger and impulsivity measures. Discriminant validity was also established, as Aus-PADS scores did not correlate with Venturesomeness scores. The Aus-PADS has demonstrated criterion validity, as scores were correlated with behavioural measures, such as yelling at other drivers, gesturing at other drivers, and feeling angry but not doing anything. Aus-PADS scores reliably predicted the frequency of these behaviours over and above other study variables. No significant relationship between aggressive driving and crash involvement was observed. It was concluded that the Aus-PADS is a reliable and valid tool appropriate for use in Australian research, and that the potential relationship between aggressive driving and crash involvement warrants further investigation with a more representative (and diverse) driver sample.
Evidence of Concurrent Validity of SII Scores for Asian American College Students

ERIC Educational Resources Information Center

Hansen, Jo-Ida C.; Lee, W. Vanessa

2007-01-01

The validity of scores on the Strong Interest Inventory (SII) for Asian American college students has not been thoroughly investigated. This study examined the evidence of validity of the SII Occupational Scale scores for predicting college major choices of Asian American women and men and White women and men. The sample included 186 female and…
Exploring Validity of Computer-Based Test Scores with Examinees' Response Behaviors and Response Times

ERIC Educational Resources Information Center

Sahin, Füsun

2017-01-01

Examining the testing processes, as well as the scores, is needed for a complete understanding of validity and fairness of computer-based assessments. Examinees' rapid-guessing and insufficient familiarity with computers have been found to be major issues that weaken the validity arguments of scores. This study has three goals: (a) improving…
Computer-Assisted Automated Scoring of Polysomnograms Using the Somnolyzer System

PubMed Central

Punjabi, Naresh M.; Shifa, Naima; Dorffner, Georg; Patil, Susheel; Pien, Grace; Aurora, Rashmi N.

2015-01-01

Study Objectives: Manual scoring of polysomnograms is a time-consuming and tedious process. To expedite the scoring of polysomnograms, several computerized algorithms for automated scoring have been developed. The overarching goal of this study was to determine the validity of the Somnolyzer system, an automated system for scoring polysomnograms. Design: The analysis sample comprised of 97 sleep studies. Each polysomnogram was manually scored by certified technologists from four sleep laboratories and concurrently subjected to automated scoring by the Somnolyzer system. Agreement between manual and automated scoring was examined. Sleep staging and scoring of disordered breathing events was conducted using the 2007 American Academy of Sleep Medicine criteria. Setting: Clinical sleep laboratories. Measurements and Results: A high degree of agreement was noted between manual and automated scoring of the apnea-hypopnea index (AHI). The average correlation between the manually scored AHI across the four clinical sites was 0.92 (95% confidence interval: 0.90–0.93). Similarly, the average correlation between the manual and Somnolyzer-scored AHI values was 0.93 (95% confidence interval: 0.91–0.96). Thus, interscorer correlation between the manually scored results was no different than that derived from manual and automated scoring. Substantial concordance in the arousal index, total sleep time, and sleep efficiency between manual and automated scoring was also observed. In contrast, differences were noted between manually and automated scored percentages of sleep stages N1, N2, and N3. Conclusion: Automated analysis of polysomnograms using the Somnolyzer system provides results that are comparable to manual scoring for commonly used metrics in sleep medicine. Although differences exist between manual versus automated scoring for specific sleep stages, the level of agreement between manual and automated scoring is not significantly different than that between any two human scorers. In light of the burden associated with manual scoring, automated scoring platforms provide a viable complement of tools in the diagnostic armamentarium of sleep medicine. Citation: Punjabi NM, Shifa N, Dorffner G, Patil S, Pien G, Aurora RN. Computer-assisted automated scoring of polysomnograms using the Somnolyzer system. SLEEP 2015;38(10):1555–1566. PMID:25902809

A proposal for a comprehensive risk scoring system for predicting postoperative complications in octogenarian patients with medically operable lung cancer: JACS1303.

PubMed

Saji, Hisashi; Ueno, Takahiko; Nakamura, Hiroshige; Okumura, Norihito; Tsuchida, Masanori; Sonobe, Makoto; Miyazaki, Takuro; Aokage, Keiju; Nakao, Masayuki; Haruki, Tomohiro; Ito, Hiroyuki; Kataoka, Kazuhiko; Okabe, Kazunori; Tomizawa, Kenji; Yoshimoto, Kentaro; Horio, Hirotoshi; Sugio, Kenji; Ode, Yasuhisa; Takao, Motoshi; Okada, Morihito; Chida, Masayuki

2018-04-01

Although some retrospective studies have reported clinicopathological scoring systems for predicting postoperative complications and survival outcomes for elderly lung cancer patients, optimized scoring systems remain controversial. The Japanese Association for Chest Surgery (JACS) conducted a nationwide multicentre prospective cohort and enrolled a total of 1019 octogenarians with medically operable lung cancer. Details of the clinical factors, comorbidities and comprehensive geriatric assessment were recorded for 895 patients to develop a comprehensive risk scoring (RS) system capable of predicting severe complications. Operative (30 days) and hospital mortality rates were 1.0% and 1.6%, respectively. Complications were observed in 308 (34%) patients, of whom 81 (8.4%) had Grade 3-4 severe complications. Pneumonia was the most common severe complication, observed in 27 (3.0%) patients. Five predictive factors, gender, comprehensive geriatric assessment75: memory and Simplified Comorbidity Score (SCS): diabetes mellitus, albumin and percentage vital capacity, were identified as independent predictive factors for severe postoperative complications (odds ratio = 2.73, 1.86, 1.54, 1.66 and 1.61, respectively) through univariate and multivariate analyses. A 5-fold cross-validation was performed as an internal validation to reconfirm these 5 predictive factors (average area under the curve 0.70). We developed a simplified RS system as follows: RS = 3 (gender: male) + 2 (comprehensive geriatric assessment 75: memory: yes) + 2 (albumin: <3.8 ng/ml) + 1 (percentage vital capacity: ≤90) + 1 (SCS: diabetes mellitus: yes). The current series shows that octogenarians can be successfully treated for lung cancer with surgical resection with an acceptable rate of severe complications and mortality. We propose a simplified RS system to predict severe complications in octogenarian patients with medically operative lung cancer. JACS1303 (UMIN000016756).
Development and Validation of a Model to Determine Risk of Progression of Barrett's Esophagus to Neoplasia.

PubMed

Parasa, Sravanthi; Vennalaganti, Sreekar; Gaddam, Srinivas; Vennalaganti, Prashanth; Young, Patrick; Gupta, Neil; Thota, Prashanthi; Cash, Brooks; Mathur, Sharad; Sampliner, Richard; Moawad, Fouad; Lieberman, David; Bansal, Ajay; Kennedy, Kevin F; Vargo, John; Falk, Gary; Spaander, Manon; Bruno, Marco; Sharma, Prateek

2018-04-01

A system is needed to determine the risk of patients with Barrett's esophagus for progression to high-grade dysplasia (HGD) and esophageal adenocarcinoma (EAC). We developed and validated a model to determine of progression to HGD or EAC in patients with BE, based on demographic data and endoscopic and histologic findings at the time of index endoscopy. We performed a longitudinal study of patients with BE at 5 centers in United States and 1 center in Netherlands enrolled in the Barrett's Esophagus Study database from 1985 through 2014. Patients were excluded from the analysis if they had less than 1 year of follow-up, were diagnosed with HGD or EAC within the past year, were missing baseline histologic data, or had no intestinal metaplasia. Seventy percent of the patients were used to derive the model and 30% were used for the validation study. The primary outcome was development of HGD or EAC during the follow-up period (median, 5.9 years). Survival analysis was performed using the Kaplan-Meier method. We assigned a specific number of points to each BE risk factor, and point totals (scores) were used to create categories of low, intermediate, and high risk. We used Cox regression to compute hazard ratios and 95% confidence intervals to determine associations between risk of progression and scores. Of 4584 patients in the database, 2697 were included in our analysis (84.1% men; 87.6% Caucasian; mean age, 55.4 ± 20.1 years; mean body mass index, 27.9 ± 5.5 kg/m 2 ; mean length of BE, 3.7 ± 3.2 cm). During the follow-up period, 154 patients (5.7%) developed HGD or EAC, with an annual rate of progression of 0.95%. Male sex, smoking, length of BE, and baseline-confirmed low-grade dysplasia were significantly associated with progression. Scores assigned identified patients with BE that progressed to HGD or EAC with a c-statistic of 0.76 (95% confidence interval, 0.72-0.80; P < .001). The calibration slope was 0.9966 (P = .99), determined from the validation cohort. We developed a scoring system (Progression in Barrett's Esophagus score) based on male sex, smoking, length of BE, and baseline low-grade dysplasia that identified patients with BE at low, intermediate, and high risk for HGD or EAC. This scoring system might be used in management of patients. Copyright © 2018 AGA Institute. Published by Elsevier Inc. All rights reserved.
Explicating Validity

ERIC Educational Resources Information Center

Kane, Michael T.

2016-01-01

How we choose to use a term depends on what we want to do with it. If "validity" is to be used to support a score interpretation, validation would require an analysis of the plausibility of that interpretation. If validity is to be used to support score uses, validation would require an analysis of the appropriateness of the proposed…
Timely diagnosis of dairy calf respiratory disease using a standardized scoring system.

PubMed

McGuirk, Sheila M; Peek, Simon F

2014-12-01

Respiratory disease of young dairy calves is a significant cause of morbidity, mortality, economic loss, and animal welfare concern but there is no gold standard diagnostic test for antemortem diagnosis. Clinical signs typically used to make a diagnosis of respiratory disease of calves are fever, cough, ocular or nasal discharge, abnormal breathing, and auscultation of abnormal lung sounds. Unfortunately, routine screening of calves for respiratory disease on the farm is rarely performed and until more comprehensive, practical and affordable respiratory disease-screening tools such as accelerometers, pedometers, appetite monitors, feed consumption detection systems, remote temperature recording devices, radiant heat detectors, electronic stethoscopes, and thoracic ultrasound are validated, timely diagnosis of respiratory disease can be facilitated using a standardized scoring system. We have developed a scoring system that attributes severity scores to each of four clinical parameters; rectal temperature, cough, nasal discharge, ocular discharge or ear position. A total respiratory score of five points or higher (provided that at least two abnormal parameters are observed) can be used to distinguish affected from unaffected calves. This can be applied as a screening tool twice-weekly to identify pre-weaned calves with respiratory disease thereby facilitating early detection. Coupled with effective treatment protocols, this scoring system will reduce post-weaning pneumonia, chronic pneumonia, and otitis media.
Reliability and validity of the Microsoft Kinect for evaluating static foot posture

PubMed Central

2013-01-01

Background The evaluation of foot posture in a clinical setting is useful to screen for potential injury, however disagreement remains as to which method has the greatest clinical utility. An inexpensive and widely available imaging system, the Microsoft Kinect™, may possess the characteristics to objectively evaluate static foot posture in a clinical setting with high accuracy. The aim of this study was to assess the intra-rater reliability and validity of this system for assessing static foot posture. Methods Three measures were used to assess static foot posture; traditional visual observation using the Foot Posture Index (FPI), a 3D motion analysis (3DMA) system and software designed to collect and analyse image and depth data from the Kinect. Spearman’s rho was used to assess intra-rater reliability and concurrent validity of the Kinect to evaluate foot posture, and a linear regression was used to examine the ability of the Kinect to predict total visual FPI score. Results The Kinect demonstrated moderate to good intra-rater reliability for four FPI items of foot posture (ρ = 0.62 to 0.78) and moderate to good correlations with the 3DMA system for four items of foot posture (ρ = 0.51 to 0.85). In contrast, intra-rater reliability of visual FPI items was poor to moderate (ρ = 0.17 to 0.63), and correlations with the Kinect and 3DMA systems were poor (absolute ρ = 0.01 to 0.44). Kinect FPI items with moderate to good reliability predicted 61% of the variance in total visual FPI score. Conclusions The majority of the foot posture items derived using the Kinect were more reliable than the traditional visual assessment of FPI, and were valid when compared to a 3DMA system. Individual foot posture items recorded using the Kinect were also shown to predict a moderate degree of variance in the total visual FPI score. Combined, these results support the future potential of the Kinect to accurately evaluate static foot posture in a clinical setting. PMID:23566934
A validation study on the traditional Chinese version of Spinal Appearance Questionnaire for adolescent idiopathic scoliosis.

PubMed

Guo, Jing; Lau, Ajax Hong Yin; Chau, Jack; Ng, Bobby Kin Wah; Lee, Kwong Man; Qiu, Yong; Cheng, Jack Chun Yiu; Lam, Tsz Ping

2016-10-01

"Simplified Chinese" version of Spinal Appearance Questionnaire (SC-SAQ) for patients with adolescent idiopathic scoliosis (AIS) was available but did not fit for communities using "Traditional Chinese" as their primary language. We developed a traditional Chinese version of SAQ (TC-SAQ) and evaluated its reliability and validity. TC-SAQ was administered to 112 AIS patients, of which 101 bilingual (English and Chinese) patients completed E-SAQ and the traditional Chinese version of Scoliosis Research Society-22 questionnaire (TC-SRS-22). Internal consistency and test-retest reliability were evaluated. Concurrent validity was evaluated by comparing TC-SAQ score with E-SAQ score, and convergent validity by comparing TC-SAQ score with TC-SRS-22 self-image domain score, and discriminant validity by analyzing the relationship between TC-SAQ score and patients' characteristics. Internal consistency of individual TC-SAQ domain was high (Cronbach's α = 0.785 to 0.940), except for general (Cronbach's α = 0.665) and shoulders (Cronbach's α = 0.421) domain. Test-retest reliability of TC-SAQ was good (ICCs of each domain from 0.798 to 0.865). Concurrent validity demonstrated an excellent correlation between TC-SAQ and E-SAQ scores (r = 0.820 to 0.954, P < 0.0001 for all domains). Correlation between TC-SAQ domains and TC-SRS-22 self-image domain was weak to moderate. TC-SAQ total score and individual domain scores (except waist and chest domains) were positively correlated to major curve magnitude. TC-SAQ had good internal consistency and test-retest reliability. Concurrent validity evaluated against the original English version was excellent. TC-SAQ was both reliable and valid for clinical use for AIS patients using traditional Chinese as their primary language.
Proposing Melasma Severity Index: A New, More Practical, Office-based Scoring System for Assessing the Severity of Melasma

PubMed Central

Majid, Imran; Haq, Inaamul; Imran, Saher; Keen, Abid; Aziz, Khalid; Arif, Tasleem

2016-01-01

Background: Melasma Area and Severity Index (MASI), the scoring system in melasma, needs to be refined. Aims and Objectives: To propose a more practical scoring system, named as Melasma Severity Index (MSI), for assessing the disease severity and treatment response in melasma. Materials and Methods: Four dermatologists were trained to calculate MASI and also the proposed MSI scores. For MSI, the formula used was 0.4 (a × p2) l + 0.4 (a × p2) r + 0.2 (a × p2) n where “a” stands for area, “p” for pigmentation, “l” for left face, “r” for right face, and “n” for nose. On a single day, 30 enrolled patients were randomly examined by each trained dermatologist and their MASI and MSI scores were calculated. Next, each rater re-examined every 6th patient for repeat MASI and MSI scoring to assess intra- and inter-rater reliability of MASI and MSI scores. Validity was assessed by comparing the individual scores of each rater with objective data from mexameter and ImageJ software. Results: Inter-rater reliability, as assessed by intraclass correlation coefficient, was significantly higher for MSI (0.955) as compared to MASI (0.816). Correlation of scores with objective data by Spearman's correlation revealed higher rho values for MSI than for MASI for all raters. Limitations: Sample population belonged to a single ethnic group. Conclusions: MSI is simpler and more practical scoring system for melasma. PMID:26955093
The validity and reliability of the Thai version of the Kujala score for patients with patellofemoral pain syndrome.

PubMed

Apivatgaroon, Adinun; Angthong, Chayanin; Sanguanjit, Prakasit; Chernchujit, Bancha

2016-10-01

To develop a Thai version of the Kujala score and show the evaluation of the validity and reliability of the score. The Thai version of the Kujala score was developed using the forward-backward translation protocol. The 49 PFPS patients answered the Thai version of questionnaires including the Kujala score, Short Form-36 (SF-36) and International Knee Documentation Committee (IKDC) Subjective Knee Form. The validity between the scores has been tested. The reliability was assessed using test-retest reliability and internal consistency. The Thai version of the Kujala score showed a good correlation with Thai IKDC Subjective Knee Form (Pearson's correlation coefficient; r = 0.74: p < 0.01) and moderate correlation with the Thai SF-36 subscales of physical component summary, total score and role physical (r = 0.586, 0.571 and 0.524, respectively: p < 0.01). The test-retest reliability was excellent with an intra-class correlation coefficient of 0.908 (p < 0.001; 95% CI [0.842-0.947]). The internal consistency was strong with Cronbach's alpha of 0.952 (p < 0.001). No floor and ceiling effects were observed. The Thai version of the Kujala score has shown good validity and reliability. This score can be effectively used for evaluating Thai patients with patellofemoral pain syndrome. Implications for Rehabilitation The Kujala score is a self-administered questionnaire for patients with patellofemoral pain syndrome (PFPS). The validity and reliability of the Thai version of Kujala are compatible with other versions (Turkish, Chinese and Persian version). The Thai version of Kujala has been shown to have validity and reliability in Thai PFPS patients and can be used for clinical evaluation and also in the research work.
Validity of the Family Asthma Management System Scale with an Urban African-American Sample

PubMed Central

Klinnert, Mary D.; Holsey, Chanda Nicole; McQuaid, Elizabeth L.

2011-01-01

Objective To examine the reliability and validity of the Family Asthma Management System Scale for low-income African-American children with poor asthma control and caregivers under stress. The FAMSS assesses eight aspects of asthma management from a family systems perspective. Methods Forty-three children, ages 8–13, and caregivers were interviewed with the FAMSS; caregivers completed measures of primary care quality, family functioning, parenting stress, and psychological distress. Children rated their relatedness with the caregiver, and demonstrated inhaler technique. Medical records were reviewed for dates of outpatient visits for asthma. Results The FAMSS demonstrated good internal consistency. Higher scores were associated with adequate inhaler technique, recent outpatient care, less parenting stress and better family functioning. Higher scores on the Collaborative Relationship with Provider subscale were associated with greater perceived primary care quality. Conclusions The FAMSS demonstrated relevant associations with asthma management criteria and family functioning for a low-income, African-American sample. PMID:19776230
Validation of the CORB75 (confusion, oxygen saturation, respiratory rate, blood pressure, and age ≥ 75 years) as a simpler pneumonia severity rule.

PubMed

Ochoa-Gondar, O; Vila-Corcoles, A; Rodriguez-Blanco, T; Hospital, I; Salsench, E; Ansa, X; Saun, N

2014-04-01

This study compares the ability of two simpler severity rules (classical CRB65 vs. proposed CORB75) in predicting short-term mortality in elderly patients with community-acquired pneumonia (CAP). A population-based study was undertaken involving 610 patients ≥ 65 years old with radiographically confirmed CAP diagnosed between 2008 and 2011 in Tarragona, Spain (350 cases in the derivation cohort, 260 cases in the validation cohort). Severity rules were calculated at the time of diagnosis, and 30-day mortality was considered as the dependent variable. The area under the receiver operating characteristic curves (AUC) was used to compare the discriminative power of the severity rules. Eighty deaths (46 in the derivation and 34 in the validation cohorts) were observed, which gives a mortality rate of 13.1 % (15.6 % for hospitalized and 3.3 % for outpatient cases). After multivariable analyses, besides CRB (confusion, respiration rate ≥ 30/min, systolic blood pressure <90 mmHg or diastolic ≤ 60 mmHg), peripheral oxygen saturation (≤ 90 %) and age ≥ 75 years appeared to be associated with increasing 30-day mortality in the derivation cohort. The model showed adequate calibration for the derivation and validation cohorts. A modified CORB75 scoring system (similar to the classical CRB65, but adding oxygen saturation and increasing the age to 75 years) was constructed. The AUC statistics for predicting mortality in the derivation and validation cohorts were 0.79 and 0.82, respectively. In the derivation cohort, a CORB75 score ≥ 2 showed 78.3 % sensitivity and 65.5 % specificity for mortality (in the validation cohort, these were 82.4 and 71.7 %, respectively). The proposed CORB75 scoring system has good discriminative power in predicting short-term mortality among elderly people with CAP, which supports its use for severity assessment of these patients in primary care.
Validation of a Simple Score to Determine Risk of Early Rejection After Pediatric Heart Transplantation.

PubMed

Butts, Ryan J; Savage, Andrew J; Atz, Andrew M; Heal, Elisabeth M; Burnette, Ali L; Kavarana, Minoo M; Bradley, Scott M; Chowdhury, Shahryar M

2015-09-01

This study aimed to develop a reliable and feasible score to assess the risk of rejection in pediatric heart transplantation recipients during the first post-transplant year. The first post-transplant year is the most likely time for rejection to occur in pediatric heart transplantation. Rejection during this period is associated with worse outcomes. The United Network for Organ Sharing database was queried for pediatric patients (age <18 years) who underwent isolated orthotopic heart transplantation from January 1, 2000 to December 31, 2012. Transplantations were divided into a derivation cohort (n = 2,686) and a validation (n = 509) cohort. The validation cohort was randomly selected from 20% of transplantations from 2005 to 2012. Covariates found to be associated with rejection (p < 0.2) were included in the initial multivariable logistic regression model. The final model was derived by including only variables independently associated with rejection. A risk score was then developed using relative magnitudes of the covariates' odds ratio. The score was then tested in the validation cohort. A 9-point risk score using 3 variables (age, cardiac diagnosis, and panel reactive antibody) was developed. Mean score in the derivation and validation cohorts were 4.5 ± 2.6 and 4.8 ± 2.7, respectively. A higher score was associated with an increased rate of rejection (score = 0, 10.6% in the validation cohort vs. score = 9, 40%; p < 0.01). In weighted regression analysis, the model-predicted risk of rejection correlated closely with the actual rates of rejection in the validation cohort (R(2) = 0.86; p < 0.01). The rejection score is accurate in determining the risk of early rejection in pediatric heart transplantation recipients. The score has the potential to be used in clinical practice to aid in determining the immunosuppressant regimen and the frequency of rejection surveillance in the first post-transplant year. Copyright © 2015 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
The behavioral regulation in sport questionnaire (BRSQ): instrument development and initial validity evidence.

PubMed

Lonsdale, Chris; Hodge, Ken; Rose, Elaine A

2008-06-01

The purpose of the four studies described in this article was to develop and test a new measure of competitive sport participants' intrinsic motivation, extrinsic motivation, and amotivation (self-determination theory; Deci & Ryan, 1985). The items for the new measure, named the Behavioral Regulation in Sport Questionnaire (BRSQ), were constructed using interviews, expert review, and pilot testing. Analyses supported the internal consistency, test-retest reliability, and factorial validity of the BRSQ scores. Nomological validity evidence was also supportive, as BRSQ subscale scores were correlated in the expected pattern with scores derived from measures of motivational consequences. When directly compared with scores derived from the Sport Motivation Scale (SMS; Pelletier, Fortier, Vallerand, Tuson, & Blais, 1995) and a revised version of that questionnaire (SMS-6; Mallett, Kawabata, Newcombe, Otero-Forero, & Jackson, 2007), BRSQ scores demonstrated equal or superior reliability and factorial validity as well as better nomological validity.
Predicting Hemorrhagic Transformation of Acute Ischemic Stroke: Prospective Validation of the HeRS Score.

PubMed

Marsh, Elisabeth B; Llinas, Rafael H; Schneider, Andrea L C; Hillis, Argye E; Lawrence, Erin; Dziedzic, Peter; Gottesman, Rebecca F

2016-01-01

Hemorrhagic transformation (HT) increases the morbidity and mortality of ischemic stroke. Anticoagulation is often indicated in patients with atrial fibrillation, low ejection fraction, or mechanical valves who are hospitalized with acute stroke, but increases the risk of HT. Risk quantification would be useful. Prior studies have investigated risk of systemic hemorrhage in anticoagulated patients, but none looked specifically at HT. In our previously published work, age, infarct volume, and estimated glomerular filtration rate (eGFR) significantly predicted HT. We created the hemorrhage risk stratification (HeRS) score based on regression coefficients in multivariable modeling and now determine its validity in a prospectively followed inpatient cohort.A total of 241 consecutive patients presenting to 2 academic stroke centers with acute ischemic stroke and an indication for anticoagulation over a 2.75-year period were included. Neuroimaging was evaluated for infarct volume and HT. Hemorrhages were classified as symptomatic versus asymptomatic, and by severity. HeRS scores were calculated for each patient and compared to actual hemorrhage status using receiver operating curve analysis.Area under the curve (AUC) comparing predicted odds of hemorrhage (HeRS score) to actual hemorrhage status was 0.701. Serum glucose (P < 0.001), white blood cell count (P < 0.001), and warfarin use prior to admission (P = 0.002) were also associated with HT in the validation cohort. With these variables, AUC improved to 0.854. Anticoagulation did not significantly increase HT; but with higher intensity anticoagulation, hemorrhages were more likely to be symptomatic and more severe.The HeRS score is a valid predictor of HT in patients with ischemic stroke and indication for anticoagulation.
Development of a Valid and Reliable Knee Articular Cartilage Condition-Specific Study Methodological Quality Score.

PubMed

Harris, Joshua D; Erickson, Brandon J; Cvetanovich, Gregory L; Abrams, Geoffrey D; McCormick, Frank M; Gupta, Anil K; Verma, Nikhil N; Bach, Bernard R; Cole, Brian J

2014-02-01

Condition-specific questionnaires are important components in evaluation of outcomes of surgical interventions. No condition-specific study methodological quality questionnaire exists for evaluation of outcomes of articular cartilage surgery in the knee. To develop a reliable and valid knee articular cartilage-specific study methodological quality questionnaire. Cross-sectional study. A stepwise, a priori-designed framework was created for development of a novel questionnaire. Relevant items to the topic were identified and extracted from a recent systematic review of 194 investigations of knee articular cartilage surgery. In addition, relevant items from existing generic study methodological quality questionnaires were identified. Items for a preliminary questionnaire were generated. Redundant and irrelevant items were eliminated, and acceptable items modified. The instrument was pretested and items weighed. The instrument, the MARK score (Methodological quality of ARticular cartilage studies of the Knee), was tested for validity (criterion validity) and reliability (inter- and intraobserver). A 19-item, 3-domain MARK score was developed. The 100-point scale score demonstrated face validity (focus group of 8 orthopaedic surgeons) and criterion validity (strong correlation to Cochrane Quality Assessment score and Modified Coleman Methodology Score). Interobserver reliability for the overall score was good (intraclass correlation coefficient [ICC], 0.842), and for all individual items of the MARK score, acceptable to perfect (ICC, 0.70-1.000). Intraobserver reliability ICC assessed over a 3-week interval was strong for 2 reviewers (≥0.90). The MARK score is a valid and reliable knee articular cartilage condition-specific study methodological quality instrument. This condition-specific questionnaire may be used to evaluate the quality of studies reporting outcomes of articular cartilage surgery in the knee.
A user-friendly risk-score for predicting in-hospital cardiac arrest among patients admitted with suspected non ST-elevation acute coronary syndrome - The SAFER-score.

PubMed

Faxén, Jonas; Hall, Marlous; Gale, Chris P; Sundström, Johan; Lindahl, Bertil; Jernberg, Tomas; Szummer, Karolina

2017-12-01

To develop a simple risk-score model for predicting in-hospital cardiac arrest (CA) among patients hospitalized with suspected non-ST elevation acute coronary syndrome (NSTE-ACS). Using the Swedish Web-system for Enhancement and Development of Evidence-based care in Heart disease Evaluated According to Recommended Therapies (SWEDEHEART), we identified patients (n=242 303) admitted with suspected NSTE-ACS between 2008 and 2014. Logistic regression was used to assess the association between 26 candidate variables and in-hospital CA. A risk-score model was developed and validated using a temporal cohort (n=126 073) comprising patients from SWEDEHEART between 2005 and 2007 and an external cohort (n=276 109) comprising patients from the Myocardial Ischaemia National Audit Project (MINAP) between 2008 and 2013. The incidence of in-hospital CA for NSTE-ACS and non-ACS was lower in the SWEDEHEART-derivation cohort than in MINAP (1.3% and 0.5% vs. 2.3% and 2.3%). A seven point, five variable risk score (age ≥60 years (1 point), ST-T abnormalities (2 points), Killip Class >1 (1 point), heart rate <50 or ≥100bpm (1 point), and systolic blood pressure <100mmHg (2 points) was developed. Model discrimination was good in the derivation cohort (c-statistic 0.72) and temporal validation cohort (c-statistic 0.74), and calibration was reasonable with a tendency towards overestimation of risk with a higher sum of score points. External validation showed moderate discrimination (c-statistic 0.65) and calibration showed a general underestimation of predicted risk. A simple points score containing five variables readily available on admission predicts in-hospital CA for patients with suspected NSTE-ACS. Copyright © 2017 Elsevier B.V. All rights reserved.
Associations between teaching effectiveness scores and characteristics of presentations in hospital medicine continuing education.

PubMed

Ratelle, John T; Wittich, Christopher M; Yu, Roger C; Newman, James S; Jenkins, Sarah M; Beckman, Thomas J

2015-09-01

There is little research regarding characteristics of effective continuing medical education (CME) presentations in hospital medicine (HM). Therefore, we sought to identify associations between validated CME teaching effectiveness scores and characteristics of CME presentations in the field of HM. This was a cross-sectional study of participants and didactic presentations from a national HM CME course in 2014. Participants provided CME teaching effectiveness (CMETE) ratings using an instrument with known validity evidence. Overall CMETE scores (5-point scale: 1 = strongly disagree; 5 = strongly agree) were averaged for each presentation, and associations between scores and presentation characteristics were determined using the Kruskal-Wallis test. The threshold for statistical significance was set at P < 0.05. A total of 277 out of 368 participants (75.3%) completed evaluations for the 32 presentations. CMETE scores (mean [standard deviation]) were significantly associated with the use of audience response (4.64 [0.16]) versus no audience response (4.49 [0.16]; P = 0.01), longer presentations (≥30 minutes: 4.67 [0.13] vs <30 minutes: 4.51 [0.18]; P = 0.02), and larger number of slides (≥50: 4.66 [0.17] vs <50: 4.55 [0.17]; P = 0.04). There were no significant associations between CMETE scores and use of clinical cases, defined goals, or summary slides. To our knowledge, this is the first study regarding associations between validated teaching effectiveness scores and characteristics of effective CME presentations in HM. Our findings, which support previous research in other fields, indicate that CME presentations may be improved by increasing interactivity through the use of audience response systems and allowing longer presentations. © 2015 Society of Hospital Medicine.
Five year experience in management of perforated peptic ulcer and validation of common mortality risk prediction models - are existing models sufficient? A retrospective cohort study.

PubMed

Anbalakan, K; Chua, D; Pandya, G J; Shelat, V G

2015-02-01

Emergency surgery for perforated peptic ulcer (PPU) is associated with significant morbidity and mortality. Accurate and early risk stratification is important. The primary aim of this study is to validate the various existing MRPMs and secondary aim is to audit our experience of managing PPU. 332 patients who underwent emergency surgery for PPU at a single intuition from January 2008 to December 2012 were studied. Clinical and operative details were collected. Four MRPMs: American Society of Anesthesiology (ASA) score, Boey's score, Mannheim peritonitis index (MPI) and Peptic ulcer perforation (PULP) score were validated. Median age was 54.7 years (range 17-109 years) with male predominance (82.5%). 61.7% presented within 24 h of onset of abdominal pain. Median length of stay was 7 days (range 2-137 days). Intra-abdominal collection, leakage, re-operation and 30-day mortality rates were 8.1%, 2.1%, 1.2% and 7.2% respectively. All the four MRPMs predicted intra-abdominal collection and mortality; however, only MPI predicted leak (p = 0.01) and re-operation (p = 0.02) rates. The area under curve for predicting mortality was 75%, 72%, 77.2% and 75% for ASA score, Boey's score, MPI and PULP score respectively. Emergency surgery for PPU has low morbidity and mortality in our experience. MPI is the only scoring system which predicts all - intra-abdominal collection, leak, reoperation and mortality. All four MRPMs had a similar and fair accuracy to predict mortality, however due to geographic and demographic diversity and inherent weaknesses of exiting MRPMs, quest for development of an ideal model should continue. Copyright © 2015 Surgical Associates Ltd. Published by Elsevier Ltd. All rights reserved.
Evaluation of community-acquired sepsis by PIRO system in the emergency department.

PubMed

Chen, Yun-Xia; Li, Chun-Sheng

2013-09-01

The predisposition, infection/insult, response, and organ dysfunction (PIRO) staging system for septic patients allows grouping of heterogeneous patients into homogeneous subgroups. The purposes of this single-center, prospective, observational cohort study were to create a PIRO system for patients with community-acquired sepsis (CAS) presenting to the emergency department (ED) and assess its prognostic and stratification capabilities. Septic patients were enrolled and allocated to derivation (n = 831) or validation (n = 860) cohorts according to their enrollment dates. The derivation cohort was used to identify independent predictors of mortality and create a PIRO system by binary logistic regression analysis, and the prognostic performance of PIRO was investigated in the validation cohort by receiver operator characteristic (ROC) curve. Ten independent predictors of 28-day mortality were identified. The PIRO system combined the components of predisposition (age, chronic obstructive pulmonary disease, hypoalbuminemia), infection (central nervous system infection), response (temperature, procalcitonin), and organ dysfunction (brain natriuretic peptide, troponin I, mean arterial pressure, Glasgow coma scale score). The area under the ROC of PIRO was 0.833 for the derivation cohort and 0.813 for the validation cohort. There was a stepwise increase in 28-day mortality with increasing PIRO score and the differences between the low- (PIRO 0-10), intermediate- (11-20), and high- (>20) risk groups were very significant in both cohorts (p < 0.01). The present study demonstrates that this PIRO system is valuable for prognosis and risk stratification in patients with CAS in the ED.
Quality-of-life in insect venom allergy: validation of the Turkish version of the "Vespid Allergy Quality of Life Questionnaire" (VQLQ-T).

PubMed

Sin, Betül Ayşe; Öztuna, Derya; Gelincik, Aslı; Gürlek, Feridun; Baysan, Abdullah; Sin, Aytül Zerrin; Aydın, Ömür; Mısırlıgil, Zeynep

2016-01-01

"Vespid Allergy Quality of Life Questionnaire (VQLQ)" has been used to assess psychological burden of disease. The aim of this study was to evaluate validity, reliability and responsiveness to interventions of the Turkish version. The Turkish language Questionnaire (VQLQ-T) was administered to 81 patients with bee allergy and 65 patients with vespid allergy from different groups to achieve cross-sectional validation. To establish longitudinal validity, the questionnaire was administered to 36 patients treated with venom immunotherapy. The cross-sectional validation in patients with vespid venom allergy showed a correlation coefficient of 0.97 (Cronbach α). Spearman's correlation coefficient of the pretreatment VQLQ-T score with Expectation of Outcome (EoO) questionnaire score was 0.55 (p < 0.001). After treatment, correlation between VQLQ-T score and EoO score was 0.64 (p = 0.003) in these patients. The cross-sectional instrument validation for non-beekeepers with bee venom allergy yielded a correlation coefficient of 0.96 (Cronbach α). Spearman's correlation coefficient between pretreatment VQLQ-T score and EoO score was 0.47 (p < 0.001) and after treatment, correlation between VQLQ-T score and EoO score was 0.78 (p = 0.008) in these patients. These findings indicate cross-sectional validity of VQLQ-T. In the longitudinal validation, there was a positive correlation between EoO and VQLQ-T with a correlation coefficient of 0.562 (p < 0.001). While mean (±SD) VQLQ-T score was 5.27 (±1.29) in pretreatment, it was 2.78 (±1.01) after treatment (p < 0.001). The correlation between the mean change in VQLQ-T score and the mean change in EoO score was 0.42 (p = 0.011). The Turkish version of VQLQ-T enables measurement of Quality of Life (QoL) in patients with either vespid or bee venom allergy. Furthermore, responsiveness of this instrument demonstrates the questionnaire's ability to detect changes over time.
Cataract surgeons outperform medical students in Eyesi virtual reality cataract surgery: evidence for construct validity.

PubMed

Selvander, Madeleine; Asman, Peter

2013-08-01

To investigate construct validity for modules hydromaneuvers and phaco on the Eyesi surgical simulator. Seven cataract surgeons and 17 medical students performed capsulorhexis, hydromaneuvers, phaco, navigation, forceps, cracking and chopping modules in a standardized manner. Three trials were performed on each module (two on phaco) in the above order. Performance parameters as calculated by the simulator for each trial were saved. Video recordings of the second trial of the modules capsulorhexis, hydromaneuvers and phaco were evaluated with the modified Objective Structured Assessment of Surgical Skill (OSATS) and Objective Structured Assessment of Cataract Surgical Skill (OSACSS) tools. Cataract surgeons outperformed medical students with regard to overall score on capsulorhexis (p < 0.001, p = 0.035, p = 0.010 for the tree iterations, respectively), navigation (p = 0.024, p = 0.307, p = 0.007), forceps (p = 0.017, p = 0.03, p = 0.028). Less obvious differences in overall score were found for modules cracking and chopping (p = 0.266, p = 0.022, p = 0.324) and phaco (p = 0.011, p = 0.081 for the two iterations, respectively). No differences in overall score were found on hydromaneuvers (p = 0.588, p = 0.503, p = 0.773), but surgeons received better scores from the evaluations of the modified OSATS (p = 0.001) and OSACSS (capsulorhexis, p = 0.003; hydromaneuvers, p = 0.017; phaco, p = 0.001). Construct validity was found on several modules previously not investigated (phaco, hydromaneuvers, cracking and chopping, navigation), and our results confirm previously demonstrated construct validity for capsulorhexis and forceps modules. Interestingly, validation of the hydromaneuvers module required OSACSS video evaluation tool. A further development of the scoring system in the simulator for the hydromaneuvers module would be advantageous and make training and evaluation of progress more accessible and immediate. © 2012 The Authors. Acta Ophthalmologica © 2012 Acta Ophthalmologica Scandinavica Foundation.

Clinical audit project in undergraduate medical education curriculum: an assessment validation study

PubMed Central

Steketee, Carole; Mak, Donna

2016-01-01

Objectives To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. Methods A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). Results The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students’ and examiners’ response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach’s alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. Conclusions This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole. PMID:27716612
Clinical audit project in undergraduate medical education curriculum: an assessment validation study.

PubMed

Tor, Elina; Steketee, Carole; Mak, Donna

2016-09-24

To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students' and examiners' response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach's alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole.
Reliability and validity of the visual analogue scale for disability in patients with chronic musculoskeletal pain.

PubMed

Boonstra, Anne M; Schiphorst Preuper, Henrica R; Reneman, Michiel F; Posthumus, Jitze B; Stewart, Roy E

2008-06-01

To determine the reliability and concurrent validity of a visual analogue scale (VAS) for disability as a single-item instrument measuring disability in chronic pain patients was the objective of the study. For the reliability study a test-retest design and for the validity study a cross-sectional design was used. A general rehabilitation centre and a university rehabilitation centre was the setting for the study. The study population consisted of patients over 18 years of age, suffering from chronic musculoskeletal pain; 52 patients in the reliability study, 344 patients in the validity study. Main outcome measures were as follows. Reliability study: Spearman's correlation coefficients (rho values) of the test and retest data of the VAS for disability; validity study: rho values of the VAS disability scores with the scores on four domains of the Short-Form Health Survey (SF-36) and VAS pain scores, and with Roland-Morris Disability Questionnaire scores in chronic low back pain patients. Results were as follows: in the reliability study rho values varied from 0.60 to 0.77; and in the validity study rho values of VAS disability scores with SF-36 domain scores varied from 0.16 to 0.51, with Roland-Morris Disability Questionnaire scores from 0.38 to 0.43 and with VAS pain scores from 0.76 to 0.84. The conclusion of the study was that the reliability of the VAS for disability is moderate to good. Because of a weak correlation with other disability instruments and a strong correlation with the VAS for pain, however, its validity is questionable.
The stroke impairment assessment set: its internal consistency and predictive validity.

PubMed

Tsuji, T; Liu, M; Sonoda, S; Domen, K; Chino, N

2000-07-01

To study the scale quality and predictive validity of the Stroke Impairment Assessment Set (SIAS) developed for stroke outcome research. Rasch analysis of the SIAS; stepwise multiple regression analysis to predict discharge functional independence measure (FIM) raw scores from demographic data, the SIAS scores, and the admission FIM scores; cross-validation of the prediction rule. Tertiary rehabilitation center in Japan. One hundred ninety stroke inpatients for the study of the scale quality and the predictive validity; a second sample of 116 stroke inpatients for the cross-validation study. Mean square fit statistics to study the degree of fit to the unidimensional model; logits to express item difficulties; discharge FIM scores for the study of predictive validity. The degree of misfit was acceptable except for the shoulder range of motion (ROM), pain, visuospatial function, and speech items; and the SIAS items could be arranged on a common unidimensional scale. The difficulty patterns were identical at admission and at discharge except for the deep tendon reflexes, ROM, and pain items. They were also similar for the right- and left-sided brain lesion groups except for the speech and visuospatial items. For the prediction of the discharge FIM scores, the independent variables selected were age, the SIAS total scores, and the admission FIM scores; and the adjusted R2 was .64 (p < .0001). Stability of the predictive equation was confirmed in the cross-validation sample (R2 = .68, p < .001). The unidimensionality of the SIAS was confirmed, and the SIAS total scores proved useful for stroke outcome prediction.
Performance of the inFLUenza Patient-Reported Outcome (FLU-PRO) diary in patients with influenza-like illness (ILI)

PubMed Central

Bacci, Elizabeth D.; Leidy, Nancy K.; Poon, Jiat-Ling; Stringer, Sonja; Memoli, Matthew J.; Han, Alison; Fairchok, Mary P.; Coles, Christian; Owens, Jackie; Chen, Wei-Ju; Arnold, John C.; Danaher, Patrick J.; Lalani, Tahaniyat; Burgess, Timothy H.; Millar, Eugene V.; Ridore, Michelande; Hernández, Andrés; Rodríguez-Zulueta, Patricia; Ortega-Gallegos, Hilda; Galindo-Fraga, Arturo; Ruiz-Palacios, Guillermo M.; Pett, Sarah; Fischer, William; Gillor, Daniel; Moreno Macias, Laura; DuVal, Anna; Rothman, Richard; Dugas, Andrea; Guerrero, M. Lourdes

2018-01-01

Background The inFLUenza Patient Reported Outcome (FLU-PRO) measure is a daily diary assessing signs/symptoms of influenza across six body systems: Nose, Throat, Eyes, Chest/Respiratory, Gastrointestinal, Body/Systemic, developed and tested in adults with influenza. Objectives This study tested the reliability, validity, and responsiveness of FLU-PRO scores in adults with influenza-like illness (ILI). Methods Data from the prospective, observational study used to develop and test the FLU-PRO in influenza virus positive patients were analyzed. Adults (≥18 years) presenting with influenza symptoms in outpatient settings in the US, UK, Mexico, and South America were enrolled, tested for influenza virus, and asked to complete the 37-item draft FLU-PRO daily for up to 14-days. Analyses were performed on data from patients testing negative. Reliability of the final, 32-item FLU-PRO was estimated using Cronbach’s alpha (α; Day 1) and intraclass correlation coefficients (ICC; 2-day reproducibility). Convergent and known-groups validity were assessed using patient global assessments of influenza severity (PGA). Patient report of return to usual health was used to assess responsiveness (Day 1–7). Results The analytical sample included 220 ILI patients (mean age = 39.3, 64.1% female, 88.6% white). Sixty-one (28%) were hospitalized at some point in their illness. Internal consistency reliability (α) of FLU-PRO Total score was 0.90 and ranged from 0.72–0.86 for domain scores. Reproducibility (Day 1–2) was 0.64 for Total, ranging from 0.46–0.78 for domain scores. Day 1 FLU-PRO scores correlated (≥0.30) with the PGA (except Gastrointestinal) and were significantly different across PGA severity groups (Total: F = 81.7, p<0.001; subscales: F = 6.9–62.2; p<0.01). Mean score improvements Day 1–7 were significantly greater in patients reporting return to usual health compared with those who did not (p<0.05, Total and subscales, except Gastrointestinal and Eyes). Conclusions Results suggest FLU-PRO scores are reliable, valid, and responsive in adults with influenza-like illness. PMID:29566007
Validity and Reliability of Nintendo Wii Fit Balance Scores

PubMed Central

Wikstrom, Erik A.

2012-01-01

Context: Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. Objective: To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Design: Descriptive laboratory study. Setting: Sports medicine research laboratory. Patients or Other Participants: Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Intervention(s): Participants completed a single-limb–stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Main Outcome Measure(s): Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. Results: All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r < 0.50). Intrasession reliability for Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Conclusions: Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT reach distances. In addition, the included Wii Fit balance activity scores generally had poor intrasession and intersession reliability. PMID:22892412
Development and validation of a scale for mouth handicap in systemic sclerosis: the Mouth Handicap in Systemic Sclerosis scale

PubMed Central

Mouthon, L; Rannou, F; Bérezné, A; Pagnoux, C; Arène, J‐P; Foïs, E; Cabane, J; Guillevin, L; Revel, M; Fermanian, J; Poiraudeau, S

2007-01-01

Objective To develop and assess the reliability and construct validity of a scale assessing disability involving the mouth in systemic sclerosis (SSc). Methods We generated a 34‐item provisional scale from mailed responses of patients (n = 74), expert consensus (n = 10) and literature analysis. A total of 71 other SSc patients were recruited. The test–retest reliability was assessed using the intraclass coefficient correlation and divergent validity using the Spearman correlation coefficient. Factor analysis followed by varimax rotation was performed to assess the factorial structure of the scale. Results The item reduction process retained 12 items with 5 levels of answers (total score range 0–48). The mean total score of the scale was 20.3 (SD 9.7). The test–retest reliability was 0.96. Divergent validity was confirmed for global disability (Health Assessment Questionnaire (HAQ), r = 0.33), hand function (Cochin Hand Function Scale, r = 0.37), inter‐incisor distance (r = −0.34), handicap (McMaster‐Toronto Arthritis questionnaire (MACTAR), r = 0.24), depression (Hospital Anxiety and Depression (HAD); HADd, r = 0.26) and anxiety (HADa, r = 0.17). Factor analysis extracted 3 factors with eigenvalues of 4.26, 1.76 and 1.47, explaining 63% of the variance. These 3 factors could be clinically characterised. The first factor (5 items) represents handicap induced by the reduction in mouth opening, the second (5 items) handicap induced by sicca syndrome and the third (2 items) aesthetic concerns. Conclusion We propose a new scale, the Mouth Handicap in Systemic Sclerosis (MHISS) scale, which has excellent reliability and good construct validity, and assesses specifically disability involving the mouth in patients with SSc. PMID:17502364
Concurrent Validity of LibQUAL+[TM] Scores: What Do LibQUAL+[TM] Scores Measure?

ERIC Educational Resources Information Center

Thompson, Bruce; Cook, Colleen; Kyrillidou, Martha

2005-01-01

The present study investigated the validity of LibQUAL+[TM] scores, and specifically how total and subscale LibQUAL+[TM] scores are associated with self-reported, library-related satisfaction and outcomes scores. Participants included 88,664 students and faculty who completed the American English (n[AE] = 69,494) or the British English (n[BE] =…
Nursing activities score.

PubMed

Miranda, Dinis Reis; Nap, Raoul; de Rijk, Angelique; Schaufeli, Wilmar; Iapichino, Gaetano

2003-02-01

The instruments used for measuring nursing workload in the intensive care unit (e.g., Therapeutic Intervention Scoring System-28) are based on therapeutic interventions related to severity of illness. Many nursing activities are not necessarily related to severity of illness, and cost-effectiveness studies require the accurate evaluation of nursing activities. The aim of the study was to determine the nursing activities that best describe workload in the intensive care unit and to attribute weights to these activities so that the score describes average time consumption instead of severity of illness. To define by consensus a list of nursing activities, to determine the average time consumption of these activities by use of a 1-wk observational cross-sectional study, and to compare these results with those of the Therapeutic Intervention Scoring System-28. A total of 99 intensive care units in 15 countries. Consecutive admissions to the intensive care units. Daily recording of nursing activities at a patient level and random multimoment recording of these activities. A total of five new items and 14 subitems describing nursing activities in the intensive care unit (e.g., monitoring, care of relatives, administrative tasks) were added to the list of therapeutic interventions in Therapeutic Intervention Scoring System-28. Data from 2,041 patients (6,451 nursing days and 127,951 multimoment recordings) were analyzed. The new activities accounted for 60% of the average nursing time; the new scoring system (Nursing Activities Score) explained 81% of the nursing time (vs. 43% in Therapeutic Intervention Scoring System-28). The weights in the Therapeutic Intervention Scoring System-28 are not derived from the use of nursing time. Our study suggests that the Nursing Activities Score measures the consumption of nursing time in the intensive care unit. These results should be validated in independent databases.
Reliability and Validity of Self-Concept Scores in Secondary School Students in Trinidad and Tobago

ERIC Educational Resources Information Center

Worrell, Frank C.; Watkins, Marley W.; Hall, Tracey E.

2008-01-01

In this study we examined the reliability and validity of global, mathematics and English self-concept scores from the Self-Description Questionnaire II (SDQ-II, Marsh, 1990b) in a random sample of 870 secondary school students in Trinidad and Tobago. The results provided strong evidence for the structural validity of the scores and yielded…
Validity of the SAT® for Predicting First-Year Grades: 2012 SAT Validity Sample. Statistical Report 2015 2

ERIC Educational Resources Information Center

Beard, Jonathan; Marini, Jessica P.

2015-01-01

The continued accumulation of validity evidence for the intended uses of educational assessment scores is critical to ensure that inferences made using the scores are sound. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides updated…
The Reliability and Validity of Big Five Inventory Scores with African American College Students

ERIC Educational Resources Information Center

Worrell, Frank C.; Cross, William E., Jr.

2004-01-01

This article describes a study that examined the reliability and validity of scores on the Big Five Inventory (BFI; O. P. John, E. M. Donahue, & R. L. Kentle, 1991) in a sample of 336 African American college students. Results from the study indicated moderate reliability and structural validity for BFI scores. Additionally, BFI subscales had few…
Incremental Validity of WISC-IV[superscript UK] Factor Index Scores with a Referred Irish Sample: Predicting Performance on the WIAT-II[superscript UK

ERIC Educational Resources Information Center

Canivez, Gary L.; Watkins, Marley W.; James, Trevor; Good, Rebecca; James, Kate

2014-01-01

Background: Subtest and factor scores have typically provided little incremental predictive validity beyond the omnibus IQ score. Aims: This study examined the incremental validity of Wechsler Intelligence Scale for Children-Fourth UK Edition (WISC-IV[superscript UK]; Wechsler, 2004a, "Wechsler Intelligence Scale for Children-Fourth UK…
Disease Severity and Progression in Progressive Supranuclear Palsy and Multiple System Atrophy: Validation of the NNIPPS – PARKINSON PLUS SCALE

PubMed Central

Payan, Christine A. M.; Viallet, François; Landwehrmeyer, Bernhard G.; Bonnet, Anne-Marie; Borg, Michel; Durif, Franck; Lacomblez, Lucette; Bloch, Frédéric; Verny, Marc; Fermanian, Jacques; Agid, Yves; Ludolph, Albert C.

2011-01-01

Background The Natural History and Neuroprotection in Parkinson Plus Syndromes (NNIPPS) study was a large phase III randomized placebo-controlled trial of riluzole in Progressive Supranuclear Palsy (PSP, n = 362) and Multiple System Atrophy (MSA, n = 398). To assess disease severity and progression, we constructed and validated a new clinical rating scale as an ancillary study. Methods and Findings Patients were assessed at entry and 6-montly for up to 3 years. Evaluation of the scale's psychometric properties included reliability (n = 116), validity (n = 760), and responsiveness (n = 642). Among the 85 items of the initial scale, factor analysis revealed 83 items contributing to 15 clinically relevant dimensions, including Activity of daily Living/Mobility, Axial bradykinesia, Limb bradykinesia, Rigidity, Oculomotor, Cerebellar, Bulbar/Pseudo-bulbar, Mental, Orthostatic, Urinary, Limb dystonia, Axial dystonia, Pyramidal, Myoclonus and Tremor. All but the Pyramidal dimension demonstrated good internal consistency (Cronbach α≥0.70). Inter-rater reliability was high for the total score (Intra-class coefficient = 0.94) and 9 dimensions (Intra-class coefficient = 0.80–0.93), and moderate (Intra-class coefficient = 0.54–0.77) for 6. Correlations of the total score with other clinical measures of severity were good (rho≥0.70). The total score was significantly and linearly related to survival (p<0.0001). Responsiveness expressed as the Standardized Response Mean was high for the total score slope of change (SRM = 1.10), though higher in PSP (SRM = 1.25) than in MSA (SRM = 1.0), indicating a more rapid progression of PSP. The slope of change was constant with increasing disease severity demonstrating good linearity of the scale throughout disease stages. Although MSA and PSP differed quantitatively on the total score at entry and on rate of progression, the relative contribution of clinical dimensions to overall severity and progression was similar. Conclusions The NNIPPS-PPS has suitable validity, is reliable and sensitive, and therefore is appropriate for use in clinical studies with PSP or MSA. Trial Registration ClinicalTrials.gov NCT00211224 PMID:21829612
Histopathological grading of breast ductal carcinoma in situ: validation of a web-based survey through intra-observer reproducibility analysis.

PubMed

Schuh, Fernando; Biazús, Jorge Villanova; Resetkova, Erika; Benfica, Camila Zanella; Ventura, Alessandra de Freitas; Uchoa, Diego; Graudenz, Márcia; Edelweiss, Maria Isabel Albano

2015-07-10

Histopathological grading diagnosis of ductal carcinoma in situ (DCIS) of the breast may be very difficult even for experts, and it is important for therapeutic decisions. The challenge may be due to the inaccurate and/or subjective application of the diagnosis criteria. The aim of this study was to investigate the intra-observer agreement between a traditional method and a developed web-based questionnaire for scoring breast DCIS. A cross-sectional study was carried out to evaluate the diagnostic agreement of an electronic questionnaire and its point scoring system with the subjective reading of digital images for 3 different DCIS grading systems: Holland, Van Nuys and modified Black nuclear grade system. Three pathologists analyzed the same set of digitized images from 43 DCIS cases using two different web-based programs. In the first phase, they accessed a website with a newly created questionnaire and scoring system developed to allow the determination of the histological grade of the cases. After at least 6 months, the pathologists read again the same images, but without the help of the questionnaire, indicating subjectively the diagnoses. The intra-observer agreement analysis was employed to validate this innovative web-based survey. Overall, diagnostic reproducibility was similar for all histologic grading classification systems, with kappa values of 0.57 ± 0.10, 0.67 ± 0.09 and 0.67 ± 0.09 for Holland, Van Nuys classification and modified Black nuclear grade system respectively. Only two 2-step diagnostic disagreements were found, one for Holland and another for Van Nuys. Both cases were superestimated by the web-based survey. The diagnostic agreement between the web-based questionnaire and a traditional method, both using digital images, is moderate to good for Holland, Van Nuys and modified Black nuclear grade system. The use of a scoring point system does not appear to pose a major risk of presenting large (2-step) diagnostic disagreements. These findings indicate that the use of this point scoring system in this web-based survey to grade objectively DCIS lesions is a useful diagnostic tool.
Patient-Reported Outcomes Measurement Information System (PROMIS) instruments among individuals with symptomatic knee osteoarthritis: a cross-sectional study of floor/ceiling effects and construct validity.

PubMed

Driban, Jeffrey B; Morgan, Nani; Price, Lori Lyn; Cook, Karon F; Wang, Chenchen

2015-09-14

The psychometric properties of Patient Reported Outcomes Measurement Information System (PROMIS) instruments have been explored in a number of general and clinical samples. No study, however, has evaluated the psychometric function of these measures in individuals with symptomatic knee osteoarthritis (KOA). The aim of this project was to evaluate the construct (structural) validity and floor/ceiling effects of four PROMIS measures in this population. We conducted a secondary analysis of baseline data from a randomized trial comparing Tai Chi and physical therapy. Participants completed four PROMIS static short-form instruments (i.e., Anxiety, Depression, Physical Function, and Pain Interference) as well as six well-validated (legacy) measures that assess pain, function, and psychological health. We calculated descriptive statistics and percentages of participants scoring the minimum (floor) and maximum (ceiling) possible scores for PROMIS and legacy measures. We also estimated the association between PROMIS scores and scores on legacy measures using Spearman's rank correlations coefficients. Data from 204 participants were analyzed. Mean age of the sample was 60 years; 70% were female. The PROMIS Anxiety and Depression had floor effects with 17 and 24% of participants scoring the minimum, respectively. PROMIS Anxiety and Depression scores had strongest associations with general mental health, including stress (Perceived Stress Scale, r ≥ 0.65) and depression (Beck Depression Index-II, r = 0.70). PROMIS Pain Interference scores correlated most strongly with measures of whole body pain (Short-Form 36 Bodily Pain, r = -0.73) and physical health (Short-Form 36 Physical-Component Summary, r = -0.73); their correlations were lower with other legacy measures, including with the WOMAC knee-specific pain (r = 0.47). PROMIS Physical Function scores had stronger associations with scores on the Short-Form 36 Physical Function (r = 0.79) than with scores on other legacy measures. The four PROMIS static-short forms performed well among individuals with symptomatic knee osteoarthritis as evidenced in correlations with legacy measures. PROMIS Anxiety and Depression target general mental health (e.g., stress, depression), and PROMIS Pain Interference and Physical Function static-short forms target whole-body outcomes among participants with symptomatic knee osteoarthritis. Floor effects in the PROMIS Anxiety and Depression scores should be considered if needing to distinguish among patients with very low levels of these outcomes. Clinicaltrials.gov NCT01258985. Registered 10 December 2010.
Construct validity of the helplessness/hopelessness/haplessness scale: correlations with perfectionism and depression.

PubMed

Leenaars, Lindsey; Lester, David

2007-02-01

In a sample of 117 undergraduates, helplessness scores and the discrepancy scores on a measure of perfectionism predicted depression scores, providing evidence for construct validity for the hopelessness, helplessness, and haplessness scales.
The New York State risk score for predicting in-hospital/30-day mortality following percutaneous coronary intervention.

PubMed

Hannan, Edward L; Farrell, Louise Szypulski; Walford, Gary; Jacobs, Alice K; Berger, Peter B; Holmes, David R; Stamato, Nicholas J; Sharma, Samin; King, Spencer B

2013-06-01

This study sought to develop a percutaneous coronary intervention (PCI) risk score for in-hospital/30-day mortality. Risk scores are simplified linear scores that provide clinicians with quick estimates of patients' short-term mortality rates for informed consent and to determine the appropriate intervention. Earlier PCI risk scores were based on in-hospital mortality. However, for PCI, a substantial percentage of patients die within 30 days of the procedure after discharge. New York's Percutaneous Coronary Interventions Reporting System was used to develop an in-hospital/30-day logistic regression model for patients undergoing PCI in 2010, and this model was converted into a simple linear risk score that estimates mortality rates. The score was validated by applying it to 2009 New York PCI data. Subsequent analyses evaluated the ability of the score to predict complications and length of stay. A total of 54,223 patients were used to develop the risk score. There are 11 risk factors that make up the score, with risk factor scores ranging from 1 to 9, and the highest total score is 34. The score was validated based on patients undergoing PCI in the previous year, and accurately predicted mortality for all patients as well as patients who recently suffered a myocardial infarction (MI). The PCI risk score developed here enables clinicians to estimate in-hospital/30-day mortality very quickly and quite accurately. It accurately predicts mortality for patients undergoing PCI in the previous year and for MI patients, and is also moderately related to perioperative complications and length of stay. Copyright © 2013 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
Development and validation of a patient-reported questionnaire assessing systemic therapy induced diarrhea in oncology patients.

PubMed

Lui, Michelle; Gallo-Hershberg, Daniela; DeAngelis, Carlo

2017-12-22

Systemic therapy-induced diarrhea (STID) is a common side effect experienced by more than half of cancer patients. Despite STID-associated complications and poorer quality of life (QoL), no validated assessment tools exist to accurately assess STID occurrence and severity to guide clinical management. Therefore, we developed and validated a patient-reported questionnaire (STIDAT). The STIDAT was developed using the FDA iterative process for patient-reported outcomes. A literature search uncovered potential items and questions for questionnaire construction used by oncology clinicians to develop questions for the preliminary instrument. The instrument was evaluated on its face validity and content validity by patient interviews. Repetitive, similar and different themes uncovered from patient interviews were implemented to revise the instrument to the version used for validation. Patients starting high-risk STID treatments were monitored using the STIDAT, bowel diaries and EORTC QLQ-C30. The STIDAT was evaluated for construct validity using exploratory factor analysis (EFA) using minimal residual method with Promax rotation, reliability and consistency. A weighted scoring system was developed and a receiver-operating characteristic (ROC) curve evaluated the tool's ability to detect STID occurrence. Median scores and variability were analysed to determine how well it differentiates between diarrhea severities. A post-hoc analysis determined how diarrhea severity impacted QoL of cancer patients. Patients defined diarrhea based on presence of watery stool. The STIDAT assessed patient's perception of having diarrhea, daily number of bowel movements, daily number of diarrhea episodes, antidiarrheal medication use, the presence of urgency, abdominal pain, abdominal spasms or fecal incontinence, patient's perception of diarrhea severity, and QoL. These dimensions were sorted into four clusters using EFA - patient's perception of diarrhea, frequency of diarrhea, fecal incontinence and abdominal symptoms. Cronbach's alpha was 0.78; kappa ranged from 0.934-0.952, except for abdominal spasms (κ = 0.0455). The positive predictive value was 96.4%, with the minimum score of 1.35 predicting a positive STID occurrence. Patients with moderate or severe diarrhea experience significant decreases in QoL compared to those with no diarrhea. This is the first patient-reported questionnaire that accurately predicts the occurrence and severity of diarrhea in oncology patients via assessing several bowel habit dimensions.
Quantification of human epidermal growth factor receptor 2 immunohistochemistry using the Ventana Image Analysis System: correlation with gene amplification by fluorescence in situ hybridization: the importance of instrument validation for achieving high (>95%) concordance rate.

PubMed

Dennis, Jake; Parsa, Rezvaneh; Chau, Donnie; Koduru, Prasad; Peng, Yan; Fang, Yisheng; Sarode, Venetia Rumnong

2015-05-01

The use of computer-based image analysis for scoring human epidermal growth factor receptor 2 (HER2) immunohistochemistry (IHC) has gained a lot of interest recently. We investigated the performance of the Ventana Image Analysis System (VIAS) in HER2 quantification by IHC and its correlation with fluorescence in situ hybridization (FISH). We specifically compared the 3+ IHC results using the manufacturer's machine score cutoffs versus laboratory-defined cutoffs with the FISH assay. Using the manufacturer's 3+ cutoff (VIAS score; 2.51 to 3.5), 181/536 (33.7%) were scored 3+, and FISH was positive in 147/181 (81.2%), 2 (1.1%) were equivocal, and 32 (17.6%) were FISH (-). Using the laboratory-defined 3+ cutoff (VIAS score 3.5), 52 (28.7%) cases were downgraded to 2+, of which 29 (55.7%) were FISH (-), and 23 (44.2%) were FISH (+). With the revised cutoff, there were improvements in the concordance rate from 89.1% to 97.0% and in the positive predictive value from 82.1% to 97.6%. The false-positive rate for 3+ decreased from 9.0% to 0.8%. Six of 175 (3.4%) IHC (-) cases were FISH (+). Three cases with a VIAS score 3.5 showed polysomy of chromosome 17. In conclusion, the VIAS may be a valuable tool for assisting pathologists in HER2 scoring; however, the positive cutoff defined by the manufacturer is associated with a high false-positive rate. This study highlights the importance of instrument validation/calibration to reduce false-positive results.

Validation of APACHE II scoring system at 24 hours after admission as a prognostic tool in urosepsis: A prospective observational study.

PubMed

VijayGanapathy, Sundaramoorthy; Karthikeyan, VIlvapathy Senguttuvan; Sreenivas, Jayaram; Mallya, Ashwin; Keshavamurthy, Ramaiah

2017-11-01

Urosepsis implies clinically evident severe infection of urinary tract with features of systemic inflammatory response syndrome (SIRS). We validate the role of a single Acute Physiology and Chronic Health Evaluation II (APACHE II) score at 24 hours after admission in predicting mortality in urosepsis. A prospective observational study was done in 178 patients admitted with urosepsis in the Department of Urology, in a tertiary care institute from January 2015 to August 2016. Patients >18 years diagnosed as urosepsis using SIRS criteria with positive urine or blood culture for bacteria were included. At 24 hours after admission to intensive care unit, APACHE II score was calculated using 12 physiological variables, age and chronic health. Mean±standard deviation (SD) APACHE II score was 26.03±7.03. It was 24.31±6.48 in survivors and 32.39±5.09 in those expired (p<0.001). Among patients undergoing surgery, mean±SD score was higher (30.74±4.85) than among survivors (24.30±6.54) (p<0.001). Receiver operating characteristic (ROC) analysis revealed area under curve (AUC) of 0.825 with cutoff 25.5 being 94.7% sensitive and 56.4% specific to predict mortality. Mean±SD score in those undergoing surgery was 25.22±6.70 and was lesser than those who did not undergo surgery (28.44±7.49) (p=0.007). ROC analysis revealed AUC of 0.760 with cutoff 25.5 being 94.7% sensitive and 45.6% specific to predict mortality even after surgery. A single APACHE II score assessed at 24 hours after admission was able to predict morbidity, mortality, need for surgical intervention, length of hospitalization, treatment success and outcome in urosepsis patients.
Assessment of postoperative outcomes of hypospadias repair with validated questionnaires.

PubMed

Liu, Mona M Y; Holland, Andrew J A; Cass, Danny T

2015-12-01

A standardized assessment for the optimal repair of hypospadias remains elusive. This study utilized validated questionnaires to assess the postoperative functional, cosmetic, and psychosocial outcomes of hypospadias repair. 172 patients who underwent hypospadias repair under the care of a single surgeon were identified. 25 agreed for follow-up using the validated questionnaires of Hypospadias Objective Scoring Evaluation (HOSE), Pediatric Penile Perception Scale (PPPS), and Pediatric Quality of Life Inventory (PedsQL™4.0). Mean follow-up was 59months postoperatively (range 7-113months). Techniques used included tubularized incised plate urethroplasty, meatal advancement and glanuloplasty, and a 2-stage repair. 23 of 25 patients achieved a HOSE score of 14 or more (maximum of 16). The PPPS scores correlated with severity of the hypospadias. Those with glanular hypospadias (mean score=10) scored higher than those with coronal (mean score=9) and penile/penoscrotal hypospadias (mean score=7). There was no correlation between PedsQL™4.0 scores and the severity of hypospadias or procedure used. Validated questionnaires revealed generally good functional, cosmetic, and early psychosocial outcomes after hypospadias repair. The use of validated questionnaires in routine follow-up sessions may facilitate objective assessment of both functional outcomes and patient satisfaction. Copyright © 2015 Elsevier Inc. All rights reserved.
Hypoalbuminemia, Low Base Excess Values, and Tachypnea Predict 28-Day Mortality in Severe Sepsis and Septic Shock Patients in the Emergency Department.

PubMed

Seo, Min Ho; Choa, Minhong; You, Je Sung; Lee, Hye Sun; Hong, Jung Hwa; Park, Yoo Seok; Chung, Sung Phil; Park, Incheol

2016-11-01

The objective of this study was to develop a new nomogram that can predict 28-day mortality in severe sepsis and/or septic shock patients using a combination of several biomarkers that are inexpensive and readily available in most emergency departments, with and without scoring systems. We enrolled 561 patients who were admitted to an emergency department (ED) and received early goal-directed therapy for severe sepsis or septic shock. We collected demographic data, initial vital signs, and laboratory data sampled at the time of ED admission. Patients were randomly assigned to a training set or validation set. For the training set, we generated models using independent variables associated with 28-day mortality by multivariate analysis, and developed a new nomogram for the prediction of 28-day mortality. Thereafter, the diagnostic accuracy of the nomogram was tested using the validation set. The prediction model that included albumin, base excess, and respiratory rate demonstrated the largest area under the receiver operating characteristic curve (AUC) value of 0.8173 [95% confidence interval (CI), 0.7605-0.8741]. The logistic analysis revealed that a conventional scoring system was not associated with 28-day mortality. In the validation set, the discrimination of a newly developed nomogram was also good, with an AUC value of 0.7537 (95% CI, 0.6563-0.8512). Our new nomogram is valuable in predicting the 28-day mortality of patients with severe sepsis and/or septic shock in the emergency department. Moreover, our readily available nomogram is superior to conventional scoring systems in predicting mortality.
Effects of an Intelligent Web-Based English Instruction System on Students' Academic Performance

ERIC Educational Resources Information Center

Jia, J.; Chen, Y.; Ding, Z.; Bai, Y.; Yang, B.; Li, M.; Qi, J.

2013-01-01

This research conducted quasi-experiments in four middle schools to evaluate the long-term effects of an intelligent web-based English instruction system, Computer Simulation in Educational Communication (CSIEC), on students' academic attainment. The analysis of regular examination scores and vocabulary test validates the positive impact of CSIEC,…
Multicentre validation of the bedside paediatric early warning system score: a severity of illness score to detect evolving critical illness in hospitalised children

PubMed Central

2011-01-01

Introduction The timely provision of critical care to hospitalised patients at risk for cardiopulmonary arrest is contingent upon identification and referral by frontline providers. Current approaches require improvement. In a single-centre study, we developed the Bedside Paediatric Early Warning System (Bedside PEWS) score to identify patients at risk. The objective of this study was to validate the Bedside PEWS score in a large patient population at multiple hospitals. Methods We performed an international, multicentre, case-control study of children admitted to hospital inpatient units with no limitations on care. Case patients had experienced a clinical deterioration event involving either an immediate call to a resuscitation team or urgent admission to a paediatric intensive care unit. Control patients had no events. The scores ranged from 0 to 26 and were assessed in the 24 hours prior to the clinical deterioration event. Score performance was assessed using the area under the receiver operating characteristic (AUCROC) curve by comparison with the retrospective rating of nurses and the temporal progression of scores in case patients. Results A total of 2,074 patients were evaluated at 4 participating hospitals. The median (interquartile range) maximum Bedside PEWS scores for the 12 hours ending 1 hour before the clinical deterioration event were 8 (5 to 12) in case patients and 2 (1 to 4) in control patients (P < 0.0001). The AUCROC curve (95% confidence interval) was 0.87 (0.85 to 0.89). In case patients, mean scores were 5.3 at 20 to 24 hours and 8.4 at 0 to 4 hours before the event (P < 0.0001). The AUCROC curve (95% CI) of the retrospective nurse ratings was 0.83 (0.81 to 0.86). This was significantly lower than that of the Bedside PEWS score (P < 0.0001). Conclusions The Bedside PEWS score identified children at risk for cardiopulmonary arrest. Scores were elevated and continued to increase in the 24 hours before the clinical deterioration event. Prospective clinical evaluation is needed to determine whether this score will improve the quality of care and patient outcomes. PMID:21812993
How Is Testing Supposed to Improve Schooling?

ERIC Educational Resources Information Center

Haertel, Edward

2013-01-01

Validation research for educational achievement tests is often limited to an examination of intended test score interpretations. This article calls for an expansion of validation research in three dimensions. First, validation must attend to actual test use and its consequences, not just score meaning. Second, validation must attend to unintended…
Validity Semantics in Educational and Psychological Assessment

ERIC Educational Resources Information Center

Hathcoat, John D.

2013-01-01

The semantics, or meaning, of validity is a fluid concept in educational and psychological testing. Contemporary controversies surrounding this concept appear to stem from the proper location of validity. Under one view, validity is a property of score-based inferences and entailed uses of test scores. This view is challenged by the…
The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain.

PubMed

Crins, Martine H P; Terwee, Caroline B; Klausch, Thomas; Smits, Niels; de Vet, Henrica C W; Westhovens, Rene; Cella, David; Cook, Karon F; Revicki, Dennis A; van Leeuwen, Jaap; Boers, Maarten; Dekker, Joost; Roorda, Leo D

2017-07-01

The objective of this study was to assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function item bank in Dutch patients with chronic pain. A bank of 121 items was administered to 1,247 Dutch patients with chronic pain. Unidimensionality was assessed by fitting a one-factor confirmatory factor analysis and evaluating resulting fit statistics. Items were calibrated with the graded response model and its fit was evaluated. Cross-cultural validity was assessed by testing items for differential item functioning (DIF) based on language (Dutch vs. English). Construct validity was evaluated by calculation correlations between scores on the Dutch-Flemish PROMIS Physical Function measure and scores on generic and disease-specific measures. Results supported the Dutch-Flemish PROMIS Physical Function item bank's unidimensionality (Comparative Fit Index = 0.976, Tucker Lewis Index = 0.976) and model fit. Item thresholds targeted a wide range of physical function construct (threshold-parameters range: -4.2 to 5.6). Cross-cultural validity was good as four items only showed DIF for language and their impact on item scores was minimal. Physical Function scores were strongly associated with scores on all other measures (all correlations ≤ -0.60 as expected). The Dutch-Flemish PROMIS Physical Function item bank exhibited good psychometric properties. Development of a computer adaptive test based on the large bank is warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
[Validation of the IBS-SSS].

PubMed

Betz, C; Mannsdörfer, K; Bischoff, S C

2013-10-01

Irritable bowel syndrome (IBS) is a functional gastrointestinal disorder characterised by abdominal pain, associated with stool abnormalities and changes in stool consistency. Diagnosis of IBS is based on characteristic symptoms and exclusion of other gastrointestinal diseases. A number of questionnaires exist to assist diagnosis and assessment of severity of the disease. One of these is the irritable bowel syndrome - severity scoring system (IBS-SSS). The IBS-SSS was validated 1997 in its English version. In the present study, the IBS-SSS has been validated in German language. To do this, a cohort of 60 patients with IBS according to the Rome III criteria, was compared with a control group of healthy individuals (n = 38). We studied sensitivity and reproducibility of the score, as well as the sensitivity to detect changes of symptom severity. The results of the German validation largely reflect the results of the English validation. The German version of the IBS-SSS is also a valid, meaningful and reproducible questionnaire with a high sensitivity to assess changes in symptom severity, especially in IBS patients with moderate symptoms. It is unclear if the IBS-SSS is also a valid questionnaire in IBS patients with severe symptoms because this group of patients was not studied. © Georg Thieme Verlag KG Stuttgart · New York.
Interpreting Quality of Life after Brain Injury Scores: Cross-Walk with the Short Form-36.

PubMed

Wilson, Lindsay; Marsden-Loftus, Isaac; Koskinen, Sanna; Bakx, Wilbert; Bullinger, Monika; Formisano, Rita; Maas, Andrew; Neugebauer, Edmund; Powell, Jane; Sarajuuri, Jaana; Sasse, Nadine; von Steinbuechel, Nicole; von Wild, Klaus; Truelle, Jean-Luc

2017-01-01

The Quality of Life after Brain Injury (QOLIBRI) instruments are traumatic brain injury (TBI)-specific assessments of health-related quality of life (HRQoL), with established validity and reliability. The purpose of the study is to help improve the interpretability of the two QOLIBRI summary scores (the QOLIBRI Total score and the QOLBRI Overall Scale [OS] score). An analysis was conducted of 761 patients with TBI who took part in the QOLIBRI validation studies. A cross-walk between QOLIBRI scores and the SF-36 Mental Component Summary norm-based scoring system was performed using geometric mean regression analysis. The exercise supports a previous suggestion that QOLIBRI Total scores <60 indicate low or impaired HRQoL and indicate that the corresponding score on the QOLIBRI-OS is <52. The percentage of cases in the sample that fell into the "impaired HRQoL" category was 36% for the Mental Component Summary, 38% for the QOLIBRI Total, and 39% for the QOLIBRI-OS. Relationships between the QOLIBRI scales and the Glasgow Outcome Scale-Extended (GOSE), as a measure of global function, are presented in the form of means and standard deviations that allow comparison with other studies, and data on age and sex are presented for the QOLIBRI-OS. While bearing in mind the potential imprecision of the comparison, the findings provide a framework for evaluating QOLIBRI summary scores in relation to generic HRQoL that improves their interpretability.
Qualitative and quantitative assessment of degeneration of cervical intervertebral discs and facet joints.

PubMed

Walraevens, Joris; Liu, Baoge; Meersschaert, Joke; Demaerel, Philippe; Delye, Hans; Depreitere, Bart; Vander Sloten, Jos; Goffin, Jan

2009-03-01

Degeneration of intervertebral discs and facet joints is one of the most frequently encountered spinal disorders. In order to describe and quantify degeneration and evaluate a possible relationship between degeneration and biomechanical parameters, e.g., the intervertebral range of motion and intradiscal pressure, a scoring system for degeneration is mandatory. However, few scoring systems for the assessment of degeneration of the cervical spine exist. Therefore, two separate objective scoring systems to qualitatively and quantitatively assess the degree of cervical intervertebral disc and facet joint degeneration were developed and validated. The scoring system for cervical disc degeneration consists of three variables which are individually scored on neutral lateral radiographs: "height loss" (0-4 points), "anterior osteophytes" (0-3 points) and "endplate sclerosis" (0-2 points). The scoring system for facet joint degeneration consists of four variables which are individually scored on neutral computed tomography scans: "hypertrophy" (0-2 points), "osteophytes" (0-1 point), "irregularity" on the articular surface (0-1 point) and "joint space narrowing" (0-1 point). Each variable contributes with varying importance to the overall degeneration score (max 9 points for the scoring system of cervical disc degeneration and max 5 points for facet joint degeneration). Degeneration of 20 discs and facet joints of 20 patients was blindly assessed by four raters: two neurosurgeons (one senior and one junior) and two radiologists (one senior and one junior), firstly based on first subjective impression and secondly using the scoring systems. Measurement errors and inter- and intra-rater agreement were determined. The measurement error of the scoring system for cervical disc degeneration was 11.1 versus 17.9% of the subjective impression results. This scoring system showed excellent intra-rater agreement (ICC = 0.86, 0.75-0.93) and excellent inter-rater agreement (ICC = 0.78, 0.64-0.88). Surgeons as well as radiologists and seniors as well as juniors obtained excellent inter- and intra-rater agreement. The measurement error of the scoring system for cervical facet joint degeneration was 20.1 versus 24.2% of the subjective impression results. This scoring system showed good intra-rater agreement (ICC = 0.71, 0.42-0.89) and fair inter-rater agreement (ICC = 0.49, 0.26-0.74). Both scoring systems fulfilled the criteria for recommendation proposed by Kettler and Wilke. Our scoring systems can be reliable and objective tools for assessing cervical disc and facet joint degeneration. Moreover, the scoring system of cervical disc degeneration was shown to be experience- and discipline-independent.
[Drug-promoting advertisements in the Dutch Journal of Medicine and Pharmaceutical Weekly: not always evidence based].

PubMed

van Eeden, Annelies E; Roach, Rachel E J; Halbesma, Nynke; Dekker, Friedo W

2012-01-01

To determine and compare the foundation of claims in drug-promoting advertisements in a Dutch journal for physicians and a Dutch journal for pharmacists. A cross-sectional study. We included all the drug-promoting advertisements referring to a randomized controlled trial (RCT) we could find on Medline from 2 volumes of the Dutch Journal of Medicine (Nederlands Tijdschrift voor Geneeskunde; NTvG) and the (also Dutch) Pharmaceutical Weekly (Pharmaceutisch Weekblad; PW). The validity of the advertisements (n = 54) and the methodological quality of the referenced RCTs (n = 150) were independently scored by 250 medical students using 2 standardised questionnaires. The advertisements' sources were concealed from the students. Per journal, the percentage of drug-promoting advertisements having a valid claim and the percentage of high-quality RCT references were determined. Average scores on quality and validity were compared between the 2 journals. On a scale of 0-18 points, the mean quality scores of the RCTs differed 0.3 (95% CI: -0.1-0.7) between the NTvG (score: 14.8; SD: 2.2) and the PW (score: 14.5; SD: 2.6). The difference between the validity scores of drug-promoting advertisements in the NTvG (score: 5.8; SD: 3.3) and the PW (score: 5.6; SD: 3.6) was 0.3 (95% CI: -0.3-0.9) on a scale of 0-10 points. For both journals, an average of 15% of drug-promoting advertisements was valid (defined as a validity score of > 8 points); 35% of the RCTs referred to was of good methodological quality (defined as a quality score of > 16 points). The substantiation of many claims in drug-promoting advertisements in the NTvG and the PW was mediocre. There was no difference between the 2 journals.
[Validation of two indices of biological integrity (IBI) for the Angulo River subbasin in Central Mexico].

PubMed

Ramírez-Herrejón, Juan Pablo; Mercado-Silva, Norman; Medina-Nava, Martina; Domínguez-Domínguez, Omar

2012-12-01

Efforts to halt freshwater ecosystem degradation in central Mexico can benefit from using bio-monitoring tools that reflect the condition of their biotic integrity. We analyzed the applicability of two fish-based indices of biotic integrity using data from lotic and lentic systems in the Angulo River subbasin (Lerma-Chapala basin). Both independent data from our own collections during two consecutive years, and existing information detailing the ecological attributes of each species, were used to calculate indices of biological integrity for 16 sites in lotic and lentic habitats. We assessed environmental quality by combining independent evaluations water and habitat quality for each site. We found sites with poor, regular and good biotic integrity. Our study did not find sites with good environmental quality. Fish-based IBI scores were strongly and significantly correlated with scores from independent environmental assessment techniques. IBI scores were adequate at representing environmental conditions in most study sites. These results expand the area where a lotic system fish-based IBI can be used, and constitute an initial validation of a lentic system fish-based IBI. Our results suggest that these bio-monitoring tools can be used in future conservation efforts in freshwater ecosystems in the Middle Lerma Basin.
Firefighter hearing health: an informatics approach to screening, measurement, and research.

PubMed

Hong, OiSaeng; Monsen, Karen A; Kerr, Madeleine J; Chin, Dal Lae; Lytton, Amy B; Martin, Karen S

2012-10-01

The purpose of this study was to evaluate the use of a standardized interface terminology, the Omaha System, with respect to noise-induced hearing loss (NIHL). A descriptive, correlational design was employed for this secondary analysis with the data from an ongoing hearing protection intervention study. A total of 346 firefighters were included. First, an evidence-based standardized care plan (EB-SCP) for hearing screening was developed and validated by clinical experts. Second, occupational health records were used to compute Omaha System Knowledge, Behavior, and Status outcomes. Third, research data were mapped to Omaha System rating scales. For Knowledge, the mean score was close to 'adequate' (3.7). For Behavior, the mean score was close to 'rarely appropriate' (2.2). For Status, the mean score was close to 'minimal sign/symptom' (4.4). Significant positive relationships were found between Knowledge and Behavior (Spearman's rho =.13, p =.01), and between Behavior and hearing Status (Spearman's rho =.12, p =.02). Findings support the validity of the new Knowledge, Behavior, and hearing Status. Informatics methods such as the standardized NIHL EB-SCP and outcome data sets will create opportunities for clinical decision support and data exchange across various health care settings, thus supporting population-based hearing health assessments and outcomes.
State and Local Efforts to Investigate the Validity and Reliability of Scores from Teacher Evaluation Systems

ERIC Educational Resources Information Center

Herlihy, Corinne; Karger, Ezra; Pollard, Cynthia; Hill, Heather C.; Kraft, Matthew A.; Williams, Megan; Howard, Sarah

2014-01-01

Context: In the past two years, states have implemented sweeping reforms to their teacher evaluation systems in response to Race to the Top legislation and, more recently, NCLB waivers. With these new systems, policymakers hope to make teacher evaluation both more rigorous and more grounded in specific job performance domains such as teaching…
Independent validation of the prognostic capacity of the ISUP prostate cancer grade grouping system for radiation treated patients with long-term follow-up.

PubMed

Spratt, D E; Jackson, W C; Abugharib, A; Tomlins, S A; Dess, R T; Soni, P D; Lee, J Y; Zhao, S G; Cole, A I; Zumsteg, Z S; Sandler, H; Hamstra, D; Hearn, J W; Palapattu, G; Mehra, R; Morgan, T M; Feng, F Y

2016-09-01

There has been a recent proposal to change the grading system of prostate cancer into a five-tier grade grouping system. The prognostic impact of this has been demonstrated in regards only to biochemical recurrence-free survival (bRFS) with short follow-up (3 years). Between 1990 and 2013, 847 consecutive men were treated with definitive external beam radiation therapy at a single academic center. To validate the new grade grouping system, bRFS, distant metastases-free survival (DMFS) and prostate cancer-specific survival (PCSS) were calculated. Adjusted Kaplan-Meier and multivariable Cox regression analyses were performed to assess the independent impact of the new grade grouping system. Discriminatory analyses were performed to compare the commonly used three-tier Gleason score system (6, 7 and 8-10) to the new system. The median follow-up of our cohort was 88 months. The 5-grade groups independently validated differing risks of bRFS (group 1 as reference; adjusted hazard ratio (aHR) 1.35, 2.16, 1.79 and 3.84 for groups 2-5, respectively). Furthermore, a clear stratification was demonstrated for DMFS (aHR 2.03, 3.18, 3.62 and 13.77 for groups 2-5, respectively) and PCSS (aHR 3.00, 5.32, 6.02 and 39.02 for groups 2-5, respectively). The 5-grade group system had improved prognostic discrimination for all end points compared with the commonly used three-tiered system (that is, Gleason score 6, 7 and 8-10). In a large independent radiotherapy cohort with long-term follow-up, we have validated the bRFS benefit of the proposed five-tier grade grouping system. Furthermore, we have demonstrated that the system is highly prognostic for DMFS and PCSS. Grade group 5 had markedly worse outcomes for all end points, and future work is necessary to improve outcomes in these patients.
Validity of GRE General Test scores and TOEFL scores for graduate admission to a technical university in Western Europe

NASA Astrophysics Data System (ADS)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the Master's programme grade point average (GGPA) with and without the addition of the undergraduate GPA (UGPA) and the TOEFL score, and of GRE scores for study completion and Master's thesis performance. GRE scores explained 20% of the variation in the GGPA, while additional 7% were explained by the TOEFL score and 3% by the UGPA. Contrary to common belief, the GRE quantitative reasoning score showed only little explanatory power. GRE scores were also weakly related to study progress but not to thesis performance. Nevertheless, GRE and TOEFL scores were found to be sensible admissions instruments. Rigorous methodology was used to obtain highly reliable results.
Translation and validation of the new version of the Knee Society Score - The 2011 KS Score - into Brazilian Portuguese.

PubMed

Silva, Adriana Lucia Pastore E; Croci, Alberto Tesconi; Gobbi, Riccardo Gomes; Hinckel, Betina Bremer; Pecora, José Ricardo; Demange, Marco Kawamura

2017-01-01

Translation, cultural adaptation, and validation of the new version of the Knee Society Score - The 2011 KS Score - into Brazilian Portuguese and verification of its measurement properties, reproducibility, and validity. In 2012, the new version of the Knee Society Score was developed and validated. This scale comprises four separate subscales: (a) objective knee score (seven items: 100 points); (b) patient satisfaction score (five items: 40 points); (c) patient expectations score (three items: 15 points); and (d) functional activity score (19 items: 100 points). A total of 90 patients aged 55-85 years were evaluated in a clinical cross-sectional study. The pre-operative translated version was applied to patients with TKA referral, and the post-operative translated version was applied to patients who underwent TKA. Each patient answered the same questionnaire twice and was evaluated by two experts in orthopedic knee surgery. Evaluations were performed pre-operatively and three, six, or 12 months post-operatively. The reliability of the questionnaire was evaluated using the intraclass correlation coefficient (ICC) between the two applications. Internal consistency was evaluated using Cronbach's alpha. The ICC found no difference between the means of the pre-operative, three-month, and six-month post-operative evaluations between sub-scale items. The Brazilian Portuguese version of The 2011 KS Score is a valid and reliable instrument for objective and subjective evaluation of the functionality of Brazilian patients who undergo TKA and revision TKA.
A risk scoring system for prediction of haemorrhagic stroke.

PubMed

Zodpey, S P; Tiwari, R R

2005-01-01

The present pair-matched case control study was carried out at Government Medical College Hospital, Nagpur, India, a tertiary care hospital with the objective to devise and validate a risk scoring system for prediction of hemorrhagic stroke. The study consisted of 166 hospitalized CT scan proved cases of hemorrhagic stroke (ICD 9, 431-432), and a age and sex matched control per case. The controls were selected from patients who attended the study hospital for conditions other than stroke. On conditional multiple logistic regression five risk factors- hypertension (OR = 1.9. 95% Cl = 1.5-2.5). raised scrum total cholesterol (OR = 2.3, 95% Cl = 1.1-4.9). use of anticoagulants and antiplatelet agents (OR = 3.4, 95% Cl =1.1-10.4). past history of transient ischaemic attack (OR = 8.4, 95% Cl = 2.1- 33.6) and alcohol intake (OR = 2.1, 95% Cl = 1.3-3.6) were significant. These factors were ascribed statistical weights (based on regression coefficients) of 6, 8, 12, 21 and 8 respectively. The nonsignificant factors (diabetes mellitus, physical inactivity, obesity, smoking, type A personality, history of claudication, family history of stroke, history of cardiac diseases and oral contraceptive use in females) were not included in the development of scoring system. ROC curve suggested a total score of 21 to be the best cut-off for predicting haemorrhag stroke. At this cut-off the sensitivity, specificity, positive predictivity and Cohen's kappa were 0.74, 0.74, 0.74 and 0.48 respectively. The overall predictive accuracy of this additive risk scoring system (area under ROC curve by Wilcoxon statistic) was 0.79 (95% Cl = 0.73-0.84). Thus to conclude, if substantiated by further validation, this scorincy system can be used to predict haemorrhagic stroke, thereby helping to devise effective risk factor intervention strategy.
Examining the Potential for Gender Bias in the Prediction of Symptom Validity Test Failure by MMPI-2 Symptom Validity Scale Scores

ERIC Educational Resources Information Center

Lee, Tayla T. C.; Graham, John R.; Sellbom, Martin; Gervais, Roger O.

2012-01-01

Using a sample of individuals undergoing medico-legal evaluations (690 men, 519 women), the present study extended past research on potential gender biases for scores of the Symptom Validity (FBS) scale of the Minnesota Multiphasic Personality Inventory-2 by examining score- and item-level differences between men and women and determining the…

Can outcome of pancreatic pseudocysts be predicted? Proposal for a new scoring system.

PubMed

Şenol, Kazım; Akgül, Özgür; Gündoğdu, Salih Burak; Aydoğan, İhsan; Tez, Mesut; Coşkun, Faruk; Tihan, Deniz Necdet

2016-03-01

The spontaneous resolution rate of pancreatic pseudocysts (PPs) is 86%, and the serious complication rate is 3-9%. The aim of the present study was to develop a scoring system that would predict spontaneous resolution of PPs. Medical records of 70 patients were retrospectively reviewed. Two patients were excluded. Demographic data and laboratory measurements were obtained from patient records. Mean age of the 68 patients included was 56.6 years. Female:male ratio was 1.34:1. Causes of pancreatitis were stones (48.5%), alcohol consumption (26.5%), and unknown etiology (25%). Mean size of PP was 71 mm. Pseudocysts disappeared in 32 patients (47.1%). With univariate analysis, serum direct bilirubin level (>0.95 mg/dL), cyst carcinoembryonic antigen (CEA) level (>1.5), and cyst diameter (>55 mm) were found to be significantly different between patients with and without spontaneous resolution. In multivariate analysis, these variables were statistically significant. Scores were calculated with points assigned to each variable. Final scores predicted spontaneous resolution in approximately 80% of patients. The scoring system developed to predict resolution of PPs is simple and useful, but requires validation.
SPIDERplan: A tool to support decision-making in radiation therapy treatment plan assessment.

PubMed

Ventura, Tiago; Lopes, Maria do Carmo; Ferreira, Brigida Costa; Khouri, Leila

2016-01-01

In this work, a graphical method for radiotherapy treatment plan assessment and comparison, named SPIDERplan, is proposed. It aims to support plan approval allowing independent and consistent comparisons of different treatment techniques, algorithms or treatment planning systems. Optimized plans from modern radiotherapy are not easy to evaluate and compare because of their inherent multicriterial nature. The clinical decision on the best treatment plan is mostly based on subjective options. SPIDERplan combines a graphical analysis with a scoring index. Customized radar plots based on the categorization of structures into groups and on the determination of individual structures scores are generated. To each group and structure, an angular amplitude is assigned expressing the clinical importance defined by the radiation oncologist. Completing the graphical evaluation, a global plan score, based on the structures score and their clinical weights, is determined. After a necessary clinical validation of the group weights, SPIDERplan efficacy, to compare and rank different plans, was tested through a planning exercise where plans had been generated for a nasal cavity case using different treatment planning systems. SPIDERplan method was applied to the dose metrics achieved by the nasal cavity test plans. The generated diagrams and scores successfully ranked the plans according to the prescribed dose objectives and constraints and the radiation oncologist priorities, after a necessary clinical validation process. SPIDERplan enables a fast and consistent evaluation of plan quality considering all targets and organs at risk.
Cross-cultural adaptation and validation of the Korean version of the neck disability index.

PubMed

Song, Kyung-Jin; Choi, Byung-Wan; Choi, Byung-Ryeul; Seo, Gyeu-Beom

2010-09-15

Validation of a translated, culturally adapted questionnaire. The purpose of this study is to translate and culturally adapt the Neck Disability Index (NDI) and to validate the use of the derived version in Korean patient. Although several valid measures exist for measurement of neck pain and functional impairment, these measures have yet been validated in Korean version. The NDI was linguistically translated into Korean, and prefinal version was assessed and modified by a pilot study. The reliability and validity of the derived Korean version was examined in 78 patients with degenerative cervical spine disease. Test-retest reliability, internal consistency, and construct validity were investigated by comparing Visual Analogue Scale (VAS) and Short Form Health Survey (SF-36) scores. Factor analysis of Korean NDI extracted 2 factors with eigenvalues >1. The intraclass-correlation coefficient of test-retest reliability was 0.93. Reliability, estimated by internal consistency, had a Cronbach alpha value of 0.82. The correlation between NDI and VAS scores was r = 0.49, and the correlation between NDI and SF-36 scores was r = -0.44. The physical health component score of SF-36 was highly correlated with NDI, and the correlation between VAS scores and the mental health component scores of SF-36 was high. The derived Korean version of the NDI was found to be a reliable and valid instrument for measuring disability in Korean patients with cervical problems. The authors recommend its use in future Korean clinical studies.
Validation of the Retinal Detachment after Open Globe Injury (RD-OGI) Score as an Effective Tool for Predicting Retinal Detachment.

PubMed

Brodowska, Katarzyna; Stryjewski, Tomasz P; Papavasileiou, Evangelia; Chee, Yewlin E; Eliott, Dean

2017-05-01

The Retinal Detachment after Open Globe Injury (RD-OGI) Score is a clinical prediction model that was developed at the Massachusetts Eye and Ear Infirmary to predict the risk of retinal detachment (RD) after open globe injury (OGI). This study sought to validate the RD-OGI Score in an independent cohort of patients. Retrospective cohort study. The predictive value of the RD-OGI Score was evaluated by comparing the original RD-OGI Scores of 893 eyes with OGI that presented between 1999 and 2011 (the derivation cohort) with 184 eyes with OGI that presented from January 1, 2012, to January 31, 2014 (the validation cohort). Three risk classes (low, moderate, and high) were created and logistic regression was undertaken to evaluate the optimal predictive value of the RD-OGI Score. A Kaplan-Meier survival analysis evaluated survival experience between the risk classes. Time to RD. At 1 year after OGI, 255 eyes (29%) in the derivation cohort and 66 eyes (36%) in the validation cohort were diagnosed with an RD. At 1 year, the low risk class (RD-OGI Scores 0-2) had a 3% detachment rate in the derivation cohort and a 0% detachment rate in the validation cohort, the moderate risk class (RD-OGI Scores 2.5-4.5) had a 29% detachment rate in the derivation cohort and a 35% detachment rate in the validation cohort, and the high risk class (RD-OGI scores 5-7.5) had a 73% detachment rate in the derivation cohort and an 86% detachment rate in the validation cohort. Regression modeling revealed the RD-OGI to be highly discriminative, especially 30 days after injury, with an area under the receiver operating characteristic curve of 0.939 in the validation cohort. Survival experience was significantly different depending upon the risk class (P < 0.0001, log-rank chi-square). The RD-OGI Score can reliably predict the future risk of developing an RD based on clinical variables that are present at the time of the initial evaluation after OGI. Copyright © 2017 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Reliability and Validity of the Italian Version of the Protocol of Orofacial Myofunctional Evaluation with Scores (I-OMES).

PubMed

Scarponi, Letizia; de Felicio, Claudia Maria; Sforza, Chiarella; Pimenta Ferreira, Claudia Lucia; Ginocchio, Daniela; Pizzorni, Nicole; Barozzi, Stefania; Mozzanica, Francesco; Schindler, Antonio

2018-05-30

To evaluate the reliability, validity, and responsiveness of the Italian OMES (I-OMES). The study consisted of 3 phases: (1) internal consistency and reliability, (2) validity, and (3) responsiveness analysis. The recruited population included 27 patients with orofacial myofunctional disorders (OMD) and 174 healthy volunteers. Forty-seven subjects, 18 healthy and all recruited patients with OMD were assessed for inter-rater and test-retest reliability analysis. I-OMES and Nordic Orofacial Test - Screening (NOT-S) scores of the patients were correlated for concurrent validity analysis. I-OMES scores from 27 patients with OMD and 27 age- and gender-matched healthy subjects were compared to investigate construct validity. I-OMES scores before and after successful swallowing rehabilitation in patients were compared for responsiveness analysis. Adequate internal consistency (Cronbach α = 0.71) and strong inter-rater and test-retest reliability (intraclass coefficient correlation = 0.97 and 0.98, respectively) were found. I-OMES and NOT-S scores significantly and inversely correlated (r = -0.38). A statistical significance (p < 0.001) was found between the pathological group and the control group for the total I-OMES score. The mean I-OMES score improved from 90 (78-102) to 99 (89-103) after myofunctional rehabilitation (p < 0.001). The I-OMES is a reliable and valid tool to evaluate OMD. © 2018 S. Karger AG, Basel.
Screening for postdeployment conditions: development and cross-validation of an embedded validity scale in the neurobehavioral symptom inventory.

PubMed

Vanderploeg, Rodney D; Cooper, Douglas B; Belanger, Heather G; Donnell, Alison J; Kennedy, Jan E; Hopewell, Clifford A; Scott, Steven G

2014-01-01

To develop and cross-validate internal validity scales for the Neurobehavioral Symptom Inventory (NSI). Four existing data sets were used: (1) outpatient clinical traumatic brain injury (TBI)/neurorehabilitation database from a military site (n = 403), (2) National Department of Veterans Affairs TBI evaluation database (n = 48 175), (3) Florida National Guard nonclinical TBI survey database (n = 3098), and (4) a cross-validation outpatient clinical TBI/neurorehabilitation database combined across 2 military medical centers (n = 206). Secondary analysis of existing cohort data to develop (study 1) and cross-validate (study 2) internal validity scales for the NSI. The NSI, Mild Brain Injury Atypical Symptoms, and Personality Assessment Inventory scores. Study 1: Three NSI validity scales were developed, composed of 5 unusual items (Negative Impression Management [NIM5]), 6 low-frequency items (LOW6), and the combination of 10 nonoverlapping items (Validity-10). Cut scores maximizing sensitivity and specificity on these measures were determined, using a Mild Brain Injury Atypical Symptoms score of 8 or more as the criterion for invalidity. Study 2: The same validity scale cut scores again resulted in the highest classification accuracy and optimal balance between sensitivity and specificity in the cross-validation sample, using a Personality Assessment Inventory Negative Impression Management scale with a T score of 75 or higher as the criterion for invalidity. The NSI is widely used in the Department of Defense and Veterans Affairs as a symptom-severity assessment following TBI, but is subject to symptom overreporting or exaggeration. This study developed embedded NSI validity scales to facilitate the detection of invalid response styles. The NSI Validity-10 scale appears to hold considerable promise for validity assessment when the NSI is used as a population-screening tool.
Crosscultural Adaptation and Validation of the Korean Version of the New Knee Society Knee Scoring System.

PubMed

Kim, Seok Jin; Basur, Mohnish Singh; Park, Chang Kyu; Chong, Suri; Kang, Yeon Gwi; Kim, Moon Ju; Jeong, Jeong Seong; Kim, Tae Kyun

2017-06-01

The 2011 Knee Society Score © (2011 KS Score © ) is used to characterize the expectations, symptoms, physical activity, and satisfaction of patients who undergo TKA and is widely used to assess the outcome of TKA. However, it has not been adapted or validated for use in Korea. We developed a Korean version of the 2011 KS Score and evaluated the (1) test-retest reliability, (2) convergent validity, and (3) responsiveness of the Korean version. The Korean version of the 2011 KS Score was derived by using a well-established translational procedure based on international guidelines, which include translation, synthesis, back-translation, expert committee review, pretesting, and submission for appraisal. A total of 123 patients with knee osteoarthritis who were scheduled to undergo TKA were recruited for the study. Ninety percent of the patients (111 of 123) were women, which is an exact representation of the Korean population having TKAs. To evaluate reliability, the patients were evaluated twice during a 4-week interval using the questionnaire. Reliability was assessed by using intraclass correlation coefficients (ICCs) and internal consistency by using Cronbach's alpha to determine the validity of the Korean version of the 2011 KS Score. The patients were evaluated by using the validated Korean versions of the WOMAC and SF-36 questionnaires. Spearman's correlation coefficient was used for validation. Responsiveness was determined by calculating the standardized response mean from the preoperative and postoperative test scores in the Korean version of the 2011 KS Score. To address the gender disparity in our study we identified 53 males who underwent TKA for osteoarthritis after completion of this study and generated age-matched controlled groups to evaluate construct validity and responsiveness in Korean males. The reliability proved good to excellent with an ICC between 0.69 and 0.85, depending on the clinical properties tested, which included the following: symptoms, satisfaction, expectation, and total functional activity consisting of functional activity, standard activity, advanced activity, and discretionary activity. All subscales showed good to excellent internal consistency indicated by Chronbach's alpha (range, 0.83-0.92). For validity, three of the four domains (the exception was expectation) of the 2011 KS Score, correlated either strongly or moderately with the Korean WOMAC score (r ≥ 0.35). When compared with the SF-36, the satisfaction domain showed a weak positive correlation with all the subscales of the SF-36 except general health (r < 0.35). The activity domain showed a strong positive correlation with physical function (r = 0.62) and physical component summary (r = 0.52), moderate with physical role (r = 0.46), and weak with bodily pain (r = 0.26) and social function (r = 0.31). The symptom domain also exhibited a similar moderate positive correlation with physical function (r = 0.41) and weak positive correlation with bodily pain, social function, and physical component summary (r = 0.22, 0.20, and 0.26, respectively). For responsiveness, all the domains of Korean version of the 2011 KS Score, except for expectation, showed large changes (> 0.8), calculated as standardized response mean. The total amount of the Korean version of the 2011 KS Score (2.03, p < 0.001) showed higher responsiveness when compared with the WOMAC total (1.88, p < 0.001) and SF-36 physical and mental component summaries (1.14, p < 0.001; and 0.68, p < 0.001, respectively). The Korean version of the 2011 KS Score was successfully developed using a process of crosscultural adaptation for the Korean-speaking population who had undergone TKA for osteoarthritis of the knee. The Korean version of the 2011 KS Score was shown to be a reliable, valid, and responsive tool and can be used to assess functional outcomes and expectations of Korean patients who undergo TKA. The demographic features of TKA in the Korean population should be taken into account with additional studies recommended to further investigate these psychometric properties in Korean men. Level II, diagnostic study.
The East London glaucoma prediction score: web-based validation of glaucoma risk screening tool

PubMed Central

Stephen, Cook; Benjamin, Longo-Mbenza

2013-01-01

AIM It is difficult for Optometrists and General Practitioners to know which patients are at risk. The East London glaucoma prediction score (ELGPS) is a web based risk calculator that has been developed to determine Glaucoma risk at the time of screening. Multiple risk factors that are available in a low tech environment are assessed to provide a risk assessment. This is extremely useful in settings where access to specialist care is difficult. Use of the calculator is educational. It is a free web based service. Data capture is user specific. METHOD The scoring system is a web based questionnaire that captures and subsequently calculates the relative risk for the presence of Glaucoma at the time of screening. Three categories of patient are described: Unlikely to have Glaucoma; Glaucoma Suspect and Glaucoma. A case review methodology of patients with known diagnosis is employed to validate the calculator risk assessment. RESULTS Data from the patient records of 400 patients with an established diagnosis has been captured and used to validate the screening tool. The website reports that the calculated diagnosis correlates with the actual diagnosis 82% of the time. Biostatistics analysis showed: Sensitivity = 88%; Positive predictive value = 97%; Specificity = 75%. CONCLUSION Analysis of the first 400 patients validates the web based screening tool as being a good method of screening for the at risk population. The validation is ongoing. The web based format will allow a more widespread recruitment for different geographic, population and personnel variables. PMID:23550097
Cross-cultural adaptation and validation of the Japanese version of the new Knee Society Scoring System for osteoarthritic knee with total knee arthroplasty.

PubMed

Hamamoto, Yosuke; Ito, Hiromu; Furu, Moritoshi; Ishikawa, Masahiro; Azukizawa, Masayuki; Kuriyama, Shinichi; Nakamura, Shinichiro; Matsuda, Shuichi

2015-09-01

The purposes of this study were to translate the new Knee Society Score (KSS) into Japanese and to evaluate the construct and content validity, test-retest reliability, and internal consistency of the Japanese version of the new KSS. The Japanese version of the KSS was developed according to cross-cultural guidelines by using the "translation-back translation" method to ensure content validity. KSS data were then obtained from patients who had undergone total knee arthroplasty (TKA). The psychometric properties evaluated were as follows: for feasibility, response rate, and floor and ceiling effects; for construct validity, internal consistency using Cronbach's alpha, and correlations with quality of life. Construct validity was evaluated by using Spearman's correlation coefficient to quantify the correlation between the KSS and the Japanese version of the Oxford 12-item Knee Score or Short Form 36 Health Survey (SF-36) questionnaires. The Japanese version of the KSS was sent to 93 consecutive osteoarthritic patients who underwent primary TKA in our institution. Fifty-five patients completed the questionnaires and were included in this study. Neither a floor nor ceiling effect was observed. The reliability proved excellent in the majority of domains, with intraclass correlation coefficients of 0.65-0.88. Internal consistency, assessed by Cronbach's alpha, was good to excellent for all domains (0.78-0.94). All of the four domains of the KSS correlated significantly with the Oxford 12-item Knee Score. The activity and satisfaction domains of the KSS correlated significantly with all and the majority of subscales of the SF-36, respectively, whereas symptoms and expectation domains showed significant correlations only with bodily pain and vitality subscales and with the physical function, bodily pain, and vitality subscales, respectively. The Japanese version of the new KSS is a valid, reliable, and responsive instrument to capture subjective aspects of the functional symptoms and abilities of patients who undergo TKA.
Simple Scoring System to Predict In-Hospital Mortality After Surgery for Infective Endocarditis.

PubMed

Gatti, Giuseppe; Perrotti, Andrea; Obadia, Jean-François; Duval, Xavier; Iung, Bernard; Alla, François; Chirouze, Catherine; Selton-Suty, Christine; Hoen, Bruno; Sinagra, Gianfranco; Delahaye, François; Tattevin, Pierre; Le Moing, Vincent; Pappalardo, Aniello; Chocron, Sidney

2017-07-20

Aspecific scoring systems are used to predict the risk of death postsurgery in patients with infective endocarditis (IE). The purpose of the present study was both to analyze the risk factors for in-hospital death, which complicates surgery for IE, and to create a mortality risk score based on the results of this analysis. Outcomes of 361 consecutive patients (mean age, 59.1±15.4 years) who had undergone surgery for IE in 8 European centers of cardiac surgery were recorded prospectively, and a risk factor analysis (multivariable logistic regression) for in-hospital death was performed. The discriminatory power of a new predictive scoring system was assessed with the receiver operating characteristic curve analysis. Score validation procedures were carried out. Fifty-six (15.5%) patients died postsurgery. BMI >27 kg/m 2 (odds ratio [OR], 1.79; P =0.049), estimated glomerular filtration rate <50 mL/min (OR, 3.52; P <0.0001), New York Heart Association class IV (OR, 2.11; P =0.024), systolic pulmonary artery pressure >55 mm Hg (OR, 1.78; P =0.032), and critical state (OR, 2.37; P =0.017) were independent predictors of in-hospital death. A scoring system was devised to predict in-hospital death postsurgery for IE (area under the receiver operating characteristic curve, 0.780; 95% CI, 0.734-0.822). The score performed better than 5 of 6 scoring systems for in-hospital death after cardiac surgery that were considered. A simple scoring system based on risk factors for in-hospital death was specifically created to predict mortality risk postsurgery in patients with IE. © 2017 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley.
Validation of "Signs of Inflammation in Children that Kill" (SICK) score for immediate non-invasive assessment of severity of illness.

PubMed

Gupta, Manoj A; Chakrabarty, Anjan; Halstead, Ruth; Sahni, Mohit; Rangasami, Jayanti; Puliyel, Ashish; Sreenivas, Vishnubhatla; Green, David A; Puliyel, Jacob M

2010-04-26

To validate the SICK scoring system's ability to differentiate between individuals with higher and lower probabilities of death We performed a one year two-centre prospective evaluation of all children aged between one month and 12 years referred to the Paediatric team at St Stephens Hospital in Delhi and admitted to the Paediatric Department at West Middlesex University Hospital in London. We calculated SICK scores at presentation and correlated them with subsequent in-hospital mortality. We used discrimination by areas under receiver operating characteristic (ROC) curves to measure performance. We prospectively evaluated 3895 children in Delhi and 1473 children in London. The areas under the ROC curves were 84.8% in Delhi, 81.0% in London and 84.1% (95% CI 77.4-90.8%) for combined data. Hosmer-Lemeshow goodness of fit for the combined data was good (Hosmer-Lemeshow Chi-square=2.13 (p=0.345). We propose the SICK score as a useful triage tool at initial presentation and highlight its particular suitability for resource poor settings.
Validation of "Signs of Inflammation in Children that Kill" (SICK) score for immediate non-invasive assessment of severity of illness

PubMed Central

2010-01-01

Objective To validate the SICK scoring system's ability to differentiate between individuals with higher and lower probabilities of death Method We performed a one year two-centre prospective evaluation of all children aged between one month and 12 years referred to the Paediatric team at St Stephens Hospital in Delhi and admitted to the Paediatric Department at West Middlesex University Hospital in London. We calculated SICK scores at presentation and correlated them with subsequent in-hospital mortality. We used discrimination by areas under receiver operating characteristic (ROC) curves to measure performance. Results We prospectively evaluated 3895 children in Delhi and 1473 children in London. The areas under the ROC curves were 84.8% in Delhi, 81.0% in London and 84.1% (95% CI 77.4 - 90.8%) for combined data. Hosmer-Lemeshow goodness of fit for the combined data was good (Hosmer-Lemeshow Chi-square = 2.13 (p = 0.345). Conclusions We propose the SICK score as a useful triage tool at initial presentation and highlight its particular suitability for resource poor settings. PMID:20420670
The Japanese Histologic Classification and T-score in the Oxford Classification system could predict renal outcome in Japanese IgA nephropathy patients.

PubMed

Kaihan, Ahmad Baseer; Yasuda, Yoshinari; Katsuno, Takayuki; Kato, Sawako; Imaizumi, Takahiro; Ozeki, Takaya; Hishida, Manabu; Nagata, Takanobu; Ando, Masahiko; Tsuboi, Naotake; Maruyama, Shoichi

2017-12-01

The Oxford Classification is utilized globally, but has not been fully validated. In this study, we conducted a comparative analysis between the Oxford Classification and Japanese Histologic Classification (JHC) to predict renal outcome in Japanese patients with IgA nephropathy (IgAN). A retrospective cohort study including 86 adult IgAN patients was conducted. The Oxford Classification and the JHC were evaluated by 7 independent specialists. The JHC, MEST score in the Oxford Classification, and crescents were analyzed in association with renal outcome, defined as a 50% increase in serum creatinine. In multivariate analysis without the JHC, only the T score was significantly associated with renal outcome. While, a significant association was revealed only in the JHC on multivariate analysis with JHC. The JHC and T score in the Oxford Classification were associated with renal outcome among Japanese patients with IgAN. Superiority of the JHC as a predictive index should be validated with larger study population and cohort studies in different ethnicities.
Children's human figure drawings do not measure intellectual ability.

PubMed

Willcock, Emma; Imuta, Kana; Hayne, Harlene

2011-11-01

Children typically follow a well-defined series of stages as they learn to draw, but the rate at which they progress through these stages varies from child to child. Some experts have argued that these individual differences in drawing development reflect individual differences in intelligence. Here we assessed the validity of a drawing test that is commonly used to assess children's intellectual abilities. In a single study, 125 5- and 6-year-olds completed the Draw-A-Person: A Quantitative Scoring System (DAP:QSS) and the Wechsler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R) or the Wechsler Abbreviated Scale of Intelligence (WASI). Although there was a statistically significant correlation between scores on the DAP:QSS and scores on the Wechsler tests, when the scores of individual children were examined, the DAP:QSS yielded a high number of false positives and false negatives for low intellectual functioning. We conclude that the DAP:QSS is not a valid measure of intellectual ability and should not be used as a screening tool. Copyright © 2011 Elsevier Inc. All rights reserved.
Vestibular evoked myogenic potentials and MRI in early multiple sclerosis: Validation of the VEMP score.

PubMed

Crnošija, Luka; Krbot Skorić, Magdalena; Gabelić, Tereza; Adamec, Ivan; Habek, Mario

2017-01-15

To validate the VEMP score as a measure of brainstem dysfunction in patients with the first symptom of multiple sclerosis (MS) (clinically isolated syndrome (CIS)) and to investigate the correlation between VEMP and brainstem MRI results. 121 consecutive CIS patients were enrolled and brainstem functional system score (BSFS) was determined. Ocular VEMP (oVEMP) and cervical VEMP (cVEMP) were analyzed for latencies, conduction block and amplitude asymmetry ratio and the VEMP score was calculated. MRI was analyzed for the presence of brainstem lesions as a whole and separately for the presence of pontine, midbrain and medulla oblongata lesions. Patients with signs of brainstem involvement during the neurological examination (with BSFS ≥1) had a higher oVEMP score compared to patients with no signs of brainstem involvement. A binary logistic regression model showed that patients with brainstem lesion on the MRI are 6.780 times more likely to have BSFS ≥1 (p=0.001); and also, a higher VEMP score is associated with BSFS ≥1 (p=0.042). Furthermore, significant correlations were found between clinical brainstem involvement and brainstem and pontine MRI lesions, and prolonged latencies and/or absent VEMP responses. The VEMP score is a valuable tool in evaluation of brainstem involvement in patients with early MS. Copyright © 2016 Elsevier B.V. All rights reserved.
Commentary on "Validating the Interpretations and Uses of Test Scores"

ERIC Educational Resources Information Center

Brennan, Robert L.

2013-01-01

Kane's paper "Validating the Interpretations and Uses of Test Scores" is the most complete and clearest discussion yet available of the argument-based approach to validation. At its most basic level, validation as formulated by Kane is fundamentally a simply-stated two-step enterprise: (1) specify the claims inherent in a particular interpretation…
Establishing inter-rater reliability scoring in a state trauma system.

PubMed

Read-Allsopp, Christine

2004-01-01

Trauma systems rely on accurate Injury Severity Scoring (ISS) to describe trauma patient populations. Twenty-seven (27) Trauma Nurse Coordinators and Data Managers across the state of New South Wales, Australia trauma network were instructed in the uses and techniques of the Abbreviated Injury Scale (AIS) from the Association for the Advancement of Automotive Medicine. The aim is to provide accurate, reliable and valid data for the state trauma network. Four (4) months after the course a coding exercise was conducted to assess inter-rater reliability. The results show that inter-rater reliability is with accepted international standards.
Development and Score Validation of a Chemistry Laboratory Anxiety Instrument (CLAI) for College Chemistry Students.

ERIC Educational Resources Information Center

Bowen, Craig W.

1999-01-01

Reports the development and score validation of an instrument for measuring anxieties students experience in college chemistry laboratories. Factor analysis of scores from 361 college students shows that the developed Chemistry Laboratory Anxiety Instrument measures five constructs. Results from a second sample of 598 students show that scores on…
Intrarater and interrater reliability and validity in the assessment of the mechanism of injury and integrity of the posterior ligamentous complex: a novel injury severity scoring system for thoracolumbar injuries. Invited submission from the Joint Section Meeting On Disorders of the Spine and Peripheral Nerves, March 2005.

PubMed

Harrop, James S; Vaccaro, Alexander R; Hurlbert, R John; Wilsey, Jared T; Baron, Eli M; Shaffrey, Christopher I; Fisher, Charles G; Dvorak, Marcel F; Oner, F C; Wood, Kirkham B; Anand, Neel; Anderson, D Greg; Lim, Moe R; Lee, Joon Y; Bono, Christopher M; Arnold, Paul M; Rampersaud, Y Raja; Fehlings, Michael G

2006-02-01

A new classification and treatment algorithm for thoracolumbar injuries was recently introduced by Vaccaro and colleagues in 2005. A thoracolumbar injury severity scale (TLISS) was proposed for grading and guiding treatment for these injuries. The scale is based on the following: 1) the mechanism of injury; 2) the integrity of the posterior ligamentous complex (PLC); and 3) the patient's neurological status. The reliability and validity of assessing injury mechanism and the integrity of the PLC was assessed. Forty-eight spine surgeons, consisting of neurosurgeons and orthopedic surgeons, reviewed 56 clinical thoracolumbar injury case histories. Each was classified and scored to determine treatment recommendations according to a novel classification system. After 3 months the case histories were reordered and the physicians repeated the exercise. Validity of this classification was good among reviewers; the vast majority (> 90%) agreed with the system's treatment recommendations. Surgeons were unclear as to a cogent description of PLC disruption and fracture mechanism. The TLISS demonstrated acceptable reliability in terms of intra- and interobserver agreement on the algorithm's treatment recommendations. Replacing injury mechanism with a description of injury morphology and better definition of PLC injury will improve inter- and intraobserver reliability of this injury classification system.
Turkish Version of Kolcaba's Immobilization Comfort Questionnaire: A Validity and Reliability Study.

PubMed

Tosun, Betül; Aslan, Özlem; Tunay, Servet; Akyüz, Aygül; Özkan, Hüseyin; Bek, Doğan; Açıksöz, Semra

2015-12-01

The purpose of this study was to determine the validity and reliability of the Turkish version of the Immobilization Comfort Questionnaire (ICQ). The sample used in this methodological study consisted of 121 patients undergoing lower extremity arthroscopy in a training and research hospital. The validity study of the questionnaire assessed language validity, structural validity and criterion validity. Structural validity was evaluated via exploratory factor analysis. Criterion validity was evaluated by assessing the correlation between the visual analog scale (VAS) scores (i.e., the comfort and pain VAS scores) and the ICQ scores using Spearman's correlation test. The Kaiser-Meyer-Olkin coefficient and Bartlett's test of sphericity were used to determine the suitability of the data for factor analysis. Internal consistency was evaluated to determine reliability. The data were analyzed with SPSS version 15.00 for Windows. Descriptive statistics were presented as frequencies, percentages, means and standard deviations. A p value ≤ .05 was considered statistically significant. A moderate positive correlation was found between the ICQ scores and the VAS comfort scores; a moderate negative correlation was found between the ICQ and the VAS pain measures in the criterion validity analysis. Cronbach α values of .75 and .82 were found for the first and second measurements, respectively. The findings of this study reveal that the ICQ is a valid and reliable tool for assessing the comfort of patients in Turkey who are immobilized because of lower extremity orthopedic problems. Copyright © 2015. Published by Elsevier B.V.

Clinical utility and validity of the Functional Disability Inventory (FDI) among a multicenter sample of youth with chronic pain

PubMed Central

Kashikar-Zuck, Susmita; Flowers, Stacy R.; Claar, Robyn Lewis; Guite, Jessica W.; Logan, Deirdre E.; Lynch-Jordan, Anne M; Palermo, Tonya M.; Wilson, Anna C.

2011-01-01

The Functional Disability Inventory (FDI) is a well-established and commonly used measure of physical functioning and disability in youth with chronic pain. Further validation of the measure has been called for, in particular, examination of the clinical utility and factor structure of the measure. To address this need, we utilized a large multicenter dataset of pediatric patients with chronic pain who had completed the FDI and other measures assessing pain and emotional functioning. Clinical reference points to allow for interpretation of raw scores were developed to enhance clinical utility of the measure and exploratory factor analysis was performed to examine its factor structure. Participants included 1300 youth ages 8 to 18 years (M=14.2 years; 76% female) with chronic pain. Examination of the distribution of FDI scores and validation with measures of depressive symptoms and pain intensity yielded three distinct categories of disability: No/Minimal Disability, Moderate Disability and Severe Disability. Factor analysis of FDI scores revealed a two-factor solution representing vigorous Physical Activities and non-physically strenuous Daily Activities. The three-level classification system and factor structure were further explored via comparison across the four most commonly encountered pain conditions in clinical settings (head, back, abdominal and widespread pain). Our findings provide important new information regarding the clinical utility and validity of the FDI. This will greatly enhance the interpretability of scores for research and clinical use in a wide range of pediatric pain conditions. In particular these findings will facilitate use of the FDI as an outcome measure in future clinical trials. PMID:21458162
MEASURING SPORT-SPECIFIC PHYSICAL ABILITIES IN MALE GYMNASTS: THE MEN'S GYMNASTICS FUNCTIONAL MEASUREMENT TOOL

PubMed Central

Kenyon, Lisa K.; Elliott, James M; Cheng, M. Samuel

2016-01-01

Purpose/Background Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. Methods A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts’ USA-Gymnastics competitive level to calculate the coefficient of determination (r2). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. Results The relationship between total MGFMT scores and subjects’ current USA-Gymnastics competitive level was found to be good (r2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). Conclusions The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level of Evidence Level 3 PMID:27999723
The PER (Preoperative Esophagectomy Risk) Score: A Simple Risk Score to Predict Short-Term and Long-Term Outcome in Patients with Surgically Treated Esophageal Cancer.

PubMed

Reeh, Matthias; Metze, Johannes; Uzunoglu, Faik G; Nentwich, Michael; Ghadban, Tarik; Wellner, Ullrich; Bockhorn, Maximilian; Kluge, Stefan; Izbicki, Jakob R; Vashist, Yogesh K

2016-02-01

Esophageal resection in patients with esophageal cancer (EC) is still associated with high mortality and morbidity rates. We aimed to develop a simple preoperative risk score for the prediction of short-term and long-term outcomes for patients with EC treated by esophageal resection. In total, 498 patients suffering from esophageal carcinoma, who underwent esophageal resection, were included in this retrospective cohort study. Three preoperative esophagectomy risk (PER) groups were defined based on preoperative functional evaluation of different organ systems by validated tools (revised cardiac risk index, model for end-stage liver disease score, and pulmonary function test). Clinicopathological parameters, morbidity, and mortality as well as disease-free survival (DFS) and overall survival (OS) were correlated to the PER score. The PER score significantly predicted the short-term outcome of patients with EC who underwent esophageal resection. PER 2 and PER 3 patients had at least double the risk of morbidity and mortality compared to PER 1 patients. Furthermore, a higher PER score was associated with shorter DFS (P < 0.001) and OS (P < 0.001). The PER score was identified as an independent predictor of tumor recurrence (hazard ratio [HR] 2.1; P < 0.001) and OS (HR 2.2; P < 0.001). The PER score allows preoperative objective allocation of patients with EC into different risk categories for morbidity, mortality, and long-term outcomes. Thus, multicenter studies are needed for independent validation of the PER score.
Development of a computed tomography-based scoring system for necrotizing soft-tissue infections.

PubMed

McGillicuddy, Edward A; Lischuk, Andrew W; Schuster, Kevin M; Kaplan, Lewis J; Maung, Adrian; Lui, Felix Y; Bokhari, S A Jamal; Davis, Kimberly A

2011-04-01

Necrotizing soft-tissue infections (NSTIs) are associated with significant morbidity and mortality, but a definitive nonsurgical diagnostic test remains elusive. Despite the widespread use of computed tomography (CT) as a diagnostic adjunct, there is little data that definitively correlate CT findings with the presence of NSTI. Our goal was the development of a CT-based scoring system to discriminate non-NSTI from NSTI. Patients older than 17 years undergoing CT for evaluation of soft-tissue infection at a tertiary care medical center over a 10-year period (2000-2009) were included. Abstracted data included comorbidities and social history, physical examination, laboratory findings, and operative and pathologic findings. NSTI was defined as soft-tissue necrosis in the dictated operative note or the accompanying pathology report. CT scans were reviewed by a radiologist blinded to clinical and laboratory data. A scoring system was developed and the area under the receiver operating characteristic curve was calculated. During the study period, 305 patients underwent CT scanning (57% men; mean age, 47.4 years). Forty-four patients (14.4%) evaluated had an NSTI. A scoring system was retrospectively developed (table). A score >6 points was 86.3% sensitive and 91.5% specific for the diagnosis of NSTI (positive predictive value, 63.3%; negative predictive value, 85.5%). The area under the receiver operating characteristic curve was 0.928 (95% confidence interval, 0.893-0.964). The mean score of the non-NSTI group was 2.74. We have developed a CT scoring system that is both sensitive and specific for the diagnosis of NSTIs. This system may allow clinicians to more accurately diagnose NSTIs. Prospective validation of this scoring system is planned.
Multiple Score Comparison: a network meta-analysis approach to comparison and external validation of prognostic scores.

PubMed

Haile, Sarah R; Guerra, Beniamino; Soriano, Joan B; Puhan, Milo A

2017-12-21

Prediction models and prognostic scores have been increasingly popular in both clinical practice and clinical research settings, for example to aid in risk-based decision making or control for confounding. In many medical fields, a large number of prognostic scores are available, but practitioners may find it difficult to choose between them due to lack of external validation as well as lack of comparisons between them. Borrowing methodology from network meta-analysis, we describe an approach to Multiple Score Comparison meta-analysis (MSC) which permits concurrent external validation and comparisons of prognostic scores using individual patient data (IPD) arising from a large-scale international collaboration. We describe the challenges in adapting network meta-analysis to the MSC setting, for instance the need to explicitly include correlations between the scores on a cohort level, and how to deal with many multi-score studies. We propose first using IPD to make cohort-level aggregate discrimination or calibration scores, comparing all to a common comparator. Then, standard network meta-analysis techniques can be applied, taking care to consider correlation structures in cohorts with multiple scores. Transitivity, consistency and heterogeneity are also examined. We provide a clinical application, comparing prognostic scores for 3-year mortality in patients with chronic obstructive pulmonary disease using data from a large-scale collaborative initiative. We focus on the discriminative properties of the prognostic scores. Our results show clear differences in performance, with ADO and eBODE showing higher discrimination with respect to mortality than other considered scores. The assumptions of transitivity and local and global consistency were not violated. Heterogeneity was small. We applied a network meta-analytic methodology to externally validate and concurrently compare the prognostic properties of clinical scores. Our large-scale external validation indicates that the scores with the best discriminative properties to predict 3 year mortality in patients with COPD are ADO and eBODE.
The Functional Arm Scale for Throwers (FAST)-Part II: Reliability and Validity of an Upper Extremity Region-Specific and Population-Specific Patient-Reported Outcome Scale for Throwing Athletes.

PubMed

Huxel Bliven, Kellie C; Snyder Valier, Alison R; Bay, R Curtis; Sauers, Eric L

2017-04-01

The Functional Arm Scale for Throwers (FAST) is an upper extremity (UE) region-specific and population-specific patient-reported outcome (PRO) scale developed to measure health-related quality of life in throwers with UE injuries. Stages I and II, described in a companion paper, of FAST development produced a 22-item scale and a 9-item pitcher module. Stage III of scale development, establishing reliability and validity of the FAST, is reported herein. To describe stage III of scale development: reliability and validity of the FAST. Cohort study (diagnosis); Level of evidence, 2. Data from throwing athletes collected over 5 studies were pooled to assess reliability and validity of the FAST. Reliability was estimated using FAST scores from 162 throwing athletes who were injured (n = 23) and uninjured (n = 139). Concurrent validity was estimated using FAST scores and Disabilities of the Arm, Shoulder, and Hand (DASH) and Kerlan-Jobe Orthopaedic Clinic (KJOC) scores from 106 healthy, uninjured throwing athletes. Known-groups validity was estimated using FAST scores from 557 throwing athletes who were injured (n = 142) and uninjured (n = 415). Reliability and validity were assessed using intraclass correlation coefficients (ICCs), and measurement error was assessed using standard error of measurement (SEM) and minimum detectable change (MDC). Receiver operating characteristic curves and sensitivity/specificity values were estimated for known-groups validity. Data from a separate group (n = 18) of postsurgical and nonoperative/conservative rehabilitation patients were analyzed to report responsiveness of the FAST. The FAST total, subscales, and pitcher module scores demonstrated excellent test-retest reliability (ICC, 0.91-0.98). The SEM 95 and MDC 95 for the FAST total score were 3.8 and 10.5 points, respectively. The SEM 95 and MDC 95 for the pitcher module score were 5.7 and 15.7 points, respectively. The FAST scores showed acceptable correlation with DASH (ICC, 0.49-0.82) and KJOC (ICC, 0.62-0.81) scores. The FAST total score classified 85.1% of players into the correct injury group. For predicting UE injury status, a FAST total cutoff score of 10.0 out of 100.0 was 91% sensitive and 75% specific, and a pitcher module score of 10.0 out of 100.0 was 87% sensitive and 78% specific. The FAST total score demonstrated responsiveness on several indices between intake and discharge time points. The FAST is a reliable, valid, and responsive UE region-specific and population-specific PRO scale for measuring patient-reported health care outcomes in throwing athletes with injury.
The Functional Arm Scale for Throwers (FAST)—Part II: Reliability and Validity of an Upper Extremity Region-Specific and Population-Specific Patient-Reported Outcome Scale for Throwing Athletes

PubMed Central

Huxel Bliven, Kellie C.; Snyder Valier, Alison R.; Bay, R. Curtis; Sauers, Eric L.

2017-01-01

Background: The Functional Arm Scale for Throwers (FAST) is an upper extremity (UE) region-specific and population-specific patient-reported outcome (PRO) scale developed to measure health-related quality of life in throwers with UE injuries. Stages I and II, described in a companion paper, of FAST development produced a 22-item scale and a 9-item pitcher module. Stage III of scale development, establishing reliability and validity of the FAST, is reported herein. Purpose: To describe stage III of scale development: reliability and validity of the FAST. Study Design: Cohort study (diagnosis); Level of evidence, 2. Methods: Data from throwing athletes collected over 5 studies were pooled to assess reliability and validity of the FAST. Reliability was estimated using FAST scores from 162 throwing athletes who were injured (n = 23) and uninjured (n = 139). Concurrent validity was estimated using FAST scores and Disabilities of the Arm, Shoulder, and Hand (DASH) and Kerlan-Jobe Orthopaedic Clinic (KJOC) scores from 106 healthy, uninjured throwing athletes. Known-groups validity was estimated using FAST scores from 557 throwing athletes who were injured (n = 142) and uninjured (n = 415). Reliability and validity were assessed using intraclass correlation coefficients (ICCs), and measurement error was assessed using standard error of measurement (SEM) and minimum detectable change (MDC). Receiver operating characteristic curves and sensitivity/specificity values were estimated for known-groups validity. Data from a separate group (n = 18) of postsurgical and nonoperative/conservative rehabilitation patients were analyzed to report responsiveness of the FAST. Results: The FAST total, subscales, and pitcher module scores demonstrated excellent test-retest reliability (ICC, 0.91-0.98). The SEM95 and MDC95 for the FAST total score were 3.8 and 10.5 points, respectively. The SEM95 and MDC95 for the pitcher module score were 5.7 and 15.7 points, respectively. The FAST scores showed acceptable correlation with DASH (ICC, 0.49-0.82) and KJOC (ICC, 0.62-0.81) scores. The FAST total score classified 85.1% of players into the correct injury group. For predicting UE injury status, a FAST total cutoff score of 10.0 out of 100.0 was 91% sensitive and 75% specific, and a pitcher module score of 10.0 out of 100.0 was 87% sensitive and 78% specific. The FAST total score demonstrated responsiveness on several indices between intake and discharge time points. Conclusion: The FAST is a reliable, valid, and responsive UE region-specific and population-specific PRO scale for measuring patient-reported health care outcomes in throwing athletes with injury. PMID:28451614
Construct Validity and Reliability of the SARA Gait and Posture Sub-scale in Early Onset Ataxia

PubMed Central

Lawerman, Tjitske F.; Brandsma, Rick; Verbeek, Renate J.; van der Hoeven, Johannes H.; Lunsing, Roelineke J.; Kremer, Hubertus P. H.; Sival, Deborah A.

2017-01-01

Aim: In children, gait and posture assessment provides a crucial marker for the early characterization, surveillance and treatment evaluation of early onset ataxia (EOA). For reliable data entry of studies targeting at gait and posture improvement, uniform quantitative biomarkers are necessary. Until now, the pediatric test construct of gait and posture scores of the Scale for Assessment and Rating of Ataxia sub-scale (SARA) is still unclear. In the present study, we aimed to validate the construct validity and reliability of the pediatric (SARAGAIT/POSTURE) sub-scale. Methods: We included 28 EOA patients [15.5 (6–34) years; median (range)]. For inter-observer reliability, we determined the ICC on EOA SARAGAIT/POSTURE sub-scores by three independent pediatric neurologists. For convergent validity, we associated SARAGAIT/POSTURE sub-scores with: (1) Ataxic gait Severity Measurement by Klockgether (ASMK; dynamic balance), (2) Pediatric Balance Scale (PBS; static balance), (3) Gross Motor Function Classification Scale -extended and revised version (GMFCS-E&R), (4) SARA-kinetic scores (SARAKINETIC; kinetic function of the upper and lower limbs), (5) Archimedes Spiral (AS; kinetic function of the upper limbs), and (6) total SARA scores (SARATOTAL; i.e., summed SARAGAIT/POSTURE, SARAKINETIC, and SARASPEECH sub-scores). For discriminant validity, we investigated whether EOA co-morbidity factors (myopathy and myoclonus) could influence SARAGAIT/POSTURE sub-scores. Results: The inter-observer agreement (ICC) on EOA SARAGAIT/POSTURE sub-scores was high (0.97). SARAGAIT/POSTURE was strongly correlated with the other ataxia and functional scales [ASMK (rs = -0.819; p < 0.001); PBS (rs = -0.943; p < 0.001); GMFCS-E&R (rs = -0.862; p < 0.001); SARAKINETIC (rs = 0.726; p < 0.001); AS (rs = 0.609; p = 0.002); and SARATOTAL (rs = 0.935; p < 0.001)]. Comorbid myopathy influenced SARAGAIT/POSTURE scores by concurrent muscle weakness, whereas comorbid myoclonus predominantly influenced SARAKINETIC scores. Conclusion: In young EOA patients, separate SARAGAIT/POSTURE parameters reveal a good inter-observer agreement and convergent validity, implicating the reliability of the scale. In perspective of incomplete discriminant validity, it is advisable to interpret SARAGAIT/POSTURE scores for comorbid muscle weakness. PMID:29326569
Pediatric airway study: Endoscopic grading system for quantifying tonsillar size in comparison to standard adenotonsillar grading systems.

PubMed

Patel, Neha A; Carlin, Kristen; Bernstein, Joseph M

Current grading systems may not allow clinicians to reliably document and communicate adenotonsillar size in the clinical setting. A validated endoscopic grading system may be useful for reporting tonsillar size in future clinical outcome studies. This is especially important as tonsillar enlargement is the cause of a substantial health care burden on children. To propose and validate an easy-to-use flexible fiberoptic endoscopic grading system that provides physicians with a more accurate sense of the three-dimensional relationship of the tonsillar fossa to the upper-airway. 50 consecutive pediatric patients were prospectively recruited between February 2015 and February 2016 at a pediatric otolaryngology outpatient clinic. The patients had no major craniofacial abnormalities and were aged 1 to 16years. Each patient had data regarding BMI, Friedman palate position, OSA-18 survey results collected. For each child, digital video clips of fiberoptic nasopharyngeal, oropharyngeal and laryngeal exams were presented to 2 examiners. Examiners were asked to independently use the proposed Endoscopic tonsillar grading system, the Brodsky tonsillar grading scale, the Modified Brodsky tonsillar grading scale with a tongue depressor, and the Parikh adenoid grading system to rate adenotonsillar hypertrophy. Cohen's Kappa and weighted Kappa scores were used to assess interrater reliability for each of the four grading scales. The Spearman correlation was used to test the associations between each scale and OSA-18 scores, as well as Body Mass Index (BMI). 50 pediatric patients were included in this study (mean age 6.1years, range of 1year to 16years). The average BMI was 20. The average OSA-18 score was 61.7. The average Friedman palate position score was 1.34. Twelve percent of the patients had a Friedman palate position score≥3, which made traditional Brodsky grading of their tonsils impossible without a tongue depressor. All four scales showed strong agreement between the two raters. The weighted Kappa was 0.83 for the Modified Brodsky scale, 0.89 for the Brodsky scale, 0.94 for the Parikh scale to 0.98 for the Endoscopic scale (almost perfect agreement). The Endoscopic scale showed the most consistent agreement between the raters during the study. There was a moderate association between the Parikh adenoid grading system with OSA-18 scores (Spearman's ρ=0.58, p<0.001) compared to a low association of the tonsillar grading systems with OSA- 18 scores. None of the scales correlated with patient BMI. The proposed Endoscopic tonsillar grading system is as reliable of a method of grading tonsillar size as conventional grading systems. It offers the advantage of allowing for critical evaluation of the tonsils without any anatomic distortion which may occur with the use of a tongue blade. This new validated endoscopic grading system provides a tool for communicating the degree of airway obstruction at the level of the oropharynx regardless of Friedman palate position and may be used in future outcomes projects. Copyright © 2017 Elsevier Inc. All rights reserved.
Validation of Patient-Reported Outcomes Measurement Information System Short Forms for Use in Childhood-Onset Systemic Lupus Erythematosus.

PubMed

Jones, Jordan T; Carle, Adam C; Wootton, Janet; Liberio, Brianna; Lee, Jiha; Schanberg, Laura E; Ying, Jun; Morgan DeWitt, Esi; Brunner, Hermine I

2017-01-01

To validate the pediatric Patient-Reported Outcomes Measurement Information System short forms (PROMIS-SFs) in childhood-onset systemic lupus erythematosus (SLE) in a clinical setting. At 3 study visits, childhood-onset SLE patients completed the PROMIS-SFs (anger, anxiety, depressive symptoms, fatigue, physical function-mobility, physical function-upper extremity, pain interference, and peer relationships) using the PROMIS assessment center, and health-related quality of life (HRQoL) legacy measures (Pediatric Quality of Life Inventory, Childhood Health Assessment Questionnaire, Simple Measure of Impact of Lupus Erythematosus in Youngsters [SMILEY], and visual analog scales [VAS] of pain and well-being). Physicians rated childhood-onset SLE activity on a VAS and completed the Systemic Lupus Erythematosus Disease Activity Index 2000. Using a global rating scale of change (GRC) between study visits, physicians rated change of childhood-onset SLE activity (GRC-MD1: better/same/worse) and change of patient overall health (GRC-MD2: better/same/worse). Questionnaire scores were compared in support of validity and responsiveness to change (external standards: GRC-MD1, GRC-MD2). In this population-based cohort (n = 100) with a mean age of 15.8 years (range 10-20 years), the PROMIS-SFs were completed in less than 5 minutes in a clinical setting. The PROMIS-SF scores correlated at least moderately (Pearson's r ≥ 0.5) with those of legacy HRQoL measures, except for the SMILEY. Measures of childhood-onset SLE activity did not correlate with the PROMIS-SFs. Responsiveness to change of the PROMIS-SFs was supported by path, mixed-model, and correlation analyses. To assess HRQoL in childhood-onset SLE, the PROMIS-SFs demonstrated feasibility, internal consistency, construct validity, and responsiveness to change in a clinical setting. © 2016, American College of Rheumatology.
Validating Emergency Department Vital Signs Using a Data Quality Engine for Data Warehouse

PubMed Central

Genes, N; Chandra, D; Ellis, S; Baumlin, K

2013-01-01

Background : Vital signs in our emergency department information system were entered into free-text fields for heart rate, respiratory rate, blood pressure, temperature and oxygen saturation. Objective : We sought to convert these text entries into a more useful form, for research and QA purposes, upon entry into a data warehouse. Methods : We derived a series of rules and assigned quality scores to the transformed values, conforming to physiologic parameters for vital signs across the age range and spectrum of illness seen in the emergency department. Results : Validating these entries revealed that 98% of free-text data had perfect quality scores, conforming to established vital sign parameters. Average vital signs varied as expected by age. Degradations in quality scores were most commonly attributed logging temperature in Fahrenheit instead of Celsius; vital signs with this error could still be transformed for use. Errors occurred more frequently during periods of high triage, though error rates did not correlate with triage volume. Conclusions : In developing a method for importing free-text vital sign data from our emergency department information system, we now have a data warehouse with a broad array of quality-checked vital signs, permitting analysis and correlation with demographics and outcomes. PMID:24403981
Validating emergency department vital signs using a data quality engine for data warehouse.

PubMed

Genes, N; Chandra, D; Ellis, S; Baumlin, K

2013-01-01

Vital signs in our emergency department information system were entered into free-text fields for heart rate, respiratory rate, blood pressure, temperature and oxygen saturation. We sought to convert these text entries into a more useful form, for research and QA purposes, upon entry into a data warehouse. We derived a series of rules and assigned quality scores to the transformed values, conforming to physiologic parameters for vital signs across the age range and spectrum of illness seen in the emergency department. Validating these entries revealed that 98% of free-text data had perfect quality scores, conforming to established vital sign parameters. Average vital signs varied as expected by age. Degradations in quality scores were most commonly attributed logging temperature in Fahrenheit instead of Celsius; vital signs with this error could still be transformed for use. Errors occurred more frequently during periods of high triage, though error rates did not correlate with triage volume. In developing a method for importing free-text vital sign data from our emergency department information system, we now have a data warehouse with a broad array of quality-checked vital signs, permitting analysis and correlation with demographics and outcomes.
Developing a Weighted Measure of Speech Sound Accuracy

PubMed Central

Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.

2010-01-01

Purpose The purpose is to develop a system for numerically quantifying a speaker’s phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, we describe a system for differentially weighting speech sound errors based on various levels of phonetic accuracy with a Weighted Speech Sound Accuracy (WSSA) score. We then evaluate the reliability and validity of this measure. Method Phonetic transcriptions are analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy is compared to existing measures, is used to discriminate typical and disordered speech production, and is evaluated to determine whether it is sensitive to changes in phonetic accuracy over time. Results Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners’ judgments of severity of a child’s speech disorder. The measure separates children with and without speech sound disorders. WSSA scores also capture growth in phonetic accuracy in toddler’s speech over time. Conclusion Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children’s speech. PMID:20699344
Handling missing values in the MDS-UPDRS.

PubMed

Goetz, Christopher G; Luo, Sheng; Wang, Lu; Tilley, Barbara C; LaPelle, Nancy R; Stebbins, Glenn T

2015-10-01

This study was undertaken to define the number of missing values permissible to render valid total scores for each Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS) part. To handle missing values, imputation strategies serve as guidelines to reject an incomplete rating or create a surrogate score. We tested a rigorous, scale-specific, data-based approach to handling missing values for the MDS-UPDRS. From two large MDS-UPDRS datasets, we sequentially deleted item scores, either consistently (same items) or randomly (different items) across all subjects. Lin's Concordance Correlation Coefficient (CCC) compared scores calculated without missing values with prorated scores based on sequentially increasing missing values. The maximal number of missing values retaining a CCC greater than 0.95 determined the threshold for rendering a valid prorated score. A second confirmatory sample was selected from the MDS-UPDRS international translation program. To provide valid part scores applicable across all Hoehn and Yahr (H&Y) stages when the same items are consistently missing, one missing item from Part I, one from Part II, three from Part III, but none from Part IV can be allowed. To provide valid part scores applicable across all H&Y stages when random item entries are missing, one missing item from Part I, two from Part II, seven from Part III, but none from Part IV can be allowed. All cutoff values were confirmed in the validation sample. These analyses are useful for constructing valid surrogate part scores for MDS-UPDRS when missing items fall within the identified threshold and give scientific justification for rejecting partially completed ratings that fall below the threshold. © 2015 International Parkinson and Movement Disorder Society.
A biomarker-based risk score to predict death in patients with atrial fibrillation: the ABC (age, biomarkers, clinical history) death risk score

PubMed Central

Hijazi, Ziad; Oldgren, Jonas; Lindbäck, Johan; Alexander, John H; Connolly, Stuart J; Eikelboom, John W; Ezekowitz, Michael D; Held, Claes; Hylek, Elaine M; Lopes, Renato D; Yusuf, Salim; Granger, Christopher B; Siegbahn, Agneta; Wallentin, Lars

2018-01-01

Abstract Aims In atrial fibrillation (AF), mortality remains high despite effective anticoagulation. A model predicting the risk of death in these patients is currently not available. We developed and validated a risk score for death in anticoagulated patients with AF including both clinical information and biomarkers. Methods and results The new risk score was developed and internally validated in 14 611 patients with AF randomized to apixaban vs. warfarin for a median of 1.9 years. External validation was performed in 8548 patients with AF randomized to dabigatran vs. warfarin for 2.0 years. Biomarker samples were obtained at study entry. Variables significantly contributing to the prediction of all-cause mortality were assessed by Cox-regression. Each variable obtained a weight proportional to the model coefficients. There were 1047 all-cause deaths in the derivation and 594 in the validation cohort. The most important predictors of death were N-terminal pro B-type natriuretic peptide, troponin-T, growth differentiation factor-15, age, and heart failure, and these were included in the ABC (Age, Biomarkers, Clinical history)-death risk score. The score was well-calibrated and yielded higher c-indices than a model based on all clinical variables in both the derivation (0.74 vs. 0.68) and validation cohorts (0.74 vs. 0.67). The reduction in mortality with apixaban was most pronounced in patients with a high ABC-death score. Conclusion A new biomarker-based score for predicting risk of death in anticoagulated AF patients was developed, internally and externally validated, and well-calibrated in two large cohorts. The ABC-death risk score performed well and may contribute to overall risk assessment in AF. ClinicalTrials.gov identifier NCT00412984 and NCT00262600 PMID:29069359
The medial tibial stress syndrome score: a new patient-reported outcome measure.

PubMed

Winters, Marinus; Moen, Maarten H; Zimmermann, Wessel O; Lindeboom, Robert; Weir, Adam; Backx, Frank Jg; Bakker, Eric Wp

2016-10-01

At present, there is no validated patient-reported outcome measure (PROM) for patients with medial tibial stress syndrome (MTSS). Our aim was to select and validate previously generated items and create a valid, reliable and responsive PROM for patients with MTSS: the MTSS score. A prospective cohort study was performed in multiple sports medicine, physiotherapy and military facilities in the Netherlands. Participants with MTSS filled out the previously generated items for the MTSS score on 3 occasions. From previously generated items, we selected the best items. We assessed the MTSS score for its validity, reliability and responsiveness. The MTSS score was filled out by 133 participants with MTSS. Factor analysis showed the MTSS score to exhibit a single-factor structure with acceptable internal consistency (α=0.58) and good test-retest reliability (intraclass correlation coefficient=0.81). The MTSS score ranges from 0 to 10 points. The smallest detectable change in our sample was 0.69 at the group level and 4.80 at the individual level. Construct validity analysis showed significant moderate-to-large correlations (r=0.34-0.52, p<0.01). Responsiveness of the MTSS score was confirmed by a significant relation with the global perceived effect scale (β=-0.288, R(2)=0.21, p<0.001). The MTSS score is a valid, reliable and responsive PROM to measure the severity of MTSS. It is designed to evaluate treatment outcomes in clinical studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Development of a risk index for prediction of abnormal pap test results in Serbia.

PubMed

Vukovic, Dejana; Antic, Ljiljana; Vasiljevic, Mladenko; Antic, Dragan; Matejic, Bojana

2015-01-01

Serbia is one of the countries with highest incidence and mortality rates for cervical cancer in Central and South Eastern Europe. Introducing a risk index could provide a powerful means for targeting groups at high likelihood of having an abnormal cervical smear and increase efficiency of screening. The aim of the present study was to create and assess validity ofa index for prediction of an abnormal Pap test result. The study population was drawn from patients attending Departments for Women's Health in two primary health care centers in Serbia. Out of 525 respondents 350 were randomly selected and data obtained from them were used as the index creation dataset. Data obtained from the remaining 175 were used as an index validation data set. Age at first intercourse under 18, more than 4 sexual partners, history of STD and multiparity were attributed statistical weights 16, 15, 14 and 13, respectively. The distribution of index scores in index-creation data set showed that most respondents had a score 0 (54.9%). In the index-creation dataset mean index score was 10.3 (SD-13.8), and in the validation dataset the mean was 9.1 (SD=13.2). The advantage of such scoring system is that it is simple, consisting of only four elements, so it could be applied to identify women with high risk for cervical cancer that would be referred for further examination.
Evaluation of Non-Laboratory and Laboratory Prediction Models for Current and Future Diabetes Mellitus: A Cross-Sectional and Retrospective Cohort Study

PubMed Central

Hahn, Seokyung; Moon, Min Kyong; Park, Kyong Soo; Cho, Young Min

2016-01-01

Background Various diabetes risk scores composed of non-laboratory parameters have been developed, but only a few studies performed cross-validation of these scores and a comparison with laboratory parameters. We evaluated the performance of diabetes risk scores composed of non-laboratory parameters, including a recently published Korean risk score (KRS), and compared them with laboratory parameters. Methods The data of 26,675 individuals who visited the Seoul National University Hospital Healthcare System Gangnam Center for a health screening program were reviewed for cross-sectional validation. The data of 3,029 individuals with a mean of 6.2 years of follow-up were reviewed for longitudinal validation. The KRS and 16 other risk scores were evaluated and compared with a laboratory prediction model developed by logistic regression analysis. Results For the screening of undiagnosed diabetes, the KRS exhibited a sensitivity of 81%, a specificity of 58%, and an area under the receiver operating characteristic curve (AROC) of 0.754. Other scores showed AROCs that ranged from 0.697 to 0.782. For the prediction of future diabetes, the KRS exhibited a sensitivity of 74%, a specificity of 54%, and an AROC of 0.696. Other scores had AROCs ranging from 0.630 to 0.721. The laboratory prediction model composed of fasting plasma glucose and hemoglobin A1c levels showed a significantly higher AROC (0.838, P < 0.001) than the KRS. The addition of the KRS to the laboratory prediction model increased the AROC (0.849, P = 0.016) without a significant improvement in the risk classification (net reclassification index: 4.6%, P = 0.264). Conclusions The non-laboratory risk scores, including KRS, are useful to estimate the risk of undiagnosed diabetes but are inferior to the laboratory parameters for predicting future diabetes. PMID:27214034
Understanding latent structures of clinical information logistics: A bottom-up approach for model building and validating the workflow composite score.

PubMed

Esdar, Moritz; Hübner, Ursula; Liebe, Jan-David; Hüsers, Jens; Thye, Johannes

2017-01-01

Clinical information logistics is a construct that aims to describe and explain various phenomena of information provision to drive clinical processes. It can be measured by the workflow composite score, an aggregated indicator of the degree of IT support in clinical processes. This study primarily aimed to investigate the yet unknown empirical patterns constituting this construct. The second goal was to derive a data-driven weighting scheme for the constituents of the workflow composite score and to contrast this scheme with a literature based, top-down procedure. This approach should finally test the validity and robustness of the workflow composite score. Based on secondary data from 183 German hospitals, a tiered factor analytic approach (confirmatory and subsequent exploratory factor analysis) was pursued. A weighting scheme, which was based on factor loadings obtained in the analyses, was put into practice. We were able to identify five statistically significant factors of clinical information logistics that accounted for 63% of the overall variance. These factors were "flow of data and information", "mobility", "clinical decision support and patient safety", "electronic patient record" and "integration and distribution". The system of weights derived from the factor loadings resulted in values for the workflow composite score that differed only slightly from the score values that had been previously published based on a top-down approach. Our findings give insight into the internal composition of clinical information logistics both in terms of factors and weights. They also allowed us to propose a coherent model of clinical information logistics from a technical perspective that joins empirical findings with theoretical knowledge. Despite the new scheme of weights applied to the calculation of the workflow composite score, the score behaved robustly, which is yet another hint of its validity and therefore its usefulness. Copyright Â© 2016 Elsevier Ireland Ltd. All rights reserved.
Development and Evaluation of a Questionnaire for Measuring Suboptimal Health Status in Urban Chinese

PubMed Central

Yan, Yu-Xiang; Liu, You-Qin; Li, Man; Hu, Pei-Feng; Guo, Ai-Min; Yang, Xing-Hua; Qiu, Jing-Jun; Yang, Shan-Shan; Shen, Jian; Zhang, Li-Ping; Wang, Wei

2009-01-01

Background Suboptimal health status (SHS) is characterized by ambiguous health complaints, general weakness, and lack of vitality, and has become a new public health challenge in China. It is believed to be a subclinical, reversible stage of chronic disease. Studies of intervention and prognosis for SHS are expected to become increasingly important. Consequently, a reliable and valid instrument to assess SHS is essential. We developed and evaluated a questionnaire for measuring SHS in urban Chinese. Methods Focus group discussions and a literature review provided the basis for the development of the questionnaire. Questionnaire validity and reliability were evaluated in a small pilot study and in a larger cross-sectional study of 3000 individuals. Analyses included tests for reliability and internal consistency, exploratory and confirmatory factor analysis, and tests for discriminative ability and convergent validity. Results The final questionnaire included 25 items on SHS (SHSQ-25), and encompassed 5 subscales: fatigue, the cardiovascular system, the digestive tract, the immune system, and mental status. Overall, 2799 of 3000 participants completed the questionnaire (93.3%). Test-retest reliability coefficients of individual items ranged from 0.89 to 0.98. Item-subscale correlations ranged from 0.51 to 0.72, and Cronbach’s α was 0.70 or higher for all subscales. Factor analysis established 5 distinct domains, as conceptualized in our model. One-way ANOVA showed statistically significant differences in scale scores between 3 occupation groups; these included total scores and subscores (P < 0.01). The correlation between the SHS scores and experienced stress was statistically significant (r = 0.57, P < 0.001). Conclusions The SHSQ-25 is a reliable and valid instrument for measuring sub-health status in urban Chinese. PMID:19749497

Development and face validation of a Virtual Reality Epley Maneuver System (VREMS) for home Epley treatment of benign paroxysmal positional vertigo: A randomized, controlled trial.

PubMed

Tabanfar, Reza; Chan, Harley H L; Lin, Vincent; Le, Trung; Irish, Jonathan C

To develop and validate a smartphone based Virtual Reality Epley Maneuver System (VREMS) for home use. A smartphone application was designed to produce stereoscopic views of a Virtual Reality (VR) environment, which when viewed after placing a smartphone in a virtual reality headset, allowed the user to be guided step-by-step through the Epley maneuver in a VR environment. Twenty healthy participants were recruited and randomized to undergo either assisted Epleys or self-administered Epleys following reading instructions from an Instructional Handout (IH). All participants were filmed and two expert Otologists reviewed the videos, assigning each participant a score (out of 10) for performance on each step. Participants rated their perceived workload by completing a validated task-load questionnaire (NASA Task Load Index) and averages for both groups were calculated. Twenty participants were evaluated with average age 26.4±7.12years old in the VREMS group and 26.1±7.72 in the IH group. The VR assisted group achieved an average score of 7.78±0.99 compared to 6.65±1.72 in the IH group. This result was statistically significant with p=0.0001 and side dominance did not appear to play a factor. Analyzing each step of the Epley maneuver demonstrated that assisted Epleys were done more accurately with statically significant results in steps 2-4. Results of the NASA-TLX scores were variable with no significant findings. We have developed and demonstrated face validity for VREMS through our randomized controlled trial. The VREMS platform is promising technology, which may improve the accuracy and effectiveness of home Epley treatments. N/A. Copyright © 2017 Elsevier Inc. All rights reserved.
Cigar Box Arthroscopy: A Randomized Controlled Trial Validates Nonanatomic Simulation Training of Novice Arthroscopy Skills.

PubMed

Sandberg, Rory P; Sherman, Nathan C; Latt, L Daniel; Hardy, Jolene C

2017-11-01

The goal of this study was to validate the cigar box arthroscopy trainer (CBAT) as a training tool and then compare its effectiveness to didactic training and to another previously validated low-fidelity but anatomic model, the anatomic knee arthroscopy trainer (AKAT). A nonanatomic knee arthroscopy training module was developed at our institution. Twenty-four medical students with no prior arthroscopic or laparoscopic experience were enrolled as subjects. Eight subjects served as controls. The remaining 16 subjects were randomized to participate in 4 hours of either the CBAT or a previously validated AKAT. Subjects' skills were assessed by 1 of 2 faculty members through repeated attempts at performing a diagnostic knee arthroscopy on a cadaveric specimen. Objective scores were given using a minimally adapted version of the Basic Arthroscopic Knee Skill Scoring System. Total cost differences were calculated. Seventy-five percent of subjects in the CBAT and AKAT groups succeeded in reaching minimum proficiency in the allotted time compared with 25% in the control group (P < .05). There was no significant difference in the number of attempts to reach proficiency between the CBAT and AKAT groups. The cost to build the CBAT was $44.12, whereas the cost was $324.33 for the AKAT. This pilot study suggests the CBAT is an effective knee arthroscopy trainer that may decrease the learning curve of residents without significant cost to a residency program. This study demonstrates the need for an agreed-upon objective scoring system to properly evaluate residents and compare the effectiveness of different training tools. Copyright © 2017 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Factor Structure, Factorial Invariance, and Validity of the Multidimensional Shame-Related Response Inventory-21 (MSRI-21)

PubMed Central

Garcia, Antonio F.; Acosta, Melina; Pirani, Saifa; Edwards, Daniel; Osman, Augustine

2017-01-01

We describe 2 studies designed to evaluate scores on the Multidimensional Shame-related Response Inventory-21 (MSRI-21), a recently developed instrument that measures affective and behavioral responses to shame. The inventory assesses shame-related responses in 3 categories: negative self-evaluation, fear of social consequences, and maladaptive behavior tendency. For Study 1, (N = 743) undergraduates completed the MSRI-21. Confirmatory factor analysis supported the validity of the MSRI-21 3-factor structure. Latent variable modeling of coefficient-α provided strong evidence for the internal consistency of scores on each scale. In Study 2, (N = 540) undergraduates completed the instrument along with 5 concurrent measures chosen for clinical significance. Achievement of factorial invariance supported the use of MSRI-21 scale scores to make valid mean comparisons across gender. In addition, MSRI-21 scale scores were associated as expected with scores on measures of self-harm, suicide, and other risk factors. Taken together, results of 2 studies support the internal consistency reliability, factorial validity, factorial invariance, and convergent validity of scores on the MSRI-21. Further work is needed to assess the temporal stability of the MSRI-21 scale scores, invariance across clinical status and other groupings, item-level measurement properties, and viability in highly symptomatic samples. PMID:28182490
A Study on Critical Thinking Assessment System of College English Writing

ERIC Educational Resources Information Center

Dong, Tian; Yue, Lu

2015-01-01

This research attempts to discuss the validity of introducing the evaluation of students' critical thinking skills (CTS) into the assessment system of college English writing through an empirical study. In this paper, 30 College English Test Band 4 (CET-4) writing samples were collected and analyzed. Students' CTS and the final scores of collected…
Comparing Validity Evidence of Two ECERS-R Scoring Systems

ERIC Educational Resources Information Center

Zeng, Songtian

2017-01-01

Over 30 states have adopted the Early Childhood Environmental Rating Scale-Revised (ECERS-R) as a component of their program quality assessment systems, but the use of ECERS-R on such a large scale has raised important questions about implementation. One of the most pressing question centers upon decisions users must make between two scoring…
The UK-PBC risk scores: Derivation and validation of a scoring system for long-term prediction of end-stage liver disease in primary biliary cholangitis.

PubMed

Carbone, Marco; Sharp, Stephen J; Flack, Steve; Paximadas, Dimitrios; Spiess, Kelly; Adgey, Carolyn; Griffiths, Laura; Lim, Reyna; Trembling, Paul; Williamson, Kate; Wareham, Nick J; Aldersley, Mark; Bathgate, Andrew; Burroughs, Andrew K; Heneghan, Michael A; Neuberger, James M; Thorburn, Douglas; Hirschfield, Gideon M; Cordell, Heather J; Alexander, Graeme J; Jones, David E J; Sandford, Richard N; Mells, George F

2016-03-01

The biochemical response to ursodeoxycholic acid (UDCA)--so-called "treatment response"--strongly predicts long-term outcome in primary biliary cholangitis (PBC). Several long-term prognostic models based solely on the treatment response have been developed that are widely used to risk stratify PBC patients and guide their management. However, they do not take other prognostic variables into account, such as the stage of the liver disease. We sought to improve existing long-term prognostic models of PBC using data from the UK-PBC Research Cohort. We performed Cox's proportional hazards regression analysis of diverse explanatory variables in a derivation cohort of 1,916 UDCA-treated participants. We used nonautomatic backward selection to derive the best-fitting Cox model, from which we derived a multivariable fractional polynomial model. We combined linear predictors and baseline survivor functions in equations to score the risk of a liver transplant or liver-related death occurring within 5, 10, or 15 years. We validated these risk scores in an independent cohort of 1,249 UDCA-treated participants. The best-fitting model consisted of the baseline albumin and platelet count, as well as the bilirubin, transaminases, and alkaline phosphatase, after 12 months of UDCA. In the validation cohort, the 5-, 10-, and 15-year risk scores were highly accurate (areas under the curve: >0.90). The prognosis of PBC patients can be accurately evaluated using the UK-PBC risk scores. They may be used to identify high-risk patients for closer monitoring and second-line therapies, as well as low-risk patients who could potentially be followed up in primary care. © 2015 by the American Association for the Study of Liver Diseases.
Risk assessment using a novel score to predict anastomotic leak and major complications after oesophageal resection.

PubMed

Noble, Fergus; Curtis, Nathan; Harris, Scott; Kelly, Jamie J; Bailey, Ian S; Byrne, James P; Underwood, Timothy J

2012-06-01

Oesophagectomy is associated with significant morbidity and mortality. A simple score to define a patient's risk of developing major complications would be beneficial. Patients who underwent upper gastrointestinal resections with an oesophageal anastomosis between 2005 and 2010 were reviewed and formed the development dataset with resections performed in 2011 forming a prospective validation dataset. The association between post-operative C-reactive protein (CRP), white cell count (WCC) and albumin levels with anastomotic leak (AL) or major complication including death using the Clavien-Dindo (CD) classification were analysed by receiver operating characteristic curves. After multivariate analysis, from the development dataset, these factors were combined to create a novel score which was subsequently tested on the validation dataset. Two hundred fifty-eight patients were assessed to develop the score. Sixty-three patients (25%) developed a major complication, and there were seven (2.7%) in-patient deaths. Twenty-six (10%) patients were diagnosed with AL at median post-operative day 7 (range: 5-15). CRP (p = 0.002), WCC (p < 0.0001) and albumin (p = 0.001) were predictors of AL. Combining these markers improved prediction of AL (NUn score > 10: sensitivity 95%, specificity 49%, diagnostic accuracy 0.801 (95% confidence interval: 0.692-0.909, p < 0.0001)). The validation dataset confirmed these findings (NUn score > 10: sensitivity 100%, specificity 57%, diagnostic accuracy 0.879 (95% CI 0.763-0.994, p = 0.014)) and a major complication or death (NUn > 10: sensitivity 89%, specificity 63%, diagnostic accuracy 0.856 (95% CI 0.709-1, p = 0.001)). Blood-borne markers of the systemic inflammatory response are predictors of AL and major complications after oesophageal resection. When combined they may categorise a patient's risk of developing a serious complication with higher sensitivity and specificity.
Overcoming barriers to population-based injury research: development and validation of an ICD-10–to–AIS algorithm

PubMed Central

Haas, Barbara; Xiong, Wei; Brennan-Barnes, Maureen; Gomez, David; Nathens, Avery B.

2012-01-01

Background Hospital administrative databases are a useful source of population-level data on injured patients; however, these databases use the International Classification of Diseases (ICD) system, which does not provide a direct means of estimating injury severity. We created and validated a crosswalk to derive Abbreviated Injury Scale (AIS) scores from injury-related diagnostic codes in the tenth revision of the ICD (ICD-10). Methods We assessed the validity of the crosswalk using data from the Ontario Trauma Registry Comprehensive Data Set (OTR-CDS). The AIS and Injury Severity Scores (ISS) derived using the algorithm were compared with those assigned by expert abstractors. We evaluated the ability of the algorithm to identify patients with AIS scores of 3 or greater. We used κ and intraclass correlation coefficients (ICC) as measures of concordance. Results In total, 10 431 patients were identified in the OTR-CDS. The algorithm accurately identified patients with at least 1 AIS score of 3 or greater (κ 0.65), as well as patients with a head AIS score of 3 or greater (κ 0.78). Mapped and abstracted ISS were similar; ICC across the entire cohort was 0.83 (95% confidence interval 0.81–0.84), indicating good agreement. When comparing mapped and abstracted ISS, the difference between scores was 10 or less in 87% of patients. Concordance between mapped and abstracted ISS was similar across strata of age, mechanism of injury and mortality. Conclusion Our ICD-10–to–AIS algorithm produces reliable estimates of injury severity from data available in administrative databases. This algorithm can facilitate the use of administrative data for population-based injury research in jurisdictions using ICD-10. PMID:22269308
Overcoming barriers to population-based injury research: development and validation of an ICD10-to-AIS algorithm.

PubMed

Haas, Barbara; Xiong, Wei; Brennan-Barnes, Maureen; Gomez, David; Nathens, Avery B

2012-02-01

Hospital administrative databases are a useful source of population-level data on injured patients; however, these databases use the International Classification of Diseases (ICD) system, which does not provide a direct means of estimating injury severity. We created and validated a crosswalk to derive Abbreviated Injury Scale (AIS) scores from injury-related diagnostic codes in the tenth revision of the ICD (ICD-10). We assessed the validity of the crosswalk using data from the Ontario Trauma Registry Comprehensive Data Set (OTRCDS). The AIS and Injury Severity Scores (ISS) derived using the algorithm were compared with those assigned by expert abstractors. We evaluated the ability of the algorithm to identify patients with AIS scores of 3 or greater. We used κ and intraclass correlation coefficients (ICC) as measures of concordance. In total, 10 431 patients were identified in the OTRCDS. The algorithm accurately identified patients with at least 1 AIS score of 3 or greater (κ 0.65), as well as patients with a head AIS score of 3 or greater (κ 0.78). Mapped and abstracted ISS were similar; ICC across the entire cohort was 0.83 (95% confidence interval 0.81-0.84), indicating good agreement. When comparing mapped and abstracted ISS, the difference between scores was 10 or less in 87% of patients. Concordance between mapped and abstracted ISS was similar across strata of age, mechanism of injury and mortality. Our ICD-10-to-AIS algorithm produces reliable estimates of injury severity from data available in administrative databases. This algorithm can facilitate the use of administrative data for population-based injury research in jurisdictions using ICD-10.
Identification of patients at high risk for Clostridium difficile infection: development and validation of a risk prediction model in hospitalized patients treated with antibiotics.

PubMed

van Werkhoven, C H; van der Tempel, J; Jajou, R; Thijsen, S F T; Diepersloot, R J A; Bonten, M J M; Postma, D F; Oosterheert, J J

2015-08-01

To develop and validate a prediction model for Clostridium difficile infection (CDI) in hospitalized patients treated with systemic antibiotics, we performed a case-cohort study in a tertiary (derivation) and secondary care hospital (validation). Cases had a positive Clostridium test and were treated with systemic antibiotics before suspicion of CDI. Controls were randomly selected from hospitalized patients treated with systemic antibiotics. Potential predictors were selected from the literature. Logistic regression was used to derive the model. Discrimination and calibration of the model were tested in internal and external validation. A total of 180 cases and 330 controls were included for derivation. Age >65 years, recent hospitalization, CDI history, malignancy, chronic renal failure, use of immunosuppressants, receipt of antibiotics before admission, nonsurgical admission, admission to the intensive care unit, gastric tube feeding, treatment with cephalosporins and presence of an underlying infection were independent predictors of CDI. The area under the receiver operating characteristic curve of the model in the derivation cohort was 0.84 (95% confidence interval 0.80-0.87), and was reduced to 0.81 after internal validation. In external validation, consisting of 97 cases and 417 controls, the model area under the curve was 0.81 (95% confidence interval 0.77-0.85) and model calibration was adequate (Brier score 0.004). A simplified risk score was derived. Using a cutoff of 7 points, the positive predictive value, sensitivity and specificity were 1.0%, 72% and 73%, respectively. In conclusion, a risk prediction model was developed and validated, with good discrimination and calibration, that can be used to target preventive interventions in patients with increased risk of CDI. Copyright © 2015 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Evidence for the Psychometric Validity, Internal Consistency and Measurement Invariance of Warwick Edinburgh Mental Well-being Scale Scores in Scottish and Irish Adolescents.

PubMed

McKay, Michael T; Andretta, James R

2017-09-01

Mental well-being is an important indicator of current, but also the future health of adolescents. The 14-item Warwick Edinburgh Mental Well-being Scale (WEMWBS) has been well validated in adults world-wide, but less work has been undertaken to examine the psychometric validity and internal consistency of WEMWBS scores in adolescents. In particular, little research has examined scores on the short 7-item version of the WEMWBS. The present study used two large samples of school children in Scotland and Northern Ireland and found that for both forms of the WEMWBS, scores were psychometrically valid, internally consistent, factor saturated, and measurement invariant by country. Using the WEMWBS full form, males reported significantly higher scores than females, and Northern Irish adolescents reported significantly higher scores than their Scottish counterparts. Last, the lowest overall levels of well-being were observed among Scottish females. Copyright © 2017. Published by Elsevier B.V.
Visual reproduction subtest of the Wechsler Memory Scale-Revised: analysis of construct validity.

PubMed

Williams, M A; Rich, M A; Reed, L K; Jackson, W T; LaMarche, J A; Boll, T J

1998-11-01

This study assessed the construct validity of Visual Reproduction (VR) Cards A (Flags) and B (Boxes) from the original Wechsler Memory Scale (WMS) compared to Flags and Boxes from the revised edition of the WMS (WMS-R). Independent raters scored Flags and Boxes using both the original and revised scoring criteria and correlations were obtained with age, education, IQ, and four separate criterion memory measures. Results show that for Flags, there is a tendency for the revised scoring criteria to produce improved construct validity. For Boxes, however, there was a trend in the opposite direction, with the revised scoring criteria demonstrating worse construct validity. Factor analysis suggests that Flags are a more distinct measure of visual memory, whereas Boxes are more complex and significantly associated with conceptual reasoning abilities. Using the revised scoring criteria, Boxes were found to be more strongly related to IQ than Flags. This difference was not found using the original scoring criteria.
Towards Virtual FLS: Development of a Peg Transfer Simulator

PubMed Central

Arikatla, Venkata S; Ahn, Woojin; Sankaranarayanan, Ganesh; De, Suvranu

2014-01-01

Background Peg transfer is one of five tasks in the Fundamentals of Laparoscopic Surgery (FLS), program. We report the development and validation of a Virtual Basic Laparoscopic Skill Trainer-Peg Transfer (VBLaST-PT©) simulator for automatic real-time scoring and objective quantification of performance. Methods We have introduced new techniques in order to allow bi-manual manipulation of pegs and automatic scoring/evaluation while maintaining high quality of simulation. We performed a preliminary face and construct validation study with 22 subjects divided into two groups: experts (PGY 4–5, fellow and practicing surgeons) and novice (PGY 1–3). Results Face validation shows high scores for all the aspects of the simulation. A two-tailed Mann-Whitney U-test scores showed significant difference between the two groups on completion time (p=0.003), FLS score (p=0.002) and the VBLaST-PT© score (p=0.006). Conclusions VBLaST-PT© is a high quality virtual simulator that showed both face and construct validity. PMID:24030904
Reliability and Validity of Inferences about Teachers Based on Student Scores. William H. Angoff Memorial Lecture Series

ERIC Educational Resources Information Center

Haertel, Edward H.

2013-01-01

Policymakers and school administrators have embraced value-added models of teacher effectiveness as tools for educational improvement. Teacher value-added estimates may be viewed as complicated scores of a certain kind. This suggests using a test validation model to examine their reliability and validity. Validation begins with an interpretive…
Construct validity of the individual work performance questionnaire.

PubMed

Koopmans, Linda; Bernaards, Claire M; Hildebrandt, Vincent H; de Vet, Henrica C W; van der Beek, Allard J

2014-03-01

To examine the construct validity of the Individual Work Performance Questionnaire (IWPQ). A total of 1424 Dutch workers from three occupational sectors (blue, pink, and white collar) participated in the study. First, IWPQ scores were correlated with related constructs (convergent validity). Second, differences between known groups were tested (discriminative validity). First, IWPQ scores correlated weakly to moderately with absolute and relative presenteeism, and work engagement. Second, significant differences in IWPQ scores were observed for workers differing in job satisfaction, and workers differing in health. Overall, the results indicate acceptable construct validity of the IWPQ. Researchers are provided with a reliable and valid instrument to measure individual work performance comprehensively and generically, among workers from different occupational sectors, with and without health problems.
Design and validation of a portable, inexpensive and multi-beam timing light system using the Nintendo Wii hand controllers.

PubMed

Clark, Ross A; Paterson, Kade; Ritchie, Callan; Blundell, Simon; Bryant, Adam L

2011-03-01

Commercial timing light systems (CTLS) provide precise measurement of athletes running velocity, however they are often expensive and difficult to transport. In this study an inexpensive, wireless and portable timing light system was created using the infrared camera in Nintendo Wii hand controllers (NWHC). System creation with gold-standard validation. A Windows-based software program using NWHC to replicate a dual-beam timing gate was created. Firstly, data collected during 2m walking and running trials were validated against a 3D kinematic system. Secondly, data recorded during 5m running trials at various intensities from standing or flying starts were compared to a single beam CTLS and the independent and average scores of three handheld stopwatch (HS) operators. Intraclass correlation coefficient and Bland-Altman plots were used to assess validity. Absolute error quartiles and percentage of trials in absolute error threshold ranges were used to determine accuracy. The NWHC system was valid when compared against the 3D kinematic system (ICC=0.99, median absolute error (MAR)=2.95%). For the flying 5m trials the NWHC system possessed excellent validity and precision (ICC=0.97, MAR<3%) when compared with the CTLS. In contrast, the NWHC system and the HS values during standing start trials possessed only modest validity (ICC<0.75) and accuracy (MAR>8%). A NWHC timing light system is inexpensive, portable and valid for assessing running velocity. Errors in the 5m standing start trials may have been due to erroneous event detection by either the commercial or NWHC-based timing light systems. Copyright © 2010 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Validation of a motivation-based typology of angry aggression among antisocial youths in Norway.

PubMed

Bjørnebekk, Gunnar; Howard, Rick

2012-01-01

This article describes the validation of the Angry Aggression Scales (AAS), the Behavior Inhibition System and the Behavior Activation System (BIS/BAS) scales, the reactive aggression and proactive power scales in relation to a Norwegian sample of 101 antisocial youths with conduct problems (64 boys, 37 girls, mean age 15 ± 1.3 years) and 101 prosocial controls matched on age, gender, education, ethnicity, and school district. Maximum likelihood exploratory factor analyses with oblique rotation were performed on AAS, BIS/BAS, reactive aggression and proactive power scales as well as computation of Cronbach's alpha and McDonald's omega. Tests for normality and homogeneity of variance were acceptable. Factor analyses of AAS and the proactive/reactive aggression scales suggested a hierarchical structure comprising a single higher-order angry aggression (AA) factor and four and two lower-order factors, respectively. Moreover, results suggested one BIS factor and a single higher-order BAS factor with three lower-order factors related to drive, fun-seeking and reward responsiveness. To compare scores of antisocial youths with controls, t-tests on the mean scale scores were computed. Results confirmed that antisocial youths were different from controls on the above-mentioned scales. Consistent with the idea that anger is associated with approach motivation, AAS scores correlated with behavioral activation, but only explosive/reactive and vengeful/ruminative AA correlated with behavioral inhibition. Results generally validated the quadruple typology of aggression and violence proposed by Howard (2009). Copyright © 2012 John Wiley & Sons, Ltd.
Cross-cultural adaptation, reliability and validity of the Turkish version of the Hospital for Special Surgery (HSS) Knee Score.

PubMed

Narin, Selnur; Unver, Bayram; Bakırhan, Serkan; Bozan, Ozgür; Karatosun, Vasfi

2014-01-01

The purpose of this study was to adapt the English version of the Hospital for Special Surgery (HSS) knee score for use in a Turkish population and to evaluate its validity, reliability and cultural adaptation. Standard forward-back translation of the HSS knee score was performed and the Turkish version was applied in 73 patients. The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), Mini-Mental State Examination and sit-to-stand test were also performed and analyzed. Internal consistency reliability was tested using Cronbach's alpha. The intraclass correlation coefficient (ICC) was used to calculate the test-retest reliability at one-week intervals. Validity was assessed by calculating the Pearson correlation between the HSS, WOMAC and sit-to-stand test scores. The ICC ranged from 0.98 to 0.99 with high internal consistency (Cronbach's alpha: 0.87). The WOMAC score correlated with total HSS score (r: -0.80, p<0.001) and sit-to-stand score (r: 0.12, p: 0.312). The Turkish version of the HSS knee score is reliable and valid in evaluating the total knee arthroplasty in Turkish patients.
Validation of a method for assessing resident physicians' quality improvement proposals.

PubMed

Leenstra, James L; Beckman, Thomas J; Reed, Darcy A; Mundell, William C; Thomas, Kris G; Krajicek, Bryan J; Cha, Stephen S; Kolars, Joseph C; McDonald, Furman S

2007-09-01

Residency programs involve trainees in quality improvement (QI) projects to evaluate competency in systems-based practice and practice-based learning and improvement. Valid approaches to assess QI proposals are lacking. We developed an instrument for assessing resident QI proposals--the Quality Improvement Proposal Assessment Tool (QIPAT-7)-and determined its validity and reliability. QIPAT-7 content was initially obtained from a national panel of QI experts. Through an iterative process, the instrument was refined, pilot-tested, and revised. Seven raters used the instrument to assess 45 resident QI proposals. Principal factor analysis was used to explore the dimensionality of instrument scores. Cronbach's alpha and intraclass correlations were calculated to determine internal consistency and interrater reliability, respectively. QIPAT-7 items comprised a single factor (eigenvalue = 3.4) suggesting a single assessment dimension. Interrater reliability for each item (range 0.79 to 0.93) and internal consistency reliability among the items (Cronbach's alpha = 0.87) were high. This method for assessing resident physician QI proposals is supported by content and internal structure validity evidence. QIPAT-7 is a useful tool for assessing resident QI proposals. Future research should determine the reliability of QIPAT-7 scores in other residency and fellowship training programs. Correlations should also be made between assessment scores and criteria for QI proposal success such as implementation of QI proposals, resident scholarly productivity, and improved patient outcomes.
Ensemble assimilation of ARGO temperature profile, sea surface temperature and Altimetric satellite data into an eddy permitting primitive equation model of the North Atlantic ocean

NASA Astrophysics Data System (ADS)

Yan, Yajing; Barth, Alexander; Beckers, Jean-Marie; Candille, Guillem; Brankart, Jean-Michel; Brasseur, Pierre

2015-04-01

Sea surface height, sea surface temperature and temperature profiles at depth collected between January and December 2005 are assimilated into a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. 60 ensemble members are generated by adding realistic noise to the forcing parameters related to the temperature. The ensemble is diagnosed and validated by comparison between the ensemble spread and the model/observation difference, as well as by rank histogram before the assimilation experiments. Incremental analysis update scheme is applied in order to reduce spurious oscillations due to the model state correction. The results of the assimilation are assessed according to both deterministic and probabilistic metrics with observations used in the assimilation experiments and independent observations, which goes further than most previous studies and constitutes one of the original points of this paper. Regarding the deterministic validation, the ensemble means, together with the ensemble spreads are compared to the observations in order to diagnose the ensemble distribution properties in a deterministic way. Regarding the probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centred random variable (RCRV) score in order to investigate the reliability properties of the ensemble forecast system. The improvement of the assimilation is demonstrated using these validation metrics. Finally, the deterministic validation and the probabilistic validation are analysed jointly. The consistency and complementarity between both validations are highlighted. High reliable situations, in which the RMS error and the CRPS give the same information, are identified for the first time in this paper.

Single-joint outcome measures: preliminary validation of patient-reported outcomes and physical examination.

PubMed

Heald, Alison E; Fudman, Edward J; Anklesaria, Pervin; Mease, Philip J

2010-05-01

To assess the validity, responsiveness, and reliability of single-joint outcome measures for determining target joint (TJ) response in patients with inflammatory arthritis. Patient-reported outcomes (PRO), consisting of responses to single questions about TJ global status on a 100-mm visual analog scale (VAS; TJ global score), function on a 100-mm VAS (TJ function score), and pain on a 5-point Likert scale (TJ pain score) were piloted in 66 inflammatory arthritis subjects in a phase 1/2 clinical study of an intraarticular gene transfer agent and compared to physical examination measures (TJ swelling, TJ tenderness) and validated function questionnaires (Disabilities of the Arm, Shoulder and Hand scale, Rheumatoid Arthritis Outcome Score, and the Health Assessment Questionnaire). Construct validity was assessed by evaluating the correlation between the single-joint outcome measures and validated function questionnaires using Spearman's rank correlation. Responsiveness or sensitivity to change was assessed through calculating effect size and standardized response means (SRM). Reliability of physical examination measures was assessed by determining interobserver agreement. The single-joint PRO were highly correlated with each other and correlated well with validated functional measures. The TJ global score exhibited modest effect size and modest SRM that correlated well with the patient's assessment of response on a 100-mm VAS. Physical examination measures exhibited high interrater reliability, but correlated less well with validated functional measures and the patient's assessment of response. Single-joint PRO, particularly the TJ global score, are simple to administer and demonstrate construct validity and responsiveness in patients with inflammatory arthritis. (ClinicalTrials.gov identifier NCT00126724).
Construct Validity of Fresh Frozen Human Cadaver as a Training Model in Minimal Access Surgery

PubMed Central

Macafee, David; Pranesh, Nagarajan; Horgan, Alan F.

2012-01-01

Background: The construct validity of fresh human cadaver as a training tool has not been established previously. The aims of this study were to investigate the construct validity of fresh frozen human cadaver as a method of training in minimal access surgery and determine if novices can be rapidly trained using this model to a safe level of performance. Methods: Junior surgical trainees, novices (<3 laparoscopic procedure performed) in laparoscopic surgery, performed 10 repetitions of a set of structured laparoscopic tasks on fresh frozen cadavers. Expert laparoscopists (>100 laparoscopic procedures) performed 3 repetitions of identical tasks. Performances were scored using a validated, objective Global Operative Assessment of Laparoscopic Skills scale. Scores for 3 consecutive repetitions were compared between experts and novices to determine construct validity. Furthermore, to determine if the novices reached a safe level, a trimmed mean of the experts score was used to define a benchmark. Mann-Whitney U test was used for construct validity analysis and 1-sample t test to compare performances of the novice group with the benchmark safe score. Results: Ten novices and 2 experts were recruited. Four out of 5 tasks (nondominant to dominant hand transfer; simulated appendicectomy; intracorporeal and extracorporeal knot tying) showed construct validity. Novices’ scores became comparable to benchmark scores between the eighth and tenth repetition. Conclusion: Minimal access surgical training using fresh frozen human cadavers appears to have construct validity. The laparoscopic skills of novices can be accelerated through to a safe level within 8 to 10 repetitions. PMID:23318058
Establishing the Validity of TOEIC Bridge™ Test Scores for Students in Colombia, Chile, and Ecuador. Research Report. ETS RR-08-58

ERIC Educational Resources Information Center

Sinharay, Sandip; Feng, Ying; Saldivia, Luis; Powers, Donald E.; Ginuta, Anthony; Simpson, Annabelle; Weng, Vincent

2008-01-01

The validity of TOEIC Bridge™ scores as a measure of English language skill was examined from the standpoint of a unified concept of test validity. In this study, more than 6,000 test takers in 3 Latin American countries (Chile, Colombia, and Ecuador) took 1 form of the TOEIC Bridge test, and their scores were compared to additional information…
[Support of the nursing process through electronic nursing documentation systems (UEPD) – Initial validation of an instrument].

PubMed

Hediger, Hannele; Müller-Staub, Maria; Petry, Heidi

2016-01-01

Electronic nursing documentation systems, with standardized nursing terminology, are IT-based systems for recording the nursing processes. These systems have the potential to improve the documentation of the nursing process and to support nurses in care delivery. This article describes the development and initial validation of an instrument (known by its German acronym UEPD) to measure the subjectively-perceived benefits of an electronic nursing documentation system in care delivery. The validity of the UEPD was examined by means of an evaluation study carried out in an acute care hospital (n = 94 nurses) in German-speaking Switzerland. Construct validity was analyzed by principal components analysis. Initial references of validity of the UEPD could be verified. The analysis showed a stable four factor model (FS = 0.89) scoring in 25 items. All factors loaded ≥ 0.50 and the scales demonstrated high internal consistency (Cronbach's α = 0.73 – 0.90). Principal component analysis revealed four dimensions of support: establishing nursing diagnosis and goals; recording a case history/an assessment and documenting the nursing process; implementation and evaluation as well as information exchange. Further testing with larger control samples and with different electronic documentation systems are needed. Another potential direction would be to employ the UEPD in a comparison of various electronic documentation systems.
Validation of the one pass measure for motivational interviewing competence.

PubMed

McMaster, Fiona; Resnicow, Ken

2015-04-01

This paper examines the psychometric properties of the OnePass coding system: a new, user-friendly tool for evaluating practitioner competence in motivational interviewing (MI). We provide data on reliability and validity with the current gold-standard: Motivational Interviewing Treatment Integrity tool (MITI). We compared scores from 27 videotaped MI sessions performed by student counselors trained in MI and simulated patients using both OnePass and MITI, with three different raters for each tool. Reliability was estimated using intra-class coefficients (ICCs), and validity was assessed using Pearson's r. OnePass had high levels of inter-rater reliability with 19/23 items found from substantial to almost perfect agreement. Taking the pair of scores with the highest inter-rater reliability on the MITI, the concurrent validity between the two measures ranged from moderate to high. Validity was highest for evocation, autonomy, direction and empathy. OnePass appears to have good inter-rater reliability while capturing similar dimensions of MI as the MITI. Despite the moderate concurrent validity with the MITI, the OnePass shows promise in evaluating both traditional and novel interpretations of MI. OnePass may be a useful tool for developing and improving practitioner competence in MI where access to MITI coders is limited. Copyright © 2015. Published by Elsevier Ireland Ltd.
Validity and reliability of portfolio assessment of student competence in two dental school populations: a four-year study.

PubMed

Gadbury-Amyot, Cynthia C; McCracken, Michael S; Woldt, Janet L; Brennan, Robert L

2014-05-01

The purpose of this study was to empirically investigate the validity and reliability of portfolio assessment in two U.S. dental schools using a unified framework for validity. In the process of validation, it is not the test that is validated but rather the claims (interpretations and uses) about test scores that are validated. Kane's argument-based validation framework provided the structure for reporting results where validity claims are followed by evidence to support the argument. This multivariate generalizability theory study found that the greatest source of variance was attributable to faculty raters, suggesting that portfolio assessment would benefit from two raters' evaluating each portfolio independently. The results are generally supportive of holistic scoring, but analytical scoring deserves further research. Correlational analyses between student portfolios and traditional measures of student competence and readiness for licensure resulted in significant correlations between portfolios and National Board Dental Examination Part I (r=0.323, p<0.01) and Part II scores (r=0.268, p<0.05) and small and non-significant correlations with grade point average and scores on the Western Regional Examining Board (WREB) exam. It is incumbent upon the users of portfolio assessment to determine if the claims and evidence arguments set forth in this study support the proposed claims for and decisions about portfolio assessment in their respective institutions.
Differences in MMPI-2 FBS and RBS scores in brain injury, probable malingering, and conversion disorder groups: a preliminary study.

PubMed

Peck, C P; Schroeder, R W; Heinrichs, R J; Vondran, E J; Brockman, C J; Webster, B K; Baade, L E

2013-01-01

This study examined differences in raw scores on the Symptom Validity Scale and Response Bias Scale (RBS) from the Minnesota Multiphasic Personality Inventory-2 in three criterion groups: (i) valid traumatic brain injured, (ii) invalid traumatic brain injured, and (iii) psychogenic non-epileptic seizure disorders. Results indicate that a >30 raw score cutoff for the Symptom Validity Scale accurately identified 50% of the invalid traumatic brain injured group, while misclassifying none of the valid traumatic brain injured group and 6% of the psychogenic non-epileptic seizure disorder group. Using a >15 RBS raw cutoff score accurately classified 50% of the invalid traumatic brain injured group and misclassified fewer than 10% of the valid traumatic brain injured and psychogenic non-epileptic seizure disorder groups. These cutoff scores used conjunctively did not misclassify any members of the psychogenic non-epileptic seizure disorder or valid traumatic brain injured groups, while accurately classifying 44% of the invalid traumatic brain injured individuals. Findings from this preliminary study suggest that the conjunctive use of the Symptom Validity Scale and the RBS from the Minnesota Multiphasic Personality Inventory-2 may be useful in differentiating probable malingering from individuals with brain injuries and conversion disorders.
Validation of a Spanish version of the Leicester Cough Questionnaire in non-cystic fibrosis bronchiectasis.

PubMed

Muñoz, Gerard; Buxó, Maria; de Gracia, Javier; Olveira, Casilda; Martinez-Garcia, Miguel Angel; Giron, Rosa; Polverino, Eva; Alvarez, Antonio; Birring, Surinder S; Vendrell, Montserrat

2016-05-01

The Leicester Cough Questionnaire (LCQ) has been validated in non-cystic fibrosis bronchiectasis (NCFBC). The present study aimed to create and validate a Spanish version of the LCQ (LCQ-Sp) in NCFBC. The LCQ-Sp was developed following a standardized protocol. For reliability, we assessed internal consistency and the change in score over a 15-day period in stable state. For responsiveness, we assessed the change in scores between visit 1 and the first exacerbation. For validity, we evaluated convergent validity through correlation with the Saint George's Respiratory Questionnaire (SGRQ) and discriminant validity. Two hundred fifty-nine patients (118 mild bronchiectasis, 90 moderate bronchiectasis and 47 severe bronchiectasis) were included. Internal consistency was high for the total scoring and good for the different domains (Cronbach's α: 0.86-0.91). The test-retest reliability shows an intraclass correlation coefficient of 0.87 for the total score. The mean LCQ-Sp score at visit 1 decreased at the beginning of an exacerbation (15.13 ± 4.06 vs. 12.24 ± 4.64; p < 0.001). The correlation between LCQ-Sp and SGRQ scores was -0.66 (p < 0.01). The differences in the LCQ-Sp total score between the different groups of severity were significant (p < 0.001). The LCQ-Sp discriminates disease severity, is responsive to change when faced with exacerbations and is reliable for use in bronchiectasis. © The Author(s) 2016.
Validation of a Spanish version of the Leicester Cough Questionnaire in non-cystic fibrosis bronchiectasis

PubMed Central

Muñoz, Gerard; Buxó, Maria; de Gracia, Javier; Olveira, Casilda; Martinez-Garcia, Miguel Angel; Giron, Rosa; Polverino, Eva; Alvarez, Antonio; Birring, Surinder S

2016-01-01

The Leicester Cough Questionnaire (LCQ) has been validated in non-cystic fibrosis bronchiectasis (NCFBC). The present study aimed to create and validate a Spanish version of the LCQ (LCQ-Sp) in NCFBC. The LCQ-Sp was developed following a standardized protocol. For reliability, we assessed internal consistency and the change in score over a 15-day period in stable state. For responsiveness, we assessed the change in scores between visit 1 and the first exacerbation. For validity, we evaluated convergent validity through correlation with the Saint George’s Respiratory Questionnaire (SGRQ) and discriminant validity. Two hundred fifty-nine patients (118 mild bronchiectasis, 90 moderate bronchiectasis and 47 severe bronchiectasis) were included. Internal consistency was high for the total scoring and good for the different domains (Cronbach’s α: 0.86–0.91). The test–retest reliability shows an intraclass correlation coefficient of 0.87 for the total score. The mean LCQ-Sp score at visit 1 decreased at the beginning of an exacerbation (15.13 ± 4.06 vs. 12.24 ± 4.64; p < 0.001). The correlation between LCQ-Sp and SGRQ scores was −0.66 (p < 0.01). The differences in the LCQ-Sp total score between the different groups of severity were significant (p < 0.001). The LCQ-Sp discriminates disease severity, is responsive to change when faced with exacerbations and is reliable for use in bronchiectasis. PMID:26902541
Preoperative risk factors for conversion from laparoscopic to open cholecystectomy: a validated risk score derived from a prospective U.K. database of 8820 patients.

PubMed

Sutcliffe, Robert P; Hollyman, Marianne; Hodson, James; Bonney, Glenn; Vohra, Ravi S; Griffiths, Ewen A

2016-11-01

Laparoscopic cholecystectomy is commonly performed, and several factors increase the risk of open conversion, prolonging operating time and hospital stay. Preoperative stratification would improve consent, scheduling and identify appropriate training cases. The aim of this study was to develop a validated risk score for conversion for use in clinical practice. Preoperative patient and disease-related variables were identified from a prospective cholecystectomy database (CholeS) of 8820 patients, divided into main and validation sets. Preoperative predictors of conversion were identified by multivariable binary logistic regression. A risk score was developed and validated using a forward stepwise approach. Some 297 procedures (3.4%) were converted. The risk score was derived from six significant predictors: age (p = 0.005), sex (p < 0.001), indication for surgery (p < 0.001), ASA (p < 0.001), thick-walled gallbladder (p = 0.040) and CBD diameter (p = 0.004). Testing the score on the validation set yielded an AUROC = 0.766 (p < 0.001), and a score >6 identified patients at high risk of conversion (7.1% vs. 1.2%). This validated risk score allows preoperative identification of patients at six-fold increased risk of conversion to open cholecystectomy. Copyright © 2016 International Hepato-Pancreato-Biliary Association Inc. Published by Elsevier Ltd. All rights reserved.
The use of children's drawings in the evaluation and treatment of child sexual, emotional, and physical abuse.

PubMed

Peterson, L W; Hardin, M; Nitsch, M J

1995-05-01

Primary care physicians can be instrumental in the initial identification of potential sexual, emotional, and physical abuse of children. We reviewed the use of children's artwork as a method of communicating individual and family functioning. A quantitative method of analyzing children's artwork provides more reliability and validity than some methods used previously. A new scoring system was developed that uses individual human figure drawings and kinetic family drawings. This scoring system was based on research with 842 children (341 positively identified as sexually molested, 252 positively not sexually molested but having emotional or behavioral problems, and 249 "normal" public school children). This system is more comprehensive than previous systems of assessment of potential abuse.
Demonstrating the validity of three general scores of PET in predicting higher education achievement in Israel.

PubMed

Oren, Carmel; Kennet-Cohen, Tamar; Turvall, Elliot; Allalouf, Avi

2014-01-01

The Psychometric Entrance Test (PET), used for admission to higher education in Israel together with the Matriculation (Bagrut), had in the past one general (total) score in which the weights for its domains: Verbal, Quantitative and English, were 2:2:1, respectively. In 2011, two additional total scores were introduced, with different weights for the Verbal and the Quantitative domains. This study compares the predictive validity of the three general scores of PET, and demonstrates validity in terms of utility. 100,863 freshmen students of all Israeli universities over the classes of 2005-2009. Regression weights and correlations of the predictors with FYGPA were computed. Simulations based on these results supplied the utility estimates. On average, PET is slightly more predictive than the Bagrut; using them both yields a better tool than either of them alone. Assigning differential weights to the components in the respective schools further improves the validity. The introduction of the new general scores of PET is validated by gathering and analyzing evidence based on relations of test scores to other variables. The utility of using the test can be demonstrated in ways different from correlations.
Application of validity theory and methodology to patient-reported outcome measures (PROMs): building an argument for validity.

PubMed

Hawkins, Melanie; Elsworth, Gerald R; Osborne, Richard H

2018-07-01

Data from subjective patient-reported outcome measures (PROMs) are now being used in the health sector to make or support decisions about individuals, groups and populations. Contemporary validity theorists define validity not as a statistical property of the test but as the extent to which empirical evidence supports the interpretation of test scores for an intended use. However, validity testing theory and methodology are rarely evident in the PROM validation literature. Application of this theory and methodology would provide structure for comprehensive validation planning to support improved PROM development and sound arguments for the validity of PROM score interpretation and use in each new context. This paper proposes the application of contemporary validity theory and methodology to PROM validity testing. The validity testing principles will be applied to a hypothetical case study with a focus on the interpretation and use of scores from a translated PROM that measures health literacy (the Health Literacy Questionnaire or HLQ). Although robust psychometric properties of a PROM are a pre-condition to its use, a PROM's validity lies in the sound argument that a network of empirical evidence supports the intended interpretation and use of PROM scores for decision making in a particular context. The health sector is yet to apply contemporary theory and methodology to PROM development and validation. The theoretical and methodological processes in this paper are offered as an advancement of the theory and practice of PROM validity testing in the health sector.
The cross-cultural adaptation, reliability, and validity of the Copenhagen Neck Functional Disability Scale in patients with chronic neck pain: Turkish version study.

PubMed

Yapali, Gökmen; Günel, Mintaze Kerem; Karahan, Sevilay

2012-05-15

The study design was cross-cultural adaptation and investigation of reliability and validity of the Copenhagen Neck Functional Disability Scale (CNFDS). The aim of this study was to translate the CNFDS into Turkish language and assess its reliability and validity among patients with neck pain in Turkish population. The CNFDS is a reliable and valid evaluation instrument for disability, but there is no published the Turkish version of the CNFDS. One hundred one subjects who had chronic neck pain were included in this study. The CNFDS, Neck Pain and Disability Scale, and visual analogue scale were administered to all subjects. For investigating test-retest reliability, correlation between CNFDS scores, applied at 1-week interval, intraclass correlation coefficient score for test-retest reliability was 0.86 (95% confidence interval = 0.679-0.935). There was no difference between test-retest scores (P < 0.001). For investigating concurrent validity, correlation between total score of the CNFDS and the mean visual analogue scale was r = 0.73 (P < 0.001). Concurrent validity of the CNFDS was very good. For investigating construct validity, correlation between total score of the CNFDS and the Neck Pain and Disability Scale was r = 0.78 (P < 0.001). Construct validity of the CNFDS was also very good. Our results suggest that the Turkish version of the CNFDS is a reliable and valid instrument for Turkish people.
Individualizing Risk of Multidrug-Resistant Pathogens in Community-Onset Pneumonia

PubMed Central

Falcone, Marco; Russo, Alessandro; Giannella, Maddalena; Cangemi, Roberto; Scarpellini, Maria Gabriella; Bertazzoni, Giuliano; Alarcón, José Martínez; Taliani, Gloria; Palange, Paolo; Farcomeni, Alessio; Vestri, Annarita; Bouza, Emilio; Violi, Francesco; Venditti, Mario

2015-01-01

Introduction The diffusion of multidrug-resistant (MDR) bacteria has created the need to identify risk factors for acquiring resistant pathogens in patients living in the community. Objective To analyze clinical features of patients with community-onset pneumonia due to MDR pathogens, to evaluate performance of existing scoring tools and to develop a bedside risk score for an early identification of these patients in the Emergency Department. Patients and Methods This was an open, observational, prospective study of consecutive patients with pneumonia, coming from the community, from January 2011 to January 2013. The new score was validated on an external cohort of 929 patients with pneumonia admitted in internal medicine departments participating at a multicenter prospective study in Spain. Results A total of 900 patients were included in the study. The final logistic regression model consisted of four variables: 1) one risk factor for HCAP, 2) bilateral pulmonary infiltration, 3) the presence of pleural effusion, and 4) the severity of respiratory impairment calculated by use of PaO2/FiO2 ratio. A new risk score, the ARUC score, was developed; compared to Aliberti, Shorr, and Shindo scores, this point score system has a good discrimination performance (AUC 0.76, 95% CI 0.71-0.82) and calibration (Hosmer-Lemeshow, χ2 = 7.64; p = 0.469). The new score outperformed HCAP definition in predicting etiology due to MDR organism. The performance of this bedside score was confirmed in the validation cohort (AUC 0.68, 95% CI 0.60-0.77). Conclusion Physicians working in ED should adopt simple risk scores, like ARUC score, to select the most appropriate antibiotic regimens. This individualized approach may help clinicians to identify those patients who need an empirical broad-spectrum antibiotic therapy. PMID:25860142
An illustrative overview of semi-quantitative MRI scoring of knee osteoarthritis: lessons learned from longitudinal observational studies.

PubMed

Roemer, F W; Hunter, D J; Crema, M D; Kwoh, C K; Ochoa-Albiztegui, E; Guermazi, A

2016-02-01

To introduce the most popular magnetic resonance imaging (MRI) osteoarthritis (OA) semi-quantitative (SQ) scoring systems to a broader audience with a focus on the most commonly applied scores, i.e., the MOAKS and WORMS system and illustrate similarities and differences. While the main structure and methodology of each scoring system are publicly available, the core of this overview will be an illustrative imaging atlas section including image examples from multiple OA studies applying MRI in regard to different features assessed, show specific examples of different grades and point out pitfalls and specifics of SQ assessment including artifacts, blinding to time point of acquisition and within-grade evaluation. Similarities and differences between different scoring systems are presented. Technical considerations are followed by a brief description of the most commonly utilized SQ scoring systems including their responsiveness and reliability. The second part is comprised of the atlas section presenting illustrative image examples. Evidence suggests that SQ assessment of OA by expert MRI readers is valid, reliable and responsive, which helps investigators to understand the natural history of this complex disease and to evaluate potential new drugs in OA clinical trials. Researchers have to be aware of the differences and specifics of the different systems to be able to engage in imaging assessment and interpretation of imaging-based data. SQ scoring has enabled us to explain associations of structural tissue damage with clinical manifestations of the disease and with morphological alterations thought to represent disease progression. Copyright © 2015 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
An illustrative overview of semi-quantitative MRI scoring of knee osteoarthritis: Lessons learned from longitudinal observational studies

PubMed Central

Roemer, Frank W.; Hunter, David J.; Crema, Michel D.; Kwoh, C. Kent; Ochoa-Albiztegui, Elena; Guermazi, Ali

2015-01-01

Objective To introduce the most popular magnetic resonance imaging (MRI) osteoarthritis (OA) semi-quantitative (SQ) scoring systems to a broader audience with a focus on the most commonly applied scores, i.e. the MOAKS and WORMS system and illustrate similarities and differences. Design While the main structure and methodology of each scoring system are publicly available, the core of this overview will be an illustrative imaging atlas section including image examples from multiple osteoarthritis studies applying MRI in regard to different features assessed, show specific examples of different grades and point out pitfalls and specifics of SQ assessment including artifacts, blinding to time point of acquisition and within-grade evaluation. Results Similarities and differences between different scoring systems are presented. Technical considerations are followed by a brief description of the most commonly utilized SQ scoring systems including their responsiveness and reliability. The second part is comprised of the atlas section presenting illustrative image examples. Conclusions Evidence suggests that SQ assessment of OA by expert MRI readers is valid, reliable and responsive, which helps investigators to understand the natural history of this complex disease and to evaluate potential new drugs in OA clinical trials. Researchers have to be aware of the differences and specifics of the different systems to be able to engage in imaging assessment and interpretation of imaging-based data. SQ scoring has enabled us to explain associations of structural tissue damage with clinical manifestations of the disease and with morphological alterations thought to represent disease progression. PMID:26318656
The feasibility of sharing simulation-based evaluation scenarios in anesthesiology.

PubMed

Berkenstadt, Haim; Kantor, Gareth S; Yusim, Yakov; Gafni, Naomi; Perel, Azriel; Ezri, Tiberiu; Ziv, Amitai

2005-10-01

We prospectively assessed the feasibility of international sharing of simulation-based evaluation tools despite differences in language, education, and anesthesia practice, in an Israeli study, using validated scenarios from a multi-institutional United States (US) study. Thirty-one Israeli junior anesthesia residents performed four simulation scenarios. Training sessions were videotaped and performance was assessed using two validated scoring systems (Long and Short Forms) by two independent raters. Subjects scored from 37 to 95 (70 +/- 12) of 108 possible points with the "Long Form" and "Short Form" scores ranging from 18 to 35 (28.2 +/- 4.5) of 40 possible points. Scores >70% of the maximal score were achieved by 61% of participants in comparison to only 5% in the original US study. The scenarios were rated as very realistic by 80% of the participants (grade 4 on a 1-4 scale). Reliability of the original assessment tools was demonstrated by internal consistencies of 0.66 for the Long and 0.75 for the Short Form (Cronbach alpha statistic). Values in the original study were 0.72-0.76 for the Long and 0.71-0.75 for the Short Form. The reliability did not change when a revised Israeli version of the scoring was used. Interrater reliability measured by Pearson correlation was 0.91 for the Long and 0.96 for the Short Form (P < 0.01). The high scores for plausibility given to the scenarios and the similar reliability of the original assessment tool support the feasibility of using simulation-based evaluation tools, developed in the US, in Israel. The higher scores achieved by Israeli residents may be related to the fact that most Israeli residents are immigrants with previous training in anesthesia. Simulation-based assessment tools developed in a multi-institutional study in the United States can be used in Israel despite the differences in language, education, and medical system.
The reliability and validity of qualitative scores for the Controlled Oral Word Association Test.

PubMed

Ross, Thomas P; Calhoun, Emily; Cox, Tara; Wenner, Carolyn; Kono, Whitney; Pleasant, Morgan

2007-05-01

The reliability and validity of two qualitative scoring systems for the Controlled Oral Word Association Test [Benton, A. L., Hamsher, de S. K., & Sivan, A. B. (1983). Multilingual aplasia examination (2nd ed.). Iowa City, IA: AJA Associates] were examined in 108 healthy young adults. The scoring systems developed by Troyer et al. [Troyer, A. K., Moscovich, M., & Winocur, G. (1997). Clustering and switching as two components of verbal fluency: Evidence from younger and older healthy adults. Neuropsychology, 11, 138-146] and by Abwender et al. [Abwender, D. A., Swan, J. G., Bowerman, J. T., & Connolly, S. W. (2001a). Qualitative analysis of verbal fluency output: Review and comparison of several scoring methods. Assessment, 8, 323-336] each demonstrated excellent interrater reliability (all indices at or above r(icc)=.9). Consistent with previous research [e.g., Ross, T. P. (2003). The reliability of cluster and switch scores for the COWAT. Archives of Clinical Psychology, 18, 153-164), test-retest reliability coefficients (N=53; M interval 44.6 days) for the qualitative scores were modest to poor (r(icc)=.6 to .4 range). Correlations among COWAT scores, measures of executive functioning, verbal learning, working memory, and vocabulary were examined. The idea that qualitative scores represent distinct executive functions such as cognitive flexibility or strategy utilization was not supported. We offer the interpretation that COWAT performance may require the ability to retrieve words in a non-routine manner while suppressing habitual responses and associated processing interference, presumably due to a spread of activation across semantic or lexical networks. This interpretation, though speculative at present, implies that clustering and switching on the COWAT may not be entirely deliberate, but rather an artifact of a passive (i.e., state-dependent) process. Ideas for future research, most noticeably experimental studies using cognitive methods (e.g., priming), are discussed.
Validity and sensitivity to change of the semi-quantitative OMERACT ultrasound scoring system for tenosynovitis in patients with rheumatoid arthritis.

PubMed

Ammitzbøll-Danielsen, Mads; Østergaard, Mikkel; Naredo, Esperanza; Terslev, Lene

2016-12-01

The aim was to evaluate the metric properties of the semi-quantitative OMERACT US scoring system vs a novel quantitative US scoring system for tenosynovitis, by testing its intra- and inter-reader reliability, sensitivity to change and comparison with clinical tenosynovitis scoring in a 6-month follow-up study. US and clinical assessments of the tendon sheaths of the clinically most affected hand and foot were performed at baseline, 3 and 6 months in 51 patients with RA. Tenosynovitis was assessed using the semi-quantitative scoring system (0-3) proposed by the OMERACT US group and a new quantitative US evaluation (0-100). A sum for US grey scale (GS), colour Doppler (CD) and pixel index (PI), respectively, was calculated for each patient. In 20 patients, intra- and inter-observer agreement was established between two independent investigators. A binary clinical tenosynovitis score was performed, calculating a sum score per patient. The intra- and inter-observer agreements for US tenosynovitis assessments were very good at baseline and for change for GS and CD, but less good for PI. The smallest detectable change was 0.97 for GS, 0.93 for CD and 30.1 for PI. The sensitivity to change from month 0 to 6 was high for GS and CD, and slightly higher than for clinical tenosynovitis score and PI. This study demonstrated an excellent intra- and inter-reader agreement between two investigators for the OMERACT US scoring system for tenosynovitis and a high ability to detect changes over time. Quantitative assessment by PI did not add further information. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.