validity predictive validity: Topics by Science.gov

Sample records for validity predictive validity

Overview of Heat Addition and Efficiency Predictions for an Advanced Stirling Convertor

NASA Technical Reports Server (NTRS)

Wilson, Scott D.; Reid, Terry; Schifer, Nicholas; Briggs, Maxwell

2011-01-01

Past methods of predicting net heat input needed to be validated. Validation effort pursued with several paths including improving model inputs, using test hardware to provide validation data, and validating high fidelity models. Validation test hardware provided direct measurement of net heat input for comparison to predicted values. Predicted value of net heat input was 1.7 percent less than measured value and initial calculations of measurement uncertainty were 2.1 percent (under review). Lessons learned during validation effort were incorporated into convertor modeling approach which improved predictions of convertor efficiency.
Assessing the stability of human locomotion: a review of current measures

PubMed Central

Bruijn, S. M.; Meijer, O. G.; Beek, P. J.; van Dieën, J. H.

2013-01-01

Falling poses a major threat to the steadily growing population of the elderly in modern-day society. A major challenge in the prevention of falls is the identification of individuals who are at risk of falling owing to an unstable gait. At present, several methods are available for estimating gait stability, each with its own advantages and disadvantages. In this paper, we review the currently available measures: the maximum Lyapunov exponent (λS and λL), the maximum Floquet multiplier, variability measures, long-range correlations, extrapolated centre of mass, stabilizing and destabilizing forces, foot placement estimator, gait sensitivity norm and maximum allowable perturbation. We explain what these measures represent and how they are calculated, and we assess their validity, divided up into construct validity, predictive validity in simple models, convergent validity in experimental studies, and predictive validity in observational studies. We conclude that (i) the validity of variability measures and λS is best supported across all levels, (ii) the maximum Floquet multiplier and λL have good construct validity, but negative predictive validity in models, negative convergent validity and (for λL) negative predictive validity in observational studies, (iii) long-range correlations lack construct validity and predictive validity in models and have negative convergent validity, and (iv) measures derived from perturbation experiments have good construct validity, but data are lacking on convergent validity in experimental studies and predictive validity in observational studies. In closing, directions for future research on dynamic gait stability are discussed. PMID:23516062
Design Characteristics Influence Performance of Clinical Prediction Rules in Validation: A Meta-Epidemiological Study

PubMed Central

Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda

2016-01-01

Background Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules’ performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Methods Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. Results A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2–4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Conclusion Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved. PMID:26730980
Design Characteristics Influence Performance of Clinical Prediction Rules in Validation: A Meta-Epidemiological Study.

PubMed

Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda

2016-01-01

Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules' performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2-4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved.
How to test validity in orthodontic research: a mixed dentition analysis example.

PubMed

Donatelli, Richard E; Lee, Shin-Jae

2015-02-01

The data used to test the validity of a prediction method should be different from the data used to generate the prediction model. In this study, we explored whether an independent data set is mandatory for testing the validity of a new prediction method and how validity can be tested without independent new data. Several validation methods were compared in an example using the data from a mixed dentition analysis with a regression model. The validation errors of real mixed dentition analysis data and simulation data were analyzed for increasingly large data sets. The validation results of both the real and the simulation studies demonstrated that the leave-1-out cross-validation method had the smallest errors. The largest errors occurred in the traditional simple validation method. The differences between the validation methods diminished as the sample size increased. The leave-1-out cross-validation method seems to be an optimal validation method for improving the prediction accuracy in a data set with limited sample sizes. Copyright © 2015 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
Disentangling the Predictive Validity of High School Grades for Academic Success in University

ERIC Educational Resources Information Center

Vulperhorst, Jonne; Lutz, Christel; de Kleijn, Renske; van Tartwijk, Jan

2018-01-01

To refine selective admission models, we investigate which measure of prior achievement has the best predictive validity for academic success in university. We compare the predictive validity of three core high school subjects to the predictive validity of high school grade point average (GPA) for academic achievement in a liberal arts university…
Prospective validation of pathologic complete response models in rectal cancer: Transferability and reproducibility.

PubMed

van Soest, Johan; Meldolesi, Elisa; van Stiphout, Ruud; Gatta, Roberto; Damiani, Andrea; Valentini, Vincenzo; Lambin, Philippe; Dekker, Andre

2017-09-01

Multiple models have been developed to predict pathologic complete response (pCR) in locally advanced rectal cancer patients. Unfortunately, validation of these models normally omit the implications of cohort differences on prediction model performance. In this work, we will perform a prospective validation of three pCR models, including information whether this validation will target transferability or reproducibility (cohort differences) of the given models. We applied a novel methodology, the cohort differences model, to predict whether a patient belongs to the training or to the validation cohort. If the cohort differences model performs well, it would suggest a large difference in cohort characteristics meaning we would validate the transferability of the model rather than reproducibility. We tested our method in a prospective validation of three existing models for pCR prediction in 154 patients. Our results showed a large difference between training and validation cohort for one of the three tested models [Area under the Receiver Operating Curve (AUC) cohort differences model: 0.85], signaling the validation leans towards transferability. Two out of three models had a lower AUC for validation (0.66 and 0.58), one model showed a higher AUC in the validation cohort (0.70). We have successfully applied a new methodology in the validation of three prediction models, which allows us to indicate if a validation targeted transferability (large differences between training/validation cohort) or reproducibility (small cohort differences). © 2017 American Association of Physicists in Medicine.
A new framework to enhance the interpretation of external validation studies of clinical prediction models.

PubMed

Debray, Thomas P A; Vergouwe, Yvonne; Koffijberg, Hendrik; Nieboer, Daan; Steyerberg, Ewout W; Moons, Karel G M

2015-03-01

It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. The proposed framework enhances the interpretation of findings at external validation of prediction models. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Validity of the MCAT in Predicting Performance in the First Two Years of Medical School.

ERIC Educational Resources Information Center

Jones, Robert F.; Thomae-Forgues, Maria

1984-01-01

The first systematic summary of predictive validity research on the new Medical College Admission Test (MCAT) is presented. The results show that MCAT scores have significant predictive validity with respect to first- and second-year medical school course grades. Further directions for MCAT validity research are described. (Author/MLW)
Direct Validation of Differential Prediction.

ERIC Educational Resources Information Center

Lunneborg, Clifford E.

Using academic achievement data for 655 University students, direct validation of differential predictions based on a battery of aptitude/achievement measures selected for their differential prediction efficiency was attempted. In the cross-validation of the prediction of actual differences among five academic area GPA's, this set of differential…
Criteria of validity for animal models of psychiatric disorders: focus on anxiety disorders and depression

PubMed Central

2011-01-01

Animal models of psychiatric disorders are usually discussed with regard to three criteria first elaborated by Willner; face, predictive and construct validity. Here, we draw the history of these concepts and then try to redraw and refine these criteria, using the framework of the diathesis model of depression that has been proposed by several authors. We thus propose a set of five major criteria (with sub-categories for some of them); homological validity (including species validity and strain validity), pathogenic validity (including ontopathogenic validity and triggering validity), mechanistic validity, face validity (including ethological and biomarker validity) and predictive validity (including induction and remission validity). Homological validity requires that an adequate species and strain be chosen: considering species validity, primates will be considered to have a higher score than drosophila, and considering strains, a high stress reactivity in a strain scores higher than a low stress reactivity in another strain. Pathological validity corresponds to the fact that, in order to shape pathological characteristics, the organism has been manipulated both during the developmental period (for example, maternal separation: ontopathogenic validity) and during adulthood (for example, stress: triggering validity). Mechanistic validity corresponds to the fact that the cognitive (for example, cognitive bias) or biological mechanisms (such as dysfunction of the hormonal stress axis regulation) underlying the disorder are identical in both humans and animals. Face validity corresponds to the observable behavioral (ethological validity) or biological (biomarker validity) outcomes: for example anhedonic behavior (ethological validity) or elevated corticosterone (biomarker validity). Finally, predictive validity corresponds to the identity of the relationship between the triggering factor and the outcome (induction validity) and between the effects of the treatments on the two organisms (remission validity). The relevance of this framework is then discussed regarding various animal models of depression. PMID:22738250
Development of Decision Support Formulas for the Prediction of Bladder Outlet Obstruction and Prostatic Surgery in Patients With Lower Urinary Tract Symptom/Benign Prostatic Hyperplasia: Part II, External Validation and Usability Testing of a Smartphone App.

PubMed

Choo, Min Soo; Jeong, Seong Jin; Cho, Sung Yong; Yoo, Changwon; Jeong, Chang Wook; Ku, Ja Hyeon; Oh, Seung-June

2017-04-01

We aimed to externally validate the prediction model we developed for having bladder outlet obstruction (BOO) and requiring prostatic surgery using 2 independent data sets from tertiary referral centers, and also aimed to validate a mobile app for using this model through usability testing. Formulas and nomograms predicting whether a subject has BOO and needs prostatic surgery were validated with an external validation cohort from Seoul National University Bundang Hospital and Seoul Metropolitan Government-Seoul National University Boramae Medical Center between January 2004 and April 2015. A smartphone-based app was developed, and 8 young urologists were enrolled for usability testing to identify any human factor issues of the app. A total of 642 patients were included in the external validation cohort. No significant differences were found in the baseline characteristics of major parameters between the original (n=1,179) and the external validation cohort, except for the maximal flow rate. Predictions of requiring prostatic surgery in the validation cohort showed a sensitivity of 80.6%, a specificity of 73.2%, a positive predictive value of 49.7%, and a negative predictive value of 92.0%, and area under receiver operating curve of 0.84. The calibration plot indicated that the predictions have good correspondence. The decision curve showed also a high net benefit. Similar evaluation results using the external validation cohort were seen in the predictions of having BOO. Overall results of the usability test demonstrated that the app was user-friendly with no major human factor issues. External validation of these newly developed a prediction model demonstrated a moderate level of discrimination, adequate calibration, and high net benefit gains for predicting both having BOO and requiring prostatic surgery. Also a smartphone app implementing the prediction model was user-friendly with no major human factor issue.
Independent external validation of predictive models for urinary dysfunction following external beam radiotherapy of the prostate: Issues in model development and reporting.

PubMed

Yahya, Noorazrul; Ebert, Martin A; Bulsara, Max; Kennedy, Angel; Joseph, David J; Denham, James W

2016-08-01

Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Predicting Blunt Cerebrovascular Injury in Pediatric Trauma: Validation of the “Utah Score”

PubMed Central

Ravindra, Vijay M.; Bollo, Robert J.; Sivakumar, Walavan; Akbari, Hassan; Naftel, Robert P.; Limbrick, David D.; Jea, Andrew; Gannon, Stephen; Shannon, Chevis; Birkas, Yekaterina; Yang, George L.; Prather, Colin T.; Kestle, John R.

2017-01-01

Abstract Risk factors for blunt cerebrovascular injury (BCVI) may differ between children and adults, suggesting that children at low risk for BCVI after trauma receive unnecessary computed tomography angiography (CTA) and high-dose radiation. We previously developed a score for predicting pediatric BCVI based on retrospective cohort analysis. Our objective is to externally validate this prediction score with a retrospective multi-institutional cohort. We included patients who underwent CTA for traumatic cranial injury at four pediatric Level I trauma centers. Each patient in the validation cohort was scored using the “Utah Score” and classified as high or low risk. Before analysis, we defined a misclassification rate <25% as validating the Utah Score. Six hundred forty-five patients (mean age 8.6 ± 5.4 years; 63.4% males) underwent screening for BCVI via CTA. The validation cohort was 411 patients from three sites compared with the training cohort of 234 patients. Twenty-two BCVIs (5.4%) were identified in the validation cohort. The Utah Score was significantly associated with BCVIs in the validation cohort (odds ratio 8.1 [3.3, 19.8], p < 0.001) and discriminated well in the validation cohort (area under the curve 72%). When the Utah Score was applied to the validation cohort, the sensitivity was 59%, specificity was 85%, positive predictive value was 18%, and negative predictive value was 97%. The Utah Score misclassified 16.6% of patients in the validation cohort. The Utah Score for predicting BCVI in pediatric trauma patients was validated with a low misclassification rate using a large, independent, multicenter cohort. Its implementation in the clinical setting may reduce the use of CTA in low-risk patients. PMID:27297774
Validity and validation of expert (Q)SAR systems.

PubMed

Hulzebos, E; Sijm, D; Traas, T; Posthumus, R; Maslankiewicz, L

2005-08-01

At a recent workshop in Setubal (Portugal) principles were drafted to assess the suitability of (quantitative) structure-activity relationships ((Q)SARs) for assessing the hazards and risks of chemicals. In the present study we applied some of the Setubal principles to test the validity of three (Q)SAR expert systems and validate the results. These principles include a mechanistic basis, the availability of a training set and validation. ECOSAR, BIOWIN and DEREK for Windows have a mechanistic or empirical basis. ECOSAR has a training set for each QSAR. For half of the structural fragments the number of chemicals in the training set is >4. Based on structural fragments and log Kow, ECOSAR uses linear regression to predict ecotoxicity. Validating ECOSAR for three 'valid' classes results in predictivity of > or = 64%. BIOWIN uses (non-)linear regressions to predict the probability of biodegradability based on fragments and molecular weight. It has a large training set and predicts non-ready biodegradability well. DEREK for Windows predictions are supported by a mechanistic rationale and literature references. The structural alerts in this program have been developed with a training set of positive and negative toxicity data. However, to support the prediction only a limited number of chemicals in the training set is presented to the user. DEREK for Windows predicts effects by 'if-then' reasoning. The program predicts best for mutagenicity and carcinogenicity. Each structural fragment in ECOSAR and DEREK for Windows needs to be evaluated and validated separately.
Assessing Discriminative Performance at External Validation of Clinical Prediction Models

PubMed Central

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.

2016-01-01

Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.

PubMed

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W

2016-01-01

External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
The Predictive Validity of the Minnesota Reading Assessment for Students in Postsecondary Vocational Education Programs.

ERIC Educational Resources Information Center

Brown, James M.; Chang, Gerald

1982-01-01

The predictive validity of the Minnesota Reading Assessment (MRA) when used to project potential performance of postsecondary vocational-technical education students was examined. Findings confirmed the MRA to be a valid predictor, although the error in prediction varied between the criterion variables. (Author/GK)
Comparative Predictive Validity of the New MCAT Using Different Admissions Criteria.

ERIC Educational Resources Information Center

Golmon, Melton E.; Berry, Charles A.

1981-01-01

New Medical College Admission Test (MCAT) scores and undergraduate academic achievement were examined for their validity in predicting the performance of two select student populations at Northwestern University Medical School. The data support the hypothesis that New MCAT scores possess substantial predictive validity. (Author/MLW)
Prediction of adult height in girls: the Beunen-Malina-Freitas method.

PubMed

Beunen, Gaston P; Malina, Robert M; Freitas, Duarte L; Thomis, Martine A; Maia, José A; Claessens, Albrecht L; Gouveia, Elvio R; Maes, Hermine H; Lefevre, Johan

2011-12-01

The purpose of this study was to validate and cross-validate the Beunen-Malina-Freitas method for non-invasive prediction of adult height in girls. A sample of 420 girls aged 10-15 years from the Madeira Growth Study were measured at yearly intervals and then 8 years later. Anthropometric dimensions (lengths, breadths, circumferences, and skinfolds) were measured; skeletal age was assessed using the Tanner-Whitehouse 3 method and menarcheal status (present or absent) was recorded. Adult height was measured and predicted using stepwise, forward, and maximum R (2) regression techniques. Multiple correlations, mean differences, standard errors of prediction, and error boundaries were calculated. A sample of the Leuven Longitudinal Twin Study was used to cross-validate the regressions. Age-specific coefficients of determination (R (2)) between predicted and measured adult height varied between 0.57 and 0.96, while standard errors of prediction varied between 1.1 and 3.9 cm. The cross-validation confirmed the validity of the Beunen-Malina-Freitas method in girls aged 12-15 years, but at lower ages the cross-validation was less consistent. We conclude that the Beunen-Malina-Freitas method is valid for the prediction of adult height in girls aged 12-15 years. It is applicable to European populations or populations of European ancestry.

Early Prediction of Intensive Care Unit-Acquired Weakness: A Multicenter External Validation Study.

PubMed

Witteveen, Esther; Wieske, Luuk; Sommers, Juultje; Spijkstra, Jan-Jaap; de Waard, Monique C; Endeman, Henrik; Rijkenberg, Saskia; de Ruijter, Wouter; Sleeswijk, Mengalvio; Verhamme, Camiel; Schultz, Marcus J; van Schaik, Ivo N; Horn, Janneke

2018-01-01

An early diagnosis of intensive care unit-acquired weakness (ICU-AW) is often not possible due to impaired consciousness. To avoid a diagnostic delay, we previously developed a prediction model, based on single-center data from 212 patients (development cohort), to predict ICU-AW at 2 days after ICU admission. The objective of this study was to investigate the external validity of the original prediction model in a new, multicenter cohort and, if necessary, to update the model. Newly admitted ICU patients who were mechanically ventilated at 48 hours after ICU admission were included. Predictors were prospectively recorded, and the outcome ICU-AW was defined by an average Medical Research Council score <4. In the validation cohort, consisting of 349 patients, we analyzed performance of the original prediction model by assessment of calibration and discrimination. Additionally, we updated the model in this validation cohort. Finally, we evaluated a new prediction model based on all patients of the development and validation cohort. Of 349 analyzed patients in the validation cohort, 190 (54%) developed ICU-AW. Both model calibration and discrimination of the original model were poor in the validation cohort. The area under the receiver operating characteristics curve (AUC-ROC) was 0.60 (95% confidence interval [CI]: 0.54-0.66). Model updating methods improved calibration but not discrimination. The new prediction model, based on all patients of the development and validation cohort (total of 536 patients) had a fair discrimination, AUC-ROC: 0.70 (95% CI: 0.66-0.75). The previously developed prediction model for ICU-AW showed poor performance in a new independent multicenter validation cohort. Model updating methods improved calibration but not discrimination. The newly derived prediction model showed fair discrimination. This indicates that early prediction of ICU-AW is still challenging and needs further attention.
Teachers' Grade Assignment and the Predictive Validity of Criterion-Referenced Grades

ERIC Educational Resources Information Center

Thorsen, Cecilia; Cliffordson, Christina

2012-01-01

Research has found that grades are the most valid instruments for predicting educational success. Why grades have better predictive validity than, for example, standardized tests is not yet fully understood. One possible explanation is that grades reflect not only subject-specific knowledge and skills but also individual differences in other…
External validity of two nomograms for predicting distant brain failure after radiosurgery for brain metastases in a bi-institutional independent patient cohort.

PubMed

Prabhu, Roshan S; Press, Robert H; Boselli, Danielle M; Miller, Katherine R; Lankford, Scott P; McCammon, Robert J; Moeller, Benjamin J; Heinzerling, John H; Fasola, Carolina E; Patel, Kirtesh R; Asher, Anthony L; Sumrall, Ashley L; Curran, Walter J; Shu, Hui-Kuo G; Burri, Stuart H

2018-03-01

Patients treated with stereotactic radiosurgery (SRS) for brain metastases (BM) are at increased risk of distant brain failure (DBF). Two nomograms have been recently published to predict individualized risk of DBF after SRS. The goal of this study was to assess the external validity of these nomograms in an independent patient cohort. The records of consecutive patients with BM treated with SRS at Levine Cancer Institute and Emory University between 2005 and 2013 were reviewed. Three validation cohorts were generated based on the specific nomogram or recursive partitioning analysis (RPA) entry criteria: Wake Forest nomogram (n = 281), Canadian nomogram (n = 282), and Canadian RPA (n = 303) validation cohorts. Freedom from DBF at 1-year in the Wake Forest study was 30% compared with 50% in the validation cohort. The validation c-index for both the 6-month and 9-month freedom from DBF Wake Forest nomograms was 0.55, indicating poor discrimination ability, and the goodness-of-fit test for both nomograms was highly significant (p < 0.001), indicating poor calibration. The 1-year actuarial DBF in the Canadian nomogram study was 43.9% compared with 50.9% in the validation cohort. The validation c-index for the Canadian 1-year DBF nomogram was 0.56, and the goodness-of-fit test was also highly significant (p < 0.001). The validation accuracy and c-index of the Canadian RPA classification was 53% and 0.61, respectively. The Wake Forest and Canadian nomograms for predicting risk of DBF after SRS were found to have limited predictive ability in an independent bi-institutional validation cohort. These results reinforce the importance of validating predictive models in independent patient cohorts.
The predictive validity of three versions of the MCAT in relation to performance in medical school, residency, and licensing examinations: a longitudinal study of 36 classes of Jefferson Medical College.

PubMed

Callahan, Clara A; Hojat, Mohammadreza; Veloski, Jon; Erdmann, James B; Gonnella, Joseph S

2010-06-01

The Medical College Admission Test (MCAT) has undergone several revisions for content and validity since its inception. With another comprehensive review pending, this study examines changes in the predictive validity of the MCAT's three recent versions. Study participants were 7,859 matriculants in 36 classes entering Jefferson Medical College between 1970 and 2005; 1,728 took the pre-1978 version of the MCAT; 3,032 took the 1978-1991 version, and 3,099 took the post-1991 version. MCAT subtest scores were the predictors, and performance in medical school, attrition, scores on the medical licensing examinations, and ratings of clinical competence in the first year of residency were the criterion measures. No significant improvement in validity coefficients was observed for performance in medical school or residency. Validity coefficients for all three versions of the MCAT in predicting Part I/Step 1 remained stable (in the mid-0.40s, P < .01). A systematic decline was observed in the validity coefficients of the MCAT versions in predicting Part II/Step 2. It started at 0.47 for the pre-1978 version, decreased to between 0.42 and 0.40 for the 1978-1991 versions, and to 0.37 for the post-1991 version. Validity coefficients for the MCAT versions in predicting Part III/Step 3 remained near 0.30. These were generally larger for women than men. Although the findings support the short- and long-term predictive validity of the MCAT, opportunities to strengthen it remain. Subsequent revisions should increase the test's ability to predict performance on United States Medical Licensing Examination Step 2 and must minimize the differential validity for gender.
Parent- and Self-Reported Dimensions of Oppositionality in Youth: Construct Validity, Concurrent Validity, and the Prediction of Criminal Outcomes in Adulthood

ERIC Educational Resources Information Center

Aebi, Marcel; Plattner, Belinda; Metzke, Christa Winkler; Bessler, Cornelia; Steinhausen, Hans-Christoph

2013-01-01

Background: Different dimensions of oppositional defiant disorder (ODD) have been found as valid predictors of further mental health problems and antisocial behaviors in youth. The present study aimed at testing the construct, concurrent, and predictive validity of ODD dimensions derived from parent- and self-report measures. Method: Confirmatory…
The Predictive Validity of Teacher Candidate Letters of Reference

ERIC Educational Resources Information Center

Mason, Richard W.; Schroeder, Mark P.

2014-01-01

Letters of reference are widely used as an essential part of the hiring process of newly licensed teachers. While the predictive validity of these letters of reference has been called into question it has never been empirically studied. The current study examined the predictive validity of the quality of letters of reference for forty-one student…
Predictive Validity of a Student Self-Report Screener of Behavioral and Emotional Risk in an Urban High School

ERIC Educational Resources Information Center

Dowdy, Erin; Harrell-Williams, Leigh; Dever, Bridget V.; Furlong, Michael J.; Moore, Stephanie; Raines, Tara; Kamphaus, Randy W.

2016-01-01

Increasingly, schools are implementing school-based screening for risk of behavioral and emotional problems; hence, foundational evidence supporting the predictive validity of screening instruments is important to assess. This study examined the predictive validity of the Behavior Assessment System for Children-2 Behavioral and Emotional Screening…
Predictive Validity of Curriculum-Based Measures for English Learners at Varying English Proficiency Levels

ERIC Educational Resources Information Center

Kim, Jennifer Sun; Vanderwood, Michael L.; Lee, Catherine Y.

2016-01-01

This study examined the predictive validity of curriculum-based measures in reading for Spanish-speaking English learners (ELs) at various levels of English proficiency. Third-grade Spanish-speaking EL students were screened during the fall using DIBELS Oral Reading Fluency (DORF) and Daze. Predictive validity was examined in relation to spring…
Applicability of Monte Carlo cross validation technique for model development and validation using generalised least squares regression

NASA Astrophysics Data System (ADS)

Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra

2013-03-01

SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.
Risk assessment for juvenile justice: a meta-analysis.

PubMed

Schwalbe, Craig S

2007-10-01

Risk assessment instruments are increasingly employed by juvenile justice settings to estimate the likelihood of recidivism among delinquent juveniles. In concert with their increased use, validation studies documenting their predictive validity have increased in number. The purpose of this study was to assess the average predictive validity of juvenile justice risk assessment instruments and to identify risk assessment characteristics that are associated with higher predictive validity. A search of the published and grey literature yielded 28 studies that estimated the predictive validity of 28 risk assessment instruments. Findings of the meta-analysis were consistent with effect sizes obtained in larger meta-analyses of criminal justice risk assessment instruments and showed that brief risk assessment instruments had smaller effect sizes than other types of instruments. However, this finding is tentative owing to limitations of the literature.
Analysis of model development strategies: predicting ventral hernia recurrence.

PubMed

Holihan, Julie L; Li, Linda T; Askenasy, Erik P; Greenberg, Jacob A; Keith, Jerrod N; Martindale, Robert G; Roth, J Scott; Liang, Mike K

2016-11-01

There have been many attempts to identify variables associated with ventral hernia recurrence; however, it is unclear which statistical modeling approach results in models with greatest internal and external validity. We aim to assess the predictive accuracy of models developed using five common variable selection strategies to determine variables associated with hernia recurrence. Two multicenter ventral hernia databases were used. Database 1 was randomly split into "development" and "internal validation" cohorts. Database 2 was designated "external validation". The dependent variable for model development was hernia recurrence. Five variable selection strategies were used: (1) "clinical"-variables considered clinically relevant, (2) "selective stepwise"-all variables with a P value <0.20 were assessed in a step-backward model, (3) "liberal stepwise"-all variables were included and step-backward regression was performed, (4) "restrictive internal resampling," and (5) "liberal internal resampling." Variables were included with P < 0.05 for the Restrictive model and P < 0.10 for the Liberal model. A time-to-event analysis using Cox regression was performed using these strategies. The predictive accuracy of the developed models was tested on the internal and external validation cohorts using Harrell's C-statistic where C > 0.70 was considered "reasonable". The recurrence rate was 32.9% (n = 173/526; median/range follow-up, 20/1-58 mo) for the development cohort, 36.0% (n = 95/264, median/range follow-up 20/1-61 mo) for the internal validation cohort, and 12.7% (n = 155/1224, median/range follow-up 9/1-50 mo) for the external validation cohort. Internal validation demonstrated reasonable predictive accuracy (C-statistics = 0.772, 0.760, 0.767, 0.757, 0.763), while on external validation, predictive accuracy dipped precipitously (C-statistic = 0.561, 0.557, 0.562, 0.553, 0.560). Predictive accuracy was equally adequate on internal validation among models; however, on external validation, all five models failed to demonstrate utility. Future studies should report multiple variable selection techniques and demonstrate predictive accuracy on external data sets for model validation. Copyright © 2016 Elsevier Inc. All rights reserved.
The stroke impairment assessment set: its internal consistency and predictive validity.

PubMed

Tsuji, T; Liu, M; Sonoda, S; Domen, K; Chino, N

2000-07-01

To study the scale quality and predictive validity of the Stroke Impairment Assessment Set (SIAS) developed for stroke outcome research. Rasch analysis of the SIAS; stepwise multiple regression analysis to predict discharge functional independence measure (FIM) raw scores from demographic data, the SIAS scores, and the admission FIM scores; cross-validation of the prediction rule. Tertiary rehabilitation center in Japan. One hundred ninety stroke inpatients for the study of the scale quality and the predictive validity; a second sample of 116 stroke inpatients for the cross-validation study. Mean square fit statistics to study the degree of fit to the unidimensional model; logits to express item difficulties; discharge FIM scores for the study of predictive validity. The degree of misfit was acceptable except for the shoulder range of motion (ROM), pain, visuospatial function, and speech items; and the SIAS items could be arranged on a common unidimensional scale. The difficulty patterns were identical at admission and at discharge except for the deep tendon reflexes, ROM, and pain items. They were also similar for the right- and left-sided brain lesion groups except for the speech and visuospatial items. For the prediction of the discharge FIM scores, the independent variables selected were age, the SIAS total scores, and the admission FIM scores; and the adjusted R2 was .64 (p < .0001). Stability of the predictive equation was confirmed in the cross-validation sample (R2 = .68, p < .001). The unidimensionality of the SIAS was confirmed, and the SIAS total scores proved useful for stroke outcome prediction.
Validation of the Economic and Health Outcomes Model of Type 2 Diabetes Mellitus (ECHO-T2DM).

PubMed

Willis, Michael; Johansen, Pierre; Nilsson, Andreas; Asseburg, Christian

2017-03-01

The Economic and Health Outcomes Model of Type 2 Diabetes Mellitus (ECHO-T2DM) was developed to address study questions pertaining to the cost-effectiveness of treatment alternatives in the care of patients with type 2 diabetes mellitus (T2DM). Naturally, the usefulness of a model is determined by the accuracy of its predictions. A previous version of ECHO-T2DM was validated against actual trial outcomes and the model predictions were generally accurate. However, there have been recent upgrades to the model, which modify model predictions and necessitate an update of the validation exercises. The objectives of this study were to extend the methods available for evaluating model validity, to conduct a formal model validation of ECHO-T2DM (version 2.3.0) in accordance with the principles espoused by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) and the Society for Medical Decision Making (SMDM), and secondarily to evaluate the relative accuracy of four sets of macrovascular risk equations included in ECHO-T2DM. We followed the ISPOR/SMDM guidelines on model validation, evaluating face validity, verification, cross-validation, and external validation. Model verification involved 297 'stress tests', in which specific model inputs were modified systematically to ascertain correct model implementation. Cross-validation consisted of a comparison between ECHO-T2DM predictions and those of the seminal National Institutes of Health model. In external validation, study characteristics were entered into ECHO-T2DM to replicate the clinical results of 12 studies (including 17 patient populations), and model predictions were compared to observed values using established statistical techniques as well as measures of average prediction error, separately for the four sets of macrovascular risk equations supported in ECHO-T2DM. Sub-group analyses were conducted for dependent vs. independent outcomes and for microvascular vs. macrovascular vs. mortality endpoints. All stress tests were passed. ECHO-T2DM replicated the National Institutes of Health cost-effectiveness application with numerically similar results. In external validation of ECHO-T2DM, model predictions agreed well with observed clinical outcomes. For all sets of macrovascular risk equations, the results were close to the intercept and slope coefficients corresponding to a perfect match, resulting in high R 2 and failure to reject concordance using an F test. The results were similar for sub-groups of dependent and independent validation, with some degree of under-prediction of macrovascular events. ECHO-T2DM continues to match health outcomes in clinical trials in T2DM, with prediction accuracy similar to other leading models of T2DM.
Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation.

PubMed

Kaneko, Hiromasa; Funatsu, Kimito

2013-09-23

We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regression models are updated, whereas cross-validation cannot be performed in such a situation. The proposed method is effective and helpful in handling big data when cross-validation cannot be applied. By analyzing data from numerical simulations and quantitative structural relationships, we confirm that the proposed criteria enable the predictive ability of the nonlinear regression models to be appropriately quantified.
Validity and reliability of a self-report instrument to assess social support and physical environmental correlates of physical activity in adolescents

PubMed Central

2012-01-01

Background The purpose of this study was to examine the internal consistency, test-retest reliability, construct validity and predictive validity of a new German self-report instrument to assess the influence of social support and the physical environment on physical activity in adolescents. Methods Based on theoretical consideration, the short scales on social support and physical environment were developed and cross-validated in two independent study samples of 9 to 17 year-old girls and boys. The longitudinal sample of Study I (n = 196) was recruited from a German comprehensive school, and subjects in this study completed the questionnaire twice with a between-test interval of seven days. Cronbach’s alphas were computed to determine the internal consistency of the factors. Test-retest reliability of the latent factors was assessed using intra-class coefficients. Factorial validity of the scales was assessed using principle components analysis. Construct validity was determined using a cross-validation technique by performing confirmatory factor analysis with the independent nationwide cross-sectional sample of Study II (n = 430). Correlations between factors and three measures of physical activity (objectively measured moderate-to-vigorous physical activity (MVPA), self-reported habitual MVPA and self-reported recent MVPA) were calculated to determine the predictive validity of the instrument. Results Construct validity of the social support scale (two factors: parental support and peer support) and the physical environment scale (four factors: convenience, public recreation facilities, safety and private sport providers) was shown. Both scales had moderate test-retest reliability. The factors of the social support scale also had good internal consistency and predictive validity. Internal consistency and predictive validity of the physical environment scale were low to acceptable. Conclusions The results of this study indicate moderate to good reliability and construct validity of the social support scale and physical environment scale. Predictive validity was only confirmed for the social support scale but not for the physical environment scale. Hence, it remains unclear if a person’s physical environment has a direct or an indirect effect on physical activity behavior or a moderation function. PMID:22928865
Validity and reliability of a self-report instrument to assess social support and physical environmental correlates of physical activity in adolescents.

PubMed

Reimers, Anne K; Jekauc, Darko; Mess, Filip; Mewes, Nadine; Woll, Alexander

2012-08-29

The purpose of this study was to examine the internal consistency, test-retest reliability, construct validity and predictive validity of a new German self-report instrument to assess the influence of social support and the physical environment on physical activity in adolescents. Based on theoretical consideration, the short scales on social support and physical environment were developed and cross-validated in two independent study samples of 9 to 17 year-old girls and boys. The longitudinal sample of Study I (n = 196) was recruited from a German comprehensive school, and subjects in this study completed the questionnaire twice with a between-test interval of seven days. Cronbach's alphas were computed to determine the internal consistency of the factors. Test-retest reliability of the latent factors was assessed using intra-class coefficients. Factorial validity of the scales was assessed using principle components analysis. Construct validity was determined using a cross-validation technique by performing confirmatory factor analysis with the independent nationwide cross-sectional sample of Study II (n = 430). Correlations between factors and three measures of physical activity (objectively measured moderate-to-vigorous physical activity (MVPA), self-reported habitual MVPA and self-reported recent MVPA) were calculated to determine the predictive validity of the instrument. Construct validity of the social support scale (two factors: parental support and peer support) and the physical environment scale (four factors: convenience, public recreation facilities, safety and private sport providers) was shown. Both scales had moderate test-retest reliability. The factors of the social support scale also had good internal consistency and predictive validity. Internal consistency and predictive validity of the physical environment scale were low to acceptable. The results of this study indicate moderate to good reliability and construct validity of the social support scale and physical environment scale. Predictive validity was only confirmed for the social support scale but not for the physical environment scale. Hence, it remains unclear if a person's physical environment has a direct or an indirect effect on physical activity behavior or a moderation function.
Predictive Validity Study of the APS Writing and Reading Tests [and] Validating Placement Rules for the APS Writing Test.

ERIC Educational Resources Information Center

College of the Canyons, Valencia, CA. Office of Institutional Development.

California's College of the Canyons has used the College Board Assessment and Placement Services (APS) test to assess students' abilities in basic and college English since spring 1993. These two reports summarize data from a May 1994 study of the predictive validity of the APS writing and reading tests and a June 1994 effort to validate the cut…
Optimal test selection for prediction uncertainty reduction

DOE PAGES

Mullins, Joshua; Mahadevan, Sankaran; Urbina, Angel

2016-12-02

Economic factors and experimental limitations often lead to sparse and/or imprecise data used for the calibration and validation of computational models. This paper addresses resource allocation for calibration and validation experiments, in order to maximize their effectiveness within given resource constraints. When observation data are used for model calibration, the quality of the inferred parameter descriptions is directly affected by the quality and quantity of the data. This paper characterizes parameter uncertainty within a probabilistic framework, which enables the uncertainty to be systematically reduced with additional data. The validation assessment is also uncertain in the presence of sparse and imprecisemore » data; therefore, this paper proposes an approach for quantifying the resulting validation uncertainty. Since calibration and validation uncertainty affect the prediction of interest, the proposed framework explores the decision of cost versus importance of data in terms of the impact on the prediction uncertainty. Often, calibration and validation tests may be performed for different input scenarios, and this paper shows how the calibration and validation results from different conditions may be integrated into the prediction. Then, a constrained discrete optimization formulation that selects the number of tests of each type (calibration or validation at given input conditions) is proposed. Furthermore, the proposed test selection methodology is demonstrated on a microelectromechanical system (MEMS) example.« less
Geographic and temporal validity of prediction models: Different approaches were useful to examine model performance

PubMed Central

Austin, Peter C.; van Klaveren, David; Vergouwe, Yvonne; Nieboer, Daan; Lee, Douglas S.; Steyerberg, Ewout W.

2017-01-01

Objective Validation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods. Study Design and Setting We illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random effects meta-analysis methods. I2 statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation. Results Estimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I2 statistics and prediction intervals for c-statistics. Conclusion This study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods. PMID:27262237
Risk prediction models for graft failure in kidney transplantation: a systematic review.

PubMed

Kaboré, Rémi; Haller, Maria C; Harambat, Jérôme; Heinze, Georg; Leffondré, Karen

2017-04-01

Risk prediction models are useful for identifying kidney recipients at high risk of graft failure, thus optimizing clinical care. Our objective was to systematically review the models that have been recently developed and validated to predict graft failure in kidney transplantation recipients. We used PubMed and Scopus to search for English, German and French language articles published in 2005-15. We selected studies that developed and validated a new risk prediction model for graft failure after kidney transplantation, or validated an existing model with or without updating the model. Data on recipient characteristics and predictors, as well as modelling and validation methods were extracted. In total, 39 articles met the inclusion criteria. Of these, 34 developed and validated a new risk prediction model and 5 validated an existing one with or without updating the model. The most frequently predicted outcome was graft failure, defined as dialysis, re-transplantation or death with functioning graft. Most studies used the Cox model. There was substantial variability in predictors used. In total, 25 studies used predictors measured at transplantation only, and 14 studies used predictors also measured after transplantation. Discrimination performance was reported in 87% of studies, while calibration was reported in 56%. Performance indicators were estimated using both internal and external validation in 13 studies, and using external validation only in 6 studies. Several prediction models for kidney graft failure in adults have been published. Our study highlights the need to better account for competing risks when applicable in such studies, and to adequately account for post-transplant measures of predictors in studies aiming at improving monitoring of kidney transplant recipients. © The Author 2017. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.

Latency-Based and Psychophysiological Measures of Sexual Interest Show Convergent and Concurrent Validity.

PubMed

Ó Ciardha, Caoilte; Attard-Johnson, Janice; Bindemann, Markus

2018-04-01

Latency-based measures of sexual interest require additional evidence of validity, as do newer pupil dilation approaches. A total of 102 community men completed six latency-based measures of sexual interest. Pupillary responses were recorded during three of these tasks and in an additional task where no participant response was required. For adult stimuli, there was a high degree of intercorrelation between measures, suggesting that tasks may be measuring the same underlying construct (convergent validity). In addition to being correlated with one another, measures also predicted participants' self-reported sexual interest, demonstrating concurrent validity (i.e., the ability of a task to predict a more validated, simultaneously recorded, measure). Latency-based and pupillometric approaches also showed preliminary evidence of concurrent validity in predicting both self-reported interest in child molestation and viewing pornographic material containing children. Taken together, the study findings build on the evidence base for the validity of latency-based and pupillometric measures of sexual interest.
Prediction models for successful external cephalic version: a systematic review.

PubMed

Velzel, Joost; de Hundt, Marcella; Mulder, Frederique M; Molkenboer, Jan F M; Van der Post, Joris A M; Mol, Ben W; Kok, Marjolein

2015-12-01

To provide an overview of existing prediction models for successful ECV, and to assess their quality, development and performance. We searched MEDLINE, EMBASE and the Cochrane Library to identify all articles reporting on prediction models for successful ECV published from inception to January 2015. We extracted information on study design, sample size, model-building strategies and validation. We evaluated the phases of model development and summarized their performance in terms of discrimination, calibration and clinical usefulness. We collected different predictor variables together with their defined significance, in order to identify important predictor variables for successful ECV. We identified eight articles reporting on seven prediction models. All models were subjected to internal validation. Only one model was also validated in an external cohort. Two prediction models had a low overall risk of bias, of which only one showed promising predictive performance at internal validation. This model also completed the phase of external validation. For none of the models their impact on clinical practice was evaluated. The most important predictor variables for successful ECV described in the selected articles were parity, placental location, breech engagement and the fetal head being palpable. One model was assessed using discrimination and calibration using internal (AUC 0.71) and external validation (AUC 0.64), while two other models were assessed with discrimination and calibration, respectively. We found one prediction model for breech presentation that was validated in an external cohort and had acceptable predictive performance. This model should be used to council women considering ECV. Copyright © 2015. Published by Elsevier Ireland Ltd.
Systematic review of the concurrent and predictive validity of MRI biomarkers in OA

PubMed Central

Hunter, D.J.; Zhang, W.; Conaghan, Philip G.; Hirko, K.; Menashe, L.; Li, L.; Reichmann, W.M.; Losina, E.

2012-01-01

SUMMARY Objective To summarize literature on the concurrent and predictive validity of MRI-based measures of osteoarthritis (OA) structural change. Methods An online literature search was conducted of the OVID, EMBASE, CINAHL, PsychInfo and Cochrane databases of articles published up to the time of the search, April 2009. 1338 abstracts obtained with this search were preliminarily screened for relevance by two reviewers. Of these, 243 were selected for data extraction for this analysis on validity as well as separate reviews on discriminate validity and diagnostic performance. Of these 142 manuscripts included data pertinent to concurrent validity and 61 manuscripts for the predictive validity review. For this analysis we extracted data on criterion (concurrent and predictive) validity from both longitudinal and cross-sectional studies for all synovial joint tissues as it relates to MRI measurement in OA. Results Concurrent validity of MRI in OA has been examined compared to symptoms, radiography, histology/pathology, arthroscopy, CT, and alignment. The relation of bone marrow lesions, synovitis and effusion to pain was moderate to strong. There was a weak or no relation of cartilage morphology or meniscal tears to pain. The relation of cartilage morphology to radiographic OA and radiographic joint space was inconsistent. There was a higher frequency of meniscal tears, synovitis and other features in persons with radiographic OA. The relation of cartilage to other constructs including histology and arthroscopy was stronger. Predictive validity of MRI in OA has been examined for ability to predict total knee replacement (TKR), change in symptoms, radiographic progression as well as MRI progression. Quantitative cartilage volume change and presence of cartilage defects or bone marrow lesions are potential predictors of TKR. Conclusion MRI has inherent strengths and unique advantages in its ability to visualize multiple individual tissue pathologies relating to pain and also predict clinical outcome. The complex disease of OA which involves an array of tissue abnormalities is best imaged using this imaging tool. PMID:21396463
The Predictive Validity of the Tilburg Frailty Indicator: Disability, Health Care Utilization, and Quality of Life in a Population at Risk

ERIC Educational Resources Information Center

Gobbens, Robbert J. J.; van Assen, Marcel A. L. M.; Luijkx, Katrien G.; Schols, Jos M. G. A.

2012-01-01

Purpose: To assess the predictive validity of frailty and its domains (physical, psychological, and social), as measured by the Tilburg Frailty Indicator (TFI), for the adverse outcomes disability, health care utilization, and quality of life. Design and Methods: The predictive validity of the TFI was tested in a representative sample of 484…
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models

ERIC Educational Resources Information Center

Shieh, Gwowen

2009-01-01

In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
Validity of Measured Interest for Decided and Undecided Students.

ERIC Educational Resources Information Center

Bartling, Herbert C.; Hood, Albert B.

The usefulness of vocational interest measures has been questioned by those who have studied the predictive validity of expressed choice. The predictive validities of measured interest for decided and undecided students, expressed choice and measured interest, and expressed choice and measured interest when they are congruent and incongruent were…
A simplified approach to the pooled analysis of calibration of clinical prediction rules for systematic reviews of validation studies

PubMed Central

Dimitrov, Borislav D; Motterlini, Nicola; Fahey, Tom

2015-01-01

Objective Estimating calibration performance of clinical prediction rules (CPRs) in systematic reviews of validation studies is not possible when predicted values are neither published nor accessible or sufficient or no individual participant or patient data are available. Our aims were to describe a simplified approach for outcomes prediction and calibration assessment and evaluate its functionality and validity. Study design and methods: Methodological study of systematic reviews of validation studies of CPRs: a) ABCD2 rule for prediction of 7 day stroke; and b) CRB-65 rule for prediction of 30 day mortality. Predicted outcomes in a sample validation study were computed by CPR distribution patterns (“derivation model”). As confirmation, a logistic regression model (with derivation study coefficients) was applied to CPR-based dummy variables in the validation study. Meta-analysis of validation studies provided pooled estimates of “predicted:observed” risk ratios (RRs), 95% confidence intervals (CIs), and indexes of heterogeneity (I2) on forest plots (fixed and random effects models), with and without adjustment of intercepts. The above approach was also applied to the CRB-65 rule. Results Our simplified method, applied to ABCD2 rule in three risk strata (low, 0–3; intermediate, 4–5; high, 6–7 points), indicated that predictions are identical to those computed by univariate, CPR-based logistic regression model. Discrimination was good (c-statistics =0.61–0.82), however, calibration in some studies was low. In such cases with miscalibration, the under-prediction (RRs =0.73–0.91, 95% CIs 0.41–1.48) could be further corrected by intercept adjustment to account for incidence differences. An improvement of both heterogeneities and P-values (Hosmer-Lemeshow goodness-of-fit test) was observed. Better calibration and improved pooled RRs (0.90–1.06), with narrower 95% CIs (0.57–1.41) were achieved. Conclusion Our results have an immediate clinical implication in situations when predicted outcomes in CPR validation studies are lacking or deficient by describing how such predictions can be obtained by everyone using the derivation study alone, without any need for highly specialized knowledge or sophisticated statistics. PMID:25931829
Implementation and Initial Validation of the APS English Test [and] The APS English-Writing Test at Golden West College: Evidence for Predictive Validity.

ERIC Educational Resources Information Center

Isonio, Steven

In May 1991, Golden West College (California) conducted a validation study of the English portion of the Assessment and Placement Services for Community Colleges (APS), followed by a predictive validity study in July 1991. The initial study was designed to aid in the implementation of the new test at GWC by comparing data on APS use at other…
Beware of external validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.

PubMed

Majumdar, Subhabrata; Basak, Subhash C

2018-04-26

Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Family-Based Benchmarking of Copy Number Variation Detection Software.

PubMed

Nutsua, Marcel Elie; Fischer, Annegret; Nebel, Almut; Hofmann, Sylvia; Schreiber, Stefan; Krawczak, Michael; Nothnagel, Michael

2015-01-01

The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico.
A Public-Private Partnership Develops and Externally Validates a 30-Day Hospital Readmission Risk Prediction Model

PubMed Central

Choudhry, Shahid A.; Li, Jing; Davis, Darcy; Erdmann, Cole; Sikka, Rishi; Sutariya, Bharat

2013-01-01

Introduction: Preventing the occurrence of hospital readmissions is needed to improve quality of care and foster population health across the care continuum. Hospitals are being held accountable for improving transitions of care to avert unnecessary readmissions. Advocate Health Care in Chicago and Cerner (ACC) collaborated to develop all-cause, 30-day hospital readmission risk prediction models to identify patients that need interventional resources. Ideally, prediction models should encompass several qualities: they should have high predictive ability; use reliable and clinically relevant data; use vigorous performance metrics to assess the models; be validated in populations where they are applied; and be scalable in heterogeneous populations. However, a systematic review of prediction models for hospital readmission risk determined that most performed poorly (average C-statistic of 0.66) and efforts to improve their performance are needed for widespread usage. Methods: The ACC team incorporated electronic health record data, utilized a mixed-method approach to evaluate risk factors, and externally validated their prediction models for generalizability. Inclusion and exclusion criteria were applied on the patient cohort and then split for derivation and internal validation. Stepwise logistic regression was performed to develop two predictive models: one for admission and one for discharge. The prediction models were assessed for discrimination ability, calibration, overall performance, and then externally validated. Results: The ACC Admission and Discharge Models demonstrated modest discrimination ability during derivation, internal and external validation post-recalibration (C-statistic of 0.76 and 0.78, respectively), and reasonable model fit during external validation for utility in heterogeneous populations. Conclusions: The ACC Admission and Discharge Models embody the design qualities of ideal prediction models. The ACC plans to continue its partnership to further improve and develop valuable clinical models. PMID:24224068
Validated Questionnaire of Maternal Attitude and Knowledge for Predicting Caries Risk in Children: Epidemiological Study in North Jakarta, Indonesia.

PubMed

Laksmiastuti, Sri Ratna; Budiardjo, Sarworini Bagio; Sutadi, Heriandi

2017-06-01

Predicting caries risk in children can be done by identifying caries risk factors. It is an important measure which contributes to best understanding of the cariogenic profile of the patient. Identification could be done by clinical examination and answering the questionnaire. We arrange the study to verify the questionnaire validation for predicting caries risk in children. The study was conducted on 62 pairs of mothers and their children, aged between 3 and 5 years. The questionnaire consists of 10 questions concerning mothers' attitude and knowledge about oral health. The reliability and validity test is based on Cronbach's alpha and correlation coefficient value. All question are reliable (Cronbach's alpha = 0.873) and valid (Corrected item-total item correlation >0.4). Five questionnaires of mother's attitude about oral health and five questionnaires of mother's knowledge about oral health are reliable and valid for predicting caries risk in children.
Predictive validity of the Braden Scale, Norton Scale, and Waterlow Scale in the Czech Republic.

PubMed

Šateková, Lenka; Žiaková, Katarína; Zeleníková, Renáta

2017-02-01

The aim of this study was to determine the predictive validity of the Braden, Norton, and Waterlow scales in 2 long-term care departments in the Czech Republic. Assessing the risk for developing pressure ulcers is the first step in their prevention. At present, many scales are used in clinical practice, but most of them have not been properly validated yet (for example, the Modified Norton Scale in the Czech Republic). In the Czech Republic, only the Braden Scale has been validated so far. This is a prospective comparative instrument testing study. A random sample of 123 patients was recruited. The predictive validity of the pressure ulcer risk assessment scales was evaluated based on sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve. The data were collected from April to August 2014. In the present study, the best predictive validity values were observed for the Norton Scale, followed by the Braden Scale and the Waterlow Scale, in that order. We recommended that the above 3 pressure ulcer risk assessment scales continue to be evaluated in the Czech clinical setting. © 2016 John Wiley & Sons Australia, Ltd.
A simple risk scoring system for prediction of relapse after inpatient alcohol treatment.

PubMed

Pedersen, Mads Uffe; Hesse, Morten

2009-01-01

Predicting relapse after alcoholism treatment can be useful in targeting patients for aftercare services. However, a valid and practical instrument for predicting relapse risk does not exist. Based on a prospective study of alcoholism treatment, we developed the Risk of Alcoholic Relapse Scale (RARS) using items taken from the Addiction Severity Index and some basic demographic information. The RARS was cross-validated using two non-overlapping samples, and tested for its ability to predict relapse across different models of treatment. The RARS predicted relapse to drinking within 6 months after alcoholism treatment in both the original and the validation sample, and in a second validation sample it predicted admission to new treatment 3 years after treatment. The RARS can identify patients at high risk of relapse who need extra aftercare and support after treatment.
Validation of a dye stain assay for vaginally inserted HEC-filled microbicide applicators

PubMed Central

Katzen, Lauren L.; Fernández-Romero, José A.; Sarna, Avina; Murugavel, Kailapuri G.; Gawarecki, Daniel; Zydowsky, Thomas M.; Mensch, Barbara S.

2011-01-01

Background The reliability and validity of self-reports of vaginal microbicide use are questionable given the explicit understanding that participants are expected to comply with study protocols. Our objective was to optimize the Population Council's previously validated dye stain assay (DSA) and related procedures, and establish predictive values for the DSA's ability to identify vaginally inserted single-use, low-density polyethylene microbicide applicators filled with hydroxyethylcellulose gel. Methods Applicators, inserted by 252 female sex workers enrolled in a microbicide feasibility study in Southern India, served as positive controls for optimization and validation experiments. Prior to validation, optimal dye concentration and staining time were ascertained. Three validation experiments were conducted to determine sensitivity, specificity, negative predictive values and positive predictive values. Results The dye concentration of 0.05% (w/v) FD&C Blue No. 1 Granular Food Dye and staining time of five seconds were determined to be optimal and were used for the three validation experiments. There were a total of 1,848 possible applicator readings across validation experiments; 1,703 (92.2%) applicator readings were correct. On average, the DSA performed with 90.6% sensitivity, 93.9% specificity, and had a negative predictive value of 93.8% and a positive predictive value of 91.0%. No statistically significant differences between experiments were noted. Conclusions The DSA was optimized and successfully validated for use with single-use, low-density polyethylene applicators filled with hydroxyethylcellulose (HEC) gel. We recommend including the DSA in future microbicide trials involving vaginal gels in order to identify participants who have low adherence to dosing regimens. In doing so, we can develop strategies to improve adherence as well as investigate the association between product use and efficacy. PMID:21992983
Propeller aircraft interior noise model utilization study and validation

NASA Technical Reports Server (NTRS)

Pope, L. D.

1984-01-01

Utilization and validation of a computer program designed for aircraft interior noise prediction is considered. The program, entitled PAIN (an acronym for Propeller Aircraft Interior Noise), permits (in theory) predictions of sound levels inside propeller driven aircraft arising from sidewall transmission. The objective of the work reported was to determine the practicality of making predictions for various airplanes and the extent of the program's capabilities. The ultimate purpose was to discern the quality of predictions for tonal levels inside an aircraft occurring at the propeller blade passage frequency and its harmonics. The effort involved three tasks: (1) program validation through comparisons of predictions with scale-model test results; (2) development of utilization schemes for large (full scale) fuselages; and (3) validation through comparisons of predictions with measurements taken in flight tests on a turboprop aircraft. Findings should enable future users of the program to efficiently undertake and correctly interpret predictions.
Dynamic Assessment of Reading Difficulties: Predictive and Incremental Validity on Attitude toward Reading and the Use of Dialogue/Participation Strategies in Classroom Activities.

PubMed

Navarro, Juan-José; Lara, Laura

2017-01-01

Dynamic Assessment (DA) has been shown to have more predictive value than conventional tests for academic performance. However, in relation to reading difficulties, further research is needed to determine the predictive validity of DA for specific aspects of the different processes involved in reading and the differential validity of DA for different subgroups of students with an academic disadvantage. This paper analyzes the implementation of a DA device that evaluates processes involved in reading (EDPL) among 60 students with reading comprehension difficulties between 9 and 16 years of age, of whom 20 have intellectual disabilities, 24 have reading-related learning disabilities, and 16 have socio-cultural disadvantages. We specifically analyze the predictive validity of the EDPL device over attitude toward reading, and the use of dialogue/participation strategies in reading activities in the classroom during the implementation stage. We also analyze if the EDPL device provides additional information to that obtained with a conventionally applied personal-social adjustment scale (APSL). Results showed that dynamic scores, obtained from the implementation of the EDPL device, significantly predict the studied variables. Moreover, dynamic scores showed a significant incremental validity in relation to predictions based on an APSL scale. In relation to differential validity, the results indicated the superior predictive validity for DA for students with intellectual disabilities and reading disabilities than for students with socio-cultural disadvantages. Furthermore, the role of metacognition and its relation to the processes of personal-social adjustment in explaining the results is discussed.
Dynamic Assessment of Reading Difficulties: Predictive and Incremental Validity on Attitude toward Reading and the Use of Dialogue/Participation Strategies in Classroom Activities

PubMed Central

Navarro, Juan-José; Lara, Laura

2017-01-01

Dynamic Assessment (DA) has been shown to have more predictive value than conventional tests for academic performance. However, in relation to reading difficulties, further research is needed to determine the predictive validity of DA for specific aspects of the different processes involved in reading and the differential validity of DA for different subgroups of students with an academic disadvantage. This paper analyzes the implementation of a DA device that evaluates processes involved in reading (EDPL) among 60 students with reading comprehension difficulties between 9 and 16 years of age, of whom 20 have intellectual disabilities, 24 have reading-related learning disabilities, and 16 have socio-cultural disadvantages. We specifically analyze the predictive validity of the EDPL device over attitude toward reading, and the use of dialogue/participation strategies in reading activities in the classroom during the implementation stage. We also analyze if the EDPL device provides additional information to that obtained with a conventionally applied personal-social adjustment scale (APSL). Results showed that dynamic scores, obtained from the implementation of the EDPL device, significantly predict the studied variables. Moreover, dynamic scores showed a significant incremental validity in relation to predictions based on an APSL scale. In relation to differential validity, the results indicated the superior predictive validity for DA for students with intellectual disabilities and reading disabilities than for students with socio-cultural disadvantages. Furthermore, the role of metacognition and its relation to the processes of personal-social adjustment in explaining the results is discussed. PMID:28243215
Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury.

PubMed

van der Ploeg, Tjeerd; Nieboer, Daan; Steyerberg, Ewout W

2016-10-01

Prediction of medical outcomes may potentially benefit from using modern statistical modeling techniques. We aimed to externally validate modeling strategies for prediction of 6-month mortality of patients suffering from traumatic brain injury (TBI) with predictor sets of increasing complexity. We analyzed individual patient data from 15 different studies including 11,026 TBI patients. We consecutively considered a core set of predictors (age, motor score, and pupillary reactivity), an extended set with computed tomography scan characteristics, and a further extension with two laboratory measurements (glucose and hemoglobin). With each of these sets, we predicted 6-month mortality using default settings with five statistical modeling techniques: logistic regression (LR), classification and regression trees, random forests (RFs), support vector machines (SVM) and neural nets. For external validation, a model developed on one of the 15 data sets was applied to each of the 14 remaining sets. This process was repeated 15 times for a total of 630 validations. The area under the receiver operating characteristic curve (AUC) was used to assess the discriminative ability of the models. For the most complex predictor set, the LR models performed best (median validated AUC value, 0.757), followed by RF and support vector machine models (median validated AUC value, 0.735 and 0.732, respectively). With each predictor set, the classification and regression trees models showed poor performance (median validated AUC value, <0.7). The variability in performance across the studies was smallest for the RF- and LR-based models (inter quartile range for validated AUC values from 0.07 to 0.10). In the area of predicting mortality from TBI, nonlinear and nonadditive effects are not pronounced enough to make modern prediction methods beneficial. Copyright © 2016 Elsevier Inc. All rights reserved.
Taking the Next Step: Combining Incrementally Valid Indicators to Improve Recidivism Prediction

ERIC Educational Resources Information Center

Walters, Glenn D.

2011-01-01

The possibility of combining indicators to improve recidivism prediction was evaluated in a sample of released federal prisoners randomly divided into a derivation subsample (n = 550) and a cross-validation subsample (n = 551). Five incrementally valid indicators were selected from five domains: demographic (age), historical (prior convictions),…

Investigating Postgraduate College Admission Interviews: Generalizability Theory Reliability and Incremental Predictive Validity

ERIC Educational Resources Information Center

Arce-Ferrer, Alvaro J.; Castillo, Irene Borges

2007-01-01

The use of face-to-face interviews is controversial for college admissions decisions in light of the lack of availability of validity and reliability evidence for most college admission processes. This study investigated reliability and incremental predictive validity of a face-to-face postgraduate college admission interview with a sample of…
Further Validation of the Coach Identity Prominence Scale

ERIC Educational Resources Information Center

Pope, J. Paige; Hall, Craig R.

2014-01-01

This study was designed to examine select psychometric properties of the Coach Identity Prominence Scale (CIPS), including the reliability, factorial validity, convergent validity, discriminant validity, and predictive validity. Coaches (N = 338) who averaged 37 (SD = 12.27) years of age, had a mean of 13 (SD = 9.90) years of coaching experience,…
Review and evaluation of performance measures for survival prediction models in external validation settings.

PubMed

Rahman, M Shafiqur; Ambler, Gareth; Choodari-Oskooei, Babak; Omar, Rumana Z

2017-04-18

When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the context of model validation. This paper reviewed and evaluated a wide range of performance measures to provide some guidelines for their use in practice. An extensive simulation study based on two clinical datasets was conducted to investigate the performance of the measures in external validation settings. Measures were selected from categories that assess the overall performance, discrimination and calibration of a survival prediction model. Some of these have been modified to allow their use with validation data, and a case study is provided to describe how these measures can be estimated in practice. The measures were evaluated with respect to their robustness to censoring and ease of interpretation. All measures are implemented, or are straightforward to implement, in statistical software. Most of the performance measures were reasonably robust to moderate levels of censoring. One exception was Harrell's concordance measure which tended to increase as censoring increased. We recommend that Uno's concordance measure is used to quantify concordance when there are moderate levels of censoring. Alternatively, Gönen and Heller's measure could be considered, especially if censoring is very high, but we suggest that the prediction model is re-calibrated first. We also recommend that Royston's D is routinely reported to assess discrimination since it has an appealing interpretation. The calibration slope is useful for both internal and external validation settings and recommended to report routinely. Our recommendation would be to use any of the predictive accuracy measures and provide the corresponding predictive accuracy curves. In addition, we recommend to investigate the characteristics of the validation data such as the level of censoring and the distribution of the prognostic index derived in the validation setting before choosing the performance measures.
Validation of Groundwater Models: Meaningful or Meaningless?

NASA Astrophysics Data System (ADS)

Konikow, L. F.

2003-12-01

Although numerical simulation models are valuable tools for analyzing groundwater systems, their predictive accuracy is limited. People who apply groundwater flow or solute-transport models, as well as those who make decisions based on model results, naturally want assurance that a model is "valid." To many people, model validation implies some authentication of the truth or accuracy of the model. History matching is often presented as the basis for model validation. Although such model calibration is a necessary modeling step, it is simply insufficient for model validation. Because of parameter uncertainty and solution non-uniqueness, declarations of validation (or verification) of a model are not meaningful. Post-audits represent a useful means to assess the predictive accuracy of a site-specific model, but they require the existence of long-term monitoring data. Model testing may yield invalidation, but that is an opportunity to learn and to improve the conceptual and numerical models. Examples of post-audits and of the application of a solute-transport model to a radioactive waste disposal site illustrate deficiencies in model calibration, prediction, and validation.
Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction.

PubMed

Cheng, Hao; Garrick, Dorian J; Fernando, Rohan L

2017-01-01

A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify the predictive ability of a statistical model. Naive application of Leave-one-out cross validation is computationally intensive because the training and validation analyses need to be repeated n times, once for each observation. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis. Efficient Leave-one-out cross validation strategies is 786 times faster than the naive application for a simulated dataset with 1,000 observations and 10,000 markers and 99 times faster with 1,000 observations and 100 markers. These efficiencies relative to the naive approach using the same model will increase with increases in the number of observations. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis.
Validity of Integrity Tests for Predicting Drug and Alcohol Abuse

DTIC Science & Technology

1993-08-31

Wiinkler and Sheridan (1989) found that employees who entered employee assistance programs for treating drug addiction were more likely be absent...August 31, 1993 Final 4. TITLE AND SUBTITLE S. FUNDING NUMBERS Validity of Integrity Tests for Predicting Drug and Alcohol Abuse C No. N00014-92-J...words) This research used psychometric meta-analysis (Hunter & Schmidt, 1990b) to examine the validity of integrity tests for predicting drug and
Performance of genomic prediction within and across generations in maritime pine.

PubMed

Bartholomé, Jérôme; Van Heerwaarden, Joost; Isik, Fikret; Boury, Christophe; Vidal, Marjorie; Plomion, Christophe; Bouffier, Laurent

2016-08-11

Genomic selection (GS) is a promising approach for decreasing breeding cycle length in forest trees. Assessment of progeny performance and of the prediction accuracy of GS models over generations is therefore a key issue. A reference population of maritime pine (Pinus pinaster) with an estimated effective inbreeding population size (status number) of 25 was first selected with simulated data. This reference population (n = 818) covered three generations (G0, G1 and G2) and was genotyped with 4436 single-nucleotide polymorphism (SNP) markers. We evaluated the effects on prediction accuracy of both the relatedness between the calibration and validation sets and validation on the basis of progeny performance. Pedigree-based (best linear unbiased prediction, ABLUP) and marker-based (genomic BLUP and Bayesian LASSO) models were used to predict breeding values for three different traits: circumference, height and stem straightness. On average, the ABLUP model outperformed genomic prediction models, with a maximum difference in prediction accuracies of 0.12, depending on the trait and the validation method. A mean difference in prediction accuracy of 0.17 was found between validation methods differing in terms of relatedness. Including the progenitors in the calibration set reduced this difference in prediction accuracy to 0.03. When only genotypes from the G0 and G1 generations were used in the calibration set and genotypes from G2 were used in the validation set (progeny validation), prediction accuracies ranged from 0.70 to 0.85. This study suggests that the training of prediction models on parental populations can predict the genetic merit of the progeny with high accuracy: an encouraging result for the implementation of GS in the maritime pine breeding program.
Individualized prediction of perineural invasion in colorectal cancer: development and validation of a radiomics prediction model.

PubMed

Huang, Yanqi; He, Lan; Dong, Di; Yang, Caiyun; Liang, Cuishan; Chen, Xin; Ma, Zelan; Huang, Xiaomei; Yao, Su; Liang, Changhong; Tian, Jie; Liu, Zaiyi

2018-02-01

To develop and validate a radiomics prediction model for individualized prediction of perineural invasion (PNI) in colorectal cancer (CRC). After computed tomography (CT) radiomics features extraction, a radiomics signature was constructed in derivation cohort (346 CRC patients). A prediction model was developed to integrate the radiomics signature and clinical candidate predictors [age, sex, tumor location, and carcinoembryonic antigen (CEA) level]. Apparent prediction performance was assessed. After internal validation, independent temporal validation (separate from the cohort used to build the model) was then conducted in 217 CRC patients. The final model was converted to an easy-to-use nomogram. The developed radiomics nomogram that integrated the radiomics signature and CEA level showed good calibration and discrimination performance [Harrell's concordance index (c-index): 0.817; 95% confidence interval (95% CI): 0.811-0.823]. Application of the nomogram in validation cohort gave a comparable calibration and discrimination (c-index: 0.803; 95% CI: 0.794-0.812). Integrating the radiomics signature and CEA level into a radiomics prediction model enables easy and effective risk assessment of PNI in CRC. This stratification of patients according to their PNI status may provide a basis for individualized auxiliary treatment.
A systematic review of validated sinus surgery simulators.

PubMed

Stew, B; Kao, S S-T; Dharmawardana, N; Ooi, E H

2018-06-01

Simulation provides a safe and effective opportunity to develop surgical skills. A variety of endoscopic sinus surgery (ESS) simulators has been described in the literature. Validation of these simulators allows for effective utilisation in training. To conduct a systematic review of the published literature to analyse the evidence for validated ESS simulation. Pubmed, Embase, Cochrane and Cinahl were searched from inception of the databases to 11 January 2017. Twelve thousand five hundred and sixteen articles were retrieved of which 10 112 were screened following the removal of duplicates. Thirty-eight full-text articles were reviewed after meeting search criteria. Evidence of face, content, construct, discriminant and predictive validity was extracted. Twenty articles were included in the analysis describing 12 ESS simulators. Eleven of these simulators had undergone validation: 3 virtual reality, 7 physical bench models and 1 cadaveric simulator. Seven of the simulators were shown to have face validity, 7 had construct validity and 1 had predictive validity. None of the simulators demonstrated discriminate validity. This systematic review demonstrates that a number of ESS simulators have been comprehensively validated. Many of the validation processes, however, lack standardisation in outcome reporting, thus limiting a meta-analysis comparison between simulators. © 2017 John Wiley & Sons Ltd.
The Strengths and Difficulties Questionnaire: psychometric properties of the parent and teacher version in children aged 4-7.

PubMed

Stone, Lisanne L; Janssens, Jan M A M; Vermulst, Ad A; Van Der Maten, Marloes; Engels, Rutger C M E; Otten, Roy

2015-01-01

The Strengths and Difficulties Questionnaire is one of the most employed screening instruments. Although there is a large research body investigating its psychometric properties, reliability and validity are not yet fully tested using modern techniques. Therefore, we investigate reliability, construct validity, measurement invariance, and predictive validity of the parent and teacher version in children aged 4-7. Besides, we intend to replicate previous studies by investigating test-retest reliability and criterion validity. In a Dutch community sample 2,238 teachers and 1,513 parents filled out questionnaires regarding problem behaviors and parenting, while 1,831 children reported on sociometric measures at T1. These children were followed-up during three consecutive years. Reliability was examined using Cronbach's alpha and McDonald's omega, construct validity was examined by Confirmatory Factor Analysis, and predictive validity was examined by calculating developmental profiles and linking these to measures of inadequate parenting, parenting stress and social preference. Further, mean scores and percentiles were examined in order to establish norms. Omega was consistently higher than alpha regarding reliability. The original five-factor structure was replicated, and measurement invariance was established on a configural level. Further, higher SDQ scores were associated with future indices of higher inadequate parenting, higher parenting stress and lower social preference. Finally, previous results on test-retest reliability and criterion validity were replicated. This study is the first to show SDQ scores are predictively valid, attesting to the feasibility of the SDQ as a screening instrument. Future research into predictive validity of the SDQ is warranted.
Differential Predictive Validity of a Preschool Battery Across Race and Sex.

ERIC Educational Resources Information Center

Reynolds, Cecil R.

Determination of the fairness of preschool tests for use with children of varying cultural backgrounds is the major objective of this study. The predictive validity of a battery of preschool tests, chosen to represent the core areas of preschool assessment, across race and sex, was evaluated. Validity of the battery was examined over a 12-month…
Comparison of the Incremental Validity of the Old and New MCAT.

ERIC Educational Resources Information Center

Wolf, Fredric M.; And Others

The predictive and incremental validity of both the Old and New Medical College Admission Test (MCAT) was examined and compared with a sample of over 300 medical students. Results of zero order and incremental validity coefficients, as well as prediction models resulting from all possible subsets regression analyses using Mallow's Cp criterion,…
Validity of the Student Risk Screening Scale: Evidence of Predictive Validity in a Diverse, Suburban Elementary Setting

ERIC Educational Resources Information Center

Menzies, Holly M.; Lane, Kathleen Lynne

2012-01-01

In this study the authors examined the psychometric properties of the "Student Risk Screening Scale" (SRSS), including predictive validity in terms of student outcomes in behavioral and academic domains. The school, a diverse, suburban school in Southern California, administered the SRSS at three time points as part of regular school…
Genomic selection across multiple breeding cycles in applied bread wheat breeding.

PubMed

Michel, Sebastian; Ametz, Christian; Gungor, Huseyin; Epure, Doru; Grausgruber, Heinrich; Löschenberger, Franziska; Buerstmayr, Hermann

2016-06-01

We evaluated genomic selection across five breeding cycles of bread wheat breeding. Bias of within-cycle cross-validation and methods for improving the prediction accuracy were assessed. The prospect of genomic selection has been frequently shown by cross-validation studies using the same genetic material across multiple environments, but studies investigating genomic selection across multiple breeding cycles in applied bread wheat breeding are lacking. We estimated the prediction accuracy of grain yield, protein content and protein yield of 659 inbred lines across five independent breeding cycles and assessed the bias of within-cycle cross-validation. We investigated the influence of outliers on the prediction accuracy and predicted protein yield by its components traits. A high average heritability was estimated for protein content, followed by grain yield and protein yield. The bias of the prediction accuracy using populations from individual cycles using fivefold cross-validation was accordingly substantial for protein yield (17-712 %) and less pronounced for protein content (8-86 %). Cross-validation using the cycles as folds aimed to avoid this bias and reached a maximum prediction accuracy of [Formula: see text] = 0.51 for protein content, [Formula: see text] = 0.38 for grain yield and [Formula: see text] = 0.16 for protein yield. Dropping outlier cycles increased the prediction accuracy of grain yield to [Formula: see text] = 0.41 as estimated by cross-validation, while dropping outlier environments did not have a significant effect on the prediction accuracy. Independent validation suggests, on the other hand, that careful consideration is necessary before an outlier correction is undertaken, which removes lines from the training population. Predicting protein yield by multiplying genomic estimated breeding values of grain yield and protein content raised the prediction accuracy to [Formula: see text] = 0.19 for this derived trait.
Validation of the 4P's Plus screen for substance use in pregnancy validation of the 4P's Plus.

PubMed

Chasnoff, I J; Wells, A M; McGourty, R F; Bailey, L K

2007-12-01

The purpose of this study is to validate the 4P's Plus screen for substance use in pregnancy. A total of 228 pregnant women enrolled in prenatal care underwent screening with the 4P's Plus and received a follow-up clinical assessment for substance use. Statistical analyses regarding reliability, sensitivity, specificity, and positive and negative predictive validity of the 4Ps Plus were conducted. The overall reliability for the five-item measure was 0.62. Seventy-four (32.5%) of the women had a positive screen. Sensitivity and specificity were very good, at 87 and 76%, respectively. Positive predictive validity was low (36%), but negative predictive validity was quite high (97%). Of the 31 women who had a positive clinical assessment, 45% were using less than 1 day per week. The 4P's Plus reliably and effectively screens pregnant women for risk of substance use, including those women typically missed by other perinatal screening methodologies.
Validation of NASA Thermal Ice Protection Computer Codes. Part 1; Program Overview

NASA Technical Reports Server (NTRS)

Miller, Dean; Bond, Thomas; Sheldon, David; Wright, William; Langhals, Tammy; Al-Khalil, Kamel; Broughton, Howard

1996-01-01

The Icing Technology Branch at NASA Lewis has been involved in an effort to validate two thermal ice protection codes developed at the NASA Lewis Research Center. LEWICE/Thermal (electrothermal deicing & anti-icing), and ANTICE (hot-gas & electrothermal anti-icing). The Thermal Code Validation effort was designated as a priority during a 1994 'peer review' of the NASA Lewis Icing program, and was implemented as a cooperative effort with industry. During April 1996, the first of a series of experimental validation tests was conducted in the NASA Lewis Icing Research Tunnel(IRT). The purpose of the April 96 test was to validate the electrothermal predictive capabilities of both LEWICE/Thermal, and ANTICE. A heavily instrumented test article was designed and fabricated for this test, with the capability of simulating electrothermal de-icing and anti-icing modes of operation. Thermal measurements were then obtained over a range of test conditions, for comparison with analytical predictions. This paper will present an overview of the test, including a detailed description of: (1) the validation process; (2) test article design; (3) test matrix development; and (4) test procedures. Selected experimental results will be presented for de-icing and anti-icing modes of operation. Finally, the status of the validation effort at this point will be summarized. Detailed comparisons between analytical predictions and experimental results are contained in the following two papers: 'Validation of NASA Thermal Ice Protection Computer Codes: Part 2- The Validation of LEWICE/Thermal' and 'Validation of NASA Thermal Ice Protection Computer Codes: Part 3-The Validation of ANTICE'
Screening for potential child maltreatment in parents of a newborn baby: The predictive validity of an Instrument for early identification of Parents At Risk for child Abuse and Neglect (IPARAN).

PubMed

van der Put, Claudia E; Bouwmeester-Landweer, Merian B R; Landsmeer-Beker, Eleonore A; Wit, Jan M; Dekker, Friedo W; Kousemaker, N Pieter J; Baartman, Herman E M

2017-08-01

For preventive purposes it is important to be able to identify families with a high risk of child maltreatment at an early stage. Therefore we developed an actuarial instrument for screening families with a newborn baby, the Instrument for identification of Parents At Risk for child Abuse and Neglect (IPARAN). The aim of this study was to assess the predictive validity of the IPARAN and to examine whether combining actuarial and clinical methods leads to an improvement of the predictive validity. We examined the predictive validity by calculating several performance indicators (i.e., sensitivity, specificity and the Area Under the receiver operating characteristic Curve [AUC]) in a sample of 4692 Dutch families with newborns. The outcome measure was a report of child maltreatment at Child Protection Services during a follow-up of 3 years. For 17 children (.4%) a report of maltreatment was registered. The predictive validity of the IPARAN was significantly better than chance (AUC=.700, 95% CI [.567-.832]), in contrast to a low value for clinical judgement of nurses of the Youth Health Care Centers (AUC=.591, 95% CI [.422-.759]). The combination of the IPARAN and clinical judgement resulted in the highest predictive validity (AUC=.720, 95% CI [.593-.847]), however, the difference between the methods did not reach statistical significance. The good predictive validity of the IPARAN in combination with clinical judgment of the nurse enables professionals to assess risks at an early stage and to make referrals to early intervention programs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validation and Inter-comparison Against Observations of GODAE Ocean View Ocean Prediction Systems

NASA Astrophysics Data System (ADS)

Xu, J.; Davidson, F. J. M.; Smith, G. C.; Lu, Y.; Hernandez, F.; Regnier, C.; Drevillon, M.; Ryan, A.; Martin, M.; Spindler, T. D.; Brassington, G. B.; Oke, P. R.

2016-02-01

For weather forecasts, validation of forecast performance is done at the end user level as well as by the meteorological forecast centers. In the development of Ocean Prediction Capacity, the same level of care for ocean forecast performance and validation is needed. Herein we present results from a validation against observations of 6 Global Ocean Forecast Systems under the GODAE OceanView International Collaboration Network. These systems include the Global Ocean Ice Forecast System (GIOPS) developed by the Government of Canada, two systems PSY3 and PSY4 from the French Mercator-Ocean Ocean Forecasting Group, the FOAM system from UK met office, HYCOM-RTOFS from NOAA/NCEP/NWA of USA, and the Australian Bluelink-OceanMAPS system from the CSIRO, the Australian Meteorological Bureau and the Australian Navy.The observation data used in the comparison are sea surface temperature, sub-surface temperature, sub-surface salinity, sea level anomaly, and sea ice total concentration data. Results of the inter-comparison demonstrate forecast performance limits, strengths and weaknesses of each of the six systems. This work establishes validation protocols and routines by which all new prediction systems developed under the CONCEPTS Collaborative Network will be benchmarked prior to approval for operations. This includes anticipated delivery of CONCEPTS regional prediction systems over the next two years including a pan Canadian 1/12th degree resolution ice ocean prediction system and limited area 1/36th degree resolution prediction systems. The validation approach of comparing forecasts to observations at the time and location of the observation is called Class 4 metrics. It has been adopted by major international ocean prediction centers, and will be recommended to JCOMM-WMO as routine validation approach for operational oceanography worldwide.
Building and validating a prediction model for paediatric type 1 diabetes risk using next generation targeted sequencing of class II HLA genes.

PubMed

Zhao, Lue Ping; Carlsson, Annelie; Larsson, Helena Elding; Forsander, Gun; Ivarsson, Sten A; Kockum, Ingrid; Ludvigsson, Johnny; Marcus, Claude; Persson, Martina; Samuelsson, Ulf; Örtqvist, Eva; Pyo, Chul-Woo; Bolouri, Hamid; Zhao, Michael; Nelson, Wyatt C; Geraghty, Daniel E; Lernmark, Åke

2017-11-01

It is of interest to predict possible lifetime risk of type 1 diabetes (T1D) in young children for recruiting high-risk subjects into longitudinal studies of effective prevention strategies. Utilizing a case-control study in Sweden, we applied a recently developed next generation targeted sequencing technology to genotype class II genes and applied an object-oriented regression to build and validate a prediction model for T1D. In the training set, estimated risk scores were significantly different between patients and controls (P = 8.12 × 10 -92 ), and the area under the curve (AUC) from the receiver operating characteristic (ROC) analysis was 0.917. Using the validation data set, we validated the result with AUC of 0.886. Combining both training and validation data resulted in a predictive model with AUC of 0.903. Further, we performed a "biological validation" by correlating risk scores with 6 islet autoantibodies, and found that the risk score was significantly correlated with IA-2A (Z-score = 3.628, P < 0.001). When applying this prediction model to the Swedish population, where the lifetime T1D risk ranges from 0.5% to 2%, we anticipate identifying approximately 20 000 high-risk subjects after testing all newborns, and this calculation would identify approximately 80% of all patients expected to develop T1D in their lifetime. Through both empirical and biological validation, we have established a prediction model for estimating lifetime T1D risk, using class II HLA. This prediction model should prove useful for future investigations to identify high-risk subjects for prevention research in high-risk populations. Copyright © 2017 John Wiley & Sons, Ltd.
Incremental Validity of Biographical Data in the Prediction of En Route Air Traffic Control Specialist Technical Skills

DTIC Science & Technology

2012-07-01

Incremental Validity of Biographical Data in the Prediction of En Route Air Traffic Control Specialist Technical Skills Dana Broach Civil Aerospace...Medical Institute Federal Aviation Administration Oklahoma City, OK 73125 July 2012 Final Report DOT/FAA/AM- 12 /8 Office of Aerospace Medicine...FAA/AM- 12 /8 4. Title and Subtitle 5. Report Date July 2012 Incremental Validity of Biographical Data in the Prediction of En Route Air

Predictive Validity of the HKT-R Risk Assessment Tool: Two and 5-Year Violent Recidivism in a Nationwide Sample of Dutch Forensic Psychiatric Patients.

PubMed

Bogaerts, Stefan; Spreen, Marinus; Ter Horst, Paul; Gerlsma, Coby

2018-06-01

This study has examined the predictive validity of the Historical Clinical Future [ Historisch Klinisch Toekomst] Revised risk assessment scheme in a cohort of 347 forensic psychiatric patients, which were discharged between 2004 and 2008 from any of 12 highly secure forensic centers in the Netherlands. Predictive validity was measured 2 and 5 years after release. Official reconviction data obtained from the Dutch Ministry of Security and Justice were used as outcome measures. Violent reoffending within 2 and 5 years after discharge was assessed. With regard to violent reoffending, results indicated that the predictive validity of the Historical domain was modest for 2 (area under the curve [AUC] = .75) and 5 (AUC = .74) years. The predictive validity of the Clinical domain was marginal for 2 (admission: AUC = .62; discharge: AUC = .63) and 5 (admission: AUC = .69; discharge: AUC = .62) years after release. The predictive validity of the Future domain was modest (AUC = .71) for 2 years and low for 5 (AUC = .58) years. The total score of the instrument was modest for 2 years (AUC = .78) and marginal for 5 (AUC = .68) years. Finally, the Final Risk Judgment was modest for 2 years (AUC = .78) and marginal for 5 (AUC = .63) years time at risk. It is concluded that this risk assessment instrument appears to be a satisfactory instrument for risk assessment.
Predictive value and construct validity of the work functioning screener-healthcare (WFS-H).

PubMed

Boezeman, Edwin J; Nieuwenhuijsen, Karen; Sluiter, Judith K

2016-05-25

To test the predictive value and convergent construct validity of a 6-item work functioning screener (WFS-H). Healthcare workers (249 nurses) completed a questionnaire containing the work functioning screener (WFS-H) and a work functioning instrument (NWFQ) measuring the following: cognitive aspects of task execution and general incidents, avoidance behavior, conflicts and irritation with colleagues, impaired contact with patients and their family, and level of energy and motivation. Productivity and mental health were also measured. Negative and positive predictive values, AUC values, and sensitivity and specificity were calculated to examine the predictive value of the screener. Correlation analysis was used to examine the construct validity. The screener had good predictive value, since the results showed that a negative screener score is a strong indicator of work functioning not hindered by mental health problems (negative predictive values: 94%-98%; positive predictive values: 21%-36%; AUC:.64-.82; sensitivity: 42%-76%; and specificity 85%-87%). The screener has good construct validity due to moderate, but significant (p<.001), associations with productivity (r=.51), mental health (r=.48), and distress (r=.47). The screener (WFS-H) had good predictive value and good construct validity. Its score offers occupational health professionals a helpful preliminary insight into the work functioning of healthcare workers.
Experimental and statistical post-validation of positive example EST sequences carrying peroxisome targeting signals type 1 (PTS1)

PubMed Central

Lingner, Thomas; Kataya, Amr R. A.; Reumann, Sigrun

2012-01-01

We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences.1 As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity.” Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals. PMID:22415050
Experimental and statistical post-validation of positive example EST sequences carrying peroxisome targeting signals type 1 (PTS1).

PubMed

Lingner, Thomas; Kataya, Amr R A; Reumann, Sigrun

2012-02-01

We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences. As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity." Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals.
Temporal and external validation of a prediction model for adverse outcomes among inpatients with diabetes.

PubMed

Adderley, N J; Mallett, S; Marshall, T; Ghosh, S; Rayman, G; Bellary, S; Coleman, J; Akiboye, F; Toulis, K A; Nirantharakumar, K

2018-06-01

To temporally and externally validate our previously developed prediction model, which used data from University Hospitals Birmingham to identify inpatients with diabetes at high risk of adverse outcome (mortality or excessive length of stay), in order to demonstrate its applicability to other hospital populations within the UK. Temporal validation was performed using data from University Hospitals Birmingham and external validation was performed using data from both the Heart of England NHS Foundation Trust and Ipswich Hospital. All adult inpatients with diabetes were included. Variables included in the model were age, gender, ethnicity, admission type, intensive therapy unit admission, insulin therapy, albumin, sodium, potassium, haemoglobin, C-reactive protein, estimated GFR and neutrophil count. Adverse outcome was defined as excessive length of stay or death. Model discrimination in the temporal and external validation datasets was good. In temporal validation using data from University Hospitals Birmingham, the area under the curve was 0.797 (95% CI 0.785-0.810), sensitivity was 70% (95% CI 67-72) and specificity was 75% (95% CI 74-76). In external validation using data from Heart of England NHS Foundation Trust, the area under the curve was 0.758 (95% CI 0.747-0.768), sensitivity was 73% (95% CI 71-74) and specificity was 66% (95% CI 65-67). In external validation using data from Ipswich, the area under the curve was 0.736 (95% CI 0.711-0.761), sensitivity was 63% (95% CI 59-68) and specificity was 69% (95% CI 67-72). These results were similar to those for the internally validated model derived from University Hospitals Birmingham. The prediction model to identify patients with diabetes at high risk of developing an adverse event while in hospital performed well in temporal and external validation. The externally validated prediction model is a novel tool that can be used to improve care pathways for inpatients with diabetes. Further research to assess clinical utility is needed. © 2018 Diabetes UK.
External validation of the Cairns Prediction Model (CPM) to predict conversion from laparoscopic to open cholecystectomy.

PubMed

Hu, Alan Shiun Yew; Donohue, Peter O'; Gunnarsson, Ronny K; de Costa, Alan

2018-03-14

Valid and user-friendly prediction models for conversion to open cholecystectomy allow for proper planning prior to surgery. The Cairns Prediction Model (CPM) has been in use clinically in the original study site for the past three years, but has not been tested at other sites. A retrospective, single-centred study collected ultrasonic measurements and clinical variables alongside with conversion status from consecutive patients who underwent laparoscopic cholecystectomy from 2013 to 2016 in The Townsville Hospital, North Queensland, Australia. An area under the curve (AUC) was calculated to externally validate of the CPM. Conversion was necessary in 43 (4.2%) out of 1035 patients. External validation showed an area under the curve of 0.87 (95% CI 0.82-0.93, p = 1.1 × 10 -14 ). In comparison with most previously published models, which have an AUC of approximately 0.80 or less, the CPM has the highest AUC of all published prediction models both for internal and external validation. Crown Copyright © 2018. Published by Elsevier Inc. All rights reserved.
Validating proposed migration equation and parameters' values as a tool to reproduce and predict 137Cs vertical migration activity in Spanish soils.

PubMed

Olondo, C; Legarda, F; Herranz, M; Idoeta, R

2017-04-01

This paper shows the procedure performed to validate the migration equation and the migration parameters' values presented in a previous paper (Legarda et al., 2011) regarding the migration of 137 Cs in Spanish mainland soils. In this paper, this model validation has been carried out checking experimentally obtained activity concentration values against those predicted by the model. This experimental data come from the measured vertical activity profiles of 8 new sampling points which are located in northern Spain. Before testing predicted values of the model, the uncertainty of those values has been assessed with the appropriate uncertainty analysis. Once establishing the uncertainty of the model, both activity concentration values, experimental versus model predicted ones, have been compared. Model validation has been performed analyzing its accuracy, studying it as a whole and also at different depth intervals. As a result, this model has been validated as a tool to predict 137 Cs behaviour in a Mediterranean environment. Copyright © 2017 Elsevier Ltd. All rights reserved.
Responsiveness and predictive validity of the tablet-based symbol digit modalities test in patients with stroke.

PubMed

Hsiao, Pei-Chi; Yu, Wan-Hui; Lee, Shih-Chieh; Chen, Mei-Hsiang; Hsieh, Ching-Lin

2018-06-14

The responsiveness and predictive validity of the Tablet-based Symbol Digit Modalities Test (T-SDMT) are unknown, which limits the utility of the T-SDMT in both clinical and research settings. The purpose of this study was to examine the responsiveness and predictive validity of the T-SDMT in inpatients with stroke. A follow-up, repeated-assessments design. One rehabilitation unit at a local medical center. A total of 50 inpatients receiving rehabilitation completed T-SDMT assessments at admission to and discharge from a rehabilitation ward. The median follow-up period was 14 days. The Barthel index (BI) was assessed at discharge and was used as the criterion of the predictive validity. The mean changes in the T-SDMT scores between admission and discharge were statistically significant (paired t-test = 3.46, p = 0.001). The T-SDMT scores showed a nearly moderate standardized response mean (0.49). A moderate association (Pearson's r = 0.47) was found between the scores of the T-SDMT at admission and those of the BI at discharge, indicating good predictive validity of the T-SDMT. Our results support the responsiveness and predictive validity of the T-SDMT in patients with stroke receiving rehabilitation in hospitals. This study provides empirical evidence supporting the use of the T-SDMT as an outcome measure for assessing processingspeed in inpatients with stroke. The scores of the T-SDMT could be used to predict basic activities of daily living function in inpatients with stroke.
Incremental Validity of WISC-IV[superscript UK] Factor Index Scores with a Referred Irish Sample: Predicting Performance on the WIAT-II[superscript UK

ERIC Educational Resources Information Center

Canivez, Gary L.; Watkins, Marley W.; James, Trevor; Good, Rebecca; James, Kate

2014-01-01

Background: Subtest and factor scores have typically provided little incremental predictive validity beyond the omnibus IQ score. Aims: This study examined the incremental validity of Wechsler Intelligence Scale for Children-Fourth UK Edition (WISC-IV[superscript UK]; Wechsler, 2004a, "Wechsler Intelligence Scale for Children-Fourth UK…
Predictive Validity of Measures of the Pathfinder Scaling Algorithm on Programming Performance: Alternative Assessment Strategy for Programming Education

ERIC Educational Resources Information Center

Lau, Wilfred W. F.; Yuen, Allan H. K.

2009-01-01

Recent years have seen a shift in focus from assessment of learning to assessment for learning and the emergence of alternative assessment methods. However, the reliability and validity of these methods as assessment tools are still questionable. In this article, we investigated the predictive validity of measures of the Pathfinder Scaling…
Validity of the SAT® for Predicting First-Year Grades: 2009 SAT Validity Sample. Statistical Report No. 2012-2

ERIC Educational Resources Information Center

Patterson, Brian F.; Mattern, Krista D.

2009-01-01

In an effort to continuously monitor the validity of the SAT for predicting first-year college grades, the College Board has continued its multi-year effort to recruit four-year colleges and universities (henceforth, "institutions") to provide data on the cohorts of first-time, first-year students entering in the fall semester beginning…
Effectively Coping With Task Stress: A Study of the Validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF).

PubMed

O'Connor, Peter; Nguyen, Jessica; Anglim, Jeromy

2017-01-01

In this study, we investigated the validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF; Petrides, 2009) in the context of task-induced stress. We used a total sample of 225 volunteers to investigate (a) the incremental validity of the TEIQue-SF over other predictors of coping with task-induced stress, and (b) the construct validity of the TEIQue-SF by examining the mechanisms via which scores from the TEIQue-SF predict coping outcomes. Results demonstrated that the TEIQue-SF possessed incremental validity over the Big Five personality traits in the prediction of emotion-focused coping. Results also provided support for the construct validity of the TEIQue-SF by demonstrating that this measure predicted adaptive coping via emotion-focused channels. Specifically, results showed that, following a task stressor, the TEIQue-SF predicted low negative affect and high task performance via high levels of emotion-focused coping. Consistent with the purported theoretical nature of the trait emotional intelligence (EI) construct, trait EI as assessed by the TEIQue-SF primarily enhances affect and performance in stressful situations by regulating negative emotions.
External validation of the diffuse intrinsic pontine glioma survival prediction model: a collaborative report from the International DIPG Registry and the SIOPE DIPG Registry.

PubMed

Veldhuijzen van Zanten, Sophie E M; Lane, Adam; Heymans, Martijn W; Baugh, Joshua; Chaney, Brooklyn; Hoffman, Lindsey M; Doughman, Renee; Jansen, Marc H A; Sanchez, Esther; Vandertop, William P; Kaspers, Gertjan J L; van Vuurden, Dannis G; Fouladi, Maryam; Jones, Blaise V; Leach, James

2017-08-01

We aimed to perform external validation of the recently developed survival prediction model for diffuse intrinsic pontine glioma (DIPG), and discuss its utility. The DIPG survival prediction model was developed in a cohort of patients from the Netherlands, United Kingdom and Germany, registered in the SIOPE DIPG Registry, and includes age <3 years, longer symptom duration and receipt of chemotherapy as favorable predictors, and presence of ring-enhancement on MRI as unfavorable predictor. Model performance was evaluated by analyzing the discrimination and calibration abilities. External validation was performed using an unselected cohort from the International DIPG Registry, including patients from United States, Canada, Australia and New Zealand. Basic comparison with the results of the original study was performed using descriptive statistics, and univariate- and multivariable regression analyses in the validation cohort. External validation was assessed following a variety of analyses described previously. Baseline patient characteristics and results from the regression analyses were largely comparable. Kaplan-Meier curves of the validation cohort reproduced separated groups of standard (n = 39), intermediate (n = 125), and high-risk (n = 78) patients. This discriminative ability was confirmed by similar values for the hazard ratios across these risk groups. The calibration curve in the validation cohort showed a symmetric underestimation of the predicted survival probabilities. In this external validation study, we demonstrate that the DIPG survival prediction model has acceptable cross-cohort calibration and is able to discriminate patients with short, average, and increased survival. We discuss how this clinico-radiological model may serve a useful role in current clinical practice.
Examining construct and predictive validity of the Health-IT Usability Evaluation Scale: confirmatory factor analysis and structural equation modeling results

PubMed Central

Yen, Po-Yin; Sousa, Karen H; Bakken, Suzanne

2014-01-01

Background In a previous study, we developed the Health Information Technology Usability Evaluation Scale (Health-ITUES), which is designed to support customization at the item level. Such customization matches the specific tasks/expectations of a health IT system while retaining comparability at the construct level, and provides evidence of its factorial validity and internal consistency reliability through exploratory factor analysis. Objective In this study, we advanced the development of Health-ITUES to examine its construct validity and predictive validity. Methods The health IT system studied was a web-based communication system that supported nurse staffing and scheduling. Using Health-ITUES, we conducted a cross-sectional study to evaluate users’ perception toward the web-based communication system after system implementation. We examined Health-ITUES's construct validity through first and second order confirmatory factor analysis (CFA), and its predictive validity via structural equation modeling (SEM). Results The sample comprised 541 staff nurses in two healthcare organizations. The CFA (n=165) showed that a general usability factor accounted for 78.1%, 93.4%, 51.0%, and 39.9% of the explained variance in ‘Quality of Work Life’, ‘Perceived Usefulness’, ‘Perceived Ease of Use’, and ‘User Control’, respectively. The SEM (n=541) supported the predictive validity of Health-ITUES, explaining 64% of the variance in intention for system use. Conclusions The results of CFA and SEM provide additional evidence for the construct and predictive validity of Health-ITUES. The customizability of Health-ITUES has the potential to support comparisons at the construct level, while allowing variation at the item level. We also illustrate application of Health-ITUES across stages of system development. PMID:24567081
Testing the Predictive Validity of the Hendrich II Fall Risk Model.

PubMed

Jung, Hyesil; Park, Hyeoun-Ae

2018-03-01

Cumulative data on patient fall risk have been compiled in electronic medical records systems, and it is possible to test the validity of fall-risk assessment tools using these data between the times of admission and occurrence of a fall. The Hendrich II Fall Risk Model scores assessed during three time points of hospital stays were extracted and used for testing the predictive validity: (a) upon admission, (b) when the maximum fall-risk score from admission to falling or discharge, and (c) immediately before falling or discharge. Predictive validity was examined using seven predictive indicators. In addition, logistic regression analysis was used to identify factors that significantly affect the occurrence of a fall. Among the different time points, the maximum fall-risk score assessed between admission and falling or discharge showed the best predictive performance. Confusion or disorientation and having a poor ability to rise from a sitting position were significant risk factors for a fall.
Assessing the validity of sales self-efficacy: a cautionary tale.

PubMed

Gupta, Nina; Ganster, Daniel C; Kepes, Sven

2013-07-01

We developed a focused, context-specific measure of sales self-efficacy and assessed its incremental validity against the broad Big 5 personality traits with department store salespersons, using (a) both a concurrent and a predictive design and (b) both objective sales measures and supervisory ratings of performance. We found that in the concurrent study, sales self-efficacy predicted objective and subjective measures of job performance more than did the Big 5 measures. Significant differences between the predictability of subjective and objective measures of performance were not observed. Predictive validity coefficients were generally lower than concurrent validity coefficients. The results suggest that there are different dynamics operating in concurrent and predictive designs and between broad and contextualized measures; they highlight the importance of distinguishing between these designs and measures in meta-analyses. The results also point to the value of focused, context-specific personality predictors in selection research. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Project on the Good Physician: Further Evidence for the Validity of a Moral Intuitionist Model of Virtuous Caring.

PubMed

Leffel, G Michael; Oakes Mueller, Ross A; Ham, Sandra A; Karches, Kyle E; Curlin, Farr A; Yoon, John D

2018-01-19

In the Project on the Good Physician, the authors propose a moral intuitionist model of virtuous caring that places the virtues of Mindfulness, Empathic Compassion, and Generosity at the heart of medical character education. Hypothesis 1a: The virtues of Mindfulness, Empathic Compassion, and Generosity will be positively associated with one another (convergent validity). Hypothesis 1b: The virtues of Mindfulness and Empathic Compassion will explain variance in the action-related virtue of Generosity beyond that predicted by Big Five personality traits alone (discriminant validity). Hypothesis 1c: Virtuous students will experience greater well-being ("flourishing"), as measured by four indices of well-being: life meaning, life satisfaction, vocational identity, and vocational calling (predictive validity). Hypothesis 1d: Students who self-report higher levels of the virtues will be nominated by their peers for the Gold Humanism Award (predictive validity). Hypothesis 2a-2c: Neuroticism and Burnout will be positively associated with each other and inversely associated with measures of virtue and well-being. The authors used data from a 2011 nationally representative sample of U.S. medical students (n = 499) in which medical virtues (Mindfulness, Empathic Compassion, and Generosity) were measured using scales adapted from existing instruments with validity evidence. Supporting the predictive validity of the model, virtuous students were recognized by their peers to be exemplary doctors, and they were more likely to have higher ratings on measures of student well-being. Supporting the discriminant validity of the model, virtues predicted prosocial behavior (Generosity) more than personality traits alone, and students higher in the virtue of Mindfulness were less likely to be high in Neuroticism and Burnout. Data from this descriptive-correlational study offered additional support for the validity of the moral intuitionist model of virtuous caring. Applied to medical character education, medical school programs should consider designing educational experiences that intentionally emphasize the cultivation of virtue.
Prediction and validation of residual feed intake and dry matter intake in Danish lactating dairy cows using mid-infrared spectroscopy of milk.

PubMed

Shetty, N; Løvendahl, P; Lund, M S; Buitenhuis, A J

2017-01-01

The present study explored the effectiveness of Fourier transform mid-infrared (FT-IR) spectral profiles as a predictor for dry matter intake (DMI) and residual feed intake (RFI). The partial least squares regression method was used to develop the prediction models. The models were validated using different external test sets, one randomly leaving out 20% of the records (validation A), the second randomly leaving out 20% of cows (validation B), and a third (for DMI prediction models) randomly leaving out one cow (validation C). The data included 1,044 records from 140 cows; 97 were Danish Holstein and 43 Danish Jersey. Results showed better accuracies for validation A compared with other validation methods. Milk yield (MY) contributed largely to DMI prediction; MY explained 59% of the variation and the validated model error root mean square error of prediction (RMSEP) was 2.24kg. The model was improved by adding live weight (LW) as an additional predictor trait, where the accuracy R 2 increased from 0.59 to 0.72 and error RMSEP decreased from 2.24 to 1.83kg. When only the milk FT-IR spectral profile was used in DMI prediction, a lower prediction ability was obtained, with R 2 =0.30 and RMSEP=2.91kg. However, once the spectral information was added, along with MY and LW as predictors, model accuracy improved and R 2 increased to 0.81 and RMSEP decreased to 1.49kg. Prediction accuracies of RFI changed throughout lactation. The RFI prediction model for the early-lactation stage was better compared with across lactation or mid- and late-lactation stages, with R 2 =0.46 and RMSEP=1.70. The most important spectral wavenumbers that contributed to DMI and RFI prediction models included fat, protein, and lactose peaks. Comparable prediction results were obtained when using infrared-predicted fat, protein, and lactose instead of full spectra, indicating that FT-IR spectral data do not add significant new information to improve DMI and RFI prediction models. Therefore, in practice, if full FT-IR spectral data are not stored, it is possible to achieve similar DMI or RFI prediction results based on standard milk control data. For DMI, the milk fat region was responsible for the major variation in milk spectra; for RFI, the major variation in milk spectra was within the milk protein region. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Derivation and validation of in-hospital mortality prediction models in ischaemic stroke patients using administrative data.

PubMed

Lee, Jason; Morishima, Toshitaka; Kunisawa, Susumu; Sasaki, Noriko; Otsubo, Tetsuya; Ikai, Hiroshi; Imanaka, Yuichi

2013-01-01

Stroke and other cerebrovascular diseases are a major cause of death and disability. Predicting in-hospital mortality in ischaemic stroke patients can help to identify high-risk patients and guide treatment approaches. Chart reviews provide important clinical information for mortality prediction, but are laborious and limiting in sample sizes. Administrative data allow for large-scale multi-institutional analyses but lack the necessary clinical information for outcome research. However, administrative claims data in Japan has seen the recent inclusion of patient consciousness and disability information, which may allow more accurate mortality prediction using administrative data alone. The aim of this study was to derive and validate models to predict in-hospital mortality in patients admitted for ischaemic stroke using administrative data. The sample consisted of 21,445 patients from 176 Japanese hospitals, who were randomly divided into derivation and validation subgroups. Multivariable logistic regression models were developed using 7- and 30-day and overall in-hospital mortality as dependent variables. Independent variables included patient age, sex, comorbidities upon admission, Japan Coma Scale (JCS) score, Barthel Index score, modified Rankin Scale (mRS) score, and admissions after hours and on weekends/public holidays. Models were developed in the derivation subgroup, and coefficients from these models were applied to the validation subgroup. Predictive ability was analysed using C-statistics; calibration was evaluated with Hosmer-Lemeshow χ(2) tests. All three models showed predictive abilities similar or surpassing that of chart review-based models. The C-statistics were highest in the 7-day in-hospital mortality prediction model, at 0.906 and 0.901 in the derivation and validation subgroups, respectively. For the 30-day in-hospital mortality prediction models, the C-statistics for the derivation and validation subgroups were 0.893 and 0.872, respectively; in overall in-hospital mortality prediction these values were 0.883 and 0.876. In this study, we have derived and validated in-hospital mortality prediction models for three different time spans using a large population of ischaemic stroke patients in a multi-institutional analysis. The recent inclusion of JCS, Barthel Index, and mRS scores in Japanese administrative data has allowed the prediction of in-hospital mortality with accuracy comparable to that of chart review analyses. The models developed using administrative data had consistently high predictive abilities for all models in both the derivation and validation subgroups. These results have implications in the role of administrative data in future mortality prediction analyses. Copyright © 2013 S. Karger AG, Basel.
Automated Pressure Injury Risk Assessment System Incorporated Into an Electronic Health Record System.

PubMed

Jin, Yinji; Jin, Taixian; Lee, Sun-Mi

Pressure injury risk assessment is the first step toward preventing pressure injuries, but traditional assessment tools are time-consuming, resulting in work overload and fatigue for nurses. The objectives of the study were to build an automated pressure injury risk assessment system (Auto-PIRAS) that can assess pressure injury risk using data, without requiring nurses to collect or input additional data, and to evaluate the validity of this assessment tool. A retrospective case-control study and a system development study were conducted in a 1,355-bed university hospital in Seoul, South Korea. A total of 1,305 pressure injury patients and 5,220 nonpressure injury patients participated for the development of a risk scoring algorithm: 687 and 2,748 for the validation of the algorithm and 237 and 994 for validation after clinical implementation, respectively. A total of 4,211 pressure injury-related clinical variables were extracted from the electronic health record (EHR) systems to develop a risk scoring algorithm, which was validated and incorporated into the EHR. That program was further evaluated for predictive and concurrent validity. Auto-PIRAS, incorporated into the EHR system, assigned a risk assessment score of high, moderate, or low and displayed this on the Kardex nursing record screen. Risk scores were updated nightly according to 10 predetermined risk factors. The predictive validity measures of the algorithm validation stage were as follows: sensitivity = .87, specificity = .90, positive predictive value = .68, negative predictive value = .97, Youden index = .77, and the area under the receiver operating characteristic curve = .95. The predictive validity measures of the Braden Scale were as follows: sensitivity = .77, specificity = .93, positive predictive value = .72, negative predictive value = .95, Youden index = .70, and the area under the receiver operating characteristic curve = .85. The kappa of the Auto-PIRAS and Braden Scale risk classification result was .73. The predictive performance of the Auto-PIRAS was similar to Braden Scale assessments conducted by nurses. Auto-PIRAS is expected to be used as a system that assesses pressure injury risk automatically without additional data collection by nurses.

Differential validity of the Defense Mechanism Manual for the TAT between Asian Americans and Whites. Thematic Apperception Test.

PubMed

Hibbard, S; Tang, P C; Latko, R; Park, J H; Munn, S; Bolz, S; Somerville, A

2000-12-01

Thematic Apperception Test (Murray, 1943) responses of 69 Asian American (hereafter, Asian) and 83 White students were coded for defenses according to the Defense Mechanism Manual (Cramer, 1991b) and studied for differential validity in predicting paper-and-pencil measures of relevant constructs. Three tests for differential validity were used: (a) differences between validity coefficients, (b) interactions between predictor and ethnicity in criterion prediction, and (c) differences between groups in mean prediction errors using a common regression equation. Modest differential validity was found. It was surprising that the DMM scales were slightly stronger predictors of their criteria among Asians than among Whites and when a common predictor was used, desirable criteria were overpredicted for Asians, whereas undesirable ones were overpredicted for Whites. The results were not affected by acculturation level or English vocabulary among the Asians.
Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates.

PubMed

LeDell, Erin; Petersen, Maya; van der Laan, Mark

In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC.
Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates

PubMed Central

Petersen, Maya; van der Laan, Mark

2015-01-01

In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC. PMID:26279737
A cross-validation package driving Netica with python

USGS Publications Warehouse

Fienen, Michael N.; Plant, Nathaniel G.

2014-01-01

Bayesian networks (BNs) are powerful tools for probabilistically simulating natural systems and emulating process models. Cross validation is a technique to avoid overfitting resulting from overly complex BNs. Overfitting reduces predictive skill. Cross-validation for BNs is known but rarely implemented due partly to a lack of software tools designed to work with available BN packages. CVNetica is open-source, written in Python, and extends the Netica software package to perform cross-validation and read, rebuild, and learn BNs from data. Insights gained from cross-validation and implications on prediction versus description are illustrated with: a data-driven oceanographic application; and a model-emulation application. These examples show that overfitting occurs when BNs become more complex than allowed by supporting data and overfitting incurs computational costs as well as causing a reduction in prediction skill. CVNetica evaluates overfitting using several complexity metrics (we used level of discretization) and its impact on performance metrics (we used skill).
Adolescent Domain Screening Inventory-Short Form: Development and Initial Validation

ERIC Educational Resources Information Center

Corrigan, Matthew J.

2017-01-01

This study sought to develop a short version of the ADSI, and investigate its psychometric properties. Methods: This is a secondary analysis. Analysis to determine the Cronbach's Alpha, correlations to determine concurrent criterion validity and known instrument validity and a logistic regression to determine predictive validity were conducted.…
Initial Reliability and Validity of the Perceived Social Competence Scale

ERIC Educational Resources Information Center

Anderson-Butcher, Dawn; Iachini, Aidyn L.; Amorose, Anthony J.

2008-01-01

Objective: This study describes the development and validation of a perceived social competence scale that social workers can easily use to assess children's and youth's social competence. Method: Exploratory and confirmatory factor analyses were conducted on a calibration and a cross-validation sample of youth. Predictive validity was also…
Development and validation of a predictive model for excessive postpartum blood loss: A retrospective, cohort study.

PubMed

Rubio-Álvarez, Ana; Molina-Alarcón, Milagros; Arias-Arias, Ángel; Hernández-Martínez, Antonio

2018-03-01

postpartum haemorrhage is one of the leading causes of maternal morbidity and mortality worldwide. Despite the use of uterotonics agents as preventive measure, it remains a challenge to identify those women who are at increased risk of postpartum bleeding. to develop and to validate a predictive model to assess the risk of excessive bleeding in women with vaginal birth. retrospective cohorts study. "Mancha-Centro Hospital" (Spain). the elaboration of the predictive model was based on a derivation cohort consisting of 2336 women between 2009 and 2011. For validation purposes, a prospective cohort of 953 women between 2013 and 2014 were employed. Women with antenatal fetal demise, multiple pregnancies and gestations under 35 weeks were excluded METHODS: we used a multivariate analysis with binary logistic regression, Ridge Regression and areas under the Receiver Operating Characteristic curves to determine the predictive ability of the proposed model. there was 197 (8.43%) women with excessive bleeding in the derivation cohort and 63 (6.61%) women in the validation cohort. Predictive factors in the final model were: maternal age, primiparity, duration of the first and second stages of labour, neonatal birth weight and antepartum haemoglobin levels. Accordingly, the predictive ability of this model in the derivation cohort was 0.90 (95% CI: 0.85-0.93), while it remained 0.83 (95% CI: 0.74-0.92) in the validation cohort. this predictive model is proved to have an excellent predictive ability in the derivation cohort, and its validation in a latter population equally shows a good ability for prediction. This model can be employed to identify women with a higher risk of postpartum haemorrhage. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evaluating the Predictive Validity of the Computerized Comprehension Task: Comprehension Predicts Production

PubMed Central

Friend, Margaret; Schmitt, Sara A.; Simpson, Adrianne M.

2017-01-01

Until recently, the challenges inherent in measuring comprehension have impeded our ability to predict the course of language acquisition. The present research reports on a longitudinal assessment of the convergent and predictive validity of the CDI: Words and Gestures and the Computerized Comprehension Task (CCT). The CDI: WG and the CCT evinced good convergent validity however the CCT better predicted subsequent parent reports of language production. Language sample data in the third year confirm this finding: the CCT accounted for 24% of the variance in unique word use. These studies provide evidence for the utility of a behavior-based approach to predicting the course of language acquisition into production. PMID:21928878
A novel QSAR model of Salmonella mutagenicity and its application in the safety assessment of drug impurities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valencia, Antoni; Prous, Josep; Mora, Oscar

As indicated in ICH M7 draft guidance, in silico predictive tools including statistically-based QSARs and expert analysis may be used as a computational assessment for bacterial mutagenicity for the qualification of impurities in pharmaceuticals. To address this need, we developed and validated a QSAR model to predict Salmonella t. mutagenicity (Ames assay outcome) of pharmaceutical impurities using Prous Institute's Symmetry℠, a new in silico solution for drug discovery and toxicity screening, and the Mold2 molecular descriptor package (FDA/NCTR). Data was sourced from public benchmark databases with known Ames assay mutagenicity outcomes for 7300 chemicals (57% mutagens). Of these data, 90%more » was used to train the model and the remaining 10% was set aside as a holdout set for validation. The model's applicability to drug impurities was tested using a FDA/CDER database of 951 structures, of which 94% were found within the model's applicability domain. The predictive performance of the model is acceptable for supporting regulatory decision-making with 84 ± 1% sensitivity, 81 ± 1% specificity, 83 ± 1% concordance and 79 ± 1% negative predictivity based on internal cross-validation, while the holdout dataset yielded 83% sensitivity, 77% specificity, 80% concordance and 78% negative predictivity. Given the importance of having confidence in negative predictions, an additional external validation of the model was also carried out, using marketed drugs known to be Ames-negative, and obtained 98% coverage and 81% specificity. Additionally, Ames mutagenicity data from FDA/CFSAN was used to create another data set of 1535 chemicals for external validation of the model, yielding 98% coverage, 73% sensitivity, 86% specificity, 81% concordance and 84% negative predictivity. - Highlights: • A new in silico QSAR model to predict Ames mutagenicity is described. • The model is extensively validated with chemicals from the FDA and the public domain. • Validation tests show desirable high sensitivity and high negative predictivity. • The model predicted 14 reportedly difficult to predict drug impurities with accuracy. • The model is suitable to support risk evaluation of potentially mutagenic compounds.« less
Survey of statistical techniques used in validation studies of air pollution prediction models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bornstein, R D; Anderson, S F

1979-03-01

Statistical techniques used by meteorologists to validate predictions made by air pollution models are surveyed. Techniques are divided into the following three groups: graphical, tabular, and summary statistics. Some of the practical problems associated with verification are also discussed. Characteristics desired in any validation program are listed and a suggested combination of techniques that possesses many of these characteristics is presented.
Observations on CFD Verification and Validation from the AIAA Drag Prediction Workshops

NASA Technical Reports Server (NTRS)

Morrison, Joseph H.; Kleb, Bil; Vassberg, John C.

2014-01-01

The authors provide observations from the AIAA Drag Prediction Workshops that have spanned over a decade and from a recent validation experiment at NASA Langley. These workshops provide an assessment of the predictive capability of forces and moments, focused on drag, for transonic transports. It is very difficult to manage the consistency of results in a workshop setting to perform verification and validation at the scientific level, but it may be sufficient to assess it at the level of practice. Observations thus far: 1) due to simplifications in the workshop test cases, wind tunnel data are not necessarily the “correct” results that CFD should match, 2) an average of core CFD data are not necessarily a better estimate of the true solution as it is merely an average of other solutions and has many coupled sources of variation, 3) outlier solutions should be investigated and understood, and 4) the DPW series does not have the systematic build up and definition on both the computational and experimental side that is required for detailed verification and validation. Several observations regarding the importance of the grid, effects of physical modeling, benefits of open forums, and guidance for validation experiments are discussed. The increased variation in results when predicting regions of flow separation and increased variation due to interaction effects, e.g., fuselage and horizontal tail, point out the need for validation data sets for these important flow phenomena. Experiences with a recent validation experiment at NASA Langley are included to provide guidance on validation experiments.
Independent data validation of an in vitro method for ...

EPA Pesticide Factsheets

In vitro bioaccessibility assays (IVBA) estimate arsenic (As) relative bioavailability (RBA) in contaminated soils to improve the accuracy of site-specific human exposure assessments and risk calculations. For an IVBA assay to gain acceptance for use in risk assessment, it must be shown to reliably predict in vivo RBA that is determined in an established animal model. Previous studies correlating soil As IVBA with RBA have been limited by the use of few soil types as the source of As. Furthermore, the predictive value of As IVBA assays has not been validated using an independent set of As-contaminated soils. Therefore, the current study was undertaken to develop a robust linear model to predict As RBA in mice using an IVBA assay and to independently validate the predictive capability of this assay using a unique set of As-contaminated soils. Thirty-six As-contaminated soils varying in soil type, As contaminant source, and As concentration were included in this study, with 27 soils used for initial model development and nine soils used for independent model validation. The initial model reliably predicted As RBA values in the independent data set, with a mean As RBA prediction error of 5.3% (range 2.4 to 8.4%). Following validation, all 36 soils were used for final model development, resulting in a linear model with the equation: RBA = 0.59 * IVBA + 9.8 and R2 of 0.78. The in vivo-in vitro correlation and independent data validation presented here provide
Extending the validity of the Feeding Practices and Structure Questionnaire.

PubMed

Jansen, Elena; Mallan, Kimberley M; Daniels, Lynne A

2015-06-30

Feeding practices are commonly examined as potentially modifiable determinants of children's eating behaviours and weight status. Although a variety of questionnaires exist to assess different feeding aspects, many lack thorough reliability and validity testing. The Feeding Practices and Structure Questionnaire (FPSQ) is a tool designed to measure early feeding practices related to non-responsive feeding and structure of the meal environment. Face validity, factorial validity, internal reliability and cross-sectional correlations with children's eating behaviours have been established in mothers with 2-year-old children. The aim of the present study was to further extend the validity of the FPSQ by examining factorial, construct and predictive validity, and stability. Participants were from the NOURISH randomised controlled trial which evaluated an intervention with first-time mothers designed to promote protective feeding practices. Maternal feeding practices (FP) and child eating behaviours were assessed when children were aged 2 years and 3.7 years (n = 388). Confirmatory Factor analysis, group differences, predictive relationships, and stability were tested. The original 9-factor structure was confirmed when children were aged 3.7 ± 0.3 years. Cronbach's alpha was above the recommended 0.70 cut-off for all factors except Structured Meal Timing, Over Restriction and Distrust in Appetite which were 0.58, 0.67 and 0.66 respectively. Allocated group differences reflected behaviour consistent with intervention content and all feeding practices were stable across both time points (range of r = 0.45-0.70). There was some evidence for the predictive validity of factors with 2 FP showing expected relationships, 2 FP showing expected and unexpected relationships and 5 FP showing no relationship. Reliability and validity was demonstrated for most subscales of the FPSQ. Future validation is warranted with culturally diverse samples and with fathers and other caregivers. The use of additional outcomes to further explore predictive validity is recommended as well as testing test-retest reliability of the questionnaire.
Nomogram predicting response after chemoradiotherapy in rectal cancer using sequential PETCT imaging: a multicentric prospective study with external validation.

PubMed

van Stiphout, Ruud G P M; Valentini, Vincenzo; Buijsen, Jeroen; Lammering, Guido; Meldolesi, Elisa; van Soest, Johan; Leccisotti, Lucia; Giordano, Alessandro; Gambacorta, Maria A; Dekker, Andre; Lambin, Philippe

2014-11-01

To develop and externally validate a predictive model for pathologic complete response (pCR) for locally advanced rectal cancer (LARC) based on clinical features and early sequential (18)F-FDG PETCT imaging. Prospective data (i.a. THUNDER trial) were used to train (N=112, MAASTRO Clinic) and validate (N=78, Università Cattolica del S. Cuore) the model for pCR (ypT0N0). All patients received long-course chemoradiotherapy (CRT) and surgery. Clinical parameters were age, gender, clinical tumour (cT) stage and clinical nodal (cN) stage. PET parameters were SUVmax, SUVmean, metabolic tumour volume (MTV) and maximal tumour diameter, for which response indices between pre-treatment and intermediate scan were calculated. Using multivariate logistic regression, three probability groups for pCR were defined. The pCR rates were 21.4% (training) and 23.1% (validation). The selected predictive features for pCR were cT-stage, cN-stage, response index of SUVmean and maximal tumour diameter during treatment. The models' performances (AUC) were 0.78 (training) and 0.70 (validation). The high probability group for pCR resulted in 100% correct predictions for training and 67% for validation. The model is available on the website www.predictcancer.org. The developed predictive model for pCR is accurate and externally validated. This model may assist in treatment decisions during CRT to select complete responders for a wait-and-see policy, good responders for extra RT boost and bad responders for additional chemotherapy. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Examining the Predictive Validity of NIH Peer Review Scores

PubMed Central

Lindner, Mark D.; Nakamura, Richard K.

2015-01-01

The predictive validity of peer review at the National Institutes of Health (NIH) has not yet been demonstrated empirically. It might be assumed that the most efficient and expedient test of the predictive validity of NIH peer review would be an examination of the correlation between percentile scores from peer review and bibliometric indices of the publications produced from funded projects. The present study used a large dataset to examine the rationale for such a study, to determine if it would satisfy the requirements for a test of predictive validity. The results show significant restriction of range in the applications selected for funding. Furthermore, those few applications that are funded with slightly worse peer review scores are not selected at random or representative of other applications in the same range. The funding institutes also negotiate with applicants to address issues identified during peer review. Therefore, the peer review scores assigned to the submitted applications, especially for those few funded applications with slightly worse peer review scores, do not reflect the changed and improved projects that are eventually funded. In addition, citation metrics by themselves are not valid or appropriate measures of scientific impact. The use of bibliometric indices on their own to measure scientific impact would likely increase the inefficiencies and problems with replicability already largely attributed to the current over-emphasis on bibliometric indices. Therefore, retrospective analyses of the correlation between percentile scores from peer review and bibliometric indices of the publications resulting from funded grant applications are not valid tests of the predictive validity of peer review at the NIH. PMID:26039440
Multivariate statistical assessment of predictors of firefighters' muscular and aerobic work capacity.

PubMed

Lindberg, Ann-Sofie; Oksa, Juha; Antti, Henrik; Malm, Christer

2015-01-01

Physical capacity has previously been deemed important for firefighters physical work capacity, and aerobic fitness, muscular strength, and muscular endurance are the most frequently investigated parameters of importance. Traditionally, bivariate and multivariate linear regression statistics have been used to study relationships between physical capacities and work capacities among firefighters. An alternative way to handle datasets consisting of numerous correlated variables is to use multivariate projection analyses, such as Orthogonal Projection to Latent Structures. The first aim of the present study was to evaluate the prediction and predictive power of field and laboratory tests, respectively, on firefighters' physical work capacity on selected work tasks. Also, to study if valid predictions could be achieved without anthropometric data. The second aim was to externally validate selected models. The third aim was to validate selected models on firefighters' and on civilians'. A total of 38 (26 men and 12 women) + 90 (38 men and 52 women) subjects were included in the models and the external validation, respectively. The best prediction (R2) and predictive power (Q2) of Stairs, Pulling, Demolition, Terrain, and Rescue work capacities included field tests (R2 = 0.73 to 0.84, Q2 = 0.68 to 0.82). The best external validation was for Stairs work capacity (R2 = 0.80) and worst for Demolition work capacity (R2 = 0.40). In conclusion, field and laboratory tests could equally well predict physical work capacities for firefighting work tasks, and models excluding anthropometric data were valid. The predictive power was satisfactory for all included work tasks except Demolition.
Self-perceived Coparenting of Nonresident Fathers: Scale Development and Validation.

PubMed

Dyer, W Justin; Fagan, Jay; Kaufman, Rebecca; Pearson, Jessica; Cabrera, Natasha

2017-11-16

This study reports on the development and validation of the Fatherhood Research and Practice Network coparenting perceptions scale for nonresident fathers. Although other measures of coparenting have been developed, this is the first measure developed specifically for low-income, nonresident fathers. Focus groups were conducted to determine various aspects of coparenting. Based on this, a scale was created and administered to 542 nonresident fathers. Participants also responded to items used to examine convergent and predictive validity (i.e., parental responsibility, contact with the mother, father self-efficacy and satisfaction, child behavior problems, and contact and engagement with the child). Factor analyses and reliability tests revealed three distinct and reliable perceived coparenting factors: undermining, alliance, and gatekeeping. Validity tests suggest substantial overlap between the undermining and alliance factors, though undermining was uniquely related to child behavior problems. The alliance and gatekeeping factors showed strong convergent validity and evidence for predictive validity. Taken together, results suggest this relatively short measure (11 items) taps into three coparenting dimensions significantly predictive of aspects of individual and family life. © 2017 Family Process Institute.
Improving the Validity of Activity of Daily Living Dependency Risk Assessment

PubMed Central

Clark, Daniel O.; Stump, Timothy E.; Tu, Wanzhu; Miller, Douglas K.

2015-01-01

Objectives Efforts to prevent activity of daily living (ADL) dependency may be improved through models that assess older adults’ dependency risk. We evaluated whether cognition and gait speed measures improve the predictive validity of interview-based models. Method Participants were 8,095 self-respondents in the 2006 Health and Retirement Survey who were aged 65 years or over and independent in five ADLs. Incident ADL dependency was determined from the 2008 interview. Models were developed using random 2/3rd cohorts and validated in the remaining 1/3rd. Results Compared to a c-statistic of 0.79 in the best interview model, the model including cognitive measures had c-statistics of 0.82 and 0.80 while the best fitting gait speed model had c-statistics of 0.83 and 0.79 in the development and validation cohorts, respectively. Conclusion Two relatively brief models, one that requires an in-person assessment and one that does not, had excellent validity for predicting incident ADL dependency but did not significantly improve the predictive validity of the best fitting interview-based models. PMID:24652867
Validity of the MMPI Personality Disorder scales (MMPI-PD).

PubMed

Schuler, C E; Snibbe, J R; Buckwalter, J G

1994-03-01

The MMPI Personality Disorder scales, developed by Morey, Waugh, and Blashfield (1985), were validated on an inpatient population by comparing 104 patients' MMPI-PD scores with the MCMI and with DSM-III-R diagnosis. Conservative significance levels were used to ensure more valid conclusions. Schizoid, Avoidant, Dependent, Histrionic, and Narcissistic scales were correlated significantly. Passive-Aggressive, Schizotypal, and Borderline scales did not correlate with corresponding MCMI scales. The MMPI-PD nonoverlapping scales were most effective in predicting diagnosis, specifically the Personality Disorder NOS, Eccentric and Borderline groups. The overlapping scales were not as effective in predicting diagnosis, but best predicted the Eccentric and Borderline groups. This study provides support for the validity of specific scales and circumscribed diagnostic utility for both measures.
External validation of preexisting first trimester preeclampsia prediction models.

PubMed

Allen, Rebecca E; Zamora, Javier; Arroyo-Manzano, David; Velauthar, Luxmilar; Allotey, John; Thangaratinam, Shakila; Aquilina, Joseph

2017-10-01

To validate the increasing number of prognostic models being developed for preeclampsia using our own prospective study. A systematic review of literature that assessed biomarkers, uterine artery Doppler and maternal characteristics in the first trimester for the prediction of preeclampsia was performed and models selected based on predefined criteria. Validation was performed by applying the regression coefficients that were published in the different derivation studies to our cohort. We assessed the models discrimination ability and calibration. Twenty models were identified for validation. The discrimination ability observed in derivation studies (Area Under the Curves) ranged from 0.70 to 0.96 when these models were validated against the validation cohort, these AUC varied importantly, ranging from 0.504 to 0.833. Comparing Area Under the Curves obtained in the derivation study to those in the validation cohort we found statistically significant differences in several studies. There currently isn't a definitive prediction model with adequate ability to discriminate for preeclampsia, which performs as well when applied to a different population and can differentiate well between the highest and lowest risk groups within the tested population. The pre-existing large number of models limits the value of further model development and future research should be focussed on further attempts to validate existing models and assessing whether implementation of these improves patient care. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.

On various metrics used for validation of predictive QSAR models with applications in virtual screening and focused library design.

PubMed

Roy, Kunal; Mitra, Indrani

2011-07-01

Quantitative structure-activity relationships (QSARs) have important applications in drug discovery research, environmental fate modeling, property prediction, etc. Validation has been recognized as a very important step for QSAR model development. As one of the important objectives of QSAR modeling is to predict activity/property/toxicity of new chemicals falling within the domain of applicability of the developed models and QSARs are being used for regulatory decisions, checking reliability of the models and confidence of their predictions is a very important aspect, which can be judged during the validation process. One prime application of a statistically significant QSAR model is virtual screening for molecules with improved potency based on the pharmacophoric features and the descriptors appearing in the QSAR model. Validated QSAR models may also be utilized for design of focused libraries which may be subsequently screened for the selection of hits. The present review focuses on various metrics used for validation of predictive QSAR models together with an overview of the application of QSAR models in the fields of virtual screening and focused library design for diverse series of compounds with citation of some recent examples.
Validity of Bioelectrical Impedance Analysis to Estimation Fat-Free Mass in the Army Cadets.

PubMed

Langer, Raquel D; Borges, Juliano H; Pascoa, Mauro A; Cirolini, Vagner X; Guerra-Júnior, Gil; Gonçalves, Ezequiel M

2016-03-11

Bioelectrical Impedance Analysis (BIA) is a fast, practical, non-invasive, and frequently used method for fat-free mass (FFM) estimation. The aims of this study were to validate predictive equations of BIA to FFM estimation in Army cadets and to develop and validate a specific BIA equation for this population. A total of 396 males, Brazilian Army cadets, aged 17-24 years were included. The study used eight published predictive BIA equations, a specific equation in FFM estimation, and dual-energy X-ray absorptiometry (DXA) as a reference method. Student's t-test (for paired sample), linear regression analysis, and Bland-Altman method were used to test the validity of the BIA equations. Predictive BIA equations showed significant differences in FFM compared to DXA (p < 0.05) and large limits of agreement by Bland-Altman. Predictive BIA equations explained 68% to 88% of FFM variance. Specific BIA equations showed no significant differences in FFM, compared to DXA values. Published BIA predictive equations showed poor accuracy in this sample. The specific BIA equations, developed in this study, demonstrated validity for this sample, although should be used with caution in samples with a large range of FFM.
Validation of the procedures. [integrated multidisciplinary optimization of rotorcraft

NASA Technical Reports Server (NTRS)

Mantay, Wayne R.

1989-01-01

Validation strategies are described for procedures aimed at improving the rotor blade design process through a multidisciplinary optimization approach. Validation of the basic rotor environment prediction tools and the overall rotor design are discussed.
Patient-Reported Outcomes After Radiation Therapy in Men With Prostate Cancer: A Systematic Review of Prognostic Tool Accuracy and Validity

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Callaghan, Michael E., E-mail: elspeth.raymond@health.sa.gov.au; Freemasons Foundation Centre for Men's Health, University of Adelaide; Urology Unit, Repatriation General Hospital, SA Health, Flinders Centre for Innovation in Cancer

Purpose: To identify, through a systematic review, all validated tools used for the prediction of patient-reported outcome measures (PROMs) in patients being treated with radiation therapy for prostate cancer, and provide a comparative summary of accuracy and generalizability. Methods and Materials: PubMed and EMBASE were searched from July 2007. Title/abstract screening, full text review, and critical appraisal were undertaken by 2 reviewers, whereas data extraction was performed by a single reviewer. Eligible articles had to provide a summary measure of accuracy and undertake internal or external validation. Tools were recommended for clinical implementation if they had been externally validated and foundmore » to have accuracy ≥70%. Results: The search strategy identified 3839 potential studies, of which 236 progressed to full text review and 22 were included. From these studies, 50 tools predicted gastrointestinal/rectal symptoms, 29 tools predicted genitourinary symptoms, 4 tools predicted erectile dysfunction, and no tools predicted quality of life. For patients treated with external beam radiation therapy, 3 tools could be recommended for the prediction of rectal toxicity, gastrointestinal toxicity, and erectile dysfunction. For patients treated with brachytherapy, 2 tools could be recommended for the prediction of urinary retention and erectile dysfunction. Conclusions: A large number of tools for the prediction of PROMs in prostate cancer patients treated with radiation therapy have been developed. Only a small minority are accurate and have been shown to be generalizable through external validation. This review provides an accessible catalogue of tools that are ready for clinical implementation as well as which should be prioritized for validation.« less
A Formal Approach to Empirical Dynamic Model Optimization and Validation

NASA Technical Reports Server (NTRS)

Crespo, Luis G; Morelli, Eugene A.; Kenny, Sean P.; Giesy, Daniel P.

2014-01-01

A framework was developed for the optimization and validation of empirical dynamic models subject to an arbitrary set of validation criteria. The validation requirements imposed upon the model, which may involve several sets of input-output data and arbitrary specifications in time and frequency domains, are used to determine if model predictions are within admissible error limits. The parameters of the empirical model are estimated by finding the parameter realization for which the smallest of the margins of requirement compliance is as large as possible. The uncertainty in the value of this estimate is characterized by studying the set of model parameters yielding predictions that comply with all the requirements. Strategies are presented for bounding this set, studying its dependence on admissible prediction error set by the analyst, and evaluating the sensitivity of the model predictions to parameter variations. This information is instrumental in characterizing uncertainty models used for evaluating the dynamic model at operating conditions differing from those used for its identification and validation. A practical example based on the short period dynamics of the F-16 is used for illustration.
External model validation of binary clinical risk prediction models in cardiovascular and thoracic surgery.

PubMed

Hickey, Graeme L; Blackstone, Eugene H

2016-08-01

Clinical risk-prediction models serve an important role in healthcare. They are used for clinical decision-making and measuring the performance of healthcare providers. To establish confidence in a model, external model validation is imperative. When designing such an external model validation study, thought must be given to patient selection, risk factor and outcome definitions, missing data, and the transparent reporting of the analysis. In addition, there are a number of statistical methods available for external model validation. Execution of a rigorous external validation study rests in proper study design, application of suitable statistical methods, and transparent reporting. Copyright © 2016 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
Meta-Analysis of Integrity Tests: A Critical Examination of Validity Generalization and Moderator Variables.

DTIC Science & Technology

1992-06-01

predicting both job performance and counterproductive behaviors on the job such as theft, disciplinary problems, and absenteeism . Validities were found to...DECLASSIFICATION/DOWNGRADING SCHEDULE 4 PERFORMING ORGANIZATION REPORT NUMBER(S) 92-1 6a NAME OF PERFORMING ORGANIZATION Universi+y of Iowa...be generalizable. The estimated mean operational predictive validity of integrity tests for supervisory ratings of job performance is .41. For the
Understanding Interrater Reliability and Validity of Risk Assessment Tools Used to Predict Adverse Clinical Events.

PubMed

Siedlecki, Sandra L; Albert, Nancy M

This article will describe how to assess interrater reliability and validity of risk assessment tools, using easy-to-follow formulas, and to provide calculations that demonstrate principles discussed. Clinical nurse specialists should be able to identify risk assessment tools that provide high-quality interrater reliability and the highest validity for predicting true events of importance to clinical settings. Making best practice recommendations for assessment tool use is critical to high-quality patient care and safe practices that impact patient outcomes and nursing resources. Optimal risk assessment tool selection requires knowledge about interrater reliability and tool validity. The clinical nurse specialist will understand the reliability and validity issues associated with risk assessment tools, and be able to evaluate tools using basic calculations. Risk assessment tools are developed to objectively predict quality and safety events and ultimately reduce the risk of event occurrence through preventive interventions. To ensure high-quality tool use, clinical nurse specialists must critically assess tool properties. The better the tool's ability to predict adverse events, the more likely that event risk is mediated. Interrater reliability and validity assessment is relatively an easy skill to master and will result in better decisions when selecting or making recommendations for risk assessment tool use.
Validation of finite element and boundary element methods for predicting structural vibration and radiated noise

NASA Technical Reports Server (NTRS)

Seybert, A. F.; Wu, X. F.; Oswald, Fred B.

1992-01-01

Analytical and experimental validation of methods to predict structural vibration and radiated noise are presented. A rectangular box excited by a mechanical shaker was used as a vibrating structure. Combined finite element method (FEM) and boundary element method (BEM) models of the apparatus were used to predict the noise radiated from the box. The FEM was used to predict the vibration, and the surface vibration was used as input to the BEM to predict the sound intensity and sound power. Vibration predicted by the FEM model was validated by experimental modal analysis. Noise predicted by the BEM was validated by sound intensity measurements. Three types of results are presented for the total radiated sound power: (1) sound power predicted by the BEM modeling using vibration data measured on the surface of the box; (2) sound power predicted by the FEM/BEM model; and (3) sound power measured by a sound intensity scan. The sound power predicted from the BEM model using measured vibration data yields an excellent prediction of radiated noise. The sound power predicted by the combined FEM/BEM model also gives a good prediction of radiated noise except for a shift of the natural frequencies that are due to limitations in the FEM model.
Status and plans for the ANOPP/HSR prediction system

NASA Technical Reports Server (NTRS)

Nolan, Sandra K.

1992-01-01

ANOPP is a comprehensive prediction system which was developed and validated by NASA. Because ANOPP is a system prediction program, it allows aerospace industry researchers to create trade-off studies with a variety of aircraft noise problems. The extensive validation of ANOPP allows the program results to be used as a benchmark for testing other prediction codes.
How Nonrecidivism Affects Predictive Accuracy: Evidence from a Cross-Validation of the Ontario Domestic Assault Risk Assessment (ODARA)

ERIC Educational Resources Information Center

Hilton, N. Zoe; Harris, Grant T.

2009-01-01

Prediction effect sizes such as ROC area are important for demonstrating a risk assessment's generalizability and utility. How a study defines recidivism might affect predictive accuracy. Nonrecidivism is problematic when predicting specialized violence (e.g., domestic violence). The present study cross-validates the ability of the Ontario…
Does Rational Selection of Training and Test Sets Improve the Outcome of QSAR Modeling?

EPA Science Inventory

Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external dataset, the best way to validate the predictive ability of a model is to perform its s...
Teacher' Interpersonal Self-Efficacy: Evaluation and Predictive Capacity of Teacher Burnout

ERIC Educational Resources Information Center

García-Ros, Rafael; Fuentes, María C.; Fernández, Basilio

2015-01-01

Introduction: This study analyzed the predictive capacity and incremental validity of teachers' interpersonal self-efficacy on their levels of burnout. First, it presents the validation process of a Spanish adaptation of the Teacher Interpersonal Self-Efficacy Scale--TISES--(Browers & Tomic, 1999, 2001). Second, the predictive capacity of…
The Kuder Occupational Interest Inventory as a Moderator of Its Predictive Validity.

ERIC Educational Resources Information Center

Hansen, Chris J.; Zytowski, Donald G.

1979-01-01

A measure of the extent to which the Kuder Occupational Interest Survey (KOIS) was predictive of occupational membership for an individual was correlated with KOIS item and scale scores. Results indicated that the KOIS was a moderator of its own predictive validity. (Author/JKS)
Multisite external validation of a risk prediction model for the diagnosis of blood stream infections in febrile pediatric oncology patients without severe neutropenia.

PubMed

Esbenshade, Adam J; Zhao, Zhiguo; Aftandilian, Catherine; Saab, Raya; Wattier, Rachel L; Beauchemin, Melissa; Miller, Tamara P; Wilkes, Jennifer J; Kelly, Michael J; Fernbach, Alison; Jeng, Michael; Schwartz, Cindy L; Dvorak, Christopher C; Shyr, Yu; Moons, Karl G M; Sulis, Maria-Luisa; Friedman, Debra L

2017-10-01

Pediatric oncology patients are at an increased risk of invasive bacterial infection due to immunosuppression. The risk of such infection in the absence of severe neutropenia (absolute neutrophil count ≥ 500/μL) is not well established and a validated prediction model for blood stream infection (BSI) risk offers clinical usefulness. A 6-site retrospective external validation was conducted using a previously published risk prediction model for BSI in febrile pediatric oncology patients without severe neutropenia: the Esbenshade/Vanderbilt (EsVan) model. A reduced model (EsVan2) excluding 2 less clinically reliable variables also was created using the initial EsVan model derivative cohort, and was validated using all 5 external validation cohorts. One data set was used only in sensitivity analyses due to missing some variables. From the 5 primary data sets, there were a total of 1197 febrile episodes and 76 episodes of bacteremia. The overall C statistic for predicting bacteremia was 0.695, with a calibration slope of 0.50 for the original model and a calibration slope of 1.0 when recalibration was applied to the model. The model performed better in predicting high-risk bacteremia (gram-negative or Staphylococcus aureus infection) versus BSI alone, with a C statistic of 0.801 and a calibration slope of 0.65. The EsVan2 model outperformed the EsVan model across data sets with a C statistic of 0.733 for predicting BSI and a C statistic of 0.841 for high-risk BSI. The results of this external validation demonstrated that the EsVan and EsVan2 models are able to predict BSI across multiple performance sites and, once validated and implemented prospectively, could assist in decision making in clinical practice. Cancer 2017;123:3781-3790. © 2017 American Cancer Society. © 2017 American Cancer Society.
A diagnostic model for the detection of sensitization to wheat allergens was developed and validated in bakery workers.

PubMed

Suarthana, Eva; Vergouwe, Yvonne; Moons, Karel G; de Monchy, Jan; Grobbee, Diederick; Heederik, Dick; Meijer, Evert

2010-09-01

To develop and validate a prediction model to detect sensitization to wheat allergens in bakery workers. The prediction model was developed in 867 Dutch bakery workers (development set, prevalence of sensitization 13%) and included questionnaire items (candidate predictors). First, principal component analysis was used to reduce the number of candidate predictors. Then, multivariable logistic regression analysis was used to develop the model. Internal validation and extent of optimism was assessed with bootstrapping. External validation was studied in 390 independent Dutch bakery workers (validation set, prevalence of sensitization 20%). The prediction model contained the predictors nasoconjunctival symptoms, asthma symptoms, shortness of breath and wheeze, work-related upper and lower respiratory symptoms, and traditional bakery. The model showed good discrimination with an area under the receiver operating characteristic (ROC) curve area of 0.76 (and 0.75 after internal validation). Application of the model in the validation set gave a reasonable discrimination (ROC area=0.69) and good calibration after a small adjustment of the model intercept. A simple model with questionnaire items only can be used to stratify bakers according to their risk of sensitization to wheat allergens. Its use may increase the cost-effectiveness of (subsequent) medical surveillance.
Can species distribution models really predict the expansion of invasive species?

PubMed

Barbet-Massin, Morgane; Rome, Quentin; Villemant, Claire; Courchamp, Franck

2018-01-01

Predictive studies are of paramount importance for biological invasions, one of the biggest threats for biodiversity. To help and better prioritize management strategies, species distribution models (SDMs) are often used to predict the potential invasive range of introduced species. Yet, SDMs have been regularly criticized, due to several strong limitations, such as violating the equilibrium assumption during the invasion process. Unfortunately, validation studies-with independent data-are too scarce to assess the predictive accuracy of SDMs in invasion biology. Yet, biological invasions allow to test SDMs usefulness, by retrospectively assessing whether they would have accurately predicted the latest ranges of invasion. Here, we assess the predictive accuracy of SDMs in predicting the expansion of invasive species. We used temporal occurrence data for the Asian hornet Vespa velutina nigrithorax, a species native to China that is invading Europe with a very fast rate. Specifically, we compared occurrence data from the last stage of invasion (independent validation points) to the climate suitability distribution predicted from models calibrated with data from the early stage of invasion. Despite the invasive species not being at equilibrium yet, the predicted climate suitability of validation points was high. SDMs can thus adequately predict the spread of V. v. nigrithorax, which appears to be-at least partially-climatically driven. In the case of V. v. nigrithorax, SDMs predictive accuracy was slightly but significantly better when models were calibrated with invasive data only, excluding native data. Although more validation studies for other invasion cases are needed to generalize our results, our findings are an important step towards validating the use of SDMs in invasion biology.
Can species distribution models really predict the expansion of invasive species?

PubMed Central

Rome, Quentin; Villemant, Claire; Courchamp, Franck

2018-01-01

Predictive studies are of paramount importance for biological invasions, one of the biggest threats for biodiversity. To help and better prioritize management strategies, species distribution models (SDMs) are often used to predict the potential invasive range of introduced species. Yet, SDMs have been regularly criticized, due to several strong limitations, such as violating the equilibrium assumption during the invasion process. Unfortunately, validation studies–with independent data–are too scarce to assess the predictive accuracy of SDMs in invasion biology. Yet, biological invasions allow to test SDMs usefulness, by retrospectively assessing whether they would have accurately predicted the latest ranges of invasion. Here, we assess the predictive accuracy of SDMs in predicting the expansion of invasive species. We used temporal occurrence data for the Asian hornet Vespa velutina nigrithorax, a species native to China that is invading Europe with a very fast rate. Specifically, we compared occurrence data from the last stage of invasion (independent validation points) to the climate suitability distribution predicted from models calibrated with data from the early stage of invasion. Despite the invasive species not being at equilibrium yet, the predicted climate suitability of validation points was high. SDMs can thus adequately predict the spread of V. v. nigrithorax, which appears to be—at least partially–climatically driven. In the case of V. v. nigrithorax, SDMs predictive accuracy was slightly but significantly better when models were calibrated with invasive data only, excluding native data. Although more validation studies for other invasion cases are needed to generalize our results, our findings are an important step towards validating the use of SDMs in invasion biology. PMID:29509789
Genome-based prediction of test cross performance in two subsequent breeding cycles.

PubMed

Hofheinz, Nina; Borchardt, Dietrich; Weissleder, Knuth; Frisch, Matthias

2012-12-01

Genome-based prediction of genetic values is expected to overcome shortcomings that limit the application of QTL mapping and marker-assisted selection in plant breeding. Our goal was to study the genome-based prediction of test cross performance with genetic effects that were estimated using genotypes from the preceding breeding cycle. In particular, our objectives were to employ a ridge regression approach that approximates best linear unbiased prediction of genetic effects, compare cross validation with validation using genetic material of the subsequent breeding cycle, and investigate the prospects of genome-based prediction in sugar beet breeding. We focused on the traits sugar content and standard molasses loss (ML) and used a set of 310 sugar beet lines to estimate genetic effects at 384 SNP markers. In cross validation, correlations >0.8 between observed and predicted test cross performance were observed for both traits. However, in validation with 56 lines from the next breeding cycle, a correlation of 0.8 could only be observed for sugar content, for standard ML the correlation reduced to 0.4. We found that ridge regression based on preliminary estimates of the heritability provided a very good approximation of best linear unbiased prediction and was not accompanied with a loss in prediction accuracy. We conclude that prediction accuracy assessed with cross validation within one cycle of a breeding program can not be used as an indicator for the accuracy of predicting lines of the next cycle. Prediction of lines of the next cycle seems promising for traits with high heritabilities.
Measurement of fatigue: Comparison of the reliability and validity of single-item and short measures to a comprehensive measure.

PubMed

Kim, Hee-Ju; Abraham, Ivo

2017-01-01

Evidence is needed on the clinicometric properties of single-item or short measures as alternatives to comprehensive measures. We examined whether two single-item fatigue measures (i.e., Likert scale, numeric rating scale) or a short fatigue measure were comparable to a comprehensive measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults. For this quantitative study, we selected the Functional Assessment of Chronic Illness Therapy-Fatigue for the comprehensive measure and the Profile of Mood States-Brief, Fatigue subscale for the short measure; and constructed two single-item measures. A total of 368 students from four nursing colleges in South Korea participated. We used Cronbach's alpha and item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability. We assessed Pearson's correlation with a comprehensive measure for convergent validity, with perceived stress level and sleep quality for concurrent validity and the receiver operating characteristic curve for predictive validity. The short measure was comparable to the comprehensive measure in internal consistency reliability (Cronbach's alpha=0.81 vs. 0.88); test-retest reliability (intraclass correlation coefficient=0.66 vs. 0.61); convergent validity (r with comprehensive measure=0.79); concurrent validity (r with perceived stress=0.55, r with sleep quality=0.39) and predictive validity (area under curve=0.88). Single-item measures were not comparable to the comprehensive measure. A short fatigue measure exhibited similar levels of reliability and validity to the comprehensive measure in Korean young adults. Copyright Â© 2016 Elsevier Ltd. All rights reserved.

Validation and Use of a Predictive Modeling Tool: Employing Scientific Findings to Improve Responsible Conduct of Research Education.

PubMed

Mulhearn, Tyler J; Watts, Logan L; Todd, E Michelle; Medeiros, Kelsey E; Connelly, Shane; Mumford, Michael D

2017-01-01

Although recent evidence suggests ethics education can be effective, the nature of specific training programs, and their effectiveness, varies considerably. Building on a recent path modeling effort, the present study developed and validated a predictive modeling tool for responsible conduct of research education. The predictive modeling tool allows users to enter ratings in relation to a given ethics training program and receive instantaneous evaluative information for course refinement. Validation work suggests the tool's predicted outcomes correlate strongly (r = 0.46) with objective course outcomes. Implications for training program development and refinement are discussed.
Development and External Validation of a Melanoma Risk Prediction Model Based on Self-assessed Risk Factors.

PubMed

Vuong, Kylie; Armstrong, Bruce K; Weiderpass, Elisabete; Lund, Eiliv; Adami, Hans-Olov; Veierod, Marit B; Barrett, Jennifer H; Davies, John R; Bishop, D Timothy; Whiteman, David C; Olsen, Catherine M; Hopper, John L; Mann, Graham J; Cust, Anne E; McGeechan, Kevin

2016-08-01

Identifying individuals at high risk of melanoma can optimize primary and secondary prevention strategies. To develop and externally validate a risk prediction model for incident first-primary cutaneous melanoma using self-assessed risk factors. We used unconditional logistic regression to develop a multivariable risk prediction model. Relative risk estimates from the model were combined with Australian melanoma incidence and competing mortality rates to obtain absolute risk estimates. A risk prediction model was developed using the Australian Melanoma Family Study (629 cases and 535 controls) and externally validated using 4 independent population-based studies: the Western Australia Melanoma Study (511 case-control pairs), Leeds Melanoma Case-Control Study (960 cases and 513 controls), Epigene-QSkin Study (44 544, of which 766 with melanoma), and Swedish Women's Lifestyle and Health Cohort Study (49 259 women, of which 273 had melanoma). We validated model performance internally and externally by assessing discrimination using the area under the receiver operating curve (AUC). Additionally, using the Swedish Women's Lifestyle and Health Cohort Study, we assessed model calibration and clinical usefulness. The risk prediction model included hair color, nevus density, first-degree family history of melanoma, previous nonmelanoma skin cancer, and lifetime sunbed use. On internal validation, the AUC was 0.70 (95% CI, 0.67-0.73). On external validation, the AUC was 0.66 (95% CI, 0.63-0.69) in the Western Australia Melanoma Study, 0.67 (95% CI, 0.65-0.70) in the Leeds Melanoma Case-Control Study, 0.64 (95% CI, 0.62-0.66) in the Epigene-QSkin Study, and 0.63 (95% CI, 0.60-0.67) in the Swedish Women's Lifestyle and Health Cohort Study. Model calibration showed close agreement between predicted and observed numbers of incident melanomas across all deciles of predicted risk. In the external validation setting, there was higher net benefit when using the risk prediction model to classify individuals as high risk compared with classifying all individuals as high risk. The melanoma risk prediction model performs well and may be useful in prevention interventions reliant on a risk assessment using self-assessed risk factors.
Predictive and concurrent validity of the Braden scale in long-term care: a meta-analysis.

PubMed

Wilchesky, Machelle; Lungu, Ovidiu

2015-01-01

Pressure ulcer prevention is an important long-term care (LTC) quality indicator. While the Braden Scale is a recommended risk assessment tool, there is a paucity of information specifically pertaining to its validity within the LTC setting. We, therefore, undertook a systematic review and meta-analysis comparing Braden Scale predictive and concurrent validity within this context. We searched the Medline, EMBASE, PsychINFO and PubMed databases from 1985-2014 for studies containing the requisite information to analyze tool validity. Our initial search yielded 3,773 articles. Eleven datasets emanating from nine published studies describing 40,361 residents met all meta-analysis inclusion criteria and were analyzed using random effects models. Pooled sensitivity, specificity, positive predictive value (PPV), and negative predictive values were 86%, 38%, 28%, and 93%, respectively. Specificity was poorer in concurrent samples as compared with predictive samples (38% vs. 72%), while PPV was low in both sample types (25 and 37%). Though random effects model results showed that the Scale had good overall predictive ability [RR, 4.33; 95% CI, 3.28-5.72], none of the concurrent samples were found to have "optimal" sensitivity and specificity. In conclusion, the appropriateness of the Braden Scale in LTC is questionable given its low specificity and PPV, in particular in concurrent validity studies. Future studies should further explore the extent to which the apparent low validity of the Scale in LTC is due to the choice of cutoff point and/or preventive strategies implemented by LTC staff as a matter of course. © 2015 by the Wound Healing Society.
Pre-launch Optical Characteristics of the Oculus-ASR Nanosatellite for Attitude and Shape Recognition Experiments

DTIC Science & Technology

2011-12-02

construction and validation of predictive computer models such as those used in Time-domain Analysis Simulation for Advanced Tracking (TASAT), a...characterization data, successful construction and validation of predictive computer models was accomplished. And an investigation in pose determination from...currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1. REPORT DATE (DD-MM-YYYY) 2. REPORT TYPE 3. DATES
Development and validation of an automated delirium risk assessment system (Auto-DelRAS) implemented in the electronic health record system.

PubMed

Moon, Kyoung-Ja; Jin, Yinji; Jin, Taixian; Lee, Sun-Mi

2018-01-01

A key component of the delirium management is prevention and early detection. To develop an automated delirium risk assessment system (Auto-DelRAS) that automatically alerts health care providers of an intensive care unit (ICU) patient's delirium risk based only on data collected in an electronic health record (EHR) system, and to evaluate the clinical validity of this system. Cohort and system development designs were used. Medical and surgical ICUs in two university hospitals in Seoul, Korea. A total of 3284 patients for the development of Auto-DelRAS, 325 for external validation, 694 for validation after clinical applications. The 4211 data items were extracted from the EHR system and delirium was measured using CAM-ICU (Confusion Assessment Method for Intensive Care Unit). The potential predictors were selected and a logistic regression model was established to create a delirium risk scoring algorithm to construct the Auto-DelRAS. The Auto-DelRAS was evaluated at three months and one year after its application to clinical practice to establish the predictive validity of the system. Eleven predictors were finally included in the logistic regression model. The results of the Auto-DelRAS risk assessment were shown as high/moderate/low risk on a Kardex screen. The predictive validity, analyzed after the clinical application of Auto-DelRAS after one year, showed a sensitivity of 0.88, specificity of 0.72, positive predictive value of 0.53, negative predictive value of 0.94, and a Youden index of 0.59. A relatively high level of predictive validity was maintained with the Auto-DelRAS system, even one year after it was applied to clinical practice. Copyright © 2017. Published by Elsevier Ltd.
Testing the Predictive Validity and Construct of Pathological Video Game Use

PubMed Central

Groves, Christopher L.; Gentile, Douglas; Tapscott, Ryan L.; Lynch, Paul J.

2015-01-01

Three studies assessed the construct of pathological video game use and tested its predictive validity. Replicating previous research, Study 1 produced evidence of convergent validity in 8th and 9th graders (N = 607) classified as pathological gamers. Study 2 replicated and extended the findings of Study 1 with college undergraduates (N = 504). Predictive validity was established in Study 3 by measuring cue reactivity to video games in college undergraduates (N = 254), such that pathological gamers were more emotionally reactive to and provided higher subjective appraisals of video games than non-pathological gamers and non-gamers. The three studies converged to show that pathological video game use seems similar to other addictions in its patterns of correlations with other constructs. Conceptual and definitional aspects of Internet Gaming Disorder are discussed. PMID:26694472
Base Flow Model Validation

NASA Technical Reports Server (NTRS)

Sinha, Neeraj; Brinckman, Kevin; Jansen, Bernard; Seiner, John

2011-01-01

A method was developed of obtaining propulsive base flow data in both hot and cold jet environments, at Mach numbers and altitude of relevance to NASA launcher designs. The base flow data was used to perform computational fluid dynamics (CFD) turbulence model assessments of base flow predictive capabilities in order to provide increased confidence in base thermal and pressure load predictions obtained from computational modeling efforts. Predictive CFD analyses were used in the design of the experiments, available propulsive models were used to reduce program costs and increase success, and a wind tunnel facility was used. The data obtained allowed assessment of CFD/turbulence models in a complex flow environment, working within a building-block procedure to validation, where cold, non-reacting test data was first used for validation, followed by more complex reacting base flow validation.
Examining construct and predictive validity of the Health-IT Usability Evaluation Scale: confirmatory factor analysis and structural equation modeling results.

PubMed

Yen, Po-Yin; Sousa, Karen H; Bakken, Suzanne

2014-10-01

In a previous study, we developed the Health Information Technology Usability Evaluation Scale (Health-ITUES), which is designed to support customization at the item level. Such customization matches the specific tasks/expectations of a health IT system while retaining comparability at the construct level, and provides evidence of its factorial validity and internal consistency reliability through exploratory factor analysis. In this study, we advanced the development of Health-ITUES to examine its construct validity and predictive validity. The health IT system studied was a web-based communication system that supported nurse staffing and scheduling. Using Health-ITUES, we conducted a cross-sectional study to evaluate users' perception toward the web-based communication system after system implementation. We examined Health-ITUES's construct validity through first and second order confirmatory factor analysis (CFA), and its predictive validity via structural equation modeling (SEM). The sample comprised 541 staff nurses in two healthcare organizations. The CFA (n=165) showed that a general usability factor accounted for 78.1%, 93.4%, 51.0%, and 39.9% of the explained variance in 'Quality of Work Life', 'Perceived Usefulness', 'Perceived Ease of Use', and 'User Control', respectively. The SEM (n=541) supported the predictive validity of Health-ITUES, explaining 64% of the variance in intention for system use. The results of CFA and SEM provide additional evidence for the construct and predictive validity of Health-ITUES. The customizability of Health-ITUES has the potential to support comparisons at the construct level, while allowing variation at the item level. We also illustrate application of Health-ITUES across stages of system development. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Development and validation of immune dysfunction score to predict 28-day mortality of sepsis patients

PubMed Central

Fang, Wen-Feng; Douglas, Ivor S.; Chen, Yu-Mu; Lin, Chiung-Yu; Kao, Hsu-Ching; Fang, Ying-Tang; Huang, Chi-Han; Chang, Ya-Ting; Huang, Kuo-Tung; Wang, Yi-His; Wang, Chin-Chou

2017-01-01

Background Sepsis-induced immune dysfunction ranging from cytokines storm to immunoparalysis impacts outcomes. Monitoring immune dysfunction enables better risk stratification and mortality prediction and is mandatory before widely application of immunoadjuvant therapies. We aimed to develop and validate a scoring system according to patients’ immune dysfunction status for 28-day mortality prediction. Methods A prospective observational study from a cohort of adult sepsis patients admitted to ICU between August 2013 and June 2016 at Kaohsiung Chang Gung Memorial Hospital in Taiwan. We evaluated immune dysfunction status through measurement of baseline plasma Cytokine levels, Monocyte human leukocyte-DR expression by flow cytometry, and stimulated immune response using post LPS stimulated cytokine elevation ratio. An immune dysfunction score was created for 28-day mortality prediction and was validated. Results A total of 151 patients were enrolled. Data of the first consecutive 106 septic patients comprised the training cohort, and of other 45 patients comprised the validation cohort. Among the 106 patients, 21 died and 85 were still alive on day 28 after ICU admission. (mortality rate, 19.8%). Independent predictive factors revealed via multivariate logistic regression analysis included segmented neutrophil-to-monocyte ratio, granulocyte-colony stimulating factor, interleukin-10, and monocyte human leukocyte antigen-antigen D–related levels, all of which were selected to construct the score, which predicted 28-day mortality with area under the curve of 0.853 and 0.789 in the training and validation cohorts, respectively. Conclusions The immune dysfunction scoring system developed here included plasma granulocyte-colony stimulating factor level, interleukin-10 level, serum segmented neutrophil-to-monocyte ratio, and monocyte human leukocyte antigen-antigen D–related expression appears valid and reproducible for predicting 28-day mortality. PMID:29073262
Predicting functional outcomes among college drinkers: reliability and predictive validity of the Young Adult Alcohol Consequences Questionnaire.

PubMed

Read, Jennifer P; Merrill, Jennifer E; Kahler, Christopher W; Strong, David R

2007-11-01

Heavy drinking and associated consequences are widespread among U.S. college students. Recently, Read et al. (Read, J. P., Kahler, C. W., Strong, D., & Colder, C. R. (2006). Development and preliminary validation of the Young Adult Alcohol Consequences Questionnaire. Journal of Studies on Alcohol, 67, 169-178) developed the Young Adult Alcohol Consequences Questionnaire (YAACQ) to assess the broad range of consequences that may result from heavy drinking in the college milieu. In the present study, we sought to add to the psychometric validation of this measure by employing a prospective design to examine the test-retest reliability, concurrent validity, and predictive validity of the YAACQ. We also sought to examine the utility of the YAACQ administered early in the semester in the prediction of functional outcomes later in the semester, including the persistence of heavy drinking, and academic functioning. Ninety-two college students (48 females) completed a self-report assessment battery during the first weeks of the Fall semester, and approximately one week later. Additionally, 64 subjects (37 females) participated at an optional third time point at the end of the semester. Overall, the YAACQ demonstrated strong internal consistency, test-retest reliability, and concurrent and predictive validity. YAACQ scores also were predictive of both drinking frequency, and "binge" drinking frequency. YAACQ total scores at baseline were an early indicator of academic performance later in the semester, with greater number of total consequences experienced being negatively associated with end-of-semester grade point average. Specific YAACQ subscale scores (Impaired Control, Dependence Symptoms, Blackout Drinking) showed unique prediction of persistent drinking and academic outcomes.
TESTING BALANCE AND FALL RISK IN PERSONS WITH PARKINSON DISEASE, AN ARGUMENT FOR ECOLOGICALLY VALID TESTING

PubMed Central

Foreman, K. Bo; Addison, Odessa; Kim, Han S.; Dibble, Leland E.

2010-01-01

Introduction Despite clear deficits in postural control, most clinical examination tools lack accuracy in identifying persons with Parkinson disease (PD) who have fallen or are at risk for falls. We assert that this is in part due to the lack of ecological validity of the testing. Methods To test this assertion, we examined the responsiveness and predictive validity of the Functional Gait Assessment (FGA), the Pull test, and the Timed up and Go (TUG) during clinically defined ON and OFF medication states. To address responsiveness, ON/OFF medication performance was compared. To address predictive validity, areas under the curve (AUC) of receiver operating characteristic (ROC) curves were compared. Comparisons were made using separate non-parametric tests. Results Thirty-six persons (24 male, 12 female) with PD (22 fallers, 14 non-fallers) participated. Only the FGA was able to detect differences between fallers and non-fallers for both ON/OFF medication testing. The predictive validity of the FGA and the TUG for fall identification was higher during OFF medication compared to ON medication testing. The predictive validity of the FGA was higher than the TUG and the Pull test during ON and OFF medication testing. Discussion In order to most accurately identify fallers, clinicians should test persons with PD in ecologically relevant conditions and tasks. In this study, interpretation of the OFF medication performance and use of the FGA provided more accurate prediction of those who would fall. PMID:21215674
Validation of the Beck Hopelessness Scale in patients with suicide risk.

PubMed

Rueda-Jaimes, German Eduardo; Castro-Rueda, Vanessa Alexandra; Rangel-Martínez-Villalba, Andrés Mauricio; Moreno-Quijano, Catalina; Martinez-Salazar, Gustavo Adolfo; Camacho, Paul Anthony

Only a few scales have been validated in Spanish for the assessment of suicide risk, and none of them have achieved predictive validity. To determine the validity and reliability of the Beck Hopelessness Scale in patients with suicide risk attending the specialist clinic. The Beck Hopelessness Scale, reasons for living inventory, and the suicide behaviour questionnaire were applied in patients with suicide risk attending the psychiatric clinic and the emergency department. A new assessment was made 30 days later to determine the predictive validity of suicide or suicide attempt. The evaluation included a total of 244 patients, with a mean age of 30.7±13.2 years, and the majority were women. The internal consistency was .9 (Kuder-Richardson formula 20). Four dimensions were found which accounted for 50% of the variance. It was positively correlated with the suicidal behaviour questionnaire (Spearman .48, P<.001), number of suicide attempts (Spearman .25, P<.001), severity of suicide risk (Spearman .23, P<.001). The correlation with the reasons for living inventory was negative (Spearman -.52, P<.001). With a cut-off ≥12, the negative predictive value was 98.4% (95% CI: 94.2-99.8), and the positive predictive value was 14.8% (95% CI: 6.6-27.1). The Beck Hopelessness Scale in Colombian patients with suicidality shows results similar to the original version, with adequate reliability and moderate concurrent and predictive validity. Copyright © 2016 SEP y SEPB. Publicado por Elsevier España, S.L.U. All rights reserved.
Cross-cultural validation of the St. Louis Inventory of Community Living Skills for Chinese patients with schizophrenia in Hong Kong.

PubMed

Au, Raymond Wing Cheong; Tam, Peter Wai Chung; Tam, Gladys Wai Chi; Ungvari, Gabor Sander

2005-01-01

The study validated a culturally sensitive community living skills rating scale for Chinese patients by adapting the St. Louis Inventory of Community Living Skills (SLICLS). The Chinese version (SLICLS-C) was produced by forward and backward translation. An expert panel evaluated its content validity. Its internal consistency, inter-rater reliability, construct and concurrent validity were tested on 80 DSM-IV schizophrenia inpatients in a long-term facility. For predictive validity, the above sample was extended to ensure at least 20 subjects discharged to each of three levels of community care were included in the study sample. The SLICLS-C was psychometrically sound and could be used for predicting level of community care, program evaluation and measuring outcome.
A design of experiments approach to validation sampling for logistic regression modeling with error-prone medical records.

PubMed

Ouyang, Liwen; Apley, Daniel W; Mehrotra, Sanjay

2016-04-01

Electronic medical record (EMR) databases offer significant potential for developing clinical hypotheses and identifying disease risk associations by fitting statistical models that capture the relationship between a binary response variable and a set of predictor variables that represent clinical, phenotypical, and demographic data for the patient. However, EMR response data may be error prone for a variety of reasons. Performing a manual chart review to validate data accuracy is time consuming, which limits the number of chart reviews in a large database. The authors' objective is to develop a new design-of-experiments-based systematic chart validation and review (DSCVR) approach that is more powerful than the random validation sampling used in existing approaches. The DSCVR approach judiciously and efficiently selects the cases to validate (i.e., validate whether the response values are correct for those cases) for maximum information content, based only on their predictor variable values. The final predictive model will be fit using only the validation sample, ignoring the remainder of the unvalidated and unreliable error-prone data. A Fisher information based D-optimality criterion is used, and an algorithm for optimizing it is developed. The authors' method is tested in a simulation comparison that is based on a sudden cardiac arrest case study with 23 041 patients' records. This DSCVR approach, using the Fisher information based D-optimality criterion, results in a fitted model with much better predictive performance, as measured by the receiver operating characteristic curve and the accuracy in predicting whether a patient will experience the event, than a model fitted using a random validation sample. The simulation comparisons demonstrate that this DSCVR approach can produce predictive models that are significantly better than those produced from random validation sampling, especially when the event rate is low. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The reliability and validity of ultrasound to quantify muscles in older adults: a systematic review

PubMed Central

Scafoglieri, Aldo; Jager‐Wittenaar, Harriët; Hobbelen, Johannes S.M.; van der Schans, Cees P.

2017-01-01

Abstract This review evaluates the reliability and validity of ultrasound to quantify muscles in older adults. The databases PubMed, Cochrane, and Cumulative Index to Nursing and Allied Health Literature were systematically searched for studies. In 17 studies, the reliability (n = 13) and validity (n = 8) of ultrasound to quantify muscles in community‐dwelling older adults (≥60 years) or a clinical population were evaluated. Four out of 13 reliability studies investigated both intra‐rater and inter‐rater reliability. Intraclass correlation coefficient (ICC) scores for reliability ranged from −0.26 to 1.00. The highest ICC scores were found for the vastus lateralis, rectus femoris, upper arm anterior, and the trunk (ICC = 0.72 to 1.000). All included validity studies found ICC scores ranging from 0.92 to 0.999. Two studies describing the validity of ultrasound to predict lean body mass showed good validity as compared with dual‐energy X‐ray absorptiometry (r 2 = 0.92 to 0.96). This systematic review shows that ultrasound is a reliable and valid tool for the assessment of muscle size in older adults. More high‐quality research is required to confirm these findings in both clinical and healthy populations. Furthermore, ultrasound assessment of small muscles needs further evaluation. Ultrasound to predict lean body mass is feasible; however, future research is required to validate prediction equations in older adults with varying function and health. PMID:28703496
Test-Retest Reliability and Predictive Validity of the Implicit Association Test in Children

ERIC Educational Resources Information Center

Rae, James R.; Olson, Kristina R.

2018-01-01

The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many…
Progress Towards a Microgravity CFD Validation Study Using the ISS SPHERES-SLOSH Experiment

NASA Technical Reports Server (NTRS)

Storey, Jedediah M.; Kirk, Daniel; Marsell, Brandon (Editor); Schallhorn, Paul (Editor)

2017-01-01

Understanding, predicting, and controlling fluid slosh dynamics is critical to safety and improving performance of space missions when a significant percentage of the spacecrafts mass is a liquid. Computational fluid dynamics simulations can be used to predict the dynamics of slosh, but these programs require extensive validation. Many CFD programs have been validated by slosh experiments using various fluids in earth gravity, but prior to the ISS SPHERES-Slosh experiment1, little experimental data for long-duration, zero-gravity slosh existed. This paper presents the current status of an ongoing CFD validation study using the ISS SPHERES-Slosh experimental data.
Progress Towards a Microgravity CFD Validation Study Using the ISS SPHERES-SLOSH Experiment

NASA Technical Reports Server (NTRS)

Storey, Jed; Kirk, Daniel (Editor); Marsell, Brandon (Editor); Schallhorn, Paul (Editor)

2017-01-01

Understanding, predicting, and controlling fluid slosh dynamics is critical to safety and improving performance of space missions when a significant percentage of the spacecrafts mass is a liquid. Computational fluid dynamics simulations can be used to predict the dynamics of slosh, but these programs require extensive validation. Many CFD programs have been validated by slosh experiments using various fluids in earth gravity, but prior to the ISS SPHERES-Slosh experiment, little experimental data for long-duration, zero-gravity slosh existed. This paper presents the current status of an ongoing CFD validation study using the ISS SPHERES-Slosh experimental data.
External validation of a 5-year survival prediction model after elective abdominal aortic aneurysm repair.

PubMed

DeMartino, Randall R; Huang, Ying; Mandrekar, Jay; Goodney, Philip P; Oderich, Gustavo S; Kalra, Manju; Bower, Thomas C; Cronenwett, Jack L; Gloviczki, Peter

2018-01-01

The benefit of prophylactic repair of abdominal aortic aneurysms (AAAs) is based on the risk of rupture exceeding the risk of death from other comorbidities. The purpose of this study was to validate a 5-year survival prediction model for patients undergoing elective repair of asymptomatic AAA <6.5 cm to assist in optimal selection of patients. All patients undergoing elective repair for asymptomatic AAA <6.5 cm (open or endovascular) from 2002 to 2011 were identified from a single institutional database (validation group). We assessed the ability of a prior published Vascular Study Group of New England (VSGNE) model (derivation group) to predict survival in our cohort. The model was assessed for discrimination (concordance index), calibration (calibration slope and calibration in the large), and goodness of fit (score test). The VSGNE derivation group consisted of 2367 patients (70% endovascular). Major factors associated with survival in the derivation group were age, coronary disease, chronic obstructive pulmonary disease, renal function, and antiplatelet and statin medication use. Our validation group consisted of 1038 patients (59% endovascular). The validation group was slightly older (74 vs 72 years; P < .01) and had a higher proportion of men (76% vs 68%; P < .01). In addition, the derivation group had higher rates of advanced cardiac disease, chronic obstructive pulmonary disease, and baseline creatinine concentration (1.2 vs 1.1 mg/dL; P < .01). Despite slight differences in preoperative patient factors, 5-year survival was similar between validation and derivation groups (75% vs 77%; P = .33). The concordance index of the validation group was identical between derivation and validation groups at 0.659 (95% confidence interval, 0.63-0.69). Our validation calibration in the large value was 1.02 (P = .62, closer to 1 indicating better calibration), calibration slope of 0.84 (95% confidence interval, 0.71-0.97), and score test of P = .57 (>.05 indicating goodness of fit). Across different populations of patients, assessment of age and level of cardiac, pulmonary, and renal disease can accurately predict 5-year survival in patients with AAA <6.5 cm undergoing repair. This risk prediction model is a valid method to assess mortality risk in determining potential overall survival benefit from elective AAA repair. Copyright © 2017 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Do Implicit Attitudes Predict Actual Voting Behavior Particularly for Undecided Voters?

PubMed Central

Friese, Malte; Smith, Colin Tucker; Plischke, Thomas; Bluemke, Matthias; Nosek, Brian A.

2012-01-01

The prediction of voting behavior of undecided voters poses a challenge to psychologists and pollsters. Recently, researchers argued that implicit attitudes would predict voting behavior particularly for undecided voters whereas explicit attitudes would predict voting behavior particularly for decided voters. We tested this assumption in two studies in two countries with distinct political systems in the context of real political elections. Results revealed that (a) explicit attitudes predicted voting behavior better than implicit attitudes for both decided and undecided voters, and (b) implicit attitudes predicted voting behavior better for decided than undecided voters. We propose that greater elaboration of attitudes produces stronger convergence between implicit and explicit attitudes resulting in better predictive validity of both, and less incremental validity of implicit over explicit attitudes for the prediction of voting behavior. However, greater incremental predictive validity of implicit over explicit attitudes may be associated with less elaboration. PMID:22952898

Challenges in Rotorcraft Acoustic Flight Prediction and Validation

NASA Technical Reports Server (NTRS)

Boyd, D. Douglas, Jr.

2003-01-01

Challenges associated with rotorcraft acoustic flight prediction and validation are examined. First, an outline of a state-of-the-art rotorcraft aeroacoustic prediction methodology is presented. Components including rotorcraft aeromechanics, high resolution reconstruction, and rotorcraft acoustic prediction arc discussed. Next, to illustrate challenges and issues involved, a case study is presented in which an analysis of flight data from a specific XV-15 tiltrotor acoustic flight test is discussed in detail. Issues related to validation of methodologies using flight test data are discussed. Primary flight parameters such as velocity, altitude, and attitude are discussed and compared for repeated flight conditions. Other measured steady state flight conditions are examined for consistency and steadiness. A representative example prediction is presented and suggestions are made for future research.
Cross-validation pitfalls when selecting and assessing regression and classification models.

PubMed

Krstajic, Damjan; Buturovic, Ljubomir J; Leahy, David E; Thomas, Simon

2014-03-29

We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.
Validation of the Retinal Detachment after Open Globe Injury (RD-OGI) Score as an Effective Tool for Predicting Retinal Detachment.

PubMed

Brodowska, Katarzyna; Stryjewski, Tomasz P; Papavasileiou, Evangelia; Chee, Yewlin E; Eliott, Dean

2017-05-01

The Retinal Detachment after Open Globe Injury (RD-OGI) Score is a clinical prediction model that was developed at the Massachusetts Eye and Ear Infirmary to predict the risk of retinal detachment (RD) after open globe injury (OGI). This study sought to validate the RD-OGI Score in an independent cohort of patients. Retrospective cohort study. The predictive value of the RD-OGI Score was evaluated by comparing the original RD-OGI Scores of 893 eyes with OGI that presented between 1999 and 2011 (the derivation cohort) with 184 eyes with OGI that presented from January 1, 2012, to January 31, 2014 (the validation cohort). Three risk classes (low, moderate, and high) were created and logistic regression was undertaken to evaluate the optimal predictive value of the RD-OGI Score. A Kaplan-Meier survival analysis evaluated survival experience between the risk classes. Time to RD. At 1 year after OGI, 255 eyes (29%) in the derivation cohort and 66 eyes (36%) in the validation cohort were diagnosed with an RD. At 1 year, the low risk class (RD-OGI Scores 0-2) had a 3% detachment rate in the derivation cohort and a 0% detachment rate in the validation cohort, the moderate risk class (RD-OGI Scores 2.5-4.5) had a 29% detachment rate in the derivation cohort and a 35% detachment rate in the validation cohort, and the high risk class (RD-OGI scores 5-7.5) had a 73% detachment rate in the derivation cohort and an 86% detachment rate in the validation cohort. Regression modeling revealed the RD-OGI to be highly discriminative, especially 30 days after injury, with an area under the receiver operating characteristic curve of 0.939 in the validation cohort. Survival experience was significantly different depending upon the risk class (P < 0.0001, log-rank chi-square). The RD-OGI Score can reliably predict the future risk of developing an RD based on clinical variables that are present at the time of the initial evaluation after OGI. Copyright © 2017 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Supervisor support in the work place: legitimacy and positive affectivity.

PubMed

Yoon, J; Thye, S

2000-06-01

The authors tested 3 hypotheses regarding supervisor support in the work place. The validation hypothesis predicts that when employees are supported by their coworkers and the larger organization, they also receive more support from their supervisors. The positive affectivity hypothesis predicts that employees with positive dispositions receive more supervisor support because they are more socially oriented and likable. The moderation hypothesis predicts a joint multiplicative effect between validation and positive affectivity. An assessment of the hypotheses among a sample of 1,882 hospital employees in Korea provided strong support for the validation and moderation hypotheses.
The Validity and Incremental Validity of Knowledge Tests, Low-Fidelity Simulations, and High-Fidelity Simulations for Predicting Job Performance in Advanced-Level High-Stakes Selection

ERIC Educational Resources Information Center

Lievens, Filip; Patterson, Fiona

2011-01-01

In high-stakes selection among candidates with considerable domain-specific knowledge and experience, investigations of whether high-fidelity simulations (assessment centers; ACs) have incremental validity over low-fidelity simulations (situational judgment tests; SJTs) are lacking. Therefore, this article integrates research on the validity of…
Development and validation of a predictive equation for lean body mass in children and adolescents.

PubMed

Foster, Bethany J; Platt, Robert W; Zemel, Babette S

2012-05-01

Lean body mass (LBM) is not easy to measure directly in the field or clinical setting. Equations to predict LBM from simple anthropometric measures, which account for the differing contributions of fat and lean to body weight at different ages and levels of adiposity, would be useful to both human biologists and clinicians. To develop and validate equations to predict LBM in children and adolescents across the entire range of the adiposity spectrum. Dual energy X-ray absorptiometry was used to measure LBM in 836 healthy children (437 females) and linear regression was used to develop sex-specific equations to estimate LBM from height, weight, age, body mass index (BMI) for age z-score and population ancestry. Equations were validated using bootstrapping methods and in a local independent sample of 332 children and in national data collected by NHANES. The mean difference between measured and predicted LBM was - 0.12% (95% limits of agreement - 11.3% to 8.5%) for males and - 0.14% ( - 11.9% to 10.9%) for females. Equations performed equally well across the entire adiposity spectrum, as estimated by BMI z-score. Validation indicated no over-fitting. LBM was predicted within 5% of measured LBM in the validation sample. The equations estimate LBM accurately from simple anthropometric measures.
The development and testing of a skin tear risk assessment tool.

PubMed

Newall, Nelly; Lewin, Gill F; Bulsara, Max K; Carville, Keryln J; Leslie, Gavin D; Roberts, Pam A

2017-02-01

The aim of the present study is to develop a reliable and valid skin tear risk assessment tool. The six characteristics identified in a previous case control study as constituting the best risk model for skin tear development were used to construct a risk assessment tool. The ability of the tool to predict skin tear development was then tested in a prospective study. Between August 2012 and September 2013, 1466 tertiary hospital patients were assessed at admission and followed up for 10 days to see if they developed a skin tear. The predictive validity of the tool was assessed using receiver operating characteristic (ROC) analysis. When the tool was found not to have performed as well as hoped, secondary analyses were performed to determine whether a potentially better performing risk model could be identified. The tool was found to have high sensitivity but low specificity and therefore have inadequate predictive validity. Secondary analysis of the combined data from this and the previous case control study identified an alternative better performing risk model. The tool developed and tested in this study was found to have inadequate predictive validity. The predictive validity of an alternative, more parsimonious model now needs to be tested. © 2015 Medicalhelplines.com Inc and John Wiley & Sons Ltd.
A calibration hierarchy for risk models was defined: from utopia to empirical data.

PubMed

Van Calster, Ben; Nieboer, Daan; Vergouwe, Yvonne; De Cock, Bavo; Pencina, Michael J; Steyerberg, Ewout W

2016-06-01

Calibrated risk models are vital for valid decision support. We define four levels of calibration and describe implications for model development and external validation of predictions. We present results based on simulated data sets. A common definition of calibration is "having an event rate of R% among patients with a predicted risk of R%," which we refer to as "moderate calibration." Weaker forms of calibration only require the average predicted risk (mean calibration) or the average prediction effects (weak calibration) to be correct. "Strong calibration" requires that the event rate equals the predicted risk for every covariate pattern. This implies that the model is fully correct for the validation setting. We argue that this is unrealistic: the model type may be incorrect, the linear predictor is only asymptotically unbiased, and all nonlinear and interaction effects should be correctly modeled. In addition, we prove that moderate calibration guarantees nonharmful decision making. Finally, results indicate that a flexible assessment of calibration in small validation data sets is problematic. Strong calibration is desirable for individualized decision support but unrealistic and counter productive by stimulating the development of overly complex models. Model development and external validation should focus on moderate calibration. Copyright © 2016 Elsevier Inc. All rights reserved.
Validation of nomograms for overall survival, cancer-specific survival, and recurrence in carcinoma of the major salivary glands.

PubMed

Hay, Ashley; Migliacci, Jocelyn; Zanoni, Daniella Karassawa; Patel, Snehal; Yu, Changhong; Kattan, Michael W; Ganly, Ian

2018-05-01

The purpose of this study was to investigate the performance of the Memorial Sloan Kettering Cancer Center salivary carcinoma nomograms predicting overall survival, cancer-specific survival, and recurrence with an external validation dataset. The validation dataset comprised 123 patients treated between 2010 and 2015 at our institution. They were evaluated by assessing discrimination (concordance index [C-index]) and calibration (plotting predicted vs actual probabilities for quintiles). The validation cohort (n = 123) showed some differences to the original cohort (n = 301). The validation cohort had less high-grade cancers (P = .006), less lymphovascular invasion (LVI; P < .001) and shorter follow-up of 19 months versus 45.6 months. Validation showed a C-index of 0.833 (95% confidence interval [CI] 0.758-0.908), 0.807 (95% CI 0.717-0.898), and 0.844 (95% CI 0.768-0.920) for overall survival, cancer-specific survival, and recurrence, respectively. The 3 salivary gland nomograms performed well using a contemporary validation dataset, despite limitations related to sample size, follow-up, and differences in clinical and pathology characteristics between the original and validation cohorts. © 2018 Wiley Periodicals, Inc.
Sharing reference data and including cows in the reference population improve genomic predictions in Danish Jersey.

PubMed

Su, G; Ma, P; Nielsen, U S; Aamand, G P; Wiggans, G; Guldbrandtsen, B; Lund, M S

2016-06-01

Small reference populations limit the accuracy of genomic prediction in numerically small breeds, such like Danish Jersey. The objective of this study was to investigate two approaches to improve genomic prediction by increasing size of reference population in Danish Jersey. The first approach was to include North American Jersey bulls in Danish Jersey reference population. The second was to genotype cows and use them as reference animals. The validation of genomic prediction was carried out on bulls and cows, respectively. In validation on bulls, about 300 Danish bulls (depending on traits) born in 2005 and later were used as validation data, and the reference populations were: (1) about 1050 Danish bulls, (2) about 1050 Danish bulls and about 1150 US bulls. In validation on cows, about 3000 Danish cows from 87 young half-sib families were used as validation data, and the reference populations were: (1) about 1250 Danish bulls, (2) about 1250 Danish bulls and about 1150 US bulls, (3) about 1250 Danish bulls and about 4800 cows, (4) about 1250 Danish bulls, 1150 US bulls and 4800 Danish cows. Genomic best linear unbiased prediction model was used to predict breeding values. De-regressed proofs were used as response variables. In the validation on bulls for eight traits, the joint DK-US bull reference population led to higher reliability of genomic prediction than the DK bull reference population for six traits, but not for fertility and longevity. Averaged over the eight traits, the gain was 3 percentage points. In the validation on cows for six traits (fertility and longevity were not available), the gain from inclusion of US bull in reference population was 6.6 percentage points in average over the six traits, and the gain from inclusion of cows was 8.2 percentage points. However, the gains from cows and US bulls were not accumulative. The total gain of including both US bulls and Danish cows was 10.5 percentage points. The results indicate that sharing reference data and including cows in reference population are efficient approaches to increase reliability of genomic prediction. Therefore, genomic selection is promising for numerically small population.
Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models.

PubMed

Blagus, Rok; Lusa, Lara

2015-11-04

Prediction models are used in clinical research to develop rules that can be used to accurately predict the outcome of the patients based on some of their characteristics. They represent a valuable tool in the decision making process of clinicians and health policy makers, as they enable them to estimate the probability that patients have or will develop a disease, will respond to a treatment, or that their disease will recur. The interest devoted to prediction models in the biomedical community has been growing in the last few years. Often the data used to develop the prediction models are class-imbalanced as only few patients experience the event (and therefore belong to minority class). Prediction models developed using class-imbalanced data tend to achieve sub-optimal predictive accuracy in the minority class. This problem can be diminished by using sampling techniques aimed at balancing the class distribution. These techniques include under- and oversampling, where a fraction of the majority class samples are retained in the analysis or new samples from the minority class are generated. The correct assessment of how the prediction model is likely to perform on independent data is of crucial importance; in the absence of an independent data set, cross-validation is normally used. While the importance of correct cross-validation is well documented in the biomedical literature, the challenges posed by the joint use of sampling techniques and cross-validation have not been addressed. We show that care must be taken to ensure that cross-validation is performed correctly on sampled data, and that the risk of overestimating the predictive accuracy is greater when oversampling techniques are used. Examples based on the re-analysis of real datasets and simulation studies are provided. We identify some results from the biomedical literature where the incorrect cross-validation was performed, where we expect that the performance of oversampling techniques was heavily overestimated.
Prediction of Primary Care Depression Outcomes at Six Months: Validation of DOC-6 ©.

PubMed

Angstman, Kurt B; Garrison, Gregory M; Gonzalez, Cesar A; Cozine, Daniel W; Cozine, Elizabeth W; Katzelnick, David J

2017-01-01

The goal of this study was to develop and validate an assessment tool for adult primary care patients diagnosed with depression to determine predictive probability of clinical outcomes at 6 months. We retrospectively reviewed 3096 adult patients enrolled in collaborative care management (CCM) for depression. Patients enrolled on or before December 31, 2013, served as the training set (n = 2525), whereas those enrolled after that date served as the preliminary validation set (n = 571). Six variables (2 demographic and 4 clinical) were statistically significant in determining clinical outcomes. Using the validation data set, the remission classifier produced the receiver operating characteristics (ROC) curve with a c-statistic or area under the curve (AUC) of 0.62 with predicted probabilities than ranged from 14.5% to 79.1%, with a median of 50.6%. The persistent depressive symptoms (PDS) classifier produced an ROC curve with a c-statistic or AUC of 0.67 and predicted probabilities that ranged from 5.5% to 73.1%, with a median of 23.5%. We were able to identify readily available variables and then validated these in the prediction of depression remission and PDS at 6 months. The DOC-6 tool may be used to predict which patients may be at risk for worse outcomes. © Copyright 2017 by the American Board of Family Medicine.
Development and validation of a predictive score for perioperative transfusion in patients with hepatocellular carcinoma undergoing liver resection.

PubMed

Wang, Hai-Qing; Yang, Jian; Yang, Jia-Yin; Wang, Wen-Tao; Yan, Lu-Nan

2015-08-01

Liver resection is a major surgery requiring perioperative blood transfusion. Predicting the need for blood transfusion for patients undergoing liver resection is of great importance. The present study aimed to develop and validate a model for predicting transfusion requirement in HBV-related hepatocellular carcinoma patients undergoing liver resection. A total of 1543 consecutive liver resections were included in the study. Randomly selected sample set of 1080 cases (70% of the study cohort) were used to develop a predictive score for transfusion requirement and the remaining 30% (n=463) was used to validate the score. Based on the preoperative and predictable intraoperative parameters, logistic regression was used to identify risk factors and to create an integer score for the prediction of transfusion requirement. Extrahepatic procedure, major liver resection, hemoglobin level and platelets count were identified as independent predictors for transfusion requirement by logistic regression analysis. A score system integrating these 4 factors was stratified into three groups which could predict the risk of transfusion, with a rate of 11.4%, 24.7% and 57.4% for low, moderate and high risk, respectively. The prediction model appeared accurate with good discriminatory abilities, generating an area under the receiver operating characteristic curve of 0.736 in the development set and 0.709 in the validation set. We have developed and validated an integer-based risk score to predict perioperative transfusion for patients undergoing liver resection in a high-volume surgical center. This score allows identifying patients at a high risk and may alter transfusion practices.
Reliability and Validity of the Work and Well-Being Inventory (WBI) for Employees.

PubMed

Vendrig, A A; Schaafsma, F G

2018-06-01

Purpose The purpose of this study is to measure the psychometric properties of the Work and Wellbeing Inventory (WBI) (in Dutch: VAR-2), a screening tool that is used within occupational health care and rehabilitation. Our research question focused on the reliability and validity of this inventory. Methods Over the years seven different samples of workers, patients and sick listed workers varying in size between 89 and 912 participants (total: 2514), were used to measure the test-retest reliability, the internal consistency, the construct and concurrent validity, and the criterion and predictive validity. Results The 13 scales displayed good internal consistency and test-retest reliability. The constructive validity of the WBI could clearly be demonstrated in both patients and healthy workers. Confirmative factor analyses revealed a CFI >.90 for all scales. The depression scale predicted future work absenteeism (>6 weeks) because of a common mental disorder in healthy workers. The job strain scale and the illness behavior scale predicted long term absenteeism (>3 months) in workers with short-term absenteeism. The illness behavior scale moderately predicted return to work in rehab patients attending an intensive multidisciplinary program. Conclusions The WBI is a valid and reliable tool for occupational health practitioners to screen for risk factors for prolonged or future sickness absence. With this tool they will have reliable indications for further advice and interventions to restore the work ability.
Real external predictivity of QSAR models: how to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient.

PubMed

Chirico, Nicola; Gramatica, Paola

2011-09-26

The main utility of QSAR models is their ability to predict activities/properties for new chemicals, and this external prediction ability is evaluated by means of various validation criteria. As a measure for such evaluation the OECD guidelines have proposed the predictive squared correlation coefficient Q(2)(F1) (Shi et al.). However, other validation criteria have been proposed by other authors: the Golbraikh-Tropsha method, r(2)(m) (Roy), Q(2)(F2) (Schüürmann et al.), Q(2)(F3) (Consonni et al.). In QSAR studies these measures are usually in accordance, though this is not always the case, thus doubts can arise when contradictory results are obtained. It is likely that none of the aforementioned criteria is the best in every situation, so a comparative study using simulated data sets is proposed here, using threshold values suggested by the proponents or those widely used in QSAR modeling. In addition, a different and simple external validation measure, the concordance correlation coefficient (CCC), is proposed and compared with other criteria. Huge data sets were used to study the general behavior of validation measures, and the concordance correlation coefficient was shown to be the most restrictive. On using simulated data sets of a more realistic size, it was found that CCC was broadly in agreement, about 96% of the time, with other validation measures in accepting models as predictive, and in almost all the examples it was the most precautionary. The proposed concordance correlation coefficient also works well on real data sets, where it seems to be more stable, and helps in making decisions when the validation measures are in conflict. Since it is conceptually simple, and given its stability and restrictiveness, we propose the concordance correlation coefficient as a complementary, or alternative, more prudent measure of a QSAR model to be externally predictive.
Determination of the criterion-related validity of hip joint angle test for estimating hamstring flexibility using a contemporary statistical approach.

PubMed

Sainz de Baranda, Pilar; Rodríguez-Iniesta, María; Ayala, Francisco; Santonja, Fernando; Cejudo, Antonio

2014-07-01

To examine the criterion-related validity of the horizontal hip joint angle (H-HJA) test and vertical hip joint angle (V-HJA) test for estimating hamstring flexibility measured through the passive straight-leg raise (PSLR) test using contemporary statistical measures. Validity study. Controlled laboratory environment. One hundred thirty-eight professional trampoline gymnasts (61 women and 77 men). Hamstring flexibility. Each participant performed 2 trials of H-HJA, V-HJA, and PSLR tests in a randomized order. The criterion-related validity of H-HJA and V-HJA tests was measured through the estimation equation, typical error of the estimate (TEEST), validity correlation (β), and their respective confidence limits. The findings from this study suggest that although H-HJA and V-HJA tests showed moderate to high validity scores for estimating hamstring flexibility (standardized TEEST = 0.63; β = 0.80), the TEEST statistic reported for both tests was not narrow enough for clinical purposes (H-HJA = 10.3 degrees; V-HJA = 9.5 degrees). Subsequently, the predicted likely thresholds for the true values that were generated were too wide (H-HJA = predicted value ± 13.2 degrees; V-HJA = predicted value ± 12.2 degrees). The results suggest that although the HJA test showed moderate to high validity scores for estimating hamstring flexibility, the prediction intervals between the HJA and PSLR tests are not strong enough to suggest that clinicians and sport medicine practitioners should use the HJA and PSLR tests interchangeably as gold standard measurement tools to evaluate and detect short hamstring muscle flexibility.
The Optimal Screening for Prediction of Referral and Outcome (OSPRO) in patients with musculoskeletal pain conditions: a longitudinal validation cohort from the USA

PubMed Central

George, Steven Z; Beneciuk, Jason M; Lentz, Trevor A; Wu, Samuel S

2017-01-01

Purpose There is an increased need for determining which patients with musculoskeletal pain benefit from additional diagnostic testing or psychologically informed intervention. The Optimal Screening for Prediction of Referral and Outcome (OSPRO) cohort studies were designed to develop and validate standard assessment tools for review of systems and yellow flags. This cohort profile paper provides a description of and future plans for the validation cohort. Participants Patients (n=440) with primary complaint of spine, shoulder or knee pain were recruited into the OSPRO validation cohort via a national Orthopaedic Physical Therapy-Investigative Network. Patients were followed up at 4 weeks, 6 months and 12 months for pain, functional status and quality of life outcomes. Healthcare utilisation outcomes were also collected at 6 and 12 months. Findings to date There are no longitudinal findings reported to date from the ongoing OSPRO validation cohort. The previously completed cross-sectional OSPRO development cohort yielded two assessment tools that were investigated in the validation cohort. Future plans Follow-up data collection was completed in January 2017. Primary analyses will investigate how accurately the OSPRO review of systems and yellow flag tools predict 12-month pain, functional status, quality of life and healthcare utilisation outcomes. Planned secondary analyses include prediction of pain interference and/or development of chronic pain, investigation of treatment expectation on patient outcomes and analysis of patient satisfaction following an episode of physical therapy. Trial registration number The OSPRO validation cohort was not registered. PMID:28600371
Analytic Validation of Immunohistochemical Assays: A Comparison of Laboratory Practices Before and After Introduction of an Evidence-Based Guideline.

PubMed

Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Souers, Rhona J; Fatheree, Lisa A; Volmar, Keith E; Stuart, Lauren N; Nowak, Jan A; Astles, J Rex; Nakhleh, Raouf E

2017-09-01

- Laboratories must demonstrate analytic validity before any test can be used clinically, but studies have shown inconsistent practices in immunohistochemical assay validation. - To assess changes in immunohistochemistry analytic validation practices after publication of an evidence-based laboratory practice guideline. - A survey on current immunohistochemistry assay validation practices and on the awareness and adoption of a recently published guideline was sent to subscribers enrolled in one of 3 relevant College of American Pathologists proficiency testing programs and to additional nonsubscribing laboratories that perform immunohistochemical testing. The results were compared with an earlier survey of validation practices. - Analysis was based on responses from 1085 laboratories that perform immunohistochemical staining. Of 1057 responses, 65.4% (691) were aware of the guideline recommendations before this survey was sent and 79.9% (550 of 688) of those have already adopted some or all of the recommendations. Compared with the 2010 survey, a significant number of laboratories now have written validation procedures for both predictive and nonpredictive marker assays and specifications for the minimum numbers of cases needed for validation. There was also significant improvement in compliance with validation requirements, with 99% (100 of 102) having validated their most recently introduced predictive marker assay, compared with 74.9% (326 of 435) in 2010. The difficulty in finding validation cases for rare antigens and resource limitations were cited as the biggest challenges in implementing the guideline. - Dissemination of the 2014 evidence-based guideline validation practices had a positive impact on laboratory performance; some or all of the recommendations have been adopted by nearly 80% of respondents.
Development of estrogen receptor beta binding prediction model using large sets of chemicals.

PubMed

Sakkiah, Sugunadevi; Selvaraj, Chandrabose; Gong, Ping; Zhang, Chaoyang; Tong, Weida; Hong, Huixiao

2017-11-03

We developed an ER β binding prediction model to facilitate identification of chemicals specifically bind ER β or ER α together with our previously developed ER α binding model. Decision Forest was used to train ER β binding prediction model based on a large set of compounds obtained from EADB. Model performance was estimated through 1000 iterations of 5-fold cross validations. Prediction confidence was analyzed using predictions from the cross validations. Informative chemical features for ER β binding were identified through analysis of the frequency data of chemical descriptors used in the models in the 5-fold cross validations. 1000 permutations were conducted to assess the chance correlation. The average accuracy of 5-fold cross validations was 93.14% with a standard deviation of 0.64%. Prediction confidence analysis indicated that the higher the prediction confidence the more accurate the predictions. Permutation testing results revealed that the prediction model is unlikely generated by chance. Eighteen informative descriptors were identified to be important to ER β binding prediction. Application of the prediction model to the data from ToxCast project yielded very high sensitivity of 90-92%. Our results demonstrated ER β binding of chemicals could be accurately predicted using the developed model. Coupling with our previously developed ER α prediction model, this model could be expected to facilitate drug development through identification of chemicals that specifically bind ER β or ER α .
Validation of the Social Appearance Anxiety Scale: factor, convergent, and divergent validity.

PubMed

Levinson, Cheri A; Rodebaugh, Thomas L

2011-09-01

The Social Appearance Anxiety Scale (SAAS) was created to assess fear of overall appearance evaluation. Initial psychometric work indicated that the measure had a single-factor structure and exhibited excellent internal consistency, test-retest reliability, and convergent validity. In the current study, the authors further examined the factor, convergent, and divergent validity of the SAAS in two samples of undergraduates. In Study 1 (N = 323), the authors tested the factor structure, convergent, and divergent validity of the SAAS with measures of the Big Five personality traits, negative affect, fear of negative evaluation, and social interaction anxiety. In Study 2 (N = 118), participants completed a body evaluation that included measurements of height, weight, and body fat content. The SAAS exhibited excellent convergent and divergent validity with self-report measures (i.e., self-esteem, trait anxiety, ethnic identity, and sympathy), predicted state anxiety experienced during the body evaluation, and predicted body fat content. In both studies, results confirmed a single-factor structure as the best fit to the data. These results lend additional support for the use of the SAAS as a valid measure of social appearance anxiety.

VDA, a Method of Choosing a Better Algorithm with Fewer Validations

PubMed Central

Kluger, Yuval

2011-01-01

The multitude of bioinformatics algorithms designed for performing a particular computational task presents end-users with the problem of selecting the most appropriate computational tool for analyzing their biological data. The choice of the best available method is often based on expensive experimental validation of the results. We propose an approach to design validation sets for method comparison and performance assessment that are effective in terms of cost and discrimination power. Validation Discriminant Analysis (VDA) is a method for designing a minimal validation dataset to allow reliable comparisons between the performances of different algorithms. Implementation of our VDA approach achieves this reduction by selecting predictions that maximize the minimum Hamming distance between algorithmic predictions in the validation set. We show that VDA can be used to correctly rank algorithms according to their performances. These results are further supported by simulations and by realistic algorithmic comparisons in silico. VDA is a novel, cost-efficient method for minimizing the number of validation experiments necessary for reliable performance estimation and fair comparison between algorithms. Our VDA software is available at http://sourceforge.net/projects/klugerlab/files/VDA/ PMID:22046256
Implementing Lumberjacks and Black Swans Into Model-Based Tools to Support Human-Automation Interaction.

PubMed

Sebok, Angelia; Wickens, Christopher D

2017-03-01

The objectives were to (a) implement theoretical perspectives regarding human-automation interaction (HAI) into model-based tools to assist designers in developing systems that support effective performance and (b) conduct validations to assess the ability of the models to predict operator performance. Two key concepts in HAI, the lumberjack analogy and black swan events, have been studied extensively. The lumberjack analogy describes the effects of imperfect automation on operator performance. In routine operations, an increased degree of automation supports performance, but in failure conditions, increased automation results in more significantly impaired performance. Black swans are the rare and unexpected failures of imperfect automation. The lumberjack analogy and black swan concepts have been implemented into three model-based tools that predict operator performance in different systems. These tools include a flight management system, a remotely controlled robotic arm, and an environmental process control system. Each modeling effort included a corresponding validation. In one validation, the software tool was used to compare three flight management system designs, which were ranked in the same order as predicted by subject matter experts. The second validation compared model-predicted operator complacency with empirical performance in the same conditions. The third validation compared model-predicted and empirically determined time to detect and repair faults in four automation conditions. The three model-based tools offer useful ways to predict operator performance in complex systems. The three tools offer ways to predict the effects of different automation designs on operator performance.
Derivation and Validation of a Biomarker-Based Clinical Algorithm to Rule Out Sepsis From Noninfectious Systemic Inflammatory Response Syndrome at Emergency Department Admission: A Multicenter Prospective Study.

PubMed

Mearelli, Filippo; Fiotti, Nicola; Giansante, Carlo; Casarsa, Chiara; Orso, Daniele; De Helmersen, Marco; Altamura, Nicola; Ruscio, Maurizio; Castello, Luigi Mario; Colonetti, Efrem; Marino, Rossella; Barbati, Giulia; Bregnocchi, Andrea; Ronco, Claudio; Lupia, Enrico; Montrucchio, Giuseppe; Muiesan, Maria Lorenza; Di Somma, Salvatore; Avanzi, Gian Carlo; Biolo, Gianni

2018-05-07

To derive and validate a predictive algorithm integrating a nomogram-based prediction of the pretest probability of infection with a panel of serum biomarkers, which could robustly differentiate sepsis/septic shock from noninfectious systemic inflammatory response syndrome. Multicenter prospective study. At emergency department admission in five University hospitals. Nine-hundred forty-seven adults in inception cohort and 185 adults in validation cohort. None. A nomogram, including age, Sequential Organ Failure Assessment score, recent antimicrobial therapy, hyperthermia, leukocytosis, and high C-reactive protein values, was built in order to take data from 716 infected patients and 120 patients with noninfectious systemic inflammatory response syndrome to predict pretest probability of infection. Then, the best combination of procalcitonin, soluble phospholypase A2 group IIA, presepsin, soluble interleukin-2 receptor α, and soluble triggering receptor expressed on myeloid cell-1 was applied in order to categorize patients as "likely" or "unlikely" to be infected. The predictive algorithm required only procalcitonin backed up with soluble phospholypase A2 group IIA determined in 29% of the patients to rule out sepsis/septic shock with a negative predictive value of 93%. In a validation cohort of 158 patients, predictive algorithm reached 100% of negative predictive value requiring biomarker measurements in 18% of the population. We have developed and validated a high-performing, reproducible, and parsimonious algorithm to assist emergency department physicians in distinguishing sepsis/septic shock from noninfectious systemic inflammatory response syndrome.
Development and validation of an ICD-10-based disability predictive index for patients admitted to hospitals with trauma.

PubMed

Wada, Tomoki; Yasunaga, Hideo; Yamana, Hayato; Matsui, Hiroki; Fushimi, Kiyohide; Morimura, Naoto

2018-03-01

There was no established disability predictive measurement for patients with trauma that could be used in administrative claims databases. The aim of the present study was to develop and validate a diagnosis-based disability predictive index for severe physical disability at discharge using the International Classification of Diseases, 10th revision (ICD-10) coding. This retrospective observational study used the Diagnosis Procedure Combination database in Japan. Patients who were admitted to hospitals with trauma and discharged alive from 01 April 2010 to 31 March 2015 were included. Pediatric patients under 15 years old were excluded. Data for patients admitted to hospitals from 01 April 2010 to 31 March 2013 was used for development of a disability predictive index (derivation cohort), while data for patients admitted to hospitals from 01 April 2013 to 31 March 2015 was used for the internal validation (validation cohort). The outcome of interest was severe physical disability defined as the Barthel Index score of <60 at discharge. Trauma-related ICD-10 codes were categorized into 36 injury groups with reference to the categorization used in the Global Burden of Diseases study 2013. A multivariable logistic regression analysis was performed for the outcome using the injury groups and patient baseline characteristics including patient age, sex, and Charlson Comorbidity Index (CCI) score in the derivation cohort. A score corresponding to a regression coefficient was assigned to each injury group. The disability predictive index for each patient was defined as the sum of the scores. The predictive performance of the index was validated using the receiver operating characteristic curve analysis in the validation cohort. The derivation cohort included 1,475,158 patients, while the validation cohort included 939,659 patients. Of the 939,659 patients, 235,382 (25.0%) were discharged with severe physical disability. The c-statistics of the disability predictive index was 0.795 (95% confidence interval [CI] 0.794-0.795), while that of a model using the disability predictive index and patient baseline characteristics was 0.856 (95% CI 0.855-0.857). Severe physical disability at discharge may be well predicted with patient age, sex, CCI score, and the diagnosis-based disability predictive index in patients admitted to hospitals with trauma. Copyright © 2018 Elsevier Ltd. All rights reserved.
External validation of a prediction model for surgical site infection after thoracolumbar spine surgery in a Western European cohort.

PubMed

Janssen, Daniël M C; van Kuijk, Sander M J; d'Aumerie, Boudewijn B; Willems, Paul C

2018-05-16

A prediction model for surgical site infection (SSI) after spine surgery was developed in 2014 by Lee et al. This model was developed to compute an individual estimate of the probability of SSI after spine surgery based on the patient's comorbidity profile and invasiveness of surgery. Before any prediction model can be validly implemented in daily medical practice, it should be externally validated to assess how the prediction model performs in patients sampled independently from the derivation cohort. We included 898 consecutive patients who underwent instrumented thoracolumbar spine surgery. To quantify overall performance using Nagelkerke's R 2 statistic, the discriminative ability was quantified as the area under the receiver operating characteristic curve (AUC). We computed the calibration slope of the calibration plot, to judge prediction accuracy. Sixty patients developed an SSI. The overall performance of the prediction model in our population was poor: Nagelkerke's R 2 was 0.01. The AUC was 0.61 (95% confidence interval (CI) 0.54-0.68). The estimated slope of the calibration plot was 0.52. The previously published prediction model showed poor performance in our academic external validation cohort. To predict SSI after instrumented thoracolumbar spine surgery for the present population, a better fitting prediction model should be developed.
Resolving Contradictions of Predictive Validity of University Matriculation Examinations in Nigeria: A Meta-Analysis Approach

ERIC Educational Resources Information Center

Modupe, Ale Veronica; Babafemi, Kolawole Emmanuel

2015-01-01

The study examined the various means of solving contradictions of predictive studies of University Matriculation Examination in Nigeria. The study used a sample size of 35 studies on predictive validity of University Matriculation Examination in Nigeria, which was purposively selected to have met the criteria for meta-analysis. Two null hypotheses…
The Predictive Validity of the University Student Selection Examination

ERIC Educational Resources Information Center

Karakaya, Ismail; Tavsancil, Ezel

2008-01-01

The main purpose of this study is to investigate the predictive validity of the 2003 University Student Selection Examination (OSS). For this purpose, freshman grade point average (FGPA) in higher education was predicted by raw scores, standard scores, and placement scores (YEP). This study has been conducted on a research group. In this study,…
Beyond Correlations: Usefulness of High School GPA and Test Scores in Making College Admissions Decisions

ERIC Educational Resources Information Center

Sawyer, Richard

2013-01-01

Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…
Validation of Alternative In Vitro Methods to Animal Testing: Concepts, Challenges, Processes and Tools.

PubMed

Griesinger, Claudius; Desprez, Bertrand; Coecke, Sandra; Casey, Warren; Zuang, Valérie

This chapter explores the concepts, processes, tools and challenges relating to the validation of alternative methods for toxicity and safety testing. In general terms, validation is the process of assessing the appropriateness and usefulness of a tool for its intended purpose. Validation is routinely used in various contexts in science, technology, the manufacturing and services sectors. It serves to assess the fitness-for-purpose of devices, systems, software up to entire methodologies. In the area of toxicity testing, validation plays an indispensable role: "alternative approaches" are increasingly replacing animal models as predictive tools and it needs to be demonstrated that these novel methods are fit for purpose. Alternative approaches include in vitro test methods, non-testing approaches such as predictive computer models up to entire testing and assessment strategies composed of method suites, data sources and decision-aiding tools. Data generated with alternative approaches are ultimately used for decision-making on public health and the protection of the environment. It is therefore essential that the underlying methods and methodologies are thoroughly characterised, assessed and transparently documented through validation studies involving impartial actors. Importantly, validation serves as a filter to ensure that only test methods able to produce data that help to address legislative requirements (e.g. EU's REACH legislation) are accepted as official testing tools and, owing to the globalisation of markets, recognised on international level (e.g. through inclusion in OECD test guidelines). Since validation creates a credible and transparent evidence base on test methods, it provides a quality stamp, supporting companies developing and marketing alternative methods and creating considerable business opportunities. Validation of alternative methods is conducted through scientific studies assessing two key hypotheses, reliability and relevance of the test method for a given purpose. Relevance encapsulates the scientific basis of the test method, its capacity to predict adverse effects in the "target system" (i.e. human health or the environment) as well as its applicability for the intended purpose. In this chapter we focus on the validation of non-animal in vitro alternative testing methods and review the concepts, challenges, processes and tools fundamental to the validation of in vitro methods intended for hazard testing of chemicals. We explore major challenges and peculiarities of validation in this area. Based on the notion that validation per se is a scientific endeavour that needs to adhere to key scientific principles, namely objectivity and appropriate choice of methodology, we examine basic aspects of study design and management, and provide illustrations of statistical approaches to describe predictive performance of validated test methods as well as their reliability.
Assessing Attachment Security With the Attachment Q Sort: Meta-Analytic Evidence for the Validity of the Observer AQS

ERIC Educational Resources Information Center

van I Jzendoorn,Marinus H.; Vereijken, Carolus M.J.L.; Bakermans-Kranenburg, Marian J.; Riksen-Walraven, Marianne J.

2004-01-01

The reliability and validity of the Attachment Q Sort (AQS; Waters & Deane, 1985) was tested in a series of meta-analyses on 139 studies with 13,835 children. The observer AQS security score showed convergent validity with Strange Situation procedure (SSP) security (r=31) and excellent predictive validity with sensitivity measures (r=39). Its…
Validity of the SAT® for Predicting First-Year Grades: 2010 SAT Validity Sample. Statistical Report 2013-2

ERIC Educational Resources Information Center

Patterson, Brian F.; Mattern, Krista D.

2013-01-01

The continued accumulation of validity evidence for the core uses of educational assessments is critical to ensure that proper inferences will be made for those core purposes. To that end, the College Board has continued to follow previous cohorts of college students and this report provides updated validity evidence for using the SAT to predict…
Evaluating the predictive accuracy and the clinical benefit of a nomogram aimed to predict survival in node-positive prostate cancer patients: External validation on a multi-institutional database.

PubMed

Bianchi, Lorenzo; Schiavina, Riccardo; Borghesi, Marco; Bianchi, Federico Mineo; Briganti, Alberto; Carini, Marco; Terrone, Carlo; Mottrie, Alex; Gacci, Mauro; Gontero, Paolo; Imbimbo, Ciro; Marchioro, Giansilvio; Milanese, Giulio; Mirone, Vincenzo; Montorsi, Francesco; Morgia, Giuseppe; Novara, Giacomo; Porreca, Angelo; Volpe, Alessandro; Brunocilla, Eugenio

2018-04-06

To assess the predictive accuracy and the clinical value of a recent nomogram predicting cancer-specific mortality-free survival after surgery in pN1 prostate cancer patients through an external validation. We evaluated 518 prostate cancer patients treated with radical prostatectomy and pelvic lymph node dissection with evidence of nodal metastases at final pathology, at 10 tertiary centers. External validation was carried out using regression coefficients of the previously published nomogram. The performance characteristics of the model were assessed by quantifying predictive accuracy, according to the area under the curve in the receiver operating characteristic curve and model calibration. Furthermore, we systematically analyzed the specificity, sensitivity, positive predictive value and negative predictive value for each nomogram-derived probability cut-off. Finally, we implemented decision curve analysis, in order to quantify the nomogram's clinical value in routine practice. External validation showed inferior predictive accuracy as referred to in the internal validation (65.8% vs 83.3%, respectively). The discrimination (area under the curve) of the multivariable model was 66.7% (95% CI 60.1-73.0%) by testing with receiver operating characteristic curve analysis. The calibration plot showed an overestimation throughout the range of predicted cancer-specific mortality-free survival rates probabilities. However, in decision curve analysis, the nomogram's use showed a net benefit when compared with the scenarios of treating all patients or none. In an external setting, the nomogram showed inferior predictive accuracy and suboptimal calibration characteristics as compared to that reported in the original population. However, decision curve analysis showed a clinical net benefit, suggesting a clinical implication to correctly manage pN1 prostate cancer patients after surgery. © 2018 The Japanese Urological Association.
Development and validation of a novel predictive scoring model for microvascular invasion in patients with hepatocellular carcinoma.

PubMed

Zhao, Hui; Hua, Ye; Dai, Tu; He, Jian; Tang, Min; Fu, Xu; Mao, Liang; Jin, Huihan; Qiu, Yudong

2017-03-01

Microvascular invasion (MVI) in patients with hepatocellular carcinoma (HCC) cannot be accurately predicted preoperatively. This study aimed to establish a predictive scoring model of MVI in solitary HCC patients without macroscopic vascular invasion. A total of 309 consecutive HCC patients who underwent curative hepatectomy were divided into the derivation (n=206) and validation cohort (n=103). A predictive scoring model of MVI was established according to the valuable predictors in the derivation cohort based on multivariate logistic regression analysis. The performance of the predictive model was evaluated in the derivation and validation cohorts. Preoperative imaging features on CECT, such as intratumoral arteries, non-nodular type of HCC and absence of radiological tumor capsule were independent predictors for MVI. The predictive scoring model was established according to the β coefficients of the 3 predictors. Area under receiver operating characteristic (AUROC) of the predictive scoring model was 0.872 (95% CI, 0.817-0.928) and 0.856 (95% CI, 0.771-0.940) in the derivation and validation cohorts. The positive and negative predictive values were 76.5% and 88.0% in the derivation cohort and 74.4% and 88.3% in the validation cohort. The performance of the model was similar between the patients with tumor size ≤5cm and >5cm in AUROC (P=0.910). The predictive scoring model based on intratumoral arteries, non-nodular type of HCC, and absence of the radiological tumor capsule on preoperative CECT is of great value in the prediction of MVI regardless of tumor size. Copyright © 2017 Elsevier B.V. All rights reserved.
Methods to compute reliabilities for genomic predictions of feed intake

USDA-ARS?s Scientific Manuscript database

For new traits without historical reference data, cross-validation is often the preferred method to validate reliability (REL). Time truncation is less useful because few animals gain substantial REL after the truncation point. Accurate cross-validation requires separating genomic gain from pedigree...
Predicting Pilot Error in Nextgen: Pilot Performance Modeling and Validation Efforts

NASA Technical Reports Server (NTRS)

Wickens, Christopher; Sebok, Angelia; Gore, Brian; Hooey, Becky

2012-01-01

We review 25 articles presenting 5 general classes of computational models to predict pilot error. This more targeted review is placed within the context of the broader review of computational models of pilot cognition and performance, including such aspects as models of situation awareness or pilot-automation interaction. Particular emphasis is placed on the degree of validation of such models against empirical pilot data, and the relevance of the modeling and validation efforts to Next Gen technology and procedures.
The Johns Hopkins Fall Risk Assessment Tool: A Study of Reliability and Validity.

PubMed

Poe, Stephanie S; Dawson, Patricia B; Cvach, Maria; Burnett, Margaret; Kumble, Sowmya; Lewis, Maureen; Thompson, Carol B; Hill, Elizabeth E

Patient falls and fall-related injury remain a safety concern. The Johns Hopkins Fall Risk Assessment Tool (JHFRAT) was developed to facilitate early detection of risk for anticipated physiologic falls in adult inpatients. Psychometric properties in acute care settings have not yet been fully established; this study sought to fill that gap. Results indicate that the JHFRAT is reliable, with high sensitivity and negative predictive validity. Specificity and positive predictive validity were lower than expected.
Development and Validation of an Empiric Tool to Predict Favorable Neurologic Outcomes Among PICU Patients.

PubMed

Gupta, Punkaj; Rettiganti, Mallikarjuna; Gossett, Jeffrey M; Daufeldt, Jennifer; Rice, Tom B; Wetzel, Randall C

2018-01-01

To create a novel tool to predict favorable neurologic outcomes during ICU stay among children with critical illness. Logistic regression models using adaptive lasso methodology were used to identify independent factors associated with favorable neurologic outcomes. A mixed effects logistic regression model was used to create the final prediction model including all predictors selected from the lasso model. Model validation was performed using a 10-fold internal cross-validation approach. Virtual Pediatric Systems (VPS, LLC, Los Angeles, CA) database. Patients less than 18 years old admitted to one of the participating ICUs in the Virtual Pediatric Systems database were included (2009-2015). None. A total of 160,570 patients from 90 hospitals qualified for inclusion. Of these, 1,675 patients (1.04%) were associated with a decline in Pediatric Cerebral Performance Category scale by at least 2 between ICU admission and ICU discharge (unfavorable neurologic outcome). The independent factors associated with unfavorable neurologic outcome included higher weight at ICU admission, higher Pediatric Index of Morality-2 score at ICU admission, cardiac arrest, stroke, seizures, head/nonhead trauma, use of conventional mechanical ventilation and high-frequency oscillatory ventilation, prolonged hospital length of ICU stay, and prolonged use of mechanical ventilation. The presence of chromosomal anomaly, cardiac surgery, and utilization of nitric oxide were associated with favorable neurologic outcome. The final online prediction tool can be accessed at https://soipredictiontool.shinyapps.io/GNOScore/. Our model predicted 139,688 patients with favorable neurologic outcomes in an internal validation sample when the observed number of patients with favorable neurologic outcomes was among 139,591 patients. The area under the receiver operating curve for the validation model was 0.90. This proposed prediction tool encompasses 20 risk factors into one probability to predict favorable neurologic outcome during ICU stay among children with critical illness. Future studies should seek external validation and improved discrimination of this prediction tool.
[Design and validation of an instrument to assess families at risk for health problems].

PubMed

Puschel, Klaus; Repetto, Paula; Solar, María Olga; Soto, Gabriela; González, Karla

2012-04-01

There is a paucity of screening instruments with a high clinical predictive value to identify families at risk and therefore, develop focused interventions in primary care. To develop an easy to apply screening instrument with a high clinical predictive value to identify families with a higher health vulnerability. In the first stage of the study an instrument with a high content validity was designed through a review of existent instruments, qualitative interviews with families and expert opinions following a Delphi approach of three rounds. In the second stage, concurrent validity was tested through a comparative analysis between the pilot instrument and a family clinical interview conducted to 300 families randomly selected from a population registered at a primary care clinic in Santiago. The sampling was blocked based on the presence of diabetes, depression, child asthma, behavioral disorders, presence of an older person or the lack of previous conditions among family members. The third stage, was directed to test the clinical predictive validity of the instrument by comparing the baseline vulnerability obtained by the instrument and the change in clinical status and health related quality of life perceptions of the family members after nine months of follow-up. The final SALUFAM instrument included 13 items and had a high internal consistency (Cronbach's alpha: 0.821), high test re-test reproducibility (Pearson correlation: 0.84) and a high clinical predictive value for clinical deterioration (Odds ratio: 1.826; 95% confidence intervals: 1.101-3.029). SALUFAM instrument is applicable, replicable, has a high content validity, concurrent validity and clinical predictive value.
Assessing genomic selection prediction accuracy in a dynamic barley breeding

USDA-ARS?s Scientific Manuscript database

Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotype and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, however validation based on prog...
A nearest neighbor approach for automated transporter prediction and categorization from protein sequences.

PubMed

Li, Haiquan; Dai, Xinbin; Zhao, Xuechun

2008-05-01

Membrane transport proteins play a crucial role in the import and export of ions, small molecules or macromolecules across biological membranes. Currently, there are a limited number of published computational tools which enable the systematic discovery and categorization of transporters prior to costly experimental validation. To approach this problem, we utilized a nearest neighbor method which seamlessly integrates homologous search and topological analysis into a machine-learning framework. Our approach satisfactorily distinguished 484 transporter families in the Transporter Classification Database, a curated and representative database for transporters. A five-fold cross-validation on the database achieved a positive classification rate of 72.3% on average. Furthermore, this method successfully detected transporters in seven model and four non-model organisms, ranging from archaean to mammalian species. A preliminary literature-based validation has cross-validated 65.8% of our predictions on the 11 organisms, including 55.9% of our predictions overlapping with 83.6% of the predicted transporters in TransportDB.

Parametric convergence sensitivity and validation of a finite element model of the human lumbar spine.

PubMed

Ayturk, Ugur M; Puttlitz, Christian M

2011-08-01

The primary objective of this study was to generate a finite element model of the human lumbar spine (L1-L5), verify mesh convergence for each tissue constituent and perform an extensive validation using both kinematic/kinetic and stress/strain data. Mesh refinement was accomplished via convergence of strain energy density (SED) predictions for each spinal tissue. The converged model was validated based on range of motion, intradiscal pressure, facet force transmission, anterolateral cortical bone strain and anterior longitudinal ligament deformation predictions. Changes in mesh resolution had the biggest impact on SED predictions under axial rotation loading. Nonlinearity of the moment-rotation curves was accurately simulated and the model predictions on the aforementioned parameters were in good agreement with experimental data. The validated and converged model will be utilised to study the effects of degeneration on the lumbar spine biomechanics, as well as to investigate the mechanical underpinning of the contemporary treatment strategies.
A New Approach of Juvenile Age Estimation using Measurements of the Ilium and Multivariate Adaptive Regression Splines (MARS) Models for Better Age Prediction.

PubMed

Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal

2017-01-01

Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.
Predictive validity of the Biomedical Admissions Test: an evaluation and case study.

PubMed

McManus, I C; Ferguson, Eamonn; Wakeford, Richard; Powis, David; James, David

2011-01-01

There has been an increase in the use of pre-admission selection tests for medicine. Such tests need to show good psychometric properties. Here, we use a paper by Emery and Bell [2009. The predictive validity of the Biomedical Admissions Test for pre-clinical examination performance. Med Educ 43:557-564] as a case study to evaluate and comment on the reporting of psychometric data in the field of medical student selection (and the comments apply to many papers in the field). We highlight pitfalls when reliability data are not presented, how simple zero-order associations can lead to inaccurate conclusions about the predictive validity of a test, and how biases need to be explored and reported. We show with BMAT that it is the knowledge part of the test which does all the predictive work. We show that without evidence of incremental validity it is difficult to assess the value of any selection tests for medicine.
Validation of intensive care unit-acquired infection surveillance in the Italian SPIN-UTI network.

PubMed

Masia, M D; Barchitta, M; Liperi, G; Cantù, A P; Alliata, E; Auxilia, F; Torregrossa, V; Mura, I; Agodi, A

2010-10-01

Validity is one of the most critical factors concerning surveillance of nosocomial infections (NIs). This article describes the first validation study of the Italian Nosocomial Infections Surveillance in Intensive Care Units (ICUs) project (SPIN-UTI) surveillance data. The objective was to validate infection data and thus to determine the sensitivity, specificity, and positive and negative predictive values of NI data reported on patients in the ICUs participating in the SPIN-UTI network. A validation study was performed at the end of the surveillance period. All medical records including all clinical and laboratory data were reviewed retrospectively by the trained physicians of the validation team and a positive predictive value (PPV), a negative predictive value (NPV), sensitivity and specificity were calculated. Eight ICUs (16.3%) were randomly chosen from all 49 SPIN-UTI ICUs for the validation study. In total, the validation team reviewed 832 patient charts (27.3% of the SPIN-UTI patients). The PPV was 83.5% and the NPV was 97.3%. The overall sensitivity was 82.3% and overall specificity was 97.2%. Over- and under-reporting of NIs were related to misinterpretation of the case definitions and deviations from the protocol despite previous training and instructions. The results of this study are useful to identify methodological problems within a surveillance system and have been used to plan retraining for surveillance personnel and to design and implement the second phase of the SPIN-UTI project. Copyright 2010 The Hospital Infection Society. Published by Elsevier Ltd. All rights reserved.
Reliability and validity of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-30) in Saudi Arabia.

PubMed

Tadakamadla, Santosh Kumar; Quadri, Mir Faeq Ali; Pakpour, Amir H; Zailai, Abdulaziz M; Sayed, Mohammed E; Mashyakhy, Mohammed; Inamdar, Aadil S; Tadakamadla, Jyothi

2014-09-29

To evaluate the reliability and validity of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-30) in Saudi Arabia. A convenience sample of 200 subjects was approached, of which 177 agreed to participate giving a response rate of 88.5%. Rapid Estimate of Adult Literacy in Dentistry (REALD-99), was translated into Arabic to prepare the longer and shorter versions of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-99 and AREALD-30). Each participant was provided with AREALD-99 which also includes words from AREALD-30. A questionnaire containing socio-behavioral information and Arabic Oral Health Impact Profile (A-OHIP-14) was also administered. Reliability of the AREALD-30 was assessed by re-administering it to 20 subjects after two weeks. Convergent and predictive validity of AREALD-30 was evaluated by its correlations with AREALD-99 and self-perceived oral health status, dental visiting habits and A-OHIP-14 respectively. Discriminant validity was assessed in relation to the educational level while construct validity was evaluated by confirmatory factor analysis (CFA). Reliability of AREALD-30 was excellent with intraclass correlation coefficient of 0.99. It exhibited good convergent and discriminant validity but poor predictive validity. CFA showed presence of two factors and infit mean-square statistics for AREALD-30 were all within the desired range of 0.50 - 2.0 in Rasch analysis. AREALD-30 showed excellent reliability, good convergent and concurrent validity, but failed to predict the differences between the subjects categorized based on their oral health outcomes.
Development and validation of the Pediatric Anesthesia Behavior score--an objective measure of behavior during induction of anesthesia.

PubMed

Beringer, Richard M; Greenwood, Rosemary; Kilpatrick, Nicky

2014-02-01

Measuring perioperative behavior changes requires validated objective rating scales. We developed a simple score for children's behavior during induction of anesthesia (Pediatric Anesthesia Behavior score) and assessed its reliability, concurrent validity, and predictive validity. Data were collected as part of a wider observational study of perioperative behavior changes in children undergoing general anesthesia for elective dental extractions. One-hundred and two healthy children aged 2-12 were recruited. Previously validated behavioral scales were used as follows: the modified Yale Preoperative Anxiety Scale (m-YPAS); the induction compliance checklist (ICC); the Pediatric Anesthesia Emergence Delirium scale (PAED); and the Post-Hospitalization Behavior Questionnaire (PHBQ). Pediatric Anesthesia Behavior (PAB) score was independently measured by two investigators, to allow assessment of interobserver reliability. Concurrent validity was assessed by examining the correlation between the PAB score, the m-YPAS, and the ICC. Predictive validity was assessed by examining the association between the PAB score, the PAED scale, and the PHBQ. The PAB score correlated strongly with both the m-YPAS (P < 0.001) and the ICC (P < 0.001). PAB score was significantly associated with the PAED score (P = 0.031) and with the PHBQ (P = 0.034). Two independent investigators recorded identical PAB scores for 94% of children and overall, there was close agreement between scores (Kappa coefficient of 0.886 [P < 0.001]). The PAB score is simple to use and may predict which children are at increased risk of developing postoperative behavioral disturbance. This study provides evidence for its reliability and validity. © 2013 John Wiley & Sons Ltd.
The Predictive Validity of the Short-Term Assessment of Risk and Treatability (START) for Multiple Adverse Outcomes in a Secure Psychiatric Inpatient Setting.

PubMed

O'Shea, Laura E; Picchioni, Marco M; Dickens, Geoffrey L

2016-04-01

The Short-Term Assessment of Risk and Treatability (START) aims to assist mental health practitioners to estimate an individual's short-term risk for a range of adverse outcomes via structured consideration of their risk ("Vulnerabilities") and protective factors ("Strengths") in 20 areas. It has demonstrated predictive validity for aggression but this is less established for other outcomes. We collated START assessments for N = 200 adults in a secure mental health hospital and ascertained 3-month risk event incidence using the START Outcomes Scale. The specific risk estimates, which are the tool developers' suggested method of overall assessment, predicted aggression, self-harm/suicidality, and victimization, and had incremental validity over the Strength and Vulnerability scales for these outcomes. The Strength scale had incremental validity over the Vulnerability scale for aggressive outcomes; therefore, consideration of protective factors had demonstrable value in their prediction. Further evidence is required to support use of the START for the full range of outcomes it aims to predict. © The Author(s) 2015.
QSPR for predicting chloroform formation in drinking water disinfection.

PubMed

Luilo, G B; Cabaniss, S E

2011-01-01

Chlorination is the most widely used technique for water disinfection, but may lead to the formation of chloroform (trichloromethane; TCM) and other by-products. This article reports the first quantitative structure-property relationship (QSPR) for predicting the formation of TCM in chlorinated drinking water. Model compounds (n = 117) drawn from 10 literature sources were divided into training data (n = 90, analysed by five-way leave-many-out internal cross-validation) and external validation data (n = 27). QSPR internal cross-validation had Q² = 0.94 and root mean square error (RMSE) of 0.09 moles TCM per mole compound, consistent with external validation Q2 of 0.94 and RMSE of 0.08 moles TCM per mole compound, and met criteria for high predictive power and robustness. In contrast, log TCM QSPR performed poorly and did not meet the criteria for predictive power. The QSPR predictions were consistent with experimental values for TCM formation from tannic acid and for model fulvic acid structures. The descriptors used are consistent with a relatively small number of important TCM precursor structures based upon 1,3-dicarbonyls or 1,3-diphenols.
The predictive validity of a situational judgement test, a clinical problem solving test and the core medical training selection methods for performance in specialty training .

PubMed

Patterson, Fiona; Lopes, Safiatu; Harding, Stephen; Vaux, Emma; Berkin, Liz; Black, David

2017-02-01

The aim of this study was to follow up a sample of physicians who began core medical training (CMT) in 2009. This paper examines the long-term validity of CMT and GP selection methods in predicting performance in the Membership of Royal College of Physicians (MRCP(UK)) examinations. We performed a longitudinal study, examining the extent to which the GP and CMT selection methods (T1) predict performance in the MRCP(UK) examinations (T2). A total of 2,569 applicants from 2008-09 who completed CMT and GP selection methods were included in the study. Looking at MRCP(UK) part 1, part 2 written and PACES scores, both CMT and GP selection methods show evidence of predictive validity for the outcome variables, and hierarchical regressions show the GP methods add significant value to the CMT selection process. CMT selection methods predict performance in important outcomes and have good evidence of validity; the GP methods may have an additional role alongside the CMT selection methods. © Royal College of Physicians 2017. All rights reserved.
Implicit Sex Guilt Predicts Sexual Behaviors: Evidence for the Validity of the Sex Guilt Implicit Association Test.

PubMed

Totonchi, Delaram A; Derlega, Valerian J; Janda, Louis H

2018-05-14

Self-report measures of sexuality may be influenced by people's conscious concerns about confidentiality and social desirability. Alternatively, non-conscious measures (e.g., implicit association tests; IATs) are designed to minimize these validity concerns. We constructed an IAT measure of sex guilt using 154 male and female university students. The sex guilt IAT demonstrated convergent validity as it correlated with various sexual behaviors and incremental validity as it improved the prediction of several sexual behaviors beyond that provided by the Mosher sex guilt scale. We conclude that a non-conscious measure of sex guilt may complement the use of self-reports in studying sexual behaviors.
Pitfalls in Prediction Modeling for Normal Tissue Toxicity in Radiation Therapy: An Illustration With the Individual Radiation Sensitivity and Mammary Carcinoma Risk Factor Investigation Cohorts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mbah, Chamberlain, E-mail: chamberlain.mbah@ugent.be; Department of Mathematical Modeling, Statistics, and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent; Thierens, Hubert

Purpose: To identify the main causes underlying the failure of prediction models for radiation therapy toxicity to replicate. Methods and Materials: Data were used from two German cohorts, Individual Radiation Sensitivity (ISE) (n=418) and Mammary Carcinoma Risk Factor Investigation (MARIE) (n=409), of breast cancer patients with similar characteristics and radiation therapy treatments. The toxicity endpoint chosen was telangiectasia. The LASSO (least absolute shrinkage and selection operator) logistic regression method was used to build a predictive model for a dichotomized endpoint (Radiation Therapy Oncology Group/European Organization for the Research and Treatment of Cancer score 0, 1, or ≥2). Internal areas undermore » the receiver operating characteristic curve (inAUCs) were calculated by a naïve approach whereby the training data (ISE) were also used for calculating the AUC. Cross-validation was also applied to calculate the AUC within the same cohort, a second type of inAUC. Internal AUCs from cross-validation were calculated within ISE and MARIE separately. Models trained on one dataset (ISE) were applied to a test dataset (MARIE) and AUCs calculated (exAUCs). Results: Internal AUCs from the naïve approach were generally larger than inAUCs from cross-validation owing to overfitting the training data. Internal AUCs from cross-validation were also generally larger than the exAUCs, reflecting heterogeneity in the predictors between cohorts. The best models with largest inAUCs from cross-validation within both cohorts had a number of common predictors: hypertension, normalized total boost, and presence of estrogen receptors. Surprisingly, the effect (coefficient in the prediction model) of hypertension on telangiectasia incidence was positive in ISE and negative in MARIE. Other predictors were also not common between the 2 cohorts, illustrating that overcoming overfitting does not solve the problem of replication failure of prediction models completely. Conclusions: Overfitting and cohort heterogeneity are the 2 main causes of replication failure of prediction models across cohorts. Cross-validation and similar techniques (eg, bootstrapping) cope with overfitting, but the development of validated predictive models for radiation therapy toxicity requires strategies that deal with cohort heterogeneity.« less
Statistical validation of predictive TRANSP simulations of baseline discharges in preparation for extrapolation to JET D-T

NASA Astrophysics Data System (ADS)

Kim, Hyun-Tae; Romanelli, M.; Yuan, X.; Kaye, S.; Sips, A. C. C.; Frassinetti, L.; Buchanan, J.; Contributors, JET

2017-06-01

This paper presents for the first time a statistical validation of predictive TRANSP simulations of plasma temperature using two transport models, GLF23 and TGLF, over a database of 80 baseline H-mode discharges in JET-ILW. While the accuracy of the predicted T e with TRANSP-GLF23 is affected by plasma collisionality, the dependency of predictions on collisionality is less significant when using TRANSP-TGLF, indicating that the latter model has a broader applicability across plasma regimes. TRANSP-TGLF also shows a good matching of predicted T i with experimental measurements allowing for a more accurate prediction of the neutron yields. The impact of input data and assumptions prescribed in the simulations are also investigated in this paper. The statistical validation and the assessment of uncertainty level in predictive TRANSP simulations for JET-ILW-DD will constitute the basis for the extrapolation to JET-ILW-DT experiments.
The Predictive Validity of Four Intelligence Tests for School Grades: A Small Sample Longitudinal Study

PubMed Central

Gygi, Jasmin T.; Hagmann-von Arx, Priska; Schweizer, Florine; Grob, Alexander

2017-01-01

Intelligence is considered the strongest single predictor of scholastic achievement. However, little is known regarding the predictive validity of well-established intelligence tests for school grades. We analyzed the predictive validity of four widely used intelligence tests in German-speaking countries: The Intelligence and Development Scales (IDS), the Reynolds Intellectual Assessment Scales (RIAS), the Snijders-Oomen Nonverbal Intelligence Test (SON-R 6-40), and the Wechsler Intelligence Scale for Children (WISC-IV), which were individually administered to 103 children (Mage = 9.17 years) enrolled in regular school. School grades were collected longitudinally after 3 years (averaged school grades, mathematics, and language) and were available for 54 children (Mage = 11.77 years). All four tests significantly predicted averaged school grades. Furthermore, the IDS and the RIAS predicted both mathematics and language, while the SON-R 6-40 predicted mathematics. The WISC-IV showed no significant association with longitudinal scholastic achievement when mathematics and language were analyzed separately. The results revealed the predictive validity of currently used intelligence tests for longitudinal scholastic achievement in German-speaking countries and support their use in psychological practice, in particular for predicting averaged school grades. However, this conclusion has to be considered as preliminary due to the small sample of children observed. PMID:28348543
A Quantitative Structure Activity Relationship for acute oral toxicity of pesticides on rats: Validation, domain of application and prediction.

PubMed

Hamadache, Mabrouk; Benkortbi, Othmane; Hanini, Salah; Amrane, Abdeltif; Khaouane, Latifa; Si Moussa, Cherif

2016-02-13

Quantitative Structure Activity Relationship (QSAR) models are expected to play an important role in the risk assessment of chemicals on humans and the environment. In this study, we developed a validated QSAR model to predict acute oral toxicity of 329 pesticides to rats because a few QSAR models have been devoted to predict the Lethal Dose 50 (LD50) of pesticides on rats. This QSAR model is based on 17 molecular descriptors, and is robust, externally predictive and characterized by a good applicability domain. The best results were obtained with a 17/9/1 Artificial Neural Network model trained with the Quasi Newton back propagation (BFGS) algorithm. The prediction accuracy for the external validation set was estimated by the Q(2)ext and the root mean square error (RMS) which are equal to 0.948 and 0.201, respectively. 98.6% of external validation set is correctly predicted and the present model proved to be superior to models previously published. Accordingly, the model developed in this study provides excellent predictions and can be used to predict the acute oral toxicity of pesticides, particularly for those that have not been tested as well as new pesticides. Copyright © 2015 Elsevier B.V. All rights reserved.
Validity and reliability of bioelectrical impedance analysis and skinfold thickness in predicting body fat in military personnel.

PubMed

Aandstad, Anders; Holtberget, Kristian; Hageberg, Rune; Holme, Ingar; Anderssen, Sigmund A

2014-02-01

Previous studies show that body composition is related to injury risk and physical performance in soldiers. Thus, valid methods for measuring body composition in military personnel are needed. The frequently used body mass index method is not a valid measure of body composition in soldiers, but reliability and validity of alternative field methods are less investigated in military personnel. Thus, we carried out test and retest of skinfold (SKF), single frequency bioelectrical impedance analysis (SF-BIA), and multifrequency bioelectrical impedance analysis measurements in 65 male and female soldiers. Several validated equations were used to predict percent body fat from these methods. Dual-energy X-ray absorptiometry was also measured, and acted as the criterion method. Results showed that SF-BIA was the most reliable method in both genders. In women, SF-BIA was also the most valid method, whereas SKF or a combination of SKF and SF-BIA produced the highest validity in men. Reliability and validity varied substantially among the equations examined. The best methods and equations produced test-retest 95% limits of agreement below ±1% points, whereas the corresponding validity figures were ±3.5% points. Each investigator and practitioner must consider whether such measurement errors are acceptable for its specific use. Reprint & Copyright © 2014 Association of Military Surgeons of the U.S.
Comparison between genetic parameters of cheese yield and nutrient recovery or whey loss traits measured from individual model cheese-making methods or predicted from unprocessed bovine milk samples using Fourier-transform infrared spectroscopy.

PubMed

Bittante, G; Ferragina, A; Cipolat-Gotet, C; Cecchinato, A

2014-10-01

Cheese yield is an important technological trait in the dairy industry. The aim of this study was to infer the genetic parameters of some cheese yield-related traits predicted using Fourier-transform infrared (FTIR) spectral analysis and compare the results with those obtained using an individual model cheese-producing procedure. A total of 1,264 model cheeses were produced using 1,500-mL milk samples collected from individual Brown Swiss cows, and individual measurements were taken for 10 traits: 3 cheese yield traits (fresh curd, curd total solids, and curd water as a percent of the weight of the processed milk), 4 milk nutrient recovery traits (fat, protein, total solids, and energy of the curd as a percent of the same nutrient in the processed milk), and 3 daily cheese production traits per cow (fresh curd, total solids, and water weight of the curd). Each unprocessed milk sample was analyzed using a MilkoScan FT6000 (Foss, Hillerød, Denmark) over the spectral range, from 5,000 to 900 wavenumber × cm(-1). The FTIR spectrum-based prediction models for the previously mentioned traits were developed using modified partial least-square regression. Cross-validation of the whole data set yielded coefficients of determination between the predicted and measured values in cross-validation of 0.65 to 0.95 for all traits, except for the recovery of fat (0.41). A 3-fold external validation was also used, in which the available data were partitioned into 2 subsets: a training set (one-third of the herds) and a testing set (two-thirds). The training set was used to develop calibration equations, whereas the testing subsets were used for external validation of the calibration equations and to estimate the heritabilities and genetic correlations of the measured and FTIR-predicted phenotypes. The coefficients of determination between the predicted and measured values in cross-validation results obtained from the training sets were very similar to those obtained from the whole data set, but the coefficient of determination of validation values for the external validation sets were much lower for all traits (0.30 to 0.73), and particularly for fat recovery (0.05 to 0.18), for the training sets compared with the full data set. For each testing subset, the (co)variance components for the measured and FTIR-predicted phenotypes were estimated using bivariate Bayesian analyses and linear models. The intraherd heritabilities for the predicted traits obtained from our internal cross-validation using the whole data set ranged from 0.085 for daily yield of curd solids to 0.576 for protein recovery, and were similar to those obtained from the measured traits (0.079 to 0.586, respectively). The heritabilities estimated from the testing data set used for external validation were more variable but similar (on average) to the corresponding values obtained from the whole data set. Moreover, the genetic correlations between the predicted and measured traits were high in general (0.791 to 0.996), and they were always higher than the corresponding phenotypic correlations (0.383 to 0.995), especially for the external validation subset. In conclusion, we herein report that application of the cross-validation technique to the whole data set tended to overestimate the predictive ability of FTIR spectra, give more precise phenotypic predictions than the calibrations obtained using smaller data sets, and yield genetic correlations similar to those obtained from the measured traits. Collectively, our findings indicate that FTIR predictions have the potential to be used as indicator traits for the rapid and inexpensive selection of dairy populations for improvement of cheese yield, milk nutrient recovery in curd, and daily cheese production per cow. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Development, calibration, and validation of performance prediction models for the Texas M-E flexible pavement design system.

DOT National Transportation Integrated Search

2010-08-01

This study was intended to recommend future directions for the development of TxDOTs Mechanistic-Empirical : (TexME) design system. For stress predictions, a multi-layer linear elastic system was evaluated and its validity was : verified by compar...
Predictive and Incremental Validity of Global and Domain-Based Adolescent Life Satisfaction Reports

ERIC Educational Resources Information Center

Haranin, Emily C.; Huebner, E. Scott; Suldo, Shannon M.

2007-01-01

Concurrent, predictive, and incremental validity of global and domain-based adolescent life satisfaction reports are examined with respect to internalizing and externalizing behavior problems. The Students' Life Satisfaction Scale (SLSS), Multidimensional Students' Life Satisfaction Scale (MSLSS), and measures of internalizing and externalizing…
An appraisal of the psychometric properties of the Clinician version of the Apathy Evaluation Scale (AES-C).

PubMed

Clarke, Diana E; Van Reekum, Robert; Patel, Jigisha; Simard, Martine; Gomez, Everlyne; Streiner, David L

2007-01-01

This article examines the psychometric properties of the clinician version of the Apathy Evaluation Scale (AES-C) to determine its ability to characterize, quantify and differentiate apathy. Critical appraisals of the item-reduction processes, effectiveness of the administration, coding and scoring procedures, and the reliability and validity of the scale were carried out. For training, administration and rating of the AES-C, clearer guidelines, including a more standardized list of verbal and non-verbal apathetic cues, are needed. There is evidence of high internal consistency for the scale across studies. In addition, the original study reported good test-retest and inter-rater reliability coefficients. However, there is a lack of replication on these more stable and informative measures of reliability and as such they warrant further investigation. The research evidence confirms that the AES-C shows good discriminant, convergent and criterion validity. However, evidence of its predictive validity is limited. As this aspect of validity refers to the scale's ability to predict future outcomes, which is important for treatment and rehabilitation planning, further assessment of the predictive validity of the AES-C is needed. In conclusion, the AES-C is a reliable and valid measure for the characterization and quantification of apathy. Copyright (c) 2007 John Wiley & Sons, Ltd.
Validation and clinical utility of the executive function performance test in persons with traumatic brain injury.

PubMed

Baum, C M; Wolf, T J; Wong, A W K; Chen, C H; Walker, K; Young, A C; Carlozzi, N E; Tulsky, D S; Heaton, R K; Heinemann, A W

2017-07-01

This study examined the relationships between the Executive Function Performance Test (EFPT), the NIH Toolbox Cognitive Function tests, and neuropsychological executive function measures in 182 persons with traumatic brain injury (TBI) and 46 controls to evaluate construct, discriminant, and predictive validity. Construct validity: There were moderate correlations between the EFPT and the NIH Toolbox Crystallized (r = -.479), Fluid Tests (r = -.420), and Total Composite Scores (r = -.496). Discriminant validity: Significant differences were found in the EFPT total and sequence scores across control, complicated mild/moderate, and severe TBI groups. We found differences in the organisation score between control and severe, and between mild and severe TBI groups. Both TBI groups had significantly lower scores in safety and judgement than controls. Compared to the controls, the severe TBI group demonstrated significantly lower performance on all instrumental activities of daily living (IADL) tasks. Compared to the mild TBI group, the controls performed better on the medication task, the severe TBI group performed worse in the cooking and telephone tasks. Predictive validity: The EFPT predicted the self-perception of independence measured by the TBI-QOL (beta = -0.49, p < .001) for the severe TBI group. Overall, these data support the validity of the EFPT for use in individuals with TBI.

Symbolic control of visual attention: semantic constraints on the spatial distribution of attention.

PubMed

Gibson, Bradley S; Scheutz, Matthias; Davis, Gregory J

2009-02-01

Humans routinely use spatial language to control the spatial distribution of attention. In so doing, spatial information may be communicated from one individual to another across opposing frames of reference, which in turn can lead to inconsistent mappings between symbols and directions (or locations). These inconsistencies may have important implications for the symbolic control of attention because they can be translated into differences in cue validity, a manipulation that is known to influence the focus of attention. This differential validity hypothesis was tested in Experiment 1 by comparing spatial word cues that were predicted to have high learned spatial validity ("above/below") and low learned spatial validity ("left/right"). Consistent with this prediction, when two measures of selective attention were used, the results indicated that attention was less focused in response to "left/right" cues than in response to "above/below" cues, even when the actual validity of each of the cues was equal. In addition, Experiment 2 predicted that spatial words such as "left/right" would have lower spatial validity than would other directional symbols that specify direction along the horizontal axis, such as "<--/-->" cues. The results were also consistent with this hypothesis. Altogether, the present findings demonstrate important semantic-based constraints on the spatial distribution of attention.
Predictive Validity of Curriculum-Embedded Measures on Outcomes of Kindergarteners Identified as At Risk for Reading Difficulty

ERIC Educational Resources Information Center

Oslund, Eric L.; Hagan-Burke, Shanna; Simmons, Deborah C.; Clemens, Nathan H.; Simmons, Leslie E.; Taylor, Aaron B.; Kwok, Oi-man; Coyne, Michael D.

2017-01-01

This study examined the predictive validity of formative assessments embedded in a Tier 2 intervention curriculum for kindergarten students identified as at risk for reading difficulty. We examined when (i.e., months during the school year) measures could predict reading outcomes gathered at the end of kindergarten and whether the predictive…
Aptitude Tests and Successful College Students: The Predictive Validity of the General Aptitude Test (GAT) in Saudi Arabia

ERIC Educational Resources Information Center

Alnahdi, Ghaleb Hamad

2015-01-01

Aptitude tests should predict student success at the university level. This study examined the predictive validity of the General Aptitude Test (GAT) in Saudi Arabia. Data for 27420 students enrolled at Prince Sattam bin Abdulaziz University were analyzed. Of these students, 17565 were male students, and 9855 were female students. Multiple…
Testing a Multi-Stage Screening System: Predicting Performance on Australia's National Achievement Test Using Teachers' Ratings of Academic and Social Behaviors

ERIC Educational Resources Information Center

Kettler, Ryan J.; Elliott, Stephen N.; Davies, Michael; Griffin, Patrick

2012-01-01

This study addresses the predictive validity of results from a screening system of academic enablers, with a sample of Australian elementary school students, when the criterion variable is end-of-year achievement. The investigation included (a) comparing the predictive validity of a brief criterion-referenced nomination system with more…
A Long-Term Predictive Validity Study: Can the CDI Short Form be Used to Predict Language and Early Literacy Skills Four Years Later?

ERIC Educational Resources Information Center

Can, Dilara Deniz; Ginsburg-Block, Marika; Golinkoff, Roberta Michnick; Hirsh-Pasek, Kathryn

2013-01-01

This longitudinal study examined the predictive validity of the MacArthur Communicative Developmental Inventories-Short Form (CDI-SF), a parent report questionnaire about children's language development (Fenson, Pethick, Renda, Cox, Dale & Reznick, 2000). Data were first gathered from parents on the CDI-SF vocabulary scores for seventy-six…
A Case for Transforming the Criterion of a Predictive Validity Study

ERIC Educational Resources Information Center

Patterson, Brian F.; Kobrin, Jennifer L.

2011-01-01

This study presents a case for applying a transformation (Box and Cox, 1964) of the criterion used in predictive validity studies. The goals of the transformation were to better meet the assumptions of the linear regression model and to reduce the residual variance of fitted (i.e., predicted) values. Using data for the 2008 cohort of first-time,…
Construct and Predictive Validity of the Core Phonics Survey: A Diagnostic Assessment for Students with Specific Learning Disabilities

ERIC Educational Resources Information Center

Park, Yujeong; Benedict, Amber E.; Brownell, Mary T.

2014-01-01

The factor structure of the CORE Phonics Survey was analyzed using a sample of 165 students in upper elementary school with specific learning disabilities. Confirmatory factor analysis was used to identify the hypothesized constructs of the CORE Phonics Survey and predictive validity of the CORE Phonics Survey to predict students' success in word…
Cross-validation of the Beunen-Malina method to predict adult height.

PubMed

Beunen, Gaston P; Malina, Robert M; Freitas, Duarte I; Maia, José A; Claessens, Albrecht L; Gouveia, Elvio R; Lefevre, Johan

2010-08-01

The purpose of this study was to cross-validate the Beunen-Malina method for non-invasive prediction of adult height. Three hundred and eight boys aged 13, 14, 15 and 16 years from the Madeira Growth Study were observed at annual intervals in 1996, 1997 and 1998 and re-measured 7-8 years later. Height, sitting height and the triceps and subscapular skinfolds were measured; skeletal age was assessed using the Tanner-Whitehouse 2 method. Adult height was measured and predicted using the Beunen-Malina method. Maturity groups were classified using relative skeletal age (skeletal age minus chronological age). Pearson correlations, mean differences and standard errors of estimate (SEE) were calculated. Age-specific correlations between predicted and measured adult height vary between 0.70 and 0.85, while age-specific SEE varies between 3.3 and 4.7 cm. The correlations and SEE are similar to those obtained in the development of the original Beunen-Malina method. The Beunen-Malina method is a valid method to predict adult height in adolescent boys and can be used in European populations or populations from European ancestry. Percentage of predicted adult height is a non-invasive valid method to assess biological maturity.
Development and external validation of a prediction rule for an unfavorable course of late-life depression: A multicenter cohort study.

PubMed

Maarsingh, O R; Heymans, M W; Verhaak, P F; Penninx, B W J H; Comijs, H C

2018-08-01

Given the poor prognosis of late-life depression, it is crucial to identify those at risk. Our objective was to construct and validate a prediction rule for an unfavourable course of late-life depression. For development and internal validation of the model, we used The Netherlands Study of Depression in Older Persons (NESDO) data. We included participants with a major depressive disorder (MDD) at baseline (n = 270; 60-90 years), assessed with the Composite International Diagnostic Interview (CIDI). For external validation of the model, we used The Netherlands Study of Depression and Anxiety (NESDA) data (n = 197; 50-66 years). The outcome was MDD after 2 years of follow-up, assessed with the CIDI. Candidate predictors concerned sociodemographics, psychopathology, physical symptoms, medication, psychological determinants, and healthcare setting. Model performance was assessed by calculating calibration and discrimination. 111 subjects (41.1%) had MDD after 2 years of follow-up. Independent predictors of MDD after 2 years were (older) age, (early) onset of depression, severity of depression, anxiety symptoms, comorbid anxiety disorder, fatigue, and loneliness. The final model showed good calibration and reasonable discrimination (AUC of 0.75; 0.70 after external validation). The strongest individual predictor was severity of depression (AUC of 0.69; 0.68 after external validation). The model was developed and validated in The Netherlands, which could affect the cross-country generalizability. Based on rather simple clinical indicators, it is possible to predict the 2-year course of MDD. The prediction rule can be used for monitoring MDD patients and identifying those at risk of an unfavourable outcome. Copyright © 2018 Elsevier B.V. All rights reserved.
The predictive validity of quality of evidence grades for the stability of effect estimates was low: a meta-epidemiological study.

PubMed

Gartlehner, Gerald; Dobrescu, Andreea; Evans, Tammeka Swinson; Bann, Carla; Robinson, Karen A; Reston, James; Thaler, Kylie; Skelly, Andrea; Glechner, Anna; Peterson, Kimberly; Kien, Christina; Lohr, Kathleen N

2016-02-01

To determine the predictive validity of the U.S. Evidence-based Practice Center (EPC) approach to GRADE (Grading of Recommendations Assessment, Development and Evaluation). Based on Cochrane reports with outcomes graded as high quality of evidence (QOE), we prepared 160 documents which represented different levels of QOE. Professional systematic reviewers dually graded the QOE. For each document, we determined whether estimates were concordant with high QOE estimates of the Cochrane reports. We compared the observed proportion of concordant estimates with the expected proportion from an international survey. To determine the predictive validity, we used the Hosmer-Lemeshow test to assess calibration and the C (concordance) index to assess discrimination. The predictive validity of the EPC approach to GRADE was limited. Estimates graded as high QOE were less likely, estimates graded as low or insufficient QOE more likely to remain stable than expected. The EPC approach to GRADE could not reliably predict the likelihood that individual bodies of evidence remain stable as new evidence becomes available. C-indices ranged between 0.56 (95% CI, 0.47 to 0.66) and 0.58 (95% CI, 0.50 to 0.67) indicating a low discriminatory ability. The limited predictive validity of the EPC approach to GRADE seems to reflect a mismatch between expected and observed changes in treatment effects as bodies of evidence advance from insufficient to high QOE. Copyright © 2016 Elsevier Inc. All rights reserved.
Cross Cultural Adaptation, Validity, and Reliability of the Farsi Breastfeeding Attrition Prediction Tools in Iranian Pregnant Women

PubMed Central

Mortazavi, Forough; Mousavi, Seyed Abbas; Chaman, Reza; Khosravi, Ahmad; Janke, Jill R.

2015-01-01

Background: The rate of exclusive breastfeeding in Iran is decreasing. The breastfeeding attrition prediction tools (BAPT) have been validated and used in predicting premature weaning. Objectives: We aimed to translate the BAPT into Farsi, assess its content validity, and examine its reliability and validity to identify exclusive breastfeeding discontinuation in Iran. Materials and Methods: The BAPT was translated into Farsi and the content validity of the Farsi version of the BAPT was assessed. It was administered to 356 pregnant women in the third trimester of pregnancy, who were residents of a city in northeast of Iran. The structural integrity of the four-factor model was assessed in confirmatory factor analysis (CFA) and exploratory factor analysis (EFA). Reliability was assessed using Cronbach’s alpha coefficient and item-subscale correlations. Validity was assessed using the known-group comparison (128 with vs. 228 without breastfeeding experience) and predictive validity (80 successes vs. 265 failures in exclusive breastfeeding). Results: The internal consistency of the whole instrument (49 items) was 0.775. CFA provided an acceptable fit to the a priori four-factor model (Chi-square/df = 1.8, Root Mean Square Error of Approximation (RMSEA) = 0.049, Standardized Root Mean Square Residual (SRMR) = 0.064, Comparative Fit Index (CFI) = 0.911). The difference in means of breastfeeding control (BFC) between the participants with and without breastfeeding experience was significant (P < 0.001). In addition, the total score of BAPT and the score of Breast Feeding Control (BFC) subscale were higher in women who were on exclusive breastfeeding than women who were not, at four months postpartum (P < 0.05). Conclusions: This study validated the Farsi version of BAPT. It is useful for researchers who want to use it in Iran to identify women at higher risks of Exclusive Breast Feeding (EBF) discontinuation. PMID:26019910
[The Amsterdam wrist rules: the multicenter prospective derivation and external validation of a clinical decision rule for the use of radiography in acute wrist trauma].

PubMed

Walenkamp, Monique M J; Bentohami, Abdelali; Slaar, Annelie; Beerekamp, M S H Suzan; Maas, Mario; Jager, L C Cara; Sosef, Nico L; van Velde, Romuald; Ultee, Jan M; Steyerberg, Ewout W; Goslings, J C Carel; Schep, Niels W L

2016-01-01

Although only 39% of patients with wrist trauma have sustained a fracture, the majority of patients is routinely referred for radiography. The purpose of this study was to derive and externally validate a clinical decision rule that selects patients with acute wrist trauma in the Emergency Department (ED) for radiography. This multicenter prospective study consisted of three components: (1) derivation of a clinical prediction model for detecting wrist fractures in patients following wrist trauma; (2) external validation of this model; and (3) design of a clinical decision rule. The study was conducted in the EDs of five Dutch hospitals: one academic hospital (derivation cohort) and four regional hospitals (external validation cohort). We included all adult patients with acute wrist trauma. The main outcome was fracture of the wrist (distal radius, distal ulna or carpal bones) diagnosed on conventional X-rays. A total of 882 patients were analyzed; 487 in the derivation cohort and 395 in the validation cohort. We derived a clinical prediction model with eight variables: age; sex, swelling of the wrist; swelling of the anatomical snuffbox, visible deformation; distal radius tender to palpation; pain on radial deviation and painful axial compression of the thumb. The Area Under the Curve at external validation of this model was 0.81 (95% CI: 0.77-0.85). The sensitivity and specificity of the Amsterdam Wrist Rules (AWR) in the external validation cohort were 98% (95% CI: 95-99%) and 21% (95% CI: 15%-28). The negative predictive value was 90% (95% CI: 81-99%). The Amsterdam Wrist Rules is a clinical prediction rule with a high sensitivity and negative predictive value for fractures of the wrist. Although external validation showed low specificity and 100 % sensitivity could not be achieved, the Amsterdam Wrist Rules can provide physicians in the Emergency Department with a useful screening tool to select patients with acute wrist trauma for radiography. The upcoming implementation study will further reveal the impact of the Amsterdam Wrist Rules on the anticipated reduction of X-rays requested, missed fractures, Emergency Department waiting times and health care costs.
14 CFR 60.13 - FSTD objective data requirements.

Code of Federal Regulations, 2014 CFR

2014-01-01

..., the data made available to the NSPM (the validation data package) must include the aircraft...) The validation data package may contain flight test data from a source in addition to or independent..., as described in the applicable QPS. (c) The validation data package may also contain predicted data...
14 CFR 60.13 - FSTD objective data requirements.

Code of Federal Regulations, 2011 CFR

2011-01-01

..., the data made available to the NSPM (the validation data package) must include the aircraft...) The validation data package may contain flight test data from a source in addition to or independent..., as described in the applicable QPS. (c) The validation data package may also contain predicted data...
14 CFR 60.13 - FSTD objective data requirements.

Code of Federal Regulations, 2010 CFR

2010-01-01

..., the data made available to the NSPM (the validation data package) must include the aircraft...) The validation data package may contain flight test data from a source in addition to or independent..., as described in the applicable QPS. (c) The validation data package may also contain predicted data...
14 CFR 60.13 - FSTD objective data requirements.

Code of Federal Regulations, 2012 CFR

2012-01-01

..., the data made available to the NSPM (the validation data package) must include the aircraft...) The validation data package may contain flight test data from a source in addition to or independent..., as described in the applicable QPS. (c) The validation data package may also contain predicted data...
14 CFR 60.13 - FSTD objective data requirements.

Code of Federal Regulations, 2013 CFR

2013-01-01

..., the data made available to the NSPM (the validation data package) must include the aircraft...) The validation data package may contain flight test data from a source in addition to or independent..., as described in the applicable QPS. (c) The validation data package may also contain predicted data...
Clinical utility of the Neurobehavioral Symptom Inventory validity scales to screen for symptom exaggeration following traumatic brain injury.

PubMed

Lange, Rael T; Brickell, Tracey A; Lippa, Sara M; French, Louis M

2015-01-01

The purpose of this study was to examine the clinical utility of three recently developed validity scales (Validity-10, NIM5, and LOW6) designed to screen for symptom exaggeration using the Neurobehavioral Symptom Inventory (NSI). Participants were 272 U.S. military service members who sustained a mild, moderate, severe, or penetrating traumatic brain injury (TBI) and who were evaluated by the neuropsychology service at Walter Reed Army Medical Center within 199 weeks post injury. Participants were divided into two groups based on the Negative Impression Management scale of the Personality Assessment Inventory: (a) those who failed symptom validity testing (SVT-fail; n = 27) and (b) those who passed symptom validity testing (SVT-pass; n = 245). Participants in the SVT-fail group had significantly higher scores (p<.001) on the Validity-10, NIM5, LOW6, NSI total, and Personality Assessment Inventory (PAI) clinical scales (range: d = 0.76 to 2.34). Similarly high sensitivity, specificity, positive predictive power (PPP), and negative predictive (NPP) values were found when using all three validity scales to differentiate SVT-fail versus SVT-pass groups. However, the Validity-10 scale consistently had the highest overall values. The optimal cutoff score for the Validity-10 scale to identify possible symptom exaggeration was ≥19 (sensitivity = .59, specificity = .89, PPP = .74, NPP = .80). For the majority of people, these findings provide support for the use of the Validity-10 scale as a screening tool for possible symptom exaggeration. When scores on the Validity-10 exceed the cutoff score, it is recommended that (a) researchers and clinicians do not interpret responses on the NSI, and (b) clinicians follow up with a more detailed evaluation, using well-validated symptom validity measures (e.g., Minnesota Multiphasic Personality Inventory-2 Restructured Form, MMPI-2-RF, validity scales), to seek confirmatory evidence to support an hypothesis of symptom exaggeration.
Impact of External Cue Validity on Driving Performance in Parkinson's Disease

PubMed Central

Scally, Karen; Charlton, Judith L.; Iansek, Robert; Bradshaw, John L.; Moss, Simon; Georgiou-Karistianis, Nellie

2011-01-01

This study sought to investigate the impact of external cue validity on simulated driving performance in 19 Parkinson's disease (PD) patients and 19 healthy age-matched controls. Braking points and distance between deceleration point and braking point were analysed for red traffic signals preceded either by Valid Cues (correctly predicting signal), Invalid Cues (incorrectly predicting signal), and No Cues. Results showed that PD drivers braked significantly later and travelled significantly further between deceleration and braking points compared with controls for Invalid and No-Cue conditions. No significant group differences were observed for driving performance in response to Valid Cues. The benefit of Valid Cues relative to Invalid Cues and No Cues was significantly greater for PD drivers compared with controls. Trail Making Test (B-A) scores correlated with driving performance for PDs only. These results highlight the importance of external cues and higher cognitive functioning for driving performance in mild to moderate PD. PMID:21789275
Quantitative validation of carbon-fiber laminate low velocity impact simulations

DOE PAGES

English, Shawn A.; Briggs, Timothy M.; Nelson, Stacy M.

2015-09-26

Simulations of low velocity impact with a flat cylindrical indenter upon a carbon fiber fabric reinforced polymer laminate are rigorously validated. Comparison of the impact energy absorption between the model and experiment is used as the validation metric. Additionally, non-destructive evaluation, including ultrasonic scans and three-dimensional computed tomography, provide qualitative validation of the models. The simulations include delamination, matrix cracks and fiber breaks. An orthotropic damage and failure constitutive model, capable of predicting progressive damage and failure, is developed in conjunction and described. An ensemble of simulations incorporating model parameter uncertainties is used to predict a response distribution which ismore » then compared to experimental output using appropriate statistical methods. Lastly, the model form errors are exposed and corrected for use in an additional blind validation analysis. The result is a quantifiable confidence in material characterization and model physics when simulating low velocity impact in structures of interest.« less

Validation of biomarkers to predict response to immunotherapy in cancer: Volume II - clinical validation and regulatory considerations.

PubMed

Dobbin, Kevin K; Cesano, Alessandra; Alvarez, John; Hawtin, Rachael; Janetzki, Sylvia; Kirsch, Ilan; Masucci, Giuseppe V; Robbins, Paul B; Selvan, Senthamil R; Streicher, Howard Z; Zhang, Jenny; Butterfield, Lisa H; Thurin, Magdalena

2016-01-01

There is growing recognition that immunotherapy is likely to significantly improve health outcomes for cancer patients in the coming years. Currently, while a subset of patients experience substantial clinical benefit in response to different immunotherapeutic approaches, the majority of patients do not but are still exposed to the significant drug toxicities. Therefore, a growing need for the development and clinical use of predictive biomarkers exists in the field of cancer immunotherapy. Predictive cancer biomarkers can be used to identify the patients who are or who are not likely to derive benefit from specific therapeutic approaches. In order to be applicable in a clinical setting, predictive biomarkers must be carefully shepherded through a step-wise, highly regulated developmental process. Volume I of this two-volume document focused on the pre-analytical and analytical phases of the biomarker development process, by providing background, examples and "good practice" recommendations. In the current Volume II, the focus is on the clinical validation, validation of clinical utility and regulatory considerations for biomarker development. Together, this two volume series is meant to provide guidance on the entire biomarker development process, with a particular focus on the unique aspects of developing immune-based biomarkers. Specifically, knowledge about the challenges to clinical validation of predictive biomarkers, which has been gained from numerous successes and failures in other contexts, will be reviewed together with statistical methodological issues related to bias and overfitting. The different trial designs used for the clinical validation of biomarkers will also be discussed, as the selection of clinical metrics and endpoints becomes critical to establish the clinical utility of the biomarker during the clinical validation phase of the biomarker development. Finally, the regulatory aspects of submission of biomarker assays to the U.S. Food and Drug Administration as well as regulatory considerations in the European Union will be covered.
Recidivism in female offenders: PCL-R lifestyle factor and VRAG show predictive validity in a German sample.

PubMed

Eisenbarth, Hedwig; Osterheider, Michael; Nedopil, Norbert; Stadtland, Cornelis

2012-01-01

A clear and structured approach to evidence-based and gender-specific risk assessment of violence in female offenders is high on political and mental health agendas. However, most data on the factors involved in risk-assessment instruments are based on data of male offenders. The aim of the present study was to validate the use of the Psychopathy Checklist Revised (PCL-R), the HCR-20 and the Violence Risk Appraisal Guide (VRAG) for the prediction of recidivism in German female offenders. This study is part of the Munich Prognosis Project (MPP). It focuses on a subsample of female delinquents (n = 80) who had been referred for forensic-psychiatric evaluation prior to sentencing. The mean time at risk was 8 years (SD = 5 years; range: 1-18 years). During this time, 31% (n = 25) of the female offenders were reconvicted, 5% (n = 4) for violent and 26% (n = 21) for non-violent re-offenses. The predictive validity of the PCL-R for general recidivism was calculated. Analysis with receiver-operating characteristics revealed that the PCL-R total score, the PCL-R antisocial lifestyle factor, the PCL-R lifestyle factor and the PCL-R impulsive and irresponsible behavioral style factor had a moderate predictive validity for general recidivism (area under the curve, AUC = 0.66, p = 0.02). The VRAG has also demonstrated predictive validity (AUC = 0.72, p = 0.02), whereas the HCR-20 showed no predictive validity. These results appear to provide the first evidence that the PCL-R total score and the antisocial lifestyle factor are predictive for general female recidivism, as has been shown consistently for male recidivists. The implications of these findings for crime prevention, prognosis in women, and future research are discussed. Copyright © 2012 John Wiley & Sons, Ltd.
Application of Multivariable Analysis and FTIR-ATR Spectroscopy to the Prediction of Properties in Campeche Honey

PubMed Central

Pat, Lucio; Ali, Bassam; Guerrero, Armando; Córdova, Atl V.; Garduza, José P.

2016-01-01

Attenuated total reflectance-Fourier transform infrared spectrometry and chemometrics model was used for determination of physicochemical properties (pH, redox potential, free acidity, electrical conductivity, moisture, total soluble solids (TSS), ash, and HMF) in honey samples. The reference values of 189 honey samples of different botanical origin were determined using Association Official Analytical Chemists, (AOAC), 1990; Codex Alimentarius, 2001, International Honey Commission, 2002, methods. Multivariate calibration models were built using partial least squares (PLS) for the measurands studied. The developed models were validated using cross-validation and external validation; several statistical parameters were obtained to determine the robustness of the calibration models: (PCs) optimum number of components principal, (SECV) standard error of cross-validation, (R 2 cal) coefficient of determination of cross-validation, (SEP) standard error of validation, and (R 2 val) coefficient of determination for external validation and coefficient of variation (CV). The prediction accuracy for pH, redox potential, electrical conductivity, moisture, TSS, and ash was good, while for free acidity and HMF it was poor. The results demonstrate that attenuated total reflectance-Fourier transform infrared spectrometry is a valuable, rapid, and nondestructive tool for the quantification of physicochemical properties of honey. PMID:28070445
Definition and Demonstration of a Methodology for Validating Aircraft Trajectory Predictors

NASA Technical Reports Server (NTRS)

Vivona, Robert A.; Paglione, Mike M.; Cate, Karen T.; Enea, Gabriele

2010-01-01

This paper presents a new methodology for validating an aircraft trajectory predictor, inspired by the lessons learned from a number of field trials, flight tests and simulation experiments for the development of trajectory-predictor-based automation. The methodology introduces new techniques and a new multi-staged approach to reduce the effort in identifying and resolving validation failures, avoiding the potentially large costs associated with failures during a single-stage, pass/fail approach. As a case study, the validation effort performed by the Federal Aviation Administration for its En Route Automation Modernization (ERAM) system is analyzed to illustrate the real-world applicability of this methodology. During this validation effort, ERAM initially failed to achieve six of its eight requirements associated with trajectory prediction and conflict probe. The ERAM validation issues have since been addressed, but to illustrate how the methodology could have benefited the FAA effort, additional techniques are presented that could have been used to resolve some of these issues. Using data from the ERAM validation effort, it is demonstrated that these new techniques could have identified trajectory prediction error sources that contributed to several of the unmet ERAM requirements.
Display format, highlight validity, and highlight method: Their effects on search performance

NASA Technical Reports Server (NTRS)

Donner, Kimberly A.; Mckay, Tim D.; Obrien, Kevin M.; Rudisill, Marianne

1991-01-01

Display format and highlight validity were shown to affect visual display search performance; however, these studies were conducted on small, artificial displays of alphanumeric stimuli. A study manipulating these variables was conducted using realistic, complex Space Shuttle information displays. A 2x2x3 within-subjects analysis of variance found that search times were faster for items in reformatted displays than for current displays. Responses to valid applications of highlight were significantly faster than responses to non or invalidly highlighted applications. The significant format by highlight validity interaction showed that there was little difference in response time to both current and reformatted displays when the highlight validity was applied; however, under the non or invalid highlight conditions, search times were faster with reformatted displays. A separate within-subject analysis of variance of display format, highlight validity, and several highlight methods did not reveal a main effect of highlight method. In addition, observed display search times were compared to search time predicted by Tullis' Display Analysis Program. Benefits of highlighting and reformatting displays to enhance search and the necessity to consider highlight validity and format characteristics in tandem for predicting search performance are discussed.
Spatial and temporal predictions of agricultural land prices using DSM techniques.

NASA Astrophysics Data System (ADS)

Carré, F.; Grandgirard, D.; Diafas, I.; Reuter, H. I.; Julien, V.; Lemercier, B.

2009-04-01

Agricultural land prices highly impacts land accessibility to farmers and by consequence the evolution of agricultural landscapes (crop changes, land conversion to urban infrastructures…) which can turn to irreversible soil degradation. The economic value of agricultural land has been studied spatially, in every one of the 374 French Agricultural Counties, and temporally- from 1995 to 2007, by using data of the SAFER Institute. To this aim, agricultural land price was considered as a digital soil property. The spatial and temporal predictions were done using Digital Soil Mapping techniques combined with tools mainly used for studying temporal financial behaviors. For making both predictions, a first classification of the Agricultural Counties was done for the 1995-2006 periods (2007 was excluded and served as the date of prediction) using a fuzzy k-means clustering. The Agricultural Counties were then aggregated according to land price at the different times. The clustering allows for characterizing the counties by their memberships to each class centroid. The memberships were used for the spatial prediction, whereas the centroids were used for the temporal prediction. For the spatial prediction, from the 374 Agricultural counties, three fourths were used for modeling and one fourth for validating. Random sampling was done by class to ensure that all classes are represented by at least one county in the modeling and validation datasets. The prediction was done for each class by testing the relationships between the memberships and the following factors: (i) soil variable (organic matter from the French BDAT database), (ii) soil covariates (land use classes from CORINE LANDCOVER, bioclimatic zones from the WorldClim Database, landform attributes and landform classes from the SRTM, major roads and hydrographic densities from EUROSTAT, average field sizes estimated by automatic classification of remote sensed images) and (iii) socio-economic factors (population density, gross domestic product and its combination with the population density obtained from EUROSTAT). Linear (Generalized Linear Models) and non-linear models (neural network) were used for building the relationships. For the validation, the relationships were applied to the validation datasets. The RMSE and the coefficient of determination (from a linear regression) between predicted and actual memberships, and the contingency table between the predicted and actual allocation classes were used as validation criteria. The temporal prediction was done on the year 2007 from the centroid land prices characterizing the 1995-2006 period. For each class, the land prices of the time-series 1995-2006 were modeled using an Auto-Regressive Moving Average approach. For the validation, the models were applied to the year 2007. The RMSE between predicted and actual prices is used as the validation criteria. We then discussed the methods and the results of the spatial and temporal validation. Based on this methodology, an extrapolation will be tested on another European country with land price market similar to France (to be determined).
In Infants' Hands: Identification of Preverbal Infants at Risk for Primary Language Delay

ERIC Educational Resources Information Center

Lüke, Carina; Grimminger, Angela; Rohlfing, Katharina J.; Liszkowski, Ulf; Ritterfeld, Ute

2017-01-01

Early identification of primary language delay is crucial to implement effective prevention programs. Available screening instruments are based on parents' reports and have only insufficient predictive validity. This study employed observational measures of preverbal infants' gestural communication to test its predictive validity for identifying…
Predicting Academic Performance at a Predominantly Black Medical School.

ERIC Educational Resources Information Center

Johnson, Davis G.; And Others

1986-01-01

The validity of the Medical College Admission (MCAT), undergraduate grade-point average (GPA), and "competitiveness" of undergraduate college in predicting the performance of students at a predominantly black college of medicine was examined. No differences between men and women were found in the validity of MCAT scores and GPA.…
A Comparative Study of Adolescent Risk Assessment Instruments: Predictive and Incremental Validity

ERIC Educational Resources Information Center

Welsh, Jennifer L.; Schmidt, Fred; McKinnon, Lauren; Chattha, H. K.; Meyers, Joanna R.

2008-01-01

Promising new adolescent risk assessment tools are being incorporated into clinical practice but currently possess limited evidence of predictive validity regarding their individual and/or combined use in risk assessments. The current study compares three structured adolescent risk instruments, Youth Level of Service/Case Management Inventory…
The Predictive Validity of the Metropolitan Readiness Tests, 1976 Edition.

ERIC Educational Resources Information Center

Nagle, Richard J.

1979-01-01

A sample of 176 first-grade children was tested on the Metropolitan Readiness Tests, 1976 Edition (MRT), during the initial month of school and was retested eight months later on the Stanford Achievement Test. Results demonstrated substantial validity of the MRT for predicting first-grade achievement. (Author/CTM)
Predicting Intent to Get a College Degree.

ERIC Educational Resources Information Center

Staats, Sara; Partlo, Christie

1990-01-01

Examined reliability and validity of Perceived Quality of Academic Life (PQAL) instrument with data collected from 218 midwestern commuter college students. Extended existing research by studying PQAL scores as predictors of intent to remain in college. Findings showed that the PQAL was reliable, valid, and predictive of future intent to obtain a…
The Predictive Validity of Dynamic Assessment: A Review

ERIC Educational Resources Information Center

Caffrey, Erin; Fuchs, Douglas; Fuchs, Lynn S.

2008-01-01

The authors report on a mixed-methods review of 24 studies that explores the predictive validity of dynamic assessment (DA). For 15 of the studies, they conducted quantitative analyses using Pearson's correlation coefficients. They descriptively examined the remaining studies to determine if their results were consistent with findings from the…
The Predictive Validity of Projective Measures.

ERIC Educational Resources Information Center

Suinn, Richard M.; Oskamp, Stuart

Written for use by clinical practitioners as well as psychological researchers, this book surveys recent literature (1950-1965) on projective test validity by reviewing and critically evaluating studies which shed light on what may reliably be predicted from projective test results. Two major instruments are covered: the Rorschach and the Thematic…
Examination of the Mild Brain Injury Atypical Symptom Scale and the Validity-10 Scale to detect symptom exaggeration in US military service members.

PubMed

Lange, Rael T; Brickell, Tracey A; French, Louis M

2015-01-01

The purpose of this study was to examine the clinical utility of two validity scales designed for use with the Neurobehavioral Symptom Inventory (NSI) and the PTSD Checklist-Civilian Version (PCL-C); the Mild Brain Injury Atypical Symptoms Scale (mBIAS) and Validity-10 scale. Participants were 63 U.S. military service members (age: M = 31.9 years, SD = 12.5; 90.5% male) who sustained a mild traumatic brain injury (MTBI) and were prospectively enrolled from Walter Reed National Military Medical Center. Participants were divided into two groups based on the validity scales of the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF): (a) symptom validity test (SVT)-Fail (n = 24) and (b) SVT-Pass (n = 39). Participants were evaluated on average 19.4 months postinjury (SD = 27.6). Participants in the SVT-Fail group had significantly higher scores (p < .05) on the mBIAS (d = 0.85), Validity-10 (d = 1.89), NSI (d = 2.23), and PCL-C (d = 2.47), and the vast majority of the MMPI-2-RF scales (d = 0.69 to d = 2.47). Sensitivity, specificity, and predictive power values were calculated across the range of mBIAS and Validity-10 scores to determine the optimal cutoff to detect symptom exaggeration. For the mBIAS, a cutoff score of ≥8 was considered optimal, which resulted in low sensitivity (.17), high specificity (1.0), high positive predictive power (1.0), and moderate negative predictive power (.69). For the Validity-10 scale, a cutoff score of ≥13 was considered optimal, which resulted in moderate-high sensitivity (.63), high specificity (.97), and high positive (.93) and negative predictive power (.83). These findings provide strong support for the use of the Validity-10 as a tool to screen for symptom exaggeration when administering the NSI and PCL-C. The mBIAS, however, was not a reliable tool for this purpose and failed to identify the vast majority of people who exaggerated symptoms.
Measuring acuity of the approximate number system reliably and validly: the evaluation of an adaptive test procedure

PubMed Central

Lindskog, Marcus; Winman, Anders; Juslin, Peter; Poom, Leo

2013-01-01

Two studies investigated the reliability and predictive validity of commonly used measures and models of Approximate Number System acuity (ANS). Study 1 investigated reliability by both an empirical approach and a simulation of maximum obtainable reliability under ideal conditions. Results showed that common measures of the Weber fraction (w) are reliable only when using a substantial number of trials, even under ideal conditions. Study 2 compared different purported measures of ANS acuity as for convergent and predictive validity in a within-subjects design and evaluated an adaptive test using the ZEST algorithm. Results showed that the adaptive measure can reduce the number of trials needed to reach acceptable reliability. Only direct tests with non-symbolic numerosity discriminations of stimuli presented simultaneously were related to arithmetic fluency. This correlation remained when controlling for general cognitive ability and perceptual speed. Further, the purported indirect measure of ANS acuity in terms of the Numeric Distance Effect (NDE) was not reliable and showed no sign of predictive validity. The non-symbolic NDE for reaction time was significantly related to direct w estimates in a direction contrary to the expected. Easier stimuli were found to be more reliable, but only harder (7:8 ratio) stimuli contributed to predictive validity. PMID:23964256
Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation.

PubMed

Wahl, Simone; Boulesteix, Anne-Laure; Zierer, Astrid; Thorand, Barbara; van de Wiel, Mark A

2016-10-26

Missing values are a frequent issue in human studies. In many situations, multiple imputation (MI) is an appropriate missing data handling strategy, whereby missing values are imputed multiple times, the analysis is performed in every imputed data set, and the obtained estimates are pooled. If the aim is to estimate (added) predictive performance measures, such as (change in) the area under the receiver-operating characteristic curve (AUC), internal validation strategies become desirable in order to correct for optimism. It is not fully understood how internal validation should be combined with multiple imputation. In a comprehensive simulation study and in a real data set based on blood markers as predictors for mortality, we compare three combination strategies: Val-MI, internal validation followed by MI on the training and test parts separately, MI-Val, MI on the full data set followed by internal validation, and MI(-y)-Val, MI on the full data set omitting the outcome followed by internal validation. Different validation strategies, including bootstrap und cross-validation, different (added) performance measures, and various data characteristics are considered, and the strategies are evaluated with regard to bias and mean squared error of the obtained performance estimates. In addition, we elaborate on the number of resamples and imputations to be used, and adopt a strategy for confidence interval construction to incomplete data. Internal validation is essential in order to avoid optimism, with the bootstrap 0.632+ estimate representing a reliable method to correct for optimism. While estimates obtained by MI-Val are optimistically biased, those obtained by MI(-y)-Val tend to be pessimistic in the presence of a true underlying effect. Val-MI provides largely unbiased estimates, with a slight pessimistic bias with increasing true effect size, number of covariates and decreasing sample size. In Val-MI, accuracy of the estimate is more strongly improved by increasing the number of bootstrap draws rather than the number of imputations. With a simple integrated approach, valid confidence intervals for performance estimates can be obtained. When prognostic models are developed on incomplete data, Val-MI represents a valid strategy to obtain estimates of predictive performance measures.
Derivation and validation of simple anthropometric equations to predict adipose tissue mass and total fat mass with MRI as the reference method

PubMed Central

Al-Gindan, Yasmin Y.; Hankey, Catherine R.; Govan, Lindsay; Gallagher, Dympna; Heymsfield, Steven B.; Lean, Michael E. J.

2017-01-01

The reference organ-level body composition measurement method is MRI. Practical estimations of total adipose tissue mass (TATM), total adipose tissue fat mass (TATFM) and total body fat are valuable for epidemiology, but validated prediction equations based on MRI are not currently available. We aimed to derive and validate new anthropometric equations to estimate MRI-measured TATM/TATFM/total body fat and compare them with existing prediction equations using older methods. The derivation sample included 416 participants (222 women), aged between 18 and 88 years with BMI between 15·9 and 40·8 (kg/m2). The validation sample included 204 participants (110 women), aged between 18 and 86 years with BMI between 15·7 and 36·4 (kg/m2). Both samples included mixed ethnic/racial groups. All the participants underwent whole-body MRI to quantify TATM (dependent variable) and anthropometry (independent variables). Prediction equations developed using stepwise multiple regression were further investigated for agreement and bias before validation in separate data sets. Simplest equations with optimal R2 and Bland–Altman plots demonstrated good agreement without bias in the validation analyses: men: TATM (kg) = 0·198 weight (kg) + 0·478 waist (cm) − 0·147 height (cm) − 12·8 (validation: R2 0·79, CV = 20 %, standard error of the estimate (SEE)=3·8 kg) and women: TATM (kg)=0·789 weight (kg) + 0·0786 age (years) − 0·342 height (cm) + 24·5 (validation: R2 0·84, CV = 13 %, SEE = 3·0 kg). Published anthropometric prediction equations, based on MRI and computed tomographic scans, correlated strongly with MRI-measured TATM: (R2 0·70 – 0·82). Estimated TATFM correlated well with published prediction equations for total body fat based on underwater weighing (R2 0·70–0·80), with mean bias of 2·5–4·9 kg, correctable with log-transformation in most equations. In conclusion, new equations, using simple anthropometric measurements, estimated MRI-measured TATM with correlations and agreements suitable for use in groups and populations across a wide range of fatness. PMID:26435103
Testing Pearl Model In Three European Sites

NASA Astrophysics Data System (ADS)

Bouraoui, F.; Bidoglio, G.

The Plant Protection Product Directive (91/414/EEC) stresses the need of validated models to calculate predicted environmental concentrations. The use of models has become an unavoidable step before pesticide registration. In this context, European Commission, and in particular DGVI, set up a FOrum for the Co-ordination of pes- ticide fate models and their USe (FOCUS). In a complementary effort, DG research supported the APECOP project, with one of its objective being the validation and im- provement of existing pesticide fate models. The main topic of research presented here is the validation of the PEARL model for different sites in Europe. The PEARL model, actually used in the Dutch pesticide registration procedure, was validated in three well- instrumented sites: Vredepeel (the Netherlands), Brimstone (UK), and Lanna (Swe- den). A step-wise procedure was used for the validation of the PEARL model. First the water transport module was calibrated, and then the solute transport module, using tracer measurements keeping unchanged the water transport parameters. The Vrede- peel site is characterised by a sandy soil. Fourteen months of measurements were used for the calibration. Two pesticides were applied on the site: bentazone and etho- prophos. PEARL predictions were very satisfactory for both soil moisture content, and pesticide concentration in the soil profile. The Brimstone site is characterised by a cracking clay soil. The calibration was conducted on a time series measurement of 7 years. The validation consisted in comparing predictions and measurement of soil moisture at different soil depths, and in comparing the predicted and measured con- centration of isoproturon in the drainage water. The results, even if in good agreement with the measuremens, highlighted the limitation of the model when the preferential flow becomes a dominant process. PEARL did not reproduce well soil moisture pro- file during summer months, and also under-predicted the arrival of isoproturon to the drains. The Lanna site is characterised by s structured clay soil. PEARL was success- ful in predicting soil moisture profiles and the draining water. PEARL performed well in predicting the soil concentration of bentazone at different depth. However, since PEARL does not consider cracks in the soil, it did not predict well the peak concen- trations of bentazone in the drainage water. Along with the validation results for the three sites, a sensitivity analysis of the model is presented.
The development and validation of different decision-making tools to predict urine culture growth out of urine flow cytometry parameter.

PubMed

Müller, Martin; Seidenberg, Ruth; Schuh, Sabine K; Exadaktylos, Aristomenis K; Schechter, Clyde B; Leichtle, Alexander B; Hautz, Wolf E

2018-01-01

Patients presenting with suspected urinary tract infection are common in every day emergency practice. Urine flow cytometry has replaced microscopic urine evaluation in many emergency departments, but interpretation of the results remains challenging. The aim of this study was to develop and validate tools that predict urine culture growth out of urine flow cytometry parameter. This retrospective study included all adult patients that presented in a large emergency department between January and July 2017 with a suspected urinary tract infection and had a urine flow cytometry as well as a urine culture obtained. The objective was to identify urine flow cytometry parameters that reliably predict urine culture growth and mixed flora growth. The data set was split into a training (70%) and a validation set (30%) and different decision-making approaches were developed and validated. Relevant urine culture growth (respectively mixed flora growth) was found in 40.2% (7.2% respectively) of the 613 patients included. The number of leukocytes and bacteria in flow cytometry were highly associated with urine culture growth, but mixed flora growth could not be sufficiently predicted from the urine flow cytometry parameters. A decision tree, predictive value figures, a nomogram, and a cut-off table to predict urine culture growth from bacteria and leukocyte count were developed, validated and compared. Urine flow cytometry parameters are insufficient to predict mixed flora growth. However, the prediction of urine culture growth based on bacteria and leukocyte count is highly accurate and the developed tools should be used as part of the decision-making process of ordering a urine culture or starting an antibiotic therapy if a urogenital infection is suspected.
The development and validation of different decision-making tools to predict urine culture growth out of urine flow cytometry parameter

PubMed Central

Seidenberg, Ruth; Schuh, Sabine K.; Exadaktylos, Aristomenis K.; Schechter, Clyde B.; Leichtle, Alexander B.; Hautz, Wolf E.

2018-01-01

Objective Patients presenting with suspected urinary tract infection are common in every day emergency practice. Urine flow cytometry has replaced microscopic urine evaluation in many emergency departments, but interpretation of the results remains challenging. The aim of this study was to develop and validate tools that predict urine culture growth out of urine flow cytometry parameter. Methods This retrospective study included all adult patients that presented in a large emergency department between January and July 2017 with a suspected urinary tract infection and had a urine flow cytometry as well as a urine culture obtained. The objective was to identify urine flow cytometry parameters that reliably predict urine culture growth and mixed flora growth. The data set was split into a training (70%) and a validation set (30%) and different decision-making approaches were developed and validated. Results Relevant urine culture growth (respectively mixed flora growth) was found in 40.2% (7.2% respectively) of the 613 patients included. The number of leukocytes and bacteria in flow cytometry were highly associated with urine culture growth, but mixed flora growth could not be sufficiently predicted from the urine flow cytometry parameters. A decision tree, predictive value figures, a nomogram, and a cut-off table to predict urine culture growth from bacteria and leukocyte count were developed, validated and compared. Conclusions Urine flow cytometry parameters are insufficient to predict mixed flora growth. However, the prediction of urine culture growth based on bacteria and leukocyte count is highly accurate and the developed tools should be used as part of the decision-making process of ordering a urine culture or starting an antibiotic therapy if a urogenital infection is suspected. PMID:29474463

Validation study of the SCREENIVF: an instrument to screen women or men on risk for emotional maladjustment before the start of a fertility treatment.

PubMed

Ockhuijsen, Henrietta D L; van Smeden, Maarten; van den Hoogen, Agnes; Boivin, Jacky

2017-06-01

To examine construct and criterion validity of the Dutch SCREENIVF among women and men undergoing a fertility treatment. A prospective longitudinal study nested in a randomized controlled trial. University hospital. Couples, 468 women and 383 men, undergoing an IVF/intracytoplasmic sperm injection (ICSI) treatment in a fertility clinic, completed the SCREENIVF. Construct and criteria validity of the SCREENIVF. The comparative fit index and root mean square error of approximation for women and men show a good fit of the factor model. Across time, the sensitivity for Hospital Anxiety and Depression Scale subscale in women ranged from 61%-98%, specificity 53%-65%, predictive value of a positive test (PVP) 13%-56%, predictive value of a negative test (PVN) 70%-99%. The sensitivity scores for men ranged from 38%-100%, specificity 71%-75%, PVP 9%-27%, PVN 92%-100%. A prediction model revealed that for women 68.7% of the variance in the Hospital Anxiety and Depression Scale on time 1 and 42.5% at time 2 and 38.9% at time 3 was explained by the predictors, the sum score scales of the SCREENIVF. For men, 58.1% of the variance in the Hospital Anxiety and Depression Scale on time 1 and 46.5% at time 2 and 37.3% at time 3 was explained by the predictors, the sum score scales of the SCREENIVF. The SCREENIVF has good construct validity but the concurrent validity is better than the predictive validity. SCREENIVF will be most effectively used in fertility clinics at the start of treatment and should not be used as a predictive tool. Copyright © 2017 American Society for Reproductive Medicine. All rights reserved.
Risk prediction models of breast cancer: a systematic review of model performances.

PubMed

Anothaisintawee, Thunyarat; Teerawattananon, Yot; Wiratkapun, Chollathip; Kasamesup, Vijj; Thakkinstian, Ammarin

2012-05-01

The number of risk prediction models has been increasingly developed, for estimating about breast cancer in individual women. However, those model performances are questionable. We therefore have conducted a study with the aim to systematically review previous risk prediction models. The results from this review help to identify the most reliable model and indicate the strengths and weaknesses of each model for guiding future model development. We searched MEDLINE (PubMed) from 1949 and EMBASE (Ovid) from 1974 until October 2010. Observational studies which constructed models using regression methods were selected. Information about model development and performance were extracted. Twenty-five out of 453 studies were eligible. Of these, 18 developed prediction models and 7 validated existing prediction models. Up to 13 variables were included in the models and sample sizes for each study ranged from 550 to 2,404,636. Internal validation was performed in four models, while five models had external validation. Gail and Rosner and Colditz models were the significant models which were subsequently modified by other scholars. Calibration performance of most models was fair to good (expected/observe ratio: 0.87-1.12), but discriminatory accuracy was poor to fair both in internal validation (concordance statistics: 0.53-0.66) and in external validation (concordance statistics: 0.56-0.63). Most models yielded relatively poor discrimination in both internal and external validation. This poor discriminatory accuracy of existing models might be because of a lack of knowledge about risk factors, heterogeneous subtypes of breast cancer, and different distributions of risk factors across populations. In addition the concordance statistic itself is insensitive to measure the improvement of discrimination. Therefore, the new method such as net reclassification index should be considered to evaluate the improvement of the performance of a new develop model.
Prediction of Outcome after Moderate and Severe Traumatic Brain Injury: External Validation of the IMPACT and CRASH Prognostic Models

PubMed Central

Roozenbeek, Bob; Lingsma, Hester F.; Lecky, Fiona E.; Lu, Juan; Weir, James; Butcher, Isabella; McHugh, Gillian S.; Murray, Gordon D.; Perel, Pablo; Maas, Andrew I.R.; Steyerberg, Ewout W.

2012-01-01

Objective The International Mission on Prognosis and Analysis of Clinical Trials (IMPACT) and Corticoid Randomisation After Significant Head injury (CRASH) prognostic models predict outcome after traumatic brain injury (TBI) but have not been compared in large datasets. The objective of this is study is to validate externally and compare the IMPACT and CRASH prognostic models for prediction of outcome after moderate or severe TBI. Design External validation study. Patients We considered 5 new datasets with a total of 9036 patients, comprising three randomized trials and two observational series, containing prospectively collected individual TBI patient data. Measurements Outcomes were mortality and unfavourable outcome, based on the Glasgow Outcome Score (GOS) at six months after injury. To assess performance, we studied the discrimination of the models (by AUCs), and calibration (by comparison of the mean observed to predicted outcomes and calibration slopes). Main Results The highest discrimination was found in the TARN trauma registry (AUCs between 0.83 and 0.87), and the lowest discrimination in the Pharmos trial (AUCs between 0.65 and 0.71). Although differences in predictor effects between development and validation populations were found (calibration slopes varying between 0.58 and 1.53), the differences in discrimination were largely explained by differences in case-mix in the validation studies. Calibration was good, the fraction of observed outcomes generally agreed well with the mean predicted outcome. No meaningful differences were noted in performance between the IMPACT and CRASH models. More complex models discriminated slightly better than simpler variants. Conclusions Since both the IMPACT and the CRASH prognostic models show good generalizability to more recent data, they are valid instruments to quantify prognosis in TBI. PMID:22511138
Validation of bioelectrical impedance analysis for total body water assessment against the deuterium dilution technique in Asian children.

PubMed

Liu, A; Byrne, N M; Ma, G; Nasreddine, L; Trinidad, T P; Kijboonchoo, K; Ismail, M N; Kagawa, M; Poh, B K; Hills, A P

2011-12-01

To develop and cross-validate bioelectrical impedance analysis (BIA) prediction equations of total body water (TBW) and fat-free mass (FFM) for Asian pre-pubertal children from China, Lebanon, Malaysia, Philippines and Thailand. Height, weight, age, gender, resistance and reactance measured by BIA were collected from 948 Asian children (492 boys and 456 girls) aged 8-10 years from the five countries. The deuterium dilution technique was used as the criterion method for the estimation of TBW and FFM. The BIA equations were developed using stepwise multiple regression analysis and cross-validated using the Bland-Altman approach. The BIA prediction equation for the estimation of TBW was as follows: TBW=0.231 × height(2)/resistance+0.066 × height+0.188 × weight+0.128 × age+0.500 × sex-0.316 × Thais-4.574 (R (2)=88.0%, root mean square error (RMSE)=1.3 kg), and for the estimation of FFM was as follows: FFM=0.299 × height(2)/resistance+0.086 × height+0.245 × weight+0.260 × age+0.901 × sex-0.415 × ethnicity (Thai ethnicity =1, others = 0)-6.952 (R (2)=88.3%, RMSE=1.7 kg). No significant difference between measured and predicted values for the whole cross-validation sample was found. However, the prediction equation for estimation of TBW/FFM tended to overestimate TBW/FFM at lower levels whereas underestimate at higher levels of TBW/FFM. Accuracy of the general equation for TBW and FFM was also valid at each body mass index category. Ethnicity influences the relationship between BIA and body composition in Asian pre-pubertal children. The newly developed BIA prediction equations are valid for use in Asian pre-pubertal children.
Risk score to predict gastrointestinal bleeding after acute ischemic stroke.

PubMed

Ji, Ruijun; Shen, Haipeng; Pan, Yuesong; Wang, Penglian; Liu, Gaifen; Wang, Yilong; Li, Hao; Singhal, Aneesh B; Wang, Yongjun

2014-07-25

Gastrointestinal bleeding (GIB) is a common and often serious complication after stroke. Although several risk factors for post-stroke GIB have been identified, no reliable or validated scoring system is currently available to predict GIB after acute stroke in routine clinical practice or clinical trials. In the present study, we aimed to develop and validate a risk model (acute ischemic stroke associated gastrointestinal bleeding score, the AIS-GIB score) to predict in-hospital GIB after acute ischemic stroke. The AIS-GIB score was developed from data in the China National Stroke Registry (CNSR). Eligible patients in the CNSR were randomly divided into derivation (60%) and internal validation (40%) cohorts. External validation was performed using data from the prospective Chinese Intracranial Atherosclerosis Study (CICAS). Independent predictors of in-hospital GIB were obtained using multivariable logistic regression in the derivation cohort, and β-coefficients were used to generate point scoring system for the AIS-GIB. The area under the receiver operating characteristic curve (AUROC) and the Hosmer-Lemeshow goodness-of-fit test were used to assess model discrimination and calibration, respectively. A total of 8,820, 5,882, and 2,938 patients were enrolled in the derivation, internal validation and external validation cohorts. The overall in-hospital GIB after AIS was 2.6%, 2.3%, and 1.5% in the derivation, internal, and external validation cohort, respectively. An 18-point AIS-GIB score was developed from the set of independent predictors of GIB including age, gender, history of hypertension, hepatic cirrhosis, peptic ulcer or previous GIB, pre-stroke dependence, admission National Institutes of Health stroke scale score, Glasgow Coma Scale score and stroke subtype (Oxfordshire). The AIS-GIB score showed good discrimination in the derivation (0.79; 95% CI, 0.764-0.825), internal (0.78; 95% CI, 0.74-0.82) and external (0.76; 95% CI, 0.71-0.82) validation cohorts. The AIS-GIB score was well calibrated in the derivation (P = 0.42), internal (P = 0.45) and external (P = 0.86) validation cohorts. The AIS-GIB score is a valid clinical grading scale to predict in-hospital GIB after AIS. Further studies on the effect of the AIS-GIB score on reducing GIB and improving outcome after AIS are warranted.
A Study of the Predictive Validity of the Children's Depression Inventory for Major Depression Disorder in Puerto Rican Adolescents

ERIC Educational Resources Information Center

Rivera-Medina, Carmen L.; Bernal, Guillermo; Rossello, Jeannette; Cumba-Aviles, Eduardo

2010-01-01

This study aims to evaluate the predictive validity of the Children's Depression Inventory items for major depression disorder (MDD) in an outpatient clinic sample of Puerto Rican adolescents. The sample consisted of 130 adolescents, 13 to 18 years old. The five most frequent symptoms of the Children's Depression Inventory that best predict the…
The Predictive Validity of Interim Assessment Scores Based on the Full-Information Bifactor Model for the Prediction of End-of-Grade Test Performance

ERIC Educational Resources Information Center

Immekus, Jason C.; Atitya, Ben

2016-01-01

Interim tests are a central component of district-wide assessment systems, yet their technical quality to guide decisions (e.g., instructional) has been repeatedly questioned. In response, the study purpose was to investigate the validity of a series of English Language Arts (ELA) interim assessments in terms of dimensionality and prediction of…
Future Performance Trend Indicators: A Current Value Approach to Human Resources Accounting. Report III. Multivariate Predictions of Organizational Performance Across Time.

ERIC Educational Resources Information Center

Pecorella, Patricia A.; Bowers, David G.

Multiple regression in a double cross-validated design was used to predict two performance measures (total variable expense and absence rate) by multi-month period in five industrial firms. The regressions do cross-validate, and produce multiple coefficients which display both concurrent and predictive effects, peaking 18 months to two years…
Validation of BEHAVE fire behavior predictions in oak savannas using five fuel models

Treesearch

Keith Grabner; John Dwyer; Bruce Cutter

1997-01-01

Prescribed fire is a valuable tool in the restoration and management of oak savannas. BEHAVE, a fire behavior prediction system developed by the United States Forest Service, can be a useful tool when managing oak savannas with prescribed fire. BEHAVE predictions of fire rate-of-spread and flame length were validated using four standardized fuel models: Fuel Model 1 (...
Development and evaluation of an automated fall risk assessment system.

PubMed

Lee, Ju Young; Jin, Yinji; Piao, Jinshi; Lee, Sun-Mi

2016-04-01

Fall risk assessment is the first step toward prevention, and a risk assessment tool with high validity should be used. This study aimed to develop and validate an automated fall risk assessment system (Auto-FallRAS) to assess fall risks based on electronic medical records (EMRs) without additional data collected or entered by nurses. This study was conducted in a 1335-bed university hospital in Seoul, South Korea. The Auto-FallRAS was developed using 4211 fall-related clinical data extracted from EMRs. Participants included fall patients and non-fall patients (868 and 3472 for the development study; 752 and 3008 for the validation study; and 58 and 232 for validation after clinical application, respectively). The system was evaluated for predictive validity and concurrent validity. The final 10 predictors were included in the logistic regression model for the risk-scoring algorithm. The results of the Auto-FallRAS were shown as high/moderate/low risk on the EMR screen. The predictive validity analyzed after clinical application of the Auto-FallRAS was as follows: sensitivity = 0.95, NPV = 0.97 and Youden index = 0.44. The validity of the Morse Fall Scale assessed by nurses was as follows: sensitivity = 0.68, NPV = 0.88 and Youden index = 0.28. This study found that the Auto-FallRAS results were better than were the nurses' predictions. The advantage of the Auto-FallRAS is that it automatically analyzes information and shows patients' fall risk assessment results without requiring additional time from nurses. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
Initial Validation of a Comprehensive Assessment Instrument for Bereavement-Related Grief Symptoms and Risk of Complications: The Indicator of Bereavement Adaptation—Cruse Scotland (IBACS)

PubMed Central

Schut, Henk; Stroebe, Margaret S.; Wilson, Stewart; Birrell, John

2016-01-01

Objective This study assessed the validity of the Indicator of Bereavement Adaptation Cruse Scotland (IBACS). Designed for use in clinical and non-clinical settings, the IBACS measures severity of grief symptoms and risk of developing complications. Method N = 196 (44 male, 152 female) help-seeking, bereaved Scottish adults participated at two timepoints: T1 (baseline) and T2 (after 18 months). Four validated assessment instruments were administered: CORE-R, ICG-R, IES-R, SCL-90-R. Discriminative ability was assessed using ROC curve analysis. Concurrent validity was tested through correlation analysis at T1. Predictive validity was assessed using correlation analyses and ROC curve analysis. Optimal IBACS cutoff values were obtained by calculating a maximal Youden index J in ROC curve analysis. Clinical implications were compared across instruments. Results ROC curve analysis results (AUC = .84, p < .01, 95% CI between .77 and .90) indicated the IBACS is a good diagnostic instrument for assessing complicated grief. Positive correlations (p < .01, 2-tailed) with all four instruments at T1 demonstrated the IBACS' concurrent validity, strongest with complicated grief measures (r = .82). Predictive validity was shown to be fair in T2 ROC curve analysis results (n = 67, AUC = .78, 95% CI between .65 and .92; p < .01). Predictive validity was also supported by stable positive correlations between IBACS and other instruments at T2. Clinical indications were found not to differ across instruments. Conclusions The IBACS offers effective grief symptom and risk assessment for use by non-clinicians. Indications are sufficient to support intake assessment for a stepped model of bereavement intervention. PMID:27741246
CheS-Mapper 2.0 for visual validation of (Q)SAR models

PubMed Central

2014-01-01

Background Sound statistical validation is important to evaluate and compare the overall performance of (Q)SAR models. However, classical validation does not support the user in better understanding the properties of the model or the underlying data. Even though, a number of visualization tools for analyzing (Q)SAR information in small molecule datasets exist, integrated visualization methods that allow the investigation of model validation results are still lacking. Results We propose visual validation, as an approach for the graphical inspection of (Q)SAR model validation results. The approach applies the 3D viewer CheS-Mapper, an open-source application for the exploration of small molecules in virtual 3D space. The present work describes the new functionalities in CheS-Mapper 2.0, that facilitate the analysis of (Q)SAR information and allows the visual validation of (Q)SAR models. The tool enables the comparison of model predictions to the actual activity in feature space. The approach is generic: It is model-independent and can handle physico-chemical and structural input features as well as quantitative and qualitative endpoints. Conclusions Visual validation with CheS-Mapper enables analyzing (Q)SAR information in the data and indicates how this information is employed by the (Q)SAR model. It reveals, if the endpoint is modeled too specific or too generic and highlights common properties of misclassified compounds. Moreover, the researcher can use CheS-Mapper to inspect how the (Q)SAR model predicts activity cliffs. The CheS-Mapper software is freely available at http://ches-mapper.org. Graphical abstract Comparing actual and predicted activity values with CheS-Mapper.
Developing Enhanced Blood–Brain Barrier Permeability Models: Integrating External Bio-Assay Data in QSAR Modeling

PubMed Central

Wang, Wenyi; Kim, Marlene T.; Sedykh, Alexander

2015-01-01

Purpose Experimental Blood–Brain Barrier (BBB) permeability models for drug molecules are expensive and time-consuming. As alternative methods, several traditional Quantitative Structure-Activity Relationship (QSAR) models have been developed previously. In this study, we aimed to improve the predictivity of traditional QSAR BBB permeability models by employing relevant public bio-assay data in the modeling process. Methods We compiled a BBB permeability database consisting of 439 unique compounds from various resources. The database was split into a modeling set of 341 compounds and a validation set of 98 compounds. Consensus QSAR modeling workflow was employed on the modeling set to develop various QSAR models. A five-fold cross-validation approach was used to validate the developed models, and the resulting models were used to predict the external validation set compounds. Furthermore, we used previously published membrane transporter models to generate relevant transporter profiles for target compounds. The transporter profiles were used as additional biological descriptors to develop hybrid QSAR BBB models. Results The consensus QSAR models have R2=0.638 for fivefold cross-validation and R2=0.504 for external validation. The consensus model developed by pooling chemical and transporter descriptors showed better predictivity (R2=0.646 for five-fold cross-validation and R2=0.526 for external validation). Moreover, several external bio-assays that correlate with BBB permeability were identified using our automatic profiling tool. Conclusions The BBB permeability models developed in this study can be useful for early evaluation of new compounds (e.g., new drug candidates). The combination of chemical and biological descriptors shows a promising direction to improve the current traditional QSAR models. PMID:25862462
Validation of a new measure of availability and accommodation of health care that is valid for rural and urban contexts.

PubMed

Haggerty, Jeannie L; Levesque, Jean-Frédéric

2017-04-01

Patients are the most valid source for evaluating the accessibility of services, but a previous study observed differential psychometric performance of instruments in rural and urban respondents. To validate a measure of organizational accessibility free of differential rural-urban performance that predicts consequences of difficult access for patient-initiated care. Sequential qualitative-quantitative study. Qualitative findings used to adapt or develop evaluative and reporting items. Quantitative validation study. Primary data by telephone from 750 urban, rural and remote respondents in Quebec, Canada; follow-up mailed questionnaire to a subset of 316. Items were developed for barriers along the care trajectory. We used common factor and confirmatory factor analysis to identify constructs and compare models. We used item response theory analysis to test for differential rural-urban performance; examine individual item performance; adjust response options; and exclude redundant or non-discriminatory items. We used logistic regression to examine predictive validity of the subscale on access difficulty (outcome). Initial factor resolution suggested geographic and organizational dimensions, plus consequences of access difficulty. After second administration, organizational accommodation and geographic indicators were integrated into a 6-item subscale of Effective Availability and Accommodation, which demonstrates good variability and internal consistency (α = 0.84) and no differential functioning by geographic area. Each unit increase predicts decreased likelihood of consequences of access difficulties (unmet need and problem aggravation). The new subscale is a practical, valid and reliable measure for patients to evaluate first-contact health services accessibility, yielding valid comparisons between urban and rural contexts. © 2016 The Authors. Health Expectations published by John Wiley & Sons Ltd.
Effectiveness of genomic prediction of maize hybrid performance in different breeding populations and environments.

PubMed

Windhausen, Vanessa S; Atlin, Gary N; Hickey, John M; Crossa, Jose; Jannink, Jean-Luc; Sorrells, Mark E; Raman, Babu; Cairns, Jill E; Tarekegne, Amsal; Semagn, Kassa; Beyene, Yoseph; Grudloyma, Pichet; Technow, Frank; Riedelsheimer, Christian; Melchinger, Albrecht E

2012-11-01

Genomic prediction is expected to considerably increase genetic gains by increasing selection intensity and accelerating the breeding cycle. In this study, marker effects estimated in 255 diverse maize (Zea mays L.) hybrids were used to predict grain yield, anthesis date, and anthesis-silking interval within the diversity panel and testcross progenies of 30 F(2)-derived lines from each of five populations. Although up to 25% of the genetic variance could be explained by cross validation within the diversity panel, the prediction of testcross performance of F(2)-derived lines using marker effects estimated in the diversity panel was on average zero. Hybrids in the diversity panel could be grouped into eight breeding populations differing in mean performance. When performance was predicted separately for each breeding population on the basis of marker effects estimated in the other populations, predictive ability was low (i.e., 0.12 for grain yield). These results suggest that prediction resulted mostly from differences in mean performance of the breeding populations and less from the relationship between the training and validation sets or linkage disequilibrium with causal variants underlying the predicted traits. Potential uses for genomic prediction in maize hybrid breeding are discussed emphasizing the need of (1) a clear definition of the breeding scenario in which genomic prediction should be applied (i.e., prediction among or within populations), (2) a detailed analysis of the population structure before performing cross validation, and (3) larger training sets with strong genetic relationship to the validation set.
INCLEN Diagnostic Tool for Autism Spectrum Disorder (INDT-ASD): development and validation.

PubMed

Juneja, Monica; Mishra, Devendra; Russell, Paul S S; Gulati, Sheffali; Deshmukh, Vaishali; Tudu, Poma; Sagar, Rajesh; Silberberg, Donald; Bhutani, Vinod K; Pinto, Jennifer M; Durkin, Maureen; Pandey, Ravindra M; Nair, M K C; Arora, Narendra K

2014-05-01

To develop and validate INCLEN Diagnostic Tool for Autism Spectrum Disorder (INDT-ASD). Diagnostic test evaluation by cross sectional design. Four tertiary pediatric neurology centers in Delhi and Thiruvanthapuram, India. Children aged 2-9 years were enrolled in the study. INDT-ASD and Childhood Autism Rating Scale (CARS) were administered in a randomly decided sequence by trained psychologist, followed by an expert evaluation by DSM-IV TR diagnostic criteria (gold standard). Psychometric parameters of diagnostic accuracy, validity (construct, criterion and convergent) and internal consistency. 154 children (110 boys, mean age 64.2 mo) were enrolled. The overall diagnostic accuracy (AUC=0.97, 95% CI 0.93, 0.99; P<0.001) and validity (sensitivity 98%, specificity 95%, positive predictive value 91%, negative predictive value 99%) of INDT-ASD for Autism spectrum disorder were high, taking expert diagnosis using DSM-IV-TR as gold standard. The concordance rate between the INDT-ASD and expert diagnosis for 'ASD group' was 82.52% [Cohen's k=0.89; 95% CI (0.82, 0.97); P=0.001]. The internal consistency of INDT-ASD was 0.96. The convergent validity with CARS (r = 0.73, P= 0.001) and divergent validity with Binet-Kamat Test of intelligence (r = -0.37; P=0.004) were significantly high. INDT-ASD has a 4-factor structure explaining 85.3% of the variance. INDT-ASD has high diagnostic accuracy, adequate content validity, good internal consistency high criterion validity and high to moderate convergent validity and 4-factor construct validity for diagnosis of Autistm spectrum disorder.
Validation of elk resource selection models with spatially independent data

Treesearch

Priscilla K. Coe; Bruce K. Johnson; Michael J. Wisdom; John G. Cook; Marty Vavra; Ryan M. Nielson

2011-01-01

Knowledge of how landscape features affect wildlife resource use is essential for informed management. Resource selection functions often are used to make and validate predictions about landscape use; however, resource selection functions are rarely validated with data from landscapes independent of those from which the models were built. This problem has severely...
The Utility and Comparative Incremental Validity of the MMPI-2 and Trauma Symptom Inventory Validity Scales in the Detection of Feigned PTSD

ERIC Educational Resources Information Center

Efendov, Adele A.; Sellbom, Martin; Bagby, R. Michael

2008-01-01

The authors examined the comparative predictive capacity of the Trauma Symptom Inventory (TSI) Atypical Response Scale (ATR) and the standard set of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) fake-bad validity scales (i.e., F, F[subscript B[prime
Factor Structure and Validation of a Set of Readiness Measures.

ERIC Educational Resources Information Center

Kaufman, Maurice; Lynch, Mervin

A study was undertaken to identify the factor structure of a battery of readiness measures and to demonstrate the concurrent and predictive validity of one instrument in that battery--the Pre-Reading Screening Procedures (PSP). Concurrent validity was determined by examining the correlation of the PSP with the Metropolitan Readiness Test (MRT),…
The Adaptation and Validation of the Emotion Matching Task for Preschool Children in Spain

ERIC Educational Resources Information Center

Alonso-Alberca, Natalia; Vergara, Ana I.; Fernandez-Berrocal, Pablo; Johnson, Stacy R.; Izard, Carroll E.

2012-01-01

The Emotion Matching Task (EMT; Izard, Haskins, Schultz, Trentacosta, & King, 2003) was developed to assess emotion knowledge in preschoolers and was demonstrated to show adequate convergent and predictive validity in an American sample (Morgan, Izard, & King, 2010). In light of the need for valid measures for assessing emotion…

Double Cross-Validation in Multiple Regression: A Method of Estimating the Stability of Results.

ERIC Educational Resources Information Center

Rowell, R. Kevin

In multiple regression analysis, where resulting predictive equation effectiveness is subject to shrinkage, it is especially important to evaluate result replicability. Double cross-validation is an empirical method by which an estimate of invariance or stability can be obtained from research data. A procedure for double cross-validation is…
Analyzing the Validity of the Adult-Adolescent Parenting Inventory for Low-Income Populations

ERIC Educational Resources Information Center

Lawson, Michael A.; Alameda-Lawson, Tania; Byrnes, Edward

2017-01-01

Objectives: The purpose of this study was to examine the construct and predictive validity of the Adult-Adolescent Parenting Inventory (AAPI-2). Methods: The validity of the AAPI-2 was evaluated using multiple statistical methods, including exploratory factor analysis, confirmatory factor analysis, and latent class analysis. These analyses were…
Validation of asthma recording in electronic health records: a systematic review

PubMed Central

Nissen, Francis; Quint, Jennifer K; Wilkinson, Samantha; Mullerova, Hana; Smeeth, Liam; Douglas, Ian J

2017-01-01

Objective To describe the methods used to validate asthma diagnoses in electronic health records and summarize the results of the validation studies. Background Electronic health records are increasingly being used for research on asthma to inform health services and health policy. Validation of the recording of asthma diagnoses in electronic health records is essential to use these databases for credible epidemiological asthma research. Methods We searched EMBASE and MEDLINE databases for studies that validated asthma diagnoses detected in electronic health records up to October 2016. Two reviewers independently assessed the full text against the predetermined inclusion criteria. Key data including author, year, data source, case definitions, reference standard, and validation statistics (including sensitivity, specificity, positive predictive value [PPV], and negative predictive value [NPV]) were summarized in two tables. Results Thirteen studies met the inclusion criteria. Most studies demonstrated a high validity using at least one case definition (PPV >80%). Ten studies used a manual validation as the reference standard; each had at least one case definition with a PPV of at least 63%, up to 100%. We also found two studies using a second independent database to validate asthma diagnoses. The PPVs of the best performing case definitions ranged from 46% to 58%. We found one study which used a questionnaire as the reference standard to validate a database case definition; the PPV of the case definition algorithm in this study was 89%. Conclusion Attaining high PPVs (>80%) is possible using each of the discussed validation methods. Identifying asthma cases in electronic health records is possible with high sensitivity, specificity or PPV, by combining multiple data sources, or by focusing on specific test measures. Studies testing a range of case definitions show wide variation in the validity of each definition, suggesting this may be important for obtaining asthma definitions with optimal validity. PMID:29238227
On the incremental validity of irrational beliefs to predict subjective well-being while controlling for personality factors.

PubMed

Spörrle, Matthias; Strobel, Maria; Tumasjan, Andranik

2010-11-01

This research examines the incremental validity of irrational thinking as conceptualized by Albert Ellis to predict diverse aspects of subjective well-being while controlling for the influence of personality factors. Rational-emotive behavior therapy (REBT) argues that irrational beliefs result in maladaptive emotions leading to reduced well-being. Although there is some early scientific evidence for this relation, it has never been investigated whether this connection would still persist when statistically controlling for the Big Five personality factors, which were consistently found to be important determinants of well-being. Regression analyses revealed significant incremental validity of irrationality over personality factors when predicting life satisfaction, but not when predicting subjective happiness. Results are discussed with respect to conceptual differences between these two aspects of subjective well-being.
Development and validation of a tool to evaluate the quality of medical education websites in pathology.

PubMed

Alyusuf, Raja H; Prasad, Kameshwar; Abdel Satir, Ali M; Abalkhail, Ali A; Arora, Roopa K

2013-01-01

The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites.
Validation workflow for a clinical Bayesian network model in multidisciplinary decision making in head and neck oncology treatment.

PubMed

Cypko, Mario A; Stoehr, Matthaeus; Kozniewski, Marcin; Druzdzel, Marek J; Dietz, Andreas; Berliner, Leonard; Lemke, Heinz U

2017-11-01

Oncological treatment is being increasingly complex, and therefore, decision making in multidisciplinary teams is becoming the key activity in the clinical pathways. The increased complexity is related to the number and variability of possible treatment decisions that may be relevant to a patient. In this paper, we describe validation of a multidisciplinary cancer treatment decision in the clinical domain of head and neck oncology. Probabilistic graphical models and corresponding inference algorithms, in the form of Bayesian networks, can support complex decision-making processes by providing a mathematically reproducible and transparent advice. The quality of BN-based advice depends on the quality of the model. Therefore, it is vital to validate the model before it is applied in practice. For an example BN subnetwork of laryngeal cancer with 303 variables, we evaluated 66 patient records. To validate the model on this dataset, a validation workflow was applied in combination with quantitative and qualitative analyses. In the subsequent analyses, we observed four sources of imprecise predictions: incorrect data, incomplete patient data, outvoting relevant observations, and incorrect model. Finally, the four problems were solved by modifying the data and the model. The presented validation effort is related to the model complexity. For simpler models, the validation workflow is the same, although it may require fewer validation methods. The validation success is related to the model's well-founded knowledge base. The remaining laryngeal cancer model may disclose additional sources of imprecise predictions.
Experimental validation of finite element and boundary element methods for predicting structural vibration and radiated noise

NASA Technical Reports Server (NTRS)

Seybert, A. F.; Wu, T. W.; Wu, X. F.

1994-01-01

This research report is presented in three parts. In the first part, acoustical analyses were performed on modes of vibration of the housing of a transmission of a gear test rig developed by NASA. The modes of vibration of the transmission housing were measured using experimental modal analysis. The boundary element method (BEM) was used to calculate the sound pressure and sound intensity on the surface of the housing and the radiation efficiency of each mode. The radiation efficiency of each of the transmission housing modes was then compared to theoretical results for a finite baffled plate. In the second part, analytical and experimental validation of methods to predict structural vibration and radiated noise are presented. A rectangular box excited by a mechanical shaker was used as a vibrating structure. Combined finite element method (FEM) and boundary element method (BEM) models of the apparatus were used to predict the noise level radiated from the box. The FEM was used to predict the vibration, while the BEM was used to predict the sound intensity and total radiated sound power using surface vibration as the input data. Vibration predicted by the FEM model was validated by experimental modal analysis; noise predicted by the BEM was validated by measurements of sound intensity. Three types of results are presented for the total radiated sound power: sound power predicted by the BEM model using vibration data measured on the surface of the box; sound power predicted by the FEM/BEM model; and sound power measured by an acoustic intensity scan. In the third part, the structure used in part two was modified. A rib was attached to the top plate of the structure. The FEM and BEM were then used to predict structural vibration and radiated noise respectively. The predicted vibration and radiated noise were then validated through experimentation.
A microRNA-based prediction model for lymph node metastasis in hepatocellular carcinoma.

PubMed

Zhang, Li; Xiang, Zuo-Lin; Zeng, Zhao-Chong; Fan, Jia; Tang, Zhao-You; Zhao, Xiao-Mei

2016-01-19

We developed an efficient microRNA (miRNA) model that could predict the risk of lymph node metastasis (LNM) in hepatocellular carcinoma (HCC). We first evaluated a training cohort of 192 HCC patients after hepatectomy and found five LNM associated predictive factors: vascular invasion, Barcelona Clinic Liver Cancer stage, miR-145, miR-31, and miR-92a. The five statistically independent factors were used to develop a predictive model. The predictive value of the miRNA-based model was confirmed in a validation cohort of 209 consecutive HCC patients. The prediction model was scored for LNM risk from 0 to 8. The cutoff value 4 was used to distinguish high-risk and low-risk groups. The model sensitivity and specificity was 69.6 and 80.2%, respectively, during 5 years in the validation cohort. And the area under the curve (AUC) for the miRNA-based prognostic model was 0.860. The 5-year positive and negative predictive values of the model in the validation cohort were 30.3 and 95.5%, respectively. Cox regression analysis revealed that the LNM hazard ratio of the high-risk versus low-risk groups was 11.751 (95% CI, 5.110-27.021; P < 0.001) in the validation cohort. In conclusion, the miRNA-based model is reliable and accurate for the early prediction of LNM in patients with HCC.
Role of learning potential in cognitive remediation: Construct and predictive validity.

PubMed

Davidson, Charlie A; Johannesen, Jason K; Fiszdon, Joanna M

2016-03-01

The construct, convergent, discriminant, and predictive validity of Learning Potential (LP) was evaluated in a trial of cognitive remediation for adults with schizophrenia-spectrum disorders. LP utilizes a dynamic assessment approach to prospectively estimate an individual's learning capacity if provided the opportunity for specific related learning. LP was assessed in 75 participants at study entry, of whom 41 completed an eight-week cognitive remediation (CR) intervention, and 22 received treatment-as-usual (TAU). LP was assessed in a "test-train-test" verbal learning paradigm. Incremental predictive validity was assessed as the degree to which LP predicted memory skill acquisition above and beyond prediction by static verbal learning ability. Examination of construct validity confirmed that LP scores reflected use of trained semantic clustering strategy. LP scores correlated with executive functioning and education history, but not other demographics or symptom severity. Following the eight-week active phase, TAU evidenced little substantial change in skill acquisition outcomes, which related to static baseline verbal learning ability but not LP. For the CR group, LP significantly predicted skill acquisition in domains of verbal and visuospatial memory, but not auditory working memory. Furthermore, LP predicted skill acquisition incrementally beyond relevant background characteristics, symptoms, and neurocognitive abilities. Results suggest that LP assessment can significantly improve prediction of specific skill acquisition with cognitive training, particularly for the domain assessed, and thereby may prove useful in individualization of treatment. Published by Elsevier B.V.
The Smoking Consequences Questionnaire: Factor structure and predictive validity among Spanish-speaking Latino smokers in the United States.

PubMed

Vidrine, Jennifer Irvin; Vidrine, Damon J; Costello, Tracy J; Mazas, Carlos; Cofta-Woerpel, Ludmila; Mejia, Luz Maria; Wetter, David W

2009-11-01

Much of the existing research on smoking outcome expectancies has been guided by the Smoking Consequences Questionnaire (SCQ ). Although the original version of the SCQ has been modified over time for use in different populations, none of the existing versions have been evaluated for use among Spanish-speaking Latino smokers in the United States. The present study evaluated the factor structure and predictive validity of the 3 previously validated versions of the SCQ--the original, the SCQ-Adult, and the SCQ-Spanish, which was developed with Spanish-speaking smokers in Spain--among Spanish-speaking Latino smokers in Texas. The SCQ-Spanish represented the least complex solution. Each of the SCQ-Spanish scales had good internal consistency, and the predictive validity of the SCQ-Spanish was partially supported. Nearly all the SCQ-Spanish scales predicted withdrawal severity even after controlling for demographics and dependence. Boredom Reduction predicted smoking relapse across the 5- and 12-week follow-up assessments in a multivariate model that also controlled for demographics and dependence. Our results support use of the SCQ-Spanish with Spanish-speaking Latino smokers in the United States.
The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models

EPA Science Inventory

The second phase of the MicroArray Quality Control (MAQC-II) project evaluated common practices for developing and validating microarray-based models aimed at predicting toxicological and clinical endpoints. Thirty-six teams developed classifiers for 13 endpoints - some easy, som...
Predictive Validity of DSM-IV and ICD-10 Criteria for ADHD and Hyperkinetic Disorder

ERIC Educational Resources Information Center

Lee, Soyoung I.; Schachar, Russell J.; Chen, Shirley X.; Ornstein, Tisha J.; Charach, Alice; Barr, Cathy; Ickowicz, Abel

2008-01-01

Background: The goal of this study was to compare the predictive validity of the two main diagnostic schemata for childhood hyperactivity--attention-deficit hyperactivity disorder (ADHD; "Diagnostic and Statistical Manual"-IV) and hyperkinetic disorder (HKD; "International Classification of Diseases"-10th Edition). Methods: Diagnostic criteria for…
Predictive Validity of Early Literacy Measures for Korean English Language Learners in the United States

ERIC Educational Resources Information Center

Han, Jeanie Nam; Vanderwood, Michael L.; Lee, Catherine Y.

2015-01-01

This study examined the predictive validity of early literacy measures with first-grade Korean English language learners (ELLs) in the United States at varying levels of English proficiency. Participants were screened using Dynamic Indicators of Basic Early Literacy Skills (DIBELS) Phoneme Segmentation Fluency (PSF), DIBELS Nonsense Word Fluency…
The Validity of College Grade Prediction Equations Over Time.

ERIC Educational Resources Information Center

Sawyer, Richard L.; Maxey, James

A sample of 260 colleges was surveyed during the years 1972-1976 to determine the validity of predicting college freshmen grades from standardized test scores and high school grades using the American College Testing (ACT) Assessment Program, an evaluative and placement service for students and educators involved in the transition from high school…
Validity Evidence for Games as Assessment Environments. CRESST Report 773

ERIC Educational Resources Information Center

Delacruz, Girlie C.; Chung, Gregory K. W. K.; Baker, Eva L.

2010-01-01

This study provides empirical evidence of a highly specific use of games in education--the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also…
A Model for Investigating Predictive Validity at Highly Selective Institutions.

ERIC Educational Resources Information Center

Gross, Alan L.; And Others

A statistical model for investigating predictive validity at highly selective institutions is described. When the selection ratio is small, one must typically deal with a data set containing relatively large amounts of missing data on both criterion and predictor variables. Standard statistical approaches are based on the strong assumption that…
Evaluating the Complementary Roles of an SJT and Academic Assessment for Entry into Clinical Practice

ERIC Educational Resources Information Center

Cousans, Fran; Patterson, Fiona; Edwards, Helena; Walker, Kim; McLachlan, John C.; Good, David

2017-01-01

Although there is extensive evidence confirming the predictive validity of situational judgement tests (SJTs) in medical education, there remains a shortage of evidence for their predictive validity for performance of postgraduate trainees in their first role in clinical practice. Moreover, to date few researchers have empirically examined the…
The Validity of the Three-Component Model of Organizational Commitment in a Chinese Context.

ERIC Educational Resources Information Center

Cheng, Yuqiu; Stockdale, Margaret S.

2003-01-01

The construct validity of a three-component model of organizational commitment was tested with 226 Chinese employees. Affective and normative commitment significantly predicted job satisfaction; all three components predicted turnover intention. Compared with Canadian (n=603) and South Korean (n=227) samples, normative and affective commitment…
The Predictive Validity of CBM Writing Indices for Eighth-Grade Students

ERIC Educational Resources Information Center

Amato, Janelle M.; Watkins, Marley W.

2011-01-01

Curriculum-based measurement (CBM) is an alternative to traditional assessment techniques. Technical work has begun to identify CBM writing indices that are psychometrically sound for monitoring older students' writing proficiency. This study examined the predictive validity of CBM writing indices in a sample of 447 eighth-grade students.…
Thirty-Year Stability and Predictive Validity of Vocational Interests

ERIC Educational Resources Information Center

Rottinghaus, Patrick J.; Coon, Kristin L.; Gaffey, Abigail R.; Zytowski, Donald G.

2007-01-01

This study reports a 30-year follow-up of 107 former high school juniors and seniors from a rural Midwestern community who completed the Kuder Occupational Interest Survey (KOIS) in 1975 and 2005. Absolute, intra-individual, and test-retest stability of interests, and predictive validity of occupations were examined. Results showed minor absolute…

A Longitudinal Study of the Predictive Validity of a Kindergarten Screening Battery.

ERIC Educational Resources Information Center

Kilgallon, Mary K.; Mueller, Richard J.

Test validity was studied in nine subtests of a kindergarten screening battery used to predict reading comprehension for children up to five years after entering kindergarten. The independent variables were kindergarteners' scores on the: (1) Otis-Lennon Mental Ability Test; (2) Bender Visual Motor Gestalt Test; (3) Detroit Tests of Learning…
Journal Reviewer Ratings: Issues of Particularistic Bias, Agreement, and Predictive Validity within the Manuscript Review Process

ERIC Educational Resources Information Center

Vecchio, Robert P.

2006-01-01

Reviewer evaluations and recommendations for 853 manuscript submissions, over a span of 4 years, are analyzed for evidence of particularistic bias, reviewer agreement, and predictive validity for forecasting a published manuscript's citation impact. Attributes of the submitters, their affiliated institutions, and the reviewers have little…
Validity of the MicroDYN Approach: Complex Problem Solving Predicts School Grades beyond Working Memory Capacity

ERIC Educational Resources Information Center

Schweizer, Fabian; Wustenberg, Sascha; Greiff, Samuel

2013-01-01

This study examines the validity of the complex problem solving (CPS) test MicroDYN by investigating a) the relation between its dimensions--rule identification (exploration strategy), rule knowledge (acquired knowledge), rule application (control performance)--and working memory capacity (WMC), and b) whether CPS predicts school grades in…
Field Validity of the Psychopathy Checklist--Revised in Sex Offender Risk Assessment

ERIC Educational Resources Information Center

Murrie, Daniel C.; Boccaccini, Marcus T.; Caperton, Jennifer; Rufino, Katrina

2012-01-01

Several studies have concluded that scores from Hare's (2003) Psychopathy Checklist--Revised (PCL-R) predict reoffense among sexual offenders, but most of those studies examined the predictive validity of scores from trained research staff, not clinicians in the field scoring the measure as part of actual forensic assessments. Therefore, we…
Concurrent and Predictive Validity of the Phelps Kindergarten Readiness Scale-II

ERIC Educational Resources Information Center

Duncan, Jennifer; Rafter, Erin M.

2005-01-01

The purpose of this research was to establish the concurrent and predictive validity of the Phelps Kindergarten Readiness Scale, Second Edition (PKRS-II; L. Phelps, 2003). Seventy-four kindergarten students of diverse ethnic backgrounds enrolled in a northeastern suburban school participated in the study. The concurrent administration of the…
Predictive Validity and Accuracy of Oral Reading Fluency for English Learners

ERIC Educational Resources Information Center

Vanderwood, Michael L.; Tung, Catherine Y.; Checca, C. Jason

2014-01-01

The predictive validity and accuracy of an oral reading fluency (ORF) measure for a statewide assessment in English language arts was examined for second-grade native English speakers (NESs) and English learners (ELs) with varying levels of English proficiency. In addition to comparing ELs with native English speakers, the impact of English…
Incremental Validity of Thinking Styles in Predicting Academic Achievements: An Experimental Study in Hypermedia Learning Environments

ERIC Educational Resources Information Center

Fan, Weiqiao; Zhang, Li-Fang; Watkins, David

2010-01-01

The study examined the incremental validity of thinking styles in predicting academic achievement after controlling for personality and achievement motivation in the hypermedia-based learning environment. Seventy-two Chinese college students from Shanghai, the People's Republic of China, took part in this instructional experiment. The…
Assessing the reliability, predictive and construct validity of historical, clinical and risk management-20 (HCR-20) in Mexican psychiatric inpatients.

PubMed

Sada, Andrea; Robles-García, Rebeca; Martínez-López, Nicolás; Hernández-Ramírez, Rafael; Tovilla-Zarate, Carlos-Alfonso; López-Munguía, Fernando; Suárez-Alvarez, Enrique; Ayala, Xochitl; Fresán, Ana

2016-08-01

Assessing dangerousness to gauge the likelihood of future violent behaviour has become an integral part of clinical mental health practice in forensic and non-forensic psychiatric settings, one of the most effective instruments for this being the Historical, Clinical and Risk Management-20 (HCR-20). To examine the HCR-20 factor structure in Mexican psychiatric inpatients and to obtain its predictive validity and reliability for use in this population. In total, 225 patients diagnosed with psychotic, affective or personality disorders were included. The HCR-20 was applied at hospital admission and violent behaviours were assessed during psychiatric hospitalization using the Overt Aggression Scale (OAS). Construct validity, predictive validity and internal consistency were determined. Violent behaviour remains more severe in patients classified in the high-risk group during hospitalization. Fifteen items displayed adequate communalities in the original designated domains of the HCR-20 and internal consistency of the instruments was high. The HCR-20 is a suitable instrument for predicting violence risk in Mexican psychiatric inpatients.
Validation metrics for turbulent plasma transport

DOE PAGES

Holland, C.

2016-06-22

Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less
One-year temporal stability and predictive and incremental validity of the body, eating, and exercise comparison orientation measure (BEECOM) among college women.

PubMed

Fitzsimmons-Craft, Ellen E; Bardone-Cone, Anna M

2014-01-01

This study examined the one-year temporal stability and the predictive and incremental validity of the Body, Eating, and Exercise Comparison Measure (BEECOM) in a sample of 237 college women who completed study measures at two time points about one year apart. One-year temporal stability was high for the BEECOM total and subscale (i.e., Body, Eating, and Exercise Comparison Orientation) scores. Additionally, the BEECOM exhibited predictive validity in that it accounted for variance in body dissatisfaction and eating disorder symptomatology one year later. These findings held even after controlling for body mass index and existing measures of social comparison orientation. However, results regarding the incremental validity of the BEECOM, or its ability to predict change in these constructs over time, were more mixed. Overall, this study demonstrated additional psychometric properties of the BEECOM among college women, further establishing the usefulness of this measure for more comprehensively assessing eating disorder-related social comparison. Copyright © 2013 Elsevier Ltd. All rights reserved.
Establishment and validation of the scoring system for preoperative prediction of central lymph node metastasis in papillary thyroid carcinoma.

PubMed

Liu, Wen; Cheng, Ruochuan; Ma, Yunhai; Wang, Dan; Su, Yanjun; Diao, Chang; Zhang, Jianming; Qian, Jun; Liu, Jin

2018-05-03

Early preoperative diagnosis of central lymph node metastasis (CNM) is crucial to improve survival rates among patients with papillary thyroid carcinoma (PTC). Here, we analyzed clinical data from 2862 PTC patients and developed a scoring system using multivariable logistic regression and testified by the validation group. The predictive diagnostic effectiveness of the scoring system was evaluated based on consistency, discrimination ability, and accuracy. The scoring system considered seven variables: gender, age, tumor size, microcalcification, resistance index >0.7, multiple nodular lesions, and extrathyroid extension. The area under the receiver operating characteristic curve (AUC) was 0.742, indicating a good discrimination. Using 5 points as a diagnostic threshold, the validation results for validation group had an AUC of 0.758, indicating good discrimination and consistency in the scoring system. The sensitivity of this predictive model for preoperative diagnosis of CNM was 4 times higher than a direct ultrasound diagnosis. These data indicate that the CNM prediction model would improve preoperative diagnostic sensitivity for CNM in patients with papillary thyroid carcinoma.
Validation metrics for turbulent plasma transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holland, C.

Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less
Agility performance in high-level junior basketball players: the predictive value of anthropometrics and power qualities.

PubMed

Sisic, Nedim; Jelicic, Mario; Pehar, Miran; Spasic, Miodrag; Sekulic, Damir

2016-01-01

In basketball, anthropometric status is an important factor when identifying and selecting talents, while agility is one of the most vital motor performances. The aim of this investigation was to evaluate the influence of anthropometric variables and power capacities on different preplanned agility performances. The participants were 92 high-level, junior-age basketball players (16-17 years of age; 187.6±8.72 cm in body height, 78.40±12.26 kg in body mass), randomly divided into a validation and cross-validation subsample. The predictors set consisted of 16 anthropometric variables, three tests of power-capacities (Sargent-jump, broad-jump and medicine-ball-throw) as predictors. The criteria were three tests of agility: a T-Shape-Test; a Zig-Zag-Test, and a test of running with a 180-degree turn (T180). Forward stepwise multiple regressions were calculated for validation subsamples and then cross-validated. Cross validation included correlations between observed and predicted scores, dependent samples t-test between predicted and observed scores; and Bland Altman graphics. Analysis of the variance identified centres being advanced in most of the anthropometric indices, and medicine-ball-throw (all at P<0.05); with no significant between-position-differences for other studied motor performances. Multiple regression models originally calculated for the validation subsample were then cross-validated, and confirmed for Zig-zag-Test (R of 0.71 and 0.72 for the validation and cross-validation subsample, respectively). Anthropometrics were not strongly related to agility performance, but leg length is found to be negatively associated with performance in basketball-specific agility. Power capacities are confirmed to be an important factor in agility. The results highlighted the importance of sport-specific tests when studying pre-planned agility performance in basketball. The improvement in power capacities will probably result in an improvement in agility in basketball athletes, while anthropometric indices should be used in order to identify those athletes who can achieve superior agility performance.
Derivation and external validation of a case mix model for the standardized reporting of 30-day stroke mortality rates.

PubMed

Bray, Benjamin D; Campbell, James; Cloud, Geoffrey C; Hoffman, Alex; James, Martin; Tyrrell, Pippa J; Wolfe, Charles D A; Rudd, Anthony G

2014-11-01

Case mix adjustment is required to allow valid comparison of outcomes across care providers. However, there is a lack of externally validated models suitable for use in unselected stroke admissions. We therefore aimed to develop and externally validate prediction models to enable comparison of 30-day post-stroke mortality outcomes using routine clinical data. Models were derived (n=9000 patients) and internally validated (n=18 169 patients) using data from the Sentinel Stroke National Audit Program, the national register of acute stroke in England and Wales. External validation (n=1470 patients) was performed in the South London Stroke Register, a population-based longitudinal study. Models were fitted using general estimating equations. Discrimination and calibration were assessed using receiver operating characteristic curve analysis and correlation plots. Two final models were derived. Model A included age (<60, 60-69, 70-79, 80-89, and ≥90 years), National Institutes of Health Stroke Severity Score (NIHSS) on admission, presence of atrial fibrillation on admission, and stroke type (ischemic versus primary intracerebral hemorrhage). Model B was similar but included only the consciousness component of the NIHSS in place of the full NIHSS. Both models showed excellent discrimination and calibration in internal and external validation. The c-statistics in external validation were 0.87 (95% confidence interval, 0.84-0.89) and 0.86 (95% confidence interval, 0.83-0.89) for models A and B, respectively. We have derived and externally validated 2 models to predict mortality in unselected patients with acute stroke using commonly collected clinical variables. In settings where the ability to record the full NIHSS on admission is limited, the level of consciousness component of the NIHSS provides a good approximation of the full NIHSS for mortality prediction. © 2014 American Heart Association, Inc.
Predicting Overall Survival After Stereotactic Ablative Radiation Therapy in Early-Stage Lung Cancer: Development and External Validation of the Amsterdam Prognostic Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Louie, Alexander V., E-mail: Dr.alexlouie@gmail.com; Department of Radiation Oncology, London Regional Cancer Program, University of Western Ontario, London, Ontario; Department of Epidemiology, Harvard School of Public Health, Harvard University, Boston, Massachusetts

Purpose: A prognostic model for 5-year overall survival (OS), consisting of recursive partitioning analysis (RPA) and a nomogram, was developed for patients with early-stage non-small cell lung cancer (ES-NSCLC) treated with stereotactic ablative radiation therapy (SABR). Methods and Materials: A primary dataset of 703 ES-NSCLC SABR patients was randomly divided into a training (67%) and an internal validation (33%) dataset. In the former group, 21 unique parameters consisting of patient, treatment, and tumor factors were entered into an RPA model to predict OS. Univariate and multivariate models were constructed for RPA-selected factors to evaluate their relationship with OS. A nomogrammore » for OS was constructed based on factors significant in multivariate modeling and validated with calibration plots. Both the RPA and the nomogram were externally validated in independent surgical (n=193) and SABR (n=543) datasets. Results: RPA identified 2 distinct risk classes based on tumor diameter, age, World Health Organization performance status (PS) and Charlson comorbidity index. This RPA had moderate discrimination in SABR datasets (c-index range: 0.52-0.60) but was of limited value in the surgical validation cohort. The nomogram predicting OS included smoking history in addition to RPA-identified factors. In contrast to RPA, validation of the nomogram performed well in internal validation (r{sup 2}=0.97) and external SABR (r{sup 2}=0.79) and surgical cohorts (r{sup 2}=0.91). Conclusions: The Amsterdam prognostic model is the first externally validated prognostication tool for OS in ES-NSCLC treated with SABR available to individualize patient decision making. The nomogram retained strong performance across surgical and SABR external validation datasets. RPA performance was poor in surgical patients, suggesting that 2 different distinct patient populations are being treated with these 2 effective modalities.« less
The validity of DSM-5 severity specifiers for anorexia nervosa, bulimia nervosa, and binge-eating disorder.

PubMed

Smith, Kathryn E; Ellison, Jo M; Crosby, Ross D; Engel, Scott G; Mitchell, James E; Crow, Scott J; Peterson, Carol B; Le Grange, Daniel; Wonderlich, Stephen A

2017-09-01

The DSM-5 includes severity specifiers (i.e., mild, moderate, severe, extreme) for anorexia nervosa (AN), bulimia nervosa (BN), and binge-eating disorder (BED), which are determined by weight status (AN) and frequencies of binge-eating episodes (BED) or inappropriate compensatory behaviors (BN). Given limited data regarding the validity of eating disorder (ED) severity specifiers, this study examined the concurrent and predictive validity of severity specifiers in AN, BN, and BED. Adults with AN (n = 109), BN (n = 76), and BED (n = 216) were identified from previous datasets. Concurrent validity was assessed by measures of ED psychopathology, depression, anxiety, quality of life, and physical health. Predictive validity was assessed by ED symptoms at the end of the treatment in BN and BED. Severity categories did not differ in baseline validators, though the mild AN group evidenced greater ED symptoms compared to the severe group. In BN, greater severity was related to greater end of treatment binge-eating and compensatory behaviors, and lower likelihood of abstinence; however, in BED, greater severity was related to lower ED symptoms at the end of the treatment. Results demonstrated limited support for the validity of DSM-5 severity specifiers. Future research is warranted to explore additional validators and possible alternative indicators of severity in EDs. © 2017 Wiley Periodicals, Inc.
Developing and validating risk prediction models in an individual participant data meta-analysis

PubMed Central

2014-01-01

Background Risk prediction models estimate the risk of developing future outcomes for individuals based on one or more underlying characteristics (predictors). We review how researchers develop and validate risk prediction models within an individual participant data (IPD) meta-analysis, in order to assess the feasibility and conduct of the approach. Methods A qualitative review of the aims, methodology, and reporting in 15 articles that developed a risk prediction model using IPD from multiple studies. Results The IPD approach offers many opportunities but methodological challenges exist, including: unavailability of requested IPD, missing patient data and predictors, and between-study heterogeneity in methods of measurement, outcome definitions and predictor effects. Most articles develop their model using IPD from all available studies and perform only an internal validation (on the same set of data). Ten of the 15 articles did not allow for any study differences in baseline risk (intercepts), potentially limiting their model’s applicability and performance in some populations. Only two articles used external validation (on different data), including a novel method which develops the model on all but one of the IPD studies, tests performance in the excluded study, and repeats by rotating the omitted study. Conclusions An IPD meta-analysis offers unique opportunities for risk prediction research. Researchers can make more of this by allowing separate model intercept terms for each study (population) to improve generalisability, and by using ‘internal-external cross-validation’ to simultaneously develop and validate their model. Methodological challenges can be reduced by prospectively planned collaborations that share IPD for risk prediction. PMID:24397587
Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds

PubMed Central

Alves, Vinicius M.; Muratov, Eugene; Fourches, Denis; Strickland, Judy; Kleinstreuer, Nicole; Andrade, Carolina H.; Tropsha, Alexander

2015-01-01

Repetitive exposure to a chemical agent can induce an immune reaction in inherently susceptible individuals that leads to skin sensitization. Although many chemicals have been reported as skin sensitizers, there have been very few rigorously validated QSAR models with defined applicability domains (AD) that were developed using a large group of chemically diverse compounds. In this study, we have aimed to compile, curate, and integrate the largest publicly available dataset related to chemically-induced skin sensitization, use this data to generate rigorously validated and QSAR models for skin sensitization, and employ these models as a virtual screening tool for identifying putative sensitizers among environmental chemicals. We followed best practices for model building and validation implemented with our predictive QSAR workflow using random forest modeling technique in combination with SiRMS and Dragon descriptors. The Correct Classification Rate (CCR) for QSAR models discriminating sensitizers from non-sensitizers were 71–88% when evaluated on several external validation sets, within a broad AD, with positive (for sensitizers) and negative (for non-sensitizers) predicted rates of 85% and 79% respectively. When compared to the skin sensitization module included in the OECD QSAR toolbox as well as to the skin sensitization model in publicly available VEGA software, our models showed a significantly higher prediction accuracy for the same sets of external compounds as evaluated by Positive Predicted Rate, Negative Predicted Rate, and CCR. These models were applied to identify putative chemical hazards in the ScoreCard database of possible skin or sense organ toxicants as primary candidates for experimental validation. PMID:25560674
Construction, internal validation and implementation in a mobile application of a scoring system to predict nonadherence to proton pump inhibitors.

PubMed

Mares-García, Emma; Palazón-Bru, Antonio; Folgado-de la Rosa, David Manuel; Pereira-Expósito, Avelino; Martínez-Martín, Álvaro; Cortés-Castell, Ernesto; Gil-Guillén, Vicente Francisco

2017-01-01

Other studies have assessed nonadherence to proton pump inhibitors (PPIs), but none has developed a screening test for its detection. To construct and internally validate a predictive model for nonadherence to PPIs. This prospective observational study with a one-month follow-up was carried out in 2013 in Spain, and included 302 patients with a prescription for PPIs. The primary variable was nonadherence to PPIs (pill count). Secondary variables were gender, age, antidepressants, type of PPI, non-guideline-recommended prescription (NGRP) of PPIs, and total number of drugs. With the secondary variables, a binary logistic regression model to predict nonadherence was constructed and adapted to a points system. The ROC curve, with its area (AUC), was calculated and the optimal cut-off point was established. The points system was internally validated through 1,000 bootstrap samples and implemented in a mobile application (Android). The points system had three prognostic variables: total number of drugs, NGRP of PPIs, and antidepressants. The AUC was 0.87 (95% CI [0.83-0.91], p < 0.001). The test yielded a sensitivity of 0.80 (95% CI [0.70-0.87]) and a specificity of 0.82 (95% CI [0.76-0.87]). The three parameters were very similar in the bootstrap validation. A points system to predict nonadherence to PPIs has been constructed, internally validated and implemented in a mobile application. Provided similar results are obtained in external validation studies, we will have a screening tool to detect nonadherence to PPIs.
Comparing current definitions of return to work: a measurement approach.

PubMed

Steenstra, I A; Lee, H; de Vroome, E M M; Busse, J W; Hogg-Johnson, S J

2012-09-01

Return-to-work (RTW) status is an often used outcome in work and health research. In low back pain, work is regarded as a normal activity a worker should return to in order to fully recover. Comparing outcomes across studies and even jurisdictions using different definitions of RTW can be challenging for readers in general and when performing a systematic review in particular. In this study, the measurement properties of previously defined RTW outcomes were examined with data from two studies from two countries. Data on RTW in low back pain (LBP) from the Canadian Early Claimant Cohort (ECC); a workers' compensation based study, and the Dutch Amsterdam Sherbrooke Evaluation (ASE) study were analyzed. Correlations between outcomes, differences in predictive validity when using different outcomes and construct validity when comparing outcomes to a functional status outcome were analyzed. In the ECC all definitions were highly correlated and performed similarly in predictive validity. When compared to functional status, RTW definitions in the ECC study performed fair to good on all time points. In the ASE study all definitions were highly correlated and performed similarly in predictive validity. The RTW definitions, however, failed to compare or compared poorly with functional status. Only one definition compared fairly on one time point. Differently defined outcomes are highly correlated, give similar results in prediction, but seem to differ in construct validity when compared to functional status depending on societal context or possibly birth cohort. Comparison of studies using different RTW definitions appears valid as long as RTW status is not considered as a measure of functional status.

Diabetic retinopathy risk prediction for fundus examination using sparse learning: a cross-sectional study.

PubMed

Oh, Ein; Yoo, Tae Keun; Park, Eun-Cheol

2013-09-13

Blindness due to diabetic retinopathy (DR) is the major disability in diabetic patients. Although early management has shown to prevent vision loss, diabetic patients have a low rate of routine ophthalmologic examination. Hence, we developed and validated sparse learning models with the aim of identifying the risk of DR in diabetic patients. Health records from the Korea National Health and Nutrition Examination Surveys (KNHANES) V-1 were used. The prediction models for DR were constructed using data from 327 diabetic patients, and were validated internally on 163 patients in the KNHANES V-1. External validation was performed using 562 diabetic patients in the KNHANES V-2. The learning models, including ridge, elastic net, and LASSO, were compared to the traditional indicators of DR. Considering the Bayesian information criterion, LASSO predicted DR most efficiently. In the internal and external validation, LASSO was significantly superior to the traditional indicators by calculating the area under the curve (AUC) of the receiver operating characteristic. LASSO showed an AUC of 0.81 and an accuracy of 73.6% in the internal validation, and an AUC of 0.82 and an accuracy of 75.2% in the external validation. The sparse learning model using LASSO was effective in analyzing the epidemiological underlying patterns of DR. This is the first study to develop a machine learning model to predict DR risk using health records. LASSO can be an excellent choice when both discriminative power and variable selection are important in the analysis of high-dimensional electronic health records.
Validity of a novel computerized screening test system for mild cognitive impairment.

PubMed

Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk

2018-06-20

ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.
Translation, adaptation, and validation of the Sunderland Scale and the Cubbin & Jackson Revised Scale in Portuguese

PubMed Central

Sousa, Bruno

2013-01-01

Objective To translate into Portuguese and evaluate the measuring properties of the Sunderland Scale and the Cubbin & Jackson Revised Scale, which are instruments for evaluating the risk of developing pressure ulcers during intensive care. Methods This study included the process of translation and adaptation of the scales to the Portuguese language, as well as the validation of these tools. To assess the reliability, Cronbach alpha values of 0.702 to 0.708 were identified for the Sunderland Scale and the Cubbin & Jackson Revised Scale, respectively. The validation criteria (predictive) were performed comparatively with the Braden Scale (gold standard), and the main measurements evaluated were sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve, which were calculated based on cutoff points. Results The Sunderland Scale exhibited 60% sensitivity, 86.7% specificity, 47.4% positive predictive value, 91.5% negative predictive value, and 0.86 for the area under the curve. The Cubbin & Jackson Revised Scale exhibited 73.3% sensitivity, 86.7% specificity, 52.4% positive predictive value, 94.2% negative predictive value, and 0.91 for the area under the curve. The Braden scale exhibited 100% sensitivity, 5.3% specificity, 17.4% positive predictive value, 100% negative predictive value, and 0.72 for the area under the curve. Conclusions Both tools demonstrated reliability and validity for this sample. The Cubbin & Jackson Revised Scale yielded better predictive values for the development of pressure ulcers during intensive care. PMID:23917975
Assessing working memory in children with ADHD: Minor administration and scoring changes may improve digit span backward's construct validity.

PubMed

Wells, Erica L; Kofler, Michael J; Soto, Elia F; Schaefer, Hillary S; Sarver, Dustin E

2018-01-01

Pediatric ADHD is associated with impairments in working memory, but these deficits often go undetected when using clinic-based tests such as digit span backward. The current study pilot-tested minor administration/scoring modifications to improve digit span backward's construct and predictive validities in a well-characterized sample of children with ADHD. WISC-IV digit span was modified to administer all trials (i.e., ignore discontinue rule) and count digits rather than trials correct. Traditional and modified scores were compared to a battery of criterion working memory (construct validity) and academic achievement tests (predictive validity) for 34 children with ADHD ages 8-13 (M=10.41; 11 girls). Traditional digit span backward scores failed to predict working memory or KTEA-2 achievement (allns). Alternate administration/scoring of digit span backward significantly improved its associations with working memory reordering (r=.58), working memory dual-processing (r=.53), working memory updating (r=.28), and KTEA-2 achievement (r=.49). Consistent with prior work, these findings urge caution when interpreting digit span performance. Minor test modifications may address test validity concerns, and should be considered in future test revisions. Digit span backward becomes a valid measure of working memory at exactly the point that testing is traditionally discontinued. Copyright © 2017 Elsevier Ltd. All rights reserved.
Concurrent validity and clinical usefulness of several individually administered tests of children's social-emotional cognition.

PubMed

McKown, Clark

2007-03-01

In this study, the validity of 5 tests of children's social-emotional cognition, defined as their encoding, memory, and interpretation of social information, was tested. Participants were 126 clinic-referred children between the ages of 5 and 17. All 5 tests were evaluated in terms of their (a) concurrent validity, (b) incremental validity, and (c) clinical usefulness in predicting social functioning. Tests included measures of nonverbal sensitivity, social language, and social problem solving. Criterion measures included parent and teacher report of social functioning. Analyses support the concurrent validity of all measures, and the incremental validity and clinical usefulness of tests of pragmatic language and problem solving.
Demonstrating the validity of three general scores of PET in predicting higher education achievement in Israel.

PubMed

Oren, Carmel; Kennet-Cohen, Tamar; Turvall, Elliot; Allalouf, Avi

2014-01-01

The Psychometric Entrance Test (PET), used for admission to higher education in Israel together with the Matriculation (Bagrut), had in the past one general (total) score in which the weights for its domains: Verbal, Quantitative and English, were 2:2:1, respectively. In 2011, two additional total scores were introduced, with different weights for the Verbal and the Quantitative domains. This study compares the predictive validity of the three general scores of PET, and demonstrates validity in terms of utility. 100,863 freshmen students of all Israeli universities over the classes of 2005-2009. Regression weights and correlations of the predictors with FYGPA were computed. Simulations based on these results supplied the utility estimates. On average, PET is slightly more predictive than the Bagrut; using them both yields a better tool than either of them alone. Assigning differential weights to the components in the respective schools further improves the validity. The introduction of the new general scores of PET is validated by gathering and analyzing evidence based on relations of test scores to other variables. The utility of using the test can be demonstrated in ways different from correlations.
Prediction of functional aerobic capacity without exercise testing

NASA Technical Reports Server (NTRS)

Jackson, A. S.; Blair, S. N.; Mahar, M. T.; Wier, L. T.; Ross, R. M.; Stuteville, J. E.

1990-01-01

The purpose of this study was to develop functional aerobic capacity prediction models without using exercise tests (N-Ex) and to compare the accuracy with Astrand single-stage submaximal prediction methods. The data of 2,009 subjects (9.7% female) were randomly divided into validation (N = 1,543) and cross-validation (N = 466) samples. The validation sample was used to develop two N-Ex models to estimate VO2peak. Gender, age, body composition, and self-report activity were used to develop two N-Ex prediction models. One model estimated percent fat from skinfolds (N-Ex %fat) and the other used body mass index (N-Ex BMI) to represent body composition. The multiple correlations for the developed models were R = 0.81 (SE = 5.3 ml.kg-1.min-1) and R = 0.78 (SE = 5.6 ml.kg-1.min-1). This accuracy was confirmed when applied to the cross-validation sample. The N-Ex models were more accurate than what was obtained from VO2peak estimated from the Astrand prediction models. The SEs of the Astrand models ranged from 5.5-9.7 ml.kg-1.min-1. The N-Ex models were cross-validated on 59 men on hypertensive medication and 71 men who were found to have a positive exercise ECG. The SEs of the N-Ex models ranged from 4.6-5.4 ml.kg-1.min-1 with these subjects.(ABSTRACT TRUNCATED AT 250 WORDS).
Environmental fate model for ultra-low-volume insecticide applications used for adult mosquito management

USGS Publications Warehouse

Schleier, Jerome J.; Peterson, Robert K.D.; Irvine, Kathryn M.; Marshall, Lucy M.; Weaver, David K.; Preftakes, Collin J.

2012-01-01

One of the more effective ways of managing high densities of adult mosquitoes that vector human and animal pathogens is ultra-low-volume (ULV) aerosol applications of insecticides. The U.S. Environmental Protection Agency uses models that are not validated for ULV insecticide applications and exposure assumptions to perform their human and ecological risk assessments. Currently, there is no validated model that can accurately predict deposition of insecticides applied using ULV technology for adult mosquito management. In addition, little is known about the deposition and drift of small droplets like those used under conditions encountered during ULV applications. The objective of this study was to perform field studies to measure environmental concentrations of insecticides and to develop a validated model to predict the deposition of ULV insecticides. The final regression model was selected by minimizing the Bayesian Information Criterion and its prediction performance was evaluated using k-fold cross validation. Density of the formulation and the density and CMD interaction coefficients were the largest in the model. The results showed that as density of the formulation decreases, deposition increases. The interaction of density and CMD showed that higher density formulations and larger droplets resulted in greater deposition. These results are supported by the aerosol physics literature. A k-fold cross validation demonstrated that the mean square error of the selected regression model is not biased, and the mean square error and mean square prediction error indicated good predictive ability.
Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

PubMed Central

Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

2011-01-01

We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875
A scoring system to predict breast cancer mortality at 5 and 10 years.

PubMed

Paredes-Aracil, Esther; Palazón-Bru, Antonio; Folgado-de la Rosa, David Manuel; Ots-Gutiérrez, José Ramón; Compañ-Rosique, Antonio Fernando; Gil-Guillén, Vicente Francisco

2017-03-24

Although predictive models exist for mortality in breast cancer (BC) (generally all cause-mortality), they are not applicable to all patients and their statistical methodology is not the most powerful to develop a predictive model. Consequently, we developed a predictive model specific for BC mortality at 5 and 10 years resolving the above issues. This cohort study included 287 patients diagnosed with BC in a Spanish region in 2003-2016. time-to-BC death. Secondary variables: age, personal history of breast surgery, personal history of any cancer/BC, premenopause, postmenopause, grade, estrogen receptor, progesterone receptor, c-erbB2, TNM stage, multicentricity/multifocality, diagnosis and treatment. A points system was constructed to predict BC mortality at 5 and 10 years. The model was internally validated by bootstrapping. The points system was integrated into a mobile application for Android. Mean follow-up was 8.6 ± 3.5 years and 55 patients died of BC. The points system included age, personal history of BC, grade, TNM stage and multicentricity. Validation was satisfactory, in both discrimination and calibration. In conclusion, we constructed and internally validated a scoring system for predicting BC mortality at 5 and 10 years. External validation studies are needed for its use in other geographical areas.
Multifactorial risk index for prediction of intraoperative blood transfusion in endovascular aneurysm repair.

PubMed

Mahmood, Eitezaz; Matyal, Robina; Mueller, Ariel; Mahmood, Feroze; Tung, Avery; Montealegre-Gallegos, Mario; Schermerhorn, Marc; Shahul, Sajid

2018-03-01

In some institutions, the current blood ordering practice does not discriminate minimally invasive endovascular aneurysm repair (EVAR) from open procedures, with consequent increasing costs and likelihood of blood product wastage for EVARs. This limitation in practice can possibly be addressed with the development of a reliable prediction model for transfusion risk in EVAR patients. We used the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) database to create a model for prediction of intraoperative blood transfusion occurrence in patients undergoing EVAR. Afterward, we tested our predictive model on the Vascular Study Group of New England (VSGNE) database. We used the ACS NSQIP database for patients who underwent EVAR from 2011 to 2013 (N = 4709) as our derivation set for identifying a risk index for predicting intraoperative blood transfusion. We then developed a clinical risk score and validated this model using patients who underwent EVAR from 2003 to 2014 in the VSGNE database (N = 4478). The transfusion rates were 8.4% and 6.1% for the ACS NSQIP (derivation set) and VSGNE (validation) databases, respectively. Hemoglobin concentration, American Society of Anesthesiologists class, age, and aneurysm diameter predicted blood transfusion in the derivation set. When it was applied on the validation set, our risk index demonstrated good discrimination in both the derivation and validation set (C statistic = 0.73 and 0.70, respectively) and calibration using the Hosmer-Lemeshow test (P = .27 and 0.31) for both data sets. We developed and validated a risk index for predicting the likelihood of intraoperative blood transfusion in EVAR patients. Implementation of this index may facilitate the blood management strategies specific for EVAR. Copyright © 2017 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Evaluation of the DAVROS (Development And Validation of Risk-adjusted Outcomes for Systems of emergency care) risk-adjustment model as a quality indicator for healthcare

PubMed Central

Wilson, Richard; Goodacre, Steve W; Klingbajl, Marcin; Kelly, Anne-Maree; Rainer, Tim; Coats, Tim; Holloway, Vikki; Townend, Will; Crane, Steve

2014-01-01

Background and objective Risk-adjusted mortality rates can be used as a quality indicator if it is assumed that the discrepancy between predicted and actual mortality can be attributed to the quality of healthcare (ie, the model has attributional validity). The Development And Validation of Risk-adjusted Outcomes for Systems of emergency care (DAVROS) model predicts 7-day mortality in emergency medical admissions. We aimed to test this assumption by evaluating the attributional validity of the DAVROS risk-adjustment model. Methods We selected cases that had the greatest discrepancy between observed mortality and predicted probability of mortality from seven hospitals involved in validation of the DAVROS risk-adjustment model. Reviewers at each hospital assessed hospital records to determine whether the discrepancy between predicted and actual mortality could be explained by the healthcare provided. Results We received 232/280 (83%) completed review forms relating to 179 unexpected deaths and 53 unexpected survivors. The healthcare system was judged to have potentially contributed to 10/179 (8%) of the unexpected deaths and 26/53 (49%) of the unexpected survivors. Failure of the model to appropriately predict risk was judged to be responsible for 135/179 (75%) of the unexpected deaths and 2/53 (4%) of the unexpected survivors. Some 10/53 (19%) of the unexpected survivors died within a few months of the 7-day period of model prediction. Conclusions We found little evidence that deaths occurring in patients with a low predicted mortality from risk-adjustment could be attributed to the quality of healthcare provided. PMID:23605036
Systematic review of prediction models for delirium in the older adult inpatient.

PubMed

Lindroth, Heidi; Bratzke, Lisa; Purvis, Suzanne; Brown, Roger; Coburn, Mark; Mrkobrada, Marko; Chan, Matthew T V; Davis, Daniel H J; Pandharipande, Pratik; Carlsson, Cynthia M; Sanders, Robert D

2018-04-28

To identify existing prognostic delirium prediction models and evaluate their validity and statistical methodology in the older adult (≥60 years) acute hospital population. Systematic review. PubMed, CINAHL, PsychINFO, SocINFO, Cochrane, Web of Science and Embase were searched from 1 January 1990 to 31 December 2016. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses and CHARMS Statement guided protocol development. age >60 years, inpatient, developed/validated a prognostic delirium prediction model. alcohol-related delirium, sample size ≤50. The primary performance measures were calibration and discrimination statistics. Two authors independently conducted search and extracted data. The synthesis of data was done by the first author. Disagreement was resolved by the mentoring author. The initial search resulted in 7,502 studies. Following full-text review of 192 studies, 33 were excluded based on age criteria (<60 years) and 27 met the defined criteria. Twenty-three delirium prediction models were identified, 14 were externally validated and 3 were internally validated. The following populations were represented: 11 medical, 3 medical/surgical and 13 surgical. The assessment of delirium was often non-systematic, resulting in varied incidence. Fourteen models were externally validated with an area under the receiver operating curve range from 0.52 to 0.94. Limitations in design, data collection methods and model metric reporting statistics were identified. Delirium prediction models for older adults show variable and typically inadequate predictive capabilities. Our review highlights the need for development of robust models to predict delirium in older inpatients. We provide recommendations for the development of such models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
An Empiric HIV Risk Scoring Tool to Predict HIV-1 Acquisition in African Women.

PubMed

Balkus, Jennifer E; Brown, Elizabeth; Palanee, Thesla; Nair, Gonasagrie; Gafoor, Zakir; Zhang, Jingyang; Richardson, Barbra A; Chirenje, Zvavahera M; Marrazzo, Jeanne M; Baeten, Jared M

2016-07-01

To develop and validate an HIV risk assessment tool to predict HIV acquisition among African women. Data were analyzed from 3 randomized trials of biomedical HIV prevention interventions among African women (VOICE, HPTN 035, and FEM-PrEP). We implemented standard methods for the development of clinical prediction rules to generate a risk-scoring tool to predict HIV acquisition over the course of 1 year. Performance of the score was assessed through internal and external validations. The final risk score resulting from multivariable modeling included age, married/living with a partner, partner provides financial or material support, partner has other partners, alcohol use, detection of a curable sexually transmitted infection, and herpes simplex virus 2 serostatus. Point values for each factor ranged from 0 to 2, with a maximum possible total score of 11. Scores ≥5 were associated with HIV incidence >5 per 100 person-years and identified 91% of incident HIV infections from among only 64% of women. The area under the curve (AUC) for predictive ability of the score was 0.71 (95% confidence interval [CI]: 0.68 to 0.74), indicating good predictive ability. Risk score performance was generally similar with internal cross-validation (AUC = 0.69; 95% CI: 0.66 to 0.73) and external validation in HPTN 035 (AUC = 0.70; 95% CI: 0.65 to 0.75) and FEM-PrEP (AUC = 0.58; 95% CI: 0.51 to 0.65). A discrete set of characteristics that can be easily assessed in clinical and research settings was predictive of HIV acquisition over 1 year. The use of a validated risk score could improve efficiency of recruitment into HIV prevention research and inform scale-up of HIV prevention strategies in women at highest risk.
External Validation of a Tool Predicting 7-Year Risk of Developing Cardiovascular Disease, Type 2 Diabetes or Chronic Kidney Disease.

PubMed

Rauh, Simone P; Rutters, Femke; van der Heijden, Amber A W A; Luimes, Thomas; Alssema, Marjan; Heymans, Martijn W; Magliano, Dianna J; Shaw, Jonathan E; Beulens, Joline W; Dekker, Jacqueline M

2018-02-01

Chronic cardiometabolic diseases, including cardiovascular disease (CVD), type 2 diabetes (T2D) and chronic kidney disease (CKD), share many modifiable risk factors and can be prevented using combined prevention programs. Valid risk prediction tools are needed to accurately identify individuals at risk. We aimed to validate a previously developed non-invasive risk prediction tool for predicting the combined 7-year-risk for chronic cardiometabolic diseases. The previously developed tool is stratified for sex and contains the predictors age, BMI, waist circumference, use of antihypertensives, smoking, family history of myocardial infarction/stroke, and family history of diabetes. This tool was externally validated, evaluating model performance using area under the receiver operating characteristic curve (AUC)-assessing discrimination-and Hosmer-Lemeshow goodness-of-fit (HL) statistics-assessing calibration. The intercept was recalibrated to improve calibration performance. The risk prediction tool was validated in 3544 participants from the Australian Diabetes, Obesity and Lifestyle Study (AusDiab). Discrimination was acceptable, with an AUC of 0.78 (95% CI 0.75-0.81) in men and 0.78 (95% CI 0.74-0.81) in women. Calibration was poor (HL statistic: p < 0.001), but improved considerably after intercept recalibration. Examination of individual outcomes showed that in men, AUC was highest for CKD (0.85 [95% CI 0.78-0.91]) and lowest for T2D (0.69 [95% CI 0.65-0.74]). In women, AUC was highest for CVD (0.88 [95% CI 0.83-0.94)]) and lowest for T2D (0.71 [95% CI 0.66-0.75]). Validation of our previously developed tool showed robust discriminative performance across populations. Model recalibration is recommended to account for different disease rates. Our risk prediction tool can be useful in large-scale prevention programs for identifying those in need of further risk profiling because of their increased risk for chronic cardiometabolic diseases.
ERP evidence for selective drop in attentional costs in uncertain environments: challenging a purely premotor account of covert orienting of attention.

PubMed

Lasaponara, Stefano; Chica, Ana B; Lecce, Francesca; Lupianez, Juan; Doricchi, Fabrizio

2011-07-01

Several studies have proved that the reliability of endogenous spatial cues linearly modulates the reaction time advantage in the processing of targets at validly cued vs. invalidly cued locations, i.e. the "validity effect". This would imply that with non-predictive cues, no "validity effect" should be observed. However, contrary to this prediction, one could hypothesize that attentional benefits by valid cuing (i.e. the RT advantage for validly vs. neutrally cued targets) can still be maintained with non-predictive cues, if the brain were endowed with mechanisms allowing the selective reduction in costs of reorienting from invalidly cued locations (i.e. the reduction of the RT disadvantage for invalidly vs. neutrally cued targets). This separated modulation of attentional benefits and costs would be adaptive in uncertain contexts where cues predict at chance level the location of targets. Through the joint recording of manual reaction times and event-related cerebral potentials (ERPs), we have found that this is the case and that relying on non-predictive endogenous cues results in abatement of attentional costs and the difference in the amplitude of the P1 brain responses evoked by invalidly vs. neutrally cued targets. In contrast, the use of non-predictive cues leaves unaffected attentional benefits and the difference in the amplitude of the N1 responses evoked by validly vs. neutrally cued targets. At the individual level, the drop in costs with non-predictive cues was matched with equivalent lateral biases in RTs to neutrally and invalidly cued targets presented in the left and right visual field. During the cue period, the drop in costs with non-predictive cues was preceded by reduction of the Early Directing Attention Negativity (EDAN) on posterior occipital sites and by enhancement of the frontal Anterior Directing Attention Negativity (ADAN) correlated to preparatory voluntary orienting. These findings demonstrate, for the first time, that the segregation of mechanisms regulating attentional benefits and costs helps efficiency of orienting in "uncertain" visual spatial contexts characterized by poor probabilistic association between cues and targets. Copyright © 2011 Elsevier Ltd. All rights reserved.
Systematic review of prognostic prediction models for acute kidney injury (AKI) in general hospital populations.

PubMed

Hodgson, Luke Eliot; Sarnowski, Alexander; Roderick, Paul J; Dimitrov, Borislav D; Venn, Richard M; Forni, Lui G

2017-09-27

Critically appraise prediction models for hospital-acquired acute kidney injury (HA-AKI) in general populations. Systematic review. Medline, Embase and Web of Science until November 2016. Studies describing development of a multivariable model for predicting HA-AKI in non-specialised adult hospital populations. Published guidance followed for data extraction reporting and appraisal. 14 046 references were screened. Of 53 HA-AKI prediction models, 11 met inclusion criteria (general medicine and/or surgery populations, 474 478 patient episodes) and five externally validated. The most common predictors were age (n=9 models), diabetes (5), admission serum creatinine (SCr) (5), chronic kidney disease (CKD) (4), drugs (diuretics (4) and/or ACE inhibitors/angiotensin-receptor blockers (3)), bicarbonate and heart failure (4 models each). Heterogeneity was identified for outcome definition. Deficiencies in reporting included handling of predictors, missing data and sample size. Admission SCr was frequently taken to represent baseline renal function. Most models were considered at high risk of bias. Area under the receiver operating characteristic curves to predict HA-AKI ranged 0.71-0.80 in derivation (reported in 8/11 studies), 0.66-0.80 for internal validation studies (n=7) and 0.65-0.71 in five external validations. For calibration, the Hosmer-Lemeshow test or a calibration plot was provided in 4/11 derivations, 3/11 internal and 3/5 external validations. A minority of the models allow easy bedside calculation and potential electronic automation. No impact analysis studies were found. AKI prediction models may help address shortcomings in risk assessment; however, in general hospital populations, few have external validation. Similar predictors reflect an elderly demographic with chronic comorbidities. Reporting deficiencies mirrors prediction research more broadly, with handling of SCr (baseline function and use as a predictor) a concern. Future research should focus on validation, exploration of electronic linkage and impact analysis. The latter could combine a prediction model with AKI alerting to address prevention and early recognition of evolving AKI. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
The reliability, validity, sensitivity, specificity and predictive values of the Chinese version of the Rowland Universal Dementia Assessment Scale.

PubMed

Chen, Chia-Wei; Chu, Hsin; Tsai, Chia-Fen; Yang, Hui-Ling; Tsai, Jui-Chen; Chung, Min-Huey; Liao, Yuan-Mei; Chi, Mei-Ju; Chou, Kuei-Ru

2015-11-01

The purpose of this study was to translate the Rowland Universal Dementia Assessment Scale into Chinese and to evaluate the psychometric properties (reliability and validity) and the diagnostic properties (sensitivity, specificity and predictive values) of the Chinese version of the Rowland Universal Dementia Assessment Scale. The accurate detection of early dementia requires screening tools with favourable cross-cultural linguistic and appropriate sensitivity, specificity, and predictive values, particularly for Chinese-speaking populations. This was a cross-sectional, descriptive study. Overall, 130 participants suspected to have cognitive impairment were enrolled in the study. A test-retest for determining reliability was scheduled four weeks after the initial test. Content validity was determined by five experts, whereas construct validity was established by using contrasted group technique. The participants' clinical diagnoses were used as the standard in calculating the sensitivity, specificity, positive predictive value and negative predictive value. The study revealed that the Chinese version of the Rowland Universal Dementia Assessment Scale exhibited a test-retest reliability of 0.90, an internal consistency reliability of 0.71, an inter-rater reliability (kappa value) of 0.88 and a content validity index of 0.97. Both the patients and healthy contrast group exhibited significant differences in their cognitive ability. The optimal cut-off points for the Chinese version of the Rowland Universal Dementia Assessment Scale in the test for mild cognitive impairment and dementia were 24 and 22, respectively; moreover, for these two conditions, the sensitivities of the scale were 0.79 and 0.76, the specificities were 0.91 and 0.81, the areas under the curve were 0.85 and 0.78, the positive predictive values were 0.99 and 0.83 and the negative predictive values were 0.96 and 0.91 respectively. The Chinese version of the Rowland Universal Dementia Assessment Scale exhibited sound reliability, validity, sensitivity, specificity and predictive values. This scale can help clinical staff members to quickly and accurately diagnose cognitive impairment and provide appropriate treatment as early as possible. © 2015 John Wiley & Sons Ltd.
The Diagnostic Accuracy of the Berg Balance Scale in Predicting Falls.

PubMed

Park, Seong-Hi; Lee, Young-Shin

2017-11-01

This study aimed to evaluate the predictive validity of the Berg Balance Scale (BBS) as a screening tool for fall risks among those with varied levels of balance. A total of 21 studies reporting predictive validity of the BBS of fall risk were meta-analyzed. With regard to the overall predictive validity of the BBS, the pooled sensitivity and specificity were 0.72 and 0.73, respectively; the accuracy curve area was 0.84. The findings showed statistical heterogeneity among studies. Among the sub-groups, the age group of those younger than 65 years, those with neuromuscular disease, those with 2+ falls, and those with a cutoff point of 45 to 49 showed better sensitivity with statistically less heterogeneity. The empirical evidence indicates that the BBS is a suitable tool to screen for the risk of falls and shows good predictability when used with the appropriate criteria and applied to those with neuromuscular disease.
Translation and validation of the Canadian diabetes risk assessment questionnaire in China.

PubMed

Guo, Jia; Shi, Zhengkun; Chen, Jyu-Lin; Dixon, Jane K; Wiley, James; Parry, Monica

2018-01-01

To adapt the Canadian Diabetes Risk Assessment Questionnaire for the Chinese population and to evaluate its psychometric properties. A cross-sectional study was conducted with a convenience sample of 194 individuals aged 35-74 years from October 2014 to April 2015. The Canadian Diabetes Risk Assessment Questionnaire was adapted and translated for the Chinese population. Test-retest reliability was conducted to measure stability. Criterion and convergent validity of the adapted questionnaire were assessed using 2-hr 75 g oral glucose tolerance tests and the Finnish Diabetes Risk Scores, respectively. Sensitivity and specificity were evaluated to establish its predictive validity. The test-retest reliability was 0.988. Adequate validity of the adapted questionnaire was demonstrated by positive correlations found between the scores and 2-hr 75 g oral glucose tolerance tests (r = .343, p < .001) and with the Finnish Diabetes Risk Scores (r = .738, p < .001). The area under receiver operating characteristic curve was 0.705 (95% CI .632, .778), demonstrating moderate diagnostic value at a cutoff score of 30. The sensitivity was 73%, with a positive predictive value of 57% and negative predictive value of 78%. Our results provided evidence supporting the translation consistency, content validity, convergent validity, criterion validity, sensitivity, and specificity of the translated Canadian Diabetes Risk Assessment Questionnaire with minor modifications. This paper provides clinical, practical, and methodological information on how to adapt a diabetes risk calculator between cultures for public health nurses. © 2017 Wiley Periodicals, Inc.

Applicability Analysis of Validation Evidence for Biomedical Computational Models

DOE PAGES

Pathmanathan, Pras; Gray, Richard A.; Romero, Vicente J.; ...

2017-09-07

Computational modeling has the potential to revolutionize medicine the way it transformed engineering. However, despite decades of work, there has only been limited progress to successfully translate modeling research to patient care. One major difficulty which often occurs with biomedical computational models is an inability to perform validation in a setting that closely resembles how the model will be used. For example, for a biomedical model that makes in vivo clinically relevant predictions, direct validation of predictions may be impossible for ethical, technological, or financial reasons. Unavoidable limitations inherent to the validation process lead to challenges in evaluating the credibilitymore » of biomedical model predictions. Therefore, when evaluating biomedical models, it is critical to rigorously assess applicability, that is, the relevance of the computational model, and its validation evidence to the proposed context of use (COU). However, there are no well-established methods for assessing applicability. In this paper, we present a novel framework for performing applicability analysis and demonstrate its use with a medical device computational model. The framework provides a systematic, step-by-step method for breaking down the broad question of applicability into a series of focused questions, which may be addressed using supporting evidence and subject matter expertise. The framework can be used for model justification, model assessment, and validation planning. While motivated by biomedical models, it is relevant to a broad range of disciplines and underlying physics. Finally, the proposed applicability framework could help overcome some of the barriers inherent to validation of, and aid clinical implementation of, biomedical models.« less
Applicability Analysis of Validation Evidence for Biomedical Computational Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pathmanathan, Pras; Gray, Richard A.; Romero, Vicente J.

Computational modeling has the potential to revolutionize medicine the way it transformed engineering. However, despite decades of work, there has only been limited progress to successfully translate modeling research to patient care. One major difficulty which often occurs with biomedical computational models is an inability to perform validation in a setting that closely resembles how the model will be used. For example, for a biomedical model that makes in vivo clinically relevant predictions, direct validation of predictions may be impossible for ethical, technological, or financial reasons. Unavoidable limitations inherent to the validation process lead to challenges in evaluating the credibilitymore » of biomedical model predictions. Therefore, when evaluating biomedical models, it is critical to rigorously assess applicability, that is, the relevance of the computational model, and its validation evidence to the proposed context of use (COU). However, there are no well-established methods for assessing applicability. In this paper, we present a novel framework for performing applicability analysis and demonstrate its use with a medical device computational model. The framework provides a systematic, step-by-step method for breaking down the broad question of applicability into a series of focused questions, which may be addressed using supporting evidence and subject matter expertise. The framework can be used for model justification, model assessment, and validation planning. While motivated by biomedical models, it is relevant to a broad range of disciplines and underlying physics. Finally, the proposed applicability framework could help overcome some of the barriers inherent to validation of, and aid clinical implementation of, biomedical models.« less
A Novel Biclustering Approach to Association Rule Mining for Predicting HIV-1–Human Protein Interactions

PubMed Central

Mukhopadhyay, Anirban; Maulik, Ujjwal; Bandyopadhyay, Sanghamitra

2012-01-01

Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1–human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1–human interaction network. Novel HIV-1–human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed. PMID:22539940
Reading the Road Signs: The Utility of the MMPI-2 Restructured Form Validity Scales in Prediction of Premature Termination.

PubMed

Anestis, Joye C; Finn, Jacob A; Gottfried, Emily; Arbisi, Paul A; Joiner, Thomas E

2015-06-01

This study examined the utility of the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF) Validity Scales in prediction of premature termination in a sample of 511 individuals seeking services from a university-based psychology clinic. Higher scores on True Response Inconsistency-Revised and Infrequent Psychopathology Responses increased the risk of premature termination, whereas higher scores on Adjustment Validity lowered the risk of premature termination. Additionally, when compared with individuals who did not prematurely terminate, individuals who prematurely terminated treatment had lower Global Assessment of Functioning scores at both intake and termination and made fewer improvements. Implications of these findings for the use of the MMPI-2-RF Validity Scales in promoting treatment compliance are discussed. © The Author(s) 2014.
Isokinetic knee strength qualities as predictors of jumping performance in high-level volleyball athletes: multiple regression approach.

PubMed

Sattler, Tine; Sekulic, Damir; Spasic, Miodrag; Osmankac, Nedzad; Vicente João, Paulo; Dervisevic, Edvin; Hadzic, Vedran

2016-01-01

Previous investigations noted potential importance of isokinetic strength in rapid muscular performances, such as jumping. This study aimed to identify the influence of isokinetic-knee-strength on specific jumping performance in volleyball. The secondary aim of the study was to evaluate reliability and validity of the two volleyball-specific jumping tests. The sample comprised 67 female (21.96±3.79 years; 68.26±8.52 kg; 174.43±6.85 cm) and 99 male (23.62±5.27 years; 84.83±10.37 kg; 189.01±7.21 cm) high- volleyball players who competed in 1st and 2nd National Division. Subjects were randomly divided into validation (N.=55 and 33 for males and females, respectively) and cross-validation subsamples (N.=54 and 34 for males and females, respectively). Set of predictors included isokinetic tests, to evaluate the eccentric and concentric strength capacities of the knee extensors, and flexors for dominant and non-dominant leg. The main outcome measure for the isokinetic testing was peak torque (PT) which was later normalized for body mass and expressed as PT/Kg. Block-jump and spike-jump performances were measured over three trials, and observed as criteria. Forward stepwise multiple regressions were calculated for validation subsamples and then cross-validated. Cross validation included correlations between and t-test differences between observed and predicted scores; and Bland Altman graphics. Jumping tests were found to be reliable (spike jump: ICC of 0.79 and 0.86; block-jump: ICC of 0.86 and 0.90; for males and females, respectively), and their validity was confirmed by significant t-test differences between 1st vs. 2nd division players. Isokinetic variables were found to be significant predictors of jumping performance in females, but not among males. In females, the isokinetic-knee measures were shown to be stronger and more valid predictors of the block-jump (42% and 64% of the explained variance for validation and cross-validation subsample, respectively) than that of the spike-jump (39% and 34% of the explained variance for validation and cross-validation subsample, respectively). Differences between prediction models calculated for males and females are mostly explained by gender-specific biomechanics of jumping. Study defined importance of knee-isokinetic-strength in volleyball jumping performance in female athletes. Further studies should evaluate association between ankle-isokinetic-strength and volleyball-specific jumping performances. Results reinforce the need for the cross-validation of the prediction-models in sport and exercise sciences.
A Multimethod Multitrait Validity Assessment of Self-Construal in Japan, Korea, and the United States

ERIC Educational Resources Information Center

Bresnahan, Mary J.; Levine, Timothy R.; Shearman, Sachiyo Morinaga; Lee, Sun Young; Park, Cheong-Yi; Kiyomiya, Toru

2005-01-01

A large number of previous studies have used self-construal to predict communication outcomes. Recent evidence, however, suggests that validity problems may exist in self-construal measurement. The current study conducted a multimethod multitrait (Campbell & Fiske, 1959) validation study of self-construal measures with data (total N = 578)…
Evidence of Concurrent Validity of SII Scores for Asian American College Students

ERIC Educational Resources Information Center

Hansen, Jo-Ida C.; Lee, W. Vanessa

2007-01-01

The validity of scores on the Strong Interest Inventory (SII) for Asian American college students has not been thoroughly investigated. This study examined the evidence of validity of the SII Occupational Scale scores for predicting college major choices of Asian American women and men and White women and men. The sample included 186 female and…
Validity Evidence for the Security Scale as a Measure of Perceived Attachment Security in Adolescence

ERIC Educational Resources Information Center

Van Ryzin, Mark J.; Leve, Leslie D.

2012-01-01

In this study, the validity of a self-report measure of children's perceived attachment security (the Kerns Security Scale) was tested using adolescents. With regards to predictive validity, the Security Scale was significantly associated with (1) observed mother-adolescent interactions during conflict and (2) parent- and teacher-rated social…
Validation of a Computerized Cognitive Assessment System for Persons with Stroke: A Pilot Study

ERIC Educational Resources Information Center

Yip, Chi Kwong; Man, David W. K.

2009-01-01

This study investigates the validity of a newly developed computerized cognitive assessment system (CCAS) that is equipped with rich multimedia to generate simulated testing situations and considers both test item difficulty and the test taker's ability. It is also hypothesized that better predictive validity of the CCAS in self-care of persons…
Improved Diagnostic Validity of the ADOS Revised Algorithms: A Replication Study in an Independent Sample

ERIC Educational Resources Information Center

Oosterling, Iris; Roos, Sascha; de Bildt, Annelies; Rommelse, Nanda; de Jonge, Maretha; Visser, Janne; Lappenschaar, Martijn; Swinkels, Sophie; van der Gaag, Rutger Jan; Buitelaar, Jan

2010-01-01

Recently, Gotham et al. ("2007") proposed revised algorithms for the Autism Diagnostic Observation Schedule (ADOS) with improved diagnostic validity. The aim of the current study was to replicate predictive validity, factor structure, and correlations with age and verbal and nonverbal IQ of the ADOS revised algorithms for Modules 1 and 2…
Predicting implementation from organizational readiness for change: a study protocol

PubMed Central

2011-01-01

Background There is widespread interest in measuring organizational readiness to implement evidence-based practices in clinical care. However, there are a number of challenges to validating organizational measures, including inferential bias arising from the halo effect and method bias - two threats to validity that, while well-documented by organizational scholars, are often ignored in health services research. We describe a protocol to comprehensively assess the psychometric properties of a previously developed survey, the Organizational Readiness to Change Assessment. Objectives Our objective is to conduct a comprehensive assessment of the psychometric properties of the Organizational Readiness to Change Assessment incorporating methods specifically to address threats from halo effect and method bias. Methods and Design We will conduct three sets of analyses using longitudinal, secondary data from four partner projects, each testing interventions to improve the implementation of an evidence-based clinical practice. Partner projects field the Organizational Readiness to Change Assessment at baseline (n = 208 respondents; 53 facilities), and prospectively assesses the degree to which the evidence-based practice is implemented. We will conduct predictive and concurrent validities using hierarchical linear modeling and multivariate regression, respectively. For predictive validity, the outcome is the change from baseline to follow-up in the use of the evidence-based practice. We will use intra-class correlations derived from hierarchical linear models to assess inter-rater reliability. Two partner projects will also field measures of job satisfaction for convergent and discriminant validity analyses, and will field Organizational Readiness to Change Assessment measures at follow-up for concurrent validity (n = 158 respondents; 33 facilities). Convergent and discriminant validities will test associations between organizational readiness and different aspects of job satisfaction: satisfaction with leadership, which should be highly correlated with readiness, versus satisfaction with salary, which should be less correlated with readiness. Content validity will be assessed using an expert panel and modified Delphi technique. Discussion We propose a comprehensive protocol for validating a survey instrument for assessing organizational readiness to change that specifically addresses key threats of bias related to halo effect, method bias and questions of construct validity that often go unexplored in research using measures of organizational constructs. PMID:21777479
Are loss of control while eating and overeating valid constructs? A critical review of the literature

PubMed Central

Goldschmidt, Andrea B.

2017-01-01

Background Binge eating is a marker of weight gain and obesity, and a hallmark feature of eating disorders. Yet, its component constructs—overeating and loss of control (LOC) while eating—are poorly understood and difficult to measure. Objective To critically review the human literature concerning the validity of LOC and overeating across the age and weight spectrum. Data sources English-language articles addressing the face, convergent, discriminant, and predictive validity of LOC and overeating were included. Results LOC and overeating appear to have adequate face validity. Emerging evidence supports the convergent and predictive validity of the LOC construct, given its unique cross-sectional and prospective associations with numerous anthropometric, psychosocial, and eating behavior-related factors. Overeating may be best conceptualized as a marker of excess weight status. Limitations Binge eating constructs, particularly in the context of subjectively large episodes, are challenging to measure reliably. Few studies addressed overeating in the absence of LOC, thereby limiting conclusions about the validity of the overeating construct independent of LOC. Additional studies addressing the discriminant validity of both constructs are warranted. Discussion Suggestions for future weight-related research and for appropriately defining binge eating in the eating disorders diagnostic scheme are presented. PMID:28165655
Reliability and validity of a treatment adherence measure for child psychiatric rehabilitation.

PubMed

Williams, Nathaniel J; Green, Philip

2012-09-01

Treatment adherence, defined as the degree to which practitioners implemented prescribed program principles and activities and avoided proscribed activities, has been an area of growing interest in mental health services for children with severe emotional and behavioral disorders. This study evaluated the reliability and validity of a treatment adherence measure for child psychiatric rehabilitation (CPSR). Parents of children receiving CPSR (n = 79) or psychotherapy (n = 27) completed the Children's Psychosocial Rehabilitation Treatment Adherence Measure (CTAM) and a measure of 2-week session impact. Psychiatric rehabilitation (PSR) supervisors identified PSR practitioners with reputations for high or low adherence to the model. The CTAM's discriminant validity was assessed by using known-groups procedures and predictive validity by examining its relationship to 2-week session impact. The CTAM demonstrated excellent internal consistency (α = .92), discriminant validity (p = .002, d = .72; p = .021, d = .59), and predictive validity (B = 2.24, SE = .31, p < .001), accounting for 28% of the child-level variance in 2-week session impact. Findings suggest the CTAM is a reliable and valid measure of treatment adherence for CPSR programs with a skill-teaching focus. Providers and agencies should take steps to enhance treatment adherence because it may be an important predictor of children's short-term response to CPSR. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
Rational selection of training and test sets for the development of validated QSAR models

NASA Astrophysics Data System (ADS)

Golbraikh, Alexander; Shen, Min; Xiao, Zhiyan; Xiao, Yun-De; Lee, Kuo-Hsiung; Tropsha, Alexander

2003-02-01

Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors ( kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction ( R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.
Job Embeddedness Demonstrates Incremental Validity When Predicting Turnover Intentions for Australian University Employees

PubMed Central

Heritage, Brody; Gilbert, Jessica M.; Roberts, Lynne D.

2016-01-01

Job embeddedness is a construct that describes the manner in which employees can be enmeshed in their jobs, reducing their turnover intentions. Recent questions regarding the properties of quantitative job embeddedness measures, and their predictive utility, have been raised. Our study compared two competing reflective measures of job embeddedness, examining their convergent, criterion, and incremental validity, as a means of addressing these questions. Cross-sectional quantitative data from 246 Australian university employees (146 academic; 100 professional) was gathered. Our findings indicated that the two compared measures of job embeddedness were convergent when total scale scores were examined. Additionally, job embeddedness was capable of demonstrating criterion and incremental validity, predicting unique variance in turnover intention. However, this finding was not readily apparent with one of the compared job embeddedness measures, which demonstrated comparatively weaker evidence of validity. We discuss the theoretical and applied implications of these findings, noting that job embeddedness has a complementary place among established determinants of turnover intention. PMID:27199817
The Structured Assessment of Violence Risk in Adults with Intellectual Disability: A Systematic Review.

PubMed

Hounsome, J; Whittington, R; Brown, A; Greenhill, B; McGuire, J

2018-01-01

While structured professional judgement approaches to assessing and managing the risk of violence have been extensively examined in mental health/forensic settings, the application of the findings to people with an intellectual disability is less extensively researched and reviewed. This review aimed to assess whether risk assessment tools have adequate predictive validity for violence in adults with an intellectual disability. Standard systematic review methodology was used to identify and synthesize appropriate studies. A total of 14 studies were identified as meeting the inclusion criteria. These studies assessed the predictive validity of 18 different risk assessment tools, mainly in forensic settings. All studies concluded that the tools assessed were successful in predicting violence. Studies were generally of a high quality. There is good quality evidence that risk assessment tools are valid for people with intellectual disability who offend but further research is required to validate tools for use with people with intellectual disability who offend. © 2016 John Wiley & Sons Ltd.
Two-Speed Gearbox Dynamic Simulation Predictions and Test Validation

NASA Technical Reports Server (NTRS)

Lewicki, David G.; DeSmidt, Hans; Smith, Edward C.; Bauman, Steven W.

2010-01-01

Dynamic simulations and experimental validation tests were performed on a two-stage, two-speed gearbox as part of the drive system research activities of the NASA Fundamental Aeronautics Subsonics Rotary Wing Project. The gearbox was driven by two electromagnetic motors and had two electromagnetic, multi-disk clutches to control output speed. A dynamic model of the system was created which included a direct current electric motor with proportional-integral-derivative (PID) speed control, a two-speed gearbox with dual electromagnetically actuated clutches, and an eddy current dynamometer. A six degree-of-freedom model of the gearbox accounted for the system torsional dynamics and included gear, clutch, shaft, and load inertias as well as shaft flexibilities and a dry clutch stick-slip friction model. Experimental validation tests were performed on the gearbox in the NASA Glenn gear noise test facility. Gearbox output speed and torque as well as drive motor speed and current were compared to those from the analytical predictions. The experiments correlate very well with the predictions, thus validating the dynamic simulation methodologies.
Two-Tiered Violence Risk Estimates: a validation study of an integrated-actuarial risk assessment instrument.

PubMed

Mills, Jeremy F; Gray, Andrew L

2013-11-01

This study is an initial validation study of the Two-Tiered Violence Risk Estimates instrument (TTV), a violence risk appraisal instrument designed to support an integrated-actuarial approach to violence risk assessment. The TTV was scored retrospectively from file information on a sample of violent offenders. Construct validity was examined by comparing the TTV with instruments that have shown utility to predict violence that were prospectively scored: The Historical-Clinical-Risk Management-20 (HCR-20) and Lifestyle Criminality Screening Form (LCSF). Predictive validity was examined through a long-term follow-up of 12.4 years with a sample of 78 incarcerated offenders. Results show the TTV to be highly correlated with the HCR-20 and LCSF. The base rate for violence over the follow-up period was 47.4%, and the TTV was equally predictive of violent recidivism relative to the HCR-20 and LCSF. Discussion centers on the advantages of an integrated-actuarial approach to the assessment of violence risk.
Development and validation of a preoperative prediction model for colorectal cancer T-staging based on MDCT images and clinical information.

PubMed

Sa, Sha; Li, Jing; Li, Xiaodong; Li, Yongrui; Liu, Xiaoming; Wang, Defeng; Zhang, Huimao; Fu, Yu

2017-08-15

This study aimed to establish and evaluate the efficacy of a prediction model for colorectal cancer T-staging. T-staging was positively correlated with the level of carcinoembryonic antigen (CEA), expression of carbohydrate antigen 19-9 (CA19-9), wall deformity, blurred outer edges, fat infiltration, infiltration into the surrounding tissue, tumor size and wall thickness. Age, location, enhancement rate and enhancement homogeneity were negatively correlated with T-staging. The predictive results of the model were consistent with the pathological gold standard, and the kappa value was 0.805. The total accuracy of staging improved from 51.04% to 86.98% with the proposed model. The clinical, imaging and pathological data of 611 patients with colorectal cancer (419 patients in the training group and 192 patients in the validation group) were collected. A spearman correlation analysis was used to validate the relationship among these factors and pathological T-staging. A prediction model was trained with the random forest algorithm. T staging of the patients in the validation group was predicted by both prediction model and traditional method. The consistency, accuracy, sensitivity, specificity and area under the curve (AUC) were used to compare the efficacy of the two methods. The newly established comprehensive model can improve the predictive efficiency of preoperative colorectal cancer T-staging.
Calibration power of the Braden scale in predicting pressure ulcer development.

PubMed

Chen, Hong-Lin; Cao, Ying-Juan; Wang, Jing; Huai, Bao-Sha

2016-11-02

Calibration is the degree of correspondence between the estimated probability produced by a model and the actual observed probability. The aim of this study was to investigate the calibration power of the Braden scale in predicting pressure ulcer development (PU). A retrospective analysis was performed among consecutive patients in 2013. The patients were separated into training a group and a validation group. The predicted incidence was calculated using a logistic regression model in the training group and the Hosmer-Lemeshow test was used for assessing the goodness of fit. In the validation cohort, the observed and the predicted incidence were compared by the Chi-square (χ 2 ) goodness of fit test for calibration power. We included 2585 patients in the study, of these 78 patients (3.0%) developed a PU. Between the training and validation groups the patient characteristics were non-significant (p>0.05). In the training group, the logistic regression model for predicting pressure ulcer was Logit(P) = -0.433*Braden score+2.616. The Hosmer-Lemeshow test showed no goodness fit (χ 2 =13.472; p=0.019). In the validation group, the predicted pressure ulcer incidence also did not fit well with the observed incidence (χ 2 =42.154, p=0.000 by Braden scores; and χ 2 =17.223, p=0.001 by Braden scale risk classification). The Braden scale has low calibration power in predicting PU formation.

Assessing youth who sexually offended: the predictive validity of the ERASOR, J-SOAP-II, and YLS/CMI in a non-Western context.

PubMed

Chu, Chi Meng; Ng, Kynaston; Fong, June; Teoh, Jennifer

2012-04-01

Recent research suggested that the predictive validity of adult sexual offender risk assessment measures can be affected when used cross-culturally, but there is no published study on the predictive validity of risk assessment measures for youth who sexually offended in a non-Western context. This study compared the predictive validity of three youth risk assessment measures (i.e., the Estimate of Risk of Adolescent Sexual Offense Recidivism [ERASOR], the Juvenile Sex Offender Assessment Protocol-II [J-SOAP-II], and the Youth Level of Service/Case Management Inventory [YLS/CMI]) for sexual and nonviolent recidivism in a sample of 104 male youth who sexually offended within a Singaporean context (M (follow-up) = 1,637 days; SD (follow-up) = 491). Results showed that the ERASOR overall clinical rating and total score significantly predicted sexual recidivism but only the former significantly predicted time to sexual reoffense. All of the measures (i.e., the ERASOR overall clinical rating and total score, the J-SOAP-II total score, as well as the YLS/CMI) significantly predicted nonsexual recidivism and time to nonsexual reoffense for this sample of youth who sexually offended. Overall, the results suggest that the ERASOR appears to be suited for assessing youth who sexually offended in a non-Western context, but the J-SOAP-II and the YLS/CMI have limited utility for such a purpose.
Risk prediction model: Statistical and artificial neural network approach

NASA Astrophysics Data System (ADS)

Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim

2017-04-01

Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.
Evaluation of the Predictive Validity of Thermography in Identifying Extravasation With Intravenous Chemotherapy Infusions.

PubMed

Matsui, Yuko; Murayama, Ryoko; Tanabe, Hidenori; Oe, Makoto; Motoo, Yoshiharu; Wagatsuma, Takanori; Michibuchi, Michiko; Kinoshita, Sachiko; Sakai, Keiko; Konya, Chizuko; Sugama, Junko; Sanada, Hiromi

Early detection of extravasation is important, but conventional methods of detection lack objectivity and reliability. This study evaluated the predictive validity of thermography for identifying extravasation during intravenous antineoplastic therapy. Of 257 patients who received chemotherapy through peripheral veins, extravasation was identified in 26. Thermography was performed every 15 to 30 minutes during the infusions. Sensitivity, specificity, positive predictive value, and negative predictive value using thermography were 84.6%, 94.8%, 64.7%, and 98.2%, respectively. This study showed that thermography offers an accurate prediction of extravasation.
Development and Validation of a Measure of Quality of Life for the Young Elderly in Sri Lanka.

PubMed

de Silva, Sudirikku Hennadige Padmal; Jayasuriya, Anura Rohan; Rajapaksa, Lalini Chandika; de Silva, Ambepitiyawaduge Pubudu; Barraclough, Simon

2016-01-01

Sri Lanka has one of the fastest aging populations in the world. Measurement of quality of life (QoL) in the elderly needs instruments developed that encompass the sociocultural settings. An instrument was developed to measure QoL in the young elderly in Sri Lanka (QLI-YES), using accepted methods to generate and reduce items. The measure was validated using a community sample. Construct, criterion and predictive validity and reliability were tested. A first-order model of 24 items with 6 domains was found to have good fit indices (CMIN/df = 1.567, RMR = 0.05, CFI = 0.95, and RMSEA = 0.053). Both criterion and predictive validity were demonstrated. Good internal consistency reliability (Cronbach's α = 0.93) was shown. The development of the QLI-YES using a societal perspective relevant to the social and cultural beliefs has resulted in a robust and valid instrument to measure QoL for the young elderly in Sri Lanka. © 2015 APJPH.
Further Validation of a CFD Code for Calculating the Performance of Two-Stage Light Gas Guns

NASA Technical Reports Server (NTRS)

Bogdanoff, David W.

2017-01-01

Earlier validations of a higher-order Godunov code for modeling the performance of two-stage light gas guns are reviewed. These validation comparisons were made between code predictions and experimental data from the NASA Ames 1.5" and 0.28" guns and covered muzzle velocities of 6.5 to 7.2 km/s. In the present report, five more series of code validation comparisons involving experimental data from the Ames 0.22" (1.28" pump tube diameter), 0.28", 0.50", 1.00" and 1.50" guns are presented. The total muzzle velocity range of the validation data presented herein is 3 to 11.3 km/s. The agreement between the experimental data and CFD results is judged to be very good. Muzzle velocities were predicted within 0.35 km/s for 74% of the cases studied with maximum differences being 0.5 km/s and for 4 out of 50 cases, 0.5 - 0.7 km/s.
Validating Quantitative Measurement Using Qualitative Data: Combining Rasch Scaling and Latent Semantic Analysis in Psychiatry

NASA Astrophysics Data System (ADS)

Lange, Rense

2015-02-01

An extension of concurrent validity is proposed that uses qualitative data for the purpose of validating quantitative measures. The approach relies on Latent Semantic Analysis (LSA) which places verbal (written) statements in a high dimensional semantic space. Using data from a medical / psychiatric domain as a case study - Near Death Experiences, or NDE - we established concurrent validity by connecting NDErs qualitative (written) experiential accounts with their locations on a Rasch scalable measure of NDE intensity. Concurrent validity received strong empirical support since the variance in the Rasch measures could be predicted reliably from the coordinates of their accounts in the LSA derived semantic space (R2 = 0.33). These coordinates also predicted NDErs age with considerable precision (R2 = 0.25). Both estimates are probably artificially low due to the small available data samples (n = 588). It appears that Rasch scalability of NDE intensity is a prerequisite for these findings, as each intensity level is associated (at least probabilistically) with a well- defined pattern of item endorsements.
The validity of the Health-Relevant Personality Inventory (HP5i) and the Junior Temperament and Character Inventory (JTCI) among adolescents referred for a substance misuse problem.

PubMed

Hemphälä, Malin; Gustavsson, J Petter; Tengström, Anders

2013-01-01

The aim was to study the validity of 2 personality instruments, the Health-Relevant Personality Inventory (HP5i) and the Junior Temperament and Character Inventory (JTCI), among adolescents with a substance use problem. Clinical interviews were completed with 180 adolescents and followed up after 12 months. Discriminant validity was demonstrated in the lack of correlation to intelligence in both instruments' scales. Two findings were in support of convergent validity: Negative affectivity (HP5i) and harm avoidance (JTCI) were correlated to internalizing symptoms, and impulsivity (HP5i) and novelty seeking (JTCI) were correlated to externalizing symptoms. The predictive validity of JTCI was partly supported. When psychiatric symptoms at baseline were controlled for, cooperativeness predicted conduct disorder after 12 months. Summarizing, both instruments can be used in adolescent clinical samples to tailor treatment efforts, although some scales need further investigation. It is important to include personality assessment when evaluating psychiatric problems in adolescents.
Development and Validation of a Measure of Quality of Life for the Young Elderly in Sri Lanka

PubMed Central

de Silva, Sudirikku Hennadige Padmal; Jayasuriya, Anura Rohan; Rajapaksa, Lalini Chandika; de Silva, Ambepitiyawaduge Pubudu; Barraclough, Simon

2016-01-01

Sri Lanka has one of the fastest aging populations in the world. Measurement of quality of life (QoL) in the elderly needs instruments developed that encompass the sociocultural settings. An instrument was developed to measure QoL in the young elderly in Sri Lanka (QLI-YES), using accepted methods to generate and reduce items. The measure was validated using a community sample. Construct, criterion and predictive validity and reliability were tested. A first-order model of 24 items with 6 domains was found to have good fit indices (CMIN/df = 1.567, RMR = 0.05, CFI = 0.95, and RMSEA = 0.053). Both criterion and predictive validity were demonstrated. Good internal consistency reliability (Cronbach’s α = 0.93) was shown. The development of the QLI-YES using a societal perspective relevant to the social and cultural beliefs has resulted in a robust and valid instrument to measure QoL for the young elderly in Sri Lanka. PMID:26712893
Validation of the Korean Version of the Mini-Sleep Questionnaire-Insomnia in Korean College Students.

PubMed

Kim, Hee-Ju

2017-03-01

This study aimed to evaluate the reliability and validity of the Korean version of the Mini-Sleep Questionnaire-Insomnia in Korean college students. A total of 470 students from six nursing colleges in South Korea participated in the study. The translation and linguistic validation of the Mini-Sleep Questionnaire-Insomnia was performed based on guidelines. The Pittsburgh Sleep Quality Index and the Perceived Stress Scale were used to validate the measure. Cronbach α, item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability were evaluated. Exploratory factor analysis for construct validity, Pearson's correlation with the Pittsburgh Sleep Quality Index and the Perceived Stress Scale for concurrent validity, and the receiver operating character curve for predictive validity were assessed. The 4-item Mini-Sleep Questionnaire-Insomnia had a Cronbach α of .69 and the item-total correlations were higher than .30. Cronbach α increased to .73 if the item assessing the use of sleeping pills and tranquilizers was deleted. This item had marked skewness and kurtosis issues. Factor analysis indicated unidimensionality, explaining 53.0% of the total variance. The measure showed high test-retest reliability (i.e., intraclass correlation coefficient = .84), acceptable concurrent validity (r with the Pittsburg Sleep Quality Index = .69; r with the Perceived Stress Scale = .31) and predictive validity [area under curve = .85; 95% confidence interval (0.81, 0.90)]. The Mini-Sleep Questionnaire-Insomnia showed acceptable reliability and validity. Yet, the limited distribution in sleep medications warrants further evaluations in the clinical population. Copyright © 2017. Published by Elsevier B.V.
Development, Testing, and Validation of a Model-Based Tool to Predict Operator Responses in Unexpected Workload Transitions

NASA Technical Reports Server (NTRS)

Sebok, Angelia; Wickens, Christopher; Sargent, Robert

2015-01-01

One human factors challenge is predicting operator performance in novel situations. Approaches such as drawing on relevant previous experience, and developing computational models to predict operator performance in complex situations, offer potential methods to address this challenge. A few concerns with modeling operator performance are that models need to realistic, and they need to be tested empirically and validated. In addition, many existing human performance modeling tools are complex and require that an analyst gain significant experience to be able to develop models for meaningful data collection. This paper describes an effort to address these challenges by developing an easy to use model-based tool, using models that were developed from a review of existing human performance literature and targeted experimental studies, and performing an empirical validation of key model predictions.
Three DIBELS Tasks vs. Three Informal Reading/Spelling Tasks: A Comparison of Predictive Validity

ERIC Educational Resources Information Center

Morris, Darrell; Trathen, Woodrow; Perney, Jan; Gill, Tom; Schlagal, Robert; Ward, Devery; Frye, Elizabeth M.

2017-01-01

Within a developmental framework, this study compared the predictive validity of three DIBELS tasks (phoneme segmentation fluency [PSF], nonsense word fluency [NWF], and oral reading fluency [ORF]) with that of three alternative tasks drawn from the field of reading (phonemic spelling [phSPEL], word recognition-timed [WR-t], and graded passage…
A Parsimonious Instrument for Predicting Students' Intent to Pursue a Sales Career: Scale Development and Validation

ERIC Educational Resources Information Center

Peltier, James W.; Cummins, Shannon; Pomirleanu, Nadia; Cross, James; Simon, Rob

2014-01-01

Students' desire and intention to pursue a career in sales continue to lag behind industry demand for sales professionals. This article develops and validates a reliable and parsimonious scale for measuring and predicting student intention to pursue a selling career. The instrument advances previous scales in three ways. The instrument is…
The Study Skills Questionnaire (SSQUES): Preliminary Validation of a Measure for Assessing Students' Perceived Areas of Weakness.

ERIC Educational Resources Information Center

McCombs, Barbara L.; Dobrovolny, Jacqueline L.

The potential reliability and construct and predictive validity of a 30-item Study Skills Questionnaire (SSQUES) was evaluated for its ability to: (1) predict student performance in a self-paced, individualized, or computer-managed instructional environment, and (2) identify students needing some type of study skills remediation. The study was…
Predictive Validity and Impact of CAEP Standard 3.2: Results from One Master's-Level Teacher Preparation Program

ERIC Educational Resources Information Center

Evans, Carla M.

2017-01-01

This study investigates the predictive validity and policy impact of Council for Accreditation of Educator Preparation minimum admission requirements in Standard 3.2 on teacher preparation programs (TPPs), their applicants, and the broader field of educator preparation. Undergraduate grade point average (GPA) and Graduate Record Examination (GRE)…
Validation of a New Skinfold Prediction Equation Based on Dual-Energy X-Ray Absorptiometry

ERIC Educational Resources Information Center

Ball, Stephen; Cowan, Celsi; Thyfault, John; LaFontaine, Tom

2014-01-01

Skinfold prediction equations recommended by the American College of Sports Medicine underestimate body fat percentage. The purpose of this research was to validate an alternative equation for men created from dual energy x-ray absorptiometry. Two hundred ninety-seven males, aged 18-65, completed a skinfold assessment and dual energy x-ray…
The Predictive Validity of a Computer-Adaptive Assessment of Kindergarten and First-Grade Reading Skills

ERIC Educational Resources Information Center

Clemens, Nathan H.; Hagan-Burke, Shanna; Luo, Wen; Cerda, Carissa; Blakely, Alane; Frosch, Jennifer; Gamez-Patience, Brenda; Jones, Meredith

2015-01-01

This study examined the predictive validity of a computer-adaptive assessment for measuring kindergarten reading skills using the STAR Early Literacy (SEL) test. The findings showed that the results of SEL assessments administered during the fall, winter, and spring of kindergarten were moderate and statistically significant predictors of year-end…
Predictive modeling of infrared radiative heating in tomato dry-peeling process: Part II. Model validation and sensitivity analysis

USDA-ARS?s Scientific Manuscript database

A predictive mathematical model was developed to simulate heat transfer in a tomato undergoing double sided infrared (IR) heating in a dry-peeling process. The aims of this study were to validate the developed model using experimental data and to investigate different engineering parameters that mos...
Predicting Curriculum and Test Performance at Age 7 Years from Pupil Background, Baseline Skills and Phonological Awareness at Age 5

ERIC Educational Resources Information Center

Savage, R.; Carless, S.

2004-01-01

Background: Phonological awareness tests are known to be amongst the best predictors of literacy; however their predictive validity alongside current school screening practice (baseline assessment, pupil background data) and to National Curricular outcome measures is unknown. Aim: We explored the validity of phonological awareness and orthographic…
Predictive and Treatment Validity of Life Satisfaction and the Quality of Life Inventory

ERIC Educational Resources Information Center

Frisch, Michael B.; Clark, Michelle P.; Rouse, Steven V.; Rudd, M. David; Paweleck, Jennifer K.; Greenstone, Andrew; Kopplin, David A.

2005-01-01

The clinical and positive psychology usefulness of quality of life, well-being, and life satisfaction assessments depends on their ability to predict important outcomes and to detect intervention-related change. These issues were explored in the context of a program of instrument validation for the Quality of Life Inventory (QOLI) involving 3,927…
Measuring Life Stress: A Comparison of the Predictive Validity of Different Scoring Systems for the Social Readjustment Rating Scale.

ERIC Educational Resources Information Center

McGrath, Robert E. V.; Burkhart, Barry R.

1983-01-01

Assessed whether accounting for variables in the scoring of the Social Readjustment Rating Scale (SRRS) would improve the predictive validity of the inventory. Results from 107 sets of questionnaires showed that income and level of education are significant predictors of the capacity to cope with stress. (JAC)

On the Validity of Validity Scales: The Importance of Defensive Responding in the Prediction of Institutional Misconduct

ERIC Educational Resources Information Center

Edens, John F.; Ruiz, Mark A.

2006-01-01

This study examined the effects of defensive responding on the prediction of institutional misconduct among male inmates (N = 349) who completed the Personality Assessment Inventory (L. C. Morey, 1991). Hierarchical logistic regression analyses demonstrated significant main effects for the Antisocial Features (ANT) scale as well as main effects…
Predictive Validity of a National Examination for Medical Graduates in the People's Republic of China.

ERIC Educational Resources Information Center

Bingxiun, Liu; And Others

1990-01-01

To estimate the predictive validity of the Chinese National Medical Examination, scores of a sample (n=1,717) of participating examinees were compared with program directors' ratings on nine aspects of clinical competence. Test scores were consistent with competence measures and overall, correlated significantly with ratings, while varying for…
Validity of the Optometry Admission Test in Predicting Performance in Schools and Colleges of Optometry.

ERIC Educational Resources Information Center

Kramer, Gene A.; Johnston, JoElle

1997-01-01

A study examined the relationship between Optometry Admission Test scores and pre-optometry or undergraduate grade point average (GPA) with first and second year performance in optometry schools. The test's predictive validity was limited but significant, and comparable to those reported for other admission tests. In addition, the scores…
Validating Inertial Confinement Fusion (ICF) predictive capability using perturbed capsules

NASA Astrophysics Data System (ADS)

Schmitt, Mark; Magelssen, Glenn; Tregillis, Ian; Hsu, Scott; Bradley, Paul; Dodd, Evan; Cobble, James; Flippo, Kirk; Offerman, Dustin; Obrey, Kimberly; Wang, Yi-Ming; Watt, Robert; Wilke, Mark; Wysocki, Frederick; Batha, Steven

2009-11-01

Achieving ignition on NIF is a monumental step on the path toward utilizing fusion as a controlled energy source. Obtaining robust ignition requires accurate ICF models to predict the degradation of ignition caused by heterogeneities in capsule construction and irradiation. LANL has embarked on a project to induce controlled defects in capsules to validate our ability to predict their effects on fusion burn. These efforts include the validation of feature-driven hydrodynamics and mix in a convergent geometry. This capability is needed to determine the performance of capsules imploded under less-than-optimum conditions on future IFE facilities. LANL's recently initiated Defect Implosion Experiments (DIME) conducted at Rochester's Omega facility are providing input for these efforts. Recent simulation and experimental results will be shown.
Simulation for Prediction of Entry Article Demise (SPEAD): An Analysis Tool for Spacecraft Safety Analysis and Ascent/Reentry Risk Assessment

NASA Technical Reports Server (NTRS)

Ling, Lisa

2014-01-01

For the purpose of performing safety analysis and risk assessment for a potential off-nominal atmospheric reentry resulting in vehicle breakup, a synthesis of trajectory propagation coupled with thermal analysis and the evaluation of node failure is required to predict the sequence of events, the timeline, and the progressive demise of spacecraft components. To provide this capability, the Simulation for Prediction of Entry Article Demise (SPEAD) analysis tool was developed. The software and methodology have been validated against actual flights, telemetry data, and validated software, and safety/risk analyses were performed for various programs using SPEAD. This report discusses the capabilities, modeling, validation, and application of the SPEAD analysis tool.
NNvPDB: Neural Network based Protein Secondary Structure Prediction with PDB Validation.

PubMed

Sakthivel, Seethalakshmi; S K M, Habeeb

2015-01-01

The predicted secondary structural states are not cross validated by any of the existing servers. Hence, information on the level of accuracy for every sequence is not reported by the existing servers. This was overcome by NNvPDB, which not only reported greater Q3 but also validates every prediction with the homologous PDB entries. NNvPDB is based on the concept of Neural Network, with a new and different approach of training the network every time with five PDB structures that are similar to query sequence. The average accuracy for helix is 76%, beta sheet is 71% and overall (helix, sheet and coil) is 66%. http://bit.srmuniv.ac.in/cgi-bin/bit/cfpdb/nnsecstruct.pl.
Prediction of individual milk proteins including free amino acids in bovine milk using mid-infrared spectroscopy and their correlations with milk processing characteristics.

PubMed

McDermott, A; Visentin, G; De Marchi, M; Berry, D P; Fenelon, M A; O'Connor, P M; Kenny, O A; McParland, S

2016-04-01

The aim of this study was to evaluate the effectiveness of mid-infrared spectroscopy in predicting milk protein and free amino acid (FAA) composition in bovine milk. Milk samples were collected from 7 Irish research herds and represented cows from a range of breeds, parities, and stages of lactation. Mid-infrared spectral data in the range of 900 to 5,000 cm(-1) were available for 730 milk samples; gold standard methods were used to quantify individual protein fractions and FAA of these samples with a view to predicting these gold standard protein fractions and FAA levels with available mid-infrared spectroscopy data. Separate prediction equations were developed for each trait using partial least squares regression; accuracy of prediction was assessed using both cross validation on a calibration data set (n=400 to 591 samples) and external validation on an independent data set (n=143 to 294 samples). The accuracy of prediction in external validation was the same irrespective of whether undertaken on the entire external validation data set or just within the Holstein-Friesian breed. The strongest coefficient of correlation obtained for protein fractions in external validation was 0.74, 0.69, and 0.67 for total casein, total β-lactoglobulin, and β-casein, respectively. Total proteins (i.e., total casein, total whey, and total lactoglobulin) were predicted with greater accuracy then their respective component traits; prediction accuracy using the infrared spectrum was superior to prediction using just milk protein concentration. Weak to moderate prediction accuracies were observed for FAA. The greatest coefficient of correlation in both cross validation and external validation was for Gly (0.75), indicating a moderate accuracy of prediction. Overall, the FAA prediction models overpredicted the gold standard values. Near-unity correlations existed between total casein and β-casein irrespective of whether the traits were based on the gold standard (0.92) or mid-infrared spectroscopy predictions (0.95). Weaker correlations among FAA were observed than the correlations among the protein fractions. Pearson correlations between gold standard protein fractions and the milk processing characteristics of rennet coagulation time, curd firming time, curd firmness, heat coagulating time, pH, and casein micelle size were weak to moderate and ranged from -0.48 (protein and pH) to 0.50 (total casein and a30). Pearson correlations between gold standard FAA and these milk processing characteristics were also weak to moderate and ranged from -0.60 (Val and pH) to 0.49 (Val and K20). Results from this study indicate that mid-infrared spectroscopy has the potential to predict protein fractions and some FAA in milk at a population level. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
The Amsterdam wrist rules: the multicenter prospective derivation and external validation of a clinical decision rule for the use of radiography in acute wrist trauma.

PubMed

Walenkamp, Monique M J; Bentohami, Abdelali; Slaar, Annelie; Beerekamp, M Suzan H; Maas, Mario; Jager, L Cara; Sosef, Nico L; van Velde, Romuald; Ultee, Jan M; Steyerberg, Ewout W; Goslings, J Carel; Schep, Niels W L

2015-12-18

Although only 39 % of patients with wrist trauma have sustained a fracture, the majority of patients is routinely referred for radiography. The purpose of this study was to derive and externally validate a clinical decision rule that selects patients with acute wrist trauma in the Emergency Department (ED) for radiography. This multicenter prospective study consisted of three components: (1) derivation of a clinical prediction model for detecting wrist fractures in patients following wrist trauma; (2) external validation of this model; and (3) design of a clinical decision rule. The study was conducted in the EDs of five Dutch hospitals: one academic hospital (derivation cohort) and four regional hospitals (external validation cohort). We included all adult patients with acute wrist trauma. The main outcome was fracture of the wrist (distal radius, distal ulna or carpal bones) diagnosed on conventional X-rays. A total of 882 patients were analyzed; 487 in the derivation cohort and 395 in the validation cohort. We derived a clinical prediction model with eight variables: age; sex, swelling of the wrist; swelling of the anatomical snuffbox, visible deformation; distal radius tender to palpation; pain on radial deviation and painful axial compression of the thumb. The Area Under the Curve at external validation of this model was 0.81 (95 % CI: 0.77-0.85). The sensitivity and specificity of the Amsterdam Wrist Rules (AWR) in the external validation cohort were 98 % (95 % CI: 95-99 %) and 21 % (95 % CI: 15 %-28). The negative predictive value was 90 % (95 % CI: 81-99 %). The Amsterdam Wrist Rules is a clinical prediction rule with a high sensitivity and negative predictive value for fractures of the wrist. Although external validation showed low specificity and 100 % sensitivity could not be achieved, the Amsterdam Wrist Rules can provide physicians in the Emergency Department with a useful screening tool to select patients with acute wrist trauma for radiography. The upcoming implementation study will further reveal the impact of the Amsterdam Wrist Rules on the anticipated reduction of X-rays requested, missed fractures, Emergency Department waiting times and health care costs. This study was registered in the Dutch Trial Registry, reference number NTR2544 on October 1(st), 2010.
Effectiveness of Genomic Prediction of Maize Hybrid Performance in Different Breeding Populations and Environments

PubMed Central

Windhausen, Vanessa S.; Atlin, Gary N.; Hickey, John M.; Crossa, Jose; Jannink, Jean-Luc; Sorrells, Mark E.; Raman, Babu; Cairns, Jill E.; Tarekegne, Amsal; Semagn, Kassa; Beyene, Yoseph; Grudloyma, Pichet; Technow, Frank; Riedelsheimer, Christian; Melchinger, Albrecht E.

2012-01-01

Genomic prediction is expected to considerably increase genetic gains by increasing selection intensity and accelerating the breeding cycle. In this study, marker effects estimated in 255 diverse maize (Zea mays L.) hybrids were used to predict grain yield, anthesis date, and anthesis-silking interval within the diversity panel and testcross progenies of 30 F2-derived lines from each of five populations. Although up to 25% of the genetic variance could be explained by cross validation within the diversity panel, the prediction of testcross performance of F2-derived lines using marker effects estimated in the diversity panel was on average zero. Hybrids in the diversity panel could be grouped into eight breeding populations differing in mean performance. When performance was predicted separately for each breeding population on the basis of marker effects estimated in the other populations, predictive ability was low (i.e., 0.12 for grain yield). These results suggest that prediction resulted mostly from differences in mean performance of the breeding populations and less from the relationship between the training and validation sets or linkage disequilibrium with causal variants underlying the predicted traits. Potential uses for genomic prediction in maize hybrid breeding are discussed emphasizing the need of (1) a clear definition of the breeding scenario in which genomic prediction should be applied (i.e., prediction among or within populations), (2) a detailed analysis of the population structure before performing cross validation, and (3) larger training sets with strong genetic relationship to the validation set. PMID:23173094
Simplified Mortality Score for the Intensive Care Unit (SMS-ICU): protocol for the development and validation of a bedside clinical prediction rule.

PubMed

Granholm, Anders; Perner, Anders; Krag, Mette; Hjortrup, Peter Buhl; Haase, Nicolai; Holst, Lars Broksø; Marker, Søren; Collet, Marie Oxenbøll; Jensen, Aksel Karl Georg; Møller, Morten Hylander

2017-03-09

Mortality prediction scores are widely used in intensive care units (ICUs) and in research, but their predictive value deteriorates as scores age. Existing mortality prediction scores are imprecise and complex, which increases the risk of missing data and decreases the applicability bedside in daily clinical practice. We propose the development and validation of a new, simple and updated clinical prediction rule: the Simplified Mortality Score for use in the Intensive Care Unit (SMS-ICU). During the first phase of the study, we will develop and internally validate a clinical prediction rule that predicts 90-day mortality on ICU admission. The development sample will comprise 4247 adult critically ill patients acutely admitted to the ICU, enrolled in 5 contemporary high-quality ICU studies/trials. The score will be developed using binary logistic regression analysis with backward stepwise elimination of candidate variables, and subsequently be converted into a point-based clinical prediction rule. The general performance, discrimination and calibration of the score will be evaluated, and the score will be internally validated using bootstrapping. During the second phase of the study, the score will be externally validated in a fully independent sample consisting of 3350 patients included in the ongoing Stress Ulcer Prophylaxis in the Intensive Care Unit trial. We will compare the performance of the SMS-ICU to that of existing scores. We will use data from patients enrolled in studies/trials already approved by the relevant ethical committees and this study requires no further permissions. The results will be reported in accordance with the Transparent Reporting of multivariate prediction models for Individual Prognosis Or Diagnosis (TRIPOD) statement, and submitted to a peer-reviewed journal. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
The derivation and validation of a simple model for predicting in-hospital mortality of acutely admitted patients to internal medicine wards.

PubMed

Sakhnini, Ali; Saliba, Walid; Schwartz, Naama; Bisharat, Naiel

2017-06-01

Limited information is available about clinical predictors of in-hospital mortality in acute unselected medical admissions. Such information could assist medical decision-making.To develop a clinical model for predicting in-hospital mortality in unselected acute medical admissions and to test the impact of secondary conditions on hospital mortality.This is an analysis of the medical records of patients admitted to internal medicine wards at one university-affiliated hospital. Data obtained from the years 2013 to 2014 were used as a derivation dataset for creating a prediction model, while data from 2015 was used as a validation dataset to test the performance of the model. For each admission, a set of clinical and epidemiological variables was obtained. The main diagnosis at hospitalization was recorded, and all additional or secondary conditions that coexisted at hospital admission or that developed during hospital stay were considered secondary conditions.The derivation and validation datasets included 7268 and 7843 patients, respectively. The in-hospital mortality rate averaged 7.2%. The following variables entered the final model; age, body mass index, mean arterial pressure on admission, prior admission within 3 months, background morbidity of heart failure and active malignancy, and chronic use of statins and antiplatelet agents. The c-statistic (ROC-AUC) of the prediction model was 80.5% without adjustment for main or secondary conditions, 84.5%, with adjustment for the main diagnosis, and 89.5% with adjustment for the main diagnosis and secondary conditions. The accuracy of the predictive model reached 81% on the validation dataset.A prediction model based on clinical data with adjustment for secondary conditions exhibited a high degree of prediction accuracy. We provide a proof of concept that there is an added value for incorporating secondary conditions while predicting probabilities of in-hospital mortality. Further improvement of the model performance and validation in other cohorts are needed to aid hospitalists in predicting health outcomes.
Parameter Selection Methods in Inverse Problem Formulation

DTIC Science & Technology

2010-11-03

clinical data and used for prediction and a model for the reaction of the cardiovascular system to an ergometric workload. Key Words: Parameter selection...model for HIV dynamics which has been successfully validated with clinical data and used for prediction and a model for the reaction of the...recently developed in-host model for HIV dynamics which has been successfully validated with clinical data and used for prediction [4, 8]; b) a global
Perioperative Respiratory Adverse Events in Pediatric Ambulatory Anesthesia: Development and Validation of a Risk Prediction Tool.

PubMed

Subramanyam, Rajeev; Yeramaneni, Samrat; Hossain, Mohamed Monir; Anneken, Amy M; Varughese, Anna M

2016-05-01

Perioperative respiratory adverse events (PRAEs) are the most common cause of serious adverse events in children receiving anesthesia. Our primary aim of this study was to develop and validate a risk prediction tool for the occurrence of PRAE from the onset of anesthesia induction until discharge from the postanesthesia care unit in children younger than 18 years undergoing elective ambulatory anesthesia for surgery and radiology. The incidence of PRAE was studied. We analyzed data from 19,059 patients from our department's quality improvement database. The predictor variables were age, sex, ASA physical status, morbid obesity, preexisting pulmonary disorder, preexisting neurologic disorder, and location of ambulatory anesthesia (surgery or radiology). Composite PRAE was defined as the presence of any 1 of the following events: intraoperative bronchospasm, intraoperative laryngospasm, postoperative apnea, postoperative laryngospasm, postoperative bronchospasm, or postoperative prolonged oxygen requirement. Development and validation of the risk prediction tool for PRAE were performed using a split sampling technique to split the database into 2 independent cohorts based on the year when the patient received ambulatory anesthesia for surgery and radiology using logistic regression. A risk score was developed based on the regression coefficients from the validation tool. The performance of the risk prediction tool was assessed by using tests of discrimination and calibration. The overall incidence of composite PRAE was 2.8%. The derivation cohort included 8904 patients, and the validation cohort included 10,155 patients. The risk of PRAE was 3.9% in the development cohort and 1.8% in the validation cohort. Age ≤ 3 years (versus >3 years), ASA physical status II or III (versus ASA physical status I), morbid obesity, preexisting pulmonary disorder, and surgery (versus radiology) significantly predicted the occurrence of PRAE in a multivariable logistic regression model. A risk score in the range of 0 to 3 was assigned to each significant variable in the logistic regression model, and final score for all risk factors ranged from 0 to 11. A cutoff score of 4 was derived from a receiver operating characteristic curve to determine the high-risk category. The model C-statistic and the corresponding SE for the derivation and validation cohort was 0.64 ± 0.01 and 0.63 ± 0.02, respectively. Sensitivity and SE of the risk prediction tool to identify children at risk for PRAE was 77.6 ± 0.02 in the derivation cohort and 76.2 ± 0.03 in the validation cohort. The risk tool developed and validated from our study cohort identified 5 risk factors: age ≤ 3 years (versus >3 years), ASA physical status II and III (versus ASA physical status I), morbid obesity, preexisting pulmonary disorder, and surgery (versus radiology) for PRAE. This tool can be used to provide an individual risk score for each patient to predict the risk of PRAE in the preoperative period.
Anxiety measures validated in perinatal populations: a systematic review.

PubMed

Meades, Rose; Ayers, Susan

2011-09-01

Research and screening of anxiety in the perinatal period is hampered by a lack of psychometric data on self-report anxiety measures used in perinatal populations. This paper aimed to review self-report measures that have been validated with perinatal women. A systematic search was carried out of four electronic databases. Additional papers were obtained through searching identified articles. Thirty studies were identified that reported validation of an anxiety measure with perinatal women. Most commonly validated self-report measures were the General Health Questionnaire (GHQ), State-Trait Anxiety Inventory (STAI), and Hospital Anxiety and Depression Scales (HADS). Of the 30 studies included, 11 used a clinical interview to provide criterion validity. Remaining studies reported one or more other forms of validity (factorial, discriminant, concurrent and predictive) or reliability. The STAI shows criterion, discriminant and predictive validity and may be most useful for research purposes as a specific measure of anxiety. The Kessler 10 (K-10) may be the best short screening measure due to its ability to differentiate anxiety disorders. The Depression Anxiety Stress Scales 21 (DASS-21) measures multiple types of distress, shows appropriate content, and remains to be validated against clinical interview in perinatal populations. Nineteen studies did not report sensitivity or specificity data. The early stages of research into perinatal anxiety, the multitude of measures in use, and methodological differences restrict comparison of measures across studies. There is a need for further validation of self-report measures of anxiety in the perinatal period to enable accurate screening and detection of anxiety symptoms and disorders. Copyright © 2010 Elsevier B.V. All rights reserved.
Development and validation of a tool to evaluate the quality of medical education websites in pathology

PubMed Central

Alyusuf, Raja H.; Prasad, Kameshwar; Abdel Satir, Ali M.; Abalkhail, Ali A.; Arora, Roopa K.

2013-01-01

Background: The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. Aim: The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. Methods: A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Results and Discussion: Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. Conclusion: A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites. PMID:24392243
Validation of biomarkers to predict response to immunotherapy in cancer: Volume I - pre-analytical and analytical validation.

PubMed

Masucci, Giuseppe V; Cesano, Alessandra; Hawtin, Rachael; Janetzki, Sylvia; Zhang, Jenny; Kirsch, Ilan; Dobbin, Kevin K; Alvarez, John; Robbins, Paul B; Selvan, Senthamil R; Streicher, Howard Z; Butterfield, Lisa H; Thurin, Magdalena

2016-01-01

Immunotherapies have emerged as one of the most promising approaches to treat patients with cancer. Recently, there have been many clinical successes using checkpoint receptor blockade, including T cell inhibitory receptors such as cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) and programmed cell death-1 (PD-1). Despite demonstrated successes in a variety of malignancies, responses only typically occur in a minority of patients in any given histology. Additionally, treatment is associated with inflammatory toxicity and high cost. Therefore, determining which patients would derive clinical benefit from immunotherapy is a compelling clinical question. Although numerous candidate biomarkers have been described, there are currently three FDA-approved assays based on PD-1 ligand expression (PD-L1) that have been clinically validated to identify patients who are more likely to benefit from a single-agent anti-PD-1/PD-L1 therapy. Because of the complexity of the immune response and tumor biology, it is unlikely that a single biomarker will be sufficient to predict clinical outcomes in response to immune-targeted therapy. Rather, the integration of multiple tumor and immune response parameters, such as protein expression, genomics, and transcriptomics, may be necessary for accurate prediction of clinical benefit. Before a candidate biomarker and/or new technology can be used in a clinical setting, several steps are necessary to demonstrate its clinical validity. Although regulatory guidelines provide general roadmaps for the validation process, their applicability to biomarkers in the cancer immunotherapy field is somewhat limited. Thus, Working Group 1 (WG1) of the Society for Immunotherapy of Cancer (SITC) Immune Biomarkers Task Force convened to address this need. In this two volume series, we discuss pre-analytical and analytical (Volume I) as well as clinical and regulatory (Volume II) aspects of the validation process as applied to predictive biomarkers for cancer immunotherapy. To illustrate the requirements for validation, we discuss examples of biomarker assays that have shown preliminary evidence of an association with clinical benefit from immunotherapeutic interventions. The scope includes only those assays and technologies that have established a certain level of validation for clinical use (fit-for-purpose). Recommendations to meet challenges and strategies to guide the choice of analytical and clinical validation design for specific assays are also provided.
Identification of patients at high risk for Clostridium difficile infection: development and validation of a risk prediction model in hospitalized patients treated with antibiotics.

PubMed

van Werkhoven, C H; van der Tempel, J; Jajou, R; Thijsen, S F T; Diepersloot, R J A; Bonten, M J M; Postma, D F; Oosterheert, J J

2015-08-01

To develop and validate a prediction model for Clostridium difficile infection (CDI) in hospitalized patients treated with systemic antibiotics, we performed a case-cohort study in a tertiary (derivation) and secondary care hospital (validation). Cases had a positive Clostridium test and were treated with systemic antibiotics before suspicion of CDI. Controls were randomly selected from hospitalized patients treated with systemic antibiotics. Potential predictors were selected from the literature. Logistic regression was used to derive the model. Discrimination and calibration of the model were tested in internal and external validation. A total of 180 cases and 330 controls were included for derivation. Age >65 years, recent hospitalization, CDI history, malignancy, chronic renal failure, use of immunosuppressants, receipt of antibiotics before admission, nonsurgical admission, admission to the intensive care unit, gastric tube feeding, treatment with cephalosporins and presence of an underlying infection were independent predictors of CDI. The area under the receiver operating characteristic curve of the model in the derivation cohort was 0.84 (95% confidence interval 0.80-0.87), and was reduced to 0.81 after internal validation. In external validation, consisting of 97 cases and 417 controls, the model area under the curve was 0.81 (95% confidence interval 0.77-0.85) and model calibration was adequate (Brier score 0.004). A simplified risk score was derived. Using a cutoff of 7 points, the positive predictive value, sensitivity and specificity were 1.0%, 72% and 73%, respectively. In conclusion, a risk prediction model was developed and validated, with good discrimination and calibration, that can be used to target preventive interventions in patients with increased risk of CDI. Copyright © 2015 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Measurement of COPD Severity Using a Survey-Based Score

PubMed Central

Omachi, Theodore A.; Katz, Patricia P.; Yelin, Edward H.; Iribarren, Carlos; Blanc, Paul D.

2010-01-01

Background: A comprehensive survey-based COPD severity score has usefulness for epidemiologic and health outcomes research. We previously developed and validated the survey-based COPD Severity Score without using lung function or other physiologic measurements. In this study, we aimed to further validate the severity score in a different COPD cohort and using a combination of patient-reported and objective physiologic measurements. Methods: Using data from the Function, Living, Outcomes, and Work cohort study of COPD, we evaluated the concurrent and predictive validity of the COPD Severity Score among 1,202 subjects. The survey instrument is a 35-point score based on symptoms, medication and oxygen use, and prior hospitalization or intubation for COPD. Subjects were systemically assessed using structured telephone survey, spirometry, and 6-min walk testing. Results: We found evidence to support concurrent validity of the score. Higher COPD Severity Score values were associated with poorer FEV1 (r = −0.38), FEV1% predicted (r = −0.40), Body mass, Obstruction, Dyspnea, Exercise Index (r = 0.57), and distance walked in 6 min (r = −0.43) (P < .0001 in all cases). Greater COPD severity was also related to poorer generic physical health status (r = −0.49) and disease-specific health-related quality of life (r = 0.57) (P < .0001). The score also demonstrated predictive validity. It was also associated with a greater prospective risk of acute exacerbation of COPD defined as ED visits (hazard ratio [HR], 1.31; 95% CI, 1.24-1.39), hospitalizations (HR, 1.59; 95% CI, 1.44-1.75), and either measure of hospital-based care for COPD (HR, 1.34; 95% CI, 1.26-1.41) (P < .0001 in all cases). Conclusion: The COPD Severity Score is a valid survey-based measure of disease-specific severity, both in terms of concurrent and predictive validity. The score is a psychometrically sound instrument for use in epidemiologic and outcomes research in COPD. PMID:20040611
Towards personalized therapy for multiple sclerosis: prediction of individual treatment response.

PubMed

Kalincik, Tomas; Manouchehrinia, Ali; Sobisek, Lukas; Jokubaitis, Vilija; Spelman, Tim; Horakova, Dana; Havrdova, Eva; Trojano, Maria; Izquierdo, Guillermo; Lugaresi, Alessandra; Girard, Marc; Prat, Alexandre; Duquette, Pierre; Grammond, Pierre; Sola, Patrizia; Hupperts, Raymond; Grand'Maison, Francois; Pucci, Eugenio; Boz, Cavit; Alroughani, Raed; Van Pesch, Vincent; Lechner-Scott, Jeannette; Terzi, Murat; Bergamaschi, Roberto; Iuliano, Gerardo; Granella, Franco; Spitaleri, Daniele; Shaygannejad, Vahid; Oreja-Guevara, Celia; Slee, Mark; Ampapa, Radek; Verheul, Freek; McCombe, Pamela; Olascoaga, Javier; Amato, Maria Pia; Vucic, Steve; Hodgkinson, Suzanne; Ramo-Tello, Cristina; Flechter, Shlomo; Cristiano, Edgardo; Rozsa, Csilla; Moore, Fraser; Luis Sanchez-Menoyo, Jose; Laura Saladino, Maria; Barnett, Michael; Hillert, Jan; Butzkueven, Helmut

2017-09-01

Timely initiation of effective therapy is crucial for preventing disability in multiple sclerosis; however, treatment response varies greatly among patients. Comprehensive predictive models of individual treatment response are lacking. Our aims were: (i) to develop predictive algorithms for individual treatment response using demographic, clinical and paraclinical predictors in patients with multiple sclerosis; and (ii) to evaluate accuracy, and internal and external validity of these algorithms. This study evaluated 27 demographic, clinical and paraclinical predictors of individual response to seven disease-modifying therapies in MSBase, a large global cohort study. Treatment response was analysed separately for disability progression, disability regression, relapse frequency, conversion to secondary progressive disease, change in the cumulative disease burden, and the probability of treatment discontinuation. Multivariable survival and generalized linear models were used, together with the principal component analysis to reduce model dimensionality and prevent overparameterization. Accuracy of the individual prediction was tested and its internal validity was evaluated in a separate, non-overlapping cohort. External validity was evaluated in a geographically distinct cohort, the Swedish Multiple Sclerosis Registry. In the training cohort (n = 8513), the most prominent modifiers of treatment response comprised age, disease duration, disease course, previous relapse activity, disability, predominant relapse phenotype and previous therapy. Importantly, the magnitude and direction of the associations varied among therapies and disease outcomes. Higher probability of disability progression during treatment with injectable therapies was predominantly associated with a greater disability at treatment start and the previous therapy. For fingolimod, natalizumab or mitoxantrone, it was mainly associated with lower pretreatment relapse activity. The probability of disability regression was predominantly associated with pre-baseline disability, therapy and relapse activity. Relapse incidence was associated with pretreatment relapse activity, age and relapsing disease course, with the strength of these associations varying among therapies. Accuracy and internal validity (n = 1196) of the resulting predictive models was high (>80%) for relapse incidence during the first year and for disability outcomes, moderate for relapse incidence in Years 2-4 and for the change in the cumulative disease burden, and low for conversion to secondary progressive disease and treatment discontinuation. External validation showed similar results, demonstrating high external validity for disability and relapse outcomes, moderate external validity for cumulative disease burden and low external validity for conversion to secondary progressive disease and treatment discontinuation. We conclude that demographic, clinical and paraclinical information helps predict individual response to disease-modifying therapies at the time of their commencement. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Validation of the Registry to Evaluate Early and Long-Term Pulmonary Arterial Hypertension Disease Management (REVEAL) pulmonary hypertension prediction model in a unique population and utility in the prediction of long-term survival.

PubMed

Cogswell, Rebecca; Kobashigawa, Erin; McGlothlin, Dana; Shaw, Robin; De Marco, Teresa

2012-11-01

The Registry to Evaluate Early and Long-Term Pulmonary Arterial (PAH) Hypertension Disease Management (REVEAL) model was designed to predict 1-year survival in patients with PAH. Multivariate prediction models need to be evaluated in cohorts distinct from the derivation set to determine external validity. In addition, limited data exist on the utility of this model in the prediction of long-term survival. REVEAL model performance was assessed to predict 1-year and 5-year outcomes, defined as survival or composite survival or freedom from lung transplant, in 140 patients with PAH. The validation cohort had a higher proportion of human immunodeficiency virus (7.9% vs 1.9%, p < 0.0001), methamphetamine use (19.3% vs 4.9%, p < 0.0001), and portal hypertension PAH (16.4% vs 5.1%, p < 0.0001) compared with the development cohort. The C-index of the model to predict survival was 0.765 at 1 year and 0.712 at 5 years of follow-up. The C-index of the model to predict composite survival or freedom from lung transplant was 0.805 and 0.724 at 1 and 5 years of follow-up, respectively. Prediction by the model, however, was weakest among patients with intermediate-risk predicted survival. The REVEAL model had adequate discrimination to predict 1-year survival in this small but clinically distinct validation cohort. Although the model also had predictive ability out to 5 years, prediction was limited among patients of intermediate risk, suggesting our prediction methods can still be improved. Copyright © 2012. Published by Elsevier Inc.

Validation of behave fire behavior predictions in oak savannas

USGS Publications Warehouse

Grabner, Keith W.; Dwyer, John; Cutter, Bruce E.

1997-01-01

Prescribed fire is a valuable tool in the restoration and management of oak savannas. BEHAVE, a fire behavior prediction system developed by the United States Forest Service, can be a useful tool when managing oak savannas with prescribed fire. BEHAVE predictions of fire rate-of-spread and flame length were validated using four standardized fuel models: Fuel Model 1 (short grass), Fuel Model 2 (timber and grass), Fuel Model 3 (tall grass), and Fuel Model 9 (hardwood litter). Also, a customized oak savanna fuel model (COSFM) was created and validated. Results indicate that standardized fuel model 2 and the COSFM reliably estimate mean rate-of-spread (MROS). The COSFM did not appreciably reduce MROS variation when compared to fuel model 2. Fuel models 1, 3, and 9 did not reliably predict MROS. Neither the standardized fuel models nor the COSFM adequately predicted flame lengths. We concluded that standardized fuel model 2 should be used with BEHAVE when predicting fire rates-of-spread in established oak savannas.
Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking.

PubMed

Daetwyler, Hans D; Calus, Mario P L; Pong-Wong, Ricardo; de Los Campos, Gustavo; Hickey, John M

2013-02-01

The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits with fewer QTL variable selection did have some advantages. In the real data sets examined here all methods had very similar accuracies. We conclude that no single method can serve as a benchmark for genomic prediction. We recommend comparing accuracy and bias of new methods to results from genomic best linear prediction and a variable selection approach (e.g., BayesB), because, together, these methods are appropriate for a range of genetic architectures. An accompanying article in this issue provides a comprehensive review of genomic prediction methods and discusses a selection of topics related to application of genomic prediction in plants and animals.
Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking

PubMed Central

Daetwyler, Hans D.; Calus, Mario P. L.; Pong-Wong, Ricardo; de los Campos, Gustavo; Hickey, John M.

2013-01-01

The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits with fewer QTL variable selection did have some advantages. In the real data sets examined here all methods had very similar accuracies. We conclude that no single method can serve as a benchmark for genomic prediction. We recommend comparing accuracy and bias of new methods to results from genomic best linear prediction and a variable selection approach (e.g., BayesB), because, together, these methods are appropriate for a range of genetic architectures. An accompanying article in this issue provides a comprehensive review of genomic prediction methods and discusses a selection of topics related to application of genomic prediction in plants and animals. PMID:23222650
Reliability and validity in a nutshell.

PubMed

Bannigan, Katrina; Watson, Roger

2009-12-01

To explore and explain the different concepts of reliability and validity as they are related to measurement instruments in social science and health care. There are different concepts contained in the terms reliability and validity and these are often explained poorly and there is often confusion between them. To develop some clarity about reliability and validity a conceptual framework was built based on the existing literature. The concepts of reliability, validity and utility are explored and explained. Reliability contains the concepts of internal consistency and stability and equivalence. Validity contains the concepts of content, face, criterion, concurrent, predictive, construct, convergent (and divergent), factorial and discriminant. In addition, for clinical practice and research, it is essential to establish the utility of a measurement instrument. To use measurement instruments appropriately in clinical practice, the extent to which they are reliable, valid and usable must be established.
Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries.

PubMed

Jochems, Arthur; Deist, Timo M; El Naqa, Issam; Kessler, Marc; Mayo, Chuck; Reeves, Jackson; Jolly, Shruti; Matuszak, Martha; Ten Haken, Randall; van Soest, Johan; Oberije, Cary; Faivre-Finn, Corinne; Price, Gareth; de Ruysscher, Dirk; Lambin, Philippe; Dekker, Andre

2017-10-01

Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach. Clinical data from 698 lung cancer patients, treated with curative intent with chemoradiation or radiation therapy alone, were collected and stored at 2 different cancer institutes (559 patients at Maastro clinic (Netherlands) and 139 at Michigan university [United States]). The model was further validated on 196 patients originating from The Christie (United Kingdon). A Bayesian network model was adapted for distributed learning (the animation can be viewed at https://www.youtube.com/watch?v=ZDJFOxpwqEA). Two-year posttreatment survival was chosen as the endpoint. The Maastro clinic cohort data are publicly available at https://www.cancerdata.org/publication/developing-and-validating-survival-prediction-model-nsclc-patients-through-distributed, and the developed models can be found at www.predictcancer.org. Variables included in the final model were T and N category, age, performance status, and total tumor dose. The model has an area under the curve (AUC) of 0.66 on the external validation set and an AUC of 0.62 on a 5-fold cross validation. A model based on the T and N category performed with an AUC of 0.47 on the validation set, significantly worse than our model (P<.001). Learning the model in a centralized or distributed fashion yields a minor difference on the probabilities of the conditional probability tables (0.6%); the discriminative performance of the models on the validation set is similar (P=.26). Distributed learning from federated databases allows learning of predictive models on data originating from multiple institutions while avoiding many of the data-sharing barriers. We believe that distributed learning is the future of sharing data in health care. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Prediction of liver disease in patients whose liver function tests have been checked in primary care: model development and validation using population-based observational cohorts.

PubMed

McLernon, David J; Donnan, Peter T; Sullivan, Frank M; Roderick, Paul; Rosenberg, William M; Ryder, Steve D; Dillon, John F

2014-06-02

To derive and validate a clinical prediction model to estimate the risk of liver disease diagnosis following liver function tests (LFTs) and to convert the model to a simplified scoring tool for use in primary care. Population-based observational cohort study of patients in Tayside Scotland identified as having their LFTs performed in primary care and followed for 2 years. Biochemistry data were linked to secondary care, prescriptions and mortality data to ascertain baseline characteristics of the derivation cohort. A separate validation cohort was obtained from 19 general practices across the rest of Scotland to externally validate the final model. Primary care, Tayside, Scotland. Derivation cohort: LFT results from 310 511 patients. After exclusions (including: patients under 16 years, patients having initial LFTs measured in secondary care, bilirubin >35 μmol/L, liver complications within 6 weeks and history of a liver condition), the derivation cohort contained 95 977 patients with no clinically apparent liver condition. Validation cohort: after exclusions, this cohort contained 11 653 patients. Diagnosis of a liver condition within 2 years. From the derivation cohort (n=95 977), 481 (0.5%) were diagnosed with a liver disease. The model showed good discrimination (C-statistic=0.78). Given the low prevalence of liver disease, the negative predictive values were high. Positive predictive values were low but rose to 20-30% for high-risk patients. This study successfully developed and validated a clinical prediction model and subsequent scoring tool, the Algorithm for Liver Function Investigations (ALFI), which can predict liver disease risk in patients with no clinically obvious liver disease who had their initial LFTs taken in primary care. ALFI can help general practitioners focus referral on a small subset of patients with higher predicted risk while continuing to address modifiable liver disease risk factors in those at lower risk. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Validity of the SAT® for Predicting First-Year Grades: 2011 SAT Validity Sample. Statistical Report 2013-3

ERIC Educational Resources Information Center

Patterson, Brian F.; Mattern, Krista D.

2013-01-01

The continued accumulation of validity evidence for the intended uses of educational assessments is critical to ensure that proper inferences will be made for those purposes. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides…
Validity of the SAT® for Predicting First-Year Grades: 2012 SAT Validity Sample. Statistical Report 2015 2

ERIC Educational Resources Information Center

Beard, Jonathan; Marini, Jessica P.

2015-01-01

The continued accumulation of validity evidence for the intended uses of educational assessment scores is critical to ensure that inferences made using the scores are sound. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides updated…
Development and Validation of Criterion-Referenced Clinically Relevant Fitness Standards for Maintaining Physical Independence in Later Years

ERIC Educational Resources Information Center

Rikli, Roberta E.; Jones, C. Jessie

2013-01-01

Purpose: To develop and validate criterion-referenced fitness standards for older adults that predict the level of capacity needed for maintaining physical independence into later life. The proposed standards were developed for use with a previously validated test battery for older adults--the Senior Fitness Test (Rikli, R. E., & Jones, C. J.…
Examining the Reliability and Validity of ADEPT and CELDT: Comparing Two Assessments of Oral Language Proficiency for English Language Learners

ERIC Educational Resources Information Center

Chavez, Gina

2013-01-01

Few classroom measures of English language proficiency have been evaluated for reliability and validity. This research examined the concurrent and predictive validity of an oral language test, titled A Developmental English Language Proficiency Test (ADEPT), and the relationship to the California English Language Development Test (CELDT) in the…
Effects of Coaching on the Validity of the SAT: A Simulation Study.

ERIC Educational Resources Information Center

Baydar, Nazli

The effects of student coaching in preparation for the College Board Scholastic Aptitude Test (SAT) on the predictive validity of this test for freshman year performance were studied using data on 1985 freshman year students from four colleges. After the validity of the SAT was estimated for each school, a given proportion of students was picked,…
Validation of the Spanish Version of the Mammography-Specific Self-Efficacy Scale.

PubMed

Jerome-D'Emilia, Bonnie; Suplee, Patricia; Akincigil, Ayse

2015-05-01

To consider psychometric estimates of the validity and reliability of the Spanish translation of a mammography-specific self-efficacy scale. A cross-sectional study. Three primarily Hispanic churches and a Hispanic community center in a low-income urban area of New Jersey. 153 low-income Hispanic women aged 40-85 years. The translated scale was administered to participants during a six-month period. Internal consistency, reliability, and construct and predictive validity were assessed. Demographic variables included income and insurance status. Outcome variables included total mammography-specific self-efficacy and having had a mammogram within the past two years. Preliminary evidence of reliability and validity were found, and predictive validity was demonstrated. The health needs of specific populations can be addressed only when research instruments have been appropriately validated and all relevant factors are considered. Diverse groups of low-income women face similar challenges and barriers in their efforts to get screened. Nurses are in an ideal position to help women with preventive care decision making (e.g., screening for breast cancer). Understanding how a woman's level of self-efficacy affects her decision making should be considered when counseling a client.
Reconsidering vocational interests for personnel selection: the validity of an interest-based selection test in relation to job knowledge, job performance, and continuance intentions.

PubMed

Van Iddekinge, Chad H; Putka, Dan J; Campbell, John P

2011-01-01

Although vocational interests have a long history in vocational psychology, they have received extremely limited attention within the recent personnel selection literature. We reconsider some widely held beliefs concerning the (low) validity of interests for predicting criteria important to selection researchers, and we review theory and empirical evidence that challenge such beliefs. We then describe the development and validation of an interests-based selection measure. Results of a large validation study (N = 418) reveal that interests predicted a diverse set of criteria—including measures of job knowledge, job performance, and continuance intentions—with corrected, cross-validated Rs that ranged from .25 to .46 across the criteria (mean R = .31). Interests also provided incremental validity beyond measures of general cognitive aptitude and facets of the Big Five personality dimensions in relation to each criterion. Furthermore, with a couple exceptions, the interest scales were associated with small to medium subgroup differences, which in most cases favored women and racial minorities. Taken as a whole, these results appear to call into question the prevailing thought that vocational interests have limited usefulness for selection.
CADASTER QSPR Models for Predictions of Melting and Boiling Points of Perfluorinated Chemicals.

PubMed

Bhhatarai, Barun; Teetz, Wolfram; Liu, Tao; Öberg, Tomas; Jeliazkova, Nina; Kochev, Nikolay; Pukalov, Ognyan; Tetko, Igor V; Kovarich, Simona; Papa, Ester; Gramatica, Paola

2011-03-14

Quantitative structure property relationship (QSPR) studies on per- and polyfluorinated chemicals (PFCs) on melting point (MP) and boiling point (BP) are presented. The training and prediction chemicals used for developing and validating the models were selected from Syracuse PhysProp database and literatures. The available experimental data sets were split in two different ways: a) random selection on response value, and b) structural similarity verified by self-organizing-map (SOM), in order to propose reliable predictive models, developed only on the training sets and externally verified on the prediction sets. Individual linear and non-linear approaches based models developed by different CADASTER partners on 0D-2D Dragon descriptors, E-state descriptors and fragment based descriptors as well as consensus model and their predictions are presented. In addition, the predictive performance of the developed models was verified on a blind external validation set (EV-set) prepared using PERFORCE database on 15 MP and 25 BP data respectively. This database contains only long chain perfluoro-alkylated chemicals, particularly monitored by regulatory agencies like US-EPA and EU-REACH. QSPR models with internal and external validation on two different external prediction/validation sets and study of applicability-domain highlighting the robustness and high accuracy of the models are discussed. Finally, MPs for additional 303 PFCs and BPs for 271 PFCs were predicted for which experimental measurements are unknown. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Adaptation of clinical prediction models for application in local settings.

PubMed

Kappen, Teus H; Vergouwe, Yvonne; van Klei, Wilton A; van Wolfswinkel, Leo; Kalkman, Cor J; Moons, Karel G M

2012-01-01

When planning to use a validated prediction model in new patients, adequate performance is not guaranteed. For example, changes in clinical practice over time or a different case mix than the original validation population may result in inaccurate risk predictions. To demonstrate how clinical information can direct updating a prediction model and development of a strategy for handling missing predictor values in clinical practice. A previously derived and validated prediction model for postoperative nausea and vomiting was updated using a data set of 1847 patients. The update consisted of 1) changing the definition of an existing predictor, 2) reestimating the regression coefficient of a predictor, and 3) adding a new predictor to the model. The updated model was then validated in a new series of 3822 patients. Furthermore, several imputation models were considered to handle real-time missing values, so that possible missing predictor values could be anticipated during actual model use. Differences in clinical practice between our local population and the original derivation population guided the update strategy of the prediction model. The predictive accuracy of the updated model was better (c statistic, 0.68; calibration slope, 1.0) than the original model (c statistic, 0.62; calibration slope, 0.57). Inclusion of logistical variables in the imputation models, besides observed patient characteristics, contributed to a strategy to deal with missing predictor values at the time of risk calculation. Extensive knowledge of local, clinical processes provides crucial information to guide the process of adapting a prediction model to new clinical practices.
The predictive validity of ideal partner preferences: a review and meta-analysis.

PubMed

Eastwick, Paul W; Luchies, Laura B; Finkel, Eli J; Hunt, Lucy L

2014-05-01

A central element of interdependence theory is that people have standards against which they compare their current outcomes, and one ubiquitous standard in the mating domain is the preference for particular attributes in a partner (ideal partner preferences). This article reviews research on the predictive validity of ideal partner preferences and presents a new integrative model that highlights when and why ideals succeed or fail to predict relational outcomes. Section 1 examines predictive validity by reviewing research on sex differences in the preference for physical attractiveness and earning prospects. Men and women reliably differ in the extent to which these qualities affect their romantic evaluations of hypothetical targets. Yet a new meta-analysis spanning the attraction and relationships literatures (k = 97) revealed that physical attractiveness predicted romantic evaluations with a moderate-to-strong effect size (r = ∼.40) for both sexes, and earning prospects predicted romantic evaluations with a small effect size (r = ∼.10) for both sexes. Sex differences in the correlations were small (r difference = .03) and uniformly nonsignificant. Section 2 reviews research on individual differences in ideal partner preferences, drawing from several theoretical traditions to explain why ideals predict relational evaluations at different relationship stages. Furthermore, this literature also identifies alternative measures of ideal partner preferences that have stronger predictive validity in certain theoretically sensible contexts. Finally, a discussion highlights a new framework for conceptualizing the appeal of traits, the difference between live and hypothetical interactions, and the productive interplay between mating research and broader psychological theories.
Development and Validation of a Practical Two-Step Prediction Model and Clinical Risk Score for Post-Thrombotic Syndrome.

PubMed

Amin, Elham E; van Kuijk, Sander M J; Joore, Manuela A; Prandoni, Paolo; Cate, Hugo Ten; Cate-Hoek, Arina J Ten

2018-06-04

Post-thrombotic syndrome (PTS) is a common chronic consequence of deep vein thrombosis that affects the quality of life and is associated with substantial costs. In clinical practice, it is not possible to predict the individual patient risk. We develop and validate a practical two-step prediction tool for PTS in the acute and sub-acute phase of deep vein thrombosis. Multivariable regression modelling with data from two prospective cohorts in which 479 (derivation) and 1,107 (validation) consecutive patients with objectively confirmed deep vein thrombosis of the leg, from thrombosis outpatient clinic of Maastricht University Medical Centre, the Netherlands (derivation) and Padua University hospital in Italy (validation), were included. PTS was defined as a Villalta score of ≥ 5 at least 6 months after acute thrombosis. Variables in the baseline model in the acute phase were: age, body mass index, sex, varicose veins, history of venous thrombosis, smoking status, provoked thrombosis and thrombus location. For the secondary model, the additional variable was residual vein obstruction. Optimism-corrected area under the receiver operating characteristic curves (AUCs) were 0.71 for the baseline model and 0.60 for the secondary model. Calibration plots showed well-calibrated predictions. External validation of the derived clinical risk scores was successful: AUC, 0.66 (95% confidence interval [CI], 0.63-0.70) and 0.64 (95% CI, 0.60-0.69). Individual risk for PTS in the acute phase of deep vein thrombosis can be predicted based on readily accessible baseline clinical and demographic characteristics. The individual risk in the sub-acute phase can be predicted with limited additional clinical characteristics. Schattauer GmbH Stuttgart.
External validation and clinical utility of a prediction model for 6-month mortality in patients undergoing hemodialysis for end-stage kidney disease.

PubMed

Forzley, Brian; Er, Lee; Chiu, Helen Hl; Djurdjev, Ognjenka; Martinusen, Dan; Carson, Rachel C; Hargrove, Gaylene; Levin, Adeera; Karim, Mohamud

2018-02-01

End-stage kidney disease is associated with poor prognosis. Health care professionals must be prepared to address end-of-life issues and identify those at high risk for dying. A 6-month mortality prediction model for patients on dialysis derived in the United States is used but has not been externally validated. We aimed to assess the external validity and clinical utility in an independent cohort in Canada. We examined the performance of the published 6-month mortality prediction model, using discrimination, calibration, and decision curve analyses. Data were derived from a cohort of 374 prevalent dialysis patients in two regions of British Columbia, Canada, which included serum albumin, age, peripheral vascular disease, dementia, and answers to the "the surprise question" ("Would I be surprised if this patient died within the next year?"). The observed mortality in the validation cohort was 11.5% at 6 months. The prediction model had reasonable discrimination (c-stat = 0.70) but poor calibration (calibration-in-the-large = -0.53 (95% confidence interval: -0.88, -0.18); calibration slope = 0.57 (95% confidence interval: 0.31, 0.83)) in our data. Decision curve analysis showed the model only has added value in guiding clinical decision in a small range of threshold probabilities: 8%-20%. Despite reasonable discrimination, the prediction model has poor calibration in this external study cohort; thus, it may have limited clinical utility in settings outside of where it was derived. Decision curve analysis clarifies limitations in clinical utility not apparent by receiver operating characteristic curve analysis. This study highlights the importance of external validation of prediction models prior to routine use in clinical practice.
A Systematic Review of the Reliability and Validity of Behavioural Tests Used to Assess Behavioural Characteristics Important in Working Dogs.

PubMed

Brady, Karen; Cracknell, Nina; Zulch, Helen; Mills, Daniel Simon

2018-01-01

Working dogs are selected based on predictions from tests that they will be able to perform specific tasks in often challenging environments. However, withdrawal from service in working dogs is still a big problem, bringing into question the reliability of the selection tests used to make these predictions. A systematic review was undertaken aimed at bringing together available information on the reliability and predictive validity of the assessment of behavioural characteristics used with working dogs to establish the quality of selection tests currently available for use to predict success in working dogs. The search procedures resulted in 16 papers meeting the criteria for inclusion. A large range of behaviour tests and parameters were used in the identified papers, and so behaviour tests and their underpinning constructs were grouped on the basis of their relationship with positive core affect (willingness to work, human-directed social behaviour, object-directed play tendencies) and negative core affect (human-directed aggression, approach withdrawal tendencies, sensitivity to aversives). We then examined the papers for reports of inter-rater reliability, within-session intra-rater reliability, test-retest validity and predictive validity. The review revealed a widespread lack of information relating to the reliability and validity of measures to assess behaviour and inconsistencies in terminologies, study parameters and indices of success. There is a need to standardise the reporting of these aspects of behavioural tests in order to improve the knowledge base of what characteristics are predictive of optimal performance in working dog roles, improving selection processes and reducing working dog redundancy. We suggest the use of a framework based on explaining the direct or indirect relationship of the test with core affect.
Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery.

PubMed

Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A

2013-08-01

As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.

An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs.

PubMed

Harvey, Naomi D; Craigon, Peter J; Blythe, Simon A; England, Gary C W; Asher, Lucy

2017-01-01

Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.
A biomarker-based risk score to predict death in patients with atrial fibrillation: the ABC (age, biomarkers, clinical history) death risk score

PubMed Central

Hijazi, Ziad; Oldgren, Jonas; Lindbäck, Johan; Alexander, John H; Connolly, Stuart J; Eikelboom, John W; Ezekowitz, Michael D; Held, Claes; Hylek, Elaine M; Lopes, Renato D; Yusuf, Salim; Granger, Christopher B; Siegbahn, Agneta; Wallentin, Lars

2018-01-01

Abstract Aims In atrial fibrillation (AF), mortality remains high despite effective anticoagulation. A model predicting the risk of death in these patients is currently not available. We developed and validated a risk score for death in anticoagulated patients with AF including both clinical information and biomarkers. Methods and results The new risk score was developed and internally validated in 14 611 patients with AF randomized to apixaban vs. warfarin for a median of 1.9 years. External validation was performed in 8548 patients with AF randomized to dabigatran vs. warfarin for 2.0 years. Biomarker samples were obtained at study entry. Variables significantly contributing to the prediction of all-cause mortality were assessed by Cox-regression. Each variable obtained a weight proportional to the model coefficients. There were 1047 all-cause deaths in the derivation and 594 in the validation cohort. The most important predictors of death were N-terminal pro B-type natriuretic peptide, troponin-T, growth differentiation factor-15, age, and heart failure, and these were included in the ABC (Age, Biomarkers, Clinical history)-death risk score. The score was well-calibrated and yielded higher c-indices than a model based on all clinical variables in both the derivation (0.74 vs. 0.68) and validation cohorts (0.74 vs. 0.67). The reduction in mortality with apixaban was most pronounced in patients with a high ABC-death score. Conclusion A new biomarker-based score for predicting risk of death in anticoagulated AF patients was developed, internally and externally validated, and well-calibrated in two large cohorts. The ABC-death risk score performed well and may contribute to overall risk assessment in AF. ClinicalTrials.gov identifier NCT00412984 and NCT00262600 PMID:29069359
Development and validation of a prognostic nomogram for terminally ill cancer patients.

PubMed

Feliu, Jaime; Jiménez-Gordo, Ana María; Madero, Rosario; Rodríguez-Aizcorbe, José Ramón; Espinosa, Enrique; Castro, Javier; Acedo, Jesús Domingo; Martínez, Beatriz; Alonso-Babarro, Alberto; Molina, Raquel; Cámara, Juan Carlos; García-Paredes, María Luisa; González-Barón, Manuel

2011-11-02

Determining life expectancy in terminally ill cancer patients is a difficult task. We aimed to develop and validate a nomogram to predict the length of survival in patients with terminal disease. From February 1, 2003, to December 31, 2005, 406 consecutive terminally ill patients were entered into the study. We analyzed 38 features prognostic of life expectancy among terminally ill patients by multivariable Cox regression and identified the most accurate and parsimonious model by backward variable elimination according to the Akaike information criterion. Five clinical and laboratory variables were built into a nomogram to estimate the probability of patient survival at 15, 30, and 60 days. We validated and calibrated the nomogram with an external validation cohort of 474 patients who were treated from June 1, 2006, through December 31, 2007. The median overall survival was 29.1 days for the training set and 18.3 days for the validation set. Eastern Cooperative Oncology Group performance status, lactate dehydrogenase levels, lymphocyte levels, albumin levels, and time from initial diagnosis to diagnosis of terminal disease were retained in the multivariable Cox proportional hazards model as independent prognostic factors of survival and formed the basis of the nomogram. The nomogram had high predictive performance, with a bootstrapped corrected concordance index of 0.70, and it showed good calibration. External independent validation revealed 68% predictive accuracy. We developed a highly accurate tool that uses basic clinical and analytical information to predict the probability of survival at 15, 30, and 60 days in terminally ill cancer patients. This tool can help physicians making decisions on clinical care at the end of life.
Personalized Prediction of Psychosis: External validation of the NAPLS2 Psychosis Risk Calculator with the EDIPPP project

PubMed Central

Carrión, Ricardo E.; Cornblatt, Barbara A.; Burton, Cynthia Z.; Tso, Ivy F; Auther, Andrea; Adelsheim, Steven; Calkins, Roderick; Carter, Cameron S.; Niendam, Tara; Taylor, Stephan F.; McFarlane, William R.

2016-01-01

Objective In the current issue, Cannon and colleagues, as part of the second phase of the North American Prodrome Longitudinal Study (NAPLS2), report on a risk calculator for the individualized prediction of developing a psychotic disorder in a 2-year period. The present study represents an external validation of the NAPLS2 psychosis risk calculator using an independent sample of subjects at clinical high risk for psychosis collected as part of the Early Detection, Intervention, and Prevention of Psychosis Program (EDIPPP). Methods 176 subjects with follow-up (from the total EDIPPP sample of 210) rated as clinical high-risk (CHR) based on the Structured Interview for Prodromal Syndromes were used to construct a new prediction model with the 6 significant predictor variables in the NAPLS2 psychosis risk calculator (unusual thoughts, suspiciousness, Symbol Coding, verbal learning, social functioning decline, baseline age, and family history). Discrimination performance was assessed with the area under the receiver operating curve (AUC). The NAPLS2 risk calculator was then used to generate a psychosis risk estimate for each case in the external validation sample. Results The external validation model showed good discrimination, with an AUC of 79% (95% CI 0.644–0.937). In addition, the personalized risk generated by the NAPLS calculator provided a solid estimation of the actual conversion outcome in the validation sample. Conclusions In the companion papers in this issue, two independent samples of CHR subjects converge to validate the NAPLS2 psychosis risk calculator. This prediction calculator represents a meaningful step towards early intervention and personalized treatment of psychotic disorders. PMID:27363511
Effect of response format on cognitive reflection: Validating a two- and four-option multiple choice question version of the Cognitive Reflection Test.

PubMed

Sirota, Miroslav; Juanchich, Marie

2018-03-27

The Cognitive Reflection Test, measuring intuition inhibition and cognitive reflection, has become extremely popular because it reliably predicts reasoning performance, decision-making, and beliefs. Across studies, the response format of CRT items sometimes differs, based on the assumed construct equivalence of tests with open-ended versus multiple-choice items (the equivalence hypothesis). Evidence and theoretical reasons, however, suggest that the cognitive processes measured by these response formats and their associated performances might differ (the nonequivalence hypothesis). We tested the two hypotheses experimentally by assessing the performance in tests with different response formats and by comparing their predictive and construct validity. In a between-subjects experiment (n = 452), participants answered stem-equivalent CRT items in an open-ended, a two-option, or a four-option response format and then completed tasks on belief bias, denominator neglect, and paranormal beliefs (benchmark indicators of predictive validity), as well as on actively open-minded thinking and numeracy (benchmark indicators of construct validity). We found no significant differences between the three response formats in the numbers of correct responses, the numbers of intuitive responses (with the exception of the two-option version, which had a higher number than the other tests), and the correlational patterns of the indicators of predictive and construct validity. All three test versions were similarly reliable, but the multiple-choice formats were completed more quickly. We speculate that the specific nature of the CRT items helps build construct equivalence among the different response formats. We recommend using the validated multiple-choice version of the CRT presented here, particularly the four-option CRT, for practical and methodological reasons. Supplementary materials and data are available at https://osf.io/mzhyc/ .
Validation of a new mortality risk prediction model for people 65 years and older in northwest Russia: The Crystal risk score.

PubMed

Turusheva, Anna; Frolova, Elena; Bert, Vaes; Hegendoerfer, Eralda; Degryse, Jean-Marie

2017-07-01

Prediction models help to make decisions about further management in clinical practice. This study aims to develop a mortality risk score based on previously identified risk predictors and to perform internal and external validations. In a population-based prospective cohort study of 611 community-dwelling individuals aged 65+ in St. Petersburg (Russia), all-cause mortality risks over 2.5 years follow-up were determined based on the results obtained from anthropometry, medical history, physical performance tests, spirometry and laboratory tests. C-statistic, risk reclassification analysis, integrated discrimination improvement analysis, decision curves analysis, internal validation and external validation were performed. Older adults were at higher risk for mortality [HR (95%CI)=4.54 (3.73-5.52)] when two or more of the following components were present: poor physical performance, low muscle mass, poor lung function, and anemia. If anemia was combined with high C-reactive protein (CRP) and high B-type natriuretic peptide (BNP) was added the HR (95%CI) was slightly higher (5.81 (4.73-7.14)) even after adjusting for age, sex and comorbidities. Our models were validated in an external population of adults 80+. The extended model had a better predictive capacity for cardiovascular mortality [HR (95%CI)=5.05 (2.23-11.44)] compared to the baseline model [HR (95%CI)=2.17 (1.18-4.00)] in the external population. We developed and validated a new risk prediction score that may be used to identify older adults at higher risk for mortality in Russia. Additional studies need to determine which targeted interventions improve the outcomes of these at-risk individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Independent validation of a new reirradiation risk score (RRRS) for glioma patients predicting post-recurrence survival: A multicenter DKTK/ROG analysis.

PubMed

Niyazi, Maximilian; Adeberg, Sebastian; Kaul, David; Boulesteix, Anne-Laure; Bougatf, Nina; Fleischmann, Daniel F; Grün, Arne; Krämer, Anna; Rödel, Claus; Eckert, Franziska; Paulsen, Frank; Kessel, Kerstin A; Combs, Stephanie E; Oehlke, Oliver; Grosu, Anca-Ligia; Seidlitz, Annekatrin; Lattermann, Annika; Krause, Mechthild; Baumann, Michael; Guberina, Maja; Stuschke, Martin; Budach, Volker; Belka, Claus; Debus, Jürgen

2018-04-01

Reirradiation (reRT) is a valid option with considerable efficacy in patients with recurrent high-grade glioma, but it is still not known which patients might be optimal candidates for a second course of irradiation. This study validated a newly developed prognostic score independently in an external patient cohort. The reRT risk score (RRRS) is based on a linear combination of initial histology, clinical performance status, and age derived from a multivariable model of 353 patients. This score can predict post-recurrence survival (PRS) after reRT. The validation dataset consisted of 212 patients. The RRRS differentiates three prognostic groups. Discrimination and calibration were maintained in the validation group. Median PRS times in the development cohort for the good/intermediate/poor risk categories were 14.2, 9.1, and 5.3 months, respectively. The respective groups within the validation cohort displayed median PRS times of 13.8, 8.8, and 3.8 months, respectively. Uno's C for development data was 0.64 (CI: 0.60-0.69) and for validation data 0.63 (CI: 0.58-0.68). The RRRS has been successfully validated in an independent patient cohort. This linear combination of three easily determined clinicopathological factors allows for a reliable classification of patients and may be used as stratification factor for future trials. Copyright © 2018 Elsevier B.V. All rights reserved.
[The Basel Screening Instrument for Psychosis (BSIP): development, structure, reliability and validity].

PubMed

Riecher-Rössler, A; Aston, J; Ventura, J; Merlo, M; Borgwardt, S; Gschwandtner, U; Stieglitz, R-D

2008-04-01

Early detection of psychosis is of growing clinical importance. So far there is, however, no screening instrument for detecting individuals with beginning psychosis in the atypical early stages of the disease with sufficient validity. We have therefore developed the Basel Screening Instrument for Psychosis (BSIP) and tested its feasibility, interrater-reliability and validity. Aim of this paper is to describe the development and structure of the instrument, as well as to report the results of the studies on reliability and validity. The instrument was developed based on a comprehensive search of literature on the most important risk factors and early signs of schizophrenic psychoses. The interraterreliability study was conducted on 24 psychiatric cases. Validity was tested based on 206 individuals referred to our early detection clinic from 3/1/2000 until 2/28/2003. We identified seven categories of relevance for early detection of psychosis and used them to construct a semistructured interview. Interrater-reliability for high risk individuals was high (Kappa .87). Predictive validity was comparable to other, more comprehensive instruments: 16 (32 %) of 50 individuals classified as being at risk for psychosis by the BSIP have in fact developed frank psychosis within an follow-up period of two to five years. The BSIP is the first screening instrument for the early detection of psychosis which has been validated based on transition to psychosis. The BSIP is easy to use by experienced psychiatrists and has a very good interrater-reliability and predictive validity.
[Comparison of the Wechsler Memory Scale-III and the Spain-Complutense Verbal Learning Test in acquired brain injury: construct validity and ecological validity].

PubMed

Luna-Lario, P; Pena, J; Ojeda, N

2017-04-16

To perform an in-depth examination of the construct validity and the ecological validity of the Wechsler Memory Scale-III (WMS-III) and the Spain-Complutense Verbal Learning Test (TAVEC). The sample consists of 106 adults with acquired brain injury who were treated in the Area of Neuropsychology and Neuropsychiatry of the Complejo Hospitalario de Navarra and displayed memory deficit as the main sequela, measured by means of specific memory tests. The construct validity is determined by examining the tasks required in each test over the basic theoretical models, comparing the performance according to the parameters offered by the tests, contrasting the severity indices of each test and analysing their convergence. The external validity is explored through the correlation between the tests and by using regression models. According to the results obtained, both the WMS-III and the TAVEC have construct validity. The TAVEC is more sensitive and captures not only the deficits in mnemonic consolidation, but also in the executive functions involved in memory. The working memory index of the WMS-III is useful for predicting the return to work at two years after the acquired brain injury, but none of the instruments anticipates the disability and dependence at least six months after the injury. We reflect upon the construct validity of the tests and their insufficient capacity to predict functionality when the sequelae become chronic.
Cross-cultural adaptation and psychometric evaluation of oral health impact profile among school teacher community

PubMed Central

Vyas, Shaleen; Nagarajappa, Sandesh; Dasar, Pralhad L.; Mishra, Prashant

2018-01-01

AIM: To translate OHIP-14 into Hindi and test its psychometric properties among school teacher community. METHODS: The OHIP-14 was translated to OHIP-14-H using WHO recommended translation protocol. During pre-testing, an expert panel assessed content validity of the questionnaire. Face validity was assessed on a sample of 10 individuals. The OHIP-14-H was administered on a random sample of 170 primary school teachers. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and Intra-class correlation coefficient (ICC) respectively, with 2 weeks interval. Predictive validity was tested by comparing OHIP-14-H scores with clinical parameters. The concurrent validity was assessed using self-reported oral health and discriminant validity was ascertained through negative association with sociodemographic variables. RESULTS: The mean OHIP-14-H score was 9.57 (S.D = 4.58). ICC and Cronbach's alpha for OHIP-14-H was 0.96 and 0.92 respectively. Concurrent validity using binomial regression model indicated that good (OR = 0.56, 95% CI = 0.55 – 4.47) and moderate (OR = 0.25, 95% CI = 0.17 – 1.87) OHIP-14-H scores were negative but significant risk indicators of poor self reported oral health (P < 0.009). Significant predictive validity was observed between OHIP-14-H scores and clinical parameters (P < 0.000). CONCLUSION: Translated and culturally adapted OHIP-14-H indicates good reliability and validity among primary school teachers. PMID:29417064
Comment on Hall et al. (2017), "How to Choose Between Measures of Tinnitus Loudness for Clinical Research? A Report on the Reliability and Validity of an Investigator-Administered Test and a Patient-Reported Measure Using Baseline Data Collected in a Phase IIa Drug Trial".

PubMed

Sabour, Siamak

2018-03-08

The purpose of this letter, in response to Hall, Mehta, and Fackrell (2017), is to provide important knowledge about methodology and statistical issues in assessing the reliability and validity of an audiologist-administered tinnitus loudness matching test and a patient-reported tinnitus loudness rating. The author uses reference textbooks and published articles regarding scientific assessment of the validity and reliability of a clinical test to discuss the statistical test and the methodological approach in assessing validity and reliability in clinical research. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess reliability and validity. The qualitative variables of sensitivity, specificity, positive predictive value, negative predictive value, false positive and false negative rates, likelihood ratio positive and likelihood ratio negative, as well as odds ratio (i.e., ratio of true to false results), are the most appropriate estimates to evaluate validity of a test compared to a gold standard. In the case of quantitative variables, depending on distribution of the variable, Pearson r or Spearman rho can be applied. Diagnostic accuracy (validity) and diagnostic precision (reliability or agreement) are two completely different methodological issues. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess validity.
Number of organ dysfunctions predicts mortality in emergency department patients with suspected infection: a multicenter validation study.

PubMed

Jessen, Marie K; Skibsted, Simon; Shapiro, Nathan I

2017-06-01

The aim of this study was to validate the association between number of organ dysfunctions and mortality in emergency department (ED) patients with suspected infection. This study was conducted at two medical care center EDs. The internal validation set was a prospective cohort study conducted in Boston, USA. The external validation set was a retrospective case-control study conducted in Aarhus, Denmark. The study included adult patients (>18 years) with clinically suspected infection. Laboratory results and clinical data were used to assess organ dysfunctions. Inhospital mortality was the outcome measure. Multivariate logistic regression was used to determine the independent mortality odds for number and types of organ dysfunctions. We enrolled 4952 (internal) and 483 (external) patients. The mortality rate significantly increased with increasing number of organ dysfunctions: internal validation: 0 organ dysfunctions: 0.5% mortality, 1: 3.6%, 2: 9.5%, 3: 17%, and 4 or more: 37%; external validation: 2.2, 6.7, 17, 41, and 57% mortality (both P<0.001 for trend). Age-adjusted and comorbidity-adjusted number of organ dysfunctions remained an independent predictor. The effect of specific types of organ dysfunction on mortality was most pronounced for hematologic [odds ratio (OR) 3.3 (95% confidence interval (CI) 2.0-5.4)], metabolic [OR 3.3 (95% CI 2.4-4.6); internal validation], and cardiovascular dysfunctions [OR 14 (95% CI 3.7-50); external validation]. The number of organ dysfunctions predicts sepsis mortality.
Personalized prediction of chronic wound healing: an exponential mixed effects model using stereophotogrammetric measurement.

PubMed

Xu, Yifan; Sun, Jiayang; Carter, Rebecca R; Bogie, Kath M

2014-05-01

Stereophotogrammetric digital imaging enables rapid and accurate detailed 3D wound monitoring. This rich data source was used to develop a statistically validated model to provide personalized predictive healing information for chronic wounds. 147 valid wound images were obtained from a sample of 13 category III/IV pressure ulcers from 10 individuals with spinal cord injury. Statistical comparison of several models indicated the best fit for the clinical data was a personalized mixed-effects exponential model (pMEE), with initial wound size and time as predictors and observed wound size as the response variable. Random effects capture personalized differences. Other models are only valid when wound size constantly decreases. This is often not achieved for clinical wounds. Our model accommodates this reality. Two criteria to determine effective healing time outcomes are proposed: r-fold wound size reduction time, t(r-fold), is defined as the time when wound size reduces to 1/r of initial size. t(δ) is defined as the time when the rate of the wound healing/size change reduces to a predetermined threshold δ < 0. Healing rate differs from patient to patient. Model development and validation indicates that accurate monitoring of wound geometry can adaptively predict healing progression and that larger wounds heal more rapidly. Accuracy of the prediction curve in the current model improves with each additional evaluation. Routine assessment of wounds using detailed stereophotogrammetric imaging can provide personalized predictions of wound healing time. Application of a valid model will help the clinical team to determine wound management care pathways. Published by Elsevier Ltd.
An Optimized Transient Dual Luciferase Assay for Quantifying MicroRNA Directed Repression of Targeted Sequences

PubMed Central

Moyle, Richard L.; Carvalhais, Lilia C.; Pretorius, Lara-Simone; Nowak, Ekaterina; Subramaniam, Gayathery; Dalton-Morgan, Jessica; Schenk, Peer M.

2017-01-01

Studies investigating the action of small RNAs on computationally predicted target genes require some form of experimental validation. Classical molecular methods of validating microRNA action on target genes are laborious, while approaches that tag predicted target sequences to qualitative reporter genes encounter technical limitations. The aim of this study was to address the challenge of experimentally validating large numbers of computationally predicted microRNA-target transcript interactions using an optimized, quantitative, cost-effective, and scalable approach. The presented method combines transient expression via agroinfiltration of Nicotiana benthamiana leaves with a quantitative dual luciferase reporter system, where firefly luciferase is used to report the microRNA-target sequence interaction and Renilla luciferase is used as an internal standard to normalize expression between replicates. We report the appropriate concentration of N. benthamiana leaf extracts and dilution factor to apply in order to avoid inhibition of firefly LUC activity. Furthermore, the optimal ratio of microRNA precursor expression construct to reporter construct and duration of the incubation period post-agroinfiltration were determined. The optimized dual luciferase assay provides an efficient, repeatable and scalable method to validate and quantify microRNA action on predicted target sequences. The optimized assay was used to validate five predicted targets of rice microRNA miR529b, with as few as six technical replicates. The assay can be extended to assess other small RNA-target sequence interactions, including assessing the functionality of an artificial miRNA or an RNAi construct on a targeted sequence. PMID:28979287
Incremental Validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF).

PubMed

Siegling, A B; Vesely, Ashley K; Petrides, K V; Saklofske, Donald H

2015-01-01

This study examined the incremental validity of the adult short form of the Trait Emotional Intelligence Questionnaire (TEIQue-SF) in predicting 7 construct-relevant criteria beyond the variance explained by the Five-factor model and coping strategies. Additionally, the relative contributions of the questionnaire's 4 subscales were assessed. Two samples of Canadian university students completed the TEIQue-SF, along with measures of the Big Five, coping strategies (Sample 1 only), and emotion-laden criteria. The TEIQue-SF showed consistent incremental effects beyond the Big Five or the Big Five and coping strategies, predicting all 7 criteria examined across the 2 samples. Furthermore, 2 of the 4 TEIQue-SF subscales accounted for the measure's incremental validity. Although the findings provide good support for the validity and utility of the TEIQue-SF, directions for further research are emphasized.
Validity of the Miller forensic assessment of symptoms test in psychiatric inpatients.

PubMed

Veazey, Connie H; Wagner, Alisha L; Hays, J Ray; Miller, Holly A

2005-06-01

This study investigated the validity of the Miller Forensic Assessment of Symptoms Test (M-FAST), a brief measure of malingering, in an inpatient psychiatric sample of 70. Among those patients who also completed the Personality Assessment Inventory (N=44), Total M-FAST score was related in the expected directions to the Personality Assessment Inventory validity scales and indexes, providing evidence for concurrent validity of the M-FAST. With the PAI malingering index used as a criterion, we examined the diagnostic efficiency of the M-FAST and found a cut score of 8 represented the best balance of sensitivity, specificity, positive predictive power, and negative predictive power. Based on this cut-score of 8, 16% of the population was classified as malingering. The M-FAST appears to be an excellent rapid screen for symptom exaggeration in this population and setting.
Validation of a 4-item Negative Symptom Assessment (NSA-4): a short, practical clinical tool for the assessment of negative symptoms in schizophrenia.

PubMed

Alphs, Larry; Morlock, Robert; Coon, Cheryl; Cazorla, Pilar; Szegedi, Armin; Panagides, John

2011-06-01

The 16-item Negative Symptom Assessment (NSA-16) scale is a validated tool for evaluating negative symptoms of schizophrenia. The psychometric properties and predictive power of a four-item version (NSA-4) were compared with the NSA-16. Baseline data from 561 patients with predominant negative symptoms of schizophrenia who participated in two identically designed clinical trials were evaluated. Ordered logistic regression analysis of ratings using NSA-4 and NSA-16 were compared with ratings using several other standard tools to determine predictive validity and construct validity. Internal consistency and test--retest reliability were also analyzed. NSA-16 and NSA-4 scores were both predictive of scores on the NSA global rating (odds ratio = 0.83-0.86) and the Clinical Global Impressions--Severity scale (odds ratio = 0.91-0.93). NSA-16 and NSA-4 showed high correlation with each other (Pearson r = 0.85), similar high correlation with other measures of negative symptoms (demonstrating convergent validity), and lesser correlations with measures of other forms of psychopathology (demonstrating divergent validity). NSA-16 and NSA-4 both showed acceptable internal consistency (Cronbach α, 0.85 and 0.64, respectively) and test--retest reliability (intraclass correlation coefficient, 0.87 and 0.82). This study demonstrates that NSA-4 offers accuracy comparable to the NSA-16 in rating negative symptoms in patients with schizophrenia. Copyright © 2011 John Wiley & Sons, Ltd.
Validity and reliability of three definitions of hip osteoarthritis: cross sectional and longitudinal approach.

PubMed

Reijman, M; Hazes, J M W; Pols, H A P; Bernsen, R M D; Koes, B W; Bierma-Zeinstra, S M A

2004-11-01

To compare the reliability and validity in a large open population of three frequently used radiological definitions of hip osteoarthritis (OA): Kellgren and Lawrence grade, minimal joint space (MJS), and Croft grade; and to investigate whether the validity of the three definitions of hip OA is sex dependent. from the Rotterdam study (aged > or= 55 years, n = 3585) were evaluated. The inter-rater reliability was tested in a random set of 148 x rays. The validity was expressed as the ability to identify patients who show clinical symptoms of hip OA (construct validity) and as the ability to predict total hip replacement (THR) at follow up (predictive validity). Inter-rater reliability was similar for the Kellgren and Lawrence grade and MJS (kappa statistics 0.68 and 0.62, respectively) but lower for Croft's grade (kappa statistic, 0.51). The Kellgren and Lawrence grade and MJS showed the strongest associations with clinical symptoms of hip OA. Sex appeared to be an effect modifier for Kellgren and Lawrence and MJS definitions, women showing a stronger association between grading and symptoms than men. However, the sex dependency was attributed to differences in height between women and men. The Kellgren and Lawrence grade showed the highest predictive value for THR at follow up. Based on these findings, Kellgren and Lawrence still appears to be a useful OA definition for epidemiological studies focusing on the presence of hip OA.
Exploring discrepancies between quantitative validation results and the geomorphic plausibility of statistical landslide susceptibility maps

NASA Astrophysics Data System (ADS)

Steger, Stefan; Brenning, Alexander; Bell, Rainer; Petschko, Helene; Glade, Thomas

2016-06-01

Empirical models are frequently applied to produce landslide susceptibility maps for large areas. Subsequent quantitative validation results are routinely used as the primary criteria to infer the validity and applicability of the final maps or to select one of several models. This study hypothesizes that such direct deductions can be misleading. The main objective was to explore discrepancies between the predictive performance of a landslide susceptibility model and the geomorphic plausibility of subsequent landslide susceptibility maps while a particular emphasis was placed on the influence of incomplete landslide inventories on modelling and validation results. The study was conducted within the Flysch Zone of Lower Austria (1,354 km2) which is known to be highly susceptible to landslides of the slide-type movement. Sixteen susceptibility models were generated by applying two statistical classifiers (logistic regression and generalized additive model) and two machine learning techniques (random forest and support vector machine) separately for two landslide inventories of differing completeness and two predictor sets. The results were validated quantitatively by estimating the area under the receiver operating characteristic curve (AUROC) with single holdout and spatial cross-validation technique. The heuristic evaluation of the geomorphic plausibility of the final results was supported by findings of an exploratory data analysis, an estimation of odds ratios and an evaluation of the spatial structure of the final maps. The results showed that maps generated by different inventories, classifiers and predictors appeared differently while holdout validation revealed similar high predictive performances. Spatial cross-validation proved useful to expose spatially varying inconsistencies of the modelling results while additionally providing evidence for slightly overfitted machine learning-based models. However, the highest predictive performances were obtained for maps that explicitly expressed geomorphically implausible relationships indicating that the predictive performance of a model might be misleading in the case a predictor systematically relates to a spatially consistent bias of the inventory. Furthermore, we observed that random forest-based maps displayed spatial artifacts. The most plausible susceptibility map of the study area showed smooth prediction surfaces while the underlying model revealed a high predictive capability and was generated with an accurate landslide inventory and predictors that did not directly describe a bias. However, none of the presented models was found to be completely unbiased. This study showed that high predictive performances cannot be equated with a high plausibility and applicability of subsequent landslide susceptibility maps. We suggest that greater emphasis should be placed on identifying confounding factors and biases in landslide inventories. A joint discussion between modelers and decision makers of the spatial pattern of the final susceptibility maps in the field might increase their acceptance and applicability.
Validity Assessment of 5 Day Repeated Forced-Swim Stress to Model Human Depression in Young-Adult C57BL/6J and BALB/cJ Mice

PubMed Central

Zheng, Jia; Goodyear, Laurie J.

2016-01-01

The development of animal models with construct, face, and predictive validity to accurately model human depression has been a major challenge. One proposed rodent model is the 5 d repeated forced swim stress (5d-RFSS) paradigm, which progressively increases floating during individual swim sessions. The onset and persistence of this floating behavior has been anthropomorphically characterized as a measure of depression. This interpretation has been under debate because a progressive increase in floating over time may reflect an adaptive learned behavioral response promoting survival, and not depression (Molendijk and de Kloet, 2015). To assess construct and face validity, we applied 5d-RFSS to C57BL/6J and BALB/cJ mice, two mouse strains commonly used in neuropsychiatric research, and measured a combination of emotional, homeostatic, and psychomotor symptoms indicative of a depressive-like state. We also compared the efficacy of 5d-RFSS and chronic social defeat stress (CSDS), a validated depression model, to induce a depressive-like state in C57BL/6J mice. In both strains, 5d-RFSS progressively increased floating behavior that persisted for at least 4 weeks. 5d-RFSS did not alter sucrose preference, body weight, appetite, locomotor activity, anxiety-like behavior, or immobility behavior during a tail-suspension test compared with nonstressed controls. In contrast, CSDS altered several of these parameters, suggesting a depressive-like state. Finally, predictive validity was assessed using voluntary wheel running (VWR), a known antidepressant intervention. Four weeks of VWR after 5d-RFSS normalized floating behavior toward nonstressed levels. These observations suggest that 5d-RFSS has no construct or face validity but might have predictive validity to model human depression. PMID:28058270

Validity Assessment of 5 Day Repeated Forced-Swim Stress to Model Human Depression in Young-Adult C57BL/6J and BALB/cJ Mice.

PubMed

Mul, Joram D; Zheng, Jia; Goodyear, Laurie J

2016-01-01

The development of animal models with construct, face, and predictive validity to accurately model human depression has been a major challenge. One proposed rodent model is the 5 d repeated forced swim stress (5d-RFSS) paradigm, which progressively increases floating during individual swim sessions. The onset and persistence of this floating behavior has been anthropomorphically characterized as a measure of depression. This interpretation has been under debate because a progressive increase in floating over time may reflect an adaptive learned behavioral response promoting survival, and not depression (Molendijk and de Kloet, 2015). To assess construct and face validity, we applied 5d-RFSS to C57BL/6J and BALB/cJ mice, two mouse strains commonly used in neuropsychiatric research, and measured a combination of emotional, homeostatic, and psychomotor symptoms indicative of a depressive-like state. We also compared the efficacy of 5d-RFSS and chronic social defeat stress (CSDS), a validated depression model, to induce a depressive-like state in C57BL/6J mice. In both strains, 5d-RFSS progressively increased floating behavior that persisted for at least 4 weeks. 5d-RFSS did not alter sucrose preference, body weight, appetite, locomotor activity, anxiety-like behavior, or immobility behavior during a tail-suspension test compared with nonstressed controls. In contrast, CSDS altered several of these parameters, suggesting a depressive-like state. Finally, predictive validity was assessed using voluntary wheel running (VWR), a known antidepressant intervention. Four weeks of VWR after 5d-RFSS normalized floating behavior toward nonstressed levels. These observations suggest that 5d-RFSS has no construct or face validity but might have predictive validity to model human depression.
Development of an Itemwise Efficiency Scoring Method: Concurrent, Convergent, Discriminant, and Neuroimaging-Based Predictive Validity Assessed in a Large Community Sample

PubMed Central

Moore, Tyler M.; Reise, Steven P.; Roalf, David R.; Satterthwaite, Theodore D.; Davatzikos, Christos; Bilker, Warren B.; Port, Allison M.; Jackson, Chad T.; Ruparel, Kosha; Savitt, Adam P.; Baron, Robert B.; Gur, Raquel E.; Gur, Ruben C.

2016-01-01

Traditional “paper-and-pencil” testing is imprecise in measuring speed and hence limited in assessing performance efficiency, but computerized testing permits precision in measuring itemwise response time. We present a method of scoring performance efficiency (combining information from accuracy and speed) at the item level. Using a community sample of 9,498 youths age 8-21, we calculated item-level efficiency scores on four neurocognitive tests, and compared the concurrent, convergent, discriminant, and predictive validity of these scores to simple averaging of standardized speed and accuracy-summed scores. Concurrent validity was measured by the scores' abilities to distinguish men from women and their correlations with age; convergent and discriminant validity were measured by correlations with other scores inside and outside of their neurocognitive domains; predictive validity was measured by correlations with brain volume in regions associated with the specific neurocognitive abilities. Results provide support for the ability of itemwise efficiency scoring to detect signals as strong as those detected by standard efficiency scoring methods. We find no evidence of superior validity of the itemwise scores over traditional scores, but point out several advantages of the former. The itemwise efficiency scoring method shows promise as an alternative to standard efficiency scoring methods, with overall moderate support from tests of four different types of validity. This method allows the use of existing item analysis methods and provides the convenient ability to adjust the overall emphasis of accuracy versus speed in the efficiency score, thus adjusting the scoring to the real-world demands the test is aiming to fulfill. PMID:26866796
Validating Pseudo-dynamic Source Models against Observed Ground Motion Data at the SCEC Broadband Platform, Ver 16.5

NASA Astrophysics Data System (ADS)

Song, S. G.

2016-12-01

Simulation-based ground motion prediction approaches have several benefits over empirical ground motion prediction equations (GMPEs). For instance, full 3-component waveforms can be produced and site-specific hazard analysis is also possible. However, it is important to validate them against observed ground motion data to confirm their efficiency and validity before practical uses. There have been community efforts for these purposes, which are supported by the Broadband Platform (BBP) project at the Southern California Earthquake Center (SCEC). In the simulation-based ground motion prediction approaches, it is a critical element to prepare a possible range of scenario rupture models. I developed a pseudo-dynamic source model for Mw 6.5-7.0 by analyzing a number of dynamic rupture models, based on 1-point and 2-point statistics of earthquake source parameters (Song et al. 2014; Song 2016). In this study, the developed pseudo-dynamic source models were tested against observed ground motion data at the SCEC BBP, Ver 16.5. The validation was performed at two stages. At the first stage, simulated ground motions were validated against observed ground motion data for past events such as the 1992 Landers and 1994 Northridge, California, earthquakes. At the second stage, they were validated against the latest version of empirical GMPEs, i.e., NGA-West2. The validation results show that the simulated ground motions produce ground motion intensities compatible with observed ground motion data at both stages. The compatibility of the pseudo-dynamic source models with the omega-square spectral decay and the standard deviation of the simulated ground motion intensities are also discussed in the study
Prediction of overall survival for metastatic pancreatic cancer: Development and validation of a prognostic nomogram with data from open clinical trial and real-world study.

PubMed

Hang, Junjie; Wu, Lixia; Zhu, Lina; Sun, Zhiqiang; Wang, Ge; Pan, Jingjing; Zheng, Suhua; Xu, Kequn; Du, Jiadi; Jiang, Hua

2018-06-01

It is necessary to develop prognostic tools of metastatic pancreatic cancer (MPC) for optimizing therapeutic strategies. Thus, we tried to develop and validate a prognostic nomogram of MPC. Data from 3 clinical trials (NCT00844649, NCT01124786, and NCT00574275) and 133 Chinese MPC patients were used for analysis. The former 2 trials were taken as the training cohort while NCT00574275 was used as the validation cohort. In addition, 133 MPC patients treated in China were taken as the testing cohort. Cox regression model was used to investigate prognostic factors in the training cohort. With these factors, we established a nomogram and verified it by Harrell's concordance index (C-index) and calibration plots. Furthermore, the nomogram was externally validated in the validation cohort and testing cohort. In the training cohort (n = 445), performance status, liver metastasis, Carbohydrate antigen 19-9 (CA19-9) log-value, absolute neutrophil count (ANC), and albumin were independent prognostic factors for overall survival (OS). A nomogram was established with these factors to predict OS and survival probabilities. The nomogram showed an acceptable discrimination ability (C-index: .683) and good calibration, and was further externally validated in the validation cohort (n = 273, C-index: .699) and testing cohort (n = 133, C-index: .653).The nomogram total points (NTP) had the potential to stratify patients into 3-risk groups with median OS of 11.7, 7.0 and 3.7 months (P < .001), respectively. In conclusion, the prognostic nomogram with NTP can predict OS for patients with MPC with considerable accuracy. © 2018 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Analytic Validation of Immunohistochemistry Assays: New Benchmark Data From a Survey of 1085 Laboratories.

PubMed

Stuart, Lauren N; Volmar, Keith E; Nowak, Jan A; Fatheree, Lisa A; Souers, Rhona J; Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Astles, J Rex; Nakhleh, Raouf E

2017-09-01

- A cooperative agreement between the College of American Pathologists (CAP) and the United States Centers for Disease Control and Prevention was undertaken to measure laboratories' awareness and implementation of an evidence-based laboratory practice guideline (LPG) on immunohistochemical (IHC) validation practices published in 2014. - To establish new benchmark data on IHC laboratory practices. - A 2015 survey on IHC assay validation practices was sent to laboratories subscribed to specific CAP proficiency testing programs and to additional nonsubscribing laboratories that perform IHC testing. Specific questions were designed to capture laboratory practices not addressed in a 2010 survey. - The analysis was based on responses from 1085 laboratories that perform IHC staining. Ninety-six percent (809 of 844) always documented validation of IHC assays. Sixty percent (648 of 1078) had separate procedures for predictive and nonpredictive markers, 42.7% (220 of 515) had procedures for laboratory-developed tests, 50% (349 of 697) had procedures for testing cytologic specimens, and 46.2% (363 of 785) had procedures for testing decalcified specimens. Minimum case numbers were specified by 85.9% (720 of 838) of laboratories for nonpredictive markers and 76% (584 of 768) for predictive markers. Median concordance requirements were 95% for both types. For initial validation, 75.4% (538 of 714) of laboratories adopted the 20-case minimum for nonpredictive markers and 45.9% (266 of 579) adopted the 40-case minimum for predictive markers as outlined in the 2014 LPG. The most common method for validation was correlation with morphology and expected results. Laboratories also reported which assay changes necessitated revalidation and their minimum case requirements. - Benchmark data on current IHC validation practices and procedures may help laboratories understand the issues and influence further refinement of LPG recommendations.
Development and validation of a prediction model for functional decline in older medical inpatients.

PubMed

Takada, Toshihiko; Fukuma, Shingo; Yamamoto, Yosuke; Tsugihashi, Yukio; Nagano, Hiroyuki; Hayashi, Michio; Miyashita, Jun; Azuma, Teruhisa; Fukuhara, Shunichi

2018-05-17

To prevent functional decline in older inpatients, identification of high-risk patients is crucial. The aim of this study was to develop and validate a prediction model to assess the risk of functional decline in older medical inpatients. In this retrospective cohort study, patients ≥65 years admitted acutely to medical wards were included. The healthcare database of 246 acute care hospitals (n = 229,913) was used for derivation, and two acute care hospitals (n = 1767 and 5443, respectively) were used for validation. Data were collected using a national administrative claims and discharge database. Functional decline was defined as a decline of the Katz score at discharge compared with on admission. About 6% of patients in the derivation cohort and 9% and 2% in each validation cohort developed functional decline. A model with 7 items, age, body mass index, living in a nursing home, ambulance use, need for assistance in walking, dementia, and bedsore, was developed. On internal validation, it demonstrated a c-statistic of 0.77 (95% confidence interval (CI) = 0.767-0.771) and good fit on the calibration plot. On external validation, the c-statistics were 0.79 (95% CI = 0.77-0.81) and 0.75 (95% CI = 0.73-0.77) for each cohort, respectively. Calibration plots showed good fit in one cohort and overestimation in the other one. A prediction model for functional decline in older medical inpatients was derived and validated. It is expected that use of the model would lead to early identification of high-risk patients and introducing early intervention. Copyright © 2018 Elsevier B.V. All rights reserved.
On Nomological Validity and Auxiliary Assumptions: The Importance of Simultaneously Testing Effects in Social Cognitive Theories Applied to Health Behavior and Some Guidelines

PubMed Central

Hagger, Martin S.; Gucciardi, Daniel F.; Chatzisarantis, Nikos L. D.

2017-01-01

Tests of social cognitive theories provide informative data on the factors that relate to health behavior, and the processes and mechanisms involved. In the present article, we contend that tests of social cognitive theories should adhere to the principles of nomological validity, defined as the degree to which predictions in a formal theoretical network are confirmed. We highlight the importance of nomological validity tests to ensure theory predictions can be disconfirmed through observation. We argue that researchers should be explicit on the conditions that lead to theory disconfirmation, and identify any auxiliary assumptions on which theory effects may be conditional. We contend that few researchers formally test the nomological validity of theories, or outline conditions that lead to model rejection and the auxiliary assumptions that may explain findings that run counter to hypotheses, raising potential for ‘falsification evasion.’ We present a brief analysis of studies (k = 122) testing four key social cognitive theories in health behavior to illustrate deficiencies in reporting theory tests and evaluations of nomological validity. Our analysis revealed that few articles report explicit statements suggesting that their findings support or reject the hypotheses of the theories tested, even when findings point to rejection. We illustrate the importance of explicit a priori specification of fundamental theory hypotheses and associated auxiliary assumptions, and identification of the conditions which would lead to rejection of theory predictions. We also demonstrate the value of confirmatory analytic techniques, meta-analytic structural equation modeling, and Bayesian analyses in providing robust converging evidence for nomological validity. We provide a set of guidelines for researchers on how to adopt and apply the nomological validity approach to testing health behavior models. PMID:29163307
A valid model for predicting responsible nerve roots in lumbar degenerative disease with diagnostic doubt.

PubMed

Li, Xiaochuan; Bai, Xuedong; Wu, Yaohong; Ruan, Dike

2016-03-15

To construct and validate a model to predict responsible nerve roots in lumbar degenerative disease with diagnostic doubt (DD). From January 2009-January 2013, 163 patients with DD were assigned to the construction (n = 106) or validation sample (n = 57) according to different admission times to hospital. Outcome was assessed according to the Japanese Orthopedic Association (JOA) recovery rate as excellent, good, fair, and poor. The first two results were considered as effective clinical outcome (ECO). Baseline patient and clinical characteristics were considered as secondary variables. A multivariate logistic regression model was used to construct a model with the ECO as a dependent variable and other factors as explanatory variables. The odds ratios (ORs) of each risk factor were adjusted and transformed into a scoring system. Area under the curve (AUC) was calculated and validated in both internal and external samples. Moreover, calibration plot and predictive ability of this scoring system were also tested for further validation. Patients with DD with ECOs in both construction and validation models were around 76 % (76.4 and 75.5 % respectively). more preoperative visual analog pain scale (VAS) score (OR = 1.56, p < 0.01), stenosis levels of L4/5 or L5/S1 (OR = 1.44, p = 0.04), stenosis locations with neuroforamen (OR = 1.95, p = 0.01), neurological deficit (OR = 1.62, p = 0.01), and more VAS improvement of selective nerve route block (SNRB) (OR = 3.42, p = 0.02). the internal area under the curve (AUC) was 0.85, and the external AUC was 0.72, with a good calibration plot of prediction accuracy. Besides, the predictive ability of ECOs was not different from the actual results (p = 0.532). We have constructed and validated a predictive model for confirming responsible nerve roots in patients with DD. The associated risk factors were preoperative VAS score, stenosis levels of L4/5 or L5/S1, stenosis locations with neuroforamen, neurological deficit, and VAS improvement of SNRB. A tool such as this is beneficial in the preoperative counseling of patients, shared surgical decision making, and ultimately improving safety in spine surgery.
The Predictive Validity of the Assessment of Basic Learning Abilities versus Parents' Predictions with Children with Autism

ERIC Educational Resources Information Center

Murphy, Colleen; Martin, Garry L.; Yu, C. T.

2014-01-01

The Assessment of Basic Learning Abilities (ABLA) is an empirically validated clinical tool for assessing the learning ability of persons with intellectual disabilities and children with autism. An ABLA tester uses standardized prompting and reinforcement procedures to attempt to teach, individually, each of six tasks, called levels, to a testee,…
Validity of the Medical College Admission Test for Predicting MD-PhD Student Outcomes

ERIC Educational Resources Information Center

Bills, James L.; VanHouten, Jacob; Grundy, Michelle M.; Chalkley, Roger; Dermody, Terence S.

2016-01-01

The Medical College Admission Test (MCAT) is a quantitative metric used by MD and MD-PhD programs to evaluate applicants for admission. This study assessed the validity of the MCAT in predicting training performance measures and career outcomes for MD-PhD students at a single institution. The study population consisted of 153 graduates of the…
Multilevel Assessment of the Predictive Validity of Teacher Made Tests in the Zimbabwean Primary Education Sector

ERIC Educational Resources Information Center

Machingambi, Zadzisai

2017-01-01

The principal focus of this study was to undertake a multilevel assessment of the predictive validity of teacher made tests in the Zimbabwean primary education sector. A correlational research design was adopted for the study, mainly to allow for statistical treatment of data and subsequent classical hypotheses testing using the spearman's rho.…
A Cross-Cultural Test of Sex Bias in the Predictive Validity of Scholastic Aptitude Examinations: Some Israeli Findings.

ERIC Educational Resources Information Center

Zeidner, Moshe

1987-01-01

This study examined the cross-cultural validity of the sex bias contention with respect to standardized aptitude testing, used for academic prediction purposes in Israel. Analyses were based on the grade point average and scores of 1778 Jewish and 1017 Arab students who were administered standardized college entrance test batteries. (Author/LMO)
Concurrent and Predictive Validity of the Raven Progressive Matrices and the Naglieri Nonverbal Ability Test

ERIC Educational Resources Information Center

Balboni, Giulia; Naglieri, Jack A.; Cubelli, Roberto

2010-01-01

The concurrent and predictive validities of the Naglieri Nonverbal Ability Test (NNAT) and Raven's Colored Progressive Matrices (CPM) were investigated in a large group of Italian third-and fifth-grade students with different sociocultural levels evaluated at the beginning and end of the school year. CPM and NNAT scores were related to math and…
Predictive Validity of ICD-10 Hyperkinetic Disorder Relative to DSM-IV Attention-Deficit/Hyperactivity Disorder among Younger Children

ERIC Educational Resources Information Center

Lahey, Benjamin B.; Pelham, William E.; Chronis, Andrea; Massetti, Greta; Kipp, Heidi; Ehrhardt, Ashley; Lee, Steve S.

2006-01-01

Background: Little is known about the predictive validity of hyperkinetic disorder (HKD) as defined by the Diagnostic Criteria for Research for mental and behavioral disorders of the tenth edition of the International Classification of Diseases (ICD-10; World Health Organization, 1993), particularly when the diagnosis is given to younger children.…
CFD validation experiments at McDonnell Aircraft Company

NASA Technical Reports Server (NTRS)

Verhoff, August

1987-01-01

Information is given in viewgraph form on computational fluid dynamics (CFD) validation experiments at McDonnell Aircraft Company. Topics covered include a high speed research model, a supersonic persistence fighter model, a generic fighter wing model, surface grids, force and moment predictions, surface pressure predictions, forebody models with 65 degree clipped delta wings, and the low aspect ratio wing/body experiment.
Project Evaluation: Validation of a Scale and Analysis of Its Predictive Capacity

ERIC Educational Resources Information Center

Fernandes Malaquias, Rodrigo; de Oliveira Malaquias, Fernanda Francielle

2014-01-01

The objective of this study was to validate a scale for assessment of academic projects. As a complement, we examined its predictive ability by comparing the scores of advised/corrected projects based on the model and the final scores awarded to the work by an examining panel (approximately 10 months after the project design). Results of…
Incremental Criterion Validity of the WJ-III COG Clinical Clusters: Marginal Predictive Effects beyond the General Factor

ERIC Educational Resources Information Center

McGill, Ryan J.

2015-01-01

The current study examined the incremental validity of the clinical clusters from the Woodcock-Johnson III Tests of Cognitive Abilities (WJ-III COG) for predicting scores on the Woodcock-Johnson III Tests of Achievement (WJ-III ACH). All participants were children and adolescents (N = 4,722) drawn from the nationally representative WJ-III…
Predicting return to work after low back injury using the Psychosocial Risk for Occupational Disability Instrument: a validation study.

PubMed

Schultz, I Z; Crook, J; Berkowitz, J; Milner, R; Meloche, G R

2005-09-01

This paper reports on the predictive validity of a Psychosocial Risk for Occupational Disability Scale in the workers' compensation environment using a paper and pencil version of a previously validated multimethod instrument on a new, subacute sample of workers with low back pain. A cohort longitudinal study design with a randomly selected cohort off work for 4-6 weeks was applied. The questionnaire was completed by 111 eligible workers at 4-6 weeks following injury. Return to work status data at three months was obtained from 100 workers. Sixty-four workers had returned to work (RTW) and 36 had not (NRTW). Stepwise backward elimination resulted in a model with these predictors: Expectations of Recovery, SF-36 Vitality, SF-36 Mental Health, and Waddell Symptoms. The correct classification of RTW/NRTW was 79%, with sensitivity (NRTW) of 61% and specificity (RTW) of 89%. The area under the ROC curve was 84%. New evidence for predictive validity for the Psychosocial Risk-for-Disability Instrument was provided. The instrument can be useful and practical for prediction of return to work outcomes in the subacute stage after low back injury in the workers' compensation context.
Aeroacoustic Validation of Installed Low Noise Propulsion for NASA's N+2 Supersonic Airliner

NASA Technical Reports Server (NTRS)

Bridges, James

2018-01-01

An aeroacoustic test was conducted at NASA Glenn Research Center on an integrated propulsion system designed to meet noise regulations of ICAO Chapter 4 with 10EPNdB cumulative margin. The test had two objectives: to demonstrate that the aircraft design did meet the noise goal, and to validate the acoustic design tools used in the design. Variations in the propulsion system design and its installation were tested and the results compared against predictions. Far-field arrays of microphones measured the acoustic spectral directivity, which was transformed to full scale as noise certification levels. Phased array measurements confirmed that the shielding of the installation model adequately simulated the full aircraft and provided data for validating RANS-based noise prediction tools. Particle image velocimetry confirmed that the flow field around the nozzle on the jet rig mimicked that of the full aircraft and produced flow data to validate the RANS solutions used in the noise predictions. The far-field acoustic measurements confirmed the empirical predictions for the noise. Results provided here detail the steps taken to ensure accuracy of the measurements and give insights into the physics of exhaust noise from installed propulsion systems in future supersonic vehicles.
Validation Assessment of a Glass-to-Metal Seal Finite-Element Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jamison, Ryan Dale; Buchheit, Thomas E.; Emery, John M

Sealing glasses are ubiquitous in high pressure and temperature engineering applications, such as hermetic feed-through electrical connectors. A common connector technology are glass-to-metal seals where a metal shell compresses a sealing glass to create a hermetic seal. Though finite-element analysis has been used to understand and design glass-to-metal seals for many years, there has been little validation of these models. An indentation technique was employed to measure the residual stress on the surface of a simple glass-to-metal seal. Recently developed rate- dependent material models of both Schott 8061 and 304L VAR stainless steel have been applied to a finite-element modelmore » of the simple glass-to-metal seal. Model predictions of residual stress based on the evolution of material models are shown. These model predictions are compared to measured data. Validity of the finite- element predictions is discussed. It will be shown that the finite-element model of the glass-to-metal seal accurately predicts the mean residual stress in the glass near the glass-to-metal interface and is valid for this quantity of interest.« less

Validation metrics for turbulent plasma transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holland, C., E-mail: chholland@ucsd.edu

Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. The utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak [J. L. Luxon, Nucl. Fusion 42, 614 (2002)], as part of a multi-year transport model validation activity.« less
Using cluster analysis to identify phenotypes and validation of mortality in men with COPD.

PubMed

Chen, Chiung-Zuei; Wang, Liang-Yi; Ou, Chih-Ying; Lee, Cheng-Hung; Lin, Chien-Chung; Hsiue, Tzuen-Ren

2014-12-01

Cluster analysis has been proposed to examine phenotypic heterogeneity in chronic obstructive pulmonary disease (COPD). The aim of this study was to use cluster analysis to define COPD phenotypes and validate them by assessing their relationship with mortality. Male subjects with COPD were recruited to identify and validate COPD phenotypes. Seven variables were assessed for their relevance to COPD, age, FEV(1) % predicted, BMI, history of severe exacerbations, mMRC, SpO(2), and Charlson index. COPD groups were identified by cluster analysis and validated prospectively against mortality during a 4-year follow-up. Analysis of 332 COPD subjects identified five clusters from cluster A to cluster E. Assessment of the predictive validity of these clusters of COPD showed that cluster E patients had higher all cause mortality (HR 18.3, p < 0.0001), and respiratory cause mortality (HR 21.5, p < 0.0001) than those in the other four groups. Cluster E patients also had higher all cause mortality (HR 14.3, p = 0.0002) and respiratory cause mortality (HR 10.1, p = 0.0013) than patients in cluster D alone. COPD patient with severe airflow limitation, many symptoms, and a history of frequent severe exacerbations was a novel and distinct clinical phenotype predicting mortality in men with COPD.
Multidimensional assessment of self-regulated learning with middle school math students.

PubMed

Callan, Gregory L; Cleary, Timothy J

2018-03-01

This study examined the convergent and predictive validity of self-regulated learning (SRL) measures situated in mathematics. The sample included 100 eighth graders from a diverse, urban school district. Four measurement formats were examined including, 2 broad-based (i.e., self-report questionnaire and teacher ratings) and 2 task-specific measures (i.e., SRL microanalysis and behavioral traces). Convergent validity was examined across task-difficulty, and the predictive validity was examined across 3 mathematics outcomes: 2 measures of mathematical problem solving skill (i.e., practice session math problems, posttest math problems) and a global measure of mathematical skill (i.e., standardized math test). Correlation analyses were used to examine convergent validity and revealed medium correlations between measures within the same category (i.e., broad-based or task-specific). Relations between measurement classes were not statistically significant. Separate regressions examined the predictive validity of the SRL measures. While controlling all other predictors, a SRL microanalysis metacognitive-monitoring measure emerged as a significant predictor of all 3 outcomes and teacher ratings accounted for unique variance on 2 of the outcomes (i.e., posttest math problems and standardized math test). Results suggest that a multidimensional assessment approach should be considered by school psychologists interested in measuring SRL. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Assessment of EchoMRI-AH versus dual-energy X-ray absorptiometry by iDXA to measure human body composition.

PubMed

Marlatt, K L; Greenway, F L; Ravussin, E

2017-04-01

Comparison of percent fat mass across different body composition analysis devices is important given variation in technology accuracy and precision, as well as the growing need for cross-validation of devices often applied across longitudinal studies. We compared EchoMRI-AH and Lunar iDXA quantification of percent body fat (PBF) in 84 adults (43M, 41F), with the mean age 39.7±15.9 years and body mass index (BMI) 26.2±5.3 kg/m 2 . PBF correlated strongly between devices (r>0.95, P<0.0001). A prediction equation was derived in half of the subjects, and the other half were used to cross-validate the proposed equation (EchoMRI-AH PBF=[(0.94 × iDXA PBF)+(0.14 × Age)+(3.3 × Female)-8.83). The mean PBF difference (predicted-measured) in the validation group was not different from 0 (diff=0.27%, 95% confidence interval: -0.42-0.96, P=0.430). Bland-Altman plots showed a bias with higher measured PBF on EchoMRI-AH versus iDXA in all 84 subjects (β=0.13, P<0.0001). The proposed prediction equation was valid in our cross-validation sample, and it has the potential to be applied across multicenter studies.
Subarachnoid hemorrhage admissions retrospectively identified using a prediction model

PubMed Central

McIntyre, Lauralyn; Fergusson, Dean; Turgeon, Alexis; dos Santos, Marlise P.; Lum, Cheemun; Chassé, Michaël; Sinclair, John; Forster, Alan; van Walraven, Carl

2016-01-01

Objective: To create an accurate prediction model using variables collected in widely available health administrative data records to identify hospitalizations for primary subarachnoid hemorrhage (SAH). Methods: A previously established complete cohort of consecutive primary SAH patients was combined with a random sample of control hospitalizations. Chi-square recursive partitioning was used to derive and internally validate a model to predict the probability that a patient had primary SAH (due to aneurysm or arteriovenous malformation) using health administrative data. Results: A total of 10,322 hospitalizations with 631 having primary SAH (6.1%) were included in the study (5,122 derivation, 5,200 validation). In the validation patients, our recursive partitioning algorithm had a sensitivity of 96.5% (95% confidence interval [CI] 93.9–98.0), a specificity of 99.8% (95% CI 99.6–99.9), and a positive likelihood ratio of 483 (95% CI 254–879). In this population, patients meeting criteria for the algorithm had a probability of 45% of truly having primary SAH. Conclusions: Routinely collected health administrative data can be used to accurately identify hospitalized patients with a high probability of having a primary SAH. This algorithm may allow, upon validation, an easy and accurate method to create validated cohorts of primary SAH from either ruptured aneurysm or arteriovenous malformation. PMID:27629096
Evaluation of a Computational Model of Situational Awareness

NASA Technical Reports Server (NTRS)

Burdick, Mark D.; Shively, R. Jay; Rutkewski, Michael (Technical Monitor)

2000-01-01

Although the use of the psychological construct of situational awareness (SA) assists researchers in creating a flight environment that is safer and more predictable, its true potential remains untapped until a valid means of predicting SA a priori becomes available. Previous work proposed a computational model of SA (CSA) that sought to Fill that void. The current line of research is aimed at validating that model. The results show that the model accurately predicted SA in a piloted simulation.
Incremental Validity of Useful Field of View Subtests for the Prediction of Instrumental Activities of Daily Living

PubMed Central

Aust, Frederik; Edwards, Jerri D.

2015-01-01

Introduction The Useful Field of View Test (UFOV®) is a cognitive measure that predicts older adults’ ability to perform a range of everyday activities. However, little is known about the individual contribution of each subtest to these predictions and the underlying constructs of UFOV performance remain a topic of debate. Method We investigated the incremental validity of UFOV subtests for the prediction of Instrumental Activities of Daily Living (IADL) performance in two independent datasets, the SKILL (n = 828) and ACTIVE (n = 2426) studies. We, then, explored the cognitive and visual abilities assessed by UFOV using a range of neuropsychological and vision tests administered in the SKILL study. Results In the four subtest variant of UFOV, only subtests 2 and 3 consistently made independent contributions to the prediction of IADL performance across three different behavioral measures. In all cases, the incremental validity of UFOV subtests 1 and 4 was negligible. Furthermore, we found that UFOV was related to processing speed, general non-speeded cognition, and visual function; the omission of subtests 1 and 4 from the test score did not affect these associations. Conclusions UFOV subtests 1 and 4 appear to be of limited use to predict IADL and possibly other everyday activities. Future experimental research should investigate if shortening the UFOV by omitting these subtests is a reliable and valid assessment approach. PMID:26782018
A prospectively validated nomogram for predicting the risk of chemotherapy-induced febrile neutropenia: a multicenter study.

PubMed

Bozcuk, H; Yıldız, M; Artaç, M; Kocer, M; Kaya, Ç; Ulukal, E; Ay, S; Kılıç, M P; Şimşek, E H; Kılıçkaya, P; Uçar, S; Coskun, H S; Savas, B

2015-06-01

There is clinical need to predict risk of febrile neutropenia before a specific cycle of chemotherapy in cancer patients. Data on 3882 chemotherapy cycles in 1089 consecutive patients with lung, breast, and colon cancer from four teaching hospitals were used to construct a predictive model for febrile neutropenia. A final nomogram derived from the multivariate predictive model was prospectively confirmed in a second cohort of 960 consecutive cases and 1444 cycles. The following factors were used to construct the nomogram: previous history of febrile neutropenia, pre-cycle lymphocyte count, type of cancer, cycle of current chemotherapy, and patient age. The predictive model had a concordance index of 0.95 (95 % confidence interval (CI) = 0.91-0.99) in the derivation cohort and 0.85 (95 % CI = 0.80-0.91) in the external validation cohort. A threshold of 15 % for the risk of febrile neutropenia in the derivation cohort was associated with a sensitivity of 0.76 and specificity of 0.98. These figures were 1.00 and 0.49 in the validation cohort if a risk threshold of 50 % was chosen. This nomogram is helpful in the prediction of febrile neutropenia after chemotherapy in patients with lung, breast, and colon cancer. Usage of this nomogram may help decrease the morbidity and mortality associated with febrile neutropenia and deserves further validation.
Validation of the FACT-B+4-UL questionnaire and exploration of its predictive value in women submitted to surgery for breast cancer.

PubMed

Andrade Ortega, Juan Alfonso; Millán Gómez, Ana Pilar; Ribeiro González, Marisa; Martínez Piró, Pilar; Jiménez Anula, Juan; Sánchez Andújar, María Belén

2017-06-21

The early detection of upper limb complications is important in women operated on for breast cancer. The "FACT-B+4-UL" questionnaire, a specific variant of the Functional Assessment of Cancer Therapy-Breast (FACT-B) is available among others to measure the upper limb function. The Spanish version of the upper limb subscale of the FACT-B+4 was validated in a prospective cohort of 201 women operated on for breast cancer (factor analysis, internal consistency, test-retest reliability, construct validity and sensitivity to change were determined). Its predictive capacity of subsequent lymphoedema and other complications in the upper limb was explored using logistic regression. This subscale is unifactorial and has a great internal consistency (Cronbach's alpha: 0.87), its test-retest reliability and construct validity are strong (intraclass correlation coefficient: 0.986; Pearson's R with "Quick DASH": 0.81) as is its sensitivity to change. It didn't predict the onset of lymphedema. Its predictive capacity for other upper limb complications is low. FACT-B+4-UL is useful in measuring upper limb disability in women surgically treated for breast cancer; but it does not predict the onset of lymphoedema and its predictive capacity for others complications in the upper limb is low. Copyright © 2017 Elsevier España, S.L.U. All rights reserved.
External Validation and Evaluation of Reliability and Validity of the Modified Seoul National University Renal Stone Complexity Scoring System to Predict Stone-Free Status After Retrograde Intrarenal Surgery.

PubMed

Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong

2015-08-01

The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors

PubMed Central

Ravikumar, Balaguru; Parri, Elina; Timonen, Sanna; Airola, Antti; Wennerberg, Krister

2017-01-01

Due to relatively high costs and labor required for experimental profiling of the full target space of chemical compounds, various machine learning models have been proposed as cost-effective means to advance this process in terms of predicting the most potent compound-target interactions for subsequent verification. However, most of the model predictions lack direct experimental validation in the laboratory, making their practical benefits for drug discovery or repurposing applications largely unknown. Here, we therefore introduce and carefully test a systematic computational-experimental framework for the prediction and pre-clinical verification of drug-target interactions using a well-established kernel-based regression algorithm as the prediction model. To evaluate its performance, we first predicted unmeasured binding affinities in a large-scale kinase inhibitor profiling study, and then experimentally tested 100 compound-kinase pairs. The relatively high correlation of 0.77 (p < 0.0001) between the predicted and measured bioactivities supports the potential of the model for filling the experimental gaps in existing compound-target interaction maps. Further, we subjected the model to a more challenging task of predicting target interactions for such a new candidate drug compound that lacks prior binding profile information. As a specific case study, we used tivozanib, an investigational VEGF receptor inhibitor with currently unknown off-target profile. Among 7 kinases with high predicted affinity, we experimentally validated 4 new off-targets of tivozanib, namely the Src-family kinases FRK and FYN A, the non-receptor tyrosine kinase ABL1, and the serine/threonine kinase SLK. Our sub-sequent experimental validation protocol effectively avoids any possible information leakage between the training and validation data, and therefore enables rigorous model validation for practical applications. These results demonstrate that the kernel-based modeling approach offers practical benefits for probing novel insights into the mode of action of investigational compounds, and for the identification of new target selectivities for drug repurposing applications. PMID:28787438
An Interpretable Machine Learning Model for Accurate Prediction of Sepsis in the ICU.

PubMed

Nemati, Shamim; Holder, Andre; Razmi, Fereshteh; Stanley, Matthew D; Clifford, Gari D; Buchman, Timothy G

2018-04-01

Sepsis is among the leading causes of morbidity, mortality, and cost overruns in critically ill patients. Early intervention with antibiotics improves survival in septic patients. However, no clinically validated system exists for real-time prediction of sepsis onset. We aimed to develop and validate an Artificial Intelligence Sepsis Expert algorithm for early prediction of sepsis. Observational cohort study. Academic medical center from January 2013 to December 2015. Over 31,000 admissions to the ICUs at two Emory University hospitals (development cohort), in addition to over 52,000 ICU patients from the publicly available Medical Information Mart for Intensive Care-III ICU database (validation cohort). Patients who met the Third International Consensus Definitions for Sepsis (Sepsis-3) prior to or within 4 hours of their ICU admission were excluded, resulting in roughly 27,000 and 42,000 patients within our development and validation cohorts, respectively. None. High-resolution vital signs time series and electronic medical record data were extracted. A set of 65 features (variables) were calculated on hourly basis and passed to the Artificial Intelligence Sepsis Expert algorithm to predict onset of sepsis in the proceeding T hours (where T = 12, 8, 6, or 4). Artificial Intelligence Sepsis Expert was used to predict onset of sepsis in the proceeding T hours and to produce a list of the most significant contributing factors. For the 12-, 8-, 6-, and 4-hour ahead prediction of sepsis, Artificial Intelligence Sepsis Expert achieved area under the receiver operating characteristic in the range of 0.83-0.85. Performance of the Artificial Intelligence Sepsis Expert on the development and validation cohorts was indistinguishable. Using data available in the ICU in real-time, Artificial Intelligence Sepsis Expert can accurately predict the onset of sepsis in an ICU patient 4-12 hours prior to clinical recognition. A prospective study is necessary to determine the clinical utility of the proposed sepsis prediction model.
Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alves, Vinicius M.; Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC 27599; Muratov, Eugene

Repetitive exposure to a chemical agent can induce an immune reaction in inherently susceptible individuals that leads to skin sensitization. Although many chemicals have been reported as skin sensitizers, there have been very few rigorously validated QSAR models with defined applicability domains (AD) that were developed using a large group of chemically diverse compounds. In this study, we have aimed to compile, curate, and integrate the largest publicly available dataset related to chemically-induced skin sensitization, use this data to generate rigorously validated and QSAR models for skin sensitization, and employ these models as a virtual screening tool for identifying putativemore » sensitizers among environmental chemicals. We followed best practices for model building and validation implemented with our predictive QSAR workflow using Random Forest modeling technique in combination with SiRMS and Dragon descriptors. The Correct Classification Rate (CCR) for QSAR models discriminating sensitizers from non-sensitizers was 71–88% when evaluated on several external validation sets, within a broad AD, with positive (for sensitizers) and negative (for non-sensitizers) predicted rates of 85% and 79% respectively. When compared to the skin sensitization module included in the OECD QSAR Toolbox as well as to the skin sensitization model in publicly available VEGA software, our models showed a significantly higher prediction accuracy for the same sets of external compounds as evaluated by Positive Predicted Rate, Negative Predicted Rate, and CCR. These models were applied to identify putative chemical hazards in the Scorecard database of possible skin or sense organ toxicants as primary candidates for experimental validation. - Highlights: • It was compiled the largest publicly-available skin sensitization dataset. • Predictive QSAR models were developed for skin sensitization. • Developed models have higher prediction accuracy than OECD QSAR Toolbox. • Putative chemical hazards in the Scorecard database were found using our models.« less
Characterization and validation of an in silico toxicology model to predict the mutagenic potential of drug impurities*

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valerio, Luis G., E-mail: luis.valerio@fda.hhs.gov; Cross, Kevin P.

Control and minimization of human exposure to potential genotoxic impurities found in drug substances and products is an important part of preclinical safety assessments of new drug products. The FDA's 2008 draft guidance on genotoxic and carcinogenic impurities in drug substances and products allows use of computational quantitative structure–activity relationships (QSAR) to identify structural alerts for known and expected impurities present at levels below qualified thresholds. This study provides the information necessary to establish the practical use of a new in silico toxicology model for predicting Salmonella t. mutagenicity (Ames assay outcome) of drug impurities and other chemicals. We describemore » the model's chemical content and toxicity fingerprint in terms of compound space, molecular and structural toxicophores, and have rigorously tested its predictive power using both cross-validation and external validation experiments, as well as case studies. Consistent with desired regulatory use, the model performs with high sensitivity (81%) and high negative predictivity (81%) based on external validation with 2368 compounds foreign to the model and having known mutagenicity. A database of drug impurities was created from proprietary FDA submissions and the public literature which found significant overlap between the structural features of drug impurities and training set chemicals in the QSAR model. Overall, the model's predictive performance was found to be acceptable for screening drug impurities for Salmonella mutagenicity. -- Highlights: ► We characterize a new in silico model to predict mutagenicity of drug impurities. ► The model predicts Salmonella mutagenicity and will be useful for safety assessment. ► We examine toxicity fingerprints and toxicophores of this Ames assay model. ► We compare these attributes to those found in drug impurities known to FDA/CDER. ► We validate the model and find it has a desired predictive performance.« less
The performance of seven QPrediction risk scores in an independent external sample of patients from general practice: a validation study

PubMed Central

Hippisley-Cox, Julia; Coupland, Carol; Brindle, Peter

2014-01-01

Objectives To validate the performance of a set of risk prediction algorithms developed using the QResearch database, in an independent sample from general practices contributing to the Clinical Research Data Link (CPRD). Setting Prospective open cohort study using practices contributing to the CPRD database and practices contributing to the QResearch database. Participants The CPRD validation cohort consisted of 3.3 million patients, aged 25–99 years registered at 357 general practices between 1 Jan 1998 and 31 July 2012. The validation statistics for QResearch were obtained from the original published papers which used a one-third sample of practices separate to those used to derive the score. A cohort from QResearch was used to compare incidence rates and baseline characteristics and consisted of 6.8 million patients from 753 practices registered between 1 Jan 1998 and until 31 July 2013. Outcome measures Incident events relating to seven different risk prediction scores: QRISK2 (cardiovascular disease); QStroke (ischaemic stroke); QDiabetes (type 2 diabetes); QFracture (osteoporotic fracture and hip fracture); QKidney (moderate and severe kidney failure); QThrombosis (venous thromboembolism); QBleed (intracranial bleed and upper gastrointestinal haemorrhage). Measures of discrimination and calibration were calculated. Results Overall, the baseline characteristics of the CPRD and QResearch cohorts were similar though QResearch had higher recording levels for ethnicity and family history. The validation statistics for each of the risk prediction scores were very similar in the CPRD cohort compared with the published results from QResearch validation cohorts. For example, in women, the QDiabetes algorithm explained 50% of the variation within CPRD compared with 51% on QResearch and the receiver operator curve value was 0.85 on both databases. The scores were well calibrated in CPRD. Conclusions Each of the algorithms performed practically as well in the external independent CPRD validation cohorts as they had in the original published QResearch validation cohorts. PMID:25168040
The Reliability and Validity of the Thoracolumbar Injury Classification System in Pediatric Spine Trauma.

PubMed

Savage, Jason W; Moore, Timothy A; Arnold, Paul M; Thakur, Nikhil; Hsu, Wellington K; Patel, Alpesh A; McCarthy, Kathryn; Schroeder, Gregory D; Vaccaro, Alexander R; Dimar, John R; Anderson, Paul A

2015-09-15

The thoracolumbar injury classification system (TLICS) was evaluated in 20 consecutive pediatric spine trauma cases. The purpose of this study was to determine the reliability and validity of the TLICS in pediatric spine trauma. The TLICS was developed to improve the categorization and management of thoracolumbar trauma. TLICS has been shown to have good reliability and validity in the adult population. The clinical and radiographical findings of 20 pediatric thoracolumbar fractures were prospectively presented to 20 surgeons with disparate levels of training and experience with spinal trauma. These injuries were consecutively scored using the TLICS. Cohen unweighted κ coefficients and Spearman rank order correlation values were calculated for the key parameters (injury morphology, status of posterior ligamentous complex, neurological status, TLICS total score, and proposed management) to assess the inter-rater reliabilities. Five surgeons scored the same cases 3 months later to assess the intra-rater reliability. The actual management of each case was then compared with the treatment recommended by the TLICS algorithm to assess validity. The inter-rater κ statistics of all subgroups (injury morphology, status of the posterior ligamentous complex, neurological status, TLICS total score, and proposed treatment) were within the range of moderate to substantial reproducibility (0.524-0.958). All subgroups had excellent intra-rater reliability (0.748-1.000). The various indices for validity were calculated (80.3% correct, 0.836 sensitivity, 0.785 specificity, 0.676 positive predictive value, 0.899 negative predictive value). Overall, TLICS demonstrated good validity. The TLICS has good reliability and validity when used in the pediatric population. The inter-rater reliability of predicting management and indices for validity are lower than those in adults with thoracolumbar fractures, which is likely due to differences in the way children are treated for certain types of injuries. TLICS can be used to reliably categorize thoracolumbar injuries in the pediatric population; however, modifications may be needed to better guide treatment in this specific patient population. 4.
The fecal hemoglobin concentration, age and sex test score: Development and external validation of a simple prediction tool for colorectal cancer detection in symptomatic patients.

PubMed

Cubiella, Joaquín; Digby, Jayne; Rodríguez-Alonso, Lorena; Vega, Pablo; Salve, María; Díaz-Ondina, Marta; Strachan, Judith A; Mowat, Craig; McDonald, Paula J; Carey, Francis A; Godber, Ian M; Younes, Hakim Ben; Rodriguez-Moranta, Francisco; Quintero, Enrique; Álvarez-Sánchez, Victoria; Fernández-Bañares, Fernando; Boadas, Jaume; Campo, Rafel; Bujanda, Luis; Garayoa, Ana; Ferrandez, Ángel; Piñol, Virginia; Rodríguez-Alcalde, Daniel; Guardiola, Jordi; Steele, Robert J C; Fraser, Callum G

2017-05-15

Prediction models for colorectal cancer (CRC) detection in symptomatic patients, based on easily obtainable variables such as fecal haemoglobin concentration (f-Hb), age and sex, may simplify CRC diagnosis. We developed, and then externally validated, a multivariable prediction model, the FAST Score, with data from five diagnostic test accuracy studies that evaluated quantitative fecal immunochemical tests in symptomatic patients referred for colonoscopy. The diagnostic accuracy of the Score in derivation and validation cohorts was compared statistically with the area under the curve (AUC) and the Chi-square test. 1,572 and 3,976 patients were examined in these cohorts, respectively. For CRC, the odds ratio (OR) of the variables included in the Score were: age (years): 1.03 (95% confidence intervals (CI): 1.02-1.05), male sex: 1.6 (95% CI: 1.1-2.3) and f-Hb (0-<20 µg Hb/g feces): 2.0 (95% CI: 0.7-5.5), (20-<200 µg Hb/g): 16.8 (95% CI: 6.6-42.0), ≥200 µg Hb/g: 65.7 (95% CI: 26.3-164.1). The AUC for CRC detection was 0.88 (95% CI: 0.85-0.90) in the derivation and 0.91 (95% CI: 0.90-093; p = 0.005) in the validation cohort. At the two Score thresholds with 90% (4.50) and 99% (2.12) sensitivity for CRC, the Score had equivalent sensitivity, although the specificity was higher in the validation cohort (p < 0.001). Accordingly, the validation cohort was divided into three groups: high (21.4% of the cohort, positive predictive value-PPV: 21.7%), intermediate (59.8%, PPV: 0.9%) and low (18.8%, PPV: 0.0%) risk for CRC. The FAST Score is an easy to calculate prediction tool, highly accurate for CRC detection in symptomatic patients. © 2017 UICC.
Development of the Galaxy Chronic Obstructive Pulmonary Disease (COPD) Model Using Data from ECLIPSE: Internal Validation of a Linked-Equations Cohort Model.

PubMed

Briggs, Andrew H; Baker, Timothy; Risebrough, Nancy A; Chambers, Mike; Gonzalez-McQuire, Sebastian; Ismaila, Afisi S; Exuzides, Alex; Colby, Chris; Tabberer, Maggie; Muellerova, Hana; Locantore, Nicholas; Rutten van Mölken, Maureen P M H; Lomas, David A

2017-05-01

The recent joint International Society for Pharmacoeconomics and Outcomes Research / Society for Medical Decision Making Modeling Good Research Practices Task Force emphasized the importance of conceptualizing and validating models. We report a new model of chronic obstructive pulmonary disease (COPD) (part of the Galaxy project) founded on a conceptual model, implemented using a novel linked-equation approach, and internally validated. An expert panel developed a conceptual model including causal relationships between disease attributes, progression, and final outcomes. Risk equations describing these relationships were estimated using data from the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE) study, with costs estimated from the TOwards a Revolution in COPD Health (TORCH) study. Implementation as a linked-equation model enabled direct estimation of health service costs and quality-adjusted life years (QALYs) for COPD patients over their lifetimes. Internal validation compared 3 years of predicted cohort experience with ECLIPSE results. At 3 years, the Galaxy COPD model predictions of annual exacerbation rate and annual decline in forced expiratory volume in 1 second fell within the ECLIPSE data confidence limits, although 3-year overall survival was outside the observed confidence limits. Projections of the risk equations over time permitted extrapolation to patient lifetimes. Averaging the predicted cost/QALY outcomes for the different patients within the ECLIPSE cohort gives an estimated lifetime cost of £25,214 (undiscounted)/£20,318 (discounted) and lifetime QALYs of 6.45 (undiscounted/5.24 [discounted]) per ECLIPSE patient. A new form of model for COPD was conceptualized, implemented, and internally validated, based on a series of linked equations using epidemiological data (ECLIPSE) and cost data (TORCH). This Galaxy model predicts COPD outcomes from treatment effects on disease attributes such as lung function, exacerbations, symptoms, or exercise capacity; further external validation is required.
A whole blood gene expression-based signature for smoking status

PubMed Central

2012-01-01

Background Smoking is the leading cause of preventable death worldwide and has been shown to increase the risk of multiple diseases including coronary artery disease (CAD). We sought to identify genes whose levels of expression in whole blood correlate with self-reported smoking status. Methods Microarrays were used to identify gene expression changes in whole blood which correlated with self-reported smoking status; a set of significant genes from the microarray analysis were validated by qRT-PCR in an independent set of subjects. Stepwise forward logistic regression was performed using the qRT-PCR data to create a predictive model whose performance was validated in an independent set of subjects and compared to cotinine, a nicotine metabolite. Results Microarray analysis of whole blood RNA from 209 PREDICT subjects (41 current smokers, 4 quit ≤ 2 months, 64 quit > 2 months, 100 never smoked; NCT00500617) identified 4214 genes significantly correlated with self-reported smoking status. qRT-PCR was performed on 1,071 PREDICT subjects across 256 microarray genes significantly correlated with smoking or CAD. A five gene (CLDND1, LRRN3, MUC1, GOPC, LEF1) predictive model, derived from the qRT-PCR data using stepwise forward logistic regression, had a cross-validated mean AUC of 0.93 (sensitivity=0.78; specificity=0.95), and was validated using 180 independent PREDICT subjects (AUC=0.82, CI 0.69-0.94; sensitivity=0.63; specificity=0.94). Plasma from the 180 validation subjects was used to assess levels of cotinine; a model using a threshold of 10 ng/ml cotinine resulted in an AUC of 0.89 (CI 0.81-0.97; sensitivity=0.81; specificity=0.97; kappa with expression model = 0.53). Conclusion We have constructed and validated a whole blood gene expression score for the evaluation of smoking status, demonstrating that clinical and environmental factors contributing to cardiovascular disease risk can be assessed by gene expression. PMID:23210427
Academic performance, career potential, creativity, and job performance: can one construct predict them all?

PubMed

Kuncel, Nathan R; Hezlett, Sarah A; Ones, Deniz S

2004-01-01

This meta-analysis addresses the question of whether 1 general cognitive ability measure developed for predicting academic performance is valid for predicting performance in both educational and work domains. The validity of the Miller Analogies Test (MAT; W. S. Miller, 1960) for predicting 18 academic and work-related criteria was examined. MAT correlations with other cognitive tests (e.g., Raven's Matrices [J. C. Raven, 1965]; Graduate Record Examinations) also were meta-analyzed. The results indicate that the abilities measured by the MAT are shared with other cognitive ability instruments and that these abilities are generalizably valid predictors of academic and vocational criteria, as well as evaluations of career potential and creativity. These findings contradict the notion that intelligence at work is wholly different from intelligence at school, extending the voluminous literature that supports the broad importance of general cognitive ability (g).

Validation of Predictors of Fall Events in Hospitalized Patients With Cancer.

PubMed

Weed-Pfaff, Samantha H; Nutter, Benjamin; Bena, James F; Forney, Jennifer; Field, Rosemary; Szoka, Lynn; Karius, Diana; Akins, Patti; Colvin, Christina M; Albert, Nancy M

2016-10-01

A seven-item cancer-specific fall risk tool (Cleveland Clinic Capone-Albert [CC-CA] Fall Risk Score) was shown to have a strong concordance index for predicting falls; however, validation of the model is needed. The aims of this study were to validate that the CC-CA Fall Risk Score, made up of six factors, predicts falls in patients with cancer and to determine if the CC-CA Fall Risk Score performs better than the Morse Fall Tool. Using a prospective, comparative methodology, data were collected from electronic health records of patients hospitalized for cancer care in four hospitals. Risk factors from each tool were recorded, when applicable. Multivariable models were created to predict the probability of a fall. A concordance index for each fall tool was calculated. The CC-CA Fall Risk Score provided higher discrimination than the Morse Fall Tool in predicting fall events in patients hospitalized for cancer management.
Hierarchical multi-scale approach to validation and uncertainty quantification of hyper-spectral image modeling

NASA Astrophysics Data System (ADS)

Engel, Dave W.; Reichardt, Thomas A.; Kulp, Thomas J.; Graff, David L.; Thompson, Sandra E.

2016-05-01

Validating predictive models and quantifying uncertainties inherent in the modeling process is a critical component of the HARD Solids Venture program [1]. Our current research focuses on validating physics-based models predicting the optical properties of solid materials for arbitrary surface morphologies and characterizing the uncertainties in these models. We employ a systematic and hierarchical approach by designing physical experiments and comparing the experimental results with the outputs of computational predictive models. We illustrate this approach through an example comparing a micro-scale forward model to an idealized solid-material system and then propagating the results through a system model to the sensor level. Our efforts should enhance detection reliability of the hyper-spectral imaging technique and the confidence in model utilization and model outputs by users and stakeholders.
Brazilian validation of the Alberta Infant Motor Scale.

PubMed

Valentini, Nadia Cristina; Saccani, Raquel

2012-03-01

The Alberta Infant Motor Scale (AIMS) is a well-known motor assessment tool used to identify potential delays in infants' motor development. Although Brazilian researchers and practitioners have used the AIMS in laboratories and clinical settings, its translation to Portuguese and validation for the Brazilian population is yet to be investigated. This study aimed to translate and validate all AIMS items with respect to internal consistency and content, criterion, and construct validity. A cross-sectional and longitudinal design was used. A cross-cultural translation was used to generate a Brazilian-Portuguese version of the AIMS. In addition, a validation process was conducted involving 22 professionals and 766 Brazilian infants (aged 0-18 months). The results demonstrated language clarity and internal consistency for the motor criteria (motor development score, α=.90; prone, α=.85; supine, α=.92; sitting, α=.84; and standing, α=.86). The analysis also revealed high discriminative power to identify typical and atypical development (motor development score, P<.001; percentile, P=.04; classification criterion, χ(2)=6.03; P=.05). Temporal stability (P=.07) (rho=.85, P<.001) was observed, and predictive power (P<.001) was limited to the group of infants aged from 3 months to 9 months. Limited predictive validity was observed, which may have been due to the restricted time that the groups were followed longitudinally. In sum, the translated version of AIMS presented adequate validity and reliability.
Assessment of generalizability, applicability and predictability (GAP) for evaluating external validity in studies of universal family-based prevention of alcohol misuse in young people: systematic methodological review of randomized controlled trials.

PubMed

Fernandez-Hermida, Jose Ramon; Calafat, Amador; Becoña, Elisardo; Tsertsvadze, Alexander; Foxcroft, David R

2012-09-01

To assess external validity characteristics of studies from two Cochrane Systematic Reviews of the effectiveness of universal family-based prevention of alcohol misuse in young people. Two reviewers used an a priori developed external validity rating form and independently assessed three external validity dimensions of generalizability, applicability and predictability (GAP) in randomized controlled trials. The majority (69%) of the included 29 studies were rated 'unclear' on the reporting of sufficient information for judging generalizability from sample to study population. Ten studies (35%) were rated 'unclear' on the reporting of sufficient information for judging applicability to other populations and settings. No study provided an assessment of the validity of the trial end-point measures for subsequent mortality, morbidity, quality of life or other economic or social outcomes. Similarly, no study reported on the validity of surrogate measures using established criteria for assessing surrogate end-points. Studies evaluating the benefits of family-based prevention of alcohol misuse in young people are generally inadequate at reporting information relevant to generalizability of the findings or implications for health or social outcomes. Researchers, study authors, peer reviewers, journal editors and scientific societies should take steps to improve the reporting of information relevant to external validity in prevention trials. © 2012 The Authors. Addiction © 2012 Society for the Study of Addiction.
The Validity of Conscientiousness Is Overestimated in the Prediction of Job Performance.

PubMed

Kepes, Sven; McDaniel, Michael A

2015-01-01

Sensitivity analyses refer to investigations of the degree to which the results of a meta-analysis remain stable when conditions of the data or the analysis change. To the extent that results remain stable, one can refer to them as robust. Sensitivity analyses are rarely conducted in the organizational science literature. Despite conscientiousness being a valued predictor in employment selection, sensitivity analyses have not been conducted with respect to meta-analytic estimates of the correlation (i.e., validity) between conscientiousness and job performance. To address this deficiency, we reanalyzed the largest collection of conscientiousness validity data in the personnel selection literature and conducted a variety of sensitivity analyses. Publication bias analyses demonstrated that the validity of conscientiousness is moderately overestimated (by around 30%; a correlation difference of about .06). The misestimation of the validity appears to be due primarily to suppression of small effects sizes in the journal literature. These inflated validity estimates result in an overestimate of the dollar utility of personnel selection by millions of dollars and should be of considerable concern for organizations. The fields of management and applied psychology seldom conduct sensitivity analyses. Through the use of sensitivity analyses, this paper documents that the existing literature overestimates the validity of conscientiousness in the prediction of job performance. Our data show that effect sizes from journal articles are largely responsible for this overestimation.
The Validity of Conscientiousness Is Overestimated in the Prediction of Job Performance

PubMed Central

2015-01-01

Introduction Sensitivity analyses refer to investigations of the degree to which the results of a meta-analysis remain stable when conditions of the data or the analysis change. To the extent that results remain stable, one can refer to them as robust. Sensitivity analyses are rarely conducted in the organizational science literature. Despite conscientiousness being a valued predictor in employment selection, sensitivity analyses have not been conducted with respect to meta-analytic estimates of the correlation (i.e., validity) between conscientiousness and job performance. Methods To address this deficiency, we reanalyzed the largest collection of conscientiousness validity data in the personnel selection literature and conducted a variety of sensitivity analyses. Results Publication bias analyses demonstrated that the validity of conscientiousness is moderately overestimated (by around 30%; a correlation difference of about .06). The misestimation of the validity appears to be due primarily to suppression of small effects sizes in the journal literature. These inflated validity estimates result in an overestimate of the dollar utility of personnel selection by millions of dollars and should be of considerable concern for organizations. Conclusion The fields of management and applied psychology seldom conduct sensitivity analyses. Through the use of sensitivity analyses, this paper documents that the existing literature overestimates the validity of conscientiousness in the prediction of job performance. Our data show that effect sizes from journal articles are largely responsible for this overestimation. PMID:26517553
Validation of Accelerometer Prediction Equations in Children with Chronic Disease.

PubMed

Stephens, Samantha; Takken, Tim; Esliger, Dale W; Pullenayegum, Eleanor; Beyene, Joseph; Tremblay, Mark; Schneiderman, Jane; Biggar, Doug; Longmuir, Pat; McCrindle, Brian; Abad, Audrey; Ignas, Dan; Van Der Net, Janjaap; Feldman, Brian

2016-02-01

The purpose of this study was to assess the criterion validity of existing accelerometer-based energy expenditure (EE) prediction equations among children with chronic conditions, and to develop new prediction equations. Children with congenital heart disease (CHD), cystic fibrosis (CF), dermatomyositis (JDM), juvenile arthritis (JA), inherited muscle disease (IMD), and hemophilia (HE) completed 7 tasks while EE was measured using indirect calorimetry with counts determined by accelerometer. Agreement between predicted EE and measured EE was assessed. Disease-specific equations and cut points were developed and cross-validated. In total, 196 subjects participated. One participant dropped out before testing due to time constraints, while 15 CHD, 32 CF, 31 JDM, 31 JA, 30 IMD, 28 HE, and 29 healthy controls completed the study. Agreement between predicted and measured EE varied across disease group and ranged from (ICC) .13-.46. Disease-specific prediction equations exhibited a range of results (ICC .62-.88) (SE 0.45-0.78). In conclusion, poor agreement was demonstrated using current prediction equations in children with chronic conditions. Disease-specific equations and cut points were developed.
Initial Retrieval Validation from the Joint Airborne IASI Validation Experiment (JAIVEx)

NASA Technical Reports Server (NTRS)

Zhou, Daniel K.; Liu, Xu; Smith, WIlliam L.; Larar, Allen M.; Taylor, Jonathan P.; Revercomb, Henry E.; Mango, Stephen A.; Schluessel, Peter; Calbet, Xavier

2007-01-01

The Joint Airborne IASI Validation Experiment (JAIVEx) was conducted during April 2007 mainly for validation of the Infrared Atmospheric Sounding Interferometer (IASI) on the MetOp satellite, but also included a strong component focusing on validation of the Atmospheric InfraRed Sounder (AIRS) aboard the AQUA satellite. The cross validation of IASI and AIRS is important for the joint use of their data in the global Numerical Weather Prediction process. Initial inter-comparisons of geophysical products have been conducted from different aspects, such as using different measurements from airborne ultraspectral Fourier transform spectrometers (specifically, the NPOESS Airborne Sounder Testbed Interferometer (NAST-I) and the Scanning-High resolution Interferometer Sounder (S-HIS) aboard the NASA WB-57 aircraft), UK Facility for Airborne Atmospheric Measurements (FAAM) BAe146-301 aircraft insitu instruments, dedicated dropsondes, radiosondes, and ground based Raman Lidar. An overview of the JAIVEx retrieval validation plan and some initial results of this field campaign are presented.
[ETAP: A smoking scale for Primary Health Care].

PubMed

González Romero, Pilar María; Cuevas Fernández, Francisco Javier; Marcelino Rodríguez, Itahisa; Rodríguez Pérez, María Del Cristo; Cabrera de León, Antonio; Aguirre-Jaime, Armando

2016-05-01

To obtain a scale of tobacco exposure to address smoking cessation. Follow-up of a cohort. Scale validation. Primary Care Research Unit. Tenerife. A total of 6729 participants from the "CDC de Canarias" cohort. A scale was constructed under the assumption that the time of exposure to tobacco is the key factor to express accumulated risk. Discriminant validity was tested on prevalent cases of acute myocardial infarction (AMI; n=171), and its best cut-off for preventive screening was obtained. Its predictive validity was tested with incident cases of AMI (n=46), comparing the predictive power with markers (age, sex) and classic risk factors of AMI (hypertension, diabetes, dyslipidaemia), including the pack-years index (PYI). The scale obtained was the sum of three times the years that they had smoked plus years exposed to smoking at home and at work. The frequency of AMI increased with the values of the scale, with the value 20 years of exposure being the most appropriate cut-off for preventive action, as it provided adequate predictive values for incident AMI. The scale surpassed PYI in predicting AMI, and competed with the known markers and risk factors. The proposed scale allows a valid measurement of exposure to smoking and provides a useful and simple approach that can help promote a willingness to change, as well as prevention. It still needs to demonstrate its validity, taking as reference other problems associated with smoking. Copyright © 2015 Elsevier España, S.L.U. All rights reserved.
Hybrid optimal descriptors as a tool to predict skin sensitization in accordance to OECD principles.

PubMed

Toropova, Alla P; Toropov, Andrey A

2017-06-05

Skin sensitization (allergic contact dermatitis) is a widespread problem arising from the contact of chemicals with the skin. The detection of molecular features with undesired effect for skin is complex task owing to unclear biochemical mechanisms and unclearness of conditions of action of chemicals to skin. The development of computational methods for estimation of this endpoint in order to reduce animal testing is recommended (Cosmetics Directive EC regulation 1907/2006; EU Regulation, Regulation, 1223/2009). The CORAL software (http://www.insilico.eu/coral) gives good predictive models for the skin sensitization. Simplified molecular input-line entry system (SMILES) together with molecular graph are used to represent the molecular structure for these models. So-called hybrid optimal descriptors are used to establish quantitative structure-activity relationships (QSARs). The aim of this study is the estimation of the predictive potential of the hybrid descriptors. Three different distributions into the training (≈70%), calibration (≈15%), and validation (≈15%) sets are studied. QSAR for these three distributions are built up with using the Monte Carlo technique. The statistical characteristics of these models for external validation set are used as a measure of predictive potential of these models. The best model, according to the above criterion, is characterized by n validation =29, r 2 validation =0.8596, RMSE validation =0.489. Mechanistic interpretation and domain of applicability for these models are defined. Copyright © 2017 Elsevier B.V. All rights reserved.
Validation of a new formula for predicting body weight in a Mexican population with overweight and obesity.

PubMed

Quiroz-Olguín, Gabriela; Serralde-Zúñiga, Aurora Elizabeth; Saldaña-Morales, Vianey; Guevara-Cruz, Martha

2013-01-01

Body weight measurement is of critical importance when evaluating the nutritional status of patients entering a hospital. In some situations, such as the case of patients who are bedridden or in wheelchairs, these measurements cannot be obtained using standardized methods. We have designed and validated a formula for predicting body weight. To design and validate a formula for predicting body weight using circumference-based equations. The following anthropometric measurements were taken for a sample of 76 patients: weight (kg), calf circumference, average arm circumference, waist circumference, hip circumference, wrist circumference and demispan. All circumferences were taken in centimetres (cm), and gender and age were taken into account. This equation was validated in 85 individuals from a different population. The correlation with the new equation was analyzed and compared to a previously validated method. The equation for weight prediction was the following: Weight = 0.524 (WC) - 0.176 (age) + 0.484 (HC) + 0.613 (DS) + 0.704 (CC) + 2.75 (WrC) - 3.330 (if female) - 140.87. The correlation coefficient was 0.96 for the total group of patients, 0.971 for men and 0.961 for women (p < 0.0001 for all measurements). The equation we developed is accurate and can be used to estimate body weight in overweight and/or obese patients with mobility problems, such as bedridden patients or patients in wheelchairs. Copyright © AULA MEDICA EDICIONES 2013. Published by AULA MEDICA. All rights reserved.
Does my patient have chronic Chagas disease? Development and temporal validation of a diagnostic risk score.

PubMed

Brasil, Pedro Emmanuel Alvarenga Americano do; Xavier, Sergio Salles; Holanda, Marcelo Teixeira; Hasslocher-Moreno, Alejandro Marcel; Braga, José Ueleres

2016-01-01

With the globalization of Chagas disease, unexperienced health care providers may have difficulties in identifying which patients should be examined for this condition. This study aimed to develop and validate a diagnostic clinical prediction model for chronic Chagas disease. This diagnostic cohort study included consecutive volunteers suspected to have chronic Chagas disease. The clinical information was blindly compared to serological tests results, and a logistic regression model was fit and validated. The development cohort included 602 patients, and the validation cohort included 138 patients. The Chagas disease prevalence was 19.9%. Sex, age, referral from blood bank, history of living in a rural area, recognizing the kissing bug, systemic hypertension, number of siblings with Chagas disease, number of relatives with a history of stroke, ECG with low voltage, anterosuperior divisional block, pathologic Q wave, right bundle branch block, and any kind of extrasystole were included in the final model. Calibration and discrimination in the development and validation cohorts (ROC AUC 0.904 and 0.912, respectively) were good. Sensitivity and specificity analyses showed that specificity reaches at least 95% above the predicted 43% risk, while sensitivity is at least 95% below the predicted 7% risk. Net benefit decision curves favor the model across all thresholds. A nomogram and an online calculator (available at http://shiny.ipec.fiocruz.br:3838/pedrobrasil/chronic_chagas_disease_prediction/) were developed to aid in individual risk estimation.
Evaluation of the Predictive Validity of Thermography in Identifying Extravasation With Intravenous Chemotherapy Infusions

PubMed Central

Murayama, Ryoko; Tanabe, Hidenori; Oe, Makoto; Motoo, Yoshiharu; Wagatsuma, Takanori; Michibuchi, Michiko; Kinoshita, Sachiko; Sakai, Keiko; Konya, Chizuko; Sugama, Junko; Sanada, Hiromi

2017-01-01

Early detection of extravasation is important, but conventional methods of detection lack objectivity and reliability. This study evaluated the predictive validity of thermography for identifying extravasation during intravenous antineoplastic therapy. Of 257 patients who received chemotherapy through peripheral veins, extravasation was identified in 26. Thermography was performed every 15 to 30 minutes during the infusions. Sensitivity, specificity, positive predictive value, and negative predictive value using thermography were 84.6%, 94.8%, 64.7%, and 98.2%, respectively. This study showed that thermography offers an accurate prediction of extravasation. PMID:29112585
Examining the Potential for Gender Bias in the Prediction of Symptom Validity Test Failure by MMPI-2 Symptom Validity Scale Scores

ERIC Educational Resources Information Center

Lee, Tayla T. C.; Graham, John R.; Sellbom, Martin; Gervais, Roger O.

2012-01-01

Using a sample of individuals undergoing medico-legal evaluations (690 men, 519 women), the present study extended past research on potential gender biases for scores of the Symptom Validity (FBS) scale of the Minnesota Multiphasic Personality Inventory-2 by examining score- and item-level differences between men and women and determining the…
Additional Evidence for the Reliability and Validity of the Student Risk Screening Scale at the High School Level: A Replication and Extension

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy P.; Ennis, Robin Parks; Cox, Meredith Lucille; Schatschneider, Christopher; Lambert, Warren

2013-01-01

This study reports findings from a validation study of the Student Risk Screening Scale for use with 9th- through 12th-grade students (N = 1854) attending a rural fringe school. Results indicated high internal consistency, test-retest stability, and inter-rater reliability. Predictive validity was established across two academic years, with Spring…
The Cognitive Abilities Scale--Second Edition Preschool Form: Studies of Concurrent Criterion-Related, Construct, and Predictive Criterion-Related Validity

ERIC Educational Resources Information Center

Swanson, Jennifer R.; Bradley-Johnson, Sharon; Johnson, C. Merle; O'Dell, Anna Rubenaker

2009-01-01

Three studies examine the validity of the Preschool Form of the Cognitive Abilities Scale--Second Edition (CAS-2). Significant high concurrent criterion-related validity correlations, corrected for restricted range, are found between the CAS-2 and the Detroit Test of Learning Ability--Primary: Third Edition for 26 three-year-olds (r[subscript c] =…
Validity of the SAT® for Predicting Fourth-Year Grades: 2006 SAT Validity Sample. Statistical Report 2011-7

ERIC Educational Resources Information Center

Mattern, Krista D.; Patterson, Brian F.

2006-01-01

The College Board formed a research consortium with four-year colleges and universities to build a national higher education database with the primary goal of validating the SAT®, which is used in college admission and consists of three sections: critical reading (SAT-CR), mathematics (SAT-M) and writing (SAT-W). This report builds on a body of…
Connectome-based predictive modeling of attention: Comparing different functional connectivity features and prediction methods across datasets.

PubMed

Yoo, Kwangsun; Rosenberg, Monica D; Hsu, Wei-Ting; Zhang, Sheng; Li, Chiang-Shan R; Scheinost, Dustin; Constable, R Todd; Chun, Marvin M

2018-02-15

Connectome-based predictive modeling (CPM; Finn et al., 2015; Shen et al., 2017) was recently developed to predict individual differences in traits and behaviors, including fluid intelligence (Finn et al., 2015) and sustained attention (Rosenberg et al., 2016a), from functional brain connectivity (FC) measured with fMRI. Here, using the CPM framework, we compared the predictive power of three different measures of FC (Pearson's correlation, accordance, and discordance) and two different prediction algorithms (linear and partial least square [PLS] regression) for attention function. Accordance and discordance are recently proposed FC measures that respectively track in-phase synchronization and out-of-phase anti-correlation (Meskaldji et al., 2015). We defined connectome-based models using task-based or resting-state FC data, and tested the effects of (1) functional connectivity measure and (2) feature-selection/prediction algorithm on individualized attention predictions. Models were internally validated in a training dataset using leave-one-subject-out cross-validation, and externally validated with three independent datasets. The training dataset included fMRI data collected while participants performed a sustained attention task and rested (N = 25; Rosenberg et al., 2016a). The validation datasets included: 1) data collected during performance of a stop-signal task and at rest (N = 83, including 19 participants who were administered methylphenidate prior to scanning; Farr et al., 2014a; Rosenberg et al., 2016b), 2) data collected during Attention Network Task performance and rest (N = 41, Rosenberg et al., in press), and 3) resting-state data and ADHD symptom severity from the ADHD-200 Consortium (N = 113; Rosenberg et al., 2016a). Models defined using all combinations of functional connectivity measure (Pearson's correlation, accordance, and discordance) and prediction algorithm (linear and PLS regression) predicted attentional abilities, with correlations between predicted and observed measures of attention as high as 0.9 for internal validation, and 0.6 for external validation (all p's < 0.05). Models trained on task data outperformed models trained on rest data. Pearson's correlation and accordance features generally showed a small numerical advantage over discordance features, while PLS regression models were usually better than linear regression models. Overall, in addition to correlation features combined with linear models (Rosenberg et al., 2016a), it is useful to consider accordance features and PLS regression for CPM. Copyright © 2017 Elsevier Inc. All rights reserved.
Global Precipitation Measurement (GPM) Ground Validation (GV) Science Implementation Plan

NASA Technical Reports Server (NTRS)

Petersen, Walter A.; Hou, Arthur Y.

2008-01-01

For pre-launch algorithm development and post-launch product evaluation Global Precipitation Measurement (GPM) Ground Validation (GV) goes beyond direct comparisons of surface rain rates between ground and satellite measurements to provide the means for improving retrieval algorithms and model applications.Three approaches to GPM GV include direct statistical validation (at the surface), precipitation physics validation (in a vertical columns), and integrated science validation (4-dimensional). These three approaches support five themes: core satellite error characterization; constellation satellites validation; development of physical models of snow, cloud water, and mixed phase; development of cloud-resolving model (CRM) and land-surface models to bridge observations and algorithms; and, development of coupled CRM-land surface modeling for basin-scale water budget studies and natural hazard prediction. This presentation describes the implementation of these approaches.
Validity of the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire.

PubMed

João, Thaís Moreira São; Rodrigues, Roberta Cunha Matheus; Gallani, Maria Cecília Bueno Jayme; Miura, Cinthya Tamie Passos; Domingues, Gabriela de Barros Leite; Amireault, Steve; Godin, Gaston

2015-09-01

This study provides evidence of construct validity for the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire (GSLTPAQ), a 1-item instrument used among 236 participants referred for cardiopulmonary exercise testing. The Baecke Habitual Physical Activity Questionnaire (Baecke-HPA) was used to evaluate convergent and divergent validity. The self-reported measure of walking (QCAF) evaluated the convergent validity. Cardiorespiratory fitness assessed convergent validity by the Veterans Specific Activity Questionnaire (VSAQ), peak measured (VO2peak) and maximum predicted (VO2pred) oxygen uptake. Partial adjusted correlation coefficients between the GSLTPAQ, Baecke-HPA, QCAF, VO2pred and VSAQ provided evidence for convergent validity; while divergent validity was supported by the absence of correlations between the GSLTPAQ and the Occupational Physical Activity domain (Baecke-HPA). The GSLTPAQ presents level 3 of evidence of construct validity and may be useful to assess leisure-time physical activity among patients with cardiovascular disease and healthy individuals.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.