predictive validation assessments: Topics by Science.gov

Sample records for predictive validation assessments

Risk assessment for juvenile justice: a meta-analysis.

PubMed

Schwalbe, Craig S

2007-10-01

Risk assessment instruments are increasingly employed by juvenile justice settings to estimate the likelihood of recidivism among delinquent juveniles. In concert with their increased use, validation studies documenting their predictive validity have increased in number. The purpose of this study was to assess the average predictive validity of juvenile justice risk assessment instruments and to identify risk assessment characteristics that are associated with higher predictive validity. A search of the published and grey literature yielded 28 studies that estimated the predictive validity of 28 risk assessment instruments. Findings of the meta-analysis were consistent with effect sizes obtained in larger meta-analyses of criminal justice risk assessment instruments and showed that brief risk assessment instruments had smaller effect sizes than other types of instruments. However, this finding is tentative owing to limitations of the literature.
The Predictive Validity of the Minnesota Reading Assessment for Students in Postsecondary Vocational Education Programs.

ERIC Educational Resources Information Center

Brown, James M.; Chang, Gerald

1982-01-01

The predictive validity of the Minnesota Reading Assessment (MRA) when used to project potential performance of postsecondary vocational-technical education students was examined. Findings confirmed the MRA to be a valid predictor, although the error in prediction varied between the criterion variables. (Author/GK)
Predictive Validity of Measures of the Pathfinder Scaling Algorithm on Programming Performance: Alternative Assessment Strategy for Programming Education

ERIC Educational Resources Information Center

Lau, Wilfred W. F.; Yuen, Allan H. K.

2009-01-01

Recent years have seen a shift in focus from assessment of learning to assessment for learning and the emergence of alternative assessment methods. However, the reliability and validity of these methods as assessment tools are still questionable. In this article, we investigated the predictive validity of measures of the Pathfinder Scaling…
A new framework to enhance the interpretation of external validation studies of clinical prediction models.

PubMed

Debray, Thomas P A; Vergouwe, Yvonne; Koffijberg, Hendrik; Nieboer, Daan; Steyerberg, Ewout W; Moons, Karel G M

2015-03-01

It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. The proposed framework enhances the interpretation of findings at external validation of prediction models. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Design Characteristics Influence Performance of Clinical Prediction Rules in Validation: A Meta-Epidemiological Study

PubMed Central

Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda

2016-01-01

Background Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules’ performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Methods Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. Results A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2–4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Conclusion Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved. PMID:26730980
Design Characteristics Influence Performance of Clinical Prediction Rules in Validation: A Meta-Epidemiological Study.

PubMed

Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda

2016-01-01

Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules' performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2-4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved.
Predictive Validity of a Student Self-Report Screener of Behavioral and Emotional Risk in an Urban High School

ERIC Educational Resources Information Center

Dowdy, Erin; Harrell-Williams, Leigh; Dever, Bridget V.; Furlong, Michael J.; Moore, Stephanie; Raines, Tara; Kamphaus, Randy W.

2016-01-01

Increasingly, schools are implementing school-based screening for risk of behavioral and emotional problems; hence, foundational evidence supporting the predictive validity of screening instruments is important to assess. This study examined the predictive validity of the Behavior Assessment System for Children-2 Behavioral and Emotional Screening…
Understanding Interrater Reliability and Validity of Risk Assessment Tools Used to Predict Adverse Clinical Events.

PubMed

Siedlecki, Sandra L; Albert, Nancy M

This article will describe how to assess interrater reliability and validity of risk assessment tools, using easy-to-follow formulas, and to provide calculations that demonstrate principles discussed. Clinical nurse specialists should be able to identify risk assessment tools that provide high-quality interrater reliability and the highest validity for predicting true events of importance to clinical settings. Making best practice recommendations for assessment tool use is critical to high-quality patient care and safe practices that impact patient outcomes and nursing resources. Optimal risk assessment tool selection requires knowledge about interrater reliability and tool validity. The clinical nurse specialist will understand the reliability and validity issues associated with risk assessment tools, and be able to evaluate tools using basic calculations. Risk assessment tools are developed to objectively predict quality and safety events and ultimately reduce the risk of event occurrence through preventive interventions. To ensure high-quality tool use, clinical nurse specialists must critically assess tool properties. The better the tool's ability to predict adverse events, the more likely that event risk is mediated. Interrater reliability and validity assessment is relatively an easy skill to master and will result in better decisions when selecting or making recommendations for risk assessment tool use.
Geographic and temporal validity of prediction models: Different approaches were useful to examine model performance

PubMed Central

Austin, Peter C.; van Klaveren, David; Vergouwe, Yvonne; Nieboer, Daan; Lee, Douglas S.; Steyerberg, Ewout W.

2017-01-01

Objective Validation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods. Study Design and Setting We illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random effects meta-analysis methods. I2 statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation. Results Estimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I2 statistics and prediction intervals for c-statistics. Conclusion This study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods. PMID:27262237
Automated Pressure Injury Risk Assessment System Incorporated Into an Electronic Health Record System.

PubMed

Jin, Yinji; Jin, Taixian; Lee, Sun-Mi

Pressure injury risk assessment is the first step toward preventing pressure injuries, but traditional assessment tools are time-consuming, resulting in work overload and fatigue for nurses. The objectives of the study were to build an automated pressure injury risk assessment system (Auto-PIRAS) that can assess pressure injury risk using data, without requiring nurses to collect or input additional data, and to evaluate the validity of this assessment tool. A retrospective case-control study and a system development study were conducted in a 1,355-bed university hospital in Seoul, South Korea. A total of 1,305 pressure injury patients and 5,220 nonpressure injury patients participated for the development of a risk scoring algorithm: 687 and 2,748 for the validation of the algorithm and 237 and 994 for validation after clinical implementation, respectively. A total of 4,211 pressure injury-related clinical variables were extracted from the electronic health record (EHR) systems to develop a risk scoring algorithm, which was validated and incorporated into the EHR. That program was further evaluated for predictive and concurrent validity. Auto-PIRAS, incorporated into the EHR system, assigned a risk assessment score of high, moderate, or low and displayed this on the Kardex nursing record screen. Risk scores were updated nightly according to 10 predetermined risk factors. The predictive validity measures of the algorithm validation stage were as follows: sensitivity = .87, specificity = .90, positive predictive value = .68, negative predictive value = .97, Youden index = .77, and the area under the receiver operating characteristic curve = .95. The predictive validity measures of the Braden Scale were as follows: sensitivity = .77, specificity = .93, positive predictive value = .72, negative predictive value = .95, Youden index = .70, and the area under the receiver operating characteristic curve = .85. The kappa of the Auto-PIRAS and Braden Scale risk classification result was .73. The predictive performance of the Auto-PIRAS was similar to Braden Scale assessments conducted by nurses. Auto-PIRAS is expected to be used as a system that assesses pressure injury risk automatically without additional data collection by nurses.
Predicting child maltreatment: A meta-analysis of the predictive validity of risk assessment instruments.

PubMed

van der Put, Claudia E; Assink, Mark; Boekhout van Solinge, Noëlle F

2017-11-01

Risk assessment is crucial in preventing child maltreatment since it can identify high-risk cases in need of child protection intervention. Despite widespread use of risk assessment instruments in child welfare, it is unknown how well these instruments predict maltreatment and what instrument characteristics are associated with higher levels of predictive validity. Therefore, a multilevel meta-analysis was conducted to examine the predictive accuracy of (characteristics of) risk assessment instruments. A literature search yielded 30 independent studies (N=87,329) examining the predictive validity of 27 different risk assessment instruments. From these studies, 67 effect sizes could be extracted. Overall, a medium significant effect was found (AUC=0.681), indicating a moderate predictive accuracy. Moderator analyses revealed that onset of maltreatment can be better predicted than recurrence of maltreatment, which is a promising finding for early detection and prevention of child maltreatment. In addition, actuarial instruments were found to outperform clinical instruments. To bring risk and needs assessment in child welfare to a higher level, actuarial instruments should be further developed and strengthened by distinguishing risk assessment from needs assessment and by integrating risk assessment with case management. Copyright © 2017 Elsevier Ltd. All rights reserved.
Predictive Validity of the HKT-R Risk Assessment Tool: Two and 5-Year Violent Recidivism in a Nationwide Sample of Dutch Forensic Psychiatric Patients.

PubMed

Bogaerts, Stefan; Spreen, Marinus; Ter Horst, Paul; Gerlsma, Coby

2018-06-01

This study has examined the predictive validity of the Historical Clinical Future [ Historisch Klinisch Toekomst] Revised risk assessment scheme in a cohort of 347 forensic psychiatric patients, which were discharged between 2004 and 2008 from any of 12 highly secure forensic centers in the Netherlands. Predictive validity was measured 2 and 5 years after release. Official reconviction data obtained from the Dutch Ministry of Security and Justice were used as outcome measures. Violent reoffending within 2 and 5 years after discharge was assessed. With regard to violent reoffending, results indicated that the predictive validity of the Historical domain was modest for 2 (area under the curve [AUC] = .75) and 5 (AUC = .74) years. The predictive validity of the Clinical domain was marginal for 2 (admission: AUC = .62; discharge: AUC = .63) and 5 (admission: AUC = .69; discharge: AUC = .62) years after release. The predictive validity of the Future domain was modest (AUC = .71) for 2 years and low for 5 (AUC = .58) years. The total score of the instrument was modest for 2 years (AUC = .78) and marginal for 5 (AUC = .68) years. Finally, the Final Risk Judgment was modest for 2 years (AUC = .78) and marginal for 5 (AUC = .63) years time at risk. It is concluded that this risk assessment instrument appears to be a satisfactory instrument for risk assessment.
How Nonrecidivism Affects Predictive Accuracy: Evidence from a Cross-Validation of the Ontario Domestic Assault Risk Assessment (ODARA)

ERIC Educational Resources Information Center

Hilton, N. Zoe; Harris, Grant T.

2009-01-01

Prediction effect sizes such as ROC area are important for demonstrating a risk assessment's generalizability and utility. How a study defines recidivism might affect predictive accuracy. Nonrecidivism is problematic when predicting specialized violence (e.g., domestic violence). The present study cross-validates the ability of the Ontario…
The Predictive Validity of Interim Assessment Scores Based on the Full-Information Bifactor Model for the Prediction of End-of-Grade Test Performance

ERIC Educational Resources Information Center

Immekus, Jason C.; Atitya, Ben

2016-01-01

Interim tests are a central component of district-wide assessment systems, yet their technical quality to guide decisions (e.g., instructional) has been repeatedly questioned. In response, the study purpose was to investigate the validity of a series of English Language Arts (ELA) interim assessments in terms of dimensionality and prediction of…
Role of learning potential in cognitive remediation: Construct and predictive validity.

PubMed

Davidson, Charlie A; Johannesen, Jason K; Fiszdon, Joanna M

2016-03-01

The construct, convergent, discriminant, and predictive validity of Learning Potential (LP) was evaluated in a trial of cognitive remediation for adults with schizophrenia-spectrum disorders. LP utilizes a dynamic assessment approach to prospectively estimate an individual's learning capacity if provided the opportunity for specific related learning. LP was assessed in 75 participants at study entry, of whom 41 completed an eight-week cognitive remediation (CR) intervention, and 22 received treatment-as-usual (TAU). LP was assessed in a "test-train-test" verbal learning paradigm. Incremental predictive validity was assessed as the degree to which LP predicted memory skill acquisition above and beyond prediction by static verbal learning ability. Examination of construct validity confirmed that LP scores reflected use of trained semantic clustering strategy. LP scores correlated with executive functioning and education history, but not other demographics or symptom severity. Following the eight-week active phase, TAU evidenced little substantial change in skill acquisition outcomes, which related to static baseline verbal learning ability but not LP. For the CR group, LP significantly predicted skill acquisition in domains of verbal and visuospatial memory, but not auditory working memory. Furthermore, LP predicted skill acquisition incrementally beyond relevant background characteristics, symptoms, and neurocognitive abilities. Results suggest that LP assessment can significantly improve prediction of specific skill acquisition with cognitive training, particularly for the domain assessed, and thereby may prove useful in individualization of treatment. Published by Elsevier B.V.
A Comparative Study of Adolescent Risk Assessment Instruments: Predictive and Incremental Validity

ERIC Educational Resources Information Center

Welsh, Jennifer L.; Schmidt, Fred; McKinnon, Lauren; Chattha, H. K.; Meyers, Joanna R.

2008-01-01

Promising new adolescent risk assessment tools are being incorporated into clinical practice but currently possess limited evidence of predictive validity regarding their individual and/or combined use in risk assessments. The current study compares three structured adolescent risk instruments, Youth Level of Service/Case Management Inventory…
Testing the Predictive Validity of the Hendrich II Fall Risk Model.

PubMed

Jung, Hyesil; Park, Hyeoun-Ae

2018-03-01

Cumulative data on patient fall risk have been compiled in electronic medical records systems, and it is possible to test the validity of fall-risk assessment tools using these data between the times of admission and occurrence of a fall. The Hendrich II Fall Risk Model scores assessed during three time points of hospital stays were extracted and used for testing the predictive validity: (a) upon admission, (b) when the maximum fall-risk score from admission to falling or discharge, and (c) immediately before falling or discharge. Predictive validity was examined using seven predictive indicators. In addition, logistic regression analysis was used to identify factors that significantly affect the occurrence of a fall. Among the different time points, the maximum fall-risk score assessed between admission and falling or discharge showed the best predictive performance. Confusion or disorientation and having a poor ability to rise from a sitting position were significant risk factors for a fall.
Predictive Validity Study of the APS Writing and Reading Tests [and] Validating Placement Rules for the APS Writing Test.

ERIC Educational Resources Information Center

College of the Canyons, Valencia, CA. Office of Institutional Development.

California's College of the Canyons has used the College Board Assessment and Placement Services (APS) test to assess students' abilities in basic and college English since spring 1993. These two reports summarize data from a May 1994 study of the predictive validity of the APS writing and reading tests and a June 1994 effort to validate the cut…
Responsiveness and predictive validity of the tablet-based symbol digit modalities test in patients with stroke.

PubMed

Hsiao, Pei-Chi; Yu, Wan-Hui; Lee, Shih-Chieh; Chen, Mei-Hsiang; Hsieh, Ching-Lin

2018-06-14

The responsiveness and predictive validity of the Tablet-based Symbol Digit Modalities Test (T-SDMT) are unknown, which limits the utility of the T-SDMT in both clinical and research settings. The purpose of this study was to examine the responsiveness and predictive validity of the T-SDMT in inpatients with stroke. A follow-up, repeated-assessments design. One rehabilitation unit at a local medical center. A total of 50 inpatients receiving rehabilitation completed T-SDMT assessments at admission to and discharge from a rehabilitation ward. The median follow-up period was 14 days. The Barthel index (BI) was assessed at discharge and was used as the criterion of the predictive validity. The mean changes in the T-SDMT scores between admission and discharge were statistically significant (paired t-test = 3.46, p = 0.001). The T-SDMT scores showed a nearly moderate standardized response mean (0.49). A moderate association (Pearson's r = 0.47) was found between the scores of the T-SDMT at admission and those of the BI at discharge, indicating good predictive validity of the T-SDMT. Our results support the responsiveness and predictive validity of the T-SDMT in patients with stroke receiving rehabilitation in hospitals. This study provides empirical evidence supporting the use of the T-SDMT as an outcome measure for assessing processingspeed in inpatients with stroke. The scores of the T-SDMT could be used to predict basic activities of daily living function in inpatients with stroke.
The Structured Assessment of Violence Risk in Adults with Intellectual Disability: A Systematic Review.

PubMed

Hounsome, J; Whittington, R; Brown, A; Greenhill, B; McGuire, J

2018-01-01

While structured professional judgement approaches to assessing and managing the risk of violence have been extensively examined in mental health/forensic settings, the application of the findings to people with an intellectual disability is less extensively researched and reviewed. This review aimed to assess whether risk assessment tools have adequate predictive validity for violence in adults with an intellectual disability. Standard systematic review methodology was used to identify and synthesize appropriate studies. A total of 14 studies were identified as meeting the inclusion criteria. These studies assessed the predictive validity of 18 different risk assessment tools, mainly in forensic settings. All studies concluded that the tools assessed were successful in predicting violence. Studies were generally of a high quality. There is good quality evidence that risk assessment tools are valid for people with intellectual disability who offend but further research is required to validate tools for use with people with intellectual disability who offend. © 2016 John Wiley & Sons Ltd.

The development and testing of a skin tear risk assessment tool.

PubMed

Newall, Nelly; Lewin, Gill F; Bulsara, Max K; Carville, Keryln J; Leslie, Gavin D; Roberts, Pam A

2017-02-01

The aim of the present study is to develop a reliable and valid skin tear risk assessment tool. The six characteristics identified in a previous case control study as constituting the best risk model for skin tear development were used to construct a risk assessment tool. The ability of the tool to predict skin tear development was then tested in a prospective study. Between August 2012 and September 2013, 1466 tertiary hospital patients were assessed at admission and followed up for 10 days to see if they developed a skin tear. The predictive validity of the tool was assessed using receiver operating characteristic (ROC) analysis. When the tool was found not to have performed as well as hoped, secondary analyses were performed to determine whether a potentially better performing risk model could be identified. The tool was found to have high sensitivity but low specificity and therefore have inadequate predictive validity. Secondary analysis of the combined data from this and the previous case control study identified an alternative better performing risk model. The tool developed and tested in this study was found to have inadequate predictive validity. The predictive validity of an alternative, more parsimonious model now needs to be tested. © 2015 Medicalhelplines.com Inc and John Wiley & Sons Ltd.
Assessing youth who sexually offended: the predictive validity of the ERASOR, J-SOAP-II, and YLS/CMI in a non-Western context.

PubMed

Chu, Chi Meng; Ng, Kynaston; Fong, June; Teoh, Jennifer

2012-04-01

Recent research suggested that the predictive validity of adult sexual offender risk assessment measures can be affected when used cross-culturally, but there is no published study on the predictive validity of risk assessment measures for youth who sexually offended in a non-Western context. This study compared the predictive validity of three youth risk assessment measures (i.e., the Estimate of Risk of Adolescent Sexual Offense Recidivism [ERASOR], the Juvenile Sex Offender Assessment Protocol-II [J-SOAP-II], and the Youth Level of Service/Case Management Inventory [YLS/CMI]) for sexual and nonviolent recidivism in a sample of 104 male youth who sexually offended within a Singaporean context (M (follow-up) = 1,637 days; SD (follow-up) = 491). Results showed that the ERASOR overall clinical rating and total score significantly predicted sexual recidivism but only the former significantly predicted time to sexual reoffense. All of the measures (i.e., the ERASOR overall clinical rating and total score, the J-SOAP-II total score, as well as the YLS/CMI) significantly predicted nonsexual recidivism and time to nonsexual reoffense for this sample of youth who sexually offended. Overall, the results suggest that the ERASOR appears to be suited for assessing youth who sexually offended in a non-Western context, but the J-SOAP-II and the YLS/CMI have limited utility for such a purpose.
Development and External Validation of a Melanoma Risk Prediction Model Based on Self-assessed Risk Factors.

PubMed

Vuong, Kylie; Armstrong, Bruce K; Weiderpass, Elisabete; Lund, Eiliv; Adami, Hans-Olov; Veierod, Marit B; Barrett, Jennifer H; Davies, John R; Bishop, D Timothy; Whiteman, David C; Olsen, Catherine M; Hopper, John L; Mann, Graham J; Cust, Anne E; McGeechan, Kevin

2016-08-01

Identifying individuals at high risk of melanoma can optimize primary and secondary prevention strategies. To develop and externally validate a risk prediction model for incident first-primary cutaneous melanoma using self-assessed risk factors. We used unconditional logistic regression to develop a multivariable risk prediction model. Relative risk estimates from the model were combined with Australian melanoma incidence and competing mortality rates to obtain absolute risk estimates. A risk prediction model was developed using the Australian Melanoma Family Study (629 cases and 535 controls) and externally validated using 4 independent population-based studies: the Western Australia Melanoma Study (511 case-control pairs), Leeds Melanoma Case-Control Study (960 cases and 513 controls), Epigene-QSkin Study (44 544, of which 766 with melanoma), and Swedish Women's Lifestyle and Health Cohort Study (49 259 women, of which 273 had melanoma). We validated model performance internally and externally by assessing discrimination using the area under the receiver operating curve (AUC). Additionally, using the Swedish Women's Lifestyle and Health Cohort Study, we assessed model calibration and clinical usefulness. The risk prediction model included hair color, nevus density, first-degree family history of melanoma, previous nonmelanoma skin cancer, and lifetime sunbed use. On internal validation, the AUC was 0.70 (95% CI, 0.67-0.73). On external validation, the AUC was 0.66 (95% CI, 0.63-0.69) in the Western Australia Melanoma Study, 0.67 (95% CI, 0.65-0.70) in the Leeds Melanoma Case-Control Study, 0.64 (95% CI, 0.62-0.66) in the Epigene-QSkin Study, and 0.63 (95% CI, 0.60-0.67) in the Swedish Women's Lifestyle and Health Cohort Study. Model calibration showed close agreement between predicted and observed numbers of incident melanomas across all deciles of predicted risk. In the external validation setting, there was higher net benefit when using the risk prediction model to classify individuals as high risk compared with classifying all individuals as high risk. The melanoma risk prediction model performs well and may be useful in prevention interventions reliant on a risk assessment using self-assessed risk factors.
Validity Evidence for Games as Assessment Environments. CRESST Report 773

ERIC Educational Resources Information Center

Delacruz, Girlie C.; Chung, Gregory K. W. K.; Baker, Eva L.

2010-01-01

This study provides empirical evidence of a highly specific use of games in education--the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also…
Assessing the stability of human locomotion: a review of current measures

PubMed Central

Bruijn, S. M.; Meijer, O. G.; Beek, P. J.; van Dieën, J. H.

2013-01-01

Falling poses a major threat to the steadily growing population of the elderly in modern-day society. A major challenge in the prevention of falls is the identification of individuals who are at risk of falling owing to an unstable gait. At present, several methods are available for estimating gait stability, each with its own advantages and disadvantages. In this paper, we review the currently available measures: the maximum Lyapunov exponent (λS and λL), the maximum Floquet multiplier, variability measures, long-range correlations, extrapolated centre of mass, stabilizing and destabilizing forces, foot placement estimator, gait sensitivity norm and maximum allowable perturbation. We explain what these measures represent and how they are calculated, and we assess their validity, divided up into construct validity, predictive validity in simple models, convergent validity in experimental studies, and predictive validity in observational studies. We conclude that (i) the validity of variability measures and λS is best supported across all levels, (ii) the maximum Floquet multiplier and λL have good construct validity, but negative predictive validity in models, negative convergent validity and (for λL) negative predictive validity in observational studies, (iii) long-range correlations lack construct validity and predictive validity in models and have negative convergent validity, and (iv) measures derived from perturbation experiments have good construct validity, but data are lacking on convergent validity in experimental studies and predictive validity in observational studies. In closing, directions for future research on dynamic gait stability are discussed. PMID:23516062
Predictive validity of the Braden Scale, Norton Scale, and Waterlow Scale in the Czech Republic.

PubMed

Šateková, Lenka; Žiaková, Katarína; Zeleníková, Renáta

2017-02-01

The aim of this study was to determine the predictive validity of the Braden, Norton, and Waterlow scales in 2 long-term care departments in the Czech Republic. Assessing the risk for developing pressure ulcers is the first step in their prevention. At present, many scales are used in clinical practice, but most of them have not been properly validated yet (for example, the Modified Norton Scale in the Czech Republic). In the Czech Republic, only the Braden Scale has been validated so far. This is a prospective comparative instrument testing study. A random sample of 123 patients was recruited. The predictive validity of the pressure ulcer risk assessment scales was evaluated based on sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve. The data were collected from April to August 2014. In the present study, the best predictive validity values were observed for the Norton Scale, followed by the Braden Scale and the Waterlow Scale, in that order. We recommended that the above 3 pressure ulcer risk assessment scales continue to be evaluated in the Czech clinical setting. © 2016 John Wiley & Sons Australia, Ltd.
The Predictive Validity of the Short-Term Assessment of Risk and Treatability (START) for Multiple Adverse Outcomes in a Secure Psychiatric Inpatient Setting.

PubMed

O'Shea, Laura E; Picchioni, Marco M; Dickens, Geoffrey L

2016-04-01

The Short-Term Assessment of Risk and Treatability (START) aims to assist mental health practitioners to estimate an individual's short-term risk for a range of adverse outcomes via structured consideration of their risk ("Vulnerabilities") and protective factors ("Strengths") in 20 areas. It has demonstrated predictive validity for aggression but this is less established for other outcomes. We collated START assessments for N = 200 adults in a secure mental health hospital and ascertained 3-month risk event incidence using the START Outcomes Scale. The specific risk estimates, which are the tool developers' suggested method of overall assessment, predicted aggression, self-harm/suicidality, and victimization, and had incremental validity over the Strength and Vulnerability scales for these outcomes. The Strength scale had incremental validity over the Vulnerability scale for aggressive outcomes; therefore, consideration of protective factors had demonstrable value in their prediction. Further evidence is required to support use of the START for the full range of outcomes it aims to predict. © The Author(s) 2015.
The Predictive Validity of a Computer-Adaptive Assessment of Kindergarten and First-Grade Reading Skills

ERIC Educational Resources Information Center

Clemens, Nathan H.; Hagan-Burke, Shanna; Luo, Wen; Cerda, Carissa; Blakely, Alane; Frosch, Jennifer; Gamez-Patience, Brenda; Jones, Meredith

2015-01-01

This study examined the predictive validity of a computer-adaptive assessment for measuring kindergarten reading skills using the STAR Early Literacy (SEL) test. The findings showed that the results of SEL assessments administered during the fall, winter, and spring of kindergarten were moderate and statistically significant predictors of year-end…
Prediction models for successful external cephalic version: a systematic review.

PubMed

Velzel, Joost; de Hundt, Marcella; Mulder, Frederique M; Molkenboer, Jan F M; Van der Post, Joris A M; Mol, Ben W; Kok, Marjolein

2015-12-01

To provide an overview of existing prediction models for successful ECV, and to assess their quality, development and performance. We searched MEDLINE, EMBASE and the Cochrane Library to identify all articles reporting on prediction models for successful ECV published from inception to January 2015. We extracted information on study design, sample size, model-building strategies and validation. We evaluated the phases of model development and summarized their performance in terms of discrimination, calibration and clinical usefulness. We collected different predictor variables together with their defined significance, in order to identify important predictor variables for successful ECV. We identified eight articles reporting on seven prediction models. All models were subjected to internal validation. Only one model was also validated in an external cohort. Two prediction models had a low overall risk of bias, of which only one showed promising predictive performance at internal validation. This model also completed the phase of external validation. For none of the models their impact on clinical practice was evaluated. The most important predictor variables for successful ECV described in the selected articles were parity, placental location, breech engagement and the fetal head being palpable. One model was assessed using discrimination and calibration using internal (AUC 0.71) and external validation (AUC 0.64), while two other models were assessed with discrimination and calibration, respectively. We found one prediction model for breech presentation that was validated in an external cohort and had acceptable predictive performance. This model should be used to council women considering ECV. Copyright © 2015. Published by Elsevier Ireland Ltd.
The Predictive Validity of the Assessment of Basic Learning Abilities versus Parents' Predictions with Children with Autism

ERIC Educational Resources Information Center

Murphy, Colleen; Martin, Garry L.; Yu, C. T.

2014-01-01

The Assessment of Basic Learning Abilities (ABLA) is an empirically validated clinical tool for assessing the learning ability of persons with intellectual disabilities and children with autism. An ABLA tester uses standardized prompting and reinforcement procedures to attempt to teach, individually, each of six tasks, called levels, to a testee,…
The Predictive Validity of Dynamic Assessment: A Review

ERIC Educational Resources Information Center

Caffrey, Erin; Fuchs, Douglas; Fuchs, Lynn S.

2008-01-01

The authors report on a mixed-methods review of 24 studies that explores the predictive validity of dynamic assessment (DA). For 15 of the studies, they conducted quantitative analyses using Pearson's correlation coefficients. They descriptively examined the remaining studies to determine if their results were consistent with findings from the…
Can species distribution models really predict the expansion of invasive species?

PubMed

Barbet-Massin, Morgane; Rome, Quentin; Villemant, Claire; Courchamp, Franck

2018-01-01

Predictive studies are of paramount importance for biological invasions, one of the biggest threats for biodiversity. To help and better prioritize management strategies, species distribution models (SDMs) are often used to predict the potential invasive range of introduced species. Yet, SDMs have been regularly criticized, due to several strong limitations, such as violating the equilibrium assumption during the invasion process. Unfortunately, validation studies-with independent data-are too scarce to assess the predictive accuracy of SDMs in invasion biology. Yet, biological invasions allow to test SDMs usefulness, by retrospectively assessing whether they would have accurately predicted the latest ranges of invasion. Here, we assess the predictive accuracy of SDMs in predicting the expansion of invasive species. We used temporal occurrence data for the Asian hornet Vespa velutina nigrithorax, a species native to China that is invading Europe with a very fast rate. Specifically, we compared occurrence data from the last stage of invasion (independent validation points) to the climate suitability distribution predicted from models calibrated with data from the early stage of invasion. Despite the invasive species not being at equilibrium yet, the predicted climate suitability of validation points was high. SDMs can thus adequately predict the spread of V. v. nigrithorax, which appears to be-at least partially-climatically driven. In the case of V. v. nigrithorax, SDMs predictive accuracy was slightly but significantly better when models were calibrated with invasive data only, excluding native data. Although more validation studies for other invasion cases are needed to generalize our results, our findings are an important step towards validating the use of SDMs in invasion biology.
Can species distribution models really predict the expansion of invasive species?

PubMed Central

Rome, Quentin; Villemant, Claire; Courchamp, Franck

2018-01-01

Predictive studies are of paramount importance for biological invasions, one of the biggest threats for biodiversity. To help and better prioritize management strategies, species distribution models (SDMs) are often used to predict the potential invasive range of introduced species. Yet, SDMs have been regularly criticized, due to several strong limitations, such as violating the equilibrium assumption during the invasion process. Unfortunately, validation studies–with independent data–are too scarce to assess the predictive accuracy of SDMs in invasion biology. Yet, biological invasions allow to test SDMs usefulness, by retrospectively assessing whether they would have accurately predicted the latest ranges of invasion. Here, we assess the predictive accuracy of SDMs in predicting the expansion of invasive species. We used temporal occurrence data for the Asian hornet Vespa velutina nigrithorax, a species native to China that is invading Europe with a very fast rate. Specifically, we compared occurrence data from the last stage of invasion (independent validation points) to the climate suitability distribution predicted from models calibrated with data from the early stage of invasion. Despite the invasive species not being at equilibrium yet, the predicted climate suitability of validation points was high. SDMs can thus adequately predict the spread of V. v. nigrithorax, which appears to be—at least partially–climatically driven. In the case of V. v. nigrithorax, SDMs predictive accuracy was slightly but significantly better when models were calibrated with invasive data only, excluding native data. Although more validation studies for other invasion cases are needed to generalize our results, our findings are an important step towards validating the use of SDMs in invasion biology. PMID:29509789
The Construct and Predictive Validity of a Dynamic Assessment of Young Children Learning to Read: Implications for RTI Frameworks

ERIC Educational Resources Information Center

Fuchs, Douglas; Compton, Donald L.; Fuchs, Lynn S.; Bouton, Bobette; Caffrey, Erin

2011-01-01

The purpose of this study was to examine the construct and predictive validity of a dynamic assessment (DA) of decoding learning. Students (N = 318) were assessed in the fall of first grade on an array of instruments that were given in hopes of forecasting responsiveness to reading instruction. These instruments included DA as well as…
Validity and validation of expert (Q)SAR systems.

PubMed

Hulzebos, E; Sijm, D; Traas, T; Posthumus, R; Maslankiewicz, L

2005-08-01

At a recent workshop in Setubal (Portugal) principles were drafted to assess the suitability of (quantitative) structure-activity relationships ((Q)SARs) for assessing the hazards and risks of chemicals. In the present study we applied some of the Setubal principles to test the validity of three (Q)SAR expert systems and validate the results. These principles include a mechanistic basis, the availability of a training set and validation. ECOSAR, BIOWIN and DEREK for Windows have a mechanistic or empirical basis. ECOSAR has a training set for each QSAR. For half of the structural fragments the number of chemicals in the training set is >4. Based on structural fragments and log Kow, ECOSAR uses linear regression to predict ecotoxicity. Validating ECOSAR for three 'valid' classes results in predictivity of > or = 64%. BIOWIN uses (non-)linear regressions to predict the probability of biodegradability based on fragments and molecular weight. It has a large training set and predicts non-ready biodegradability well. DEREK for Windows predictions are supported by a mechanistic rationale and literature references. The structural alerts in this program have been developed with a training set of positive and negative toxicity data. However, to support the prediction only a limited number of chemicals in the training set is presented to the user. DEREK for Windows predicts effects by 'if-then' reasoning. The program predicts best for mutagenicity and carcinogenicity. Each structural fragment in ECOSAR and DEREK for Windows needs to be evaluated and validated separately.
Using Dynamic Risk and Protective Factors to Predict Inpatient Aggression: Reliability and Validity of START Assessments

PubMed Central

Desmarais, Sarah L.; Nicholls, Tonia L.; Wilson, Catherine M.; Brink, Johann

2012-01-01

The Short-Term Assessment of Risk and Treatability (START) is a relatively new structured professional judgment guide for the assessment and management of short-term risks associated with mental, substance use, and personality disorders. The scheme may be distinguished from other violence risk instruments because of its inclusion of 20 dynamic factors that are rated in terms of both vulnerability and strength. This study examined the reliability and validity of START assessments in predicting inpatient aggression. Research assistants completed START assessments for 120 male forensic psychiatric patients through review of hospital files. They additionally completed Historical-Clinical-Risk Management – 20 (HCR-20) and the Hare Psychopathy Checklist: Screening Version (PCL:SV) assessments. Outcome data was coded from hospital files for a 12-month follow-up period using the Overt Aggression Scale (OAS). START assessments evidenced excellent interrater reliability and demonstrated both predictive and incremental validity over the HCR-20 Historical subscale scores and PCL:SV total scores. Overall, results support the reliability and validity of START assessments, and use of the structured professional judgment approach more broadly, as well as the value of using dynamic risk and protective factors to assess violence risk. PMID:22250595
Field Validity of the Psychopathy Checklist--Revised in Sex Offender Risk Assessment

ERIC Educational Resources Information Center

Murrie, Daniel C.; Boccaccini, Marcus T.; Caperton, Jennifer; Rufino, Katrina

2012-01-01

Several studies have concluded that scores from Hare's (2003) Psychopathy Checklist--Revised (PCL-R) predict reoffense among sexual offenders, but most of those studies examined the predictive validity of scores from trained research staff, not clinicians in the field scoring the measure as part of actual forensic assessments. Therefore, we…
The reliability, validity, sensitivity, specificity and predictive values of the Chinese version of the Rowland Universal Dementia Assessment Scale.

PubMed

Chen, Chia-Wei; Chu, Hsin; Tsai, Chia-Fen; Yang, Hui-Ling; Tsai, Jui-Chen; Chung, Min-Huey; Liao, Yuan-Mei; Chi, Mei-Ju; Chou, Kuei-Ru

2015-11-01

The purpose of this study was to translate the Rowland Universal Dementia Assessment Scale into Chinese and to evaluate the psychometric properties (reliability and validity) and the diagnostic properties (sensitivity, specificity and predictive values) of the Chinese version of the Rowland Universal Dementia Assessment Scale. The accurate detection of early dementia requires screening tools with favourable cross-cultural linguistic and appropriate sensitivity, specificity, and predictive values, particularly for Chinese-speaking populations. This was a cross-sectional, descriptive study. Overall, 130 participants suspected to have cognitive impairment were enrolled in the study. A test-retest for determining reliability was scheduled four weeks after the initial test. Content validity was determined by five experts, whereas construct validity was established by using contrasted group technique. The participants' clinical diagnoses were used as the standard in calculating the sensitivity, specificity, positive predictive value and negative predictive value. The study revealed that the Chinese version of the Rowland Universal Dementia Assessment Scale exhibited a test-retest reliability of 0.90, an internal consistency reliability of 0.71, an inter-rater reliability (kappa value) of 0.88 and a content validity index of 0.97. Both the patients and healthy contrast group exhibited significant differences in their cognitive ability. The optimal cut-off points for the Chinese version of the Rowland Universal Dementia Assessment Scale in the test for mild cognitive impairment and dementia were 24 and 22, respectively; moreover, for these two conditions, the sensitivities of the scale were 0.79 and 0.76, the specificities were 0.91 and 0.81, the areas under the curve were 0.85 and 0.78, the positive predictive values were 0.99 and 0.83 and the negative predictive values were 0.96 and 0.91 respectively. The Chinese version of the Rowland Universal Dementia Assessment Scale exhibited sound reliability, validity, sensitivity, specificity and predictive values. This scale can help clinical staff members to quickly and accurately diagnose cognitive impairment and provide appropriate treatment as early as possible. © 2015 John Wiley & Sons Ltd.
Assessing the validity of sales self-efficacy: a cautionary tale.

PubMed

Gupta, Nina; Ganster, Daniel C; Kepes, Sven

2013-07-01

We developed a focused, context-specific measure of sales self-efficacy and assessed its incremental validity against the broad Big 5 personality traits with department store salespersons, using (a) both a concurrent and a predictive design and (b) both objective sales measures and supervisory ratings of performance. We found that in the concurrent study, sales self-efficacy predicted objective and subjective measures of job performance more than did the Big 5 measures. Significant differences between the predictability of subjective and objective measures of performance were not observed. Predictive validity coefficients were generally lower than concurrent validity coefficients. The results suggest that there are different dynamics operating in concurrent and predictive designs and between broad and contextualized measures; they highlight the importance of distinguishing between these designs and measures in meta-analyses. The results also point to the value of focused, context-specific personality predictors in selection research. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Initial Validation of a Comprehensive Assessment Instrument for Bereavement-Related Grief Symptoms and Risk of Complications: The Indicator of Bereavement Adaptation—Cruse Scotland (IBACS)

PubMed Central

Schut, Henk; Stroebe, Margaret S.; Wilson, Stewart; Birrell, John

2016-01-01

Objective This study assessed the validity of the Indicator of Bereavement Adaptation Cruse Scotland (IBACS). Designed for use in clinical and non-clinical settings, the IBACS measures severity of grief symptoms and risk of developing complications. Method N = 196 (44 male, 152 female) help-seeking, bereaved Scottish adults participated at two timepoints: T1 (baseline) and T2 (after 18 months). Four validated assessment instruments were administered: CORE-R, ICG-R, IES-R, SCL-90-R. Discriminative ability was assessed using ROC curve analysis. Concurrent validity was tested through correlation analysis at T1. Predictive validity was assessed using correlation analyses and ROC curve analysis. Optimal IBACS cutoff values were obtained by calculating a maximal Youden index J in ROC curve analysis. Clinical implications were compared across instruments. Results ROC curve analysis results (AUC = .84, p < .01, 95% CI between .77 and .90) indicated the IBACS is a good diagnostic instrument for assessing complicated grief. Positive correlations (p < .01, 2-tailed) with all four instruments at T1 demonstrated the IBACS' concurrent validity, strongest with complicated grief measures (r = .82). Predictive validity was shown to be fair in T2 ROC curve analysis results (n = 67, AUC = .78, 95% CI between .65 and .92; p < .01). Predictive validity was also supported by stable positive correlations between IBACS and other instruments at T2. Clinical indications were found not to differ across instruments. Conclusions The IBACS offers effective grief symptom and risk assessment for use by non-clinicians. Indications are sufficient to support intake assessment for a stepped model of bereavement intervention. PMID:27741246

Validity and reliability of a self-report instrument to assess social support and physical environmental correlates of physical activity in adolescents

PubMed Central

2012-01-01

Background The purpose of this study was to examine the internal consistency, test-retest reliability, construct validity and predictive validity of a new German self-report instrument to assess the influence of social support and the physical environment on physical activity in adolescents. Methods Based on theoretical consideration, the short scales on social support and physical environment were developed and cross-validated in two independent study samples of 9 to 17 year-old girls and boys. The longitudinal sample of Study I (n = 196) was recruited from a German comprehensive school, and subjects in this study completed the questionnaire twice with a between-test interval of seven days. Cronbach’s alphas were computed to determine the internal consistency of the factors. Test-retest reliability of the latent factors was assessed using intra-class coefficients. Factorial validity of the scales was assessed using principle components analysis. Construct validity was determined using a cross-validation technique by performing confirmatory factor analysis with the independent nationwide cross-sectional sample of Study II (n = 430). Correlations between factors and three measures of physical activity (objectively measured moderate-to-vigorous physical activity (MVPA), self-reported habitual MVPA and self-reported recent MVPA) were calculated to determine the predictive validity of the instrument. Results Construct validity of the social support scale (two factors: parental support and peer support) and the physical environment scale (four factors: convenience, public recreation facilities, safety and private sport providers) was shown. Both scales had moderate test-retest reliability. The factors of the social support scale also had good internal consistency and predictive validity. Internal consistency and predictive validity of the physical environment scale were low to acceptable. Conclusions The results of this study indicate moderate to good reliability and construct validity of the social support scale and physical environment scale. Predictive validity was only confirmed for the social support scale but not for the physical environment scale. Hence, it remains unclear if a person’s physical environment has a direct or an indirect effect on physical activity behavior or a moderation function. PMID:22928865
Validity and reliability of a self-report instrument to assess social support and physical environmental correlates of physical activity in adolescents.

PubMed

Reimers, Anne K; Jekauc, Darko; Mess, Filip; Mewes, Nadine; Woll, Alexander

2012-08-29

The purpose of this study was to examine the internal consistency, test-retest reliability, construct validity and predictive validity of a new German self-report instrument to assess the influence of social support and the physical environment on physical activity in adolescents. Based on theoretical consideration, the short scales on social support and physical environment were developed and cross-validated in two independent study samples of 9 to 17 year-old girls and boys. The longitudinal sample of Study I (n = 196) was recruited from a German comprehensive school, and subjects in this study completed the questionnaire twice with a between-test interval of seven days. Cronbach's alphas were computed to determine the internal consistency of the factors. Test-retest reliability of the latent factors was assessed using intra-class coefficients. Factorial validity of the scales was assessed using principle components analysis. Construct validity was determined using a cross-validation technique by performing confirmatory factor analysis with the independent nationwide cross-sectional sample of Study II (n = 430). Correlations between factors and three measures of physical activity (objectively measured moderate-to-vigorous physical activity (MVPA), self-reported habitual MVPA and self-reported recent MVPA) were calculated to determine the predictive validity of the instrument. Construct validity of the social support scale (two factors: parental support and peer support) and the physical environment scale (four factors: convenience, public recreation facilities, safety and private sport providers) was shown. Both scales had moderate test-retest reliability. The factors of the social support scale also had good internal consistency and predictive validity. Internal consistency and predictive validity of the physical environment scale were low to acceptable. The results of this study indicate moderate to good reliability and construct validity of the social support scale and physical environment scale. Predictive validity was only confirmed for the social support scale but not for the physical environment scale. Hence, it remains unclear if a person's physical environment has a direct or an indirect effect on physical activity behavior or a moderation function.
Individualized prediction of perineural invasion in colorectal cancer: development and validation of a radiomics prediction model.

PubMed

Huang, Yanqi; He, Lan; Dong, Di; Yang, Caiyun; Liang, Cuishan; Chen, Xin; Ma, Zelan; Huang, Xiaomei; Yao, Su; Liang, Changhong; Tian, Jie; Liu, Zaiyi

2018-02-01

To develop and validate a radiomics prediction model for individualized prediction of perineural invasion (PNI) in colorectal cancer (CRC). After computed tomography (CT) radiomics features extraction, a radiomics signature was constructed in derivation cohort (346 CRC patients). A prediction model was developed to integrate the radiomics signature and clinical candidate predictors [age, sex, tumor location, and carcinoembryonic antigen (CEA) level]. Apparent prediction performance was assessed. After internal validation, independent temporal validation (separate from the cohort used to build the model) was then conducted in 217 CRC patients. The final model was converted to an easy-to-use nomogram. The developed radiomics nomogram that integrated the radiomics signature and CEA level showed good calibration and discrimination performance [Harrell's concordance index (c-index): 0.817; 95% confidence interval (95% CI): 0.811-0.823]. Application of the nomogram in validation cohort gave a comparable calibration and discrimination (c-index: 0.803; 95% CI: 0.794-0.812). Integrating the radiomics signature and CEA level into a radiomics prediction model enables easy and effective risk assessment of PNI in CRC. This stratification of patients according to their PNI status may provide a basis for individualized auxiliary treatment.
The Predictive Validity of the Tilburg Frailty Indicator: Disability, Health Care Utilization, and Quality of Life in a Population at Risk

ERIC Educational Resources Information Center

Gobbens, Robbert J. J.; van Assen, Marcel A. L. M.; Luijkx, Katrien G.; Schols, Jos M. G. A.

2012-01-01

Purpose: To assess the predictive validity of frailty and its domains (physical, psychological, and social), as measured by the Tilburg Frailty Indicator (TFI), for the adverse outcomes disability, health care utilization, and quality of life. Design and Methods: The predictive validity of the TFI was tested in a representative sample of 484…
Improving the Validity of Activity of Daily Living Dependency Risk Assessment

PubMed Central

Clark, Daniel O.; Stump, Timothy E.; Tu, Wanzhu; Miller, Douglas K.

2015-01-01

Objectives Efforts to prevent activity of daily living (ADL) dependency may be improved through models that assess older adults’ dependency risk. We evaluated whether cognition and gait speed measures improve the predictive validity of interview-based models. Method Participants were 8,095 self-respondents in the 2006 Health and Retirement Survey who were aged 65 years or over and independent in five ADLs. Incident ADL dependency was determined from the 2008 interview. Models were developed using random 2/3rd cohorts and validated in the remaining 1/3rd. Results Compared to a c-statistic of 0.79 in the best interview model, the model including cognitive measures had c-statistics of 0.82 and 0.80 while the best fitting gait speed model had c-statistics of 0.83 and 0.79 in the development and validation cohorts, respectively. Conclusion Two relatively brief models, one that requires an in-person assessment and one that does not, had excellent validity for predicting incident ADL dependency but did not significantly improve the predictive validity of the best fitting interview-based models. PMID:24652867
Development and validation of an automated delirium risk assessment system (Auto-DelRAS) implemented in the electronic health record system.

PubMed

Moon, Kyoung-Ja; Jin, Yinji; Jin, Taixian; Lee, Sun-Mi

2018-01-01

A key component of the delirium management is prevention and early detection. To develop an automated delirium risk assessment system (Auto-DelRAS) that automatically alerts health care providers of an intensive care unit (ICU) patient's delirium risk based only on data collected in an electronic health record (EHR) system, and to evaluate the clinical validity of this system. Cohort and system development designs were used. Medical and surgical ICUs in two university hospitals in Seoul, Korea. A total of 3284 patients for the development of Auto-DelRAS, 325 for external validation, 694 for validation after clinical applications. The 4211 data items were extracted from the EHR system and delirium was measured using CAM-ICU (Confusion Assessment Method for Intensive Care Unit). The potential predictors were selected and a logistic regression model was established to create a delirium risk scoring algorithm to construct the Auto-DelRAS. The Auto-DelRAS was evaluated at three months and one year after its application to clinical practice to establish the predictive validity of the system. Eleven predictors were finally included in the logistic regression model. The results of the Auto-DelRAS risk assessment were shown as high/moderate/low risk on a Kardex screen. The predictive validity, analyzed after the clinical application of Auto-DelRAS after one year, showed a sensitivity of 0.88, specificity of 0.72, positive predictive value of 0.53, negative predictive value of 0.94, and a Youden index of 0.59. A relatively high level of predictive validity was maintained with the Auto-DelRAS system, even one year after it was applied to clinical practice. Copyright © 2017. Published by Elsevier Ltd.
Predictive validity and correlates of self-assessed resilience among U.S. Army soldiers.

PubMed

Campbell-Sills, Laura; Kessler, Ronald C; Ursano, Robert J; Sun, Xiaoying; Taylor, Charles T; Heeringa, Steven G; Nock, Matthew K; Sampson, Nancy A; Jain, Sonia; Stein, Murray B

2018-02-01

Self-assessment of resilience could prove valuable to military and other organizations whose personnel confront foreseen stressors. We evaluated the validity of self-assessed resilience among U.S. Army soldiers, including whether predeployment perceived resilience predicted postdeployment emotional disorder. Resilience was assessed via self-administered questionnaire among new soldiers reporting for basic training (N = 35,807) and experienced soldiers preparing to deploy to Afghanistan (N = 8,558). Concurrent validity of self-assessed resilience was evaluated among recruits by estimating its association with past-month emotional disorder. Predictive validity was examined among 3,526 experienced soldiers with no lifetime emotional disorder predeployment. Predictive models estimated associations of predeployment resilience with incidence of emotional disorder through 9 months postdeployment and with marked improvement in coping at 3 months postdeployment. Weights-adjusted regression models incorporated stringent controls for risk factors. Soldiers characterized themselves as very resilient on average [M = 14.34, SD = 4.20 (recruits); M = 14.75, SD = 4.31 (experienced soldiers); theoretical range = 0-20]. Demographic characteristics exhibited only modest associations with resilience, while severity of childhood maltreatment was negatively associated with resilience in both samples. Among recruits, resilience was inversely associated with past-month emotional disorder [adjusted odds ratio (AOR) = 0.65, 95% CI = 0.62-0.68, P < .0005 (per standard score increase)]. Among deployed soldiers, greater predeployment resilience was associated with decreased incidence of emotional disorder (AOR = 0.91; 95% CI = 0.84-0.98; P = .016) and increased odds of improved coping (AOR = 1.36; 95% CI = 1.24-1.49; P < .0005) postdeployment. Findings supported validity of self-assessed resilience among soldiers, although its predictive effect on incidence of emotional disorder was modest. In conjunction with assessment of known risk factors, measurement of resilience could help predict adaptation to foreseen stressors like deployment. © 2017 Wiley Periodicals, Inc.
Different type 2 diabetes risk assessments predict dissimilar numbers at ‘high risk’: a retrospective analysis of diabetes risk-assessment tools

PubMed Central

Gray, Benjamin J; Bracken, Richard M; Turner, Daniel; Morgan, Kerry; Thomas, Michael; Williams, Sally P; Williams, Meurig; Rice, Sam; Stephens, Jeffrey W

2015-01-01

Background Use of a validated risk-assessment tool to identify individuals at high risk of developing type 2 diabetes is currently recommended. It is under-reported, however, whether a different risk tool alters the predicted risk of an individual. Aim This study explored any differences between commonly used validated risk-assessment tools for type 2 diabetes. Design and setting Cross-sectional analysis of individuals who participated in a workplace-based risk assessment in Carmarthenshire, South Wales. Method Retrospective analysis of 676 individuals (389 females and 287 males) who participated in a workplace-based diabetes risk-assessment initiative. Ten-year risk of type 2 diabetes was predicted using the validated QDiabetes®, Leicester Risk Assessment (LRA), FINDRISC, and Cambridge Risk Score (CRS) algorithms. Results Differences between the risk-assessment tools were apparent following retrospective analysis of individuals. CRS categorised the highest proportion (13.6%) of individuals at ‘high risk’ followed by FINDRISC (6.6%), QDiabetes (6.1%), and, finally, the LRA was the most conservative risk tool (3.1%). Following further analysis by sex, over one-quarter of males were categorised at high risk using CRS (25.4%), whereas a greater percentage of females were categorised as high risk using FINDRISC (7.8%). Conclusion The adoption of a different valid risk-assessment tool can alter the predicted risk of an individual and caution should be used to identify those individuals who really are at high risk of type 2 diabetes. PMID:26541180
Different type 2 diabetes risk assessments predict dissimilar numbers at 'high risk': a retrospective analysis of diabetes risk-assessment tools.

PubMed

Gray, Benjamin J; Bracken, Richard M; Turner, Daniel; Morgan, Kerry; Thomas, Michael; Williams, Sally P; Williams, Meurig; Rice, Sam; Stephens, Jeffrey W

2015-12-01

Use of a validated risk-assessment tool to identify individuals at high risk of developing type 2 diabetes is currently recommended. It is under-reported, however, whether a different risk tool alters the predicted risk of an individual. This study explored any differences between commonly used validated risk-assessment tools for type 2 diabetes. Cross-sectional analysis of individuals who participated in a workplace-based risk assessment in Carmarthenshire, South Wales. Retrospective analysis of 676 individuals (389 females and 287 males) who participated in a workplace-based diabetes risk-assessment initiative. Ten-year risk of type 2 diabetes was predicted using the validated QDiabetes(®), Leicester Risk Assessment (LRA), FINDRISC, and Cambridge Risk Score (CRS) algorithms. Differences between the risk-assessment tools were apparent following retrospective analysis of individuals. CRS categorised the highest proportion (13.6%) of individuals at 'high risk' followed by FINDRISC (6.6%), QDiabetes (6.1%), and, finally, the LRA was the most conservative risk tool (3.1%). Following further analysis by sex, over one-quarter of males were categorised at high risk using CRS (25.4%), whereas a greater percentage of females were categorised as high risk using FINDRISC (7.8%). The adoption of a different valid risk-assessment tool can alter the predicted risk of an individual and caution should be used to identify those individuals who really are at high risk of type 2 diabetes. © British Journal of General Practice 2015.
The Johns Hopkins Fall Risk Assessment Tool: A Study of Reliability and Validity.

PubMed

Poe, Stephanie S; Dawson, Patricia B; Cvach, Maria; Burnett, Margaret; Kumble, Sowmya; Lewis, Maureen; Thompson, Carol B; Hill, Elizabeth E

Patient falls and fall-related injury remain a safety concern. The Johns Hopkins Fall Risk Assessment Tool (JHFRAT) was developed to facilitate early detection of risk for anticipated physiologic falls in adult inpatients. Psychometric properties in acute care settings have not yet been fully established; this study sought to fill that gap. Results indicate that the JHFRAT is reliable, with high sensitivity and negative predictive validity. Specificity and positive predictive validity were lower than expected.
Development and evaluation of an automated fall risk assessment system.

PubMed

Lee, Ju Young; Jin, Yinji; Piao, Jinshi; Lee, Sun-Mi

2016-04-01

Fall risk assessment is the first step toward prevention, and a risk assessment tool with high validity should be used. This study aimed to develop and validate an automated fall risk assessment system (Auto-FallRAS) to assess fall risks based on electronic medical records (EMRs) without additional data collected or entered by nurses. This study was conducted in a 1335-bed university hospital in Seoul, South Korea. The Auto-FallRAS was developed using 4211 fall-related clinical data extracted from EMRs. Participants included fall patients and non-fall patients (868 and 3472 for the development study; 752 and 3008 for the validation study; and 58 and 232 for validation after clinical application, respectively). The system was evaluated for predictive validity and concurrent validity. The final 10 predictors were included in the logistic regression model for the risk-scoring algorithm. The results of the Auto-FallRAS were shown as high/moderate/low risk on the EMR screen. The predictive validity analyzed after clinical application of the Auto-FallRAS was as follows: sensitivity = 0.95, NPV = 0.97 and Youden index = 0.44. The validity of the Morse Fall Scale assessed by nurses was as follows: sensitivity = 0.68, NPV = 0.88 and Youden index = 0.28. This study found that the Auto-FallRAS results were better than were the nurses' predictions. The advantage of the Auto-FallRAS is that it automatically analyzes information and shows patients' fall risk assessment results without requiring additional time from nurses. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
The stroke impairment assessment set: its internal consistency and predictive validity.

PubMed

Tsuji, T; Liu, M; Sonoda, S; Domen, K; Chino, N

2000-07-01

To study the scale quality and predictive validity of the Stroke Impairment Assessment Set (SIAS) developed for stroke outcome research. Rasch analysis of the SIAS; stepwise multiple regression analysis to predict discharge functional independence measure (FIM) raw scores from demographic data, the SIAS scores, and the admission FIM scores; cross-validation of the prediction rule. Tertiary rehabilitation center in Japan. One hundred ninety stroke inpatients for the study of the scale quality and the predictive validity; a second sample of 116 stroke inpatients for the cross-validation study. Mean square fit statistics to study the degree of fit to the unidimensional model; logits to express item difficulties; discharge FIM scores for the study of predictive validity. The degree of misfit was acceptable except for the shoulder range of motion (ROM), pain, visuospatial function, and speech items; and the SIAS items could be arranged on a common unidimensional scale. The difficulty patterns were identical at admission and at discharge except for the deep tendon reflexes, ROM, and pain items. They were also similar for the right- and left-sided brain lesion groups except for the speech and visuospatial items. For the prediction of the discharge FIM scores, the independent variables selected were age, the SIAS total scores, and the admission FIM scores; and the adjusted R2 was .64 (p < .0001). Stability of the predictive equation was confirmed in the cross-validation sample (R2 = .68, p < .001). The unidimensionality of the SIAS was confirmed, and the SIAS total scores proved useful for stroke outcome prediction.
Multilevel Assessment of the Predictive Validity of Teacher Made Tests in the Zimbabwean Primary Education Sector

ERIC Educational Resources Information Center

Machingambi, Zadzisai

2017-01-01

The principal focus of this study was to undertake a multilevel assessment of the predictive validity of teacher made tests in the Zimbabwean primary education sector. A correlational research design was adopted for the study, mainly to allow for statistical treatment of data and subsequent classical hypotheses testing using the spearman's rho.…
Predicting implementation from organizational readiness for change: a study protocol

PubMed Central

2011-01-01

Background There is widespread interest in measuring organizational readiness to implement evidence-based practices in clinical care. However, there are a number of challenges to validating organizational measures, including inferential bias arising from the halo effect and method bias - two threats to validity that, while well-documented by organizational scholars, are often ignored in health services research. We describe a protocol to comprehensively assess the psychometric properties of a previously developed survey, the Organizational Readiness to Change Assessment. Objectives Our objective is to conduct a comprehensive assessment of the psychometric properties of the Organizational Readiness to Change Assessment incorporating methods specifically to address threats from halo effect and method bias. Methods and Design We will conduct three sets of analyses using longitudinal, secondary data from four partner projects, each testing interventions to improve the implementation of an evidence-based clinical practice. Partner projects field the Organizational Readiness to Change Assessment at baseline (n = 208 respondents; 53 facilities), and prospectively assesses the degree to which the evidence-based practice is implemented. We will conduct predictive and concurrent validities using hierarchical linear modeling and multivariate regression, respectively. For predictive validity, the outcome is the change from baseline to follow-up in the use of the evidence-based practice. We will use intra-class correlations derived from hierarchical linear models to assess inter-rater reliability. Two partner projects will also field measures of job satisfaction for convergent and discriminant validity analyses, and will field Organizational Readiness to Change Assessment measures at follow-up for concurrent validity (n = 158 respondents; 33 facilities). Convergent and discriminant validities will test associations between organizational readiness and different aspects of job satisfaction: satisfaction with leadership, which should be highly correlated with readiness, versus satisfaction with salary, which should be less correlated with readiness. Content validity will be assessed using an expert panel and modified Delphi technique. Discussion We propose a comprehensive protocol for validating a survey instrument for assessing organizational readiness to change that specifically addresses key threats of bias related to halo effect, method bias and questions of construct validity that often go unexplored in research using measures of organizational constructs. PMID:21777479
Assessing the reliability, predictive and construct validity of historical, clinical and risk management-20 (HCR-20) in Mexican psychiatric inpatients.

PubMed

Sada, Andrea; Robles-García, Rebeca; Martínez-López, Nicolás; Hernández-Ramírez, Rafael; Tovilla-Zarate, Carlos-Alfonso; López-Munguía, Fernando; Suárez-Alvarez, Enrique; Ayala, Xochitl; Fresán, Ana

2016-08-01

Assessing dangerousness to gauge the likelihood of future violent behaviour has become an integral part of clinical mental health practice in forensic and non-forensic psychiatric settings, one of the most effective instruments for this being the Historical, Clinical and Risk Management-20 (HCR-20). To examine the HCR-20 factor structure in Mexican psychiatric inpatients and to obtain its predictive validity and reliability for use in this population. In total, 225 patients diagnosed with psychotic, affective or personality disorders were included. The HCR-20 was applied at hospital admission and violent behaviours were assessed during psychiatric hospitalization using the Overt Aggression Scale (OAS). Construct validity, predictive validity and internal consistency were determined. Violent behaviour remains more severe in patients classified in the high-risk group during hospitalization. Fifteen items displayed adequate communalities in the original designated domains of the HCR-20 and internal consistency of the instruments was high. The HCR-20 is a suitable instrument for predicting violence risk in Mexican psychiatric inpatients.
Cross-validation pitfalls when selecting and assessing regression and classification models.

PubMed

Krstajic, Damjan; Buturovic, Ljubomir J; Leahy, David E; Thomas, Simon

2014-03-29

We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.
Base Flow Model Validation

NASA Technical Reports Server (NTRS)

Sinha, Neeraj; Brinckman, Kevin; Jansen, Bernard; Seiner, John

2011-01-01

A method was developed of obtaining propulsive base flow data in both hot and cold jet environments, at Mach numbers and altitude of relevance to NASA launcher designs. The base flow data was used to perform computational fluid dynamics (CFD) turbulence model assessments of base flow predictive capabilities in order to provide increased confidence in base thermal and pressure load predictions obtained from computational modeling efforts. Predictive CFD analyses were used in the design of the experiments, available propulsive models were used to reduce program costs and increase success, and a wind tunnel facility was used. The data obtained allowed assessment of CFD/turbulence models in a complex flow environment, working within a building-block procedure to validation, where cold, non-reacting test data was first used for validation, followed by more complex reacting base flow validation.
Validity of the Miller forensic assessment of symptoms test in psychiatric inpatients.

PubMed

Veazey, Connie H; Wagner, Alisha L; Hays, J Ray; Miller, Holly A

2005-06-01

This study investigated the validity of the Miller Forensic Assessment of Symptoms Test (M-FAST), a brief measure of malingering, in an inpatient psychiatric sample of 70. Among those patients who also completed the Personality Assessment Inventory (N=44), Total M-FAST score was related in the expected directions to the Personality Assessment Inventory validity scales and indexes, providing evidence for concurrent validity of the M-FAST. With the PAI malingering index used as a criterion, we examined the diagnostic efficiency of the M-FAST and found a cut score of 8 represented the best balance of sensitivity, specificity, positive predictive power, and negative predictive power. Based on this cut-score of 8, 16% of the population was classified as malingering. The M-FAST appears to be an excellent rapid screen for symptom exaggeration in this population and setting.
Test-Retest Reliability and Predictive Validity of the Implicit Association Test in Children

ERIC Educational Resources Information Center

Rae, James R.; Olson, Kristina R.

2018-01-01

The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many…
Construct measurement quality improves predictive accuracy in violence risk assessment: an illustration using the personality assessment inventory.

PubMed

Hendry, Melissa C; Douglas, Kevin S; Winter, Elizabeth A; Edens, John F

2013-01-01

Much of the risk assessment literature has focused on the predictive validity of risk assessment tools. However, these tools often comprise a list of risk factors that are themselves complex constructs, and focusing on the quality of measurement of individual risk factors may improve the predictive validity of the tools. The present study illustrates this concern using the Antisocial Features and Aggression scales of the Personality Assessment Inventory (Morey, 1991). In a sample of 1,545 prison inmates and offenders undergoing treatment for substance abuse (85% male), we evaluated (a) the factorial validity of the ANT and AGG scales, (b) the utility of original ANT and AGG scales and newly derived ANT and AGG scales for predicting antisocial outcomes (recidivism and institutional infractions), and (c) whether items with a stronger relationship to the underlying constructs (higher factor loadings) were in turn more strongly related to antisocial outcomes. Confirmatory factor analyses (CFAs) indicated that ANT and AGG items were not structured optimally in these data in terms of correspondence to the subscale structure identified in the PAI manual. Exploratory factor analyses were conducted on a random split-half of the sample to derive optimized alternative factor structures, and cross-validated in the second split-half using CFA. Four-factor models emerged for both the ANT and AGG scales, and, as predicted, the size of item factor loadings was associated with the strength with which items were associated with institutional infractions and community recidivism. This suggests that the quality by which a construct is measured is associated with its predictive strength. Implications for risk assessment are discussed. Copyright © 2013 John Wiley & Sons, Ltd.

The predictive validity of quality of evidence grades for the stability of effect estimates was low: a meta-epidemiological study.

PubMed

Gartlehner, Gerald; Dobrescu, Andreea; Evans, Tammeka Swinson; Bann, Carla; Robinson, Karen A; Reston, James; Thaler, Kylie; Skelly, Andrea; Glechner, Anna; Peterson, Kimberly; Kien, Christina; Lohr, Kathleen N

2016-02-01

To determine the predictive validity of the U.S. Evidence-based Practice Center (EPC) approach to GRADE (Grading of Recommendations Assessment, Development and Evaluation). Based on Cochrane reports with outcomes graded as high quality of evidence (QOE), we prepared 160 documents which represented different levels of QOE. Professional systematic reviewers dually graded the QOE. For each document, we determined whether estimates were concordant with high QOE estimates of the Cochrane reports. We compared the observed proportion of concordant estimates with the expected proportion from an international survey. To determine the predictive validity, we used the Hosmer-Lemeshow test to assess calibration and the C (concordance) index to assess discrimination. The predictive validity of the EPC approach to GRADE was limited. Estimates graded as high QOE were less likely, estimates graded as low or insufficient QOE more likely to remain stable than expected. The EPC approach to GRADE could not reliably predict the likelihood that individual bodies of evidence remain stable as new evidence becomes available. C-indices ranged between 0.56 (95% CI, 0.47 to 0.66) and 0.58 (95% CI, 0.50 to 0.67) indicating a low discriminatory ability. The limited predictive validity of the EPC approach to GRADE seems to reflect a mismatch between expected and observed changes in treatment effects as bodies of evidence advance from insufficient to high QOE. Copyright © 2016 Elsevier Inc. All rights reserved.
Differential Predictive Validity of a Preschool Battery Across Race and Sex.

ERIC Educational Resources Information Center

Reynolds, Cecil R.

Determination of the fairness of preschool tests for use with children of varying cultural backgrounds is the major objective of this study. The predictive validity of a battery of preschool tests, chosen to represent the core areas of preschool assessment, across race and sex, was evaluated. Validity of the battery was examined over a 12-month…
Family-Based Benchmarking of Copy Number Variation Detection Software.

PubMed

Nutsua, Marcel Elie; Fischer, Annegret; Nebel, Almut; Hofmann, Sylvia; Schreiber, Stefan; Krawczak, Michael; Nothnagel, Michael

2015-01-01

The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico.
A Public-Private Partnership Develops and Externally Validates a 30-Day Hospital Readmission Risk Prediction Model

PubMed Central

Choudhry, Shahid A.; Li, Jing; Davis, Darcy; Erdmann, Cole; Sikka, Rishi; Sutariya, Bharat

2013-01-01

Introduction: Preventing the occurrence of hospital readmissions is needed to improve quality of care and foster population health across the care continuum. Hospitals are being held accountable for improving transitions of care to avert unnecessary readmissions. Advocate Health Care in Chicago and Cerner (ACC) collaborated to develop all-cause, 30-day hospital readmission risk prediction models to identify patients that need interventional resources. Ideally, prediction models should encompass several qualities: they should have high predictive ability; use reliable and clinically relevant data; use vigorous performance metrics to assess the models; be validated in populations where they are applied; and be scalable in heterogeneous populations. However, a systematic review of prediction models for hospital readmission risk determined that most performed poorly (average C-statistic of 0.66) and efforts to improve their performance are needed for widespread usage. Methods: The ACC team incorporated electronic health record data, utilized a mixed-method approach to evaluate risk factors, and externally validated their prediction models for generalizability. Inclusion and exclusion criteria were applied on the patient cohort and then split for derivation and internal validation. Stepwise logistic regression was performed to develop two predictive models: one for admission and one for discharge. The prediction models were assessed for discrimination ability, calibration, overall performance, and then externally validated. Results: The ACC Admission and Discharge Models demonstrated modest discrimination ability during derivation, internal and external validation post-recalibration (C-statistic of 0.76 and 0.78, respectively), and reasonable model fit during external validation for utility in heterogeneous populations. Conclusions: The ACC Admission and Discharge Models embody the design qualities of ideal prediction models. The ACC plans to continue its partnership to further improve and develop valuable clinical models. PMID:24224068
Validation of the 4P's Plus screen for substance use in pregnancy validation of the 4P's Plus.

PubMed

Chasnoff, I J; Wells, A M; McGourty, R F; Bailey, L K

2007-12-01

The purpose of this study is to validate the 4P's Plus screen for substance use in pregnancy. A total of 228 pregnant women enrolled in prenatal care underwent screening with the 4P's Plus and received a follow-up clinical assessment for substance use. Statistical analyses regarding reliability, sensitivity, specificity, and positive and negative predictive validity of the 4Ps Plus were conducted. The overall reliability for the five-item measure was 0.62. Seventy-four (32.5%) of the women had a positive screen. Sensitivity and specificity were very good, at 87 and 76%, respectively. Positive predictive validity was low (36%), but negative predictive validity was quite high (97%). Of the 31 women who had a positive clinical assessment, 45% were using less than 1 day per week. The 4P's Plus reliably and effectively screens pregnant women for risk of substance use, including those women typically missed by other perinatal screening methodologies.
Validity of Principal Diagnoses in Discharge Summaries and ICD-10 Coding Assessments Based on National Health Data of Thailand.

PubMed

Sukanya, Chongthawonsatid

2017-10-01

This study examined the validity of the principal diagnoses on discharge summaries and coding assessments. Data were collected from the National Health Security Office (NHSO) of Thailand in 2015. In total, 118,971 medical records were audited. The sample was drawn from government hospitals and private hospitals covered by the Universal Coverage Scheme in Thailand. Hospitals and cases were selected using NHSO criteria. The validity of the principal diagnoses listed in the "Summary and Coding Assessment" forms was established by comparing data from the discharge summaries with data obtained from medical record reviews, and additionally, by comparing data from the coding assessments with data in the computerized ICD (the data base used for reimbursement-purposes). The summary assessments had low sensitivities (7.3%-37.9%), high specificities (97.2%-99.8%), low positive predictive values (9.2%-60.7%), and high negative predictive values (95.9%-99.3%). The coding assessments had low sensitivities (31.1%-69.4%), high specificities (99.0%-99.9%), moderate positive predictive values (43.8%-89.0%), and high negative predictive values (97.3%-99.5%). The discharge summaries and codings often contained mistakes, particularly the categories "Endocrine, nutritional, and metabolic diseases", "Symptoms, signs, and abnormal clinical and laboratory findings not elsewhere classified", "Factors influencing health status and contact with health services", and "Injury, poisoning, and certain other consequences of external causes". The validity of the principal diagnoses on the summary and coding assessment forms was found to be low. The training of physicians and coders must be strengthened to improve the validity of discharge summaries and codings.
Assessing the validity of health impact assessment predictions regarding a Japanese city's transition to core city status: a monitoring review.

PubMed

Hoshiko, M; Hara, K; Ishitake, T

2012-02-01

The validity of health impact assessment (HIA) predictions has not been accurately assessed to date. In recent years, legislative attempts to promote decentralization have been progressing in Japan, and Kurume was designated as a core city in April 2008. An HIA into the transition of Kurume to a core city was conducted before the event, but the recommendations were not accepted by city officials. The aim of this study was to examine the validity of predictions made in the HIA on Kurume by conducting a monitoring review into the accuracy of the predictions. Before Kurume was designated as a core city, the residents completed an online questionnaire and city officials were interviewed. The findings and recommendations were presented to the city administration. One year after the transition, a monitoring review was performed to clarify the accuracy of the HIA predictions by evaluating the correlation between the predictions and reality. Many of the HIA predictions were found to conflict with reality in Kurume. Prediction validity was evaluated for two groups: residents of Kurume and city officials. For the residents, 17% (2/12 items) of the predictions were found to be compatible, 58% (7/12) were incompatible and 25% (3/12) were difficult to evaluate. For city officials, the analysis was divided into those whose department was directly involved in tasks transferred to them (transfer tasks) and those whose department was not. For the city officials in departments responsible for conducting core city transfer tasks, 33% (3/9 items) of the predictions were found to be compatible, 33% (3/9) were incompatible and 33% (3/9) were difficult to evaluate. However, for the city officials whose responsibilities were unrelated to core city transfer tasks, 11% (1/9) of predictions were found to be compatible, 78% (7/9) were incompatible and 11% (1/9) were difficult to evaluate. Although it was possible to validate some of the HIA predictions, the results of this monitoring review found substantial discrepancies between the predictions and reality 1 year after the transition of Kurume to a core city. This suggests that the accuracy of HIA predictions may be called into question. However, it should be noted that the review was conducted very soon after the transition and the steering group was very small, which may explain why the HIA predictions were inaccurate. Further, long-term studies may be needed to assess the accuracy of HIA predictions in similar contexts. Copyright © 2011 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.
Cross Cultural Adaptation, Validity, and Reliability of the Farsi Breastfeeding Attrition Prediction Tools in Iranian Pregnant Women

PubMed Central

Mortazavi, Forough; Mousavi, Seyed Abbas; Chaman, Reza; Khosravi, Ahmad; Janke, Jill R.

2015-01-01

Background: The rate of exclusive breastfeeding in Iran is decreasing. The breastfeeding attrition prediction tools (BAPT) have been validated and used in predicting premature weaning. Objectives: We aimed to translate the BAPT into Farsi, assess its content validity, and examine its reliability and validity to identify exclusive breastfeeding discontinuation in Iran. Materials and Methods: The BAPT was translated into Farsi and the content validity of the Farsi version of the BAPT was assessed. It was administered to 356 pregnant women in the third trimester of pregnancy, who were residents of a city in northeast of Iran. The structural integrity of the four-factor model was assessed in confirmatory factor analysis (CFA) and exploratory factor analysis (EFA). Reliability was assessed using Cronbach’s alpha coefficient and item-subscale correlations. Validity was assessed using the known-group comparison (128 with vs. 228 without breastfeeding experience) and predictive validity (80 successes vs. 265 failures in exclusive breastfeeding). Results: The internal consistency of the whole instrument (49 items) was 0.775. CFA provided an acceptable fit to the a priori four-factor model (Chi-square/df = 1.8, Root Mean Square Error of Approximation (RMSEA) = 0.049, Standardized Root Mean Square Residual (SRMR) = 0.064, Comparative Fit Index (CFI) = 0.911). The difference in means of breastfeeding control (BFC) between the participants with and without breastfeeding experience was significant (P < 0.001). In addition, the total score of BAPT and the score of Breast Feeding Control (BFC) subscale were higher in women who were on exclusive breastfeeding than women who were not, at four months postpartum (P < 0.05). Conclusions: This study validated the Farsi version of BAPT. It is useful for researchers who want to use it in Iran to identify women at higher risks of Exclusive Breast Feeding (EBF) discontinuation. PMID:26019910
Two-Tiered Violence Risk Estimates: a validation study of an integrated-actuarial risk assessment instrument.

PubMed

Mills, Jeremy F; Gray, Andrew L

2013-11-01

This study is an initial validation study of the Two-Tiered Violence Risk Estimates instrument (TTV), a violence risk appraisal instrument designed to support an integrated-actuarial approach to violence risk assessment. The TTV was scored retrospectively from file information on a sample of violent offenders. Construct validity was examined by comparing the TTV with instruments that have shown utility to predict violence that were prospectively scored: The Historical-Clinical-Risk Management-20 (HCR-20) and Lifestyle Criminality Screening Form (LCSF). Predictive validity was examined through a long-term follow-up of 12.4 years with a sample of 78 incarcerated offenders. Results show the TTV to be highly correlated with the HCR-20 and LCSF. The base rate for violence over the follow-up period was 47.4%, and the TTV was equally predictive of violent recidivism relative to the HCR-20 and LCSF. Discussion centers on the advantages of an integrated-actuarial approach to the assessment of violence risk.
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs.

PubMed

Harvey, Naomi D; Craigon, Peter J; Blythe, Simon A; England, Gary C W; Asher, Lucy

2017-01-01

Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.
Falls screening and assessment tools used in acute mental health settings: a review of policies in England and Wales

PubMed Central

Narayanan, V.; Dickinson, A.; Victor, C.; Griffiths, C.; Humphrey, D.

2016-01-01

Objectives There is an urgent need to improve the care of older people at risk of falls or who experience falls in mental health settings. The aims of this study were to evaluate the individual falls risk assessment tools adopted by National Health Service (NHS) mental health trusts in England and healthcare boards in Wales, to evaluate the comprehensiveness of these tools and to review their predictive validity. Methods All NHS mental health trusts in England (n = 56) and healthcare boards in Wales (n = 6) were invited to supply their falls policies and other relevant documentation (e.g. local falls audits). In order to check the comprehensiveness of tools listed in policy documents, the risk variables of the tools adopted by the mental health trusts’ policies were compared with the 2004 National Institute for Health and Care Excellence (NICE) falls prevention guidelines. A comprehensive analytical literature review was undertaken to evaluate the predictive validity of the tools used in these settings. Results Falls policies were obtained from 46 mental health trusts. Thirty-five policies met the study inclusion criteria and were included in the analysis. The main falls assessment tools used were the St. Thomas’ Risk Assessment Tool in Falling Elderly Inpatients (STRATIFY), Falls Risk Assessment Scale for the Elderly, Morse Falls Scale (MFS) and Falls Risk Assessment Tool (FRAT). On detailed examination, a number of different versions of the FRAT were evident; validated tools had inconsistent predictive validity and none of them had been validated in mental health settings. Conclusions Falls risk assessment is the most commonly used component of risk prevention strategies, but most policies included unvalidated tools and even well validated tool such as the STRATIFY and the MFS that are reported to have inconsistent predictive accuracy. This raises questions about operational usefulness, as none of these tools have been tested in acute mental health settings. The falls risk assessment tools from only four mental health trusts met all the recommendations of the NICE falls guidelines on multifactorial assessment for prevention of falls. The recent NICE (2013) guidance states that tools predicting risk using numeric scales should no longer be used; however, multifactorial risk assessment and interventions tailored to patient needs is recommended. Trusts will need to update their policies in response to this guidance. PMID:26395210
Dynamic Assessment of Reading Difficulties: Predictive and Incremental Validity on Attitude toward Reading and the Use of Dialogue/Participation Strategies in Classroom Activities.

PubMed

Navarro, Juan-José; Lara, Laura

2017-01-01

Dynamic Assessment (DA) has been shown to have more predictive value than conventional tests for academic performance. However, in relation to reading difficulties, further research is needed to determine the predictive validity of DA for specific aspects of the different processes involved in reading and the differential validity of DA for different subgroups of students with an academic disadvantage. This paper analyzes the implementation of a DA device that evaluates processes involved in reading (EDPL) among 60 students with reading comprehension difficulties between 9 and 16 years of age, of whom 20 have intellectual disabilities, 24 have reading-related learning disabilities, and 16 have socio-cultural disadvantages. We specifically analyze the predictive validity of the EDPL device over attitude toward reading, and the use of dialogue/participation strategies in reading activities in the classroom during the implementation stage. We also analyze if the EDPL device provides additional information to that obtained with a conventionally applied personal-social adjustment scale (APSL). Results showed that dynamic scores, obtained from the implementation of the EDPL device, significantly predict the studied variables. Moreover, dynamic scores showed a significant incremental validity in relation to predictions based on an APSL scale. In relation to differential validity, the results indicated the superior predictive validity for DA for students with intellectual disabilities and reading disabilities than for students with socio-cultural disadvantages. Furthermore, the role of metacognition and its relation to the processes of personal-social adjustment in explaining the results is discussed.
Dynamic Assessment of Reading Difficulties: Predictive and Incremental Validity on Attitude toward Reading and the Use of Dialogue/Participation Strategies in Classroom Activities

PubMed Central

Navarro, Juan-José; Lara, Laura

2017-01-01

Dynamic Assessment (DA) has been shown to have more predictive value than conventional tests for academic performance. However, in relation to reading difficulties, further research is needed to determine the predictive validity of DA for specific aspects of the different processes involved in reading and the differential validity of DA for different subgroups of students with an academic disadvantage. This paper analyzes the implementation of a DA device that evaluates processes involved in reading (EDPL) among 60 students with reading comprehension difficulties between 9 and 16 years of age, of whom 20 have intellectual disabilities, 24 have reading-related learning disabilities, and 16 have socio-cultural disadvantages. We specifically analyze the predictive validity of the EDPL device over attitude toward reading, and the use of dialogue/participation strategies in reading activities in the classroom during the implementation stage. We also analyze if the EDPL device provides additional information to that obtained with a conventionally applied personal-social adjustment scale (APSL). Results showed that dynamic scores, obtained from the implementation of the EDPL device, significantly predict the studied variables. Moreover, dynamic scores showed a significant incremental validity in relation to predictions based on an APSL scale. In relation to differential validity, the results indicated the superior predictive validity for DA for students with intellectual disabilities and reading disabilities than for students with socio-cultural disadvantages. Furthermore, the role of metacognition and its relation to the processes of personal-social adjustment in explaining the results is discussed. PMID:28243215
A contextual approach to social skills assessment in the peer group: who is the best judge?

PubMed

Kwon, Kyongboon; Kim, Elizabeth Moorman; Sheridan, Susan M

2012-09-01

Using a contextual approach to social skills assessment in the peer group, this study examined the criterion-related validity of contextually relevant social skills and the incremental validity of peers and teachers as judges of children's social skills. Study participants included 342 (180 male and 162 female) students and their classroom teachers (N = 22) from rural communities. As expected, contextually relevant social skills were significantly related to a variety of social status indicators (i.e., likability, peer- and teacher-assessed popularity, reciprocated friendships, clique centrality) and positive school functioning (i.e., school liking and academic competence). Peer-assessed social skills, not teacher-assessed social skills, demonstrated consistent incremental validity in predicting various indicators of social status outcomes; peer- and teacher-assessed social skills alike showed incremental validity in predicting positive school functioning. The relation between contextually relevant social skills and study outcomes did not vary by child gender. Findings are discussed in terms of the significance of peers in the assessment of children's social skills in the peer group as well as the usefulness of a contextual approach to social skills assessment.
The Adaptation and Validation of the Emotion Matching Task for Preschool Children in Spain

ERIC Educational Resources Information Center

Alonso-Alberca, Natalia; Vergara, Ana I.; Fernandez-Berrocal, Pablo; Johnson, Stacy R.; Izard, Carroll E.

2012-01-01

The Emotion Matching Task (EMT; Izard, Haskins, Schultz, Trentacosta, & King, 2003) was developed to assess emotion knowledge in preschoolers and was demonstrated to show adequate convergent and predictive validity in an American sample (Morgan, Izard, & King, 2010). In light of the need for valid measures for assessing emotion…
Translation and validation of the Canadian diabetes risk assessment questionnaire in China.

PubMed

Guo, Jia; Shi, Zhengkun; Chen, Jyu-Lin; Dixon, Jane K; Wiley, James; Parry, Monica

2018-01-01

To adapt the Canadian Diabetes Risk Assessment Questionnaire for the Chinese population and to evaluate its psychometric properties. A cross-sectional study was conducted with a convenience sample of 194 individuals aged 35-74 years from October 2014 to April 2015. The Canadian Diabetes Risk Assessment Questionnaire was adapted and translated for the Chinese population. Test-retest reliability was conducted to measure stability. Criterion and convergent validity of the adapted questionnaire were assessed using 2-hr 75 g oral glucose tolerance tests and the Finnish Diabetes Risk Scores, respectively. Sensitivity and specificity were evaluated to establish its predictive validity. The test-retest reliability was 0.988. Adequate validity of the adapted questionnaire was demonstrated by positive correlations found between the scores and 2-hr 75 g oral glucose tolerance tests (r = .343, p < .001) and with the Finnish Diabetes Risk Scores (r = .738, p < .001). The area under receiver operating characteristic curve was 0.705 (95% CI .632, .778), demonstrating moderate diagnostic value at a cutoff score of 30. The sensitivity was 73%, with a positive predictive value of 57% and negative predictive value of 78%. Our results provided evidence supporting the translation consistency, content validity, convergent validity, criterion validity, sensitivity, and specificity of the translated Canadian Diabetes Risk Assessment Questionnaire with minor modifications. This paper provides clinical, practical, and methodological information on how to adapt a diabetes risk calculator between cultures for public health nurses. © 2017 Wiley Periodicals, Inc.
Implementation and Initial Validation of the APS English Test [and] The APS English-Writing Test at Golden West College: Evidence for Predictive Validity.

ERIC Educational Resources Information Center

Isonio, Steven

In May 1991, Golden West College (California) conducted a validation study of the English portion of the Assessment and Placement Services for Community Colleges (APS), followed by a predictive validity study in July 1991. The initial study was designed to aid in the implementation of the new test at GWC by comparing data on APS use at other…
External validation of preexisting first trimester preeclampsia prediction models.

PubMed

Allen, Rebecca E; Zamora, Javier; Arroyo-Manzano, David; Velauthar, Luxmilar; Allotey, John; Thangaratinam, Shakila; Aquilina, Joseph

2017-10-01

To validate the increasing number of prognostic models being developed for preeclampsia using our own prospective study. A systematic review of literature that assessed biomarkers, uterine artery Doppler and maternal characteristics in the first trimester for the prediction of preeclampsia was performed and models selected based on predefined criteria. Validation was performed by applying the regression coefficients that were published in the different derivation studies to our cohort. We assessed the models discrimination ability and calibration. Twenty models were identified for validation. The discrimination ability observed in derivation studies (Area Under the Curves) ranged from 0.70 to 0.96 when these models were validated against the validation cohort, these AUC varied importantly, ranging from 0.504 to 0.833. Comparing Area Under the Curves obtained in the derivation study to those in the validation cohort we found statistically significant differences in several studies. There currently isn't a definitive prediction model with adequate ability to discriminate for preeclampsia, which performs as well when applied to a different population and can differentiate well between the highest and lowest risk groups within the tested population. The pre-existing large number of models limits the value of further model development and future research should be focussed on further attempts to validate existing models and assessing whether implementation of these improves patient care. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Relapse Risk Assessment for Schizophrenia Patients (RASP): A New Self-Report Screening Tool.

PubMed

Velligan, Dawn; Carpenter, William; Waters, Heidi C; Gerlanc, Nicole M; Legacy, Susan N; Ruetsch, Charles

2018-01-01

The Relapse Assessment for Schizophrenia Patients (RASP) was developed as a six-question self-report screener that measures indicators of Increased Anxiety and Social Isolation to assess patient stability and predict imminent relapse. This paper describes the development and psychometric characteristics of the RASP. The RASP and Positive and Negative Syndrome Scale (PANSS) were administered to patients with schizophrenia (n=166) three separate times. Chart data were collected on a subsample of patients (n=81). Psychometric analyses of RASP included tests of reliability, construct validity, and concurrent validity of items. Factors from RASP were correlated with subscales from PANSS (sensitivity to change and criterion validity [agreement between RASP and evidence of relapse]). Test-retest reliability returned modest to strong agreement at the item level and strong agreement at the questionnaire level. RASP showed good item response curves and internal consistency for the total instrument and within each of the two subscales (Increased Anxiety and Social Isolation). RASP Total Score and subscales showed good concurrent validity when correlated with PANSS Total Score, Positive, Excitement, and Anxiety subscales. RASP correctly predicted relapse in 67% of cases, with good specificity and negative predictive power and acceptable positive predictive power and sensitivity. The reliability and validity data presented support the use of RASP in settings where addition of a brief self-report assessment of relapse risk among patients with schizophrenia may be of benefit. Ease of use and scoring, and the ability to administer without clinical supervision allows for routine administration and assessment of relapse risk.
Assessing genomic selection prediction accuracy in a dynamic barley breeding

USDA-ARS?s Scientific Manuscript database

Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotype and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, however validation based on prog...

The Measurement of Negative Creativity: Metrics and Relationships

ERIC Educational Resources Information Center

Kapoor, Hansika; Khan, Azizuddin

2016-01-01

Although the dark side of creativity and negative creativity are shaping into legitimate subconstructs, measures to assess the same remain to be validated. To meet this goal, two studies assessed the convergent, predictive, and criterion-related validities of two valence-inclusive creativity measures. One measure assessed the self-report…
Predicting fitness-to-drive following stroke using the Occupational Therapy - Driver Off Road Assessment Battery.

PubMed

Unsworth, Carolyn A; Baker, Anne; Lannin, Natasha; Harries, Priscilla; Strahan, Janene; Browne, Matthew

2018-02-28

It is difficult to determine if, or when, individuals with stroke are ready to undergo on-road fitness-to-drive assessment. The Occupational Therapy - Driver Off Road Assessment Battery was developed to determine client suitability to resume driving. The predictive validity of the Battery needs to be verified for people with stroke. Examine the predictive validity of the Occupational Therapy - Driver Off Road Assessment Battery for on-road performance among people with stroke. Off-road data were collected from 148 people post stroke on the Battery and the outcome of their on-road assessment was recorded as: fit-to-drive or not fit-to-drive. The majority of participants (76%) were able to resume driving. A classification and regression tree (CART) analysis using four subtests (three cognitive and one physical) from the Battery demonstrated an area under the curve (AUC) of 0.8311. Using a threshold of 0.5, the model correctly predicted 98/112 fit-to-drive (87.5%) and 26/36 people not fit-to-drive (72.2%). The three cognitive subtests from the Occupational Therapy - Driver Off Road Assessment Battery and potentially one of the physical tests have good predictive validity for client fitness-to-drive. These tests can be used to screen client suitability for proceeding to an on-road test following stroke. Implications for Rehabilitation: Following stroke, drivers should be counseled (including consideration of local legislation) concerning return to driving. The Occupational Therapy - Driver Off Road Assessment Battery can be used in the clinic to screen people for suitability to undertake on road assessment. Scores on four of the Occupational Therapy - Driver Off Road Assessment Battery subtests are predictive of resumption of driving following stroke.
Simulation for Prediction of Entry Article Demise (SPEAD): An Analysis Tool for Spacecraft Safety Analysis and Ascent/Reentry Risk Assessment

NASA Technical Reports Server (NTRS)

Ling, Lisa

2014-01-01

For the purpose of performing safety analysis and risk assessment for a potential off-nominal atmospheric reentry resulting in vehicle breakup, a synthesis of trajectory propagation coupled with thermal analysis and the evaluation of node failure is required to predict the sequence of events, the timeline, and the progressive demise of spacecraft components. To provide this capability, the Simulation for Prediction of Entry Article Demise (SPEAD) analysis tool was developed. The software and methodology have been validated against actual flights, telemetry data, and validated software, and safety/risk analyses were performed for various programs using SPEAD. This report discusses the capabilities, modeling, validation, and application of the SPEAD analysis tool.
Development and validation of the Pediatric Anesthesia Behavior score--an objective measure of behavior during induction of anesthesia.

PubMed

Beringer, Richard M; Greenwood, Rosemary; Kilpatrick, Nicky

2014-02-01

Measuring perioperative behavior changes requires validated objective rating scales. We developed a simple score for children's behavior during induction of anesthesia (Pediatric Anesthesia Behavior score) and assessed its reliability, concurrent validity, and predictive validity. Data were collected as part of a wider observational study of perioperative behavior changes in children undergoing general anesthesia for elective dental extractions. One-hundred and two healthy children aged 2-12 were recruited. Previously validated behavioral scales were used as follows: the modified Yale Preoperative Anxiety Scale (m-YPAS); the induction compliance checklist (ICC); the Pediatric Anesthesia Emergence Delirium scale (PAED); and the Post-Hospitalization Behavior Questionnaire (PHBQ). Pediatric Anesthesia Behavior (PAB) score was independently measured by two investigators, to allow assessment of interobserver reliability. Concurrent validity was assessed by examining the correlation between the PAB score, the m-YPAS, and the ICC. Predictive validity was assessed by examining the association between the PAB score, the PAED scale, and the PHBQ. The PAB score correlated strongly with both the m-YPAS (P < 0.001) and the ICC (P < 0.001). PAB score was significantly associated with the PAED score (P = 0.031) and with the PHBQ (P = 0.034). Two independent investigators recorded identical PAB scores for 94% of children and overall, there was close agreement between scores (Kappa coefficient of 0.886 [P < 0.001]). The PAB score is simple to use and may predict which children are at increased risk of developing postoperative behavioral disturbance. This study provides evidence for its reliability and validity. © 2013 John Wiley & Sons Ltd.
Predictive Validity of Curriculum-Embedded Measures on Outcomes of Kindergarteners Identified as At Risk for Reading Difficulty

ERIC Educational Resources Information Center

Oslund, Eric L.; Hagan-Burke, Shanna; Simmons, Deborah C.; Clemens, Nathan H.; Simmons, Leslie E.; Taylor, Aaron B.; Kwok, Oi-man; Coyne, Michael D.

2017-01-01

This study examined the predictive validity of formative assessments embedded in a Tier 2 intervention curriculum for kindergarten students identified as at risk for reading difficulty. We examined when (i.e., months during the school year) measures could predict reading outcomes gathered at the end of kindergarten and whether the predictive…
Construct and Predictive Validity of the Core Phonics Survey: A Diagnostic Assessment for Students with Specific Learning Disabilities

ERIC Educational Resources Information Center

Park, Yujeong; Benedict, Amber E.; Brownell, Mary T.

2014-01-01

The factor structure of the CORE Phonics Survey was analyzed using a sample of 165 students in upper elementary school with specific learning disabilities. Confirmatory factor analysis was used to identify the hypothesized constructs of the CORE Phonics Survey and predictive validity of the CORE Phonics Survey to predict students' success in word…
Cross-validation of the Beunen-Malina method to predict adult height.

PubMed

Beunen, Gaston P; Malina, Robert M; Freitas, Duarte I; Maia, José A; Claessens, Albrecht L; Gouveia, Elvio R; Lefevre, Johan

2010-08-01

The purpose of this study was to cross-validate the Beunen-Malina method for non-invasive prediction of adult height. Three hundred and eight boys aged 13, 14, 15 and 16 years from the Madeira Growth Study were observed at annual intervals in 1996, 1997 and 1998 and re-measured 7-8 years later. Height, sitting height and the triceps and subscapular skinfolds were measured; skeletal age was assessed using the Tanner-Whitehouse 2 method. Adult height was measured and predicted using the Beunen-Malina method. Maturity groups were classified using relative skeletal age (skeletal age minus chronological age). Pearson correlations, mean differences and standard errors of estimate (SEE) were calculated. Age-specific correlations between predicted and measured adult height vary between 0.70 and 0.85, while age-specific SEE varies between 3.3 and 4.7 cm. The correlations and SEE are similar to those obtained in the development of the original Beunen-Malina method. The Beunen-Malina method is a valid method to predict adult height in adolescent boys and can be used in European populations or populations from European ancestry. Percentage of predicted adult height is a non-invasive valid method to assess biological maturity.
An Empiric HIV Risk Scoring Tool to Predict HIV-1 Acquisition in African Women.

PubMed

Balkus, Jennifer E; Brown, Elizabeth; Palanee, Thesla; Nair, Gonasagrie; Gafoor, Zakir; Zhang, Jingyang; Richardson, Barbra A; Chirenje, Zvavahera M; Marrazzo, Jeanne M; Baeten, Jared M

2016-07-01

To develop and validate an HIV risk assessment tool to predict HIV acquisition among African women. Data were analyzed from 3 randomized trials of biomedical HIV prevention interventions among African women (VOICE, HPTN 035, and FEM-PrEP). We implemented standard methods for the development of clinical prediction rules to generate a risk-scoring tool to predict HIV acquisition over the course of 1 year. Performance of the score was assessed through internal and external validations. The final risk score resulting from multivariable modeling included age, married/living with a partner, partner provides financial or material support, partner has other partners, alcohol use, detection of a curable sexually transmitted infection, and herpes simplex virus 2 serostatus. Point values for each factor ranged from 0 to 2, with a maximum possible total score of 11. Scores ≥5 were associated with HIV incidence >5 per 100 person-years and identified 91% of incident HIV infections from among only 64% of women. The area under the curve (AUC) for predictive ability of the score was 0.71 (95% confidence interval [CI]: 0.68 to 0.74), indicating good predictive ability. Risk score performance was generally similar with internal cross-validation (AUC = 0.69; 95% CI: 0.66 to 0.73) and external validation in HPTN 035 (AUC = 0.70; 95% CI: 0.65 to 0.75) and FEM-PrEP (AUC = 0.58; 95% CI: 0.51 to 0.65). A discrete set of characteristics that can be easily assessed in clinical and research settings was predictive of HIV acquisition over 1 year. The use of a validated risk score could improve efficiency of recruitment into HIV prevention research and inform scale-up of HIV prevention strategies in women at highest risk.
Evaluating the Predictive Validity of the Computerized Comprehension Task: Comprehension Predicts Production

PubMed Central

Friend, Margaret; Schmitt, Sara A.; Simpson, Adrianne M.

2017-01-01

Until recently, the challenges inherent in measuring comprehension have impeded our ability to predict the course of language acquisition. The present research reports on a longitudinal assessment of the convergent and predictive validity of the CDI: Words and Gestures and the Computerized Comprehension Task (CCT). The CDI: WG and the CCT evinced good convergent validity however the CCT better predicted subsequent parent reports of language production. Language sample data in the third year confirm this finding: the CCT accounted for 24% of the variance in unique word use. These studies provide evidence for the utility of a behavior-based approach to predicting the course of language acquisition into production. PMID:21928878
Independent external validation of predictive models for urinary dysfunction following external beam radiotherapy of the prostate: Issues in model development and reporting.

PubMed

Yahya, Noorazrul; Ebert, Martin A; Bulsara, Max; Kennedy, Angel; Joseph, David J; Denham, James W

2016-08-01

Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Genomic selection across multiple breeding cycles in applied bread wheat breeding.

PubMed

Michel, Sebastian; Ametz, Christian; Gungor, Huseyin; Epure, Doru; Grausgruber, Heinrich; Löschenberger, Franziska; Buerstmayr, Hermann

2016-06-01

We evaluated genomic selection across five breeding cycles of bread wheat breeding. Bias of within-cycle cross-validation and methods for improving the prediction accuracy were assessed. The prospect of genomic selection has been frequently shown by cross-validation studies using the same genetic material across multiple environments, but studies investigating genomic selection across multiple breeding cycles in applied bread wheat breeding are lacking. We estimated the prediction accuracy of grain yield, protein content and protein yield of 659 inbred lines across five independent breeding cycles and assessed the bias of within-cycle cross-validation. We investigated the influence of outliers on the prediction accuracy and predicted protein yield by its components traits. A high average heritability was estimated for protein content, followed by grain yield and protein yield. The bias of the prediction accuracy using populations from individual cycles using fivefold cross-validation was accordingly substantial for protein yield (17-712 %) and less pronounced for protein content (8-86 %). Cross-validation using the cycles as folds aimed to avoid this bias and reached a maximum prediction accuracy of [Formula: see text] = 0.51 for protein content, [Formula: see text] = 0.38 for grain yield and [Formula: see text] = 0.16 for protein yield. Dropping outlier cycles increased the prediction accuracy of grain yield to [Formula: see text] = 0.41 as estimated by cross-validation, while dropping outlier environments did not have a significant effect on the prediction accuracy. Independent validation suggests, on the other hand, that careful consideration is necessary before an outlier correction is undertaken, which removes lines from the training population. Predicting protein yield by multiplying genomic estimated breeding values of grain yield and protein content raised the prediction accuracy to [Formula: see text] = 0.19 for this derived trait.
Validation of a Computerized Cognitive Assessment System for Persons with Stroke: A Pilot Study

ERIC Educational Resources Information Center

Yip, Chi Kwong; Man, David W. K.

2009-01-01

This study investigates the validity of a newly developed computerized cognitive assessment system (CCAS) that is equipped with rich multimedia to generate simulated testing situations and considers both test item difficulty and the test taker's ability. It is also hypothesized that better predictive validity of the CCAS in self-care of persons…
Screening for potential child maltreatment in parents of a newborn baby: The predictive validity of an Instrument for early identification of Parents At Risk for child Abuse and Neglect (IPARAN).

PubMed

van der Put, Claudia E; Bouwmeester-Landweer, Merian B R; Landsmeer-Beker, Eleonore A; Wit, Jan M; Dekker, Friedo W; Kousemaker, N Pieter J; Baartman, Herman E M

2017-08-01

For preventive purposes it is important to be able to identify families with a high risk of child maltreatment at an early stage. Therefore we developed an actuarial instrument for screening families with a newborn baby, the Instrument for identification of Parents At Risk for child Abuse and Neglect (IPARAN). The aim of this study was to assess the predictive validity of the IPARAN and to examine whether combining actuarial and clinical methods leads to an improvement of the predictive validity. We examined the predictive validity by calculating several performance indicators (i.e., sensitivity, specificity and the Area Under the receiver operating characteristic Curve [AUC]) in a sample of 4692 Dutch families with newborns. The outcome measure was a report of child maltreatment at Child Protection Services during a follow-up of 3 years. For 17 children (.4%) a report of maltreatment was registered. The predictive validity of the IPARAN was significantly better than chance (AUC=.700, 95% CI [.567-.832]), in contrast to a low value for clinical judgement of nurses of the Youth Health Care Centers (AUC=.591, 95% CI [.422-.759]). The combination of the IPARAN and clinical judgement resulted in the highest predictive validity (AUC=.720, 95% CI [.593-.847]), however, the difference between the methods did not reach statistical significance. The good predictive validity of the IPARAN in combination with clinical judgment of the nurse enables professionals to assess risks at an early stage and to make referrals to early intervention programs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Independent data validation of an in vitro method for ...

EPA Pesticide Factsheets

In vitro bioaccessibility assays (IVBA) estimate arsenic (As) relative bioavailability (RBA) in contaminated soils to improve the accuracy of site-specific human exposure assessments and risk calculations. For an IVBA assay to gain acceptance for use in risk assessment, it must be shown to reliably predict in vivo RBA that is determined in an established animal model. Previous studies correlating soil As IVBA with RBA have been limited by the use of few soil types as the source of As. Furthermore, the predictive value of As IVBA assays has not been validated using an independent set of As-contaminated soils. Therefore, the current study was undertaken to develop a robust linear model to predict As RBA in mice using an IVBA assay and to independently validate the predictive capability of this assay using a unique set of As-contaminated soils. Thirty-six As-contaminated soils varying in soil type, As contaminant source, and As concentration were included in this study, with 27 soils used for initial model development and nine soils used for independent model validation. The initial model reliably predicted As RBA values in the independent data set, with a mean As RBA prediction error of 5.3% (range 2.4 to 8.4%). Following validation, all 36 soils were used for final model development, resulting in a linear model with the equation: RBA = 0.59 * IVBA + 9.8 and R2 of 0.78. The in vivo-in vitro correlation and independent data validation presented here provide
Validation of Accelerometer Prediction Equations in Children with Chronic Disease.

PubMed

Stephens, Samantha; Takken, Tim; Esliger, Dale W; Pullenayegum, Eleanor; Beyene, Joseph; Tremblay, Mark; Schneiderman, Jane; Biggar, Doug; Longmuir, Pat; McCrindle, Brian; Abad, Audrey; Ignas, Dan; Van Der Net, Janjaap; Feldman, Brian

2016-02-01

The purpose of this study was to assess the criterion validity of existing accelerometer-based energy expenditure (EE) prediction equations among children with chronic conditions, and to develop new prediction equations. Children with congenital heart disease (CHD), cystic fibrosis (CF), dermatomyositis (JDM), juvenile arthritis (JA), inherited muscle disease (IMD), and hemophilia (HE) completed 7 tasks while EE was measured using indirect calorimetry with counts determined by accelerometer. Agreement between predicted EE and measured EE was assessed. Disease-specific equations and cut points were developed and cross-validated. In total, 196 subjects participated. One participant dropped out before testing due to time constraints, while 15 CHD, 32 CF, 31 JDM, 31 JA, 30 IMD, 28 HE, and 29 healthy controls completed the study. Agreement between predicted and measured EE varied across disease group and ranged from (ICC) .13-.46. Disease-specific prediction equations exhibited a range of results (ICC .62-.88) (SE 0.45-0.78). In conclusion, poor agreement was demonstrated using current prediction equations in children with chronic conditions. Disease-specific equations and cut points were developed.
External validation of a 5-year survival prediction model after elective abdominal aortic aneurysm repair.

PubMed

DeMartino, Randall R; Huang, Ying; Mandrekar, Jay; Goodney, Philip P; Oderich, Gustavo S; Kalra, Manju; Bower, Thomas C; Cronenwett, Jack L; Gloviczki, Peter

2018-01-01

The benefit of prophylactic repair of abdominal aortic aneurysms (AAAs) is based on the risk of rupture exceeding the risk of death from other comorbidities. The purpose of this study was to validate a 5-year survival prediction model for patients undergoing elective repair of asymptomatic AAA <6.5 cm to assist in optimal selection of patients. All patients undergoing elective repair for asymptomatic AAA <6.5 cm (open or endovascular) from 2002 to 2011 were identified from a single institutional database (validation group). We assessed the ability of a prior published Vascular Study Group of New England (VSGNE) model (derivation group) to predict survival in our cohort. The model was assessed for discrimination (concordance index), calibration (calibration slope and calibration in the large), and goodness of fit (score test). The VSGNE derivation group consisted of 2367 patients (70% endovascular). Major factors associated with survival in the derivation group were age, coronary disease, chronic obstructive pulmonary disease, renal function, and antiplatelet and statin medication use. Our validation group consisted of 1038 patients (59% endovascular). The validation group was slightly older (74 vs 72 years; P < .01) and had a higher proportion of men (76% vs 68%; P < .01). In addition, the derivation group had higher rates of advanced cardiac disease, chronic obstructive pulmonary disease, and baseline creatinine concentration (1.2 vs 1.1 mg/dL; P < .01). Despite slight differences in preoperative patient factors, 5-year survival was similar between validation and derivation groups (75% vs 77%; P = .33). The concordance index of the validation group was identical between derivation and validation groups at 0.659 (95% confidence interval, 0.63-0.69). Our validation calibration in the large value was 1.02 (P = .62, closer to 1 indicating better calibration), calibration slope of 0.84 (95% confidence interval, 0.71-0.97), and score test of P = .57 (>.05 indicating goodness of fit). Across different populations of patients, assessment of age and level of cardiac, pulmonary, and renal disease can accurately predict 5-year survival in patients with AAA <6.5 cm undergoing repair. This risk prediction model is a valid method to assess mortality risk in determining potential overall survival benefit from elective AAA repair. Copyright © 2017 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Prediction models for intracranial hemorrhage or major bleeding in patients on antiplatelet therapy: a systematic review and external validation study.

PubMed

Hilkens, N A; Algra, A; Greving, J P

2016-01-01

ESSENTIALS: Prediction models may help to identify patients at high risk of bleeding on antiplatelet therapy. We identified existing prediction models for bleeding and validated them in patients with cerebral ischemia. Five prediction models were identified, all of which had some methodological shortcomings. Performance in patients with cerebral ischemia was poor. Background Antiplatelet therapy is widely used in secondary prevention after a transient ischemic attack (TIA) or ischemic stroke. Bleeding is the main adverse effect of antiplatelet therapy and is potentially life threatening. Identification of patients at increased risk of bleeding may help target antiplatelet therapy. This study sought to identify existing prediction models for intracranial hemorrhage or major bleeding in patients on antiplatelet therapy and evaluate their performance in patients with cerebral ischemia. We systematically searched PubMed and Embase for existing prediction models up to December 2014. The methodological quality of the included studies was assessed with the CHARMS checklist. Prediction models were externally validated in the European Stroke Prevention Study 2, comprising 6602 patients with a TIA or ischemic stroke. We assessed discrimination and calibration of included prediction models. Five prediction models were identified, of which two were developed in patients with previous cerebral ischemia. Three studies assessed major bleeding, one studied intracerebral hemorrhage and one gastrointestinal bleeding. None of the studies met all criteria of good quality. External validation showed poor discriminative performance, with c-statistics ranging from 0.53 to 0.64 and poor calibration. A limited number of prediction models is available that predict intracranial hemorrhage or major bleeding in patients on antiplatelet therapy. The methodological quality of the models varied, but was generally low. Predictive performance in patients with cerebral ischemia was poor. In order to reliably predict the risk of bleeding in patients with cerebral ischemia, development of a prediction model according to current methodological standards is needed. © 2015 International Society on Thrombosis and Haemostasis.
Is It Safe? Reliability and Validity of Structured versus Unstructured Child Safety Judgments

ERIC Educational Resources Information Center

Bartelink, Cora; de Kwaadsteniet, Leontien; ten Berge, Ingrid J.; Witteman, Cilia L. M.

2017-01-01

Background: The LIRIK, an instrument for the assessment of child safety and risk, is designed to improve assessments by guiding professionals through a structured evaluation of relevant signs, risk factors, and protective factors. Objective: We aimed to assess the interrater agreement and the predictive validity of professionals' judgments made…
Harnessing the power of personality assessment: subjective assessment predicts behaviour in horses.

PubMed

Ijichi, Carrie; Collins, Lisa M; Creighton, Emma; Elwood, Robert W

2013-06-01

Objective assessment of animal personality is typically time consuming, requiring the repeated measure of behavioural responses. By contrast, subjective assessment of personality allows information to be collected quickly by experienced caregivers. However, subjective assessment must predict behaviour to be valid. Comparisons of subjective assessments and behaviour have been made but often with methodological weaknesses and thus, limited success. Here we test the validity of a subjective assessment against a battery of behaviour tests in 146 horses (Equus caballus). Our first aim was to determine if subjective personality assessment could predict behaviour during behaviour testing. We made specific a priori predictions for how subjectively measured personality should relate to behaviour testing. We found that Extroversion predicted time to complete a handling test and refusal behaviour during this test. It also predicted minimum distance to a novel object. Neuroticism predicted how reactive an individual was to a sudden visual stimulus but not how quickly it recovered from this. Agreeableness did not predict any behaviour during testing. There were several unpredicted correlations between subjective measures and behaviour tests which we explore further. Our second aim was to combine data from the subjective assessment and behaviour tests to gain a more comprehensive understanding of personality. We found that the combination of methods provides new insights into horse behaviour. Furthermore, our data are consistent with the idea of horses showing different coping styles, a novel finding for this species. Copyright © 2013 Elsevier B.V. All rights reserved.
The Predictive Validity of Savry Ratings for Assessing Youth Offenders in Singapore

PubMed Central

Chu, Chi Meng; Goh, Mui Leng; Chong, Dominic

2015-01-01

Empirical support for the usage of the SAVRY has been reported in studies conducted in many Western contexts, but not in a Singaporean context. This study compared the predictive validity of the SAVRY ratings for violent and general recidivism against the Youth Level of Service/Case Management Inventory (YLS/CMI) ratings within the Singaporean context. Using a sample of 165 male young offenders (Mfollow-up = 4.54 years), results showed that the SAVRY Total Score and Summary Risk Rating, as well as YLS/CMI Total Score and Overall Risk Rating, predicted violent and general recidivism. SAVRY Protective Total Score was only significantly predictive of desistance from general recidivism, and did not show incremental predictive validity for violent and general recidivism over the SAVRY Total Score. Overall, the results suggest that the SAVRY is suited (to varying degrees) for assessing the risk of violent and general recidivism in young offenders within the Singaporean context, but might not be better than the YLS/CMI. PMID:27231403

The Irvine, Beatties, and Bresnahan (IBB) Forelimb Recovery Scale: An Assessment of Reliability and Validity

PubMed Central

Irvine, Karen-Amanda; Ferguson, Adam R.; Mitchell, Kathleen D.; Beattie, Stephanie B.; Lin, Amity; Stuck, Ellen D.; Huie, J. Russell; Nielson, Jessica L.; Talbott, Jason F.; Inoue, Tomoo; Beattie, Michael S.; Bresnahan, Jacqueline C.

2014-01-01

The IBB scale is a recently developed forelimb scale for the assessment of fine control of the forelimb and digits after cervical spinal cord injury [SCI; (1)]. The present paper describes the assessment of inter-rater reliability and face, concurrent and construct validity of this scale following SCI. It demonstrates that the IBB is a reliable and valid scale that is sensitive to severity of SCI and to recovery over time. In addition, the IBB correlates with other outcome measures and is highly predictive of biological measures of tissue pathology. Multivariate analysis using principal component analysis (PCA) demonstrates that the IBB is highly predictive of the syndromic outcome after SCI (2), and is among the best predictors of bio-behavioral function, based on strong construct validity. Altogether, the data suggest that the IBB, especially in concert with other measures, is a reliable and valid tool for assessing neurological deficits in fine motor control of the distal forelimb, and represents a powerful addition to multivariate outcome batteries aimed at documenting recovery of function after cervical SCI in rats. PMID:25071704
Assessing Anger Expression: Construct Validity of Three Emotion Expression-Related Measures

PubMed Central

Jasinski, Matthew J.; Lumley, Mark A.; Latsch, Deborah V.; Schuster, Erik; Kinner, Ellen; Burns, John W.

2016-01-01

Self-report measures of emotional expression are common, but their validity to predict objective emotional expression, particularly of anger, is unclear. We tested the validity of the Anger Expression Inventory (AEI; Spielberger et al., 1985)), Emotional Approach Coping Scale (EAC; Stanton, Kirk, Cameron & Danoff-Burg, 2000), and Toronto Alexithymia Scale-20 (TAS-20; Bagby, Taylor, & Parker, 1994) to predict objective anger expression in 95 adults with chronic back pain. Participants attempted to solve a difficult computer maze by following the directions of a confederate who treated them rudely and unjustly. Participants then expressed their feelings for 4 minutes. Blinded raters coded the videos for anger expression, and a software program analyzed expression transcripts for anger-related words. Analyses related each questionnaire to anger expression. The AEI anger-out scale predicted greater anger expression, as expected, but AEI anger-in did not. The EAC emotional processing scale predicted less anger expression, but the EAC emotional expression scale was unrelated to anger expression. Finally, the TAS-20 predicted greater anger expression. Findings support the validity of the AEI anger-out scale but raise questions about the other measures. The assessment of emotional expression by self-report is complex and perhaps confounded by general emotional experience, the specificity or generality of the emotion(s) assessed, and self-awareness limitations. Performance-based or clinician-rated measures of emotion expression are needed. PMID:27248355
Predicting Reading Ability for Bilingual Latino Children Using Dynamic Assessment

ERIC Educational Resources Information Center

Petersen, Douglas B.; Gillam, Ronald B.

2015-01-01

This study investigated the predictive validity of a dynamic assessment designed to evaluate later risk for reading difficulty in bilingual Latino children at risk for language impairment. During kindergarten, 63 bilingual Latino children completed a dynamic assessment nonsense-word recoding task that yielded pretest to posttest gain scores,…
Analysis of model development strategies: predicting ventral hernia recurrence.

PubMed

Holihan, Julie L; Li, Linda T; Askenasy, Erik P; Greenberg, Jacob A; Keith, Jerrod N; Martindale, Robert G; Roth, J Scott; Liang, Mike K

2016-11-01

There have been many attempts to identify variables associated with ventral hernia recurrence; however, it is unclear which statistical modeling approach results in models with greatest internal and external validity. We aim to assess the predictive accuracy of models developed using five common variable selection strategies to determine variables associated with hernia recurrence. Two multicenter ventral hernia databases were used. Database 1 was randomly split into "development" and "internal validation" cohorts. Database 2 was designated "external validation". The dependent variable for model development was hernia recurrence. Five variable selection strategies were used: (1) "clinical"-variables considered clinically relevant, (2) "selective stepwise"-all variables with a P value <0.20 were assessed in a step-backward model, (3) "liberal stepwise"-all variables were included and step-backward regression was performed, (4) "restrictive internal resampling," and (5) "liberal internal resampling." Variables were included with P < 0.05 for the Restrictive model and P < 0.10 for the Liberal model. A time-to-event analysis using Cox regression was performed using these strategies. The predictive accuracy of the developed models was tested on the internal and external validation cohorts using Harrell's C-statistic where C > 0.70 was considered "reasonable". The recurrence rate was 32.9% (n = 173/526; median/range follow-up, 20/1-58 mo) for the development cohort, 36.0% (n = 95/264, median/range follow-up 20/1-61 mo) for the internal validation cohort, and 12.7% (n = 155/1224, median/range follow-up 9/1-50 mo) for the external validation cohort. Internal validation demonstrated reasonable predictive accuracy (C-statistics = 0.772, 0.760, 0.767, 0.757, 0.763), while on external validation, predictive accuracy dipped precipitously (C-statistic = 0.561, 0.557, 0.562, 0.553, 0.560). Predictive accuracy was equally adequate on internal validation among models; however, on external validation, all five models failed to demonstrate utility. Future studies should report multiple variable selection techniques and demonstrate predictive accuracy on external data sets for model validation. Copyright © 2016 Elsevier Inc. All rights reserved.
Violence risk prediction. Clinical and actuarial measures and the role of the Psychopathy Checklist.

PubMed

Dolan, M; Doyle, M

2000-10-01

Violence risk prediction is a priority issue for clinicians working with mentally disordered offenders. To review the current status of violence risk prediction research. Literature search (Medline). Key words: violence, risk prediction, mental disorder. Systematic/structured risk assessment approaches may enhance the accuracy of clinical prediction of violent outcomes. Data on the predictive validity of available clinical risk assessment tools are based largely on American and North American studies and further validation is required in British samples. The Psychopathy Checklist appears to be a key predictor of violent recidivism in a variety of settings. Violence risk prediction is an inexact science and as such will continue to provoke debate. Clinicians clearly need to be able to demonstrate the rationale behind their decisions on violence risk and much can be learned from recent developments in research on violence risk prediction.
A Systematic Review of the Reliability and Validity of Behavioural Tests Used to Assess Behavioural Characteristics Important in Working Dogs.

PubMed

Brady, Karen; Cracknell, Nina; Zulch, Helen; Mills, Daniel Simon

2018-01-01

Working dogs are selected based on predictions from tests that they will be able to perform specific tasks in often challenging environments. However, withdrawal from service in working dogs is still a big problem, bringing into question the reliability of the selection tests used to make these predictions. A systematic review was undertaken aimed at bringing together available information on the reliability and predictive validity of the assessment of behavioural characteristics used with working dogs to establish the quality of selection tests currently available for use to predict success in working dogs. The search procedures resulted in 16 papers meeting the criteria for inclusion. A large range of behaviour tests and parameters were used in the identified papers, and so behaviour tests and their underpinning constructs were grouped on the basis of their relationship with positive core affect (willingness to work, human-directed social behaviour, object-directed play tendencies) and negative core affect (human-directed aggression, approach withdrawal tendencies, sensitivity to aversives). We then examined the papers for reports of inter-rater reliability, within-session intra-rater reliability, test-retest validity and predictive validity. The review revealed a widespread lack of information relating to the reliability and validity of measures to assess behaviour and inconsistencies in terminologies, study parameters and indices of success. There is a need to standardise the reporting of these aspects of behavioural tests in order to improve the knowledge base of what characteristics are predictive of optimal performance in working dog roles, improving selection processes and reducing working dog redundancy. We suggest the use of a framework based on explaining the direct or indirect relationship of the test with core affect.
Driving and Low Vision: Validity of Assessments for Predicting Performance of Drivers

ERIC Educational Resources Information Center

Strong, J. Graham; Jutai, Jeffrey W.; Russell-Minda, Elizabeth; Evans, Mal

2008-01-01

The authors conducted a systematic review to examine whether vision-related assessments can predict the driving performance of individuals who have low vision. The results indicate that measures of visual field, contrast sensitivity, cognitive and attention-based tests, and driver screening tools have variable utility for predicting real-world…
The Influence of Age and Sexual Drive on the Predictive Validity of the Juvenile Sex Offender Assessment Protocol-Revised.

PubMed

Wijetunga, Charity; Martinez, Ricardo; Rosenfeld, Barry; Cruise, Keith

2018-01-01

The Juvenile Sex Offender Assessment Protocol-Revised (J-SOAP-II) is the most commonly used measure in the assessment of recidivism risk among juveniles who have committed sexual offenses (JSOs), but mixed support exists for its predictive validity. This study compared the predictive validity of the J-SOAP-II across two offender characteristics, age and sexual drive, in a sample of 156 JSOs who had been discharged from a correctional facility or a residential treatment program. The J-SOAP-II appeared to be a better predictor of sexual recidivism for younger JSOs (14-16 years old) than for older ones (17-19 years old), with significant differences found for the Dynamic Summary Scale and Scale III (Intervention). In addition, several of the measure's scales significantly predicted sexual recidivism for JSOs with a clear pattern of sexualized behavior but not for those without such a pattern, indicating that the J-SOAP-II may have greater clinical utility for JSOs with heightened sexual drive. The implications of these findings are discussed.
Assessing performance and validating finite element simulations using probabilistic knowledge

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dolin, Ronald M.; Rodriguez, E. A.

Two probabilistic approaches for assessing performance are presented. The first approach assesses probability of failure by simultaneously modeling all likely events. The probability each event causes failure along with the event's likelihood of occurrence contribute to the overall probability of failure. The second assessment method is based on stochastic sampling using an influence diagram. Latin-hypercube sampling is used to stochastically assess events. The overall probability of failure is taken as the maximum probability of failure of all the events. The Likelihood of Occurrence simulation suggests failure does not occur while the Stochastic Sampling approach predicts failure. The Likelihood of Occurrencemore » results are used to validate finite element predictions.« less
Evaluating the predictive accuracy and the clinical benefit of a nomogram aimed to predict survival in node-positive prostate cancer patients: External validation on a multi-institutional database.

PubMed

Bianchi, Lorenzo; Schiavina, Riccardo; Borghesi, Marco; Bianchi, Federico Mineo; Briganti, Alberto; Carini, Marco; Terrone, Carlo; Mottrie, Alex; Gacci, Mauro; Gontero, Paolo; Imbimbo, Ciro; Marchioro, Giansilvio; Milanese, Giulio; Mirone, Vincenzo; Montorsi, Francesco; Morgia, Giuseppe; Novara, Giacomo; Porreca, Angelo; Volpe, Alessandro; Brunocilla, Eugenio

2018-04-06

To assess the predictive accuracy and the clinical value of a recent nomogram predicting cancer-specific mortality-free survival after surgery in pN1 prostate cancer patients through an external validation. We evaluated 518 prostate cancer patients treated with radical prostatectomy and pelvic lymph node dissection with evidence of nodal metastases at final pathology, at 10 tertiary centers. External validation was carried out using regression coefficients of the previously published nomogram. The performance characteristics of the model were assessed by quantifying predictive accuracy, according to the area under the curve in the receiver operating characteristic curve and model calibration. Furthermore, we systematically analyzed the specificity, sensitivity, positive predictive value and negative predictive value for each nomogram-derived probability cut-off. Finally, we implemented decision curve analysis, in order to quantify the nomogram's clinical value in routine practice. External validation showed inferior predictive accuracy as referred to in the internal validation (65.8% vs 83.3%, respectively). The discrimination (area under the curve) of the multivariable model was 66.7% (95% CI 60.1-73.0%) by testing with receiver operating characteristic curve analysis. The calibration plot showed an overestimation throughout the range of predicted cancer-specific mortality-free survival rates probabilities. However, in decision curve analysis, the nomogram's use showed a net benefit when compared with the scenarios of treating all patients or none. In an external setting, the nomogram showed inferior predictive accuracy and suboptimal calibration characteristics as compared to that reported in the original population. However, decision curve analysis showed a clinical net benefit, suggesting a clinical implication to correctly manage pN1 prostate cancer patients after surgery. © 2018 The Japanese Urological Association.
Assessment of generalizability, applicability and predictability (GAP) for evaluating external validity in studies of universal family-based prevention of alcohol misuse in young people: systematic methodological review of randomized controlled trials.

PubMed

Fernandez-Hermida, Jose Ramon; Calafat, Amador; Becoña, Elisardo; Tsertsvadze, Alexander; Foxcroft, David R

2012-09-01

To assess external validity characteristics of studies from two Cochrane Systematic Reviews of the effectiveness of universal family-based prevention of alcohol misuse in young people. Two reviewers used an a priori developed external validity rating form and independently assessed three external validity dimensions of generalizability, applicability and predictability (GAP) in randomized controlled trials. The majority (69%) of the included 29 studies were rated 'unclear' on the reporting of sufficient information for judging generalizability from sample to study population. Ten studies (35%) were rated 'unclear' on the reporting of sufficient information for judging applicability to other populations and settings. No study provided an assessment of the validity of the trial end-point measures for subsequent mortality, morbidity, quality of life or other economic or social outcomes. Similarly, no study reported on the validity of surrogate measures using established criteria for assessing surrogate end-points. Studies evaluating the benefits of family-based prevention of alcohol misuse in young people are generally inadequate at reporting information relevant to generalizability of the findings or implications for health or social outcomes. Researchers, study authors, peer reviewers, journal editors and scientific societies should take steps to improve the reporting of information relevant to external validity in prevention trials. © 2012 The Authors. Addiction © 2012 Society for the Study of Addiction.
The Validity of College Grade Prediction Equations Over Time.

ERIC Educational Resources Information Center

Sawyer, Richard L.; Maxey, James

A sample of 260 colleges was surveyed during the years 1972-1976 to determine the validity of predicting college freshmen grades from standardized test scores and high school grades using the American College Testing (ACT) Assessment Program, an evaluative and placement service for students and educators involved in the transition from high school…
Evaluating the Complementary Roles of an SJT and Academic Assessment for Entry into Clinical Practice

ERIC Educational Resources Information Center

Cousans, Fran; Patterson, Fiona; Edwards, Helena; Walker, Kim; McLachlan, John C.; Good, David

2017-01-01

Although there is extensive evidence confirming the predictive validity of situational judgement tests (SJTs) in medical education, there remains a shortage of evidence for their predictive validity for performance of postgraduate trainees in their first role in clinical practice. Moreover, to date few researchers have empirically examined the…
The Predictive Validity of CBM Writing Indices for Eighth-Grade Students

ERIC Educational Resources Information Center

Amato, Janelle M.; Watkins, Marley W.

2011-01-01

Curriculum-based measurement (CBM) is an alternative to traditional assessment techniques. Technical work has begun to identify CBM writing indices that are psychometrically sound for monitoring older students' writing proficiency. This study examined the predictive validity of CBM writing indices in a sample of 447 eighth-grade students.…
Predictive Validity and Accuracy of Oral Reading Fluency for English Learners

ERIC Educational Resources Information Center

Vanderwood, Michael L.; Tung, Catherine Y.; Checca, C. Jason

2014-01-01

The predictive validity and accuracy of an oral reading fluency (ORF) measure for a statewide assessment in English language arts was examined for second-grade native English speakers (NESs) and English learners (ELs) with varying levels of English proficiency. In addition to comparing ELs with native English speakers, the impact of English…
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs

PubMed Central

Craigon, Peter J.; Blythe, Simon A.; England, Gary C. W.; Asher, Lucy

2017-01-01

Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5–8, 8–12 and 5–12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs. PMID:28614347
Validating a measure to assess factors that affect assistive technology use by students with disabilities in elementary and secondary education.

PubMed

Zapf, Susan A; Scherer, Marcia J; Baxter, Mary F; H Rintala, Diana

2016-01-01

The purpose of this study was to measure the predictive validity, internal consistency and clinical utility of the Matching Assistive Technology to Child & Augmentative Communication Evaluation Simplified (MATCH-ACES) assessment. Twenty-three assistive technology team evaluators assessed 35 children using the MATCH-ACES assessment. This quasi-experimental study examined the internal consistency, predictive validity and clinical utility of the MATCH-ACES assessment. The MATCH-ACES assessment predisposition scales had good internal consistency across all three scales. A significant relationship was found between (a) high student perseverance and need for assistive technology and (b) high teacher comfort and interest in technology use (p = (0).002). Study results indicate that the MATCH-ACES assessment has good internal consistency and validity. Predisposition characteristics of student and teacher combined can influence the level of assistive technology use; therefore, assistive technology teams should assess predisposition factors of the user when recommending assistive technology. Implications for Rehabilitation Educational and medical professionals should be educated on evidence-based assistive technology assessments. Personal experience and psychosocial factors can influence the outcome use of assistive technology. Assistive technology assessments must include an intervention plan for assistive technology service delivery to measure effective outcome use.
Development and external validation of a prediction rule for an unfavorable course of late-life depression: A multicenter cohort study.

PubMed

Maarsingh, O R; Heymans, M W; Verhaak, P F; Penninx, B W J H; Comijs, H C

2018-08-01

Given the poor prognosis of late-life depression, it is crucial to identify those at risk. Our objective was to construct and validate a prediction rule for an unfavourable course of late-life depression. For development and internal validation of the model, we used The Netherlands Study of Depression in Older Persons (NESDO) data. We included participants with a major depressive disorder (MDD) at baseline (n = 270; 60-90 years), assessed with the Composite International Diagnostic Interview (CIDI). For external validation of the model, we used The Netherlands Study of Depression and Anxiety (NESDA) data (n = 197; 50-66 years). The outcome was MDD after 2 years of follow-up, assessed with the CIDI. Candidate predictors concerned sociodemographics, psychopathology, physical symptoms, medication, psychological determinants, and healthcare setting. Model performance was assessed by calculating calibration and discrimination. 111 subjects (41.1%) had MDD after 2 years of follow-up. Independent predictors of MDD after 2 years were (older) age, (early) onset of depression, severity of depression, anxiety symptoms, comorbid anxiety disorder, fatigue, and loneliness. The final model showed good calibration and reasonable discrimination (AUC of 0.75; 0.70 after external validation). The strongest individual predictor was severity of depression (AUC of 0.69; 0.68 after external validation). The model was developed and validated in The Netherlands, which could affect the cross-country generalizability. Based on rather simple clinical indicators, it is possible to predict the 2-year course of MDD. The prediction rule can be used for monitoring MDD patients and identifying those at risk of an unfavourable outcome. Copyright © 2018 Elsevier B.V. All rights reserved.
A novel QSAR model of Salmonella mutagenicity and its application in the safety assessment of drug impurities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valencia, Antoni; Prous, Josep; Mora, Oscar

As indicated in ICH M7 draft guidance, in silico predictive tools including statistically-based QSARs and expert analysis may be used as a computational assessment for bacterial mutagenicity for the qualification of impurities in pharmaceuticals. To address this need, we developed and validated a QSAR model to predict Salmonella t. mutagenicity (Ames assay outcome) of pharmaceutical impurities using Prous Institute's Symmetry℠, a new in silico solution for drug discovery and toxicity screening, and the Mold2 molecular descriptor package (FDA/NCTR). Data was sourced from public benchmark databases with known Ames assay mutagenicity outcomes for 7300 chemicals (57% mutagens). Of these data, 90%more » was used to train the model and the remaining 10% was set aside as a holdout set for validation. The model's applicability to drug impurities was tested using a FDA/CDER database of 951 structures, of which 94% were found within the model's applicability domain. The predictive performance of the model is acceptable for supporting regulatory decision-making with 84 ± 1% sensitivity, 81 ± 1% specificity, 83 ± 1% concordance and 79 ± 1% negative predictivity based on internal cross-validation, while the holdout dataset yielded 83% sensitivity, 77% specificity, 80% concordance and 78% negative predictivity. Given the importance of having confidence in negative predictions, an additional external validation of the model was also carried out, using marketed drugs known to be Ames-negative, and obtained 98% coverage and 81% specificity. Additionally, Ames mutagenicity data from FDA/CFSAN was used to create another data set of 1535 chemicals for external validation of the model, yielding 98% coverage, 73% sensitivity, 86% specificity, 81% concordance and 84% negative predictivity. - Highlights: • A new in silico QSAR model to predict Ames mutagenicity is described. • The model is extensively validated with chemicals from the FDA and the public domain. • Validation tests show desirable high sensitivity and high negative predictivity. • The model predicted 14 reportedly difficult to predict drug impurities with accuracy. • The model is suitable to support risk evaluation of potentially mutagenic compounds.« less
Comment on Hall et al. (2017), "How to Choose Between Measures of Tinnitus Loudness for Clinical Research? A Report on the Reliability and Validity of an Investigator-Administered Test and a Patient-Reported Measure Using Baseline Data Collected in a Phase IIa Drug Trial".

PubMed

Sabour, Siamak

2018-03-08

The purpose of this letter, in response to Hall, Mehta, and Fackrell (2017), is to provide important knowledge about methodology and statistical issues in assessing the reliability and validity of an audiologist-administered tinnitus loudness matching test and a patient-reported tinnitus loudness rating. The author uses reference textbooks and published articles regarding scientific assessment of the validity and reliability of a clinical test to discuss the statistical test and the methodological approach in assessing validity and reliability in clinical research. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess reliability and validity. The qualitative variables of sensitivity, specificity, positive predictive value, negative predictive value, false positive and false negative rates, likelihood ratio positive and likelihood ratio negative, as well as odds ratio (i.e., ratio of true to false results), are the most appropriate estimates to evaluate validity of a test compared to a gold standard. In the case of quantitative variables, depending on distribution of the variable, Pearson r or Spearman rho can be applied. Diagnostic accuracy (validity) and diagnostic precision (reliability or agreement) are two completely different methodological issues. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess validity.

Predicting functional outcomes among college drinkers: reliability and predictive validity of the Young Adult Alcohol Consequences Questionnaire.

PubMed

Read, Jennifer P; Merrill, Jennifer E; Kahler, Christopher W; Strong, David R

2007-11-01

Heavy drinking and associated consequences are widespread among U.S. college students. Recently, Read et al. (Read, J. P., Kahler, C. W., Strong, D., & Colder, C. R. (2006). Development and preliminary validation of the Young Adult Alcohol Consequences Questionnaire. Journal of Studies on Alcohol, 67, 169-178) developed the Young Adult Alcohol Consequences Questionnaire (YAACQ) to assess the broad range of consequences that may result from heavy drinking in the college milieu. In the present study, we sought to add to the psychometric validation of this measure by employing a prospective design to examine the test-retest reliability, concurrent validity, and predictive validity of the YAACQ. We also sought to examine the utility of the YAACQ administered early in the semester in the prediction of functional outcomes later in the semester, including the persistence of heavy drinking, and academic functioning. Ninety-two college students (48 females) completed a self-report assessment battery during the first weeks of the Fall semester, and approximately one week later. Additionally, 64 subjects (37 females) participated at an optional third time point at the end of the semester. Overall, the YAACQ demonstrated strong internal consistency, test-retest reliability, and concurrent and predictive validity. YAACQ scores also were predictive of both drinking frequency, and "binge" drinking frequency. YAACQ total scores at baseline were an early indicator of academic performance later in the semester, with greater number of total consequences experienced being negatively associated with end-of-semester grade point average. Specific YAACQ subscale scores (Impaired Control, Dependence Symptoms, Blackout Drinking) showed unique prediction of persistent drinking and academic outcomes.
Review and evaluation of performance measures for survival prediction models in external validation settings.

PubMed

Rahman, M Shafiqur; Ambler, Gareth; Choodari-Oskooei, Babak; Omar, Rumana Z

2017-04-18

When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the context of model validation. This paper reviewed and evaluated a wide range of performance measures to provide some guidelines for their use in practice. An extensive simulation study based on two clinical datasets was conducted to investigate the performance of the measures in external validation settings. Measures were selected from categories that assess the overall performance, discrimination and calibration of a survival prediction model. Some of these have been modified to allow their use with validation data, and a case study is provided to describe how these measures can be estimated in practice. The measures were evaluated with respect to their robustness to censoring and ease of interpretation. All measures are implemented, or are straightforward to implement, in statistical software. Most of the performance measures were reasonably robust to moderate levels of censoring. One exception was Harrell's concordance measure which tended to increase as censoring increased. We recommend that Uno's concordance measure is used to quantify concordance when there are moderate levels of censoring. Alternatively, Gönen and Heller's measure could be considered, especially if censoring is very high, but we suggest that the prediction model is re-calibrated first. We also recommend that Royston's D is routinely reported to assess discrimination since it has an appealing interpretation. The calibration slope is useful for both internal and external validation settings and recommended to report routinely. Our recommendation would be to use any of the predictive accuracy measures and provide the corresponding predictive accuracy curves. In addition, we recommend to investigate the characteristics of the validation data such as the level of censoring and the distribution of the prognostic index derived in the validation setting before choosing the performance measures.
Prediction of adult height in girls: the Beunen-Malina-Freitas method.

PubMed

Beunen, Gaston P; Malina, Robert M; Freitas, Duarte L; Thomis, Martine A; Maia, José A; Claessens, Albrecht L; Gouveia, Elvio R; Maes, Hermine H; Lefevre, Johan

2011-12-01

The purpose of this study was to validate and cross-validate the Beunen-Malina-Freitas method for non-invasive prediction of adult height in girls. A sample of 420 girls aged 10-15 years from the Madeira Growth Study were measured at yearly intervals and then 8 years later. Anthropometric dimensions (lengths, breadths, circumferences, and skinfolds) were measured; skeletal age was assessed using the Tanner-Whitehouse 3 method and menarcheal status (present or absent) was recorded. Adult height was measured and predicted using stepwise, forward, and maximum R (2) regression techniques. Multiple correlations, mean differences, standard errors of prediction, and error boundaries were calculated. A sample of the Leuven Longitudinal Twin Study was used to cross-validate the regressions. Age-specific coefficients of determination (R (2)) between predicted and measured adult height varied between 0.57 and 0.96, while standard errors of prediction varied between 1.1 and 3.9 cm. The cross-validation confirmed the validity of the Beunen-Malina-Freitas method in girls aged 12-15 years, but at lower ages the cross-validation was less consistent. We conclude that the Beunen-Malina-Freitas method is valid for the prediction of adult height in girls aged 12-15 years. It is applicable to European populations or populations of European ancestry.
Acquaintance Rape: Applying Crime Scene Analysis to the Prediction of Sexual Recidivism.

PubMed

Lehmann, Robert J B; Goodwill, Alasdair M; Hanson, R Karl; Dahle, Klaus-Peter

2016-10-01

The aim of the current study was to enhance the assessment and predictive accuracy of risk assessments for sexual offenders by utilizing detailed crime scene analysis (CSA). CSA was conducted on a sample of 247 male acquaintance rapists from Berlin (Germany) using a nonmetric, multidimensional scaling (MDS) Behavioral Thematic Analysis (BTA) approach. The age of the offenders at the time of the index offense ranged from 14 to 64 years (M = 32.3; SD = 11.4). The BTA procedure revealed three behavioral themes of hostility, criminality, and pseudo-intimacy, consistent with previous CSA research on stranger rape. The construct validity of the three themes was demonstrated through correlational analyses with known sexual offending measures and criminal histories. The themes of hostility and pseudo-intimacy were significant predictors of sexual recidivism. In addition, the pseudo-intimacy theme led to a significant increase in the incremental validity of the Static-99 actuarial risk assessment instrument for the prediction of sexual recidivism. The results indicate the potential utility and validity of crime scene behaviors in the applied risk assessment of sexual offenders. © The Author(s) 2015.
Initial Reliability and Validity of the Perceived Social Competence Scale

ERIC Educational Resources Information Center

Anderson-Butcher, Dawn; Iachini, Aidyn L.; Amorose, Anthony J.

2008-01-01

Objective: This study describes the development and validation of a perceived social competence scale that social workers can easily use to assess children's and youth's social competence. Method: Exploratory and confirmatory factor analyses were conducted on a calibration and a cross-validation sample of youth. Predictive validity was also…
ProTSAV: A protein tertiary structure analysis and validation server.

PubMed

Singh, Ankita; Kaushik, Rahul; Mishra, Avinash; Shanker, Asheesh; Jayaram, B

2016-01-01

Quality assessment of predicted model structures of proteins is as important as the protein tertiary structure prediction. A highly efficient quality assessment of predicted model structures directs further research on function. Here we present a new server ProTSAV, capable of evaluating predicted model structures based on some popular online servers and standalone tools. ProTSAV furnishes the user with a single quality score in case of individual protein structure along with a graphical representation and ranking in case of multiple protein structure assessment. The server is validated on ~64,446 protein structures including experimental structures from RCSB and predicted model structures for CASP targets and from public decoy sets. ProTSAV succeeds in predicting quality of protein structures with a specificity of 100% and a sensitivity of 98% on experimentally solved structures and achieves a specificity of 88%and a sensitivity of 91% on predicted protein structures of CASP11 targets under 2Å.The server overcomes the limitations of any single server/method and is seen to be robust in helping in quality assessment. ProTSAV is freely available at http://www.scfbio-iitd.res.in/software/proteomics/protsav.jsp. Copyright © 2015 Elsevier B.V. All rights reserved.
Validation of the Beck Hopelessness Scale in patients with suicide risk.

PubMed

Rueda-Jaimes, German Eduardo; Castro-Rueda, Vanessa Alexandra; Rangel-Martínez-Villalba, Andrés Mauricio; Moreno-Quijano, Catalina; Martinez-Salazar, Gustavo Adolfo; Camacho, Paul Anthony

Only a few scales have been validated in Spanish for the assessment of suicide risk, and none of them have achieved predictive validity. To determine the validity and reliability of the Beck Hopelessness Scale in patients with suicide risk attending the specialist clinic. The Beck Hopelessness Scale, reasons for living inventory, and the suicide behaviour questionnaire were applied in patients with suicide risk attending the psychiatric clinic and the emergency department. A new assessment was made 30 days later to determine the predictive validity of suicide or suicide attempt. The evaluation included a total of 244 patients, with a mean age of 30.7±13.2 years, and the majority were women. The internal consistency was .9 (Kuder-Richardson formula 20). Four dimensions were found which accounted for 50% of the variance. It was positively correlated with the suicidal behaviour questionnaire (Spearman .48, P<.001), number of suicide attempts (Spearman .25, P<.001), severity of suicide risk (Spearman .23, P<.001). The correlation with the reasons for living inventory was negative (Spearman -.52, P<.001). With a cut-off ≥12, the negative predictive value was 98.4% (95% CI: 94.2-99.8), and the positive predictive value was 14.8% (95% CI: 6.6-27.1). The Beck Hopelessness Scale in Colombian patients with suicidality shows results similar to the original version, with adequate reliability and moderate concurrent and predictive validity. Copyright © 2016 SEP y SEPB. Publicado por Elsevier España, S.L.U. All rights reserved.
Using cluster analysis to identify phenotypes and validation of mortality in men with COPD.

PubMed

Chen, Chiung-Zuei; Wang, Liang-Yi; Ou, Chih-Ying; Lee, Cheng-Hung; Lin, Chien-Chung; Hsiue, Tzuen-Ren

2014-12-01

Cluster analysis has been proposed to examine phenotypic heterogeneity in chronic obstructive pulmonary disease (COPD). The aim of this study was to use cluster analysis to define COPD phenotypes and validate them by assessing their relationship with mortality. Male subjects with COPD were recruited to identify and validate COPD phenotypes. Seven variables were assessed for their relevance to COPD, age, FEV(1) % predicted, BMI, history of severe exacerbations, mMRC, SpO(2), and Charlson index. COPD groups were identified by cluster analysis and validated prospectively against mortality during a 4-year follow-up. Analysis of 332 COPD subjects identified five clusters from cluster A to cluster E. Assessment of the predictive validity of these clusters of COPD showed that cluster E patients had higher all cause mortality (HR 18.3, p < 0.0001), and respiratory cause mortality (HR 21.5, p < 0.0001) than those in the other four groups. Cluster E patients also had higher all cause mortality (HR 14.3, p = 0.0002) and respiratory cause mortality (HR 10.1, p = 0.0013) than patients in cluster D alone. COPD patient with severe airflow limitation, many symptoms, and a history of frequent severe exacerbations was a novel and distinct clinical phenotype predicting mortality in men with COPD.
The Reliability and Predictive Validity of the Stalking Risk Profile.

PubMed

McEwan, Troy E; Shea, Daniel E; Daffern, Michael; MacKenzie, Rachel D; Ogloff, James R P; Mullen, Paul E

2018-03-01

This study assessed the reliability and validity of the Stalking Risk Profile (SRP), a structured measure for assessing stalking risks. The SRP was administered at the point of assessment or retrospectively from file review for 241 adult stalkers (91% male) referred to a community-based forensic mental health service. Interrater reliability was high for stalker type, and moderate-to-substantial for risk judgments and domain scores. Evidence for predictive validity and discrimination between stalking recidivists and nonrecidivists for risk judgments depended on follow-up duration. Discrimination was moderate (area under the curve = 0.66-0.68) and positive and negative predictive values good over the full follow-up period ( Mdn = 170.43 weeks). At 6 months, discrimination was better than chance only for judgments related to stalking of new victims (area under the curve = 0.75); however, high-risk stalkers still reoffended against their original victim(s) 2 to 4 times as often as low-risk stalkers. Implications for the clinical utility and refinement of the SRP are discussed.
Assessment of Biopsychosocial Complexity and Health Care Needs: Measurement Properties of the INTERMED Self-Assessment Version.

PubMed

van Reedt Dortland, Arianne K B; Peters, Lilian L; Boenink, Annette D; Smit, Jan H; Slaets, Joris P J; Hoogendoorn, Adriaan W; Joos, Andreas; Latour, Corine H M; Stiefel, Friedrich; Burrus, Cyrille; Guitteny-Collas, Marie; Ferrari, Silvia

2017-05-01

The INTERMED Self-Assessment questionnaire (IMSA) was developed as an alternative to the observer-rated INTERMED (IM) to assess biopsychosocial complexity and health care needs. We studied feasibility, reliability, and validity of the IMSA within a large and heterogeneous international sample of adult hospital inpatients and outpatients as well as its predictive value for health care use (HCU) and quality of life (QoL). A total of 850 participants aged 17 to 90 years from five countries completed the IMSA and were evaluated with the IM. The following measurement properties were determined: feasibility by percentages of missing values; reliability by Cronbach α; interrater agreement by intraclass correlation coefficients; convergent validity of IMSA scores with mental health (Short Form 36 emotional well-being subscale and Hospital Anxiety and Depression Scale), medical health (Cumulative Illness Rating Scale) and QoL (Euroqol-5D) by Spearman rank correlations; and predictive validity of IMSA scores with HCU and QoL by (generalized) linear mixed models. Feasibility, face validity, and reliability (Cronbach α = 0.80) were satisfactory. Intraclass correlation coefficient between IMSA and IM total scores was .78 (95% CI = .75-.81). Correlations of the IMSA with the Short Form 36, Hospital Anxiety and Depression Scale, Cumulative Illness Rating Scale, and Euroqol-5D (convergent validity) were -.65, .15, .28, and -.59, respectively. The IMSA significantly predicted QoL and also HCU (emergency department visits, hospitalization, outpatient visits, and diagnostic examinations) after 3- and 6-month follow-up. Results were comparable between hospital sites, inpatients and outpatients, as well as age groups. The IMSA is a generic and time-efficient method to assess biopsychosocial complexity and to provide guidance for multidisciplinary care trajectories in adult patients, with good reliability and validity across different cultures.
Predicting free-living energy expenditure using a miniaturized ear-worn sensor: an evaluation against doubly labeled water.

PubMed

Bouarfa, Loubna; Atallah, Louis; Kwasnicki, Richard Mark; Pettitt, Claire; Frost, Gary; Yang, Guang-Zhong

2014-02-01

Accurate estimation of daily total energy expenditure (EE)is a prerequisite for assisted weight management and assessing certain health conditions. The use of wearable sensors for predicting free-living EE is challenged by consistent sensor placement, user compliance, and estimation methods used. This paper examines whether a single ear-worn accelerometer can be used for EE estimation under free-living conditions.An EE prediction model as first derived and validated in a controlled setting using healthy subjects involving different physical activities. Ten different activities were assessed showing a tenfold cross validation error of 0.24. Furthermore, the EE prediction model shows a mean absolute deviation(MAD) below 1.2 metabolic equivalent of tasks. The same model was applied to a free-living setting with a different population for further validation. The results were compared against those derived from doubly labeled water. In free-living settings, the predicted daily EE has a correlation of 0.74, p 0.008, and a MAD of 272 kcal day. These results demonstrate that laboratory-derived prediction models can be used to predict EE under free-living conditions [corrected].
Validation of Alternative In Vitro Methods to Animal Testing: Concepts, Challenges, Processes and Tools.

PubMed

Griesinger, Claudius; Desprez, Bertrand; Coecke, Sandra; Casey, Warren; Zuang, Valérie

This chapter explores the concepts, processes, tools and challenges relating to the validation of alternative methods for toxicity and safety testing. In general terms, validation is the process of assessing the appropriateness and usefulness of a tool for its intended purpose. Validation is routinely used in various contexts in science, technology, the manufacturing and services sectors. It serves to assess the fitness-for-purpose of devices, systems, software up to entire methodologies. In the area of toxicity testing, validation plays an indispensable role: "alternative approaches" are increasingly replacing animal models as predictive tools and it needs to be demonstrated that these novel methods are fit for purpose. Alternative approaches include in vitro test methods, non-testing approaches such as predictive computer models up to entire testing and assessment strategies composed of method suites, data sources and decision-aiding tools. Data generated with alternative approaches are ultimately used for decision-making on public health and the protection of the environment. It is therefore essential that the underlying methods and methodologies are thoroughly characterised, assessed and transparently documented through validation studies involving impartial actors. Importantly, validation serves as a filter to ensure that only test methods able to produce data that help to address legislative requirements (e.g. EU's REACH legislation) are accepted as official testing tools and, owing to the globalisation of markets, recognised on international level (e.g. through inclusion in OECD test guidelines). Since validation creates a credible and transparent evidence base on test methods, it provides a quality stamp, supporting companies developing and marketing alternative methods and creating considerable business opportunities. Validation of alternative methods is conducted through scientific studies assessing two key hypotheses, reliability and relevance of the test method for a given purpose. Relevance encapsulates the scientific basis of the test method, its capacity to predict adverse effects in the "target system" (i.e. human health or the environment) as well as its applicability for the intended purpose. In this chapter we focus on the validation of non-animal in vitro alternative testing methods and review the concepts, challenges, processes and tools fundamental to the validation of in vitro methods intended for hazard testing of chemicals. We explore major challenges and peculiarities of validation in this area. Based on the notion that validation per se is a scientific endeavour that needs to adhere to key scientific principles, namely objectivity and appropriate choice of methodology, we examine basic aspects of study design and management, and provide illustrations of statistical approaches to describe predictive performance of validated test methods as well as their reliability.
Uncertainty Assessment of Hypersonic Aerothermodynamics Prediction Capability

NASA Technical Reports Server (NTRS)

Bose, Deepak; Brown, James L.; Prabhu, Dinesh K.; Gnoffo, Peter; Johnston, Christopher O.; Hollis, Brian

2011-01-01

The present paper provides the background of a focused effort to assess uncertainties in predictions of heat flux and pressure in hypersonic flight (airbreathing or atmospheric entry) using state-of-the-art aerothermodynamics codes. The assessment is performed for four mission relevant problems: (1) shock turbulent boundary layer interaction on a compression corner, (2) shock turbulent boundary layer interaction due a impinging shock, (3) high-mass Mars entry and aerocapture, and (4) high speed return to Earth. A validation based uncertainty assessment approach with reliance on subject matter expertise is used. A code verification exercise with code-to-code comparisons and comparisons against well established correlations is also included in this effort. A thorough review of the literature in search of validation experiments is performed, which identified a scarcity of ground based validation experiments at hypersonic conditions. In particular, a shortage of useable experimental data at flight like enthalpies and Reynolds numbers is found. The uncertainty was quantified using metrics that measured discrepancy between model predictions and experimental data. The discrepancy data is statistically analyzed and investigated for physics based trends in order to define a meaningful quantified uncertainty. The detailed uncertainty assessment of each mission relevant problem is found in the four companion papers.
The Study Skills Questionnaire (SSQUES): Preliminary Validation of a Measure for Assessing Students' Perceived Areas of Weakness.

ERIC Educational Resources Information Center

McCombs, Barbara L.; Dobrovolny, Jacqueline L.

The potential reliability and construct and predictive validity of a 30-item Study Skills Questionnaire (SSQUES) was evaluated for its ability to: (1) predict student performance in a self-paced, individualized, or computer-managed instructional environment, and (2) identify students needing some type of study skills remediation. The study was…
Validation of a New Skinfold Prediction Equation Based on Dual-Energy X-Ray Absorptiometry

ERIC Educational Resources Information Center

Ball, Stephen; Cowan, Celsi; Thyfault, John; LaFontaine, Tom

2014-01-01

Skinfold prediction equations recommended by the American College of Sports Medicine underestimate body fat percentage. The purpose of this research was to validate an alternative equation for men created from dual energy x-ray absorptiometry. Two hundred ninety-seven males, aged 18-65, completed a skinfold assessment and dual energy x-ray…
Predicting Curriculum and Test Performance at Age 7 Years from Pupil Background, Baseline Skills and Phonological Awareness at Age 5

ERIC Educational Resources Information Center

Savage, R.; Carless, S.

2004-01-01

Background: Phonological awareness tests are known to be amongst the best predictors of literacy; however their predictive validity alongside current school screening practice (baseline assessment, pupil background data) and to National Curricular outcome measures is unknown. Aim: We explored the validity of phonological awareness and orthographic…
Predictive and Treatment Validity of Life Satisfaction and the Quality of Life Inventory

ERIC Educational Resources Information Center

Frisch, Michael B.; Clark, Michelle P.; Rouse, Steven V.; Rudd, M. David; Paweleck, Jennifer K.; Greenstone, Andrew; Kopplin, David A.

2005-01-01

The clinical and positive psychology usefulness of quality of life, well-being, and life satisfaction assessments depends on their ability to predict important outcomes and to detect intervention-related change. These issues were explored in the context of a program of instrument validation for the Quality of Life Inventory (QOLI) involving 3,927…
Measuring Life Stress: A Comparison of the Predictive Validity of Different Scoring Systems for the Social Readjustment Rating Scale.

ERIC Educational Resources Information Center

McGrath, Robert E. V.; Burkhart, Barry R.

1983-01-01

Assessed whether accounting for variables in the scoring of the Social Readjustment Rating Scale (SRRS) would improve the predictive validity of the inventory. Results from 107 sets of questionnaires showed that income and level of education are significant predictors of the capacity to cope with stress. (JAC)
On the Validity of Validity Scales: The Importance of Defensive Responding in the Prediction of Institutional Misconduct

ERIC Educational Resources Information Center

Edens, John F.; Ruiz, Mark A.

2006-01-01

This study examined the effects of defensive responding on the prediction of institutional misconduct among male inmates (N = 349) who completed the Personality Assessment Inventory (L. C. Morey, 1991). Hierarchical logistic regression analyses demonstrated significant main effects for the Antisocial Features (ANT) scale as well as main effects…
Supervisor support in the work place: legitimacy and positive affectivity.

PubMed

Yoon, J; Thye, S

2000-06-01

The authors tested 3 hypotheses regarding supervisor support in the work place. The validation hypothesis predicts that when employees are supported by their coworkers and the larger organization, they also receive more support from their supervisors. The positive affectivity hypothesis predicts that employees with positive dispositions receive more supervisor support because they are more socially oriented and likable. The moderation hypothesis predicts a joint multiplicative effect between validation and positive affectivity. An assessment of the hypotheses among a sample of 1,882 hospital employees in Korea provided strong support for the validation and moderation hypotheses.

Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models.

PubMed

Blagus, Rok; Lusa, Lara

2015-11-04

Prediction models are used in clinical research to develop rules that can be used to accurately predict the outcome of the patients based on some of their characteristics. They represent a valuable tool in the decision making process of clinicians and health policy makers, as they enable them to estimate the probability that patients have or will develop a disease, will respond to a treatment, or that their disease will recur. The interest devoted to prediction models in the biomedical community has been growing in the last few years. Often the data used to develop the prediction models are class-imbalanced as only few patients experience the event (and therefore belong to minority class). Prediction models developed using class-imbalanced data tend to achieve sub-optimal predictive accuracy in the minority class. This problem can be diminished by using sampling techniques aimed at balancing the class distribution. These techniques include under- and oversampling, where a fraction of the majority class samples are retained in the analysis or new samples from the minority class are generated. The correct assessment of how the prediction model is likely to perform on independent data is of crucial importance; in the absence of an independent data set, cross-validation is normally used. While the importance of correct cross-validation is well documented in the biomedical literature, the challenges posed by the joint use of sampling techniques and cross-validation have not been addressed. We show that care must be taken to ensure that cross-validation is performed correctly on sampled data, and that the risk of overestimating the predictive accuracy is greater when oversampling techniques are used. Examples based on the re-analysis of real datasets and simulation studies are provided. We identify some results from the biomedical literature where the incorrect cross-validation was performed, where we expect that the performance of oversampling techniques was heavily overestimated.
The Personality Assessment Inventory as a Proxy for the Psychopathy Checklist-Revised: Testing the Incremental Validity and Cross-Sample Robustness of the Antisocial Features Scale

ERIC Educational Resources Information Center

Douglas, Kevin S.; Guy, Laura S.; Edens, John F.; Boer, Douglas P.; Hamilton, Jennine

2007-01-01

The Personality Assessment Inventory's (PAI's) ability to predict psychopathic personality features, as assessed by the Psychopathy Checklist-Revised (PCL-R), was examined. To investigate whether the PAI Antisocial Features (ANT) Scale and subscales possessed incremental validity beyond other theoretically relevant PAI scales, optimized regression…
Automated assessment of imaging biomarkers for the PanCan lung cancer risk prediction model with validation on NLST data

NASA Astrophysics Data System (ADS)

Wiemker, Rafael; Sevenster, Merlijn; MacMahon, Heber; Li, Feng; Dalal, Sandeep; Tahmasebi, Amir; Klinder, Tobias

2017-03-01

The imaging biomarkers EmphysemaPresence and NoduleSpiculation are crucial inputs for most models aiming to predict the risk of indeterminate pulmonary nodules detected at CT screening. To increase reproducibility and to accelerate screening workflow it is desirable to assess these biomarkers automatically. Validation on NLST images indicates that standard histogram measures are not sufficient to assess EmphysemaPresence in screenees. However, automatic scoring of bulla-resembling low attenuation areas can achieve agreement with experts with close to 80% sensitivity and specificity. NoduleSpiculation can be automatically assessed with similar accuracy. We find a dedicated spiculi tracing score to slightly outperform generic combinations of texture features with classifiers.
Assessing personal talent determinants in young racquet sport players: a systematic review.

PubMed

Faber, Irene R; Bustin, Paul M J; Oosterveld, Frits G J; Elferink-Gemser, Marije T; Nijhuis-Van der Sanden, Maria W G

2016-01-01

Since junior performances have little predictive value for future success, other solutions are sought to assess a young player's potential. The objectives of this systematic review are (1) to provide an overview of instruments measuring personal talent determinants of young players in racquet sports, and (2) to evaluate these instruments regarding their validity for talent development. Electronic searches were conducted in PubMed, PsychINFO, Web of Knowledge, ScienceDirect and SPORTDiscus (1990 to 31 March 2014). Search terms represented tennis, table tennis, badminton and squash, the concept of talent, methods of testing and children. Thirty articles with information regarding over 100 instruments were included. Validity evaluation showed that instruments focusing on intellectual and perceptual abilities, and coordinative skills discriminate elite from non-elite players and/or are related to current performance, but their predictive validity is not confirmed. There is moderate evidence that the assessments of mental and goal management skills predict future performance. Data on instruments measuring physical characteristics prohibit a conclusion due to conflicting findings. This systematic review yielded an ambiguous end point. The lack of longitudinal studies precludes verification of the instrument's capacity to forecast future performance. Future research should focus on instruments assessing multidimensional talent determinants and their predictive value in longitudinal designs.
Development and validation of the ORACLE score to predict risk of osteoporosis.

PubMed

Richy, Florent; Deceulaer, Fréderic; Ethgen, Olivier; Bruyère, Olivier; Reginster, Jean-Yves

2004-11-01

To develop and validate a composite index, the Osteoporosis Risk Assessment by Composite Linear Estimate (ORACLE), that includes risk factors and ultrasonometric outcomes to screen for osteoporosis. Two cohorts of postmenopausal women aged 45 years and older participated in the development (n = 407) and the validation (n = 202) of ORACLE. Their bone mineral density was determined by dual energy x-ray absorptiometry and quantitative ultrasonometry (QUS), and their historical and clinical risk factors were assessed (January to June 2003). Logistic regression analysis was used to select significant predictors of bone mineral density, whereas receiver operating characteristic (ROC) analysis was used to assess the discriminatory performance of ORACLE. The final logistic regression model retained 4 biometric or historical variables and 1 ultrasonometric outcome. The ROC areas under the curves (AUCs) for ORACLE were 84% for the prediction of osteoporosis and 78% for low bone mass. A sensitivity of 90% corresponded to a specificity of 50% for identification of women at risk of developing osteoporosis. The corresponding positive and negative predictive values were 86% and 54%, respectively, in the development cohort. In the validation cohort, the AUCs for identification of osteoporosis and low bone mass were 81% and 76% for ORACLE, 69% and 64% for QUS T score, 71% and 68% for QUS ultrasonometric bone profile index, and 76% and 75% for Osteoporosis Self-assessment Tool, respectively. ORACLE had the best discriminatory performance in identifying osteoporosis compared with the other approaches (P < .05). ORACLE exhibited the highest discriminatory properties compared with ultrasonography alone or other previously validated risk indices. It may be helpful to enhance the predictive value of QUS.
Predictive validity of a selection centre testing non-technical skills for recruitment to training in anaesthesia.

PubMed

Gale, T C E; Roberts, M J; Sice, P J; Langton, J A; Patterson, F C; Carr, A S; Anderson, I R; Lam, W H; Davies, P R F

2010-11-01

Assessment centres are an accepted method of recruitment in industry and are gaining popularity within medicine. We describe the development and validation of a selection centre for recruitment to speciality training in anaesthesia based on an assessment centre model incorporating the rating of candidate's non-technical skills. Expert consensus identified non-technical skills suitable for assessment at the point of selection. Four stations-structured interview, portfolio review, presentation, and simulation-were developed, the latter two being realistic scenarios of work-related tasks. Evaluation of the selection centre focused on applicant and assessor feedback ratings, inter-rater agreement, and internal consistency reliability coefficients. Predictive validity was sought via correlations of selection centre scores with subsequent workplace-based ratings of appointed trainees. Two hundred and twenty-four candidates were assessed over two consecutive annual recruitment rounds; 68 were appointed and followed up during training. Candidates and assessors demonstrated strong approval of the selection centre with more than 70% of ratings 'good' or 'excellent'. Mean inter-rater agreement coefficients ranged from 0.62 to 0.77 and internal consistency reliability of the selection centre score was high (Cronbach's α=0.88-0.91). The overall selection centre score was a good predictor of workplace performance during the first year of appointment. An assessment centre model based on the rating of non-technical skills can produce a reliable and valid selection tool for recruitment to speciality training in anaesthesia. Early results on predictive validity are encouraging and justify further development and evaluation.
Modelling personality, plasticity and predictability in shelter dogs

PubMed Central

2017-01-01

Behavioural assessments of shelter dogs (Canis lupus familiaris) typically comprise standardized test batteries conducted at one time point, but test batteries have shown inconsistent predictive validity. Longitudinal behavioural assessments offer an alternative. We modelled longitudinal observational data on shelter dog behaviour using the framework of behavioural reaction norms, partitioning variance into personality (i.e. inter-individual differences in behaviour), plasticity (i.e. inter-individual differences in average behaviour) and predictability (i.e. individual differences in residual intra-individual variation). We analysed data on interactions of 3263 dogs (n = 19 281) with unfamiliar people during their first month after arrival at the shelter. Accounting for personality, plasticity (linear and quadratic trends) and predictability improved the predictive accuracy of the analyses compared to models quantifying personality and/or plasticity only. While dogs were, on average, highly sociable with unfamiliar people and sociability increased over days since arrival, group averages were unrepresentative of all dogs and predictions made at the individual level entailed considerable uncertainty. Effects of demographic variables (e.g. age) on personality, plasticity and predictability were observed. Behavioural repeatability was higher one week after arrival compared to arrival day. Our results highlight the value of longitudinal assessments on shelter dogs and identify measures that could improve the predictive validity of behavioural assessments in shelters. PMID:28989764
Reliability and concurrent validity of the computer workstation checklist.

PubMed

Baker, Nancy A; Livengood, Heather; Jacobs, Karen

2013-01-01

Self-report checklists are used to assess computer workstation set up, typically by workers not trained in ergonomic assessment or checklist interpretation.Though many checklists exist, few have been evaluated for reliability and validity. This study examined reliability and validity of the Computer Workstation Checklist (CWC) to identify mismatches between workers' self-reported workstation problems. The CWC was completed at baseline and at 1 month to establish reliability. Validity was determined with CWC baseline data compared to an onsite workstation evaluation conducted by an expert in computer workstation assessment. Reliability ranged from fair to near perfect (prevalence-adjusted bias-adjusted kappa, 0.38-0.93); items with the strongest agreement were related to the input device, monitor, computer table, and document holder. The CWC had greater specificity (11 of 16 items) than sensitivity (3 of 16 items). The positive predictive value was greater than the negative predictive value for all questions. The CWC has strong reliability. Sensitivity and specificity suggested workers often indicated no problems with workstation setup when problems existed. The evidence suggests that while the CWC may not be valid when used alone, it may be a suitable adjunct to an ergonomic assessment completed by professionals.
QCT/FEA predictions of femoral stiffness are strongly affected by boundary condition modeling

PubMed Central

Rossman, Timothy; Kushvaha, Vinod; Dragomir-Daescu, Dan

2015-01-01

Quantitative computed tomography-based finite element models of proximal femora must be validated with cadaveric experiments before using them to assess fracture risk in osteoporotic patients. During validation it is essential to carefully assess whether the boundary condition modeling matches the experimental conditions. This study evaluated proximal femur stiffness results predicted by six different boundary condition methods on a sample of 30 cadaveric femora and compared the predictions with experimental data. The average stiffness varied by 280% among the six boundary conditions. Compared with experimental data the predictions ranged from overestimating the average stiffness by 65% to underestimating it by 41%. In addition we found that the boundary condition that distributed the load to the contact surfaces similar to the expected contact mechanics predictions had the best agreement with experimental stiffness. We concluded that boundary conditions modeling introduced large variations in proximal femora stiffness predictions. PMID:25804260
Assessing the risk of imminent aggression in institutionalized youth offenders using the dynamic appraisal of situational aggression

PubMed Central

Chu, Chi Meng; Hoo, Eric; Daffern, Michael; Tan, Jolie

2012-01-01

Aggressive behavior in incarcerated youth presents a significant problem for staff, co-residents and the functioning of the institution. This study aimed to examine the predictive validity of an empirically validated measure, designed to appraise the risk of imminent aggression within institutionalized adult psychiatric patients (Dynamic Appraisal of Situational Aggression; DASA), in adolescent male and female offenders. The supervising staff members on the residential units rated the DASA daily for 49 youth (29 males and 20 females) over two months. The results showed that DASA total scores significantly predicted institutional aggression in the following 24 and 48 hrs; however, the predictive validity of the DASA for institutional aggression was, at best, modest. Further analyses on male and female subsamples revealed that the DASA total scores only predicted imminent institutional aggression in the male subsample. Item analyses showed that negative attitudes, anger when requests are denied, and unwillingness to follow instructions predicted institutional aggression more strongly as compared with other behavioral manifestations of an irritable and unstable mental state as assessed by the DASA. PMID:25999797
Observations on CFD Verification and Validation from the AIAA Drag Prediction Workshops

NASA Technical Reports Server (NTRS)

Morrison, Joseph H.; Kleb, Bil; Vassberg, John C.

2014-01-01

The authors provide observations from the AIAA Drag Prediction Workshops that have spanned over a decade and from a recent validation experiment at NASA Langley. These workshops provide an assessment of the predictive capability of forces and moments, focused on drag, for transonic transports. It is very difficult to manage the consistency of results in a workshop setting to perform verification and validation at the scientific level, but it may be sufficient to assess it at the level of practice. Observations thus far: 1) due to simplifications in the workshop test cases, wind tunnel data are not necessarily the “correct” results that CFD should match, 2) an average of core CFD data are not necessarily a better estimate of the true solution as it is merely an average of other solutions and has many coupled sources of variation, 3) outlier solutions should be investigated and understood, and 4) the DPW series does not have the systematic build up and definition on both the computational and experimental side that is required for detailed verification and validation. Several observations regarding the importance of the grid, effects of physical modeling, benefits of open forums, and guidance for validation experiments are discussed. The increased variation in results when predicting regions of flow separation and increased variation due to interaction effects, e.g., fuselage and horizontal tail, point out the need for validation data sets for these important flow phenomena. Experiences with a recent validation experiment at NASA Langley are included to provide guidance on validation experiments.
Assessing risk for violence among male and female civil psychiatric patients: the HCR-20, PCL:SV, and VSC.

PubMed

Nicholls, Tonia L; Ogloff, James R P; Douglas, Kevin S

2004-01-01

This study evaluated the predictive validity of violence risk assessments conducted using the HCR-20, the Psychopathy Checklist: Screening Version (PCL:SV), and by the Violence Screening Checklist (VSC) in a sample of 268 involuntarily hospitalized male and female psychiatric patients. Information pertaining to violence and crime was coded from medical charts and correctional records. The HCR-20/PCL:SV evidenced modest non-significant associations in postdictive assessments of inpatient violence among men. Moderate to strong significant associations were found between the HCR-20/PCL:SV and inpatient violence among women. Pseudo-prospective assessments using the HCR-20 and PCL:SV resulted in moderate to large relationships with violence and crime in men and women following community discharge. It is concluded that the VSC is a promising tool for assessing acute inpatient violence risk with men. Findings offer preliminary validation of the predictive validity of the HCR-20 and PCL:SV with female civil psychiatric patients. Copyright 2004 John Wiley & Sons, Ltd.
Validity of the stroke rehabilitation assessment of movement scale in acute rehabilitation: a comparison with the functional independence measure and stroke impact scale-16.

PubMed

Ward, Irene; Pivko, Susan; Brooks, Gary; Parkin, Kate

2011-11-01

To demonstrate sensitivity to change of the Stroke Rehabilitation Assessment of Movement (STREAM) as well as the concurrent and predictive validity of the STREAM in an acute rehabilitation setting. Prospective cohort study. Acute, in-patient rehabilitation department within a tertiary-care teaching hospital in the United States. Thirty adults with a newly diagnosed, first ischemic stroke. Clinical assessments were conducted on admission and then again on discharge from the rehabilitation hospital with the STREAM (total STREAM and upper extremity, lower extremity, and mobility subscales), Functional Independence Measure (FIM), and Stroke Impact Scale-16 (SIS-16). Sensitivity to change was determined with the Wilcoxon signed rank test and by the calculation of standardized response means. Spearman correlations were used to assess concurrent validity of the total STREAM and STREAM subscales with the FIM and SIS-16 on admission and discharge. We determined predictive validity for all instruments by correlating admission scores with actual and predicted length of stay and by testing associations between admission scores and discharge destination (home vs subacute facility). Not applicable. For all instruments, there was statistically significant improvement from admission to discharge. The standardized response means for the total STREAM and STREAM subscales were large. Spearman correlations between the total STREAM and STREAM subscales and the FIM and SIS-16 were moderate to excellent, both on admission and discharge. Among change scores, only the SIS-16 correlated with the total STREAM. All 3 instruments were significantly associated with discharge destination; however, the associations were strongest for the total STREAM and STREAM subscales. All instruments showed moderate-to-excellent correlations with predicted and actual length of stay. The STREAM is sensitive to change and demonstrates good concurrent and predictive validity as compared with the FIM and SIS-16 in the acute inpatient rehabilitation population. Copyright © 2011 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Transfer of skills on LapSim virtual reality laparoscopic simulator into the operating room in urology.

PubMed

Alwaal, Amjad; Al-Qaoud, Talal M; Haddad, Richard L; Alzahrani, Tarek M; Delisle, Josee; Anidjar, Maurice

2015-01-01

Assessing the predictive validity of the LapSim simulator within a urology residency program. Twelve urology residents at McGill University were enrolled in the study between June 2008 and December 2011. The residents had weekly training on the LapSim that consisted of 3 tasks (cutting, clip-applying, and lifting and grasping). They underwent monthly assessment of their LapSim performance using total time, tissue damage and path length among other parameters as surrogates for their economy of movement and respect for tissue. The last residents' LapSim performance was compared with their first performance of radical nephrectomy on anesthetized porcine models in their 4(th) year of training. Two independent urologic surgeons rated the resident performance on the porcine models, and kappa test with standardized weight function was used to assess for inter-observer bias. Nonparametric spearman correlation test was used to compare each rater's cumulative score with the cumulative score obtained on the porcine models in order to test the predictive validity of the LapSim simulator. The kappa results demonstrated acceptable agreement between the two observers among all domains of the rating scale of performance except for confidence of movement and efficiency. In addition, poor predictive validity of the LapSim simulator was demonstrated. Predictive validity was not demonstrated for the LapSim simulator in the context of a urology residency training program.
Measurement of predictive validity in violence risk assessment studies: a second-order systematic review.

PubMed

Singh, Jay P; Desmarais, Sarah L; Van Dorn, Richard A

2013-01-01

The objective of the present review was to examine how predictive validity is analyzed and reported in studies of instruments used to assess violence risk. We reviewed 47 predictive validity studies published between 1990 and 2011 of 25 instruments that were included in two recent systematic reviews. Although all studies reported receiver operating characteristic curve analyses and the area under the curve (AUC) performance indicator, this methodology was defined inconsistently and findings often were misinterpreted. In addition, there was between-study variation in benchmarks used to determine whether AUCs were small, moderate, or large in magnitude. Though virtually all of the included instruments were designed to produce categorical estimates of risk - through the use of either actuarial risk bins or structured professional judgments - only a minority of studies calculated performance indicators for these categorical estimates. In addition to AUCs, other performance indicators, such as correlation coefficients, were reported in 60% of studies, but were infrequently defined or interpreted. An investigation of sources of heterogeneity did not reveal significant variation in reporting practices as a function of risk assessment approach (actuarial vs. structured professional judgment), study authorship, geographic location, type of journal (general vs. specialized audience), sample size, or year of publication. Findings suggest a need for standardization of predictive validity reporting to improve comparison across studies and instruments. Copyright © 2013 John Wiley & Sons, Ltd.
Incremental Validity of Useful Field of View Subtests for the Prediction of Instrumental Activities of Daily Living

PubMed Central

Aust, Frederik; Edwards, Jerri D.

2015-01-01

Introduction The Useful Field of View Test (UFOV®) is a cognitive measure that predicts older adults’ ability to perform a range of everyday activities. However, little is known about the individual contribution of each subtest to these predictions and the underlying constructs of UFOV performance remain a topic of debate. Method We investigated the incremental validity of UFOV subtests for the prediction of Instrumental Activities of Daily Living (IADL) performance in two independent datasets, the SKILL (n = 828) and ACTIVE (n = 2426) studies. We, then, explored the cognitive and visual abilities assessed by UFOV using a range of neuropsychological and vision tests administered in the SKILL study. Results In the four subtest variant of UFOV, only subtests 2 and 3 consistently made independent contributions to the prediction of IADL performance across three different behavioral measures. In all cases, the incremental validity of UFOV subtests 1 and 4 was negligible. Furthermore, we found that UFOV was related to processing speed, general non-speeded cognition, and visual function; the omission of subtests 1 and 4 from the test score did not affect these associations. Conclusions UFOV subtests 1 and 4 appear to be of limited use to predict IADL and possibly other everyday activities. Future experimental research should investigate if shortening the UFOV by omitting these subtests is a reliable and valid assessment approach. PMID:26782018
Applicability of Monte Carlo cross validation technique for model development and validation using generalised least squares regression

NASA Astrophysics Data System (ADS)

Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra

2013-03-01

SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.
Validity of the Medical College Admission Test for Predicting MD-PhD Student Outcomes

ERIC Educational Resources Information Center

Bills, James L.; VanHouten, Jacob; Grundy, Michelle M.; Chalkley, Roger; Dermody, Terence S.

2016-01-01

The Medical College Admission Test (MCAT) is a quantitative metric used by MD and MD-PhD programs to evaluate applicants for admission. This study assessed the validity of the MCAT in predicting training performance measures and career outcomes for MD-PhD students at a single institution. The study population consisted of 153 graduates of the…
Project Evaluation: Validation of a Scale and Analysis of Its Predictive Capacity

ERIC Educational Resources Information Center

Fernandes Malaquias, Rodrigo; de Oliveira Malaquias, Fernanda Francielle

2014-01-01

The objective of this study was to validate a scale for assessment of academic projects. As a complement, we examined its predictive ability by comparing the scores of advised/corrected projects based on the model and the final scores awarded to the work by an examining panel (approximately 10 months after the project design). Results of…
Cross-cultural adaptation and psychometric evaluation of oral health impact profile among school teacher community

PubMed Central

Vyas, Shaleen; Nagarajappa, Sandesh; Dasar, Pralhad L.; Mishra, Prashant

2018-01-01

AIM: To translate OHIP-14 into Hindi and test its psychometric properties among school teacher community. METHODS: The OHIP-14 was translated to OHIP-14-H using WHO recommended translation protocol. During pre-testing, an expert panel assessed content validity of the questionnaire. Face validity was assessed on a sample of 10 individuals. The OHIP-14-H was administered on a random sample of 170 primary school teachers. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and Intra-class correlation coefficient (ICC) respectively, with 2 weeks interval. Predictive validity was tested by comparing OHIP-14-H scores with clinical parameters. The concurrent validity was assessed using self-reported oral health and discriminant validity was ascertained through negative association with sociodemographic variables. RESULTS: The mean OHIP-14-H score was 9.57 (S.D = 4.58). ICC and Cronbach's alpha for OHIP-14-H was 0.96 and 0.92 respectively. Concurrent validity using binomial regression model indicated that good (OR = 0.56, 95% CI = 0.55 – 4.47) and moderate (OR = 0.25, 95% CI = 0.17 – 1.87) OHIP-14-H scores were negative but significant risk indicators of poor self reported oral health (P < 0.009). Significant predictive validity was observed between OHIP-14-H scores and clinical parameters (P < 0.000). CONCLUSION: Translated and culturally adapted OHIP-14-H indicates good reliability and validity among primary school teachers. PMID:29417064

Statistical validation of predictive TRANSP simulations of baseline discharges in preparation for extrapolation to JET D-T

NASA Astrophysics Data System (ADS)

Kim, Hyun-Tae; Romanelli, M.; Yuan, X.; Kaye, S.; Sips, A. C. C.; Frassinetti, L.; Buchanan, J.; Contributors, JET

2017-06-01

This paper presents for the first time a statistical validation of predictive TRANSP simulations of plasma temperature using two transport models, GLF23 and TGLF, over a database of 80 baseline H-mode discharges in JET-ILW. While the accuracy of the predicted T e with TRANSP-GLF23 is affected by plasma collisionality, the dependency of predictions on collisionality is less significant when using TRANSP-TGLF, indicating that the latter model has a broader applicability across plasma regimes. TRANSP-TGLF also shows a good matching of predicted T i with experimental measurements allowing for a more accurate prediction of the neutron yields. The impact of input data and assumptions prescribed in the simulations are also investigated in this paper. The statistical validation and the assessment of uncertainty level in predictive TRANSP simulations for JET-ILW-DD will constitute the basis for the extrapolation to JET-ILW-DT experiments.
Validation of a 4-item Negative Symptom Assessment (NSA-4): a short, practical clinical tool for the assessment of negative symptoms in schizophrenia.

PubMed

Alphs, Larry; Morlock, Robert; Coon, Cheryl; Cazorla, Pilar; Szegedi, Armin; Panagides, John

2011-06-01

The 16-item Negative Symptom Assessment (NSA-16) scale is a validated tool for evaluating negative symptoms of schizophrenia. The psychometric properties and predictive power of a four-item version (NSA-4) were compared with the NSA-16. Baseline data from 561 patients with predominant negative symptoms of schizophrenia who participated in two identically designed clinical trials were evaluated. Ordered logistic regression analysis of ratings using NSA-4 and NSA-16 were compared with ratings using several other standard tools to determine predictive validity and construct validity. Internal consistency and test--retest reliability were also analyzed. NSA-16 and NSA-4 scores were both predictive of scores on the NSA global rating (odds ratio = 0.83-0.86) and the Clinical Global Impressions--Severity scale (odds ratio = 0.91-0.93). NSA-16 and NSA-4 showed high correlation with each other (Pearson r = 0.85), similar high correlation with other measures of negative symptoms (demonstrating convergent validity), and lesser correlations with measures of other forms of psychopathology (demonstrating divergent validity). NSA-16 and NSA-4 both showed acceptable internal consistency (Cronbach α, 0.85 and 0.64, respectively) and test--retest reliability (intraclass correlation coefficient, 0.87 and 0.82). This study demonstrates that NSA-4 offers accuracy comparable to the NSA-16 in rating negative symptoms in patients with schizophrenia. Copyright © 2011 John Wiley & Sons, Ltd.
Development and validation of a risk assessment tool for gastric cancer in a general Japanese population.

PubMed

Iida, Masahiro; Ikeda, Fumie; Hata, Jun; Hirakawa, Yoichiro; Ohara, Tomoyuki; Mukai, Naoko; Yoshida, Daigo; Yonemoto, Koji; Esaki, Motohiro; Kitazono, Takanari; Kiyohara, Yutaka; Ninomiya, Toshiharu

2018-05-01

There have been very few reports of risk score models for the development of gastric cancer. The aim of this study was to develop and validate a risk assessment tool for discerning future gastric cancer risk in Japanese. A total of 2444 subjects aged 40 years or over were followed up for 14 years from 1988 (derivation cohort), and 3204 subjects of the same age group were followed up for 5 years from 2002 (validation cohort). The weighting (risk score) of each risk factor for predicting future gastric cancer in the risk assessment tool was determined based on the coefficients of a Cox proportional hazards model in the derivation cohort. The goodness of fit of the established risk assessment tool was assessed using the c-statistic and the Hosmer-Lemeshow test in the validation cohort. During the follow-up, gastric cancer developed in 90 subjects in the derivation cohort and 35 subjects in the validation cohort. In the derivation cohort, the risk prediction model for gastric cancer was established using significant risk factors: age, sex, the combination of Helicobacter pylori antibody and pepsinogen status, hemoglobin A1c level, and smoking status. The incidence of gastric cancer increased significantly as the sum of risk scores increased (P trend < 0.001). The risk assessment tool was validated internally and showed good discrimination (c-statistic = 0.76) and calibration (Hosmer-Lemeshow test P = 0.43) in the validation cohort. We developed a risk assessment tool for gastric cancer that provides a useful guide for stratifying an individual's risk of future gastric cancer.
Applicability Analysis of Validation Evidence for Biomedical Computational Models

DOE PAGES

Pathmanathan, Pras; Gray, Richard A.; Romero, Vicente J.; ...

2017-09-07

Computational modeling has the potential to revolutionize medicine the way it transformed engineering. However, despite decades of work, there has only been limited progress to successfully translate modeling research to patient care. One major difficulty which often occurs with biomedical computational models is an inability to perform validation in a setting that closely resembles how the model will be used. For example, for a biomedical model that makes in vivo clinically relevant predictions, direct validation of predictions may be impossible for ethical, technological, or financial reasons. Unavoidable limitations inherent to the validation process lead to challenges in evaluating the credibilitymore » of biomedical model predictions. Therefore, when evaluating biomedical models, it is critical to rigorously assess applicability, that is, the relevance of the computational model, and its validation evidence to the proposed context of use (COU). However, there are no well-established methods for assessing applicability. In this paper, we present a novel framework for performing applicability analysis and demonstrate its use with a medical device computational model. The framework provides a systematic, step-by-step method for breaking down the broad question of applicability into a series of focused questions, which may be addressed using supporting evidence and subject matter expertise. The framework can be used for model justification, model assessment, and validation planning. While motivated by biomedical models, it is relevant to a broad range of disciplines and underlying physics. Finally, the proposed applicability framework could help overcome some of the barriers inherent to validation of, and aid clinical implementation of, biomedical models.« less
Applicability Analysis of Validation Evidence for Biomedical Computational Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pathmanathan, Pras; Gray, Richard A.; Romero, Vicente J.

Computational modeling has the potential to revolutionize medicine the way it transformed engineering. However, despite decades of work, there has only been limited progress to successfully translate modeling research to patient care. One major difficulty which often occurs with biomedical computational models is an inability to perform validation in a setting that closely resembles how the model will be used. For example, for a biomedical model that makes in vivo clinically relevant predictions, direct validation of predictions may be impossible for ethical, technological, or financial reasons. Unavoidable limitations inherent to the validation process lead to challenges in evaluating the credibilitymore » of biomedical model predictions. Therefore, when evaluating biomedical models, it is critical to rigorously assess applicability, that is, the relevance of the computational model, and its validation evidence to the proposed context of use (COU). However, there are no well-established methods for assessing applicability. In this paper, we present a novel framework for performing applicability analysis and demonstrate its use with a medical device computational model. The framework provides a systematic, step-by-step method for breaking down the broad question of applicability into a series of focused questions, which may be addressed using supporting evidence and subject matter expertise. The framework can be used for model justification, model assessment, and validation planning. While motivated by biomedical models, it is relevant to a broad range of disciplines and underlying physics. Finally, the proposed applicability framework could help overcome some of the barriers inherent to validation of, and aid clinical implementation of, biomedical models.« less
Predicting the ungauged basin: model validation and realism assessment

NASA Astrophysics Data System (ADS)

van Emmerik, Tim; Mulder, Gert; Eilander, Dirk; Piet, Marijn; Savenije, Hubert

2016-04-01

The hydrological decade on Predictions in Ungauged Basins (PUB) [1] led to many new insights in model development, calibration strategies, data acquisition and uncertainty analysis. Due to a limited amount of published studies on genuinely ungauged basins, model validation and realism assessment of model outcome has not been discussed to a great extent. With this study [2] we aim to contribute to the discussion on how one can determine the value and validity of a hydrological model developed for an ungauged basin. As in many cases no local, or even regional, data are available, alternative methods should be applied. Using a PUB case study in a genuinely ungauged basin in southern Cambodia, we give several examples of how one can use different types of soft data to improve model design, calibrate and validate the model, and assess the realism of the model output. A rainfall-runoff model was coupled to an irrigation reservoir, allowing the use of additional and unconventional data. The model was mainly forced with remote sensing data, and local knowledge was used to constrain the parameters. Model realism assessment was done using data from surveys. This resulted in a successful reconstruction of the reservoir dynamics, and revealed the different hydrological characteristics of the two topographical classes. We do not present a generic approach that can be transferred to other ungauged catchments, but we aim to show how clever model design and alternative data acquisition can result in a valuable hydrological model for ungauged catchments. [1] Sivapalan, M., Takeuchi, K., Franks, S., Gupta, V., Karambiri, H., Lakshmi, V., et al. (2003). IAHS decade on predictions in ungauged basins (PUB), 2003-2012: shaping an exciting future for the hydrological sciences. Hydrol. Sci. J. 48, 857-880. doi: 10.1623/hysj.48.6.857.51421 [2] van Emmerik, T., Mulder, G., Eilander, D., Piet, M. and Savenije, H. (2015). Predicting the ungauged basin: model validation and realism assessment. Front. Earth Sci. 3:62. doi: 10.3389/feart.2015.00062
Prediction of prostate cancer in unscreened men: external validation of a risk calculator.

PubMed

van Vugt, Heidi A; Roobol, Monique J; Kranse, Ries; Määttänen, Liisa; Finne, Patrik; Hugosson, Jonas; Bangma, Chris H; Schröder, Fritz H; Steyerberg, Ewout W

2011-04-01

Prediction models need external validation to assess their value beyond the setting where the model was derived from. To assess the external validity of the European Randomized study of Screening for Prostate Cancer (ERSPC) risk calculator (www.prostatecancer-riskcalculator.com) for the probability of having a positive prostate biopsy (P(posb)). The ERSPC risk calculator was based on data of the initial screening round of the ERSPC section Rotterdam and validated in 1825 and 531 men biopsied at the initial screening round in the Finnish and Swedish sections of the ERSPC respectively. P(posb) was calculated using serum prostate specific antigen (PSA), outcome of digital rectal examination (DRE), transrectal ultrasound and ultrasound assessed prostate volume. The external validity was assessed for the presence of cancer at biopsy by calibration (agreement between observed and predicted outcomes), discrimination (separation of those with and without cancer), and decision curves (for clinical usefulness). Prostate cancer was detected in 469 men (26%) of the Finnish cohort and in 124 men (23%) of the Swedish cohort. Systematic miscalibration was present in both cohorts (mean predicted probability 34% versus 26% observed, and 29% versus 23% observed, both p<0.001). The areas under the curves were 0.76 and 0.78, and substantially lower for the model with PSA only (0.64 and 0.68 respectively). The model proved clinically useful for any decision threshold compared with a model with PSA only, PSA and DRE, or biopsying all men. A limitation is that the model is based on sextant biopsies results. The ERSPC risk calculator discriminated well between those with and without prostate cancer among initially screened men, but overestimated the risk of a positive biopsy. Further research is necessary to assess the performance and applicability of the ERSPC risk calculator when a clinical setting is considered rather than a screening setting. Copyright © 2010 Elsevier Ltd. All rights reserved.
Predicting aggressive behaviour in acute forensic mental health units: A re-examination of the dynamic appraisal of situational aggression's predictive validity.

PubMed

Maguire, Tessa; Daffern, Michael; Bowe, Steven J; McKenna, Brian

2017-10-01

In the present study, we explored the predictive validity of the Dynamic Appraisal of Situational Aggression (DASA) assessment tool in male (n = 30) and female (n = 30) patients admitted to the acute units of a forensic mental health hospital. We also tested the psychometric properties of the original DASA bands and novel risk bands. The first 60 days of each patient's file was reviewed to identify daily DASA scores and subsequent risk-related nursing interventions and aggressive behaviour within the following 24 hours. Risk assessments, followed by documented nursing interventions, were removed to preserve the integrity of the risk-assessment analysis. Receiver-operator characteristics were used to test the predictive accuracy of the DASA, and generalized estimating equations (GEE) were used to account for repeated risk assessments, which occurs when analysing short-term risk-assessment data. The results revealed modest predictive validity for males and females. GEE analyses suggested the need to adjust the DASA risk bands to the following (with associated odds ratios (OR) for aggressive behaviour): 0 = low risk; 1, 2, 3 = moderate-risk OR, 4.70 (95% confidence interval (CI): 2.84-7.80); and 4, 5, 6, 7 = high-risk OR, 16.13 (95% CI: 9.71-26.78). The adjusted DASA risk bands could assist nurses by prompting violence-prevention interventions when the level of risk is elevated. © 2017 Australian College of Mental Health Nurses Inc.
Development and Validation of the Narrative Quality Assessment Tool.

PubMed

Kim, Wonsun Sunny; Shin, Cha-Nam; Kathryn Larkey, Linda; Roe, Denise J

2017-04-01

The use of storytelling in health promotion has grown over the past 2 decades, showing promise for moving people to initiate healthy behavior change. Given the increasingly prevalent role of storytelling in health promotion research and the need to more clearly identify what storytelling elements and mediators may better predict behavior change, there is a need to develop measures to specifically assess these factors in a cultural community context. The purpose of this study is to develop and preliminarily validate a narrative quality assessment tool for measuring elements of storytelling that are predicted to affect attitude and behavior change (i.e., narrative characteristics, identification, and transportation) within a cultural community setting using a culture-centric model. Reliability and validity of these scales were assessed with repeated administrations among 74 Latino men and women with a mean age of 39.6 years (SD = 11.47 years). The confirmatory factor analysis in addition to internal consistency tests revealed preliminary evidence for reliability and validity of the narrative characteristics, identification, and transportation scales. Cronbach's alpha ranged from .92 to .94. Items revealed adequate factor loadings (.85-.98) and good model fit. The new scales provide the first step in moving the assessment of narrative quality into a culturally relevant context for evaluation of story use in health promotion. The results present valuable information for nurse researchers to guide the development and testing of culturally grounded storytelling interventions' potential to predict attitude and behavior change for patients.
Development and Validation of a New Methodology to Assess the Vineyard Water Status by On-the-Go Near Infrared Spectroscopy

PubMed Central

Diago, Maria P.; Fernández-Novales, Juan; Gutiérrez, Salvador; Marañón, Miguel; Tardaguila, Javier

2018-01-01

Assessing water status and optimizing irrigation is of utmost importance in most winegrowing countries, as the grapevine vegetative growth, yield, and grape quality can be impaired under certain water stress situations. Conventional plant-based methods for water status monitoring are either destructive or time and labor demanding, therefore unsuited to detect the spatial variation of moisten content within a vineyard plot. In this context, this work aims at the development and comprehensive validation of a novel, non-destructive methodology to assess the vineyard water status distribution using on-the-go, contactless, near infrared (NIR) spectroscopy. Likewise, plant water status prediction models were built and intensely validated using the stem water potential (ψs) as gold standard. Predictive models were developed making use of a vast number of measurements, acquired on 15 dates with diverse environmental conditions, at two different spatial scales, on both sides of vertical shoot positioned canopies, over two consecutive seasons. Different cross-validation strategies were also tested and compared. Predictive models built from east-acquired spectra yielded the best performance indicators in both seasons, with determination coefficient of prediction (RP2) ranging from 0.68 to 0.85, and sensitivity (expressed as prediction root mean square error) between 0.131 and 0.190 MPa, regardless the spatial scale. These predictive models were implemented to map the spatial variability of the vineyard water status at two different dates, and provided useful, practical information to help delineating specific irrigation schedules. The performance and the large amount of data that this on-the-go spectral solution provides, facilitates the exploitation of this non-destructive technology to monitor and map the vineyard water status variability with high spatial and temporal resolution, in the context of precision and sustainable viticulture. PMID:29441086
Development and Validation of a New Methodology to Assess the Vineyard Water Status by On-the-Go Near Infrared Spectroscopy.

PubMed

Diago, Maria P; Fernández-Novales, Juan; Gutiérrez, Salvador; Marañón, Miguel; Tardaguila, Javier

2018-01-01

Assessing water status and optimizing irrigation is of utmost importance in most winegrowing countries, as the grapevine vegetative growth, yield, and grape quality can be impaired under certain water stress situations. Conventional plant-based methods for water status monitoring are either destructive or time and labor demanding, therefore unsuited to detect the spatial variation of moisten content within a vineyard plot. In this context, this work aims at the development and comprehensive validation of a novel, non-destructive methodology to assess the vineyard water status distribution using on-the-go, contactless, near infrared (NIR) spectroscopy. Likewise, plant water status prediction models were built and intensely validated using the stem water potential (ψ s ) as gold standard. Predictive models were developed making use of a vast number of measurements, acquired on 15 dates with diverse environmental conditions, at two different spatial scales, on both sides of vertical shoot positioned canopies, over two consecutive seasons. Different cross-validation strategies were also tested and compared. Predictive models built from east-acquired spectra yielded the best performance indicators in both seasons, with determination coefficient of prediction ([Formula: see text]) ranging from 0.68 to 0.85, and sensitivity (expressed as prediction root mean square error) between 0.131 and 0.190 MPa, regardless the spatial scale. These predictive models were implemented to map the spatial variability of the vineyard water status at two different dates, and provided useful, practical information to help delineating specific irrigation schedules. The performance and the large amount of data that this on-the-go spectral solution provides, facilitates the exploitation of this non-destructive technology to monitor and map the vineyard water status variability with high spatial and temporal resolution, in the context of precision and sustainable viticulture.
Validating a Predictive Model of Acute Advanced Imaging Biomarkers in Ischemic Stroke.

PubMed

Bivard, Andrew; Levi, Christopher; Lin, Longting; Cheng, Xin; Aviv, Richard; Spratt, Neil J; Lou, Min; Kleinig, Tim; O'Brien, Billy; Butcher, Kenneth; Zhang, Jingfen; Jannes, Jim; Dong, Qiang; Parsons, Mark

2017-03-01

Advanced imaging to identify tissue pathophysiology may provide more accurate prognostication than the clinical measures used currently in stroke. This study aimed to derive and validate a predictive model for functional outcome based on acute clinical and advanced imaging measures. A database of prospectively collected sub-4.5 hour patients with ischemic stroke being assessed for thrombolysis from 5 centers who had computed tomographic perfusion and computed tomographic angiography before a treatment decision was assessed. Individual variable cut points were derived from a classification and regression tree analysis. The optimal cut points for each assessment variable were then used in a backward logic regression to predict modified Rankin scale (mRS) score of 0 to 1 and 5 to 6. The variables remaining in the models were then assessed using a receiver operating characteristic curve analysis. Overall, 1519 patients were included in the study, 635 in the derivation cohort and 884 in the validation cohort. The model was highly accurate at predicting mRS score of 0 to 1 in all patients considered for thrombolysis therapy (area under the curve [AUC] 0.91), those who were treated (AUC 0.88) and those with recanalization (AUC 0.89). Next, the model was highly accurate at predicting mRS score of 5 to 6 in all patients considered for thrombolysis therapy (AUC 0.91), those who were treated (0.89) and those with recanalization (AUC 0.91). The odds ratio of thrombolysed patients who met the model criteria achieving mRS score of 0 to 1 was 17.89 (4.59-36.35, P <0.001) and for mRS score of 5 to 6 was 8.23 (2.57-26.97, P <0.001). This study has derived and validated a highly accurate model at predicting patient outcome after ischemic stroke. © 2017 American Heart Association, Inc.
Model Verification and Validation Concepts for a Probabilistic Fracture Assessment Model to Predict Cracking of Knife Edge Seals in the Space Shuttle Main Engine High Pressure Oxidizer

NASA Technical Reports Server (NTRS)

Pai, Shantaram S.; Riha, David S.

2013-01-01

Physics-based models are routinely used to predict the performance of engineered systems to make decisions such as when to retire system components, how to extend the life of an aging system, or if a new design will be safe or available. Model verification and validation (V&V) is a process to establish credibility in model predictions. Ideally, carefully controlled validation experiments will be designed and performed to validate models or submodels. In reality, time and cost constraints limit experiments and even model development. This paper describes elements of model V&V during the development and application of a probabilistic fracture assessment model to predict cracking in space shuttle main engine high-pressure oxidizer turbopump knife-edge seals. The objective of this effort was to assess the probability of initiating and growing a crack to a specified failure length in specific flight units for different usage and inspection scenarios. The probabilistic fracture assessment model developed in this investigation combined a series of submodels describing the usage, temperature history, flutter tendencies, tooth stresses and numbers of cycles, fatigue cracking, nondestructive inspection, and finally the probability of failure. The analysis accounted for unit-to-unit variations in temperature, flutter limit state, flutter stress magnitude, and fatigue life properties. The investigation focused on the calculation of relative risk rather than absolute risk between the usage scenarios. Verification predictions were first performed for three units with known usage and cracking histories to establish credibility in the model predictions. Then, numerous predictions were performed for an assortment of operating units that had flown recently or that were projected for future flights. Calculations were performed using two NASA-developed software tools: NESSUS(Registered Trademark) for the probabilistic analysis, and NASGRO(Registered Trademark) for the fracture mechanics analysis. The goal of these predictions was to provide additional information to guide decisions on the potential of reusing existing and installed units prior to the new design certification.
The Validity of Interpersonal Skills Assessment via Situational Judgment Tests for Predicting Academic Success and Job Performance

ERIC Educational Resources Information Center

Lievens, Filip; Sackett, Paul R.

2012-01-01

This study provides conceptual and empirical arguments why an assessment of applicants' procedural knowledge about interpersonal behavior via a video-based situational judgment test might be valid for academic and postacademic success criteria. Four cohorts of medical students (N = 723) were followed from admission to employment. Procedural…
Assessing Risk for Sexual Offenders in New Zealand: Development and Validation of a Computer-Scored Risk Measure

ERIC Educational Resources Information Center

Skelton, Alexander; Riley, David; Wales, David; Vess, James

2006-01-01

A growing research base supports the predictive validity of actuarial methods of risk assessment with sexual offenders. These methods use clearly defined variables with demonstrated empirical association with re-offending. The advantages of actuarial measures for screening large numbers of offenders quickly and economically are further enhanced…
The Optimal Screening for Prediction of Referral and Outcome (OSPRO) in patients with musculoskeletal pain conditions: a longitudinal validation cohort from the USA

PubMed Central

George, Steven Z; Beneciuk, Jason M; Lentz, Trevor A; Wu, Samuel S

2017-01-01

Purpose There is an increased need for determining which patients with musculoskeletal pain benefit from additional diagnostic testing or psychologically informed intervention. The Optimal Screening for Prediction of Referral and Outcome (OSPRO) cohort studies were designed to develop and validate standard assessment tools for review of systems and yellow flags. This cohort profile paper provides a description of and future plans for the validation cohort. Participants Patients (n=440) with primary complaint of spine, shoulder or knee pain were recruited into the OSPRO validation cohort via a national Orthopaedic Physical Therapy-Investigative Network. Patients were followed up at 4 weeks, 6 months and 12 months for pain, functional status and quality of life outcomes. Healthcare utilisation outcomes were also collected at 6 and 12 months. Findings to date There are no longitudinal findings reported to date from the ongoing OSPRO validation cohort. The previously completed cross-sectional OSPRO development cohort yielded two assessment tools that were investigated in the validation cohort. Future plans Follow-up data collection was completed in January 2017. Primary analyses will investigate how accurately the OSPRO review of systems and yellow flag tools predict 12-month pain, functional status, quality of life and healthcare utilisation outcomes. Planned secondary analyses include prediction of pain interference and/or development of chronic pain, investigation of treatment expectation on patient outcomes and analysis of patient satisfaction following an episode of physical therapy. Trial registration number The OSPRO validation cohort was not registered. PMID:28600371
Validating proposed migration equation and parameters' values as a tool to reproduce and predict 137Cs vertical migration activity in Spanish soils.

PubMed

Olondo, C; Legarda, F; Herranz, M; Idoeta, R

2017-04-01

This paper shows the procedure performed to validate the migration equation and the migration parameters' values presented in a previous paper (Legarda et al., 2011) regarding the migration of 137 Cs in Spanish mainland soils. In this paper, this model validation has been carried out checking experimentally obtained activity concentration values against those predicted by the model. This experimental data come from the measured vertical activity profiles of 8 new sampling points which are located in northern Spain. Before testing predicted values of the model, the uncertainty of those values has been assessed with the appropriate uncertainty analysis. Once establishing the uncertainty of the model, both activity concentration values, experimental versus model predicted ones, have been compared. Model validation has been performed analyzing its accuracy, studying it as a whole and also at different depth intervals. As a result, this model has been validated as a tool to predict 137 Cs behaviour in a Mediterranean environment. Copyright © 2017 Elsevier Ltd. All rights reserved.
Are There Sex Differences in the Predictive Validity of DSM-IV ADHD among Younger Children?

ERIC Educational Resources Information Center

Lahey, Benjamin B.; Hartung, Cynthia M.; Loney, Jan; Pelham, William E.; Chronis, Andrea M.; Lee, Steve S.

2007-01-01

We assessed the predictive validity of attention-deficit/hyperactivity disorder (ADHD) in 20 girls and 98 boys who met the Diagnostic and Statistical Manual for Mental Disorders (4th ed., American Psychiatric Association, 1994) criteria for ADHD at 4 to 6 years of age compared to 24 female and 102 male comparison children. Over the next 8 years,…
The Predictive Validity of Using Admissions Testing and Multiple Mini-Interviews in Undergraduate University Admissions

ERIC Educational Resources Information Center

Makransky, Guido; Havmose, Philip; Vang, Maria Louison; Andersen, Tonny Elmose; Nielsen, Tine

2017-01-01

The aim of this study was to evaluate the predictive validity of a two-step admissions procedure that included a cognitive ability test followed by multiple mini-interviews (MMIs) used to assess non-cognitive skills, compared to grade-based admissions relative to subsequent drop-out rates and academic achievement after one and two years of study.…
Early Identification of Children at Risk for Academic Difficulties Using Standardized Assessment: Stability and Predictive Validity of Preschool Math and Language Scores

ERIC Educational Resources Information Center

Frans, Niek; Post, Wendy J.; Huisman, Mark; Oenema-Mostert, Ineke C. E.; Keegstra, Anne L.; Minnaert, Alexander E. M. G.

2017-01-01

Despite the claim by several researchers that variability in performance may complicate the identification of "at-risk" children, variability in the academic performance of young children remains an undervalued area of research. The goal of this study is to examine the predictive validity for future scores and the score stability of two…

The Street Level Built Environment and Physical Activity and Walking: Results of a Predictive Validity Study for the Irvine Minnesota Inventory

ERIC Educational Resources Information Center

Boarnet, Marlon G.; Forsyth, Ann; Day, Kristen; Oakes, J. Michael

2011-01-01

The Irvine Minnesota Inventory (IMI) was designed to measure environmental features that may be associated with physical activity and particularly walking. This study assesses how well the IMI predicts physical activity and walking behavior and develops shortened, validated audit tools. A version of the IMI was used in the Twin Cities Walking…
Using Structural Equation Modeling to Validate the Theory of Planned Behavior as a Model for Predicting Student Cheating

ERIC Educational Resources Information Center

Mayhew, Matthew J.; Hubbard, Steven M.; Finelli, Cynthia J.; Harding, Trevor S.; Carpenter, Donald D.

2009-01-01

The purpose of this paper is to validate the use of a modified Theory of Planned Behavior (TPB) for predicting undergraduate student cheating. Specifically, we administered a survey assessing how the TPB relates to cheating along with a measure of moral reasoning (DIT- 2) to 527 undergraduate students across three institutions; and analyzed the…
The Wisconsin Predicting Patients' Relapse questionnaire

PubMed Central

Bolt, Daniel M.; McCarthy, Danielle E.; Japuntich, Sandra J.; Fiore, Michael C.; Smith, Stevens S.; Baker, Timothy B.

2009-01-01

Introduction: Relapse is the most common smoking cessation outcome. Accurate prediction of relapse likelihood could be an important clinical tool used to influence treatment selection or duration. The aim of this research was to develop a brief clinical relapse proneness questionnaire to be used with smokers interested in quitting in a clinical setting where time is at a premium. Methods: Diverse items assessing constructs shown in previous research to be related to relapse risk, such as nicotine dependence and self-efficacy, were evaluated to determine their independent contributions to relapse prediction. In an exploratory dataset, candidate items were assessed among smokers motivated to quit smoking who enrolled in one of three randomized controlled smoking cessation trials. A cross-validation dataset was used to compare the relative predictive power of the new instrument against the Fagerström Test for Nicotine Dependence (FTND) at 1-week, 8-week, and 6-month postquit assessments. Results: We selected seven items with relatively nonoverlapping content for the Wisconsin Predicting Patient's Relapse (WI-PREPARE) measure, a brief, seven-item questionnaire that taps physical dependence, environmental factors, and individual difference characteristics. Cross-validation analyses suggested that the WI-PREPARE demonstrated a stronger prediction of relapse at 1-week and 8-week postquit assessments than the FTND and comparable prediction to the FTND at a 6-month postquit assessment. Discussion: The WI-PREPARE is easy to score, suggests the nature of a patient's relapse risk, and predicts short- and medium-term relapse better than the FTND. PMID:19372573
Development and validation of risk models and molecular diagnostics to permit personalized management of cancer.

PubMed

Pu, Xia; Ye, Yuanqing; Wu, Xifeng

2014-01-01

Despite the advances made in cancer management over the past few decades, improvements in cancer diagnosis and prognosis are still poor, highlighting the need for individualized strategies. Toward this goal, risk prediction models and molecular diagnostic tools have been developed, tailoring each step of risk assessment from diagnosis to treatment and clinical outcomes based on the individual's clinical, epidemiological, and molecular profiles. These approaches hold increasing promise for delivering a new paradigm to maximize the efficiency of cancer surveillance and efficacy of treatment. However, they require stringent study design, methodology development, comprehensive assessment of biomarkers and risk factors, and extensive validation to ensure their overall usefulness for clinical translation. In the current study, the authors conducted a systematic review using breast cancer as an example and provide general guidelines for risk prediction models and molecular diagnostic tools, including development, assessment, and validation. © 2013 American Cancer Society.
Reliability, validity, sensitivity and specificity of Guajarati version of the Roland-Morris Disability Questionnaire.

PubMed

Nambi, S Gopal

2013-01-01

The most common instruments developed to assess the functional status of patients with Non specific low back pain is the Roland-Morris Disability Questionnaire (RMDQ). Clinical and epidemiological research related to low back pain in the Gujarati population would be facilitated by the availability of well-established outcome measures. To find the reliability, validity, sensitivity and specificity of the Gujarati version of the RMDQ for use in Non Specific Chronic low back pain. A reliability, validity, sensitivity and specificity study of Gujarati version of the Roland-Morris Disability Questionnaire (RMDQ). Thirty out patients with Non Specific Chronic low back pain were assessed by the RMDQ. Reliability is assessed by using internal consistency and the intra-class correlation coefficient (ICC). Internal construct validity is assessed by RASCH Analysis and external construct validity is assessed by association with pain and spinal movement. Clinical calculator was used to determine the sensitivity and specificity. Internal consistency of the RMDQ is found to be adequate (> 0.65) at both times, with high ICC's also at both time points. Internal construct validity of the scale is good, indicating a single underlying construct. Expected associations with pain and spinal movement confirm external construct validity. The Sensitivity and Specificity at cut off point of 0.5 was 80% and 84% with respectively positive predictive value (PPV) of 83.33% and negative predictive value (NPV) of 80.76%. The Questionnaire is at the ordinal level. The RMDQ is a one-dimensional, ordinal measure, which works well in the Gujarati population.
Automatic personality assessment through social media language.

PubMed

Park, Gregory; Schwartz, H Andrew; Eichstaedt, Johannes C; Kern, Margaret L; Kosinski, Michal; Stillwell, David J; Ungar, Lyle H; Seligman, Martin E P

2015-06-01

Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality traits, and then we built a predictive model of personality based on their language. We used this model to predict the 5 personality factors in a separate sample of 4,824 Facebook users, examining (a) convergence with self-reports of personality at the domain- and facet-level; (b) discriminant validity between predictions of distinct traits; (c) agreement with informant reports of personality; (d) patterns of correlations with external criteria (e.g., number of friends, political attitudes, impulsiveness); and (e) test-retest reliability over 6-month intervals. Results indicated that language-based assessments can constitute valid personality measures: they agreed with self-reports and informant reports of personality, added incremental validity over informant reports, adequately discriminated between traits, exhibited patterns of correlations with external criteria similar to those found with self-reported personality, and were stable over 6-month intervals. Analysis of predictive language can provide rich portraits of the mental life associated with traits. This approach can complement and extend traditional methods, providing researchers with an additional measure that can quickly and cheaply assess large groups of participants with minimal burden. (c) 2015 APA, all rights reserved).
Prediction models for the risk of spontaneous preterm birth based on maternal characteristics: a systematic review and independent external validation.

PubMed

Meertens, Linda J E; van Montfort, Pim; Scheepers, Hubertina C J; van Kuijk, Sander M J; Aardenburg, Robert; Langenveld, Josje; van Dooren, Ivo M A; Zwaan, Iris M; Spaanderman, Marc E A; Smits, Luc J M

2018-04-17

Prediction models may contribute to personalized risk-based management of women at high risk of spontaneous preterm delivery. Although prediction models are published frequently, often with promising results, external validation generally is lacking. We performed a systematic review of prediction models for the risk of spontaneous preterm birth based on routine clinical parameters. Additionally, we externally validated and evaluated the clinical potential of the models. Prediction models based on routinely collected maternal parameters obtainable during first 16 weeks of gestation were eligible for selection. Risk of bias was assessed according to the CHARMS guidelines. We validated the selected models in a Dutch multicenter prospective cohort study comprising 2614 unselected pregnant women. Information on predictors was obtained by a web-based questionnaire. Predictive performance of the models was quantified by the area under the receiver operating characteristic curve (AUC) and calibration plots for the outcomes spontaneous preterm birth <37 weeks and <34 weeks of gestation. Clinical value was evaluated by means of decision curve analysis and calculating classification accuracy for different risk thresholds. Four studies describing five prediction models fulfilled the eligibility criteria. Risk of bias assessment revealed a moderate to high risk of bias in three studies. The AUC of the models ranged from 0.54 to 0.67 and from 0.56 to 0.70 for the outcomes spontaneous preterm birth <37 weeks and <34 weeks of gestation, respectively. A subanalysis showed that the models discriminated poorly (AUC 0.51-0.56) for nulliparous women. Although we recalibrated the models, two models retained evidence of overfitting. The decision curve analysis showed low clinical benefit for the best performing models. This review revealed several reporting and methodological shortcomings of published prediction models for spontaneous preterm birth. Our external validation study indicated that none of the models had the ability to predict spontaneous preterm birth adequately in our population. Further improvement of prediction models, using recent knowledge about both model development and potential risk factors, is necessary to provide an added value in personalized risk assessment of spontaneous preterm birth. © 2018 The Authors Acta Obstetricia et Gynecologica Scandinavica published by John Wiley & Sons Ltd on behalf of Nordic Federation of Societies of Obstetrics and Gynecology (NFOG).
Recidivism in female offenders: PCL-R lifestyle factor and VRAG show predictive validity in a German sample.

PubMed

Eisenbarth, Hedwig; Osterheider, Michael; Nedopil, Norbert; Stadtland, Cornelis

2012-01-01

A clear and structured approach to evidence-based and gender-specific risk assessment of violence in female offenders is high on political and mental health agendas. However, most data on the factors involved in risk-assessment instruments are based on data of male offenders. The aim of the present study was to validate the use of the Psychopathy Checklist Revised (PCL-R), the HCR-20 and the Violence Risk Appraisal Guide (VRAG) for the prediction of recidivism in German female offenders. This study is part of the Munich Prognosis Project (MPP). It focuses on a subsample of female delinquents (n = 80) who had been referred for forensic-psychiatric evaluation prior to sentencing. The mean time at risk was 8 years (SD = 5 years; range: 1-18 years). During this time, 31% (n = 25) of the female offenders were reconvicted, 5% (n = 4) for violent and 26% (n = 21) for non-violent re-offenses. The predictive validity of the PCL-R for general recidivism was calculated. Analysis with receiver-operating characteristics revealed that the PCL-R total score, the PCL-R antisocial lifestyle factor, the PCL-R lifestyle factor and the PCL-R impulsive and irresponsible behavioral style factor had a moderate predictive validity for general recidivism (area under the curve, AUC = 0.66, p = 0.02). The VRAG has also demonstrated predictive validity (AUC = 0.72, p = 0.02), whereas the HCR-20 showed no predictive validity. These results appear to provide the first evidence that the PCL-R total score and the antisocial lifestyle factor are predictive for general female recidivism, as has been shown consistently for male recidivists. The implications of these findings for crime prevention, prognosis in women, and future research are discussed. Copyright © 2012 John Wiley & Sons, Ltd.
Food for Thought ... Mechanistic Validation

PubMed Central

Hartung, Thomas; Hoffmann, Sebastian; Stephens, Martin

2013-01-01

Summary Validation of new approaches in regulatory toxicology is commonly defined as the independent assessment of the reproducibility and relevance (the scientific basis and predictive capacity) of a test for a particular purpose. In large ring trials, the emphasis to date has been mainly on reproducibility and predictive capacity (comparison to the traditional test) with less attention given to the scientific or mechanistic basis. Assessing predictive capacity is difficult for novel approaches (which are based on mechanism), such as pathways of toxicity or the complex networks within the organism (systems toxicology). This is highly relevant for implementing Toxicology for the 21st Century, either by high-throughput testing in the ToxCast/ Tox21 project or omics-based testing in the Human Toxome Project. This article explores the mostly neglected assessment of a test's scientific basis, which moves mechanism and causality to the foreground when validating/qualifying tests. Such mechanistic validation faces the problem of establishing causality in complex systems. However, pragmatic adaptations of the Bradford Hill criteria, as well as bioinformatic tools, are emerging. As critical infrastructures of the organism are perturbed by a toxic mechanism we argue that by focusing on the target of toxicity and its vulnerability, in addition to the way it is perturbed, we can anchor the identification of the mechanism and its verification. PMID:23665802
Does a Measure of Support Needs Predict Funding Need Better Than a Measure of Adaptive and Maladaptive Behavior?

PubMed

Arnold, Samuel R C; Riches, Vivienne C; Stancliffe, Roger J

2015-09-01

Internationally, various approaches are used for the allocation of individualized funding. When using a databased approach, a key question is the predictive validity of adaptive behavior versus support needs assessment. This article reports on a subset of data from a larger project that allowed for a comparison of support needs and adaptive behavior assessments when predicting person-centered funding allocation. The first phase of the project involved a trial of the Inventory for Client and Agency Planning (ICAP) adaptive behavior and Instrument for the Classification and Assessment of Support Needs (I-CAN)-Brief Research version support needs assessments. Participants were in receipt of an individual support package allocated using a person-centered planning process, and were stable in their support arrangements. Regression analysis showed that the most useful items in predicting funding allocation came from the I-CAN-Brief Research. No additional variance could be explained by adding the ICAP, or using the ICAP alone. A further unique approach of including only items from the I-CAN-Brief Research marked as funded supports showed high predictive validity. It appears support need is more effective at determining resource need than adaptive behavior.
National IQs Predict Educational Attainment in Math, Reading and Science across 56 Nations

ERIC Educational Resources Information Center

Lynn, Richard; Mikk, Jaan

2009-01-01

The results of the 2006 PISA (Program for International Student Assessment) study of reading comprehension, mathematical ability, and science understanding administered to 15 year olds in 56 countries [OECD (2007). PISA 2006: Science Competencies for Tomorrow's World. Paris: OECD.] are examined to assess the predictive validity of the national IQs…
An Evaluation of the Predictive Validity of Confidence Ratings in Identifying Functional Behavioral Assessment Hypothesis Statements

ERIC Educational Resources Information Center

Borgmeier, Chris; Horner, Robert H.

2006-01-01

Faced with limited resources, schools require tools that increase the accuracy and efficiency of functional behavioral assessment. Yarbrough and Carr (2000) provided evidence that informant confidence ratings of the likelihood of problem behavior in specific situations offered a promising tool for predicting the accuracy of function-based…
Early Prediction of Intensive Care Unit-Acquired Weakness: A Multicenter External Validation Study.

PubMed

Witteveen, Esther; Wieske, Luuk; Sommers, Juultje; Spijkstra, Jan-Jaap; de Waard, Monique C; Endeman, Henrik; Rijkenberg, Saskia; de Ruijter, Wouter; Sleeswijk, Mengalvio; Verhamme, Camiel; Schultz, Marcus J; van Schaik, Ivo N; Horn, Janneke

2018-01-01

An early diagnosis of intensive care unit-acquired weakness (ICU-AW) is often not possible due to impaired consciousness. To avoid a diagnostic delay, we previously developed a prediction model, based on single-center data from 212 patients (development cohort), to predict ICU-AW at 2 days after ICU admission. The objective of this study was to investigate the external validity of the original prediction model in a new, multicenter cohort and, if necessary, to update the model. Newly admitted ICU patients who were mechanically ventilated at 48 hours after ICU admission were included. Predictors were prospectively recorded, and the outcome ICU-AW was defined by an average Medical Research Council score <4. In the validation cohort, consisting of 349 patients, we analyzed performance of the original prediction model by assessment of calibration and discrimination. Additionally, we updated the model in this validation cohort. Finally, we evaluated a new prediction model based on all patients of the development and validation cohort. Of 349 analyzed patients in the validation cohort, 190 (54%) developed ICU-AW. Both model calibration and discrimination of the original model were poor in the validation cohort. The area under the receiver operating characteristics curve (AUC-ROC) was 0.60 (95% confidence interval [CI]: 0.54-0.66). Model updating methods improved calibration but not discrimination. The new prediction model, based on all patients of the development and validation cohort (total of 536 patients) had a fair discrimination, AUC-ROC: 0.70 (95% CI: 0.66-0.75). The previously developed prediction model for ICU-AW showed poor performance in a new independent multicenter validation cohort. Model updating methods improved calibration but not discrimination. The newly derived prediction model showed fair discrimination. This indicates that early prediction of ICU-AW is still challenging and needs further attention.
Mapping the EORTC QLQ-C30 onto the EQ-5D-3L: assessing the external validity of existing mapping algorithms.

PubMed

Doble, Brett; Lorgelly, Paula

2016-04-01

To determine the external validity of existing mapping algorithms for predicting EQ-5D-3L utility values from EORTC QLQ-C30 responses and to establish their generalizability in different types of cancer. A main analysis (pooled) sample of 3560 observations (1727 patients) and two disease severity patient samples (496 and 93 patients) with repeated observations over time from Cancer 2015 were used to validate the existing algorithms. Errors were calculated between observed and predicted EQ-5D-3L utility values using a single pooled sample and ten pooled tumour type-specific samples. Predictive accuracy was assessed using mean absolute error (MAE) and standardized root-mean-squared error (RMSE). The association between observed and predicted EQ-5D utility values and other covariates across the distribution was tested using quantile regression. Quality-adjusted life years (QALYs) were calculated using observed and predicted values to test responsiveness. Ten 'preferred' mapping algorithms were identified. Two algorithms estimated via response mapping and ordinary least-squares regression using dummy variables performed well on number of validation criteria, including accurate prediction of the best and worst QLQ-C30 health states, predicted values within the EQ-5D tariff range, relatively small MAEs and RMSEs, and minimal differences between estimated QALYs. Comparison of predictive accuracy across ten tumour type-specific samples highlighted that algorithms are relatively insensitive to grouping by tumour type and affected more by differences in disease severity. Two of the 'preferred' mapping algorithms suggest more accurate predictions, but limitations exist. We recommend extensive scenario analyses if mapped utilities are used in cost-utility analyses.
Acute Brain Dysfunction: Development and Validation of a Daily Prediction Model.

PubMed

Marra, Annachiara; Pandharipande, Pratik P; Shotwell, Matthew S; Chandrasekhar, Rameela; Girard, Timothy D; Shintani, Ayumi K; Peelen, Linda M; Moons, Karl G M; Dittus, Robert S; Ely, E Wesley; Vasilevskis, Eduard E

2018-03-24

The goal of this study was to develop and validate a dynamic risk model to predict daily changes in acute brain dysfunction (ie, delirium and coma), discharge, and mortality in ICU patients. Using data from a multicenter prospective ICU cohort, a daily acute brain dysfunction-prediction model (ABD-pm) was developed by using multinomial logistic regression that estimated 15 transition probabilities (from one of three brain function states [normal, delirious, or comatose] to one of five possible outcomes [normal, delirious, comatose, ICU discharge, or died]) using baseline and daily risk factors. Model discrimination was assessed by using predictive characteristics such as negative predictive value (NPV). Calibration was assessed by plotting empirical vs model-estimated probabilities. Internal validation was performed by using a bootstrap procedure. Data were analyzed from 810 patients (6,711 daily transitions). The ABD-pm included individual risk factors: mental status, age, preexisting cognitive impairment, baseline and daily severity of illness, and daily administration of sedatives. The model yielded very high NPVs for "next day" delirium (NPV: 0.823), coma (NPV: 0.892), normal cognitive state (NPV: 0.875), ICU discharge (NPV: 0.905), and mortality (NPV: 0.981). The model demonstrated outstanding calibration when predicting the total number of patients expected to be in any given state across predicted risk. We developed and internally validated a dynamic risk model that predicts the daily risk for one of three cognitive states, ICU discharge, or mortality. The ABD-pm may be useful for predicting the proportion of patients for each outcome state across entire ICU populations to guide quality, safety, and care delivery activities. Copyright © 2018 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.
The Validity and Incremental Validity of Knowledge Tests, Low-Fidelity Simulations, and High-Fidelity Simulations for Predicting Job Performance in Advanced-Level High-Stakes Selection

ERIC Educational Resources Information Center

Lievens, Filip; Patterson, Fiona

2011-01-01

In high-stakes selection among candidates with considerable domain-specific knowledge and experience, investigations of whether high-fidelity simulations (assessment centers; ACs) have incremental validity over low-fidelity simulations (situational judgment tests; SJTs) are lacking. Therefore, this article integrates research on the validity of…
Testing the Predictive Validity and Construct of Pathological Video Game Use

PubMed Central

Groves, Christopher L.; Gentile, Douglas; Tapscott, Ryan L.; Lynch, Paul J.

2015-01-01

Three studies assessed the construct of pathological video game use and tested its predictive validity. Replicating previous research, Study 1 produced evidence of convergent validity in 8th and 9th graders (N = 607) classified as pathological gamers. Study 2 replicated and extended the findings of Study 1 with college undergraduates (N = 504). Predictive validity was established in Study 3 by measuring cue reactivity to video games in college undergraduates (N = 254), such that pathological gamers were more emotionally reactive to and provided higher subjective appraisals of video games than non-pathological gamers and non-gamers. The three studies converged to show that pathological video game use seems similar to other addictions in its patterns of correlations with other constructs. Conceptual and definitional aspects of Internet Gaming Disorder are discussed. PMID:26694472
The reliability and validity of ultrasound to quantify muscles in older adults: a systematic review

PubMed Central

Scafoglieri, Aldo; Jager‐Wittenaar, Harriët; Hobbelen, Johannes S.M.; van der Schans, Cees P.

2017-01-01

Abstract This review evaluates the reliability and validity of ultrasound to quantify muscles in older adults. The databases PubMed, Cochrane, and Cumulative Index to Nursing and Allied Health Literature were systematically searched for studies. In 17 studies, the reliability (n = 13) and validity (n = 8) of ultrasound to quantify muscles in community‐dwelling older adults (≥60 years) or a clinical population were evaluated. Four out of 13 reliability studies investigated both intra‐rater and inter‐rater reliability. Intraclass correlation coefficient (ICC) scores for reliability ranged from −0.26 to 1.00. The highest ICC scores were found for the vastus lateralis, rectus femoris, upper arm anterior, and the trunk (ICC = 0.72 to 1.000). All included validity studies found ICC scores ranging from 0.92 to 0.999. Two studies describing the validity of ultrasound to predict lean body mass showed good validity as compared with dual‐energy X‐ray absorptiometry (r 2 = 0.92 to 0.96). This systematic review shows that ultrasound is a reliable and valid tool for the assessment of muscle size in older adults. More high‐quality research is required to confirm these findings in both clinical and healthy populations. Furthermore, ultrasound assessment of small muscles needs further evaluation. Ultrasound to predict lean body mass is feasible; however, future research is required to validate prediction equations in older adults with varying function and health. PMID:28703496
Predictors of early growth in academic achievement: the head-toes-knees-shoulders task

PubMed Central

McClelland, Megan M.; Cameron, Claire E.; Duncan, Robert; Bowles, Ryan P.; Acock, Alan C.; Miao, Alicia; Pratt, Megan E.

2014-01-01

Children's behavioral self-regulation and executive function (EF; including attentional or cognitive flexibility, working memory, and inhibitory control) are strong predictors of academic achievement. The present study examined the psychometric properties of a measure of behavioral self-regulation called the Head-Toes-Knees-Shoulders (HTKS) by assessing construct validity, including relations to EF measures, and predictive validity to academic achievement growth between prekindergarten and kindergarten. In the fall and spring of prekindergarten and kindergarten, 208 children (51% enrolled in Head Start) were assessed on the HTKS, measures of cognitive flexibility, working memory (WM), and inhibitory control, and measures of emergent literacy, mathematics, and vocabulary. For construct validity, the HTKS was significantly related to cognitive flexibility, working memory, and inhibitory control in prekindergarten and kindergarten. For predictive validity in prekindergarten, a random effects model indicated that the HTKS significantly predicted growth in mathematics, whereas a cognitive flexibility task significantly predicted growth in mathematics and vocabulary. In kindergarten, the HTKS was the only measure to significantly predict growth in all academic outcomes. An alternative conservative analytical approach, a fixed effects analysis (FEA) model, also indicated that growth in both the HTKS and measures of EF significantly predicted growth in mathematics over four time points between prekindergarten and kindergarten. Results demonstrate that the HTKS involves cognitive flexibility, working memory, and inhibitory control, and is substantively implicated in early achievement, with the strongest relations found for growth in achievement during kindergarten and associations with emergent mathematics. PMID:25071619
Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates.

PubMed

LeDell, Erin; Petersen, Maya; van der Laan, Mark

In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC.

Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates

PubMed Central

Petersen, Maya; van der Laan, Mark

2015-01-01

In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC. PMID:26279737
Validity Assessment of 5 Day Repeated Forced-Swim Stress to Model Human Depression in Young-Adult C57BL/6J and BALB/cJ Mice

PubMed Central

Zheng, Jia; Goodyear, Laurie J.

2016-01-01

The development of animal models with construct, face, and predictive validity to accurately model human depression has been a major challenge. One proposed rodent model is the 5 d repeated forced swim stress (5d-RFSS) paradigm, which progressively increases floating during individual swim sessions. The onset and persistence of this floating behavior has been anthropomorphically characterized as a measure of depression. This interpretation has been under debate because a progressive increase in floating over time may reflect an adaptive learned behavioral response promoting survival, and not depression (Molendijk and de Kloet, 2015). To assess construct and face validity, we applied 5d-RFSS to C57BL/6J and BALB/cJ mice, two mouse strains commonly used in neuropsychiatric research, and measured a combination of emotional, homeostatic, and psychomotor symptoms indicative of a depressive-like state. We also compared the efficacy of 5d-RFSS and chronic social defeat stress (CSDS), a validated depression model, to induce a depressive-like state in C57BL/6J mice. In both strains, 5d-RFSS progressively increased floating behavior that persisted for at least 4 weeks. 5d-RFSS did not alter sucrose preference, body weight, appetite, locomotor activity, anxiety-like behavior, or immobility behavior during a tail-suspension test compared with nonstressed controls. In contrast, CSDS altered several of these parameters, suggesting a depressive-like state. Finally, predictive validity was assessed using voluntary wheel running (VWR), a known antidepressant intervention. Four weeks of VWR after 5d-RFSS normalized floating behavior toward nonstressed levels. These observations suggest that 5d-RFSS has no construct or face validity but might have predictive validity to model human depression. PMID:28058270
Validity Assessment of 5 Day Repeated Forced-Swim Stress to Model Human Depression in Young-Adult C57BL/6J and BALB/cJ Mice.

PubMed

Mul, Joram D; Zheng, Jia; Goodyear, Laurie J

2016-01-01

The development of animal models with construct, face, and predictive validity to accurately model human depression has been a major challenge. One proposed rodent model is the 5 d repeated forced swim stress (5d-RFSS) paradigm, which progressively increases floating during individual swim sessions. The onset and persistence of this floating behavior has been anthropomorphically characterized as a measure of depression. This interpretation has been under debate because a progressive increase in floating over time may reflect an adaptive learned behavioral response promoting survival, and not depression (Molendijk and de Kloet, 2015). To assess construct and face validity, we applied 5d-RFSS to C57BL/6J and BALB/cJ mice, two mouse strains commonly used in neuropsychiatric research, and measured a combination of emotional, homeostatic, and psychomotor symptoms indicative of a depressive-like state. We also compared the efficacy of 5d-RFSS and chronic social defeat stress (CSDS), a validated depression model, to induce a depressive-like state in C57BL/6J mice. In both strains, 5d-RFSS progressively increased floating behavior that persisted for at least 4 weeks. 5d-RFSS did not alter sucrose preference, body weight, appetite, locomotor activity, anxiety-like behavior, or immobility behavior during a tail-suspension test compared with nonstressed controls. In contrast, CSDS altered several of these parameters, suggesting a depressive-like state. Finally, predictive validity was assessed using voluntary wheel running (VWR), a known antidepressant intervention. Four weeks of VWR after 5d-RFSS normalized floating behavior toward nonstressed levels. These observations suggest that 5d-RFSS has no construct or face validity but might have predictive validity to model human depression.
Ecological validity of the screening module and the Daily Living tests of the Neuropsychological Assessment Battery using the Mayo-Portland Adaptability Inventory-4 in postacute brain injury rehabilitation.

PubMed

Zgaljardic, Dennis J; Yancy, Sybil; Temple, Richard O; Watford, Monica F; Miller, Rebekah

2011-11-01

The assessment of ecological validity of neuropsychological measures is an area of growing interest, particularly in the postacute brain injury rehabilitation (PABIR) setting, as there is an increasing demand for clinicians to address functional and real-world outcomes. In the current study, we assessed the predictive value of the Screening module and the Daily Living tests of the Neuropsychological Assessment Battery (NAB) using clinician ratings from the Mayo-Portland Adaptability Inventory-4 (MPAI-4) in patients with moderate to severe traumatic brain injury. Forty-seven individuals were each administered the NAB Screening module (NAB-SM) and the NAB Daily Living (NAB-DL) tests following admission to a residential PABIR program. MPAI-4 ratings were also obtained at admission. Linear regression analysis was used to examine the association between these functional and neuropsychological assessment measures. We replicated prior work (Temple at al., 2009) and expanded evidence for the ecological validity of the NAB-SM. Furthermore, our findings support the ecological validity of the NAB-DL Bill Payment, Judgment, and Map Reading tests with regards to functional skills and real-world activities. The current study supports prior work from our lab assessing the predictive value of the NAB-SM, as well as provides evidence for the ecological validity for select NAB-DL tests in patients with moderate to severe traumatic brain injury admitted to a residential PABIR program.
How best to measure implementation of school health curricula: a comparison of three measures.

PubMed

Resnicow, K; Davis, M; Smith, M; Lazarus-Yaroch, A; Baranowski, T; Baranowski, J; Doyle, C; Wang, D T

1998-06-01

The impact of school health education programs is often attenuated by inadequate teacher implementation. Using data from a school-based nutrition education program delivered in a sample of fifth graders, this study examines the discriminant and predictive validity of three measures of curriculum implementation: class-room observation of fidelity, and two measures of completeness, teacher self-report questionnaire and post-implementation interview. A fourth measure, obtained during teacher observations, that assessed student and teacher interaction and student receptivity to the curriculum (labeled Rapport) was also obtained. Predictive validity was determined by examining the association of implementation measures with three study outcomes; health knowledge, asking behaviors related to fruit and vegetables, and fruit and vegetable intake, assessed by 7-day diary. Of the 37 teachers observed, 21 were observed for two sessions and 16 were observed once. Implementation measures were moderately correlated, an indication of discriminant validity. Predictive validity analyses indicated that the observed fidelity, Rapport and interview measures were significantly correlated with post-test student knowledge. The association between health knowledge and observed fidelity (based on dual observation only), Rapport and interview measures remained significant after adjustment for pre-test knowledge values. None of the implementation variables were significantly associated with student fruit and vegetable intake or asking behaviors controlling for pre-test values. These results indicate that the teacher self-report questionnaire was not a valid measure of implementation completeness in this study. Post-implementation completeness interviews and dual observations of fidelity and Rapport appear to be more valid, and largely independent methods of implementation assessment.
Concordance and predictive value of two adverse drug event data sets.

PubMed

Cami, Aurel; Reis, Ben Y

2014-08-22

Accurate prediction of adverse drug events (ADEs) is an important means of controlling and reducing drug-related morbidity and mortality. Since no single "gold standard" ADE data set exists, a range of different drug safety data sets are currently used for developing ADE prediction models. There is a critical need to assess the degree of concordance between these various ADE data sets and to validate ADE prediction models against multiple reference standards. We systematically evaluated the concordance of two widely used ADE data sets - Lexi-comp from 2010 and SIDER from 2012. The strength of the association between ADE (drug) counts in Lexi-comp and SIDER was assessed using Spearman rank correlation, while the differences between the two data sets were characterized in terms of drug categories, ADE categories and ADE frequencies. We also performed a comparative validation of the Predictive Pharmacosafety Networks (PPN) model using both ADE data sets. The predictive power of PPN using each of the two validation sets was assessed using the area under Receiver Operating Characteristic curve (AUROC). The correlations between the counts of ADEs and drugs in the two data sets were 0.84 (95% CI: 0.82-0.86) and 0.92 (95% CI: 0.91-0.93), respectively. Relative to an earlier snapshot of Lexi-comp from 2005, Lexi-comp 2010 and SIDER 2012 introduced a mean of 1,973 and 4,810 new drug-ADE associations per year, respectively. The difference between these two data sets was most pronounced for Nervous System and Anti-infective drugs, Gastrointestinal and Nervous System ADEs, and postmarketing ADEs. A minor difference of 1.1% was found in the AUROC of PPN when SIDER 2012 was used for validation instead of Lexi-comp 2010. In conclusion, the ADE and drug counts in Lexi-comp and SIDER data sets were highly correlated and the choice of validation set did not greatly affect the overall prediction performance of PPN. Our results also suggest that it is important to be aware of the differences that exist among ADE data sets, especially in modeling applications focused on specific drug and ADE categories.
The Validity and reliability of the Comprehensive Home Environment Survey (CHES).

PubMed

Pinard, Courtney A; Yaroch, Amy L; Hart, Michael H; Serrano, Elena L; McFerren, Mary M; Estabrooks, Paul A

2014-01-01

Few comprehensive measures exist to assess contributors to childhood obesity within the home, specifically among low-income populations. The current study describes the modification and psychometric testing of the Comprehensive Home Environment Survey (CHES), an inclusive measure of the home food, physical activity, and media environment related to childhood obesity. The items were tested for content relevance by an expert panel and piloted in the priority population. The CHES was administered to low-income parents of children 5 to 17 years (N = 150), including a subsample of parents a second time and additional caregivers to establish test-retest and interrater reliabilities. Children older than 9 years (n = 95), as well as parents (N = 150) completed concurrent assessments of diet and physical activity behaviors (predictive validity). Analyses and item trimming resulted in 18 subscales and a total score, which displayed adequate internal consistency (α = .74-.92) and high test-retest reliability (r ≥ .73, ps < .01) and interrater reliability (r ≥ .42, ps < .01). The CHES score and a validated screener for the home environment were correlated (r = .37, p < .01; concurrent validity). CHES subscales were significantly correlated with behavioral measures (r = -.20-.55, p < .05; predictive validity). The CHES shows promise as a valid/reliable assessment of the home environment related to childhood obesity, including healthy diet and physical activity.
Observed Emotional and Behavioral Indicators of Motivation Predict School Readiness in Head Start Graduates

PubMed Central

Berhenke, Amanda; Miller, Alison L.; Brown, Eleanor; Seifer, Ronald; Dickstein, Susan

2011-01-01

Emotions and behaviors observed during challenging tasks are hypothesized to be valuable indicators of young children's motivation, the assessment of which may be particularly important for children at risk for school failure. The current study demonstrated reliability and concurrent validity of a new observational assessment of motivation in young children. Head Start graduates completed challenging puzzle and trivia tasks during their kindergarten year. Children's emotion expression and task engagement were assessed based on their observed facial and verbal expressions and behavioral cues. Hierarchical regression analyses revealed that observed persistence and shame predicted teacher ratings of children's academic achievement, whereas interest, anxiety, pride, shame, and persistence predicted children's social skills and learning-related behaviors. Children's emotional and behavioral responses to challenge thus appeared to be important indicators of school success. Observation of such responses may be a useful and valid alternative to self-report measures of motivation at this age. PMID:21949599
Observed Emotional and Behavioral Indicators of Motivation Predict School Readiness in Head Start Graduates.

PubMed

Berhenke, Amanda; Miller, Alison L; Brown, Eleanor; Seifer, Ronald; Dickstein, Susan

2011-01-01

Emotions and behaviors observed during challenging tasks are hypothesized to be valuable indicators of young children's motivation, the assessment of which may be particularly important for children at risk for school failure. The current study demonstrated reliability and concurrent validity of a new observational assessment of motivation in young children. Head Start graduates completed challenging puzzle and trivia tasks during their kindergarten year. Children's emotion expression and task engagement were assessed based on their observed facial and verbal expressions and behavioral cues. Hierarchical regression analyses revealed that observed persistence and shame predicted teacher ratings of children's academic achievement, whereas interest, anxiety, pride, shame, and persistence predicted children's social skills and learning-related behaviors. Children's emotional and behavioral responses to challenge thus appeared to be important indicators of school success. Observation of such responses may be a useful and valid alternative to self-report measures of motivation at this age.
Literature Review: Cognitive Abilities--Theory, History, and Validity

DTIC Science & Technology

1991-02-01

Note 88-13. (AD A193 558) Literature Review: Utility of Temperament, Biodata. and Interest Assessment for Predicting Job Performance by Leaetta M. Hough...predicting soldiers’ job performance, and then to develop new measures for those attributes. These Research Notes, however, have usefulness beyond that...organization or taxonomy of the constructs in each area, and the validities of the various measures for different types of job perfor- mance criteria. Second
Performance of genomic prediction within and across generations in maritime pine.

PubMed

Bartholomé, Jérôme; Van Heerwaarden, Joost; Isik, Fikret; Boury, Christophe; Vidal, Marjorie; Plomion, Christophe; Bouffier, Laurent

2016-08-11

Genomic selection (GS) is a promising approach for decreasing breeding cycle length in forest trees. Assessment of progeny performance and of the prediction accuracy of GS models over generations is therefore a key issue. A reference population of maritime pine (Pinus pinaster) with an estimated effective inbreeding population size (status number) of 25 was first selected with simulated data. This reference population (n = 818) covered three generations (G0, G1 and G2) and was genotyped with 4436 single-nucleotide polymorphism (SNP) markers. We evaluated the effects on prediction accuracy of both the relatedness between the calibration and validation sets and validation on the basis of progeny performance. Pedigree-based (best linear unbiased prediction, ABLUP) and marker-based (genomic BLUP and Bayesian LASSO) models were used to predict breeding values for three different traits: circumference, height and stem straightness. On average, the ABLUP model outperformed genomic prediction models, with a maximum difference in prediction accuracies of 0.12, depending on the trait and the validation method. A mean difference in prediction accuracy of 0.17 was found between validation methods differing in terms of relatedness. Including the progenitors in the calibration set reduced this difference in prediction accuracy to 0.03. When only genotypes from the G0 and G1 generations were used in the calibration set and genotypes from G2 were used in the validation set (progeny validation), prediction accuracies ranged from 0.70 to 0.85. This study suggests that the training of prediction models on parental populations can predict the genetic merit of the progeny with high accuracy: an encouraging result for the implementation of GS in the maritime pine breeding program.
Extending the validity of the Feeding Practices and Structure Questionnaire.

PubMed

Jansen, Elena; Mallan, Kimberley M; Daniels, Lynne A

2015-06-30

Feeding practices are commonly examined as potentially modifiable determinants of children's eating behaviours and weight status. Although a variety of questionnaires exist to assess different feeding aspects, many lack thorough reliability and validity testing. The Feeding Practices and Structure Questionnaire (FPSQ) is a tool designed to measure early feeding practices related to non-responsive feeding and structure of the meal environment. Face validity, factorial validity, internal reliability and cross-sectional correlations with children's eating behaviours have been established in mothers with 2-year-old children. The aim of the present study was to further extend the validity of the FPSQ by examining factorial, construct and predictive validity, and stability. Participants were from the NOURISH randomised controlled trial which evaluated an intervention with first-time mothers designed to promote protective feeding practices. Maternal feeding practices (FP) and child eating behaviours were assessed when children were aged 2 years and 3.7 years (n = 388). Confirmatory Factor analysis, group differences, predictive relationships, and stability were tested. The original 9-factor structure was confirmed when children were aged 3.7 ± 0.3 years. Cronbach's alpha was above the recommended 0.70 cut-off for all factors except Structured Meal Timing, Over Restriction and Distrust in Appetite which were 0.58, 0.67 and 0.66 respectively. Allocated group differences reflected behaviour consistent with intervention content and all feeding practices were stable across both time points (range of r = 0.45-0.70). There was some evidence for the predictive validity of factors with 2 FP showing expected relationships, 2 FP showing expected and unexpected relationships and 5 FP showing no relationship. Reliability and validity was demonstrated for most subscales of the FPSQ. Future validation is warranted with culturally diverse samples and with fathers and other caregivers. The use of additional outcomes to further explore predictive validity is recommended as well as testing test-retest reliability of the questionnaire.
Validation of the measure automobile emissions model : a statistical analysis

DOT National Transportation Integrated Search

2000-09-01

The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEASURE) model provides an external validation capability for hot stabilized option; the model is one of several new modal emissions models designed to predict hot stabilized e...
External validation of Vascular Study Group of New England risk predictive model of mortality after elective abdominal aorta aneurysm repair in the Vascular Quality Initiative and comparison against established models.

PubMed

Eslami, Mohammad H; Rybin, Denis V; Doros, Gheorghe; Siracuse, Jeffrey J; Farber, Alik

2018-01-01

The purpose of this study is to externally validate a recently reported Vascular Study Group of New England (VSGNE) risk predictive model of postoperative mortality after elective abdominal aortic aneurysm (AAA) repair and to compare its predictive ability across different patients' risk categories and against the established risk predictive models using the Vascular Quality Initiative (VQI) AAA sample. The VQI AAA database (2010-2015) was queried for patients who underwent elective AAA repair. The VSGNE cases were excluded from the VQI sample. The external validation of a recently published VSGNE AAA risk predictive model, which includes only preoperative variables (age, gender, history of coronary artery disease, chronic obstructive pulmonary disease, cerebrovascular disease, creatinine levels, and aneurysm size) and planned type of repair, was performed using the VQI elective AAA repair sample. The predictive value of the model was assessed via the C-statistic. Hosmer-Lemeshow method was used to assess calibration and goodness of fit. This model was then compared with the Medicare, Vascular Governance Northwest model, and Glasgow Aneurysm Score for predicting mortality in VQI sample. The Vuong test was performed to compare the model fit between the models. Model discrimination was assessed in different risk group VQI quintiles. Data from 4431 cases from the VSGNE sample with the overall mortality rate of 1.4% was used to develop the model. The internally validated VSGNE model showed a very high discriminating ability in predicting mortality (C = 0.822) and good model fit (Hosmer-Lemeshow P = .309) among the VSGNE elective AAA repair sample. External validation on 16,989 VQI cases with an overall 0.9% mortality rate showed very robust predictive ability of mortality (C = 0.802). Vuong tests yielded a significant fit difference favoring the VSGNE over then Medicare model (C = 0.780), Vascular Governance Northwest (0.774), and Glasgow Aneurysm Score (0.639). Across the 5 risk quintiles, the VSGNE model predicted observed mortality significantly with great accuracy. This simple VSGNE AAA risk predictive model showed very high discriminative ability in predicting mortality after elective AAA repair among a large external independent sample of AAA cases performed by a diverse array of physicians nationwide. The risk score based on this simple VSGNE model can reliably stratify patients according to their risk of mortality after elective AAA repair better than other established models. Copyright © 2017 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Evidence on existing caries risk assessment systems: are they predictive of future caries?

PubMed

Tellez, M; Gomez, J; Pretty, I; Ellwood, R; Ismail, A I

2013-02-01

To critically appraise evidence for the prediction of caries using four caries risk assessment (CRA) systems/guidelines (Cariogram, Caries Management by Risk Assessment (CAMBRA), American Dental Association (ADA), and American Academy of Pediatric Dentistry (AAPD)). This review focused on prospective cohort studies or randomized controlled trials. A systematic search strategy was developed to locate papers published in Medline Ovid and Cochrane databases. The search identified 539 scientific reports, and after title and abstract review, 137 were selected for full review and 14 met the following inclusion criteria: (i) used as validating criterion caries incidence/increment, (ii) involved human subjects and natural carious lesions, and (iii) published in peer-reviewed journals. In addition, papers were excluded if they met one or more of the following criteria: (i) incomplete description of sample selection, outcomes, or small sample size and (ii) not meeting the criteria for best evidence under the prognosis category of the Oxford Centre for Evidence-Based Medicine. There are wide variations among the systems in terms of definitions of caries risk categories, type and number of risk factors/markers, and disease indicators. The Cariogram combined sensitivity and specificity for predicting caries in permanent dentition ranges from 110 to 139 and is the only system for which prospective studies have been conducted to assess its validity. The Cariogram had limited prediction utility in preschool children, and a moderate to good performance for sorting out elderly individuals into caries risk groups. One retrospective analysis on CAMBRA's CRA reported higher incidence of cavitated lesions among those assessed as extreme-risk patients when compared with those at low risk. The evidence on the validity for existing systems for CRA is limited. It is unknown if the identification of high-risk individuals can lead to more effective long-term patient management that prevents caries initiation and arrests or reverses the progression of lesions. There is an urgent need to develop valid and reliable methods for caries risk assessment that are based on best evidence for prediction and disease management rather than opinions of experts.
Selecting postoperative adjuvant systemic therapy for early stage breast cancer: A critical assessment of commercially available gene expression assays

PubMed Central

Schuur, Eric; Angel Aristizabal, Javier; Bargallo Rocha, Juan Enrique; Cabello, Cesar; Elizalde, Roberto; García‐Estévez, Laura; Gomez, Henry L.; Katz, Artur; Nuñez De Pierro, Aníbal

2017-01-01

Risk stratification of patients with early stage breast cancer may support adjuvant chemotherapy decision‐making. This review details the development and validation of six multi‐gene classifiers, each of which claims to provide useful prognostic and possibly predictive information for early stage breast cancer patients. A careful assessment is presented of each test's analytical validity, clinical validity, and clinical utility, as well as the quality of evidence supporting its use. PMID:28211064
Strategies for soil quality assessment using VNIR gyperspectral spectroscopy in a western Kenya Chronosequence

USGS Publications Warehouse

Kinoshita, Rintaro; Moebius-Clune, Bianca N.; van Es, Harold M.; Hively, W. Dean; Bilgilis, A. Volkan

2012-01-01

Visible and near-infrared reflectance spectroscopy (VNIRS) is a rapid and nondestructive method that can predict multiple soil properties simultaneously, but its application in multidimensional soil quality (SQ) assessment in the tropics still needs to be further assessed. In this study, VNIRS (350–2500 nm) was employed to analyze 227 air-dried soil samples of Ultisols from a soil chronosequence in western Kenya and assess 16 SQ indicators. Partial least squares regression (PLSR) was validated using the full-site cross-validation method by grouping samples from each farm or forest site. Most suitable models successfully predicted SQ indicators (R2 ≥ 0.80; ratio of performance to deviation [RPD] ≥ 2.00) including soil organic matter (OMLOI), active C, Ca, cation exchange capacity (CEC), and clay. Moderately-well predicted indicators (0.50 ≤ R2 pwp), and field capacity (Θfc). Poorly predicted indicators (R2 < 0.50; RPD < 1.40) were EC, S, P, available water capacity (AWC), K, Zn, and penetration resistance. Combining VNIRS with selected field- and laboratory-measured SQ indicator values increased predictability. Furthermore, VNIRS showed moderate to substantial agreement in predicting interpretive SQ scores and a composite soil quality index (CSQI) especially when combined with directly measured SQ indicator values. In conclusion, VNIRS has good potential for low cost, rapid assessment of physical and biological SQ indicators but conventional soil chemical tests may need to be retained to provide comprehensive SQ assessments.
Screening Magnetic Resonance Imaging-Based Prediction Model for Assessing Immediate Therapeutic Response to Magnetic Resonance Imaging-Guided High-Intensity Focused Ultrasound Ablation of Uterine Fibroids.

PubMed

Kim, Young-sun; Lim, Hyo Keun; Park, Min Jung; Rhim, Hyunchul; Jung, Sin-Ho; Sohn, Insuk; Kim, Tae-Joong; Keserci, Bilgin

2016-01-01

The aim of this study was to fit and validate screening magnetic resonance imaging (MRI)-based prediction models for assessing immediate therapeutic responses of uterine fibroids to MRI-guided high-intensity focused ultrasound (MR-HIFU) ablation. Informed consent from all subjects was obtained for our institutional review board-approved study. A total of 240 symptomatic uterine fibroids (mean diameter, 6.9 cm) in 152 women (mean age, 43.3 years) treated with MR-HIFU ablation were retrospectively analyzed (160 fibroids for training, 80 fibroids for validation). Screening MRI parameters (subcutaneous fat thickness [mm], x1; relative peak enhancement [%] in semiquantitative perfusion MRI, x2; T2 signal intensity ratio of fibroid to skeletal muscle, x3) were used to fit prediction models with regard to ablation efficiency (nonperfused volume/treatment cell volume, y1) and ablation quality (grade 1-5, poor to excellent, y2), respectively, using the generalized estimating equation method. Cutoff values for achievement of treatment intent (efficiency >1.0; quality grade 4/5) were determined based on receiver operating characteristic curve analysis. Prediction performances were validated by calculating positive and negative predictive values. Generalized estimating equation analyses yielded models of y1 = 2.2637 - 0.0415x1 - 0.0011x2 - 0.0772x3 and y2 = 6.8148 - 0.1070x1 - 0.0050x2 - 0.2163x3. Cutoff values were 1.312 for ablation efficiency (area under the curve, 0.7236; sensitivity, 0.6882; specificity, 0.6866) and 4.019 for ablation quality (0.8794; 0.7156; 0.9020). Positive and negative predictive values were 0.917 and 0.500 for ablation efficiency and 0.978 and 0.600 for ablation quality, respectively. Screening MRI-based prediction models for assessing immediate therapeutic responses of uterine fibroids to MR-HIFU ablation were fitted and validated, which may reduce the risk of unsuccessful treatment.
The East London glaucoma prediction score: web-based validation of glaucoma risk screening tool

PubMed Central

Stephen, Cook; Benjamin, Longo-Mbenza

2013-01-01

AIM It is difficult for Optometrists and General Practitioners to know which patients are at risk. The East London glaucoma prediction score (ELGPS) is a web based risk calculator that has been developed to determine Glaucoma risk at the time of screening. Multiple risk factors that are available in a low tech environment are assessed to provide a risk assessment. This is extremely useful in settings where access to specialist care is difficult. Use of the calculator is educational. It is a free web based service. Data capture is user specific. METHOD The scoring system is a web based questionnaire that captures and subsequently calculates the relative risk for the presence of Glaucoma at the time of screening. Three categories of patient are described: Unlikely to have Glaucoma; Glaucoma Suspect and Glaucoma. A case review methodology of patients with known diagnosis is employed to validate the calculator risk assessment. RESULTS Data from the patient records of 400 patients with an established diagnosis has been captured and used to validate the screening tool. The website reports that the calculated diagnosis correlates with the actual diagnosis 82% of the time. Biostatistics analysis showed: Sensitivity = 88%; Positive predictive value = 97%; Specificity = 75%. CONCLUSION Analysis of the first 400 patients validates the web based screening tool as being a good method of screening for the at risk population. The validation is ongoing. The web based format will allow a more widespread recruitment for different geographic, population and personnel variables. PMID:23550097
An Optimized Transient Dual Luciferase Assay for Quantifying MicroRNA Directed Repression of Targeted Sequences

PubMed Central

Moyle, Richard L.; Carvalhais, Lilia C.; Pretorius, Lara-Simone; Nowak, Ekaterina; Subramaniam, Gayathery; Dalton-Morgan, Jessica; Schenk, Peer M.

2017-01-01

Studies investigating the action of small RNAs on computationally predicted target genes require some form of experimental validation. Classical molecular methods of validating microRNA action on target genes are laborious, while approaches that tag predicted target sequences to qualitative reporter genes encounter technical limitations. The aim of this study was to address the challenge of experimentally validating large numbers of computationally predicted microRNA-target transcript interactions using an optimized, quantitative, cost-effective, and scalable approach. The presented method combines transient expression via agroinfiltration of Nicotiana benthamiana leaves with a quantitative dual luciferase reporter system, where firefly luciferase is used to report the microRNA-target sequence interaction and Renilla luciferase is used as an internal standard to normalize expression between replicates. We report the appropriate concentration of N. benthamiana leaf extracts and dilution factor to apply in order to avoid inhibition of firefly LUC activity. Furthermore, the optimal ratio of microRNA precursor expression construct to reporter construct and duration of the incubation period post-agroinfiltration were determined. The optimized dual luciferase assay provides an efficient, repeatable and scalable method to validate and quantify microRNA action on predicted target sequences. The optimized assay was used to validate five predicted targets of rice microRNA miR529b, with as few as six technical replicates. The assay can be extended to assess other small RNA-target sequence interactions, including assessing the functionality of an artificial miRNA or an RNAi construct on a targeted sequence. PMID:28979287

Development of estrogen receptor beta binding prediction model using large sets of chemicals.

PubMed

Sakkiah, Sugunadevi; Selvaraj, Chandrabose; Gong, Ping; Zhang, Chaoyang; Tong, Weida; Hong, Huixiao

2017-11-03

We developed an ER β binding prediction model to facilitate identification of chemicals specifically bind ER β or ER α together with our previously developed ER α binding model. Decision Forest was used to train ER β binding prediction model based on a large set of compounds obtained from EADB. Model performance was estimated through 1000 iterations of 5-fold cross validations. Prediction confidence was analyzed using predictions from the cross validations. Informative chemical features for ER β binding were identified through analysis of the frequency data of chemical descriptors used in the models in the 5-fold cross validations. 1000 permutations were conducted to assess the chance correlation. The average accuracy of 5-fold cross validations was 93.14% with a standard deviation of 0.64%. Prediction confidence analysis indicated that the higher the prediction confidence the more accurate the predictions. Permutation testing results revealed that the prediction model is unlikely generated by chance. Eighteen informative descriptors were identified to be important to ER β binding prediction. Application of the prediction model to the data from ToxCast project yielded very high sensitivity of 90-92%. Our results demonstrated ER β binding of chemicals could be accurately predicted using the developed model. Coupling with our previously developed ER α prediction model, this model could be expected to facilitate drug development through identification of chemicals that specifically bind ER β or ER α .
Psychometrics of a new questionnaire to assess glaucoma adherence: the Glaucoma Treatment Compliance Assessment Tool (an American Ophthalmological Society thesis).

PubMed

Mansberger, Steven L; Sheppler, Christina R; McClure, Tina M; Vanalstine, Cory L; Swanson, Ingrid L; Stoumbos, Zoey; Lambert, William E

2013-09-01

To report the psychometrics of the Glaucoma Treatment Compliance Assessment Tool (GTCAT), a new questionnaire designed to assess adherence with glaucoma therapy. We developed the questionnaire according to the constructs of the Health Belief Model. We evaluated the questionnaire using data from a cross-sectional study with focus groups (n = 20) and a prospective observational case series (n=58). Principal components analysis provided assessment of construct validity. We repeated the questionnaire after 3 months for test-retest reliability. We evaluated predictive validity using an electronic dosing monitor as an objective measure of adherence. Focus group participants provided 931 statements related to adherence, of which 88.7% (826/931) could be categorized into the constructs of the Health Belief Model. Perceived barriers accounted for 31% (288/931) of statements, cues-to-action 14% (131/931), susceptibility 12% (116/931), benefits 12% (115/931), severity 10% (91/931), and self-efficacy 9% (85/931). The principal components analysis explained 77% of the variance with five components representing Health Belief Model constructs. Reliability analyses showed acceptable Cronbach's alphas (>.70) for four of the seven components (severity, susceptibility, barriers [eye drop administration], and barriers [discomfort]). Predictive validity was high, with several Health Belief Model questions significantly associated (P <.05) with adherence and a correlation coefficient (R (2)) of .40. Test-retest reliability was 90%. The GTCAT shows excellent repeatability, content, construct, and predictive validity for glaucoma adherence. A multisite trial is needed to determine whether the results can be generalized and whether the questionnaire accurately measures the effect of interventions to increase adherence.
The Academic Diligence Task (ADT): Assessing Individual Differences in Effort on Tedious but Important Schoolwork

PubMed Central

Galla, Brian M.; Plummer, Benjamin D.; White, Rachel E.; Meketon, David; D’Mello, Sidney K.; Duckworth, Angela L.

2014-01-01

The current study reports on the development and validation of the Academic Diligence Task (ADT), designed to assess the tendency to expend effort on academic tasks which are tedious in the moment but valued in the long-term. In this novel online task, students allocate their time between solving simple math problems (framed as beneficial for problem solving skills) and, alternatively, playing Tetris or watching entertaining videos. Using a large sample of high school seniors (N = 921), the ADT demonstrated convergent validity with self-report ratings of Big Five conscientiousness and its facets, self-control and grit, as well as discriminant validity from theoretically unrelated constructs, such as Big Five extraversion, openness, and emotional stability, test anxiety, life satisfaction, and positive and negative affect. The ADT also demonstrated incremental predictive validity for objectively measured GPA, standardized math and reading achievement test scores, high school graduation, and college enrollment, over and beyond demographics and intelligence. Collectively, findings suggest the feasibility of online behavioral measures to assess noncognitive individual differences that predict academic outcomes. PMID:25258470
The Academic Diligence Task (ADT): Assessing Individual Differences in Effort on Tedious but Important Schoolwork.

PubMed

Galla, Brian M; Plummer, Benjamin D; White, Rachel E; Meketon, David; D'Mello, Sidney K; Duckworth, Angela L

2014-10-01

The current study reports on the development and validation of the Academic Diligence Task (ADT), designed to assess the tendency to expend effort on academic tasks which are tedious in the moment but valued in the long-term. In this novel online task, students allocate their time between solving simple math problems (framed as beneficial for problem solving skills) and, alternatively, playing Tetris or watching entertaining videos. Using a large sample of high school seniors ( N = 921), the ADT demonstrated convergent validity with self-report ratings of Big Five conscientiousness and its facets, self-control and grit, as well as discriminant validity from theoretically unrelated constructs, such as Big Five extraversion, openness, and emotional stability, test anxiety, life satisfaction, and positive and negative affect. The ADT also demonstrated incremental predictive validity for objectively measured GPA, standardized math and reading achievement test scores, high school graduation, and college enrollment, over and beyond demographics and intelligence. Collectively, findings suggest the feasibility of online behavioral measures to assess noncognitive individual differences that predict academic outcomes.
Characterization and validation of an in silico toxicology model to predict the mutagenic potential of drug impurities*

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valerio, Luis G., E-mail: luis.valerio@fda.hhs.gov; Cross, Kevin P.

Control and minimization of human exposure to potential genotoxic impurities found in drug substances and products is an important part of preclinical safety assessments of new drug products. The FDA's 2008 draft guidance on genotoxic and carcinogenic impurities in drug substances and products allows use of computational quantitative structure–activity relationships (QSAR) to identify structural alerts for known and expected impurities present at levels below qualified thresholds. This study provides the information necessary to establish the practical use of a new in silico toxicology model for predicting Salmonella t. mutagenicity (Ames assay outcome) of drug impurities and other chemicals. We describemore » the model's chemical content and toxicity fingerprint in terms of compound space, molecular and structural toxicophores, and have rigorously tested its predictive power using both cross-validation and external validation experiments, as well as case studies. Consistent with desired regulatory use, the model performs with high sensitivity (81%) and high negative predictivity (81%) based on external validation with 2368 compounds foreign to the model and having known mutagenicity. A database of drug impurities was created from proprietary FDA submissions and the public literature which found significant overlap between the structural features of drug impurities and training set chemicals in the QSAR model. Overall, the model's predictive performance was found to be acceptable for screening drug impurities for Salmonella mutagenicity. -- Highlights: ► We characterize a new in silico model to predict mutagenicity of drug impurities. ► The model predicts Salmonella mutagenicity and will be useful for safety assessment. ► We examine toxicity fingerprints and toxicophores of this Ames assay model. ► We compare these attributes to those found in drug impurities known to FDA/CDER. ► We validate the model and find it has a desired predictive performance.« less
Predictive validity and reliability of the Braden scale for risk assessment of pressure ulcers in an intensive care unit.

PubMed

Lima-Serrano, M; González-Méndez, M I; Martín-Castaño, C; Alonso-Araujo, I; Lima-Rodríguez, J S

2018-03-01

Contribution to validation of the Braden scale in patients admitted to the ICU, based on an analysis of its reliability and predictive validity. An analytical, observational, longitudinal prospective study was carried out. Intensive Care Unit, Hospital Virgen del Rocío, Seville (Spain). Patients aged 18years or older and admitted for over 24hours to the ICU were included. Patients with pressure ulcers upon admission were excluded. A total of 335 patients were enrolled in two study periods of one month each. None. The presence of gradei-iv pressure ulcers was regarded as the main or dependent variable. Three categories were considered (demographic, clinical and prognostic) for the remaining variables. The incidence of patients who developed pressure ulcers was 8.1%. The proportion of gradei andii pressure ulcer was 40.6% and 59.4% respectively, highlighting the sacrum as the most frequently affected location. Cronbach's alpha coefficient in the assessments considered indicated good to moderate reliability. In the three evaluations made, a cutoff point of 12 was presented as optimal in the assessment of the first and second days of admission. In relation to the assessment of the day with minimum score, the optimal cutoff point was 10. The Braden scale shows insufficient predictive validity and poor precision for cutoff points of both 18 and 16, which are those accepted in the different clinical scenarios. Copyright © 2017 Elsevier España, S.L.U. y SEMNIM. All rights reserved.
Optimal test selection for prediction uncertainty reduction

DOE PAGES

Mullins, Joshua; Mahadevan, Sankaran; Urbina, Angel

2016-12-02

Economic factors and experimental limitations often lead to sparse and/or imprecise data used for the calibration and validation of computational models. This paper addresses resource allocation for calibration and validation experiments, in order to maximize their effectiveness within given resource constraints. When observation data are used for model calibration, the quality of the inferred parameter descriptions is directly affected by the quality and quantity of the data. This paper characterizes parameter uncertainty within a probabilistic framework, which enables the uncertainty to be systematically reduced with additional data. The validation assessment is also uncertain in the presence of sparse and imprecisemore » data; therefore, this paper proposes an approach for quantifying the resulting validation uncertainty. Since calibration and validation uncertainty affect the prediction of interest, the proposed framework explores the decision of cost versus importance of data in terms of the impact on the prediction uncertainty. Often, calibration and validation tests may be performed for different input scenarios, and this paper shows how the calibration and validation results from different conditions may be integrated into the prediction. Then, a constrained discrete optimization formulation that selects the number of tests of each type (calibration or validation at given input conditions) is proposed. Furthermore, the proposed test selection methodology is demonstrated on a microelectromechanical system (MEMS) example.« less
Evaluating the real-world predictive validity of the Body Image Quality of Life Inventory using Ecological Momentary Assessment.

PubMed

Heron, Kristin E; Mason, Tyler B; Sutton, Tiphanie G; Myers, Taryn A

2015-09-01

Perceptions of physical appearance, or body image, can affect psychosocial functioning and quality of life (QOL). The present study evaluated the real-world predictive validity of the Body Image Quality of Life Inventory (BIQLI) using Ecological Momentary Assessment (EMA). College women reporting subclinical disordered eating/body dissatisfaction (N=131) completed the BIQLI and related measures. For one week they then completed five daily EMA surveys of mood, social interactions, stress, and eating behaviors on palmtop computers. Results showed better body image QOL was associated with less negative affect, less overwhelming emotions, more positive affect, more pleasant social interactions, and higher self-efficacy for handling stress. Lower body image QOL was marginally related to less overeating and lower loss of control over eating in daily life. To our knowledge, this is the first study to support the real-world predictive validity of the BIQLI by identifying social, affective, and behavioral correlates in everyday life using EMA. Copyright © 2015 Elsevier Ltd. All rights reserved.
External validation of a simple clinical tool used to predict falls in people with Parkinson disease

PubMed Central

Duncan, Ryan P.; Cavanaugh, James T.; Earhart, Gammon M.; Ellis, Terry D.; Ford, Matthew P.; Foreman, K. Bo; Leddy, Abigail L.; Paul, Serene S.; Canning, Colleen G.; Thackeray, Anne; Dibble, Leland E.

2015-01-01

Background Assessment of fall risk in an individual with Parkinson disease (PD) is a critical yet often time consuming component of patient care. Recently a simple clinical prediction tool based only on fall history in the previous year, freezing of gait in the past month, and gait velocity <1.1 m/s was developed and accurately predicted future falls in a sample of individuals with PD. METHODS We sought to externally validate the utility of the tool by administering it to a different cohort of 171 individuals with PD. Falls were monitored prospectively for 6 months following predictor assessment. RESULTS The tool accurately discriminated future fallers from non-fallers (area under the curve [AUC] = 0.83; 95% CI 0.76 –0.89), comparable to the developmental study. CONCLUSION The results validated the utility of the tool for allowing clinicians to quickly and accurately identify an individual’s risk of an impending fall. PMID:26003412
External validation of a simple clinical tool used to predict falls in people with Parkinson disease.

PubMed

Duncan, Ryan P; Cavanaugh, James T; Earhart, Gammon M; Ellis, Terry D; Ford, Matthew P; Foreman, K Bo; Leddy, Abigail L; Paul, Serene S; Canning, Colleen G; Thackeray, Anne; Dibble, Leland E

2015-08-01

Assessment of fall risk in an individual with Parkinson disease (PD) is a critical yet often time consuming component of patient care. Recently a simple clinical prediction tool based only on fall history in the previous year, freezing of gait in the past month, and gait velocity <1.1 m/s was developed and accurately predicted future falls in a sample of individuals with PD. We sought to externally validate the utility of the tool by administering it to a different cohort of 171 individuals with PD. Falls were monitored prospectively for 6 months following predictor assessment. The tool accurately discriminated future fallers from non-fallers (area under the curve [AUC] = 0.83; 95% CI 0.76-0.89), comparable to the developmental study. The results validated the utility of the tool for allowing clinicians to quickly and accurately identify an individual's risk of an impending fall. Copyright © 2015 Elsevier Ltd. All rights reserved.
Assessing Attachment Security With the Attachment Q Sort: Meta-Analytic Evidence for the Validity of the Observer AQS

ERIC Educational Resources Information Center

van I Jzendoorn,Marinus H.; Vereijken, Carolus M.J.L.; Bakermans-Kranenburg, Marian J.; Riksen-Walraven, Marianne J.

2004-01-01

The reliability and validity of the Attachment Q Sort (AQS; Waters & Deane, 1985) was tested in a series of meta-analyses on 139 studies with 13,835 children. The observer AQS security score showed convergent validity with Strange Situation procedure (SSP) security (r=31) and excellent predictive validity with sensitivity measures (r=39). Its…
Validity of the SAT® for Predicting First-Year Grades: 2010 SAT Validity Sample. Statistical Report 2013-2

ERIC Educational Resources Information Center

Patterson, Brian F.; Mattern, Krista D.

2013-01-01

The continued accumulation of validity evidence for the core uses of educational assessments is critical to ensure that proper inferences will be made for those core purposes. To that end, the College Board has continued to follow previous cohorts of college students and this report provides updated validity evidence for using the SAT to predict…
An integrated approach to evaluating alternative risk prediction strategies: a case study comparing alternative approaches for preventing invasive fungal disease.

PubMed

Sadique, Z; Grieve, R; Harrison, D A; Jit, M; Allen, E; Rowan, K M

2013-12-01

This article proposes an integrated approach to the development, validation, and evaluation of new risk prediction models illustrated with the Fungal Infection Risk Evaluation study, which developed risk models to identify non-neutropenic, critically ill adult patients at high risk of invasive fungal disease (IFD). Our decision-analytical model compared alternative strategies for preventing IFD at up to three clinical decision time points (critical care admission, after 24 hours, and end of day 3), followed with antifungal prophylaxis for those judged "high" risk versus "no formal risk assessment." We developed prognostic models to predict the risk of IFD before critical care unit discharge, with data from 35,455 admissions to 70 UK adult, critical care units, and validated the models externally. The decision model was populated with positive predictive values and negative predictive values from the best-fitting risk models. We projected lifetime cost-effectiveness and expected value of partial perfect information for groups of parameters. The risk prediction models performed well in internal and external validation. Risk assessment and prophylaxis at the end of day 3 was the most cost-effective strategy at the 2% and 1% risk threshold. Risk assessment at each time point was the most cost-effective strategy at a 0.5% risk threshold. Expected values of partial perfect information were high for positive predictive values or negative predictive values (£11 million-£13 million) and quality-adjusted life-years (£11 million). It is cost-effective to formally assess the risk of IFD for non-neutropenic, critically ill adult patients. This integrated approach to developing and evaluating risk models is useful for informing clinical practice and future research investment. © 2013 International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Published by International Society for Pharmacoeconomics and Outcomes Research (ISPOR) All rights reserved.
Development of a Korean Fracture Risk Score (KFRS) for Predicting Osteoporotic Fracture Risk: Analysis of Data from the Korean National Health Insurance Service

PubMed Central

Jang, Eun Jin; Park, ByeongJu; Kim, Tae-Young; Shin, Soon-Ae

2016-01-01

Background Asian-specific prediction models for estimating individual risk of osteoporotic fractures are rare. We developed a Korean fracture risk prediction model using clinical risk factors and assessed validity of the final model. Methods A total of 718,306 Korean men and women aged 50–90 years were followed for 7 years in a national system-based cohort study. In total, 50% of the subjects were assigned randomly to the development dataset and 50% were assigned to the validation dataset. Clinical risk factors for osteoporotic fracture were assessed at the biennial health check. Data on osteoporotic fractures during the follow-up period were identified by ICD-10 codes and the nationwide database of the National Health Insurance Service (NHIS). Results During the follow-up period, 19,840 osteoporotic fractures were reported (4,889 in men and 14,951 in women) in the development dataset. The assessment tool called the Korean Fracture Risk Score (KFRS) is comprised of a set of nine variables, including age, body mass index, recent fragility fracture, current smoking, high alcohol intake, lack of regular exercise, recent use of oral glucocorticoid, rheumatoid arthritis, and other causes of secondary osteoporosis. The KFRS predicted osteoporotic fractures over the 7 years. This score was validated using an independent dataset. A close relationship with overall fracture rate was observed when we compared the mean predicted scores after applying the KFRS with the observed risks after 7 years within each 10th of predicted risk. Conclusion We developed a Korean specific prediction model for osteoporotic fractures. The KFRS was able to predict risk of fracture in the primary population without bone mineral density testing and is therefore suitable for use in both clinical setting and self-assessment. The website is available at http://www.nhis.or.kr. PMID:27399597
Development of a Korean Fracture Risk Score (KFRS) for Predicting Osteoporotic Fracture Risk: Analysis of Data from the Korean National Health Insurance Service.

PubMed

Kim, Ha Young; Jang, Eun Jin; Park, ByeongJu; Kim, Tae-Young; Shin, Soon-Ae; Ha, Yong-Chan; Jang, Sunmee

2016-01-01

Asian-specific prediction models for estimating individual risk of osteoporotic fractures are rare. We developed a Korean fracture risk prediction model using clinical risk factors and assessed validity of the final model. A total of 718,306 Korean men and women aged 50-90 years were followed for 7 years in a national system-based cohort study. In total, 50% of the subjects were assigned randomly to the development dataset and 50% were assigned to the validation dataset. Clinical risk factors for osteoporotic fracture were assessed at the biennial health check. Data on osteoporotic fractures during the follow-up period were identified by ICD-10 codes and the nationwide database of the National Health Insurance Service (NHIS). During the follow-up period, 19,840 osteoporotic fractures were reported (4,889 in men and 14,951 in women) in the development dataset. The assessment tool called the Korean Fracture Risk Score (KFRS) is comprised of a set of nine variables, including age, body mass index, recent fragility fracture, current smoking, high alcohol intake, lack of regular exercise, recent use of oral glucocorticoid, rheumatoid arthritis, and other causes of secondary osteoporosis. The KFRS predicted osteoporotic fractures over the 7 years. This score was validated using an independent dataset. A close relationship with overall fracture rate was observed when we compared the mean predicted scores after applying the KFRS with the observed risks after 7 years within each 10th of predicted risk. We developed a Korean specific prediction model for osteoporotic fractures. The KFRS was able to predict risk of fracture in the primary population without bone mineral density testing and is therefore suitable for use in both clinical setting and self-assessment. The website is available at http://www.nhis.or.kr.
Validation of urban freeway models. [supporting datasets

DOT National Transportation Integrated Search

2015-01-01

The goal of the SHRP 2 Project L33 Validation of Urban Freeway Models was to assess and enhance the predictive travel time reliability models developed in the SHRP 2 Project L03, Analytic Procedures for Determining the Impacts of Reliability Mitigati...
Predictive validity of the HCR-20 for violent and non-violent sexual behaviour in a secure mental health service.

PubMed

O'Shea, Laura E; Thaker, Dev-Kishan; Picchioni, Marco M; Mason, Fiona L; Knight, Caroline; Dickens, Geoffrey L

2016-12-01

Violent and non-violent sexual behaviour is a fairly common problem among secure mental health service patients, but specialist sexual violence risk assessment is time-consuming and so performed infrequently. We aimed to establish whether a commonly used violence risk assessment tool, the Health Clinical Risk management 20(HCR-20), has predictive validity specifically for inappropriate sexual behaviour. A pseudo-prospective cohort design was used for a study in the adult wards of a large provider of specialist secure mental health services. Routine clinical team HCR-20 assessments were extracted from records, and incidents involving inappropriate sexual behaviour were recorded for the 3 months following assessment. Of 613 patients, 104 (17%) had engaged in at least one inappropriate sexual behaviour; in 65 (10.6%), the sexual act was violent. HCR-20 total score, clinical and risk management subscales, predicted violent and non-violent sexual behaviour. The negative predictive value of the HCR-20 for inappropriate sexual behaviour was over 90%. Prediction of violent sexual behaviour may be regarded as well within the scope of the HCR-20 as a structured professional judgement tool to aid violence risk prediction, but we found that it also predicts behaviours that may be of concern but fall below the violence threshold. High negative predictive values suggest that HCR-20 scores may have some utility for screening out patients who do not require more specialist assessment for inappropriate sexual behaviour. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Multidimensional assessment of self-regulated learning with middle school math students.

PubMed

Callan, Gregory L; Cleary, Timothy J

2018-03-01

This study examined the convergent and predictive validity of self-regulated learning (SRL) measures situated in mathematics. The sample included 100 eighth graders from a diverse, urban school district. Four measurement formats were examined including, 2 broad-based (i.e., self-report questionnaire and teacher ratings) and 2 task-specific measures (i.e., SRL microanalysis and behavioral traces). Convergent validity was examined across task-difficulty, and the predictive validity was examined across 3 mathematics outcomes: 2 measures of mathematical problem solving skill (i.e., practice session math problems, posttest math problems) and a global measure of mathematical skill (i.e., standardized math test). Correlation analyses were used to examine convergent validity and revealed medium correlations between measures within the same category (i.e., broad-based or task-specific). Relations between measurement classes were not statistically significant. Separate regressions examined the predictive validity of the SRL measures. While controlling all other predictors, a SRL microanalysis metacognitive-monitoring measure emerged as a significant predictor of all 3 outcomes and teacher ratings accounted for unique variance on 2 of the outcomes (i.e., posttest math problems and standardized math test). Results suggest that a multidimensional assessment approach should be considered by school psychologists interested in measuring SRL. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
External validation of a prediction model for surgical site infection after thoracolumbar spine surgery in a Western European cohort.

PubMed

Janssen, Daniël M C; van Kuijk, Sander M J; d'Aumerie, Boudewijn B; Willems, Paul C

2018-05-16

A prediction model for surgical site infection (SSI) after spine surgery was developed in 2014 by Lee et al. This model was developed to compute an individual estimate of the probability of SSI after spine surgery based on the patient's comorbidity profile and invasiveness of surgery. Before any prediction model can be validly implemented in daily medical practice, it should be externally validated to assess how the prediction model performs in patients sampled independently from the derivation cohort. We included 898 consecutive patients who underwent instrumented thoracolumbar spine surgery. To quantify overall performance using Nagelkerke's R 2 statistic, the discriminative ability was quantified as the area under the receiver operating characteristic curve (AUC). We computed the calibration slope of the calibration plot, to judge prediction accuracy. Sixty patients developed an SSI. The overall performance of the prediction model in our population was poor: Nagelkerke's R 2 was 0.01. The AUC was 0.61 (95% confidence interval (CI) 0.54-0.68). The estimated slope of the calibration plot was 0.52. The previously published prediction model showed poor performance in our academic external validation cohort. To predict SSI after instrumented thoracolumbar spine surgery for the present population, a better fitting prediction model should be developed.
External Validation of a Tool Predicting 7-Year Risk of Developing Cardiovascular Disease, Type 2 Diabetes or Chronic Kidney Disease.

PubMed

Rauh, Simone P; Rutters, Femke; van der Heijden, Amber A W A; Luimes, Thomas; Alssema, Marjan; Heymans, Martijn W; Magliano, Dianna J; Shaw, Jonathan E; Beulens, Joline W; Dekker, Jacqueline M

2018-02-01

Chronic cardiometabolic diseases, including cardiovascular disease (CVD), type 2 diabetes (T2D) and chronic kidney disease (CKD), share many modifiable risk factors and can be prevented using combined prevention programs. Valid risk prediction tools are needed to accurately identify individuals at risk. We aimed to validate a previously developed non-invasive risk prediction tool for predicting the combined 7-year-risk for chronic cardiometabolic diseases. The previously developed tool is stratified for sex and contains the predictors age, BMI, waist circumference, use of antihypertensives, smoking, family history of myocardial infarction/stroke, and family history of diabetes. This tool was externally validated, evaluating model performance using area under the receiver operating characteristic curve (AUC)-assessing discrimination-and Hosmer-Lemeshow goodness-of-fit (HL) statistics-assessing calibration. The intercept was recalibrated to improve calibration performance. The risk prediction tool was validated in 3544 participants from the Australian Diabetes, Obesity and Lifestyle Study (AusDiab). Discrimination was acceptable, with an AUC of 0.78 (95% CI 0.75-0.81) in men and 0.78 (95% CI 0.74-0.81) in women. Calibration was poor (HL statistic: p < 0.001), but improved considerably after intercept recalibration. Examination of individual outcomes showed that in men, AUC was highest for CKD (0.85 [95% CI 0.78-0.91]) and lowest for T2D (0.69 [95% CI 0.65-0.74]). In women, AUC was highest for CVD (0.88 [95% CI 0.83-0.94)]) and lowest for T2D (0.71 [95% CI 0.66-0.75]). Validation of our previously developed tool showed robust discriminative performance across populations. Model recalibration is recommended to account for different disease rates. Our risk prediction tool can be useful in large-scale prevention programs for identifying those in need of further risk profiling because of their increased risk for chronic cardiometabolic diseases.

The development and validation of the Youth Actuarial Care Needs Assessment Tool for Non-Offenders (Y-ACNAT-NO).

PubMed

Assink, Mark; van der Put, Claudia E; Oort, Frans J; Stams, Geert Jan J M

2015-03-04

In The Netherlands, police officers not only come into contact with juvenile offenders, but also with a large number of juveniles who were involved in a criminal offense, but not in the role of a suspect (i.e., juvenile non-offenders). Until now, no valid and reliable instrument was available that can be used by Dutch police officers for estimating the risk for future care needs of juvenile non-offenders. In the present study, the Youth Actuarial Care Needs Assessment Tool for Non-Offenders (Y-ACNAT-NO) was developed for predicting the risk for future care needs that consisted of (1) a future supervision order as imposed by a juvenile court judge and (2) future worrisome incidents involving child abuse, domestic violence/strife, and/or sexual offensive behavior at the juvenile's living address (i.e., problems in the child-rearing environment). Police records of 3,200 juveniles were retrieved from the Dutch police registration system after which the sample was randomly split in a construction (n = 1,549) and validation sample (n = 1,651). The Y-ACNAT-NO was developed by performing an Exhaustive CHAID analysis using the construction sample. The predictive validity of the instrument was examined in the validation sample by calculating several performance indicators that assess discrimination and calibration. The CHAID output yielded an instrument that consisted of six variables and eleven different risk groups. The risk for future care needs ranged from 0.06 in the lowest risk group to 0.83 in the highest risk group. The AUC value in the validation sample was .764 (95% CI [.743, .784]) and Sander's calibration score indicated an average assessment error of 3.74% in risk estimates per risk category. The Y-ACNAT-NO is the first instrument that can be used by Dutch police officers for estimating the risk for future care needs of juvenile non-offenders. The predictive validity of the Y-ACNAT-NO in terms of discrimination and calibration was sufficient to justify its use as an initial screening instrument when a decision is needed about referring a juvenile for further assessment of care needs.
Linear and nonlinear models for predicting fish bioconcentration factors for pesticides.

PubMed

Yuan, Jintao; Xie, Chun; Zhang, Ting; Sun, Jinfang; Yuan, Xuejie; Yu, Shuling; Zhang, Yingbiao; Cao, Yunyuan; Yu, Xingchen; Yang, Xuan; Yao, Wu

2016-08-01

This work is devoted to the applications of the multiple linear regression (MLR), multilayer perceptron neural network (MLP NN) and projection pursuit regression (PPR) to quantitative structure-property relationship analysis of bioconcentration factors (BCFs) of pesticides tested on Bluegill (Lepomis macrochirus). Molecular descriptors of a total of 107 pesticides were calculated with the DRAGON Software and selected by inverse enhanced replacement method. Based on the selected DRAGON descriptors, a linear model was built by MLR, nonlinear models were developed using MLP NN and PPR. The robustness of the obtained models was assessed by cross-validation and external validation using test set. Outliers were also examined and deleted to improve predictive power. Comparative results revealed that PPR achieved the most accurate predictions. This study offers useful models and information for BCF prediction, risk assessment, and pesticide formulation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Measurement of fatigue: Comparison of the reliability and validity of single-item and short measures to a comprehensive measure.

PubMed

Kim, Hee-Ju; Abraham, Ivo

2017-01-01

Evidence is needed on the clinicometric properties of single-item or short measures as alternatives to comprehensive measures. We examined whether two single-item fatigue measures (i.e., Likert scale, numeric rating scale) or a short fatigue measure were comparable to a comprehensive measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults. For this quantitative study, we selected the Functional Assessment of Chronic Illness Therapy-Fatigue for the comprehensive measure and the Profile of Mood States-Brief, Fatigue subscale for the short measure; and constructed two single-item measures. A total of 368 students from four nursing colleges in South Korea participated. We used Cronbach's alpha and item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability. We assessed Pearson's correlation with a comprehensive measure for convergent validity, with perceived stress level and sleep quality for concurrent validity and the receiver operating characteristic curve for predictive validity. The short measure was comparable to the comprehensive measure in internal consistency reliability (Cronbach's alpha=0.81 vs. 0.88); test-retest reliability (intraclass correlation coefficient=0.66 vs. 0.61); convergent validity (r with comprehensive measure=0.79); concurrent validity (r with perceived stress=0.55, r with sleep quality=0.39) and predictive validity (area under curve=0.88). Single-item measures were not comparable to the comprehensive measure. A short fatigue measure exhibited similar levels of reliability and validity to the comprehensive measure in Korean young adults. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
ACSNSQIP Risk Calculator in Indian Patients Undergoing Surgery for Head and Neck Cancers: Is It Valid?

PubMed

Subramaniam, Narayana; Balasubramanian, Deepak; Rka, Pradeep; Murthy, Samskruthi; Rathod, Priyank; Vidhyadharan, Sivakumar; Thankappan, Krishnakumar; Iyer, Subramania

2018-06-01

Pre-operative assessment is vital to determine patient-specific risks and minimize them in order to optimize surgical outcomes. The American College of Surgeons National Surgical Quality Improvement Program (ACSNSQIP) Surgical Risk Calculator is the most comprehensive surgical risk assessment tool available. We performed this study to determine the validity of ACSNSQIP calculator when used to predict surgical complications in a cohort of patients with head and neck cancer treated in an Indian tertiary care center. Retrospective data was collected for 150 patients with head and neck cancer who were operated in the Department of Head and Neck Oncology, Amrita Institute of Medical Sciences, Kochi, in the year 2016. The predicted outcome data was compared with actual documented outcome data for the variables mentioned. Brier's score was used to estimate the predictive value of the risk assessment generated. Pearson's r coefficient was utilized to validate the prediction of length of hospital stay. Brier's score for the entire calculator was 0.32 (not significant). Additionally, when the score was determined for individual parameters (surgical site infection, pneumonia, etc.), none were significant. Pearson's r value for length of stay was also not significant ( p = .632). The ACSNSQIP risk assessment tool did not accurately reflect surgical outcomes in our cohort of Indian patients. Although it is the most comprehensive tool available at present, modifications that may improve accuracy are allowing for input of multiple procedure codes, risk stratifying for previous radiation or surgery, and better risk assessment for microvascular flap reconstruction.
Predictive model of outcome of targeted nodal assessment in colorectal cancer.

PubMed

Nissan, Aviram; Protic, Mladjan; Bilchik, Anton; Eberhardt, John; Peoples, George E; Stojadinovic, Alexander

2010-02-01

Improvement in staging accuracy is the principal aim of targeted nodal assessment in colorectal carcinoma. Technical factors independently predictive of false negative (FN) sentinel lymph node (SLN) mapping should be identified to facilitate operative decision making. To define independent predictors of FN SLN mapping and to develop a predictive model that could support surgical decisions. Data was analyzed from 2 completed prospective clinical trials involving 278 patients with colorectal carcinoma undergoing SLN mapping. Clinical outcome of interest was FN SLN(s), defined as one(s) with no apparent tumor cells in the presence of non-SLN metastases. To assess the independent predictive effect of a covariate for a nominal response (FN SLN), a logistic regression model was constructed and parameters estimated using maximum likelihood. A probabilistic Bayesian model was also trained and cross validated using 10-fold train-and-test sets to predict FN SLN mapping. Area under the curve (AUC) from receiver operating characteristics curves of these predictions was calculated to determine the predictive value of the model. Number of SLNs (<3; P = 0.03) and tumor-replaced nodes (P < 0.01) independently predicted FN SLN. Cross validation of the model created with Bayesian Network Analysis effectively predicted FN SLN (area under the curve = 0.84-0.86). The positive and negative predictive values of the model are 83% and 97%, respectively. This study supports a minimum threshold of 3 nodes for targeted nodal assessment in colorectal cancer, and establishes sufficient basis to conclude that SLN mapping and biopsy cannot be justified in the presence of clinically apparent tumor-replaced nodes.
A Multimethod Multitrait Validity Assessment of Self-Construal in Japan, Korea, and the United States

ERIC Educational Resources Information Center

Bresnahan, Mary J.; Levine, Timothy R.; Shearman, Sachiyo Morinaga; Lee, Sun Young; Park, Cheong-Yi; Kiyomiya, Toru

2005-01-01

A large number of previous studies have used self-construal to predict communication outcomes. Recent evidence, however, suggests that validity problems may exist in self-construal measurement. The current study conducted a multimethod multitrait (Campbell & Fiske, 1959) validation study of self-construal measures with data (total N = 578)…
Assessment of Young English Language Learners in Arizona: Questioning the Validity of the State Measure of English Proficiency

ERIC Educational Resources Information Center

Garcia, Eugene E.; Lawton, Kerry; Diniz de Figueiredo, Eduardo H.

2010-01-01

This study analyzes the Arizona policy of utilizing a single assessment of English proficiency to determine if students should be exited from the ELL program, which is ostensibly designed to make it possible for them to succeed in the mainstream classroom without any further language support. The study examines the predictive validity of this…
Surrogate screening models for the low physical activity criterion of frailty.

PubMed

Eckel, Sandrah P; Bandeen-Roche, Karen; Chaves, Paulo H M; Fried, Linda P; Louis, Thomas A

2011-06-01

Low physical activity, one of five criteria in a validated clinical phenotype of frailty, is assessed by a standardized, semiquantitative questionnaire on up to 20 leisure time activities. Because of the time demanded to collect the interview data, it has been challenging to translate to studies other than the Cardiovascular Health Study (CHS), for which it was developed. Considering subsets of activities, we identified and evaluated streamlined surrogate assessment methods and compared them to one implemented in the Women's Health and Aging Study (WHAS). Using data on men and women ages 65 and older from the CHS, we applied logistic regression models to rank activities by "relative influence" in predicting low physical activity.We considered subsets of the most influential activities as inputs to potential surrogate models (logistic regressions). We evaluated predictive accuracy and predictive validity using the area under receiver operating characteristic curves and assessed criterion validity using proportional hazards models relating frailty status (defined using the surrogate) to mortality. Walking for exercise and moderately strenuous household chores were highly influential for both genders. Women required fewer activities than men for accurate classification. The WHAS model (8 CHS activities) was an effective surrogate, but a surrogate using 6 activities (walking, chores, gardening, general exercise, mowing and golfing) was also highly predictive. We recommend a 6 activity questionnaire to assess physical activity for men and women. If efficiency is essential and the study involves only women, fewer activities can be included.
Observational Assessment of Preschool Disruptive Behavior, Part II: validity of the Disruptive Behavior Diagnostic Observation Schedule (DB-DOS).

PubMed

Wakschlag, Lauren S; Briggs-Gowan, Margaret J; Hill, Carri; Danis, Barbara; Leventhal, Bennett L; Keenan, Kate; Egger, Helen L; Cicchetti, Domenic; Burns, James; Carter, Alice S

2008-06-01

To examine the validity of the Disruptive Behavior Diagnostic Observation Schedule (DB-DOS), a new observational method for assessing preschool disruptive behavior. A total of 327 behaviorally heterogeneous preschoolers from low-income environments comprised the validation sample. Parent and teacher reports were used to identify children with clinically significant disruptive behavior. The DB-DOS assessed observed disruptive behavior in two domains, problems in Behavioral Regulation and Anger Modulation, across three interactional contexts: Examiner Engaged, Examiner Busy, and Parent. Convergent and divergent validity of the DB-DOS were tested in relation to parent and teacher reports and independently observed behavior. Clinical validity was tested in terms of criterion and incremental validity of the DB-DOS for discriminating disruptive behavior status and impairment, concurrently and longitudinally. DB-DOS scores were significantly associated with reported and independently observed behavior in a theoretically meaningful fashion. Scores from both DB-DOS domains and each of the three DB-DOS contexts contributed uniquely to discrimination of disruptive behavior status, concurrently and predictively. Observed behavior on the DB-DOS also contributed incrementally to prediction of impairment over time, beyond variance explained by meeting DSM-IV disruptive behavior disorder symptom criteria based on parent/teacher report. The multidomain, multicontext approach of the DB-DOS is a valid method for direct assessment of preschool disruptive behavior. This approach shows promise for enhancing accurate identification of clinically significant disruptive behavior in young children and for characterizing subtypes in a manner that can directly inform etiological and intervention research.
Validation of Groundwater Models: Meaningful or Meaningless?

NASA Astrophysics Data System (ADS)

Konikow, L. F.

2003-12-01

Although numerical simulation models are valuable tools for analyzing groundwater systems, their predictive accuracy is limited. People who apply groundwater flow or solute-transport models, as well as those who make decisions based on model results, naturally want assurance that a model is "valid." To many people, model validation implies some authentication of the truth or accuracy of the model. History matching is often presented as the basis for model validation. Although such model calibration is a necessary modeling step, it is simply insufficient for model validation. Because of parameter uncertainty and solution non-uniqueness, declarations of validation (or verification) of a model are not meaningful. Post-audits represent a useful means to assess the predictive accuracy of a site-specific model, but they require the existence of long-term monitoring data. Model testing may yield invalidation, but that is an opportunity to learn and to improve the conceptual and numerical models. Examples of post-audits and of the application of a solute-transport model to a radioactive waste disposal site illustrate deficiencies in model calibration, prediction, and validation.
Validation of a single-stage fixed-rate step test for the prediction of maximal oxygen uptake in healthy adults.

PubMed

Hansen, Dominique; Jacobs, Nele; Thijs, Herbert; Dendale, Paul; Claes, Neree

2016-09-01

Healthcare professionals with limited access to ergospirometry remain in need of valid and simple submaximal exercise tests to predict maximal oxygen uptake (VO2max ). Despite previous validation studies concerning fixed-rate step tests, accurate equations for the estimation of VO2max remain to be formulated from a large sample of healthy adults between age 18-75 years (n > 100). The aim of this study was to develop a valid equation to estimate VO2max from a fixed-rate step test in a larger sample of healthy adults. A maximal ergospirometry test, with assessment of cardiopulmonary parameters and VO2max , and a 5-min fixed-rate single-stage step test were executed in 112 healthy adults (age 18-75 years). During the step test and subsequent recovery, heart rate was monitored continuously. By linear regression analysis, an equation to predict VO2max from the step test was formulated. This equation was assessed for level of agreement by displaying Bland-Altman plots and calculation of intraclass correlations with measured VO2max . Validity further was assessed by employing a Jackknife procedure. The linear regression analysis generated the following equation to predict VO2max (l min(-1) ) from the step test: 0·054(BMI)+0·612(gender)+3·359(body height in m)+0·019(fitness index)-0·012(HRmax)-0·011(age)-3·475. This equation explained 78% of the variance in measured VO2max (F = 66·15, P<0·001). The level of agreement and intraclass correlation was high (ICC = 0·94, P<0·001) between measured and predicted VO2max . From this study, a valid fixed-rate single-stage step test equation has been developed to estimate VO2max in healthy adults. This tool could be employed by healthcare professionals with limited access to ergospirometry. © 2015 Scandinavian Society of Clinical Physiology and Nuclear Medicine. Published by John Wiley & Sons Ltd.
Effectively Coping With Task Stress: A Study of the Validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF).

PubMed

O'Connor, Peter; Nguyen, Jessica; Anglim, Jeromy

2017-01-01

In this study, we investigated the validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF; Petrides, 2009) in the context of task-induced stress. We used a total sample of 225 volunteers to investigate (a) the incremental validity of the TEIQue-SF over other predictors of coping with task-induced stress, and (b) the construct validity of the TEIQue-SF by examining the mechanisms via which scores from the TEIQue-SF predict coping outcomes. Results demonstrated that the TEIQue-SF possessed incremental validity over the Big Five personality traits in the prediction of emotion-focused coping. Results also provided support for the construct validity of the TEIQue-SF by demonstrating that this measure predicted adaptive coping via emotion-focused channels. Specifically, results showed that, following a task stressor, the TEIQue-SF predicted low negative affect and high task performance via high levels of emotion-focused coping. Consistent with the purported theoretical nature of the trait emotional intelligence (EI) construct, trait EI as assessed by the TEIQue-SF primarily enhances affect and performance in stressful situations by regulating negative emotions.
Reliability and validity of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-30) in Saudi Arabia.

PubMed

Tadakamadla, Santosh Kumar; Quadri, Mir Faeq Ali; Pakpour, Amir H; Zailai, Abdulaziz M; Sayed, Mohammed E; Mashyakhy, Mohammed; Inamdar, Aadil S; Tadakamadla, Jyothi

2014-09-29

To evaluate the reliability and validity of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-30) in Saudi Arabia. A convenience sample of 200 subjects was approached, of which 177 agreed to participate giving a response rate of 88.5%. Rapid Estimate of Adult Literacy in Dentistry (REALD-99), was translated into Arabic to prepare the longer and shorter versions of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-99 and AREALD-30). Each participant was provided with AREALD-99 which also includes words from AREALD-30. A questionnaire containing socio-behavioral information and Arabic Oral Health Impact Profile (A-OHIP-14) was also administered. Reliability of the AREALD-30 was assessed by re-administering it to 20 subjects after two weeks. Convergent and predictive validity of AREALD-30 was evaluated by its correlations with AREALD-99 and self-perceived oral health status, dental visiting habits and A-OHIP-14 respectively. Discriminant validity was assessed in relation to the educational level while construct validity was evaluated by confirmatory factor analysis (CFA). Reliability of AREALD-30 was excellent with intraclass correlation coefficient of 0.99. It exhibited good convergent and discriminant validity but poor predictive validity. CFA showed presence of two factors and infit mean-square statistics for AREALD-30 were all within the desired range of 0.50 - 2.0 in Rasch analysis. AREALD-30 showed excellent reliability, good convergent and concurrent validity, but failed to predict the differences between the subjects categorized based on their oral health outcomes.
Development and validation of an instrument for rapidly assessing symptoms: the general symptom distress scale.

PubMed

Badger, Terry A; Segrin, Chris; Meek, Paula

2011-03-01

Symptom assessment has increasingly focused on the evaluation of total symptom distress or burden rather than assessing only individual symptoms. The challenge for clinicians and researchers alike is to assess symptoms, and to determine the symptom distress associated with the symptoms and the patient's ability for symptom management without a lengthy and burdensome assessment process. The objective of this article was to discuss the psychometric evaluation of a brief general symptom distress scale (GSDS) developed to assess specific symptoms and how they rank in relation to each other, the overall symptom distress associated with the symptom schema, and provide an assessment of how well or poorly that symptom schema is managed. Results from a pilot study about the initial development of the GSDS with 76 hospitalized patients are presented, followed by a more complete psychometric evaluation of the GSDS using three samples of cancer patients (n=190) and their social network members, called partners in these studies (n=94). Descriptive statistics were used to describe the GSDS symptoms, symptom distress, and symptom management. Point biserial correlations indexed the associations between dichotomous symptoms and continuous measures, and conditional probabilities were used to illustrate the substantial comorbidities of this sample. Internal consistency was examined using the KR-20 coefficient, and test-retest reliability was examined. Construct validity and predictive validity also were examined. The GSDS demonstrated satisfactory internal consistency and test-retest reliability, and good construct validity and predictive validity. The total score on the GSDS, symptom distress, and symptom management correlated significantly with related constructs of depression, positive and negative affect, and general health. The GSDS was able to demonstrate its ability to distinguish between those with or without chronic illness, and was able to significantly predict scores on criterion measures such as depression. Collectively, these results suggest that the GSDS is a straightforward and useful instrument for rapidly assessing symptoms that can disrupt health-related quality of life. Copyright © 2011 U.S. Cancer Pain Relief Committee. Published by Elsevier Inc. All rights reserved.
Memory Binding Test Predicts Incident Amnestic Mild Cognitive Impairment.

PubMed

Mowrey, Wenzhu B; Lipton, Richard B; Katz, Mindy J; Ramratan, Wendy S; Loewenstein, David A; Zimmerman, Molly E; Buschke, Herman

2016-07-14

The Memory Binding Test (MBT), previously known as Memory Capacity Test, has demonstrated discriminative validity for distinguishing persons with amnestic mild cognitive impairment (aMCI) and dementia from cognitively normal elderly. We aimed to assess the predictive validity of the MBT for incident aMCI. In a longitudinal, community-based study of adults aged 70+, we administered the MBT to 246 cognitively normal elderly adults at baseline and followed them annually. Based on previous work, a subtle reduction in memory binding at baseline was defined by a Total Items in the Paired (TIP) condition score of ≤22 on the MBT. Cox proportional hazards models were used to assess the predictive validity of the MBT for incident aMCI accounting for the effects of covariates. The hazard ratio of incident aMCI was also assessed for different prediction time windows ranging from 4 to 7 years of follow-up, separately. Among 246 controls who were cognitively normal at baseline, 48 developed incident aMCI during follow-up. A baseline MBT reduction was associated with an increased risk for developing incident aMCI (hazard ratio (HR) = 2.44, 95% confidence interval: 1.30-4.56, p = 0.005). When varying the prediction window from 4-7 years, the MBT reduction remained significant for predicting incident aMCI (HR range: 2.33-3.12, p: 0.0007-0.04). Persons with poor performance on the MBT are at significantly greater risk for developing incident aMCI. High hazard ratios up to seven years of follow-up suggest that the MBT is sensitive to early disease.
The assessment of fitness to drive in people with dementia.

PubMed

Lincoln, Nadina B; Radford, Kate A; Lee, Elizabeth; Reay, Alice C

2006-11-01

To determine whether cognitive tests predict fitness to drive in patients with dementia. Two group comparison of patients with dementia and healthy elderly volunteers, and comparison of patients with dementia who were found safe to drive and those found unsafe, followed by a validation study. Forty-two people with dementia and 33 healthy elderly volunteers with no known memory problems who were driving. Of the 42 people with dementia 37 were assessed on the road. A second sample of 17 people with dementia was also assessed on the road. Stroke Drivers Screening Assessment, Mini Mental State Examination, Salford Objective Recognition Test, Stroop Test, Test of Everyday Attention, Visual Object and Space Perception Battery, Behavioural Assessment of the Dysexecutive Syndrome, Adult Memory and Information Processing Battery. All healthy elderly volunteers were safe to drive but 10 of the 27 patients with dementia were unsafe. Discriminant function analysis identified a combination of tests, which correctly classified 92% of drivers with dementia as safe or unsafe. Validation of this prediction on an independent sample had 59% accuracy using a cut-off of 0 but 88% accuracy using a cut-off of 5. Safety to drive in people with dementia could be predicted from a combination of six cognitive tests. These correctly identified 67% of safe drivers in a validation sample. This assessment could be used to identify those who need evaluation of their safety on the road. Copyright (c) 2006 John Wiley & Sons, Ltd.
Validation of a multifactorial risk factor model used for predicting future caries risk with Nevada adolescents.

PubMed

Ditmyer, Marcia M; Dounis, Georgia; Howard, Katherine M; Mobley, Connie; Cappelli, David

2011-05-20

The objective of this study was to measure the validity and reliability of a multifactorial Risk Factor Model developed for use in predicting future caries risk in Nevada adolescents in a public health setting. This study examined retrospective data from an oral health surveillance initiative that screened over 51,000 students 13-18 years of age, attending public/private schools in Nevada across six academic years (2002/2003-2007/2008). The Risk Factor Model included ten demographic variables: exposure to fluoridation in the municipal water supply, environmental smoke exposure, race, age, locale (metropolitan vs. rural), tobacco use, Body Mass Index, insurance status, sex, and sealant application. Multiple regression was used in a previous study to establish which significantly contributed to caries risk. Follow-up logistic regression ascertained the weight of contribution and odds ratios of the ten variables. Researchers in this study computed sensitivity, specificity, positive predictive value (PVP), negative predictive value (PVN), and prevalence across all six years of screening to assess the validity of the Risk Factor Model. Subjects' overall mean caries prevalence across all six years was 66%. Average sensitivity across all six years was 79%; average specificity was 81%; average PVP was 89% and average PVN was 67%. Overall, the Risk Factor Model provided a relatively constant, valid measure of caries that could be used in conjunction with a comprehensive risk assessment in population-based screenings by school nurses/nurse practitioners, health educators, and physicians to guide them in assessing potential future caries risk for use in prevention and referral practices.
Genome-based prediction of test cross performance in two subsequent breeding cycles.

PubMed

Hofheinz, Nina; Borchardt, Dietrich; Weissleder, Knuth; Frisch, Matthias

2012-12-01

Genome-based prediction of genetic values is expected to overcome shortcomings that limit the application of QTL mapping and marker-assisted selection in plant breeding. Our goal was to study the genome-based prediction of test cross performance with genetic effects that were estimated using genotypes from the preceding breeding cycle. In particular, our objectives were to employ a ridge regression approach that approximates best linear unbiased prediction of genetic effects, compare cross validation with validation using genetic material of the subsequent breeding cycle, and investigate the prospects of genome-based prediction in sugar beet breeding. We focused on the traits sugar content and standard molasses loss (ML) and used a set of 310 sugar beet lines to estimate genetic effects at 384 SNP markers. In cross validation, correlations >0.8 between observed and predicted test cross performance were observed for both traits. However, in validation with 56 lines from the next breeding cycle, a correlation of 0.8 could only be observed for sugar content, for standard ML the correlation reduced to 0.4. We found that ridge regression based on preliminary estimates of the heritability provided a very good approximation of best linear unbiased prediction and was not accompanied with a loss in prediction accuracy. We conclude that prediction accuracy assessed with cross validation within one cycle of a breeding program can not be used as an indicator for the accuracy of predicting lines of the next cycle. Prediction of lines of the next cycle seems promising for traits with high heritabilities.
Predictive validity of the Biomedical Admissions Test: an evaluation and case study.

PubMed

McManus, I C; Ferguson, Eamonn; Wakeford, Richard; Powis, David; James, David

2011-01-01

There has been an increase in the use of pre-admission selection tests for medicine. Such tests need to show good psychometric properties. Here, we use a paper by Emery and Bell [2009. The predictive validity of the Biomedical Admissions Test for pre-clinical examination performance. Med Educ 43:557-564] as a case study to evaluate and comment on the reporting of psychometric data in the field of medical student selection (and the comments apply to many papers in the field). We highlight pitfalls when reliability data are not presented, how simple zero-order associations can lead to inaccurate conclusions about the predictive validity of a test, and how biases need to be explored and reported. We show with BMAT that it is the knowledge part of the test which does all the predictive work. We show that without evidence of incremental validity it is difficult to assess the value of any selection tests for medicine.
Predicting on-road assessment pass and fail outcomes in older drivers with cognitive impairment using a battery of computerized sensory-motor and cognitive tests.

PubMed

Hoggarth, Petra A; Innes, Carrie R H; Dalrymple-Alford, John C; Jones, Richard D

2013-12-01

To generate a robust model of computerized sensory-motor and cognitive test performance to predict on-road driving assessment outcomes in older persons with diagnosed or suspected cognitive impairment. A logistic regression model classified pass–fail outcomes of a blinded on-road driving assessment. Generalizability of the model was tested using leave-one-out cross-validation. Three specialist clinics in New Zealand. Drivers (n=279; mean age 78.4, 65% male) with diagnosed or suspected dementia, mild cognitive impairment, unspecified cognitive impairment, or memory problems referred for a medical driving assessment. A computerized battery of sensory-motor and cognitive tests and an on-road medical driving assessment. One hundred fifty-five participants (55.5%) received an on-road fail score. Binary logistic regression correctly classified 75.6% of the sample into on-road pass and fail groups. The cross-validation indicated accuracy of the model of 72.0% with sensitivity for detecting on-road fails of 73.5%, specificity of 70.2%, positive predictive value of 75.5%, and negative predictive value of 68%. The off-road assessment prediction model resulted in a substantial number of people who were assessed as likely to fail despite passing an on-road assessment and vice versa. Thus, despite a large multicenter sample, the use of off-road tests previously found to be useful in other older populations, and a carefully constructed and tested prediction model, off-road measures have yet to be found that are sufficiently accurate to allow acceptable determination of on-road driving safety of cognitively impaired older drivers. © 2013, Copyright the Authors Journal compilation © 2013, The American Geriatrics Society.

The Reliability and Validity of the Thoracolumbar Injury Classification System in Pediatric Spine Trauma.

PubMed

Savage, Jason W; Moore, Timothy A; Arnold, Paul M; Thakur, Nikhil; Hsu, Wellington K; Patel, Alpesh A; McCarthy, Kathryn; Schroeder, Gregory D; Vaccaro, Alexander R; Dimar, John R; Anderson, Paul A

2015-09-15

The thoracolumbar injury classification system (TLICS) was evaluated in 20 consecutive pediatric spine trauma cases. The purpose of this study was to determine the reliability and validity of the TLICS in pediatric spine trauma. The TLICS was developed to improve the categorization and management of thoracolumbar trauma. TLICS has been shown to have good reliability and validity in the adult population. The clinical and radiographical findings of 20 pediatric thoracolumbar fractures were prospectively presented to 20 surgeons with disparate levels of training and experience with spinal trauma. These injuries were consecutively scored using the TLICS. Cohen unweighted κ coefficients and Spearman rank order correlation values were calculated for the key parameters (injury morphology, status of posterior ligamentous complex, neurological status, TLICS total score, and proposed management) to assess the inter-rater reliabilities. Five surgeons scored the same cases 3 months later to assess the intra-rater reliability. The actual management of each case was then compared with the treatment recommended by the TLICS algorithm to assess validity. The inter-rater κ statistics of all subgroups (injury morphology, status of the posterior ligamentous complex, neurological status, TLICS total score, and proposed treatment) were within the range of moderate to substantial reproducibility (0.524-0.958). All subgroups had excellent intra-rater reliability (0.748-1.000). The various indices for validity were calculated (80.3% correct, 0.836 sensitivity, 0.785 specificity, 0.676 positive predictive value, 0.899 negative predictive value). Overall, TLICS demonstrated good validity. The TLICS has good reliability and validity when used in the pediatric population. The inter-rater reliability of predicting management and indices for validity are lower than those in adults with thoracolumbar fractures, which is likely due to differences in the way children are treated for certain types of injuries. TLICS can be used to reliably categorize thoracolumbar injuries in the pediatric population; however, modifications may be needed to better guide treatment in this specific patient population. 4.
The Miller Assessment for Preschoolers: A Longitudinal and Predictive Study. Final Report.

ERIC Educational Resources Information Center

Foundation for Knowledge in Development, Littleton, CO.

The study reported here sought to establish the predictive validity of the Miller Assessment for Preschoolers (MAP), an instrument designed to identify preschool children at risk for school-related problems in the primary years. Children (N=338) in 11 states who were originally tested in 1980 as part of the MAP standardization project were given a…
Assessing the Incremental Value of KABC-II Luria Model Scores in Predicting Achievement: What Do They Tell Us beyond the MPI?

ERIC Educational Resources Information Center

McGill, Ryan J.; Spurgin, Angelia R.

2016-01-01

The current study examined the incremental validity of the Luria interpretive scheme for the Kaufman Assessment Battery for Children-Second Edition (KABC-II) for predicting scores on the Kaufman Test of Educational Achievement-Second Edition (KTEA-II). All participants were children and adolescents (N = 2,025) drawn from the nationally…
The Predictive Validity of Four Intelligence Tests for School Grades: A Small Sample Longitudinal Study

PubMed Central

Gygi, Jasmin T.; Hagmann-von Arx, Priska; Schweizer, Florine; Grob, Alexander

2017-01-01

Intelligence is considered the strongest single predictor of scholastic achievement. However, little is known regarding the predictive validity of well-established intelligence tests for school grades. We analyzed the predictive validity of four widely used intelligence tests in German-speaking countries: The Intelligence and Development Scales (IDS), the Reynolds Intellectual Assessment Scales (RIAS), the Snijders-Oomen Nonverbal Intelligence Test (SON-R 6-40), and the Wechsler Intelligence Scale for Children (WISC-IV), which were individually administered to 103 children (Mage = 9.17 years) enrolled in regular school. School grades were collected longitudinally after 3 years (averaged school grades, mathematics, and language) and were available for 54 children (Mage = 11.77 years). All four tests significantly predicted averaged school grades. Furthermore, the IDS and the RIAS predicted both mathematics and language, while the SON-R 6-40 predicted mathematics. The WISC-IV showed no significant association with longitudinal scholastic achievement when mathematics and language were analyzed separately. The results revealed the predictive validity of currently used intelligence tests for longitudinal scholastic achievement in German-speaking countries and support their use in psychological practice, in particular for predicting averaged school grades. However, this conclusion has to be considered as preliminary due to the small sample of children observed. PMID:28348543
A Quantitative Structure Activity Relationship for acute oral toxicity of pesticides on rats: Validation, domain of application and prediction.

PubMed

Hamadache, Mabrouk; Benkortbi, Othmane; Hanini, Salah; Amrane, Abdeltif; Khaouane, Latifa; Si Moussa, Cherif

2016-02-13

Quantitative Structure Activity Relationship (QSAR) models are expected to play an important role in the risk assessment of chemicals on humans and the environment. In this study, we developed a validated QSAR model to predict acute oral toxicity of 329 pesticides to rats because a few QSAR models have been devoted to predict the Lethal Dose 50 (LD50) of pesticides on rats. This QSAR model is based on 17 molecular descriptors, and is robust, externally predictive and characterized by a good applicability domain. The best results were obtained with a 17/9/1 Artificial Neural Network model trained with the Quasi Newton back propagation (BFGS) algorithm. The prediction accuracy for the external validation set was estimated by the Q(2)ext and the root mean square error (RMS) which are equal to 0.948 and 0.201, respectively. 98.6% of external validation set is correctly predicted and the present model proved to be superior to models previously published. Accordingly, the model developed in this study provides excellent predictions and can be used to predict the acute oral toxicity of pesticides, particularly for those that have not been tested as well as new pesticides. Copyright © 2015 Elsevier B.V. All rights reserved.
Assessing Discriminative Performance at External Validation of Clinical Prediction Models

PubMed Central

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.

2016-01-01

Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.

PubMed

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W

2016-01-01

External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
Do in-training evaluation reports deserve their bad reputations? A study of the reliability and predictive ability of ITER scores and narrative comments.

PubMed

Ginsburg, Shiphra; Eva, Kevin; Regehr, Glenn

2013-10-01

Although scores on in-training evaluation reports (ITERs) are often criticized for poor reliability and validity, ITER comments may yield valuable information. The authors assessed across-rotation reliability of ITER scores in one internal medicine program, ability of ITER scores and comments to predict postgraduate year three (PGY3) performance, and reliability and incremental predictive validity of attendings' analysis of written comments. Numeric and narrative data from the first two years of ITERs for one cohort of residents at the University of Toronto Faculty of Medicine (2009-2011) were assessed for reliability and predictive validity of third-year performance. Twenty-four faculty attendings rank-ordered comments (without scores) such that each resident was ranked by three faculty. Mean ITER scores and comment rankings were submitted to regression analyses; dependent variables were PGY3 ITER scores and program directors' rankings. Reliabilities of ITER scores across nine rotations for 63 residents were 0.53 for both postgraduate year one (PGY1) and postgraduate year two (PGY2). Interrater reliabilities across three attendings' rankings were 0.83 for PGY1 and 0.79 for PGY2. There were strong correlations between ITER scores and comments within each year (0.72 and 0.70). Regressions revealed that PGY1 and PGY2 ITER scores collectively explained 25% of variance in PGY3 scores and 46% of variance in PGY3 rankings. Comment rankings did not improve predictions. ITER scores across multiple rotations showed decent reliability and predictive validity. Comment ranks did not add to the predictive ability, but correlation analyses suggest that trainee performance can be measured through these comments.
Development and validation of anthropometric prediction equations for estimation of lean body mass and appendicular lean soft tissue in Indian men and women.

PubMed

Kulkarni, Bharati; Kuper, Hannah; Taylor, Amy; Wells, Jonathan C; Radhakrishna, K V; Kinra, Sanjay; Ben-Shlomo, Yoav; Smith, George Davey; Ebrahim, Shah; Byrne, Nuala M; Hills, Andrew P

2013-10-15

Lean body mass (LBM) and muscle mass remain difficult to quantify in large epidemiological studies due to the unavailability of inexpensive methods. We therefore developed anthropometric prediction equations to estimate the LBM and appendicular lean soft tissue (ALST) using dual-energy X-ray absorptiometry (DXA) as a reference method. Healthy volunteers (n = 2,220; 36% women; age 18-79 yr), representing a wide range of body mass index (14-44 kg/m(2)), participated in this study. Their LBM, including ALST, was assessed by DXA along with anthropometric measurements. The sample was divided into prediction (60%) and validation (40%) sets. In the prediction set, a number of prediction models were constructed using DXA-measured LBM and ALST estimates as dependent variables and a combination of anthropometric indices as independent variables. These equations were cross-validated in the validation set. Simple equations using age, height, and weight explained >90% variation in the LBM and ALST in both men and women. Additional variables (hip and limb circumferences and sum of skinfold thicknesses) increased the explained variation by 5-8% in the fully adjusted models predicting LBM and ALST. More complex equations using all of the above anthropometric variables could predict the DXA-measured LBM and ALST accurately, as indicated by low standard error of the estimate (LBM: 1.47 kg and 1.63 kg for men and women, respectively), as well as good agreement by Bland-Altman analyses (Bland JM, Altman D. Lancet 1: 307-310, 1986). These equations could be a valuable tool in large epidemiological studies assessing these body compartments in Indians and other population groups with similar body composition.
Development and validation of anthropometric prediction equations for estimation of lean body mass and appendicular lean soft tissue in Indian men and women

PubMed Central

Kuper, Hannah; Taylor, Amy; Wells, Jonathan C.; Radhakrishna, K. V.; Kinra, Sanjay; Ben-Shlomo, Yoav; Smith, George Davey; Ebrahim, Shah; Byrne, Nuala M.; Hills, Andrew P.

2013-01-01

Lean body mass (LBM) and muscle mass remain difficult to quantify in large epidemiological studies due to the unavailability of inexpensive methods. We therefore developed anthropometric prediction equations to estimate the LBM and appendicular lean soft tissue (ALST) using dual-energy X-ray absorptiometry (DXA) as a reference method. Healthy volunteers (n = 2,220; 36% women; age 18-79 yr), representing a wide range of body mass index (14–44 kg/m2), participated in this study. Their LBM, including ALST, was assessed by DXA along with anthropometric measurements. The sample was divided into prediction (60%) and validation (40%) sets. In the prediction set, a number of prediction models were constructed using DXA-measured LBM and ALST estimates as dependent variables and a combination of anthropometric indices as independent variables. These equations were cross-validated in the validation set. Simple equations using age, height, and weight explained >90% variation in the LBM and ALST in both men and women. Additional variables (hip and limb circumferences and sum of skinfold thicknesses) increased the explained variation by 5–8% in the fully adjusted models predicting LBM and ALST. More complex equations using all of the above anthropometric variables could predict the DXA-measured LBM and ALST accurately, as indicated by low standard error of the estimate (LBM: 1.47 kg and 1.63 kg for men and women, respectively), as well as good agreement by Bland-Altman analyses (Bland JM, Altman D. Lancet 1: 307–310, 1986). These equations could be a valuable tool in large epidemiological studies assessing these body compartments in Indians and other population groups with similar body composition. PMID:23950165
Developing prediction equations and a mobile phone application to identify infants at risk of obesity.

PubMed

Santorelli, Gillian; Petherick, Emily S; Wright, John; Wilson, Brad; Samiei, Haider; Cameron, Noël; Johnson, William

2013-01-01

Advancements in knowledge of obesity aetiology and mobile phone technology have created the opportunity to develop an electronic tool to predict an infant's risk of childhood obesity. The study aims were to develop and validate equations for the prediction of childhood obesity and integrate them into a mobile phone application (App). Anthropometry and childhood obesity risk data were obtained for 1868 UK-born White or South Asian infants in the Born in Bradford cohort. Logistic regression was used to develop prediction equations (at 6 ± 1.5, 9 ± 1.5 and 12 ± 1.5 months) for risk of childhood obesity (BMI at 2 years >91(st) centile and weight gain from 0-2 years >1 centile band) incorporating sex, birth weight, and weight gain as predictors. The discrimination accuracy of the equations was assessed by the area under the curve (AUC); internal validity by comparing area under the curve to those obtained in bootstrapped samples; and external validity by applying the equations to an external sample. An App was built to incorporate six final equations (two at each age, one of which included maternal BMI). The equations had good discrimination (AUCs 86-91%), with the addition of maternal BMI marginally improving prediction. The AUCs in the bootstrapped and external validation samples were similar to those obtained in the development sample. The App is user-friendly, requires a minimum amount of information, and provides a risk assessment of low, medium, or high accompanied by advice and website links to government recommendations. Prediction equations for risk of childhood obesity have been developed and incorporated into a novel App, thereby providing proof of concept that childhood obesity prediction research can be integrated with advancements in technology.
External validity of two nomograms for predicting distant brain failure after radiosurgery for brain metastases in a bi-institutional independent patient cohort.

PubMed

Prabhu, Roshan S; Press, Robert H; Boselli, Danielle M; Miller, Katherine R; Lankford, Scott P; McCammon, Robert J; Moeller, Benjamin J; Heinzerling, John H; Fasola, Carolina E; Patel, Kirtesh R; Asher, Anthony L; Sumrall, Ashley L; Curran, Walter J; Shu, Hui-Kuo G; Burri, Stuart H

2018-03-01

Patients treated with stereotactic radiosurgery (SRS) for brain metastases (BM) are at increased risk of distant brain failure (DBF). Two nomograms have been recently published to predict individualized risk of DBF after SRS. The goal of this study was to assess the external validity of these nomograms in an independent patient cohort. The records of consecutive patients with BM treated with SRS at Levine Cancer Institute and Emory University between 2005 and 2013 were reviewed. Three validation cohorts were generated based on the specific nomogram or recursive partitioning analysis (RPA) entry criteria: Wake Forest nomogram (n = 281), Canadian nomogram (n = 282), and Canadian RPA (n = 303) validation cohorts. Freedom from DBF at 1-year in the Wake Forest study was 30% compared with 50% in the validation cohort. The validation c-index for both the 6-month and 9-month freedom from DBF Wake Forest nomograms was 0.55, indicating poor discrimination ability, and the goodness-of-fit test for both nomograms was highly significant (p < 0.001), indicating poor calibration. The 1-year actuarial DBF in the Canadian nomogram study was 43.9% compared with 50.9% in the validation cohort. The validation c-index for the Canadian 1-year DBF nomogram was 0.56, and the goodness-of-fit test was also highly significant (p < 0.001). The validation accuracy and c-index of the Canadian RPA classification was 53% and 0.61, respectively. The Wake Forest and Canadian nomograms for predicting risk of DBF after SRS were found to have limited predictive ability in an independent bi-institutional validation cohort. These results reinforce the importance of validating predictive models in independent patient cohorts.
Self-reported personality disorder in the children in the community sample: convergent and prospective validity in late adolescence and adulthood.

PubMed

Crawford, Thomas N; Cohen, Patricia; Johnson, Jeffrey G; Kasen, Stephanie; First, Michael B; Gordon, Kathy; Brook, Judith S

2005-02-01

Approximately 800 youths from the Children in the Community Study (Cohen & Cohen, 1996) have been assessed prospectively for over 20 years to study personality disorders (PDs) in adolescents and young adults. In this article we evaluate the Children in the Community Self-Report (CIC-SR) Scales, which were designed to assess DSM-IV PDs using self-reported prospective data from this longitudinal sample. To evaluate convergent validity, we assessed concordance between the CIC-SR Scales and the Structured Clinical Interview for DSM-IV Personality Disorders (SCID-II; First, Gibbon, Spitzer, Williams, & Benjamin, 1995) in 644 participants at mean age 33. To assess predictive validity, we used CIC-SR Scales at mean age 22 to predict subsequent CIC-SR and SCID-II Personality Questionnaire scores at mean age 33. In these analyses the CIC-SR Scales matched or exceeded benchmarks established in previous comparisons between self-report instruments and structured clinical interviews. Unlike other self-report scales, the CIC-SR did not appear to overestimate diagnoses when compared with SCID-II clinical diagnoses.
A simplified approach to the pooled analysis of calibration of clinical prediction rules for systematic reviews of validation studies

PubMed Central

Dimitrov, Borislav D; Motterlini, Nicola; Fahey, Tom

2015-01-01

Objective Estimating calibration performance of clinical prediction rules (CPRs) in systematic reviews of validation studies is not possible when predicted values are neither published nor accessible or sufficient or no individual participant or patient data are available. Our aims were to describe a simplified approach for outcomes prediction and calibration assessment and evaluate its functionality and validity. Study design and methods: Methodological study of systematic reviews of validation studies of CPRs: a) ABCD2 rule for prediction of 7 day stroke; and b) CRB-65 rule for prediction of 30 day mortality. Predicted outcomes in a sample validation study were computed by CPR distribution patterns (“derivation model”). As confirmation, a logistic regression model (with derivation study coefficients) was applied to CPR-based dummy variables in the validation study. Meta-analysis of validation studies provided pooled estimates of “predicted:observed” risk ratios (RRs), 95% confidence intervals (CIs), and indexes of heterogeneity (I2) on forest plots (fixed and random effects models), with and without adjustment of intercepts. The above approach was also applied to the CRB-65 rule. Results Our simplified method, applied to ABCD2 rule in three risk strata (low, 0–3; intermediate, 4–5; high, 6–7 points), indicated that predictions are identical to those computed by univariate, CPR-based logistic regression model. Discrimination was good (c-statistics =0.61–0.82), however, calibration in some studies was low. In such cases with miscalibration, the under-prediction (RRs =0.73–0.91, 95% CIs 0.41–1.48) could be further corrected by intercept adjustment to account for incidence differences. An improvement of both heterogeneities and P-values (Hosmer-Lemeshow goodness-of-fit test) was observed. Better calibration and improved pooled RRs (0.90–1.06), with narrower 95% CIs (0.57–1.41) were achieved. Conclusion Our results have an immediate clinical implication in situations when predicted outcomes in CPR validation studies are lacking or deficient by describing how such predictions can be obtained by everyone using the derivation study alone, without any need for highly specialized knowledge or sophisticated statistics. PMID:25931829
Evaluation of the Validity and Reliability of the Waterlow Pressure Ulcer Risk Assessment Scale

PubMed Central

Charalambous, Charalambos; Koulori, Agoritsa; Vasilopoulos, Aristidis; Roupa, Zoe

2018-01-01

Introduction Prevention is the ideal strategy to tackle the problem of pressure ulcers. Pressure ulcer risk assessment scales are one of the most pivotal measures applied to tackle the problem, much criticisms has been developed regarding the validity and reliability of these scales. Objective To investigate the validity and reliability of the Waterlow pressure ulcer risk assessment scale. Method The methodology used is a narrative literature review, the bibliography was reviewed through Cinahl, Pubmed, EBSCO, Medline and Google scholar, 26 scientific articles where identified. The articles where chosen due to their direct correlation with the objective under study and their scientific relevance. Results The construct and face validity of the Waterlow appears adequate, but with regards to content validity changes in the category age and gender can be beneficial. The concurrent validity cannot be assessed. The predictive validity of the Waterlow is characterized by high specificity and low sensitivity. The inter-rater reliability has been demonstrated to be inadequate, this may be due to lack of clear definitions within the categories and differentiating level of knowledge between the users. Conclusion Due to the limitations presented regarding the validity and reliability of the Waterlow pressure ulcer risk assessment scale, the scale should be used in conjunction with clinical assessment to provide optimum results. PMID:29736104
Evaluation of the Validity and Reliability of the Waterlow Pressure Ulcer Risk Assessment Scale.

PubMed

Charalambous, Charalambos; Koulori, Agoritsa; Vasilopoulos, Aristidis; Roupa, Zoe

2018-04-01

Prevention is the ideal strategy to tackle the problem of pressure ulcers. Pressure ulcer risk assessment scales are one of the most pivotal measures applied to tackle the problem, much criticisms has been developed regarding the validity and reliability of these scales. To investigate the validity and reliability of the Waterlow pressure ulcer risk assessment scale. The methodology used is a narrative literature review, the bibliography was reviewed through Cinahl, Pubmed, EBSCO, Medline and Google scholar, 26 scientific articles where identified. The articles where chosen due to their direct correlation with the objective under study and their scientific relevance. The construct and face validity of the Waterlow appears adequate, but with regards to content validity changes in the category age and gender can be beneficial. The concurrent validity cannot be assessed. The predictive validity of the Waterlow is characterized by high specificity and low sensitivity. The inter-rater reliability has been demonstrated to be inadequate, this may be due to lack of clear definitions within the categories and differentiating level of knowledge between the users. Due to the limitations presented regarding the validity and reliability of the Waterlow pressure ulcer risk assessment scale, the scale should be used in conjunction with clinical assessment to provide optimum results.
Overview of the Aeroelastic Prediction Workshop

NASA Technical Reports Server (NTRS)

Heeg, Jennifer; Chwalowski, Pawel; Schuster, David M.; Dalenbring, Mats

2013-01-01

The AIAA Aeroelastic Prediction Workshop (AePW) was held in April, 2012, bringing together communities of aeroelasticians and computational fluid dynamicists. The objective in conducting this workshop on aeroelastic prediction was to assess state-of-the-art computational aeroelasticity methods as practical tools for the prediction of static and dynamic aeroelastic phenomena. No comprehensive aeroelastic benchmarking validation standard currently exists, greatly hindering validation and state-of-the-art assessment objectives. The workshop was a step towards assessing the state of the art in computational aeroelasticity. This was an opportunity to discuss and evaluate the effectiveness of existing computer codes and modeling techniques for unsteady flow, and to identify computational and experimental areas needing additional research and development. Three configurations served as the basis for the workshop, providing different levels of geometric and flow field complexity. All cases considered involved supercritical airfoils at transonic conditions. The flow fields contained oscillating shocks and in some cases, regions of separation. The computational tools principally employed Reynolds-Averaged Navier Stokes solutions. The successes and failures of the computations and the experiments are examined in this paper.
Preliminary Assessment of Turbomachinery Codes

NASA Technical Reports Server (NTRS)

Mazumder, Quamrul H.

2007-01-01

This report assesses different CFD codes developed and currently being used at Glenn Research Center to predict turbomachinery fluid flow and heat transfer behavior. This report will consider the following codes: APNASA, TURBO, GlennHT, H3D, and SWIFT. Each code will be described separately in the following section with their current modeling capabilities, level of validation, pre/post processing, and future development and validation requirements. This report addresses only previously published and validations of the codes. However, the codes have been further developed to extend the capabilities of the codes.
Predictive validity of the classroom strategies scale-observer form on statewide testing scores: an initial investigation.

PubMed

Reddy, Linda A; Fabiano, Gregory A; Dudek, Christopher M; Hsu, Louis

2013-12-01

The present study examined the validity of a teacher observation measure, the Classroom Strategies Scale--Observer Form (CSS), as a predictor of student performance on statewide tests of mathematics and English language arts. The CSS is a teacher practice observational measure that assesses evidence-based instructional and behavioral management practices in elementary school. A series of two-level hierarchical generalized linear models were fitted to data of a sample of 662 third- through fifth-grade students to assess whether CSS Part 2 Instructional Strategy and Behavioral Management Strategy scale discrepancy scores (i.e., ∑ |recommended frequency--frequency ratings|) predicted statewide mathematics and English language arts proficiency scores when percentage of minority students in schools was controlled. Results indicated that the Instructional Strategy scale discrepancy scores significantly predicted mathematics and English language arts proficiency scores: Relatively larger discrepancies on observer ratings of what teachers did versus what should have been done were associated with lower proficiency scores. Results offer initial evidence of the predictive validity of the CSS Part 2 Instructional Strategy discrepancy scores on student academic outcomes. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Prediction of cognitive and motor development in preterm children using exhaustive feature selection and cross-validation of near-term white matter microstructure.

PubMed

Schadl, Kornél; Vassar, Rachel; Cahill-Rowley, Katelyn; Yeom, Kristin W; Stevenson, David K; Rose, Jessica

2018-01-01

Advanced neuroimaging and computational methods offer opportunities for more accurate prognosis. We hypothesized that near-term regional white matter (WM) microstructure, assessed on diffusion tensor imaging (DTI), using exhaustive feature selection with cross-validation would predict neurodevelopment in preterm children. Near-term MRI and DTI obtained at 36.6 ± 1.8 weeks postmenstrual age in 66 very-low-birth-weight preterm neonates were assessed. 60/66 had follow-up neurodevelopmental evaluation with Bayley Scales of Infant-Toddler Development, 3rd-edition (BSID-III) at 18-22 months. Linear models with exhaustive feature selection and leave-one-out cross-validation computed based on DTI identified sets of three brain regions most predictive of cognitive and motor function; logistic regression models were computed to classify high-risk infants scoring one standard deviation below mean. Cognitive impairment was predicted (100% sensitivity, 100% specificity; AUC = 1) by near-term right middle-temporal gyrus MD, right cingulate-cingulum MD, left caudate MD. Motor impairment was predicted (90% sensitivity, 86% specificity; AUC = 0.912) by left precuneus FA, right superior occipital gyrus MD, right hippocampus FA. Cognitive score variance was explained (29.6%, cross-validated Rˆ2 = 0.296) by left posterior-limb-of-internal-capsule MD, Genu RD, right fusiform gyrus AD. Motor score variance was explained (31.7%, cross-validated Rˆ2 = 0.317) by left posterior-limb-of-internal-capsule MD, right parahippocampal gyrus AD, right middle-temporal gyrus AD. Search in large DTI feature space more accurately identified neonatal neuroimaging correlates of neurodevelopment.

Nutrition screening tools: does one size fit all? A systematic review of screening tools for the hospital setting.

PubMed

van Bokhorst-de van der Schueren, Marian A E; Guaitoli, Patrícia Realino; Jansma, Elise P; de Vet, Henrica C W

2014-02-01

Numerous nutrition screening tools for the hospital setting have been developed. The aim of this systematic review is to study construct or criterion validity and predictive validity of nutrition screening tools for the general hospital setting. A systematic review of English, French, German, Spanish, Portuguese and Dutch articles identified via MEDLINE, Cinahl and EMBASE (from inception to the 2nd of February 2012). Additional studies were identified by checking reference lists of identified manuscripts. Search terms included key words for malnutrition, screening or assessment instruments, and terms for hospital setting and adults. Data were extracted independently by 2 authors. Only studies expressing the (construct, criterion or predictive) validity of a tool were included. 83 studies (32 screening tools) were identified: 42 studies on construct or criterion validity versus a reference method and 51 studies on predictive validity on outcome (i.e. length of stay, mortality or complications). None of the tools performed consistently well to establish the patients' nutritional status. For the elderly, MNA performed fair to good, for the adults MUST performed fair to good. SGA, NRS-2002 and MUST performed well in predicting outcome in approximately half of the studies reviewed in adults, but not in older patients. Not one single screening or assessment tool is capable of adequate nutrition screening as well as predicting poor nutrition related outcome. Development of new tools seems redundant and will most probably not lead to new insights. New studies comparing different tools within one patient population are required. Copyright © 2013 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Assessing the Culture of Residency Using the C - Change Resident Survey: Validity Evidence in 34 U.S. Residency Programs.

PubMed

Pololi, Linda H; Evans, Arthur T; Civian, Janet T; Shea, Sandy; Brennan, Robert T

2017-07-01

A practical instrument is needed to reliably measure the clinical learning environment and professionalism for residents. To develop and present evidence of validity of an instrument to assess the culture of residency programs and the clinical learning environment. During 2014-2015, we surveyed residents using the C - Change Resident Survey to assess residents' perceptions of the culture in their programs. Residents in all years of training in 34 programs in internal medicine, pediatrics, and general surgery in 14 geographically diverse public and private academic health systems. The C - Change Resident Survey assessed residents' perceptions of 13 dimensions of the culture: Vitality, Self-Efficacy, Institutional Support, Relationships/Inclusion, Values Alignment, Ethical/Moral Distress, Respect, Mentoring, Work-Life Integration, Gender Equity, Racial/Ethnic Minority Equity, and self-assessed Competencies. We measured the internal reliability of each of the 13 dimensions and evaluated response process, content validity, and construct-related evidence validity by assessing relationships predicted by our conceptual model and prior research. We also assessed whether the measurements were sensitive to differences in specialty and across institutions. A total of 1708 residents completed the survey [internal medicine: n = 956, pediatrics: n = 411, general surgery: n = 311 (51% women; 16% underrepresented in medicine minority)], with a response rate of 70% (range across programs, 51-87%). Internal consistency of each dimension was high (Cronbach α: 0.73-0.90). The instrument was able to detect significant differences in the learning environment across programs and sites. Evidence of validity was supported by a good response process and the demonstration of several relationships predicted by our conceptual model. The C - Change Resident Survey assesses the clinical learning environment for residents, and we encourage further study of validity in different contexts. Results could be used to facilitate and monitor improvements in the clinical learning environment and resident well-being.
Assessing psychological inflexibility: the psychometric properties of the Avoidance and Fusion Questionnaire for Youth in two adult samples.

PubMed

Fergus, Thomas A; Valentiner, David P; Gillen, Michael J; Hiraoka, Regina; Twohig, Michael P; Abramowitz, Jonathan S; McGrath, Patrick B

2012-06-01

The current study examined whether the Avoidance and Fusion Questionnaire for Youth (AFQ-Y; L. A. Greco, W. Lambert, & R. A. Baer, 2008), a self-report measure of psychological inflexibility for children and adolescents, might be useful for measuring psychological inflexibility for adults. The psychometric properties of the AFQ-Y were examined using data from a college student sample (N = 387) and a clinical sample of patients with anxiety disorders (N = 115). The AFQ-Y, but not the Acceptance and Action Questionnaire-II (AAQ-II; F. W. Bond et al., in press), demonstrated a reading level at or below the recommended 5th or 6th grade reading level. The AFQ-Y also demonstrated adequate reliability (internal consistency), factorial validity, convergent and discriminant validity, and concurrent validity predicting psychological symptoms. Moreover, the AFQ-Y showed incremental validity over the AAQ-II in predicting several psychological symptom domains. Implications for the assessment of psychological inflexibility are discussed. (c) 2012 APA, all rights reserved
The comparative capacity of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) and MMPI-2 Restructured Form (MMPI-2-RF) validity scales to detect suspected malingering in a disability claimant sample.

PubMed

Chmielewski, Michael; Zhu, Jiani; Burchett, Danielle; Bury, Alison S; Bagby, R Michael

2017-02-01

The current study expands on past research examining the comparative capacity of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2; Butcher et al., 2001) and MMPI-2 Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008/2011) overreporting validity scales to detect suspected malingering, as assessed by the Miller Forensic Assessment of Symptoms Test (M-FAST; Miller, 2001), in a sample of public insurance disability claimants (N = 742) who were considered to have potential incentives to malinger. Results provide support for the capacity of both the MMPI-2 and the MMPI-2-RF overreporting validity scales to predict suspected malingering of psychopathology. The MMPI-2-RF overreporting validity scales proved to be modestly better predictors of suspected psychopathology malingering-compared with the MMPI-2 overreporting scales-in dimensional predictive models and categorical classification accuracy analyses. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
The Alcohol Relapse Situation Appraisal Questionnaire: Development and Validation

PubMed Central

Martin, Rosemarie A.; MacKinnon, Selene M.; Johnson, Jennifer E.; Myers, Mark G.; Cook, Travis A. R.; Rohsenow, Damaris J.

2011-01-01

Background The role of cognitive appraisal of the threat of alcohol relapse has received little attention. A previous instrument, the Relapse Situation Appraisal Questionnaire (RSAQ), was developed to assess cocaine users’ primary appraisal of the threat of situations posing a high risk for cocaine relapse. The purpose of the present study was to modify the RSAQ in order to measure primary appraisal in situations involving a high risk for alcohol relapse. Methods The development and psychometric properties of this instrument, the Alcohol Relapse Situation Appraisal Questionnaire (A-RSAQ), were examined with two samples of abstinent adults with alcohol abuse or dependence. Factor structure and validity were examined in Study 1 (N=104). Confirmation of the factor structure and predictive validity were assessed in Study 2 (N=161). Results Results demonstrated construct, discriminant and predictive validity and reliability of the A-RSAQ. Discussion Results support the important role of primary appraisal of degree of risk in alcohol relapse situations. PMID:21237586
Psychometrics of A New Questionnaire to Assess Glaucoma Adherence: The Glaucoma Treatment Compliance Assessment Tool (An American Ophthalmological Society Thesis)

PubMed Central

Mansberger, Steven L.; Sheppler, Christina R.; McClure, Tina M.; VanAlstine, Cory L.; Swanson, Ingrid L.; Stoumbos, Zoey; Lambert, William E.

2013-01-01

Purpose: To report the psychometrics of the Glaucoma Treatment Compliance Assessment Tool (GTCAT), a new questionnaire designed to assess adherence with glaucoma therapy. Methods: We developed the questionnaire according to the constructs of the Health Belief Model. We evaluated the questionnaire using data from a cross-sectional study with focus groups (n = 20) and a prospective observational case series (n=58). Principal components analysis provided assessment of construct validity. We repeated the questionnaire after 3 months for test-retest reliability. We evaluated predictive validity using an electronic dosing monitor as an objective measure of adherence. Results: Focus group participants provided 931 statements related to adherence, of which 88.7% (826/931) could be categorized into the constructs of the Health Belief Model. Perceived barriers accounted for 31% (288/931) of statements, cues-to-action 14% (131/931), susceptibility 12% (116/931), benefits 12% (115/931), severity 10% (91/931), and self-efficacy 9% (85/931). The principal components analysis explained 77% of the variance with five components representing Health Belief Model constructs. Reliability analyses showed acceptable Cronbach’s alphas (>.70) for four of the seven components (severity, susceptibility, barriers [eye drop administration], and barriers [discomfort]). Predictive validity was high, with several Health Belief Model questions significantly associated (P <.05) with adherence and a correlation coefficient (R2) of .40. Test-retest reliability was 90%. Conclusion: The GTCAT shows excellent repeatability, content, construct, and predictive validity for glaucoma adherence. A multisite trial is needed to determine whether the results can be generalized and whether the questionnaire accurately measures the effect of interventions to increase adherence. PMID:24072942
Development and external validation of a prostate health index-based nomogram for predicting prostate cancer

PubMed Central

Zhu, Yao; Han, Cheng-Tao; Zhang, Gui-Ming; Liu, Fang; Ding, Qiang; Xu, Jian-Feng; Vidal, Adriana C.; Freedland, Stephen J.; Ng, Chi-Fai; Ye, Ding-Wei

2015-01-01

To develop and externally validate a prostate health index (PHI)-based nomogram for predicting the presence of prostate cancer (PCa) at biopsy in Chinese men with prostate-specific antigen 4–10 ng/mL and normal digital rectal examination (DRE). 347 men were recruited from two hospitals between 2012 and 2014 to develop a PHI-based nomogram to predict PCa. To validate these results, we used a separate cohort of 230 men recruited at another center between 2008 and 2013. Receiver operator curves (ROC) were used to assess the ability to predict PCa. A nomogram was derived from the multivariable logistic regression model and its accuracy was assessed by the area under the ROC (AUC). PHI achieved the highest AUC of 0.839 in the development cohort compared to the other predictors (p < 0.001). Including age and prostate volume, a PHI-based nomogram was constructed and rendered an AUC of 0.877 (95% CI 0.813–0.938). The AUC of the nomogram in the validation cohort was 0.786 (95% CI 0.678–0.894). In clinical effectiveness analyses, the PHI-based nomogram reduced unnecessary biopsies from 42.6% to 27% using a 5% threshold risk of PCa to avoid biopsy with no increase in the number of missed cases relative to conventional biopsy decision. PMID:26471350
Development and validation of a predictive model for excessive postpartum blood loss: A retrospective, cohort study.

PubMed

Rubio-Álvarez, Ana; Molina-Alarcón, Milagros; Arias-Arias, Ángel; Hernández-Martínez, Antonio

2018-03-01

postpartum haemorrhage is one of the leading causes of maternal morbidity and mortality worldwide. Despite the use of uterotonics agents as preventive measure, it remains a challenge to identify those women who are at increased risk of postpartum bleeding. to develop and to validate a predictive model to assess the risk of excessive bleeding in women with vaginal birth. retrospective cohorts study. "Mancha-Centro Hospital" (Spain). the elaboration of the predictive model was based on a derivation cohort consisting of 2336 women between 2009 and 2011. For validation purposes, a prospective cohort of 953 women between 2013 and 2014 were employed. Women with antenatal fetal demise, multiple pregnancies and gestations under 35 weeks were excluded METHODS: we used a multivariate analysis with binary logistic regression, Ridge Regression and areas under the Receiver Operating Characteristic curves to determine the predictive ability of the proposed model. there was 197 (8.43%) women with excessive bleeding in the derivation cohort and 63 (6.61%) women in the validation cohort. Predictive factors in the final model were: maternal age, primiparity, duration of the first and second stages of labour, neonatal birth weight and antepartum haemoglobin levels. Accordingly, the predictive ability of this model in the derivation cohort was 0.90 (95% CI: 0.85-0.93), while it remained 0.83 (95% CI: 0.74-0.92) in the validation cohort. this predictive model is proved to have an excellent predictive ability in the derivation cohort, and its validation in a latter population equally shows a good ability for prediction. This model can be employed to identify women with a higher risk of postpartum haemorrhage. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validity of Self-reported Sleep Bruxism among Myofascial Temporomandibular Disorder Patients and Controls

PubMed Central

Raphael, Karen G.; Janal, Malvin N.; Sirois, David A.; Dubrovsky, Boris; Klausner, Jack J.; Krieger, Ana C.; Lavigne, Gilles J.

2015-01-01

Sleep bruxism (SB), primarily involving rhythmic grinding of the teeth during sleep, has been advanced as a causal or maintenance factor for a variety of orofacial problems, including temporomandibular disorders (TMD). Since laboratory polysomnographic (PSG) assessment is extremely expensive and time-consuming, most research testing this belief has relied on patient self-report of SB. The current case-control study examined the accuracy of those self-reports relative to laboratory-based PSG assessment of SB in a large sample of women suffering from chronic myofascial TMD (n=124) and a demographically matched control group without TMD (n=46). A clinical research coordinator administered a structured questionnaire to assess self-reported SB. Participants then spent two consecutive nights in a sleep laboratory. Audiovisual and electromyographic data from the second night were scored to assess whether participants met criteria for presence of 2 or more (2+) rhythmic masticatory muscle activity episodes accompanied by grinding sounds, moderate SB, or severe SB, using previously validated research scoring standards. Contingency tables were constructed to assess positive and negative predictive values, sensitivity and specificity, and 95% confidence intervals surrounding the point estimates. Results showed that self-report significantly predicted 2+ grinding sounds during sleep for TMD cases. However, self-reported SB failed to significantly predict presence or absence of either moderate or severe SB as assessed by PSG, for both cases and controls. These data show that self-report of tooth grinding awareness is highly unlikely to be a valid indicator of true SB. Studies relying on self-report to assess SB must be viewed with extreme caution. PMID:26010126
External validation of the fatty liver index and lipid accumulation product indices, using 1H-magnetic resonance spectroscopy, to identify hepatic steatosis in healthy controls and obese, insulin-resistant individuals.

PubMed

Cuthbertson, Daniel J; Weickert, Martin O; Lythgoe, Daniel; Sprung, Victoria S; Dobson, Rebecca; Shoajee-Moradie, Fariba; Umpleby, Margot; Pfeiffer, Andreas F H; Thomas, E Louise; Bell, Jimmy D; Jones, Helen; Kemp, Graham J

2014-11-01

Simple clinical algorithms including the fatty liver index (FLI) and lipid accumulation product (LAP) have been developed as surrogate markers for non-alcoholic fatty liver disease (NAFLD), constructed using (semi-quantitative) ultrasonography. This study aimed to validate FLI and LAP as measures of hepatic steatosis, as determined quantitatively by proton magnetic resonance spectroscopy (1H-MRS). Data were collected from 168 patients with NAFLD and 168 controls who had undergone clinical, biochemical and anthropometric assessment. Values of FLI and LAP were determined and assessed both as predictors of the presence of hepatic steatosis (liver fat>5.5%) and of actual liver fat content, as measured by 1H-MRS. The discriminative ability of FLI and LAP was estimated using the area under the receiver operator characteristic curve (AUROC). As FLI can also be interpreted as a predictive probability of hepatic steatosis, we assessed how well calibrated it was in our cohort. Linear regression with prediction intervals was used to assess the ability of FLI and LAP to predict liver fat content. Further validation was provided in 54 patients with type 2 diabetes mellitus. FLI, LAP and alanine transferase discriminated between patients with and without steatosis with an AUROC of 0.79 (IQR=0.74, 0.84), 0.78 (IQR=0.72, 0.83) and 0.83 (IQR=0.79, 0.88) respectively although could not quantitatively predict liver fat. Additionally, the algorithms accurately matched the observed percentages of patients with hepatic steatosis in our cohort. FLI and LAP may be used to identify patients with hepatic steatosis clinically or for research purposes but could not predict liver fat content. © 2014 European Society of Endocrinology.
Predicting drug-induced liver injury in human with Naïve Bayes classifier approach.

PubMed

Zhang, Hui; Ding, Lan; Zou, Yi; Hu, Shui-Qing; Huang, Hai-Guo; Kong, Wei-Bao; Zhang, Ji

2016-10-01

Drug-induced liver injury (DILI) is one of the major safety concerns in drug development. Although various toxicological studies assessing DILI risk have been developed, these methods were not sufficient in predicting DILI in humans. Thus, developing new tools and approaches to better predict DILI risk in humans has become an important and urgent task. In this study, we aimed to develop a computational model for assessment of the DILI risk with using a larger scale human dataset and Naïve Bayes classifier. The established Naïve Bayes prediction model was evaluated by 5-fold cross validation and an external test set. For the training set, the overall prediction accuracy of the 5-fold cross validation was 94.0 %. The sensitivity, specificity, positive predictive value and negative predictive value were 97.1, 89.2, 93.5 and 95.1 %, respectively. The test set with the concordance of 72.6 %, sensitivity of 72.5 %, specificity of 72.7 %, positive predictive value of 80.4 %, negative predictive value of 63.2 %. Furthermore, some important molecular descriptors related to DILI risk and some toxic/non-toxic fragments were identified. Thus, we hope the prediction model established here would be employed for the assessment of human DILI risk, and the obtained molecular descriptors and substructures should be taken into consideration in the design of new candidate compounds to help medicinal chemists rationally select the chemicals with the best prospects to be effective and safe.
Prediction of Fitness to Drive in Patients with Alzheimer's Dementia

PubMed Central

Piersma, Dafne; Fuermaier, Anselm B. M.; de Waard, Dick; Davidse, Ragnhild J.; de Groot, Jolieke; Doumen, Michelle J. A.; Bredewoud, Ruud A.; Claesen, René; Lemstra, Afina W.; Vermeeren, Annemiek; Ponds, Rudolf; Verhey, Frans; Brouwer, Wiebo H.; Tucha, Oliver

2016-01-01

The number of patients with Alzheimer’s disease (AD) is increasing and so is the number of patients driving a car. To enable patients to retain their mobility while at the same time not endangering public safety, each patient should be assessed for fitness to drive. The aim of this study is to develop a method to assess fitness to drive in a clinical setting, using three types of assessments, i.e. clinical interviews, neuropsychological assessment and driving simulator rides. The goals are (1) to determine for each type of assessment which combination of measures is most predictive for on-road driving performance, (2) to compare the predictive value of clinical interviews, neuropsychological assessment and driving simulator evaluation and (3) to determine which combination of these assessments provides the best prediction of fitness to drive. Eighty-one patients with AD and 45 healthy individuals participated. All participated in a clinical interview, and were administered a neuropsychological test battery and a driving simulator ride (predictors). The criterion fitness to drive was determined in an on-road driving assessment by experts of the CBR Dutch driving test organisation according to their official protocol. The validity of the predictors to determine fitness to drive was explored by means of logistic regression analyses, discriminant function analyses, as well as receiver operating curve analyses. We found that all three types of assessments are predictive of on-road driving performance. Neuropsychological assessment had the highest classification accuracy followed by driving simulator rides and clinical interviews. However, combining all three types of assessments yielded the best prediction for fitness to drive in patients with AD with an overall accuracy of 92.7%, which makes this method highly valid for assessing fitness to drive in AD. This method may be used to advise patients with AD and their family members about fitness to drive. PMID:26910535
Validity of the SAT® for Predicting First-Year Grades: 2011 SAT Validity Sample. Statistical Report 2013-3

ERIC Educational Resources Information Center

Patterson, Brian F.; Mattern, Krista D.

2013-01-01

The continued accumulation of validity evidence for the intended uses of educational assessments is critical to ensure that proper inferences will be made for those purposes. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides…
Validity of the SAT® for Predicting First-Year Grades: 2012 SAT Validity Sample. Statistical Report 2015 2

ERIC Educational Resources Information Center

Beard, Jonathan; Marini, Jessica P.

2015-01-01

The continued accumulation of validity evidence for the intended uses of educational assessment scores is critical to ensure that inferences made using the scores are sound. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides updated…
Examining the Reliability and Validity of ADEPT and CELDT: Comparing Two Assessments of Oral Language Proficiency for English Language Learners

ERIC Educational Resources Information Center

Chavez, Gina

2013-01-01

Few classroom measures of English language proficiency have been evaluated for reliability and validity. This research examined the concurrent and predictive validity of an oral language test, titled A Developmental English Language Proficiency Test (ADEPT), and the relationship to the California English Language Development Test (CELDT) in the…
Development and validation of a cancer-specific swallowing assessment tool: MASA-C.

PubMed

Carnaby, Giselle D; Crary, Michael A

2014-03-01

We present data from a sample of patients receiving radiotherapy for head/neck cancer to define and measure the validity of a new clinical assessment measure for swallowing. Fifty-eight patients undergoing radiotherapy (±chemotherapy) for head/neck cancer (HNC) supported the development of a physiology-based assessment tool of swallowing (Mann Assessment of Swallowing Ability--Cancer: MASA-C) administered at two time points (baseline and following radiotherapy treatment). The new exam was evaluated for internal consistency of items using Cronbach's alpha. Reliability of measurement was evaluated with intraclass correlation (ICC) and the Kappa statistic between two independent raters. Concurrent validity was established through comparison with the original MASA examination and against the referent standard videofluoroscopic swallowing examination (VFE). Sensitivity, specificity, and likelihood ratios along with 95 % confidence intervals (CIs) were derived for comparison of the two evaluation forms (MASA vs. MASA-C). Accuracy of diagnostic precision was displayed using receiver operator characteristic curves. The new MASA-C tool demonstrated superior validity to the original MASA examination applied to a HNC population. In comparison to the VFE referent exam, the MASA-C revealed strong sensitivity and specificity (Se 83, Sp 96), predictive values (positive predictive value (PPV) 0.95, negative predictive value (NPV) 0.86), and likelihood ratios (21.6). In addition, it demonstrated good reliability (ICC = 0.96) between speech-language pathology raters. The MASA-C is a reliable and valid scale that is sensitive to differences in swallowing performance in HNC patients with and without dysphagia. Future longitudinal evaluation of this tool in larger samples is suggested. The development and refinement of this swallowing assessment tool for use in multidisciplinary HNC teams will facilitate earlier identification of patients with swallowing difficulties and enable more efficient allocation of resources to the management of dysphagia in this population. The MASA-C may also prove useful in future clinical HNC rehabilitation trials with this population.
Validation of the 10/66 Dementia Research Group diagnostic assessment for dementia in Arabic: a study in Lebanon

PubMed Central

Phung, Kieu T. T.; Chaaya, Monique; Waldemar, Gunhild; Atweh, Samir; Asmar, Khalil; Ghusn, Husam; Karam, Georges; Sawaya, Raja; Khoury, Rose Mary; Zeinaty, Ibrahim; Salman, Sandrine; Hammoud, Salem; Radwan, Wael; Bassil, Nazem; Prince, Martin

2014-01-01

Objectives In the North Africa and Middle East region, the illiteracy rates among older people are high, posing a great challenge to cognitive assessment. Validated diagnostic instruments for dementia in Arabic are lacking, hampering the development of dementia research in the region. The study aimed at validating the Arabic version of the 10/66 Dementia Research Group (DRG) diagnostic assessment for dementia to determine if it is suitable for case ascertainment in epidemiological research. Methods 244 participants older than 65 years were included, 100 with normal cognition and 144 with mild to moderate dementia. Dementia was diagnosed by clinicians according to DSM-IV criteria. Depression was diagnosed using the Geriatric Mental State. Trained interviewers blind to the cognitive status of the participants administered the 10/66 DRG diagnostic assessment to the participants and interviewed the caregivers. The discriminatory ability of the 10/66 DRG assessment and its subcomponents were evaluated against the clinical diagnoses. Results Half of the participants had no formal education and 49% of them were depressed. The 10/66 DRG diagnostic assessment showed excellent sensitivity (92.0%), specificity (95.1%), positive predictive value (PPV, 92.9%), and low false positive rates (FPR) among controls with no formal education (8.1%) and depression (5.6%). Each subcomponent of the 10/66 DRG diagnostic assessment independently predicted dementia diagnosis. The predictive ability of the 10/66 DRG assessment was superior to that of its subcomponents. Conclusion 10/66 DRG diagnostic assessment for dementia is well suited for case ascertainment in epidemiological studies among Arabic speaking older population with high prevalence of illiteracy. PMID:24771602
Incremental and Predictive Utility of Formative Assessment Methods of Reading Comprehension

ERIC Educational Resources Information Center

Marcotte, Amanda M.; Hintze, John M.

2009-01-01

Formative assessment measures are commonly used in schools to assess reading and to design instruction accordingly. The purpose of this research was to investigate the incremental and concurrent validity of formative assessment measures of reading comprehension. It was hypothesized that formative measures of reading comprehension would contribute…
Predictive Validity of ICD-11 PTSD as Measured by the Impact of Event Scale-Revised: A 15-Year Prospective Study of Political Prisoners.

PubMed

Hyland, Philip; Brewin, Chris R; Maercker, Andreas

2017-04-01

The 11 th edition of the International Classification of Diseases (ICD-11; World Health Organization, 2017) proposes a model of posttraumatic stress disorder (PTSD) that includes 6 symptoms. This study assessed the ability of a classification-independent measure of posttraumatic stress symptoms, the Impact of Event Scale-Revised (Weiss & Marmar, 1996), to capture the ICD-11 model of PTSD. The current study also provided the first assessment of the predictive validity of ICD-11 PTSD. Former East German political prisoners were assessed in 1994 (N = 144) and in 2008-2009 (N = 88) on numerous psychological variables using self-report measures. Of the participants, 48.2% and 36.8% met probable diagnosis for ICD-11 PTSD at the first and second assessments, respectively. Confirmatory factor analysis supported the factorial validity of the 3-factor ICD-11 model of PTSD, as represented by items selected from the Impact of Event Scale-Revised. Hierarchical multiple regression analysis demonstrated that, controlling for sex, the symptom clusters of ICD-11 PTSD (reexperiencing, avoidance, and sense of threat) significantly contributed to the explanation of depression (R 2 = .17), quality of life (R 2 = .21), internalized anger (R 2 = .10), externalized anger (R 2 = .12), hatred of perpetrators (R 2 = .15), dysfunctional disclosure (R 2 = .27), and social acknowledgment as a victim (R 2 = .12) across the 15-year study period. Current findings add support for the factorial and predictive validity of ICD-11 PTSD within a unique cohort of political prisoners. Copyright © 2017 International Society for Traumatic Stress Studies.
Estimating energy expenditure from heart rate in older adults: a case for calibration.

PubMed

Schrack, Jennifer A; Zipunnikov, Vadim; Goldsmith, Jeff; Bandeen-Roche, Karen; Crainiceanu, Ciprian M; Ferrucci, Luigi

2014-01-01

Accurate measurement of free-living energy expenditure is vital to understanding changes in energy metabolism with aging. The efficacy of heart rate as a surrogate for energy expenditure is rooted in the assumption of a linear function between heart rate and energy expenditure, but its validity and reliability in older adults remains unclear. To assess the validity and reliability of the linear function between heart rate and energy expenditure in older adults using different levels of calibration. Heart rate and energy expenditure were assessed across five levels of exertion in 290 adults participating in the Baltimore Longitudinal Study of Aging. Correlation and random effects regression analyses assessed the linearity of the relationship between heart rate and energy expenditure and cross-validation models assessed predictive performance. Heart rate and energy expenditure were highly correlated (r=0.98) and linear regardless of age or sex. Intra-person variability was low but inter-person variability was high, with substantial heterogeneity of the random intercept (s.d. =0.372) despite similar slopes. Cross-validation models indicated individual calibration data substantially improves accuracy predictions of energy expenditure from heart rate, reducing the potential for considerable measurement bias. Although using five calibration measures provided the greatest reduction in the standard deviation of prediction errors (1.08 kcals/min), substantial improvement was also noted with two (0.75 kcals/min). These findings indicate standard regression equations may be used to make population-level inferences when estimating energy expenditure from heart rate in older adults but caution should be exercised when making inferences at the individual level without proper calibration.

Assessment of tinnitus-related impairments and disabilities using the German THI-12: sensitivity and stability of the scale over time.

PubMed

Görtelmeyer, Roman; Schmidt, Jürgen; Suckfüll, Markus; Jastreboff, Pawel; Gebauer, Alexander; Krüger, Hagen; Wittmann, Werner

2011-08-01

To evaluate the reliability, dimensionality, predictive validity, construct validity, and sensitivity to change of the THI-12 total and sub-scales as diagnostic aids to describe and quantify tinnitus-evoked reactions and evaluate treatment efficacy. Explorative analysis of the German tinnitus handicap inventory (THI-12) to assess potential sensitivity to tinnitus therapy in placebo-controlled randomized studies. Correlation analysis, including Cronbach's coefficient α and explorative common factor analysis (EFA), was conducted within and between assessments to demonstrate the construct validity, dimensionality, and factorial structure of the THI-12. N = 618 patients suffering from subjective tinnitus who were to be screened to participate in a randomized, placebo-controlled, 16-week, longitudinal study. The THI-12 can reliably diagnose tinnitus-related impairments and disabilities and assess changes over time. The test-retest coefficient for neighboured visits was r > 0.69, the internal consistency of the THI-12 total score was α ≤ 0.79 and α ≤ 0.89 at subsequent visits. Predictability of THI-12 total score and overall variance increased with successive measurements. The three-factorial structure allowed for evaluation of factors that affect aspects of patients' health-related quality of life. The THI-12, with its three-factorial structure, is a simple, reliable, and valid instrument for the diagnosis and assessment of tinnitus and associated impairment over time.
A biomarker-based risk score to predict death in patients with atrial fibrillation: the ABC (age, biomarkers, clinical history) death risk score

PubMed Central

Hijazi, Ziad; Oldgren, Jonas; Lindbäck, Johan; Alexander, John H; Connolly, Stuart J; Eikelboom, John W; Ezekowitz, Michael D; Held, Claes; Hylek, Elaine M; Lopes, Renato D; Yusuf, Salim; Granger, Christopher B; Siegbahn, Agneta; Wallentin, Lars

2018-01-01

Abstract Aims In atrial fibrillation (AF), mortality remains high despite effective anticoagulation. A model predicting the risk of death in these patients is currently not available. We developed and validated a risk score for death in anticoagulated patients with AF including both clinical information and biomarkers. Methods and results The new risk score was developed and internally validated in 14 611 patients with AF randomized to apixaban vs. warfarin for a median of 1.9 years. External validation was performed in 8548 patients with AF randomized to dabigatran vs. warfarin for 2.0 years. Biomarker samples were obtained at study entry. Variables significantly contributing to the prediction of all-cause mortality were assessed by Cox-regression. Each variable obtained a weight proportional to the model coefficients. There were 1047 all-cause deaths in the derivation and 594 in the validation cohort. The most important predictors of death were N-terminal pro B-type natriuretic peptide, troponin-T, growth differentiation factor-15, age, and heart failure, and these were included in the ABC (Age, Biomarkers, Clinical history)-death risk score. The score was well-calibrated and yielded higher c-indices than a model based on all clinical variables in both the derivation (0.74 vs. 0.68) and validation cohorts (0.74 vs. 0.67). The reduction in mortality with apixaban was most pronounced in patients with a high ABC-death score. Conclusion A new biomarker-based score for predicting risk of death in anticoagulated AF patients was developed, internally and externally validated, and well-calibrated in two large cohorts. The ABC-death risk score performed well and may contribute to overall risk assessment in AF. ClinicalTrials.gov identifier NCT00412984 and NCT00262600 PMID:29069359
Absolute fracture risk assessment using lumbar spine and femoral neck bone density measurements: derivation and validation of a hybrid system.

PubMed

Leslie, William D; Lix, Lisa M

2011-03-01

The World Health Organization (WHO) Fracture Risk Assessment Tool (FRAX) computes 10-year probability of major osteoporotic fracture from multiple risk factors, including femoral neck (FN) T-scores. Lumbar spine (LS) measurements are not currently part of the FRAX formulation but are used widely in clinical practice, and this creates confusion when there is spine-hip discordance. Our objective was to develop a hybrid 10-year absolute fracture risk assessment system in which nonvertebral (NV) fracture risk was assessed from the FN and clinical vertebral (V) fracture risk was assessed from the LS. We identified 37,032 women age 45 years and older undergoing baseline FN and LS dual-energy X-ray absorptiometry (DXA; 1990-2005) from a population database that contains all clinical DXA results for the Province of Manitoba, Canada. Results were linked to longitudinal health service records for physician billings and hospitalizations to identify nontrauma vertebral and nonvertebral fracture codes after bone mineral density (BMD) testing. The population was randomly divided into equal-sized derivation and validation cohorts. Using the derivation cohort, three fracture risk prediction systems were created from Cox proportional hazards models (adjusted for age and multiple FRAX risk factors): FN to predict combined all fractures, FN to predict nonvertebral fractures, and LS to predict vertebral (without nonvertebral) fractures. The hybrid system was the sum of nonvertebral risk from the FN model and vertebral risk from the LS model. The FN and hybrid systems were both strongly predictive of overall fracture risk (p < .001). In the validation cohort, ROC analysis showed marginally better performance of the hybrid system versus the FN system for overall fracture prediction (p = .24) and significantly better performance for vertebral fracture prediction (p < .001). In a discordance subgroup with FN and LS T-score differences greater than 1 SD, there was a significant improvement in overall fracture prediction with the hybrid method (p = .025). Risk reclassification under the hybrid system showed better alignment with observed fracture risk, with 6.4% of the women reclassified to a different risk category. In conclusion, a hybrid 10-year absolute fracture risk assessment system based on combining FN and LS information is feasible. The improvement in fracture risk prediction is small but supports clinical interest in a system that integrates LS in fracture risk assessment. Copyright © 2011 American Society for Bone and Mineral Research.
Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury.

PubMed

van der Ploeg, Tjeerd; Nieboer, Daan; Steyerberg, Ewout W

2016-10-01

Prediction of medical outcomes may potentially benefit from using modern statistical modeling techniques. We aimed to externally validate modeling strategies for prediction of 6-month mortality of patients suffering from traumatic brain injury (TBI) with predictor sets of increasing complexity. We analyzed individual patient data from 15 different studies including 11,026 TBI patients. We consecutively considered a core set of predictors (age, motor score, and pupillary reactivity), an extended set with computed tomography scan characteristics, and a further extension with two laboratory measurements (glucose and hemoglobin). With each of these sets, we predicted 6-month mortality using default settings with five statistical modeling techniques: logistic regression (LR), classification and regression trees, random forests (RFs), support vector machines (SVM) and neural nets. For external validation, a model developed on one of the 15 data sets was applied to each of the 14 remaining sets. This process was repeated 15 times for a total of 630 validations. The area under the receiver operating characteristic curve (AUC) was used to assess the discriminative ability of the models. For the most complex predictor set, the LR models performed best (median validated AUC value, 0.757), followed by RF and support vector machine models (median validated AUC value, 0.735 and 0.732, respectively). With each predictor set, the classification and regression trees models showed poor performance (median validated AUC value, <0.7). The variability in performance across the studies was smallest for the RF- and LR-based models (inter quartile range for validated AUC values from 0.07 to 0.10). In the area of predicting mortality from TBI, nonlinear and nonadditive effects are not pronounced enough to make modern prediction methods beneficial. Copyright © 2016 Elsevier Inc. All rights reserved.
Validation of the Danish version of the constipation risk assessment scale (CRAS).

PubMed

Trads, Mette; Håkonson, Sasja J; Pedersen, Preben U

2017-11-01

The Constipation Assessment Scale (CRAS) was developed in order to enable the prediction of the risk of developing constipation. The scale needs validation in acute and elective patients with common disorders. Two hundred and six acute patients with hip fracture and 200 elective patients with total knee or hip replacement were included. They were assessed with CRAS before surgery and their defecation pattern, stool consistency and degree of straining were measured at admission and 30 days after surgery. The prevalence of constipation was 0.49 for the acute patients and 0.34 for the elective patients. Sensitivity was 0.67 and 0.57. Specificity was 0.54 and 0.52. Positive predictive value was 0.59 and 0.38, whereas the negative predictive value was 0.63 and 0.7. When used in an orthopaedic ward, the prognostic accuracy of CRAS is poor and it cannot be recommended as a screening tool. Copyright © 2016 Elsevier Ltd. All rights reserved.
Utility of the Static-99 and Static-99R With Latino Sex Offenders.

PubMed

Leguízamo, Alejandro; Lee, Seung C; Jeglic, Elizabeth L; Calkins, Cynthia

2017-12-01

The predictive validity of the Static-99 measures with ethnic minorities in the United States has only recently been assessed with mixed results. We assessed the predictive validity of the Static-99 and Static-99R with a sample of Latino sex offenders ( N = 483) as well as with two subsamples (U.S.-born, including Puerto Rico, and non-U.S.-born). The overall sexual recidivism rate was very low (1.9%). Both the Static-99 measures were able to predict sexual recidivism for offenders born in the United States and Puerto Rico, but neither was effective in doing so for other Latino immigrants. Calibration analyses ( N = 303) of the Static-99R were consistent with the literature and provided support for the potential use of the measure with Latinos born in the United States and Puerto Rico. These findings and their implications are discussed as they pertain to the assessment of Latino sex offenders.
Assessment scale of risk for surgical positioning injuries 1

PubMed Central

Lopes, Camila Mendonça de Moraes; Haas, Vanderlei José; Dantas, Rosana Aparecida Spadoti; de Oliveira, Cheila Gonçalves; Galvão, Cristina Maria

2016-01-01

ABSTRACT Objective: to build and validate a scale to assess the risk of surgical positioning injuries in adult patients. Method: methodological research, conducted in two phases: construction and face and content validation of the scale and field research, involving 115 patients. Results: the Risk Assessment Scale for the Development of Injuries due to Surgical Positioning contains seven items, each of which presents five subitems. The scale score ranges between seven and 35 points in which, the higher the score, the higher the patient's risk. The Content Validity Index of the scale corresponded to 0.88. The application of Student's t-test for equality of means revealed the concurrent criterion validity between the scores on the Braden scale and the constructed scale. To assess the predictive criterion validity, the association was tested between the presence of pain deriving from surgical positioning and the development of pressure ulcer, using the score on the Risk Assessment Scale for the Development of Injuries due to Surgical Positioning (p<0.001). The interrater reliability was verified using the intraclass correlation coefficient, equal to 0.99 (p<0.001). Conclusion: the scale is a valid and reliable tool, but further research is needed to assess its use in clinical practice. PMID:27579925
Multivariate statistical assessment of predictors of firefighters' muscular and aerobic work capacity.

PubMed

Lindberg, Ann-Sofie; Oksa, Juha; Antti, Henrik; Malm, Christer

2015-01-01

Physical capacity has previously been deemed important for firefighters physical work capacity, and aerobic fitness, muscular strength, and muscular endurance are the most frequently investigated parameters of importance. Traditionally, bivariate and multivariate linear regression statistics have been used to study relationships between physical capacities and work capacities among firefighters. An alternative way to handle datasets consisting of numerous correlated variables is to use multivariate projection analyses, such as Orthogonal Projection to Latent Structures. The first aim of the present study was to evaluate the prediction and predictive power of field and laboratory tests, respectively, on firefighters' physical work capacity on selected work tasks. Also, to study if valid predictions could be achieved without anthropometric data. The second aim was to externally validate selected models. The third aim was to validate selected models on firefighters' and on civilians'. A total of 38 (26 men and 12 women) + 90 (38 men and 52 women) subjects were included in the models and the external validation, respectively. The best prediction (R2) and predictive power (Q2) of Stairs, Pulling, Demolition, Terrain, and Rescue work capacities included field tests (R2 = 0.73 to 0.84, Q2 = 0.68 to 0.82). The best external validation was for Stairs work capacity (R2 = 0.80) and worst for Demolition work capacity (R2 = 0.40). In conclusion, field and laboratory tests could equally well predict physical work capacities for firefighting work tasks, and models excluding anthropometric data were valid. The predictive power was satisfactory for all included work tasks except Demolition.
A whole blood gene expression-based signature for smoking status

PubMed Central

2012-01-01

Background Smoking is the leading cause of preventable death worldwide and has been shown to increase the risk of multiple diseases including coronary artery disease (CAD). We sought to identify genes whose levels of expression in whole blood correlate with self-reported smoking status. Methods Microarrays were used to identify gene expression changes in whole blood which correlated with self-reported smoking status; a set of significant genes from the microarray analysis were validated by qRT-PCR in an independent set of subjects. Stepwise forward logistic regression was performed using the qRT-PCR data to create a predictive model whose performance was validated in an independent set of subjects and compared to cotinine, a nicotine metabolite. Results Microarray analysis of whole blood RNA from 209 PREDICT subjects (41 current smokers, 4 quit ≤ 2 months, 64 quit > 2 months, 100 never smoked; NCT00500617) identified 4214 genes significantly correlated with self-reported smoking status. qRT-PCR was performed on 1,071 PREDICT subjects across 256 microarray genes significantly correlated with smoking or CAD. A five gene (CLDND1, LRRN3, MUC1, GOPC, LEF1) predictive model, derived from the qRT-PCR data using stepwise forward logistic regression, had a cross-validated mean AUC of 0.93 (sensitivity=0.78; specificity=0.95), and was validated using 180 independent PREDICT subjects (AUC=0.82, CI 0.69-0.94; sensitivity=0.63; specificity=0.94). Plasma from the 180 validation subjects was used to assess levels of cotinine; a model using a threshold of 10 ng/ml cotinine resulted in an AUC of 0.89 (CI 0.81-0.97; sensitivity=0.81; specificity=0.97; kappa with expression model = 0.53). Conclusion We have constructed and validated a whole blood gene expression score for the evaluation of smoking status, demonstrating that clinical and environmental factors contributing to cardiovascular disease risk can be assessed by gene expression. PMID:23210427
A calibration hierarchy for risk models was defined: from utopia to empirical data.

PubMed

Van Calster, Ben; Nieboer, Daan; Vergouwe, Yvonne; De Cock, Bavo; Pencina, Michael J; Steyerberg, Ewout W

2016-06-01

Calibrated risk models are vital for valid decision support. We define four levels of calibration and describe implications for model development and external validation of predictions. We present results based on simulated data sets. A common definition of calibration is "having an event rate of R% among patients with a predicted risk of R%," which we refer to as "moderate calibration." Weaker forms of calibration only require the average predicted risk (mean calibration) or the average prediction effects (weak calibration) to be correct. "Strong calibration" requires that the event rate equals the predicted risk for every covariate pattern. This implies that the model is fully correct for the validation setting. We argue that this is unrealistic: the model type may be incorrect, the linear predictor is only asymptotically unbiased, and all nonlinear and interaction effects should be correctly modeled. In addition, we prove that moderate calibration guarantees nonharmful decision making. Finally, results indicate that a flexible assessment of calibration in small validation data sets is problematic. Strong calibration is desirable for individualized decision support but unrealistic and counter productive by stimulating the development of overly complex models. Model development and external validation should focus on moderate calibration. Copyright © 2016 Elsevier Inc. All rights reserved.
Evaluation of the validity of osteoporosis and fracture risk assessment tools (IOF One Minute Test, SCORE, and FRAX) in postmenopausal Palestinian women.

PubMed

Kharroubi, Akram; Saba, Elias; Ghannam, Ibrahim; Darwish, Hisham

2017-12-01

The need for simple self-assessment tools is necessary to predict women at high risk for developing osteoporosis. In this study, tools like the IOF One Minute Test, Fracture Risk Assessment Tool (FRAX), and Simple Calculated Osteoporosis Risk Estimation (SCORE) were found to be valid for Palestinian women. The threshold for predicting women at risk for each tool was estimated. The purpose of this study is to evaluate the validity of the updated IOF (International Osteoporosis Foundation) One Minute Osteoporosis Risk Assessment Test, FRAX, SCORE as well as age alone to detect the risk of developing osteoporosis in postmenopausal Palestinian women. Three hundred eighty-two women 45 years and older were recruited including 131 women with osteoporosis and 251 controls following bone mineral density (BMD) measurement, 287 completed questionnaires of the different risk assessment tools. Receiver operating characteristic (ROC) curves were evaluated for each tool using bone BMD as the gold standard for osteoporosis. The area under the ROC curve (AUC) was the highest for FRAX calculated with BMD for predicting hip fractures (0.897) followed by FRAX for major fractures (0.826) with cut-off values ˃1.5 and ˃7.8%, respectively. The IOF One Minute Test AUC (0.629) was the lowest compared to other tested tools but with sufficient accuracy for predicting the risk of developing osteoporosis with a cut-off value ˃4 total yes questions out of 18. SCORE test and age alone were also as good predictors of risk for developing osteoporosis. According to the ROC curve for age, women ≥64 years had a higher risk of developing osteoporosis. Higher percentage of women with low BMD (T-score ≤-1.5) or osteoporosis (T-score ≤-2.5) was found among women who were not exposed to the sun, who had menopause before the age of 45 years, or had lower body mass index (BMI) compared to controls. Women who often fall had lower BMI and approximately 27% of the recruited postmenopausal Palestinian women had accidents that caused fractures. Simple self-assessment tools like FRAX without BMD, SCORE, and the IOF One Minute Tests were valid for predicting Palestinian postmenopausal women at high risk of developing osteoporosis.
Prediction of Primary Care Depression Outcomes at Six Months: Validation of DOC-6 ©.

PubMed

Angstman, Kurt B; Garrison, Gregory M; Gonzalez, Cesar A; Cozine, Daniel W; Cozine, Elizabeth W; Katzelnick, David J

2017-01-01

The goal of this study was to develop and validate an assessment tool for adult primary care patients diagnosed with depression to determine predictive probability of clinical outcomes at 6 months. We retrospectively reviewed 3096 adult patients enrolled in collaborative care management (CCM) for depression. Patients enrolled on or before December 31, 2013, served as the training set (n = 2525), whereas those enrolled after that date served as the preliminary validation set (n = 571). Six variables (2 demographic and 4 clinical) were statistically significant in determining clinical outcomes. Using the validation data set, the remission classifier produced the receiver operating characteristics (ROC) curve with a c-statistic or area under the curve (AUC) of 0.62 with predicted probabilities than ranged from 14.5% to 79.1%, with a median of 50.6%. The persistent depressive symptoms (PDS) classifier produced an ROC curve with a c-statistic or AUC of 0.67 and predicted probabilities that ranged from 5.5% to 73.1%, with a median of 23.5%. We were able to identify readily available variables and then validated these in the prediction of depression remission and PDS at 6 months. The DOC-6 tool may be used to predict which patients may be at risk for worse outcomes. © Copyright 2017 by the American Board of Family Medicine.
In silico toxicity prediction by support vector machine and SMILES representation-based string kernel.

PubMed

Cao, D-S; Zhao, J-C; Yang, Y-N; Zhao, C-X; Yan, J; Liu, S; Hu, Q-N; Xu, Q-S; Liang, Y-Z

2012-01-01

There is a great need to assess the harmful effects or toxicities of chemicals to which man is exposed. In the present paper, the simplified molecular input line entry specification (SMILES) representation-based string kernel, together with the state-of-the-art support vector machine (SVM) algorithm, were used to classify the toxicity of chemicals from the US Environmental Protection Agency Distributed Structure-Searchable Toxicity (DSSTox) database network. In this method, the molecular structure can be directly encoded by a series of SMILES substrings that represent the presence of some chemical elements and different kinds of chemical bonds (double, triple and stereochemistry) in the molecules. Thus, SMILES string kernel can accurately and directly measure the similarities of molecules by a series of local information hidden in the molecules. Two model validation approaches, five-fold cross-validation and independent validation set, were used for assessing the predictive capability of our developed models. The results obtained indicate that SVM based on the SMILES string kernel can be regarded as a very promising and alternative modelling approach for potential toxicity prediction of chemicals.
A new test set for validating predictions of protein-ligand interaction.

PubMed

Nissink, J Willem M; Murray, Chris; Hartshorn, Mike; Verdonk, Marcel L; Cole, Jason C; Taylor, Robin

2002-12-01

We present a large test set of protein-ligand complexes for the purpose of validating algorithms that rely on the prediction of protein-ligand interactions. The set consists of 305 complexes with protonation states assigned by manual inspection. The following checks have been carried out to identify unsuitable entries in this set: (1) assessing the involvement of crystallographically related protein units in ligand binding; (2) identification of bad clashes between protein side chains and ligand; and (3) assessment of structural errors, and/or inconsistency of ligand placement with crystal structure electron density. In addition, the set has been pruned to assure diversity in terms of protein-ligand structures, and subsets are supplied for different protein-structure resolution ranges. A classification of the set by protein type is available. As an illustration, validation results are shown for GOLD and SuperStar. GOLD is a program that performs flexible protein-ligand docking, and SuperStar is used for the prediction of favorable interaction sites in proteins. The new CCDC/Astex test set is freely available to the scientific community (http://www.ccdc.cam.ac.uk). Copyright 2002 Wiley-Liss, Inc.
A predictive score to identify hospitalized patients' risk of discharge to a post-acute care facility

PubMed Central

Louis Simonet, Martine; Kossovsky, Michel P; Chopard, Pierre; Sigaud, Philippe; Perneger, Thomas V; Gaspoz, Jean-Michel

2008-01-01

Background Early identification of patients who need post-acute care (PAC) may improve discharge planning. The purposes of the study were to develop and validate a score predicting discharge to a post-acute care (PAC) facility and to determine its best assessment time. Methods We conducted a prospective study including 349 (derivation cohort) and 161 (validation cohort) consecutive patients in a general internal medicine service of a teaching hospital. We developed logistic regression models predicting discharge to a PAC facility, based on patient variables measured on admission (day 1) and on day 3. The value of each model was assessed by its area under the receiver operating characteristics curve (AUC). A simple numerical score was derived from the best model, and was validated in a separate cohort. Results Prediction of discharge to a PAC facility was as accurate on day 1 (AUC: 0.81) as on day 3 (AUC: 0.82). The day-3 model was more parsimonious, with 5 variables: patient's partner inability to provide home help (4 pts); inability to self-manage drug regimen (4 pts); number of active medical problems on admission (1 pt per problem); dependency in bathing (4 pts) and in transfers from bed to chair (4 pts) on day 3. A score ≥ 8 points predicted discharge to a PAC facility with a sensitivity of 87% and a specificity of 63%, and was significantly associated with inappropriate hospital days due to discharge delays. Internal and external validations confirmed these results. Conclusion A simple score computed on the 3rd hospital day predicted discharge to a PAC facility with good accuracy. A score > 8 points should prompt early discharge planning. PMID:18647410
ASTRAL-R score predicts non-recanalisation after intravenous thrombolysis in acute ischaemic stroke.

PubMed

Vanacker, Peter; Heldner, Mirjam R; Seiffge, David; Mueller, Hubertus; Eskandari, Ashraf; Traenka, Christopher; Ntaios, George; Mosimann, Pascal J; Sztajzel, Roman; Mendes Pereira, Vitor; Cras, Patrick; Engelter, Stefan; Lyrer, Philippe; Fischer, Urs; Lambrou, Dimitris; Arnold, Marcel; Michel, Patrik

2015-05-01

Intravenous thrombolysis (IVT) as treatment in acute ischaemic strokes may be insufficient to achieve recanalisation in certain patients. Predicting probability of non-recanalisation after IVT may have the potential to influence patient selection to more aggressive management strategies. We aimed at deriving and internally validating a predictive score for post-thrombolytic non-recanalisation, using clinical and radiological variables. In thrombolysis registries from four Swiss academic stroke centres (Lausanne, Bern, Basel and Geneva), patients were selected with large arterial occlusion on acute imaging and with repeated arterial assessment at 24 hours. Based on a logistic regression analysis, an integer-based score for each covariate of the fitted multivariate model was generated. Performance of integer-based predictive model was assessed by bootstrapping available data and cross validation (delete-d method). In 599 thrombolysed strokes, five variables were identified as independent predictors of absence of recanalisation: Acute glucose > 7 mmol/l (A), significant extracranial vessel STenosis (ST), decreased Range of visual fields (R), large Arterial occlusion (A) and decreased Level of consciousness (L). All variables were weighted 1, except for (L) which obtained 2 points based on β-coefficients on the logistic scale. ASTRAL-R scores 0, 3 and 6 corresponded to non-recanalisation probabilities of 18, 44 and 74 % respectively. Predictive ability showed AUC of 0.66 (95 %CI, 0.61-0.70) when using bootstrap and 0.66 (0.63-0.68) when using delete-d cross validation. In conclusion, the 5-item ASTRAL-R score moderately predicts non-recanalisation at 24 hours in thrombolysed ischaemic strokes. If its performance can be confirmed by external validation and its clinical usefulness can be proven, the score may influence patient selection for more aggressive revascularisation strategies in routine clinical practice.
Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

PubMed

Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi; Segata, Nicola

2016-07-01

Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the "healthy" microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly available at http://segatalab.cibio.unitn.it/tools/metaml.
An examination of the predictive validity of the risk matrix 2000 in England and wales.

PubMed

Barnett, Georgia D; Wakeling, Helen C; Howard, Philip D

2010-12-01

This study examined the predictive validity of an actuarial risk-assessment tool with convicted sexual offenders in England and Wales. A modified version of the RM2000/s scale and the RM2000 v and c scales (Thornton et al., 2003) were examined for accuracy in predicting proven sexual violent, nonsexual violent, and combined sexual and/or nonsexual violent reoffending in a sample of sexual offenders who had either started a community sentence or been released from prison into the community by March 2007. Rates of proven reoffending were examined at 2 years for the majority of the sample (n = 4,946), and 4 years ( n = 578) for those for whom these data were available. The predictive validity of the RM2000 scales was also explored for different subgroups of sexual offenders to assess the robustness of the tool. Both the modified RM2000/s and the complete v and c scales effectively classified offenders into distinct risk categories that differed significantly in rates of proven sexual and/or nonsexual violent reoffending. Survival analyses on the RM2000/s and v scales (N = 9,284) indicated that the higher risk groups offended more quickly and at a higher rate than lower risk groups. The relative predictive validity of the RM2000/s, v, and c, as calculated using Receiver Operating Characteristics (ROC) analyses, were moderate (.68) for RM2000/s and large for both the RM2000/c (.73) and RM2000/v (.80), at the 2-year follow-up. RM2000/s was moderately accurate in predicting relative risk of proven sexual reoffending for a variety of subgroups of sexual offenders.
Neurological Outcome Scale for Traumatic Brain Injury: III. Criterion-Related Validity and Sensitivity to Change in the NABIS Hypothermia-II Clinical Trial

PubMed Central

Wilde, Elisabeth A.; Moretti, Paolo; MacLeod, Marianne C.; Pedroza, Claudia; Drever, Pamala; Fourwinds, Sierra; Frisby, Melisa L.; Beers, Sue R.; Scott, James N.; Hunter, Jill V.; Traipe, Elfrides; Valadka, Alex B.; Okonkwo, David O.; Zygun, David A.; Puccio, Ava M.; Clifton, Guy L.

2013-01-01

Abstract The Neurological Outcome Scale for Traumatic Brain Injury (NOS-TBI) is a measure assessing neurological functioning in patients with TBI. We hypothesized that the NOS-TBI would exhibit adequate concurrent and predictive validity and demonstrate more sensitivity to change, compared with other well-established outcome measures. We analyzed data from the National Acute Brain Injury Study: Hypothermia-II clinical trial. Participants were 16–45 years of age with severe TBI assessed at 1, 3, 6, and 12 months postinjury. For analysis of criterion-related validity (concurrent and predictive), Spearman's rank-order correlations were calculated between the NOS-TBI and the Glasgow Outcome Scale (GOS), GOS-Extended (GOS-E), Disability Rating Scale (DRS), and Neurobehavioral Rating Scale-Revised (NRS-R). Concurrent validity was demonstrated through significant correlations between the NOS-TBI and GOS, GOS-E, DRS, and NRS-R measured contemporaneously at 3, 6, and 12 months postinjury (all p<0.0013). For prediction analyses, the multiplicity-adjusted p value using the false discovery rate was <0.015. The 1-month NOS-TBI score was a significant predictor of outcome in the GOS, GOS-E, and DRS at 3 and 6 months postinjury (all p<0.015). The 3-month NOS-TBI significantly predicted GOS, GOS-E, DRS, and NRS-R outcomes at 6 and 12 months postinjury (all p<0.0015). Sensitivity to change was analyzed using Wilcoxon's signed rank-sum test of subsamples demonstrating no change in the GOS or GOS-E between 3 and 6 months. The NOS-TBI demonstrated higher sensitivity to change, compared with the GOS (p<0.038) and GOS-E (p<0.016). In summary, the NOS-TBI demonstrated adequate concurrent and predictive validity as well as sensitivity to change, compared with gold-standard outcome measures. The NOS-TBI may enhance prediction of outcome in clinical practice and measurement of outcome in TBI research. PMID:23617608
An empirical study of the predictive validity of number grades in medical school using 3 decades of longitudinal data: implications for a grading system.

PubMed

Gonnella, Joseph S; Erdmann, James B; Hojat, Mohammadreza

2004-04-01

Context It is important to establish the predictive validity of medical school grades. The strength of predictive validity and the ability to identify at-risk students in medical schools depends upon assessment systems such as number grades, pass/fail (P/F) or honours/pass/fail (H/P/F) systems. Objective This study was designed to examine the predictive validity of number grades in medical school, and to determine whether any important information is lost in a shift from number to P/F and H/P/F grading systems. Subjects The participants in this prospective, longitudinal study were 6656 medical students who studied at Jefferson Medical College over 3 decades. They were grouped into 10 deciles based on their number grades in Year 1 of medical school. Methods Participants were compared on academic accomplishments in Years 2 and 3 of medical school, medical school class rank, delayed graduation and attrition, performance on medical licensing examinations and clinical competence ratings in the first postgraduate year. Results Results supported the short- and longterm predictive validity of the number grades. Ratings of clinical competence beyond medical school were predicted by number grades in medical school. We demonstrated that small differences in number grades are statistically meaningful, and that important information for identifying students in need of remedial education is lost when students who narrowly meet faculty's expectations are included with the rest of the class in a broad 'pass' category. Conclusions The findings refute the argument that knowledge of sciences basic to medicine is not critical to subsequent performance in medical school and beyond if an appropriate evaluation system is used. Furthermore, the results of this study raise questions about abandoning number grades in favour of a pass/fail system. Consideration of these findings in policy decisions regarding assessment systems of medical students is recommended.

Predictive validity of the UK clinical aptitude test in the final years of medical school: a prospective cohort study.

PubMed

Husbands, Adrian; Mathieson, Alistair; Dowell, Jonathan; Cleland, Jennifer; MacKenzie, Rhoda

2014-04-23

The UK Clinical Aptitude Test (UKCAT) was designed to address issues identified with traditional methods of selection. This study aims to examine the predictive validity of the UKCAT and compare this to traditional selection methods in the senior years of medical school. This was a follow-up study of two cohorts of students from two medical schools who had previously taken part in a study examining the predictive validity of the UKCAT in first year. The sample consisted of 4th and 5th Year students who commenced their studies at the University of Aberdeen or University of Dundee medical schools in 2007. Data collected were: demographics (gender and age group), UKCAT scores; Universities and Colleges Admissions Service (UCAS) form scores; admission interview scores; Year 4 and 5 degree examination scores. Pearson's correlations were used to examine the relationships between admissions variables, examination scores, gender and age group, and to select variables for multiple linear regression analysis to predict examination scores. Ninety-nine and 89 students at Aberdeen medical school from Years 4 and 5 respectively, and 51 Year 4 students in Dundee, were included in the analysis. Neither UCAS form nor interview scores were statistically significant predictors of examination performance. Conversely, the UKCAT yielded statistically significant validity coefficients between .24 and .36 in four of five assessments investigated. Multiple regression analysis showed the UKCAT made a statistically significant unique contribution to variance in examination performance in the senior years. Results suggest the UKCAT appears to predict performance better in the later years of medical school compared to earlier years and provides modest supportive evidence for the UKCAT's role in student selection within these institutions. Further research is needed to assess the predictive validity of the UKCAT against professional and behavioural outcomes as the cohort commences working life.
Predictive validity of the UK clinical aptitude test in the final years of medical school: a prospective cohort study

PubMed Central

2014-01-01

Background The UK Clinical Aptitude Test (UKCAT) was designed to address issues identified with traditional methods of selection. This study aims to examine the predictive validity of the UKCAT and compare this to traditional selection methods in the senior years of medical school. This was a follow-up study of two cohorts of students from two medical schools who had previously taken part in a study examining the predictive validity of the UKCAT in first year. Methods The sample consisted of 4th and 5th Year students who commenced their studies at the University of Aberdeen or University of Dundee medical schools in 2007. Data collected were: demographics (gender and age group), UKCAT scores; Universities and Colleges Admissions Service (UCAS) form scores; admission interview scores; Year 4 and 5 degree examination scores. Pearson’s correlations were used to examine the relationships between admissions variables, examination scores, gender and age group, and to select variables for multiple linear regression analysis to predict examination scores. Results Ninety-nine and 89 students at Aberdeen medical school from Years 4 and 5 respectively, and 51 Year 4 students in Dundee, were included in the analysis. Neither UCAS form nor interview scores were statistically significant predictors of examination performance. Conversely, the UKCAT yielded statistically significant validity coefficients between .24 and .36 in four of five assessments investigated. Multiple regression analysis showed the UKCAT made a statistically significant unique contribution to variance in examination performance in the senior years. Conclusions Results suggest the UKCAT appears to predict performance better in the later years of medical school compared to earlier years and provides modest supportive evidence for the UKCAT’s role in student selection within these institutions. Further research is needed to assess the predictive validity of the UKCAT against professional and behavioural outcomes as the cohort commences working life. PMID:24762134
MiRNA Expression Analysis of Pretreatment Biopsies Predicts the Pathological Response of Esophageal Squamous Cell Carcinomas to Neoadjuvant Chemoradiotherapy.

PubMed

Wen, Jing; Luo, Kongjia; Liu, Hui; Liu, Shiliang; Lin, Guangrong; Hu, Yi; Zhang, Xu; Wang, Geng; Chen, Yuping; Chen, Zhijian; Li, Yi; Lin, Ting; Xie, Xiuying; Liu, Mengzhong; Wang, Huiyun; Yang, Hong; Fu, Jianhua

2016-05-01

To identify miRNA markers useful for esophageal squamous cell carcinoma (ESCC) neoadjuvant chemoradiotherapy (neo-CRT) response prediction. Neo-CRT followed by surgery improves ESCC patients' survival compared with surgery alone. However, CRT outcomes are heterogeneous, and no current methods can predict CRT responses. Differentially expressed miRNAs between ESCC pathological responders and nonresponders after neo-CRT were identified by miRNA profiling and verified by real-time quantitative polymerase chain reaction (qPCR) of 27 ESCCs in the training set. Several class prediction algorithms were used to build the response-classifying models with the qPCR data. Predictive powers of the models were further assessed with a second set of 79 ESCCs. Ten miRNAs with greater than a 1.5-fold change between pathological responders and nonresponders were identified and verified, respectively. A support vector machine (SVM) prediction model, composed of 4 miRNAs (miR-145-5p, miR-152, miR-193b-3p, and miR-376a-3p), were developed. It provided overall accuracies of 100% and 87.3% for discriminating pathological responders and nonresponders in the training and external validation sets, respectively. In multivariate analysis, the subgroup determined by the SVM model was the only independent factor significantly associated with neo-CRT response in the external validation sets. Combined qPCR of the 4 miRNAs provides the possibility of ESCC neo-CRT response prediction, which may facilitate individualized ESCC treatment. Further prospective validation in larger independent cohorts is necessary to fully assess its predictive power.
Empirically Examining the Risk of Intimate Partner Violence: The Revised Domestic Violence Screening Instrument (DVSI-R)

PubMed Central

Grant, Stephen R

2006-01-01

SYNOPSIS Objective This study extends recent research on assessing the risk of intimate partner violence by determining the concurrent and predictive validity of a revised version of the Domestic Violence Screening Instrument (DVSI-R) and whether evidence of such validity is sustained independent of perpetrator demographic characteristics and forms of intimate violence. The analyses highlight violent incidents involving multiple victims as an indicator of “severe” violence. Previous research did not address these issues. Methods Data were analyzed on 14,970 assessments conducted in the State of Connecticut from September 1, 2004 through May 2, 2005. Hierarchical regression and receiver operating characteristic analyses were used to address the objectives of this research. Results The empirical findings support the concurrent and predictive validity of the DVSI-R and show that it is robust in its applicability. The findings further show that incidents involving multiple victims are highly associated with DVSI-R risk scores and recidivistic violence. Conclusion Validating and demonstrating the robustness of a risk assessment instrument is only a first step in preventing violence involving intimate partners or others in family or family-like relationships. The challenge is to train professionals responsible for addressing the problem of such violence to link valid risk assessments to well-crafted strategies of supervision and treatment so that the victimized or other potential victims are protected and perpetrators are held accountable for their actions. PMID:16827441
External validation of anti-Müllerian hormone based prediction of live birth in assisted conception

PubMed Central

2013-01-01

Background Chronological age and oocyte yield are independent determinants of live birth in assisted conception. Anti-Müllerian hormone (AMH) is strongly associated with oocyte yield after controlled ovarian stimulation. We have previously assessed the ability of AMH and age to independently predict live birth in an Italian assisted conception cohort. Herein we report the external validation of the nomogram in 822 UK first in vitro fertilization (IVF) cycles. Methods Retrospective cohort consisting of 822 patients undergoing their first IVF treatment cycle at Glasgow Centre for Reproductive Medicine. Analyses were restricted to women aged between 25 and 42 years of age. All women had an AMH measured prior to commencing their first IVF cycle. The performance of the model was assessed; discrimination by the area under the receiver operator curve (ROCAUC) and model calibration by the predicted probability versus observed probability. Results Live births occurred in 29.4% of the cohort. The observed and predicted outcomes showed no evidence of miscalibration (p = 0.188). The ROCAUC was 0.64 (95% CI: 0.60, 0.68), suggesting moderate and similar discrimination to the original model. The ROCAUC for a continuous model of age and AMH was 0.65 (95% CI 0.61, 0.69), suggesting that the original categories of AMH were appropriate. Conclusions We confirm by external validation that AMH and age are independent predictors of live birth. Although the confidence intervals for each category are wide, our results support the assessment of AMH in larger cohorts with detailed baseline phenotyping for live birth prediction. PMID:23294733
Investigating the Validity of Two Widely Used Quantitative Text Tools

ERIC Educational Resources Information Center

Cunningham, James W.; Hiebert, Elfrieda H.; Mesmer, Heidi Anne

2018-01-01

In recent years, readability formulas have gained new prominence as a basis for selecting texts for learning and assessment. Variables that quantitative tools count (e.g., word frequency, sentence length) provide valid measures of text complexity insofar as they accurately predict representative and high-quality criteria. The longstanding…
Current Developments in Measuring Academic Behavioural Confidence

ERIC Educational Resources Information Center

Sander, Paul

2009-01-01

Using published findings and by further analyses of existing data, the structure, validity and utility of the Academic Behavioural Confidence scale (ABC) is critically considered. Validity is primarily assessed through the scale's relationship with other existing scales as well as by looking for predicted differences. The utility of the ABC scale…
Intrasubject Predictions of Vocational Preference: Convergent Validation via the Decision Theoretic Paradigm.

ERIC Educational Resources Information Center

Monahan, Carlyn J.; Muchinsky, Paul M.

1985-01-01

The degree of convergent validity among four methods of identifying vocational preferences is assessed via the decision theoretic paradigm. Vocational preferences identified by Holland's Vocational Preference Inventory (VPI), a rating procedure, and ranking were compared with preferences identified from a policy-capturing model developed from an…
AERONET Version 3 Release: Providing Significant Improvements for Multi-Decadal Global Aerosol Database and Near Real-Time Validation

NASA Technical Reports Server (NTRS)

Holben, Brent; Slutsker, Ilya; Giles, David; Eck, Thomas; Smirnov, Alexander; Sinyuk, Aliaksandr; Schafer, Joel; Sorokin, Mikhail; Rodriguez, Jon; Kraft, Jason;

2016-01-01

Aerosols are highly variable in space, time and properties. Global assessment from satellite platforms and model predictions rely on validation from AERONET, a highly accurate ground-based network. Ver. 3 represents a significant improvement in accuracy and quality.

Measuring Offence-Specific Forgiveness in Marriage: The Marital Offence-Specific Forgiveness Scale (MOFS)

ERIC Educational Resources Information Center

Paleari, F. Giorgia; Regalia, Camillo; Fincham, Frank D.

2009-01-01

Three studies involving 328 married couples were conducted to validate the Marital Offence-Specific Forgiveness Scale, a new measure assessing offence-specific forgiveness for marital transgressions. The studies examined the dimensionality; internal consistency; and discriminant, concurrent, and predictive validity of the new measure. The final…
An Action Assembly Approach to Predicting Emotional Responses to Frightening Mass Media.

ERIC Educational Resources Information Center

Sparks, Glenn G.

1986-01-01

Assesses the validity of a 20-item scale that purportedly measures long term memory records--in this case, frightening mass media. Evidence for validity emerged in that subjects' scale scores were related to negative emotion, negative cognitions, and skin conductance during film clips of scary movies. (NKA)
Criterion and Construct Validity of an Isometric Midthigh-Pull Dynamometer for Assessing Whole-Body Strength in Professional Rugby League Players.

PubMed

Dobbin, Nick; Hunwicks, Richard; Jones, Ben; Till, Kevin; Highton, Jamie; Twist, Craig

2018-02-01

To examine the criterion and construct validity of an isometric midthigh-pull dynamometer to assess whole-body strength in professional rugby league players. Fifty-six male rugby league players (33 senior and 23 youth players) performed 4 isometric midthigh-pull efforts (ie, 2 on the dynamometer and 2 on the force platform) in a randomized and counterbalanced order. Isometric peak force was underestimated (P < .05) using the dynamometer compared with the force platform (95% LoA: -213.5 ± 342.6 N). Linear regression showed that peak force derived from the dynamometer explained 85% (adjusted R 2 = .85, SEE = 173 N) of the variance in the dependent variable, with the following prediction equation derived: predicted peak force = [1.046 × dynamometer peak force] + 117.594. Cross-validation revealed a nonsignificant bias (P > .05) between the predicted and peak force from the force platform and an adjusted R 2 (79.6%) that represented shrinkage of 0.4% relative to the cross-validation model (80%). Peak force was greater for the senior than the youth professionals using the dynamometer (2261.2 ± 222 cf 1725.1 ± 298.0 N, respectively; P < .05). The isometric midthigh pull assessed using a dynamometer underestimates criterion peak force but is capable of distinguishing muscle-function characteristics between professional rugby league players of different standards.
Novel Risk Engine for Diabetes Progression and Mortality in USA: Building, Relating, Assessing, and Validating Outcomes (BRAVO).

PubMed

Shao, Hui; Fonseca, Vivian; Stoecker, Charles; Liu, Shuqian; Shi, Lizheng

2018-05-03

There is an urgent need to update diabetes prediction, which has relied on the United Kingdom Prospective Diabetes Study (UKPDS) that dates back to 1970 s' European populations. The objective of this study was to develop a risk engine with multiple risk equations using a recent patient cohort with type 2 diabetes mellitus reflective of the US population. A total of 17 risk equations for predicting diabetes-related microvascular and macrovascular events, hypoglycemia, mortality, and progression of diabetes risk factors were estimated using the data from the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial (n = 10,251). Internal and external validation processes were used to assess performance of the Building, Relating, Assessing, and Validating Outcomes (BRAVO) risk engine. One-way sensitivity analysis was conducted to examine the impact of risk factors on mortality at the population level. The BRAVO risk engine added several risk factors including severe hypoglycemia and common US racial/ethnicity categories compared with the UKPDS risk engine. The BRAVO risk engine also modeled mortality escalation associated with intensive glycemic control (i.e., glycosylated hemoglobin < 6.5%). External validation showed a good prediction power on 28 endpoints observed from other clinical trials (slope = 1.071, R 2 = 0.86). The BRAVO risk engine for the US diabetes cohort provides an alternative to the UKPDS risk engine. It can be applied to assist clinical and policy decision making such as cost-effective resource allocation in USA.
Implementing Lumberjacks and Black Swans Into Model-Based Tools to Support Human-Automation Interaction.

PubMed

Sebok, Angelia; Wickens, Christopher D

2017-03-01

The objectives were to (a) implement theoretical perspectives regarding human-automation interaction (HAI) into model-based tools to assist designers in developing systems that support effective performance and (b) conduct validations to assess the ability of the models to predict operator performance. Two key concepts in HAI, the lumberjack analogy and black swan events, have been studied extensively. The lumberjack analogy describes the effects of imperfect automation on operator performance. In routine operations, an increased degree of automation supports performance, but in failure conditions, increased automation results in more significantly impaired performance. Black swans are the rare and unexpected failures of imperfect automation. The lumberjack analogy and black swan concepts have been implemented into three model-based tools that predict operator performance in different systems. These tools include a flight management system, a remotely controlled robotic arm, and an environmental process control system. Each modeling effort included a corresponding validation. In one validation, the software tool was used to compare three flight management system designs, which were ranked in the same order as predicted by subject matter experts. The second validation compared model-predicted operator complacency with empirical performance in the same conditions. The third validation compared model-predicted and empirically determined time to detect and repair faults in four automation conditions. The three model-based tools offer useful ways to predict operator performance in complex systems. The three tools offer ways to predict the effects of different automation designs on operator performance.
Derivation and Validation of a Biomarker-Based Clinical Algorithm to Rule Out Sepsis From Noninfectious Systemic Inflammatory Response Syndrome at Emergency Department Admission: A Multicenter Prospective Study.

PubMed

Mearelli, Filippo; Fiotti, Nicola; Giansante, Carlo; Casarsa, Chiara; Orso, Daniele; De Helmersen, Marco; Altamura, Nicola; Ruscio, Maurizio; Castello, Luigi Mario; Colonetti, Efrem; Marino, Rossella; Barbati, Giulia; Bregnocchi, Andrea; Ronco, Claudio; Lupia, Enrico; Montrucchio, Giuseppe; Muiesan, Maria Lorenza; Di Somma, Salvatore; Avanzi, Gian Carlo; Biolo, Gianni

2018-05-07

To derive and validate a predictive algorithm integrating a nomogram-based prediction of the pretest probability of infection with a panel of serum biomarkers, which could robustly differentiate sepsis/septic shock from noninfectious systemic inflammatory response syndrome. Multicenter prospective study. At emergency department admission in five University hospitals. Nine-hundred forty-seven adults in inception cohort and 185 adults in validation cohort. None. A nomogram, including age, Sequential Organ Failure Assessment score, recent antimicrobial therapy, hyperthermia, leukocytosis, and high C-reactive protein values, was built in order to take data from 716 infected patients and 120 patients with noninfectious systemic inflammatory response syndrome to predict pretest probability of infection. Then, the best combination of procalcitonin, soluble phospholypase A2 group IIA, presepsin, soluble interleukin-2 receptor α, and soluble triggering receptor expressed on myeloid cell-1 was applied in order to categorize patients as "likely" or "unlikely" to be infected. The predictive algorithm required only procalcitonin backed up with soluble phospholypase A2 group IIA determined in 29% of the patients to rule out sepsis/septic shock with a negative predictive value of 93%. In a validation cohort of 158 patients, predictive algorithm reached 100% of negative predictive value requiring biomarker measurements in 18% of the population. We have developed and validated a high-performing, reproducible, and parsimonious algorithm to assist emergency department physicians in distinguishing sepsis/septic shock from noninfectious systemic inflammatory response syndrome.
Using Admission Assessments to Predict Final Grades in a College Music Program

ERIC Educational Resources Information Center

Lehmann, Andreas C.

2014-01-01

Entrance examinations and auditions are common admission procedures for college music programs, yet few researchers have attempted to look at the long-term predictive validity of such selection processes. In this study, archival data from 93 student records of a German music academy were used to predict development of musicianship skills over the…
Validation of the Chinese SAD PERSONS Scale to predict repeated self-harm in emergency attendees in Taiwan

PubMed Central

2014-01-01

Background Past and repeated self-harm are long-term risks to completed suicide. A brief rating scale to assess repetition risk of self-harm is important for high-risk identification and early interventions in suicide prevention. The study aimed to examine the validity of the Chinese SAD PERSONS Scale (CSPS) and to evaluate its feasibility in clinical settings. Methods One hundred and forty-seven patients with self-harm were recruited from the Emergency Department and assessed at baseline and the sixth month. The controls, 284 people without self-harm from the Family Medicine Department in the same hospital were recruited and assessed concurrently. The psychometric properties of the CSPS were examined using baseline and follow-up measurements that assessed a variety of suicide risk factors. Clinical feasibility and applicability of the CSPS were further evaluated by a group of general nurses who used case vignette approach in CSPS risk assessment in clinical settings. An open-ended question inquiring their opinions of scale adaptation to hospital inpatient assessment for suicide risks were also analyzed using content analysis. Results The CSPS was significantly correlated with other scales measuring depression, hopelessness and suicide ideation. A cut-off point of the scale was at 4/5 in predicting 6-month self-harm repetition with the sensitivity and specificity being 65.4% and 58.1%, respectively. Based on the areas under the Receiver Operating Characteristic curves, the predictive validity of the scale showed a better performance than the other scales. Fifty-four nurses, evaluating the scale using case vignette found it a useful tool to raise the awareness of suicide risk and a considerable tool to be adopted into nursing care. Conclusions The Chinese SAD PERSONS Scale is a brief instrument with acceptable psychometric properties for self-harm prediction. However, cautions should be paid to level of therapeutic relationships during assessment, staff workload and adequate training for wider clinical applications. PMID:24533537
Validation of the Chinese SAD PERSONS Scale to predict repeated self-harm in emergency attendees in Taiwan.

PubMed

Wu, Chia-Yi; Huang, Hui-Chun; Wu, Shu-I; Sun, Fang-Ju; Huang, Chiu-Ron; Liu, Shen-Ing

2014-02-17

Past and repeated self-harm are long-term risks to completed suicide. A brief rating scale to assess repetition risk of self-harm is important for high-risk identification and early interventions in suicide prevention. The study aimed to examine the validity of the Chinese SAD PERSONS Scale (CSPS) and to evaluate its feasibility in clinical settings. One hundred and forty-seven patients with self-harm were recruited from the Emergency Department and assessed at baseline and the sixth month. The controls, 284 people without self-harm from the Family Medicine Department in the same hospital were recruited and assessed concurrently. The psychometric properties of the CSPS were examined using baseline and follow-up measurements that assessed a variety of suicide risk factors. Clinical feasibility and applicability of the CSPS were further evaluated by a group of general nurses who used case vignette approach in CSPS risk assessment in clinical settings. An open-ended question inquiring their opinions of scale adaptation to hospital inpatient assessment for suicide risks were also analyzed using content analysis. The CSPS was significantly correlated with other scales measuring depression, hopelessness and suicide ideation. A cut-off point of the scale was at 4/5 in predicting 6-month self-harm repetition with the sensitivity and specificity being 65.4% and 58.1%, respectively. Based on the areas under the Receiver Operating Characteristic curves, the predictive validity of the scale showed a better performance than the other scales. Fifty-four nurses, evaluating the scale using case vignette found it a useful tool to raise the awareness of suicide risk and a considerable tool to be adopted into nursing care. The Chinese SAD PERSONS Scale is a brief instrument with acceptable psychometric properties for self-harm prediction. However, cautions should be paid to level of therapeutic relationships during assessment, staff workload and adequate training for wider clinical applications.
Validation of a physically based catchment model for application in post-closure radiological safety assessments of deep geological repositories for solid radioactive wastes.

PubMed

Thorne, M C; Degnan, P; Ewen, J; Parkin, G

2000-12-01

The physically based river catchment modelling system SHETRAN incorporates components representing water flow, sediment transport and radionuclide transport both in solution and bound to sediments. The system has been applied to simulate hypothetical future catchments in the context of post-closure radiological safety assessments of a potential site for a deep geological disposal facility for intermediate and certain low-level radioactive wastes at Sellafield, west Cumbria. In order to have confidence in the application of SHETRAN for this purpose, various blind validation studies have been undertaken. In earlier studies, the validation was undertaken against uncertainty bounds in model output predictions set by the modelling team on the basis of how well they expected the model to perform. However, validation can also be carried out with bounds set on the basis of how well the model is required to perform in order to constitute a useful assessment tool. Herein, such an assessment-based validation exercise is reported. This exercise related to a field plot experiment conducted at Calder Hollow, west Cumbria, in which the migration of strontium and lanthanum in subsurface Quaternary deposits was studied on a length scale of a few metres. Blind predictions of tracer migration were compared with experimental results using bounds set by a small group of assessment experts independent of the modelling team. Overall, the SHETRAN system performed well, failing only two out of seven of the imposed tests. Furthermore, of the five tests that were not failed, three were positively passed even when a pessimistic view was taken as to how measurement errors should be taken into account. It is concluded that the SHETRAN system, which is still being developed further, is a powerful tool for application in post-closure radiological safety assessments.
Validation of risk stratification for children with febrile neutropenia in a pediatric oncology unit in India.

PubMed

Das, Anirban; Trehan, Amita; Oberoi, Sapna; Bansal, Deepak

2017-06-01

The study aims to validate a score predicting risk of complications in pediatric patients with chemotherapy-related febrile neutropenia (FN) and evaluate the performance of previously published models for risk stratification. Children diagnosed with cancer and presenting with FN were evaluated in a prospective single-center study. A score predicting the risk of complications, previously derived in the unit, was validated on a prospective cohort. Performance of six predictive models published from geographically distinct settings was assessed on the same cohort. Complications were observed in 109 (26.3%) of 414 episodes of FN over 15 months. A risk score based on undernutrition (two points), time from last chemotherapy (<7 days = two points), presence of a nonupper respiratory focus of infection (two points), C-reactive protein (>60 mg/l = five points), and absolute neutrophil count (<100 per μl = two points) was used to stratify patients into "low risk" (score <7, n = 208) and assessed using the following parameters: overall performance (Nagelkerke R 2 = 34.4%), calibration (calibration slope = 0.39; P = 0.25 in Hosmer-Lemeshow test), discrimination (c-statistic = 0.81), overall sensitivity (86%), negative predictive value (93%), and clinical net benefit (0.43). Six previously published rules demonstrated inferior performance in this cohort. An indigenous decision rule using five simple predefined variables was successful in identifying children at risk for complications. Prediction models derived in developed nations may not be appropriate for low-middle-income settings and need to be validated before use. © 2016 Wiley Periodicals, Inc.

Tinnitus Screener: Results From the First 100 Participants in an Epidemiology Study.

PubMed

Henry, James A; Griest, Susan; Austin, Don; Helt, Wendy; Gordon, Jane; Thielman, Emily; Theodoroff, Sarah M; Lewis, M Samantha; Blankenship, Cody; Zaugg, Tara L; Carlson, Kathleen

2016-06-01

In the Noise Outcomes in Servicemembers Epidemiology Study, Veterans recently separated from the military undergo comprehensive assessments to initiate long-term monitoring of their auditory function. We developed the Tinnitus Screener, a four-item algorithmic instrument that determines whether tinnitus is present and, if so, whether it is constant or intermittent, or whether only temporary tinnitus has been experienced. Predictive validity data are presented for the first 100 Noise Outcomes in Servicemembers Epidemiology Study participants. The Tinnitus Screener was administered to participants by telephone. In lieu of a gold standard for determining tinnitus presence, the predictive validity of the tinnitus category assigned to participants on the basis of the Screener results was assessed when the participants attended audiologic testing. Of the 100 participants, 67 screened positive for intermittent or constant tinnitus. Three were categorized as "temporary" tinnitus only, and 30 were categorized as "no tinnitus." Tinnitus categorization was predictively valid with 96 of the 100 participants. These results provide preliminary evidence that the Screener may be suitable for quickly determining essential parameters of reported tinnitus. We have since revised the instrument to differentiate acute from chronic tinnitus and to identify occasional tinnitus. We are also obtaining measures that will enable assessment of its test-retest reliability.
Computer-based analysis of general movements reveals stereotypies predicting cerebral palsy.

PubMed

Philippi, Heike; Karch, Dominik; Kang, Keun-Sun; Wochner, Katarzyna; Pietz, Joachim; Dickhaus, Hartmut; Hadders-Algra, Mijna

2014-10-01

To evaluate a kinematic paradigm of automatic general movements analysis in comparison to clinical assessment in 3-month-old infants and its prediction for neurodevelopmental outcome. Preterm infants at high risk (n=49; 26 males, 23 females) and term infants at low risk (n=18; eight males, 10 females) of developmental impairment were recruited from hospitals around Heidelberg, Germany. Kinematic analysis of general movements by magnet tracking and clinical video-based assessment of general movements were performed at 3 months of age. Neurodevelopmental outcome was evaluated at 2 years. By comparing the general movements of small samples of children with and without cerebral palsy (CP), we developed a kinematic paradigm typical for infants at risk of developing CP. We tested the validity of this paradigm as a tool to predict CP and neurodevelopmental impairment. Clinical assessment correctly identified almost all infants with neurodevelopmental impairment including CP, but did not predict if the infant would be affected by CP or not. The kinematic analysis, in particular the stereotypy score of arm movements, was an excellent predictor of CP, whereas stereotyped repetitive movements of the legs predicted any neurodevelopmental impairment. The automatic assessment of the stereotypy score by magnet tracking in 3-month-old spontaneously moving infants at high risk of developmental abnormalities allowed a valid detection of infants affected and unaffected by CP. © 2014 Mac Keith Press.
Prognostic and Prediction Tools in Bladder Cancer: A Comprehensive Review of the Literature.

PubMed

Kluth, Luis A; Black, Peter C; Bochner, Bernard H; Catto, James; Lerner, Seth P; Stenzl, Arnulf; Sylvester, Richard; Vickers, Andrew J; Xylinas, Evanguelos; Shariat, Shahrokh F

2015-08-01

This review focuses on risk assessment and prediction tools for bladder cancer (BCa). To review the current knowledge on risk assessment and prediction tools to enhance clinical decision making and counseling of patients with BCa. A literature search in English was performed using PubMed in July 2013. Relevant risk assessment and prediction tools for BCa were selected. More than 1600 publications were retrieved. Special attention was given to studies that investigated the clinical benefit of a prediction tool. Most prediction tools for BCa focus on the prediction of disease recurrence and progression in non-muscle-invasive bladder cancer or disease recurrence and survival after radical cystectomy. Although these tools are helpful, recent prediction tools aim to address a specific clinical problem, such as the prediction of organ-confined disease and lymph node metastasis to help identify patients who might benefit from neoadjuvant chemotherapy. Although a large number of prediction tools have been reported in recent years, many of them lack external validation. Few studies have investigated the clinical utility of any given model as measured by its ability to improve clinical decision making. There is a need for novel biomarkers to improve the accuracy and utility of prediction tools for BCa. Decision tools hold the promise of facilitating the shared decision process, potentially improving clinical outcomes for BCa patients. Prediction models need external validation and assessment of clinical utility before they can be incorporated into routine clinical care. We looked at models that aim to predict outcomes for patients with bladder cancer (BCa). We found a large number of prediction models that hold the promise of facilitating treatment decisions for patients with BCa. However, many models are missing confirmation in a different patient cohort, and only a few studies have tested the clinical utility of any given model as measured by its ability to improve clinical decision making. Copyright © 2015 European Association of Urology. Published by Elsevier B.V. All rights reserved.
The validity of DSM-5 severity specifiers for anorexia nervosa, bulimia nervosa, and binge-eating disorder.

PubMed

Smith, Kathryn E; Ellison, Jo M; Crosby, Ross D; Engel, Scott G; Mitchell, James E; Crow, Scott J; Peterson, Carol B; Le Grange, Daniel; Wonderlich, Stephen A

2017-09-01

The DSM-5 includes severity specifiers (i.e., mild, moderate, severe, extreme) for anorexia nervosa (AN), bulimia nervosa (BN), and binge-eating disorder (BED), which are determined by weight status (AN) and frequencies of binge-eating episodes (BED) or inappropriate compensatory behaviors (BN). Given limited data regarding the validity of eating disorder (ED) severity specifiers, this study examined the concurrent and predictive validity of severity specifiers in AN, BN, and BED. Adults with AN (n = 109), BN (n = 76), and BED (n = 216) were identified from previous datasets. Concurrent validity was assessed by measures of ED psychopathology, depression, anxiety, quality of life, and physical health. Predictive validity was assessed by ED symptoms at the end of the treatment in BN and BED. Severity categories did not differ in baseline validators, though the mild AN group evidenced greater ED symptoms compared to the severe group. In BN, greater severity was related to greater end of treatment binge-eating and compensatory behaviors, and lower likelihood of abstinence; however, in BED, greater severity was related to lower ED symptoms at the end of the treatment. Results demonstrated limited support for the validity of DSM-5 severity specifiers. Future research is warranted to explore additional validators and possible alternative indicators of severity in EDs. © 2017 Wiley Periodicals, Inc.
A Cross-Modal Assessment of Reading Achievement in Children.

ERIC Educational Resources Information Center

Webb, Kathryn; And Others

1982-01-01

This study examined the ability of the Listen and Look (LL) test of cross-modal perception and the Metropolitan Readiness Test (MRT) to predict reading achievement. Data from 79 first-grade pupils were analyzed. Both the LL and MRT demonstrated predictive validity. (Author/BW)
Limited predictive ability of surrogate indices of insulin sensitivity/resistance in Asian-Indian men.

PubMed

Muniyappa, Ranganath; Irving, Brian A; Unni, Uma S; Briggs, William M; Nair, K Sreekumaran; Quon, Michael J; Kurpad, Anura V

2010-12-01

Insulin resistance is highly prevalent in Asian Indians and contributes to worldwide public health problems, including diabetes and related disorders. Surrogate measurements of insulin sensitivity/resistance are used frequently to study Asian Indians, but these are not formally validated in this population. In this study, we compared the ability of simple surrogate indices to accurately predict insulin sensitivity as determined by the reference glucose clamp method. In this cross-sectional study of Asian-Indian men (n = 70), we used a calibration model to assess the ability of simple surrogate indices for insulin sensitivity [quantitative insulin sensitivity check index (QUICKI), homeostasis model assessment (HOMA2-IR), fasting insulin-to-glucose ratio (FIGR), and fasting insulin (FI)] to predict an insulin sensitivity index derived from the reference glucose clamp method (SI(Clamp)). Predictive accuracy was assessed by both root mean squared error (RMSE) of prediction as well as leave-one-out cross-validation-type RMSE of prediction (CVPE). QUICKI, FIGR, and FI, but not HOMA2-IR, had modest linear correlations with SI(Clamp) (QUICKI: r = 0.36; FIGR: r = -0.36; FI: r = -0.27; P < 0.05). No significant differences were noted among CVPE or RMSE from any of the surrogate indices when compared with QUICKI. Surrogate measurements of insulin sensitivity/resistance such as QUICKI, FIGR, and FI are easily obtainable in large clinical studies, but these may only be useful as secondary outcome measurements in assessing insulin sensitivity/resistance in clinical studies of Asian Indians.
A Rat α-Fetoprotein Binding Activity Prediction Model to Facilitate Assessment of the Endocrine Disruption Potential of Environmental Chemicals.

PubMed

Hong, Huixiao; Shen, Jie; Ng, Hui Wen; Sakkiah, Sugunadevi; Ye, Hao; Ge, Weigong; Gong, Ping; Xiao, Wenming; Tong, Weida

2016-03-25

Endocrine disruptors such as polychlorinated biphenyls (PCBs), diethylstilbestrol (DES) and dichlorodiphenyltrichloroethane (DDT) are agents that interfere with the endocrine system and cause adverse health effects. Huge public health concern about endocrine disruptors has arisen. One of the mechanisms of endocrine disruption is through binding of endocrine disruptors with the hormone receptors in the target cells. Entrance of endocrine disruptors into target cells is the precondition of endocrine disruption. The binding capability of a chemical with proteins in the blood affects its entrance into the target cells and, thus, is very informative for the assessment of potential endocrine disruption of chemicals. α-fetoprotein is one of the major serum proteins that binds to a variety of chemicals such as estrogens. To better facilitate assessment of endocrine disruption of environmental chemicals, we developed a model for α-fetoprotein binding activity prediction using the novel pattern recognition method (Decision Forest) and the molecular descriptors calculated from two-dimensional structures by Mold² software. The predictive capability of the model has been evaluated through internal validation using 125 training chemicals (average balanced accuracy of 69%) and external validations using 22 chemicals (balanced accuracy of 71%). Prediction confidence analysis revealed the model performed much better at high prediction confidence. Our results indicate that the model is useful (when predictions are in high confidence) in endocrine disruption risk assessment of environmental chemicals though improvement by increasing number of training chemicals is needed.
Development of novel in silico model for developmental toxicity assessment by using naïve Bayes classifier method.

PubMed

Zhang, Hui; Ren, Ji-Xia; Kang, Yan-Li; Bo, Peng; Liang, Jun-Yu; Ding, Lan; Kong, Wei-Bao; Zhang, Ji

2017-08-01

Toxicological testing associated with developmental toxicity endpoints are very expensive, time consuming and labor intensive. Thus, developing alternative approaches for developmental toxicity testing is an important and urgent task in the drug development filed. In this investigation, the naïve Bayes classifier was applied to develop a novel prediction model for developmental toxicity. The established prediction model was evaluated by the internal 5-fold cross validation and external test set. The overall prediction results for the internal 5-fold cross validation of the training set and external test set were 96.6% and 82.8%, respectively. In addition, four simple descriptors and some representative substructures of developmental toxicants were identified. Thus, we hope the established in silico prediction model could be used as alternative method for toxicological assessment. And these obtained molecular information could afford a deeper understanding on the developmental toxicants, and provide guidance for medicinal chemists working in drug discovery and lead optimization. Copyright © 2017 Elsevier Inc. All rights reserved.
Translation, adaptation, and validation of the Sunderland Scale and the Cubbin & Jackson Revised Scale in Portuguese

PubMed Central

Sousa, Bruno

2013-01-01

Objective To translate into Portuguese and evaluate the measuring properties of the Sunderland Scale and the Cubbin & Jackson Revised Scale, which are instruments for evaluating the risk of developing pressure ulcers during intensive care. Methods This study included the process of translation and adaptation of the scales to the Portuguese language, as well as the validation of these tools. To assess the reliability, Cronbach alpha values of 0.702 to 0.708 were identified for the Sunderland Scale and the Cubbin & Jackson Revised Scale, respectively. The validation criteria (predictive) were performed comparatively with the Braden Scale (gold standard), and the main measurements evaluated were sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve, which were calculated based on cutoff points. Results The Sunderland Scale exhibited 60% sensitivity, 86.7% specificity, 47.4% positive predictive value, 91.5% negative predictive value, and 0.86 for the area under the curve. The Cubbin & Jackson Revised Scale exhibited 73.3% sensitivity, 86.7% specificity, 52.4% positive predictive value, 94.2% negative predictive value, and 0.91 for the area under the curve. The Braden scale exhibited 100% sensitivity, 5.3% specificity, 17.4% positive predictive value, 100% negative predictive value, and 0.72 for the area under the curve. Conclusions Both tools demonstrated reliability and validity for this sample. The Cubbin & Jackson Revised Scale yielded better predictive values for the development of pressure ulcers during intensive care. PMID:23917975
Assessing Predictive Validity of Pressure Ulcer Risk Scales- A Systematic Review and Meta-Analysis

PubMed Central

PARK, Seong-Hi; LEE, Hea Shoon

2016-01-01

Background: The purpose of this study was to present a scientific reason for pressure ulcer risk scales: Cubbin& Jackson modified Braden, Norton, and Waterlow, as a nursing diagnosis tool by utilizing predictive validity of pressure sores. Methods: Articles published between 1966 and 2013 from periodicals indexed in the Ovid Medline, Embase, CINAHL, KoreaMed, NDSL, and other databases were selected using the key word “pressure ulcer”. QUADAS-II was applied for assessment for internal validity of the diagnostic studies. Selected studies were analyzed using meta-analysis with MetaDisc 1.4. Results: Seventeen diagnostic studies with high methodological quality, involving 5,185 patients, were included. In the results of the meta-analysis, sROC AUC of Braden, Norton, and Waterflow scale was over 0.7, showing moderate predictive validity, but they have limited interpretation due to significant differences between studies. In addition, Waterlow scale is insufficient as a screening tool owing to low sensitivity compared with other scales. Conclusion: The contemporary pressure ulcer risk scale is not suitable for uninform practice on patients under standardized criteria. Therefore, in order to provide more effective nursing care for bedsores, a new or modified pressure ulcer risk scale should be developed upon strength and weaknesses of existing tools. PMID:27114977
Development and validation of a patient-reported questionnaire assessing systemic therapy induced diarrhea in oncology patients.

PubMed

Lui, Michelle; Gallo-Hershberg, Daniela; DeAngelis, Carlo

2017-12-22

Systemic therapy-induced diarrhea (STID) is a common side effect experienced by more than half of cancer patients. Despite STID-associated complications and poorer quality of life (QoL), no validated assessment tools exist to accurately assess STID occurrence and severity to guide clinical management. Therefore, we developed and validated a patient-reported questionnaire (STIDAT). The STIDAT was developed using the FDA iterative process for patient-reported outcomes. A literature search uncovered potential items and questions for questionnaire construction used by oncology clinicians to develop questions for the preliminary instrument. The instrument was evaluated on its face validity and content validity by patient interviews. Repetitive, similar and different themes uncovered from patient interviews were implemented to revise the instrument to the version used for validation. Patients starting high-risk STID treatments were monitored using the STIDAT, bowel diaries and EORTC QLQ-C30. The STIDAT was evaluated for construct validity using exploratory factor analysis (EFA) using minimal residual method with Promax rotation, reliability and consistency. A weighted scoring system was developed and a receiver-operating characteristic (ROC) curve evaluated the tool's ability to detect STID occurrence. Median scores and variability were analysed to determine how well it differentiates between diarrhea severities. A post-hoc analysis determined how diarrhea severity impacted QoL of cancer patients. Patients defined diarrhea based on presence of watery stool. The STIDAT assessed patient's perception of having diarrhea, daily number of bowel movements, daily number of diarrhea episodes, antidiarrheal medication use, the presence of urgency, abdominal pain, abdominal spasms or fecal incontinence, patient's perception of diarrhea severity, and QoL. These dimensions were sorted into four clusters using EFA - patient's perception of diarrhea, frequency of diarrhea, fecal incontinence and abdominal symptoms. Cronbach's alpha was 0.78; kappa ranged from 0.934-0.952, except for abdominal spasms (κ = 0.0455). The positive predictive value was 96.4%, with the minimum score of 1.35 predicting a positive STID occurrence. Patients with moderate or severe diarrhea experience significant decreases in QoL compared to those with no diarrhea. This is the first patient-reported questionnaire that accurately predicts the occurrence and severity of diarrhea in oncology patients via assessing several bowel habit dimensions.
Youth Offender Care Needs Assessment Tool (YO-CNAT): an actuarial risk assessment tool for predicting problematic child-rearing situations in juvenile offenders on the basis of police records.

PubMed

van der Put, Claudia E; Stams, Geert Jan J M

2013-12-01

In the juvenile justice system, much attention is paid to estimating the risk for recidivism among juvenile offenders. However, it is also important to estimate the risk for problematic child-rearing situations (care needs) in juvenile offenders, because these problems are not always related to recidivism. In the present study, an actuarial care needs assessment tool for juvenile offenders, the Youth Offender Care Needs Assessment Tool (YO-CNAT), was developed to predict the probability of (a) a future supervision order imposed by the child welfare agency, (b) a future entitlement to care indicated by the youth care agency, and (c) future incidents involving child abuse, domestic violence, and/or sexual norm trespassing behavior at the juvenile's address. The YO-CNAT has been developed for use by the police and is based solely on information available in police registration systems. It is designed to assist a police officer without clinical expertise in making a quick assessment of the risk for problematic child-rearing situations. The YO-CNAT was developed on a sample of 1,955 juvenile offenders and was validated on another sample of 2,045 juvenile offenders. The predictive validity (area under the receiver-operating-characteristic curve) scores ranged between .70 (for predicting future entitlement to care) and .75 (for predicting future worrisome incidents at the juvenile's address); therefore, the predictive accuracy of the test scores of the YO-CNAT was sufficient to justify its use as a screening instrument for the police in deciding to refer a juvenile offender to the youth care agency for further assessment into care needs.
The Smoking Consequences Questionnaire: Factor structure and predictive validity among Spanish-speaking Latino smokers in the United States.

PubMed

Vidrine, Jennifer Irvin; Vidrine, Damon J; Costello, Tracy J; Mazas, Carlos; Cofta-Woerpel, Ludmila; Mejia, Luz Maria; Wetter, David W

2009-11-01

Much of the existing research on smoking outcome expectancies has been guided by the Smoking Consequences Questionnaire (SCQ ). Although the original version of the SCQ has been modified over time for use in different populations, none of the existing versions have been evaluated for use among Spanish-speaking Latino smokers in the United States. The present study evaluated the factor structure and predictive validity of the 3 previously validated versions of the SCQ--the original, the SCQ-Adult, and the SCQ-Spanish, which was developed with Spanish-speaking smokers in Spain--among Spanish-speaking Latino smokers in Texas. The SCQ-Spanish represented the least complex solution. Each of the SCQ-Spanish scales had good internal consistency, and the predictive validity of the SCQ-Spanish was partially supported. Nearly all the SCQ-Spanish scales predicted withdrawal severity even after controlling for demographics and dependence. Boredom Reduction predicted smoking relapse across the 5- and 12-week follow-up assessments in a multivariate model that also controlled for demographics and dependence. Our results support use of the SCQ-Spanish with Spanish-speaking Latino smokers in the United States.
Examining the Predictive Validity of a Dynamic Assessment of Decoding to Forecast Response to Tier 2 Intervention

ERIC Educational Resources Information Center

Cho, Eunsoo; Compton, Donald L.; Fuchs, Douglas; Fuchs, Lynn S.; Bouton, Bobette

2014-01-01

The purpose of this study was to examine the role of a dynamic assessment (DA) of decoding in predicting responsiveness to Tier 2 small-group tutoring in a response-to-intervention model. First grade students (n = 134) who did not show adequate progress in Tier 1 based on 6 weeks of progress monitoring received Tier 2 small-group tutoring in…
Validation metrics for turbulent plasma transport

DOE PAGES

Holland, C.

2016-06-22

Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less
One-year temporal stability and predictive and incremental validity of the body, eating, and exercise comparison orientation measure (BEECOM) among college women.

PubMed

Fitzsimmons-Craft, Ellen E; Bardone-Cone, Anna M

2014-01-01

This study examined the one-year temporal stability and the predictive and incremental validity of the Body, Eating, and Exercise Comparison Measure (BEECOM) in a sample of 237 college women who completed study measures at two time points about one year apart. One-year temporal stability was high for the BEECOM total and subscale (i.e., Body, Eating, and Exercise Comparison Orientation) scores. Additionally, the BEECOM exhibited predictive validity in that it accounted for variance in body dissatisfaction and eating disorder symptomatology one year later. These findings held even after controlling for body mass index and existing measures of social comparison orientation. However, results regarding the incremental validity of the BEECOM, or its ability to predict change in these constructs over time, were more mixed. Overall, this study demonstrated additional psychometric properties of the BEECOM among college women, further establishing the usefulness of this measure for more comprehensively assessing eating disorder-related social comparison. Copyright © 2013 Elsevier Ltd. All rights reserved.
Validation metrics for turbulent plasma transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holland, C.

Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less
External validation and comparison of two nomograms predicting the probability of Gleason sum upgrading between biopsy and radical prostatectomy pathology in two patient populations: a retrospective cohort study.

PubMed

Utsumi, Takanobu; Oka, Ryo; Endo, Takumi; Yano, Masashi; Kamijima, Shuichi; Kamiya, Naoto; Fujimura, Masaaki; Sekita, Nobuyuki; Mikami, Kazuo; Hiruta, Nobuyuki; Suzuki, Hiroyoshi

2015-11-01

The aim of this study is to validate and compare the predictive accuracy of two nomograms predicting the probability of Gleason sum upgrading between biopsy and radical prostatectomy pathology among representative patients with prostate cancer. We previously developed a nomogram, as did Chun et al. In this validation study, patients originated from two centers: Toho University Sakura Medical Center (n = 214) and Chibaken Saiseikai Narashino Hospital (n = 216). We assessed predictive accuracy using area under the curve values and constructed calibration plots to grasp the tendency for each institution. Both nomograms showed a high predictive accuracy in each institution, although the constructed calibration plots of the two nomograms underestimated the actual probability in Toho University Sakura Medical Center. Clinicians need to use calibration plots for each institution to correctly understand the tendency of each nomogram for their patients, even if each nomogram has a good predictive accuracy. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
New equations improve NIR prediction of body fat among high school wrestlers.

PubMed

Oppliger, R A; Clark, R R; Nielsen, D H

2000-09-01

Methodologic study to derive prediction equations for percent body fat (%BF). To develop valid regression equations using NIR to assess body composition among high school wrestlers. Clinicians need a portable, fast, and simple field method for assessing body composition among wrestlers. Near-infrared photospectrometry (NIR) meets these criteria, but its efficacy has been challenged. Subjects were 150 high school wrestlers from 2 Midwestern states with mean +/- SD age of 16.3 +/- 1.1 yrs, weight of 69.5 +/- 11.7 kg, and height of 174.4 +/- 7.0 cm. Relative body fatness (%BF) determined from hydrostatic weighing was the criterion measure, and NIR optical density (OD) measurements at multiple sites, plus height, weight, and body mass index (BMI) were the predictor variables. Four equations were developed with multiple R2s that varied from .530 to .693, root mean squared errors varied from 2.8% BF to 3.4% BF, and prediction errors varied from 2.9% BF to 3.1% BF. The best equation used OD measurements at the biceps, triceps, and thigh sites, BMI, and age. The root mean squared error and prediction error for all 4 equations were equal to or smaller than for a skinfold equation commonly used with wrestlers. The results substantiate the validity of NIR for predicting % BF among high school wrestlers. Cross-validation of these equations is warranted.
A new self-report inventory of dyslexia for students: criterion and construct validity.

PubMed

Tamboer, Peter; Vorst, Harrie C M

2015-02-01

The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.

[Population pharmacokinetics applied to optimising cisplatin doses in cancer patients].

PubMed

Ramón-López, A; Escudero-Ortiz, V; Carbonell, V; Pérez-Ruixo, J J; Valenzuela, B

2012-01-01

To develop and internally validate a population pharmacokinetics model for cisplatin and assess its prediction capacity for personalising doses in cancer patients. Cisplatin plasma concentrations in forty-six cancer patients were used to determine the pharmacokinetic parameters of a two-compartment pharmacokinetic model implemented in NONMEN VI software. Pharmacokinetic parameter identification capacity was assessed using the parametric bootstrap method and the model was validated using the nonparametric bootstrap method and standardised visual and numerical predictive checks. The final model's prediction capacity was evaluated in terms of accuracy and precision during the first (a priori) and second (a posteriori) chemotherapy cycles. Mean population cisplatin clearance is 1.03 L/h with an interpatient variability of 78.0%. Estimated distribution volume at steady state was 48.3 L, with inter- and intrapatient variabilities of 31,3% and 11,7%, respectively. Internal validation confirmed that the population pharmacokinetics model is appropriate to describe changes over time in cisplatin plasma concentrations, as well as its variability in the study population. The accuracy and precision of a posteriori prediction of cisplatin concentrations improved by 21% and 54% compared to a priori prediction. The population pharmacokinetic model developed adequately described the changes in cisplatin plasma concentrations in cancer patients and can be used to optimise cisplatin dosing regimes accurately and precisely. Copyright © 2011 SEFH. Published by Elsevier Espana. All rights reserved.
Predictive validity of the structured assessment of violence risk in youth: A 4-year follow-up.

PubMed

Gammelgård, Monica; Koivisto, Anna-Maija; Eronen, Markku; Kaltiala-Heino, Riittakerttu

2015-07-01

Structured violence risk assessment is an essential part of treatment planning for violent young people. The Structured Assessment of Violence Risk in Youth (SAVRY) has been shown to have good reliability and validity in a range of settings but has hardly been studied in adolescent mental health services. This study aimed to evaluate the long-term predictive validity of the SAVRY in adolescent psychiatry settings. In a prospective study, 200 SAVRY assessments of adolescents were acquired from psychiatric, forensic and correctional settings. Re-offending records from the Finnish National Crime Register were collected. Receiver operating curve statistics were applied. High SAVRY total and individual subscale scores and low values on the protective factor subscale were significantly associated with subsequent adverse outcomes, but the predictive value of the total score was weak. At the risk item level, those indicating antisocial lifestyle, absence of social support and pro-social involvement were strong indicators of subsequent criminal convictions, with or without violence. The SAVRY summary risk rating was the best indicator of likelihood of being convicted of a violent crime. After allowing for sex, age, psychiatric diagnosis and treatment setting, for example, conviction for a violent crime was over nine times more likely among those young people given high SAVRY summary risk ratings. The SAVRY is a valid and useful method for assessing both short-term and long-term risks of violent and non-violent crime by young people in psychiatric as well as criminal justice settings, adding to a traditional risk-centred assessment approach by also indicating where future preventive treatment efforts should be targeted. The next steps should be to evaluate its role in everyday clinical practice when using the knowledge generated to inform and monitor management and treatment strategies. Copyright © 2014 John Wiley & Sons, Ltd.
Validity and sensitivity of a model for assessment of impacts of river floodplain reconstruction on protected and endangered species

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nooij, R.J.W. de; Lotterman, K.M.; Sande, P.H.J. van de

Environmental Impact Assessment (EIA) must account for legally protected and endangered species. Uncertainties relating to the validity and sensitivity of EIA arise from predictions and valuation of effects on these species. This paper presents a validity and sensitivity analysis of a model (BIO-SAFE) for assessment of impacts of land use changes and physical reconstruction measures on legally protected and endangered river species. The assessment is based on links between species (higher plants, birds, mammals, reptiles and amphibians, butterflies and dragon- and damselflies) and ecotopes (landscape ecological units, e.g., river dune, soft wood alluvial forests), and on value assignment to protectedmore » and endangered species using different valuation criteria (i.e., EU Habitats and Birds directive, Conventions of Bern and Bonn and Red Lists). The validity of BIO-SAFE has been tested by comparing predicted effects of landscape changes on the diversity of protected and endangered species with observed changes in biodiversity in five reconstructed floodplains. The sensitivity of BIO-SAFE to value assignment has been analysed using data of a Strategic Environmental Assessment concerning the Spatial Planning Key Decision for reconstruction of the Dutch floodplains of the river Rhine, aimed at flood defence and ecological rehabilitation. The weights given to the valuation criteria for protected and endangered species were varied and the effects on ranking of alternatives were quantified. A statistically significant correlation (p < 0.01) between predicted and observed values for protected and endangered species was found. The sensitivity of the model to value assignment proved to be low. Comparison of five realistic valuation options showed that different rankings of scenarios predominantly occur when valuation criteria are left out of the assessment. Based on these results we conclude that linking species to ecotopes can be used for adequate impact assessments. Quantification of sensitivity of impact assessment to value assignment shows that a model like BIO-SAFE is relatively insensitive to assignment of values to different policy and legislation based criteria. Arbitrariness of the value assignment therefore has a very limited effect on assessment outcomes. However, the decision to include valuation criteria or not is very important.« less
Short communication: Variations in major mineral contents of Mediterranean buffalo milk and application of Fourier-transform infrared spectroscopy for their prediction.

PubMed

Stocco, G; Cipolat-Gotet, C; Bonfatti, V; Schiavon, S; Bittante, G; Cecchinato, A

2016-11-01

The aims of this study were (1) to assess variability in the major mineral components of buffalo milk, (2) to estimate the effect of certain environmental sources of variation on the major minerals during lactation, and (3) to investigate the possibility of using Fourier-transform infrared (FTIR) spectroscopy as an indirect, noninvasive tool for routine prediction of the mineral content of buffalo milk. A total of 173 buffaloes reared in 5 herds were sampled once during the morning milking. Milk samples were analyzed for Ca, P, K, and Mg contents within 3h of sample collection using inductively coupled plasma optical emission spectrometry. A Milkoscan FT2 (Foss, Hillerød, Denmark) was used to acquire milk spectra over the spectral range from 5,000 to 900 wavenumber/cm. Prediction models were built using a partial least square approach, and cross-validation was used to assess the prediction accuracy of FTIR. Prediction models were validated using a 4-fold random cross-validation, thus dividing the calibration-test set in 4 folds, using one of them to check the results (prediction models) and the remaining 3 to develop the calibration models. Buffalo milk minerals averaged 162, 117, 86, and 14.4mg/dL of milk for Ca, P, K, and Mg, respectively. Herd and days in milk were the most important sources of variation in the traits investigated. Parity slightly affected only Ca content. Coefficients of determination of cross-validation between the FTIR-predicted and the measured values were 0.71, 0.70, and 0.72 for Ca, Mg, and P, respectively, whereas prediction accuracy was lower for K (0.55). Our findings reveal FTIR to be an unsuitable tool when milk mineral content needs to be predicted with high accuracy. Predictions may play a role as indicator traits in selective breeding (if the additive genetic correlation between FTIR predictions and measures of milk minerals is high enough) or in monitoring the milk of buffalo populations for dairy industry purposes. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Validity of self-reported sleep bruxism among myofascial temporomandibular disorder patients and controls.

PubMed

Raphael, K G; Janal, M N; Sirois, D A; Dubrovsky, B; Klausner, J J; Krieger, A C; Lavigne, G J

2015-10-01

Sleep bruxism (SB), primarily involving rhythmic grinding of the teeth during sleep, has been advanced as a causal or maintenance factor for a variety of oro-facial problems, including temporomandibular disorders (TMD). As laboratory polysomnographic (PSG) assessment is extremely expensive and time-consuming, most research testing this belief has relied on patient self-report of SB. The current case-control study examined the accuracy of those self-reports relative to laboratory-based PSG assessment of SB in a large sample of women suffering from chronic myofascial TMD (n = 124) and a demographically matched control group without TMD (n = 46). A clinical research coordinator administered a structured questionnaire to assess self-reported SB. Participants then spent two consecutive nights in a sleep laboratory. Audiovisual and electromyographic data from the second night were scored to assess whether participants met criteria for the presence of 2 or more (2+) rhythmic masticatory muscle activity episodes accompanied by grinding sounds, moderate SB, or severe SB, using previously validated research scoring standards. Contingency tables were constructed to assess positive and negative predictive values, sensitivity and specificity, and 95% confidence intervals surrounding the point estimates. Results showed that self-report significantly predicted 2+ grinding sounds during sleep for TMD cases. However, self-reported SB failed to significantly predict the presence or absence of either moderate or severe SB as assessed by PSG, for both cases and controls. These data show that self-report of tooth grinding awareness is highly unlikely to be a valid indicator of true SB. Studies relying on self-report to assess SB must be viewed with extreme caution. © 2015 John Wiley & Sons Ltd.
Brief screening for co-occurring disorders among women entering substance abuse treatment.

PubMed

Lincoln, Alisa K; Liebschutz, Jane M; Chernoff, Miriam; Nguyen, Dana; Amaro, Hortensia

2006-09-07

Despite the importance of identifying co-occurring psychiatric disorders in substance abuse treatment programs, there are few appropriate and validated instruments available to substance abuse treatment staff to conduct brief screen for these conditions. This paper describes the development, implementation and validation of a brief screening instrument for mental health diagnoses and trauma among a diverse sample of Black, Hispanic and White women in substance abuse treatment. With input from clinicians and consumers, we adapted longer existing validated instruments into a 14 question screen covering demographics, mental health symptoms and physical and sexual violence exposure. All women entering treatment (methadone, residential and out-patient) at five treatment sites were screened at intake (N = 374). Eighty nine percent reported a history of interpersonal violence, and 70% reported a history of sexual assault. Eighty-eight percent reported mental health symptoms in the last 30 days. The screening questions administered to 88 female clients were validated against in-depth psychiatric diagnostic assessments by trained mental health clinicians. We estimated measures of predictive validity, including sensitivity, specificity and predictive values positive and negative. Screening items were examined multiple ways to assess utility. The screen is a useful and valid proxy for PTSD but not for other mental illness. Substance abuse treatment programs should incorporate violence exposure questions into clinical use as a matter of policy. More work is needed to develop brief screening tools measures for front-line treatment staff to accurately assess other mental health needs of women entering substance abuse treatment.
Pediatric Heart Donor Assessment Tool (PH-DAT): A novel donor risk scoring system to predict 1-year mortality in pediatric heart transplantation.

PubMed

Zafar, Farhan; Jaquiss, Robert D; Almond, Christopher S; Lorts, Angela; Chin, Clifford; Rizwan, Raheel; Bryant, Roosevelt; Tweddell, James S; Morales, David L S

2018-03-01

In this study we sought to quantify hazards associated with various donor factors into a cumulative risk scoring system (the Pediatric Heart Donor Assessment Tool, or PH-DAT) to predict 1-year mortality after pediatric heart transplantation (PHT). PHT data with complete donor information (5,732) were randomly divided into a derivation cohort and a validation cohort (3:1). From the derivation cohort, donor-specific variables associated with 1-year mortality (exploratory p-value < 0.2) were incorporated into a multivariate logistic regression model. Scores were assigned to independent predictors (p < 0.05) based on relative odds ratios (ORs). The final model had an acceptable predictive value (c-statistic = 0.62). The significant 5 variables (ischemic time, stroke as the cause of death, donor-to-recipient height ratio, donor left ventricular ejection fraction, glomerular filtration rate) were used for the scoring system. The validation cohort demonstrated a strong correlation between the observed and expected rates of 1-year mortality (r = 0.87). The risk of 1-year mortality increases by 11% (OR 1.11 [1.08 to 1.14]; p < 0.001) in the derivation cohort and 9% (OR 1.09 [1.04 to 1.14]; p = 0.001) in the validation cohort with an increase of 1-point in score. Mortality risk increased 5 times from the lowest to the highest donor score in this cohort. Based on this model, a donor score range of 10 to 28 predicted 1-year recipient mortality of 11% to 31%. This novel pediatric-specific, donor risk scoring system appears capable of predicting post-transplant mortality. Although the PH-DAT may benefit organ allocation and assessment of recipient risk while controlling for donor risk, prospective validation of this model is warranted. Copyright © 2018 International Society for the Heart and Lung Transplantation. Published by Elsevier Inc. All rights reserved.
Predicting the need for massive transfusion in trauma patients: the Traumatic Bleeding Severity Score.

PubMed

Ogura, Takayuki; Nakamura, Yoshihiko; Nakano, Minoru; Izawa, Yoshimitsu; Nakamura, Mitsunobu; Fujizuka, Kenji; Suzukawa, Masayuki; Lefor, Alan T

2014-05-01

The ability to easily predict the need for massive transfusion may improve the process of care, allowing early mobilization of resources. There are currently no clear criteria to activate massive transfusion in severely injured trauma patients. The aims of this study were to create a scoring system to predict the need for massive transfusion and then to validate this scoring system. We reviewed the records of 119 severely injured trauma patients and identified massive transfusion predictors using statistical methods. Each predictor was converted into a simple score based on the odds ratio in a multivariate logistic regression analysis. The Traumatic Bleeding Severity Score (TBSS) was defined as the sum of the component scores. The predictive value of the TBSS for massive transfusion was then validated, using data from 113 severely injured trauma patients. Receiver operating characteristic curve analysis was performed to compare the results of TBSS with the Trauma-Associated Severe Hemorrhage score and the Assessment of Blood Consumption score. In the development phase, five predictors of massive transfusion were identified, including age, systolic blood pressure, the Focused Assessment with Sonography for Trauma scan, severity of pelvic fracture, and lactate level. The maximum TBSS is 57 points. In the validation study, the average TBSS in patients who received massive transfusion was significantly greater (24.2 [6.7]) than the score of patients who did not (6.2 [4.7]) (p < 0.01). The area under the receiver operating characteristic curve, sensitivity, and specificity for a TBSS greater than 15 points was 0.985 (significantly higher than the other scoring systems evaluated at 0.892 and 0.813, respectively), 97.4%, and 96.2%, respectively. The TBSS is simple to calculate using an available iOS application and is accurate in predicting the need for massive transfusion. Additional multicenter studies are needed to further validate this scoring system and further assess its utility. Prognostic study, level III.
A mathematical prediction model incorporating molecular subtype for risk of non-sentinel lymph node metastasis in sentinel lymph node-positive breast cancer patients: a retrospective analysis and nomogram development.

PubMed

Wang, Na-Na; Yang, Zheng-Jun; Wang, Xue; Chen, Li-Xuan; Zhao, Hong-Meng; Cao, Wen-Feng; Zhang, Bin

2018-04-25

Molecular subtype of breast cancer is associated with sentinel lymph node status. We sought to establish a mathematical prediction model that included breast cancer molecular subtype for risk of positive non-sentinel lymph nodes in breast cancer patients with sentinel lymph node metastasis and further validate the model in a separate validation cohort. We reviewed the clinicopathologic data of breast cancer patients with sentinel lymph node metastasis who underwent axillary lymph node dissection between June 16, 2014 and November 16, 2017 at our hospital. Sentinel lymph node biopsy was performed and patients with pathologically proven sentinel lymph node metastasis underwent axillary lymph node dissection. Independent risks for non-sentinel lymph node metastasis were assessed in a training cohort by multivariate analysis and incorporated into a mathematical prediction model. The model was further validated in a separate validation cohort, and a nomogram was developed and evaluated for diagnostic performance in predicting the risk of non-sentinel lymph node metastasis. Moreover, we assessed the performance of five different models in predicting non-sentinel lymph node metastasis in training cohort. Totally, 495 cases were eligible for the study, including 291 patients in the training cohort and 204 in the validation cohort. Non-sentinel lymph node metastasis was observed in 33.3% (97/291) patients in the training cohort. The AUC of MSKCC, Tenon, MDA, Ljubljana, and Louisville models in training cohort were 0.7613, 0.7142, 0.7076, 0.7483, and 0.671, respectively. Multivariate regression analysis indicated that tumor size (OR = 1.439; 95% CI 1.025-2.021; P = 0.036), sentinel lymph node macro-metastasis versus micro-metastasis (OR = 5.063; 95% CI 1.111-23.074; P = 0.036), the number of positive sentinel lymph nodes (OR = 2.583, 95% CI 1.714-3.892; P < 0.001), and the number of negative sentinel lymph nodes (OR = 0.686, 95% CI 0.575-0.817; P < 0.001) were independent statistically significant predictors of non-sentinel lymph node metastasis. Furthermore, luminal B (OR = 3.311, 95% CI 1.593-6.884; P = 0.001) and HER2 overexpression (OR = 4.308, 95% CI 1.097-16.912; P = 0.036) were independent and statistically significant predictor of non-sentinel lymph node metastasis versus luminal A. A regression model based on the results of multivariate analysis was established to predict the risk of non-sentinel lymph node metastasis, which had an AUC of 0.8188. The model was validated in the validation cohort and showed excellent diagnostic performance. The mathematical prediction model that incorporates five variables including breast cancer molecular subtype demonstrates excellent diagnostic performance in assessing the risk of non-sentinel lymph node metastasis in sentinel lymph node-positive patients. The prediction model could be of help surgeons in evaluating the risk of non-sentinel lymph node involvement for breast cancer patients; however, the model requires further validation in prospective studies.
Clinical and Psychometric Evaluations of the Cerebral Vision Screening Questionnaire in 461 Nonaphasic Individuals Poststroke.

PubMed

Neumann, Guenter; Schaadt, Anna-Katharina; Reinhart, Stefan; Kerkhoff, Georg

2016-03-01

Cerebral vision disorders (CVDs) are frequent after brain damage and impair the patient's outcome. Yet clinically and psychometrically validated procedures for the anamnesis of CVD are lacking. To evaluate the clinical validity and psychometric qualities of the Cerebral Vision Screening Questionnaire (CVSQ) for the anamnesis of CVD in individuals poststroke. Analysis of the patients' subjective visual complaints in the 10-item CVSQ in relation to objective visual perimetry, tests of reading, visual scanning, visual acuity, spatial contrast sensitivity, light/dark adaptation, and visual depth judgments. Psychometric analyses of concurrent validity, specificity, sensitivity, positive/negative predictive value, and interrater reliability were also done. Four hundred sixty-one patients with unilateral (39.5% left, 47.5% right) or bilateral stroke (13.0%) were included. Most patients were assessed in the chronic stage, on average 36.7 (range = 1-620) weeks poststroke. The majority of all patients (96.4%) recognized their visual symptoms within 1 week poststroke when asked for specifically. Mean concurrent validity of the CVSQ with objective tests was 0.64 (0.54-0.79, P < .05). The mean positive predictive value was 80.1%, mean negative predictive value 82.9%, mean specificity 81.7%, and mean sensitivity 79.8%. The mean interrater reliability was 0.76 for a 1-week interval between both assessments (all P < .05). The CVSQ is suitable for the anamnesis of CVD poststroke because of its brevity (10 minute), clinical validity, and good psychometric qualities. It, thus, improves neurovisual diagnosis and guides the clinician in the selection of necessary assessments and appropriate neurovisual therapies for the patient. © The Author(s) 2015.
Clinical assessment of the physical activity pattern of chronic fatigue syndrome patients: a validation of three methods.

PubMed

Scheeres, Korine; Knoop, Hans; Meer, van der Jos; Bleijenberg, Gijs

2009-04-01

Effective treatment of chronic fatigue syndrome (CFS) with cognitive behavioural therapy (CBT) relies on a correct classification of so called 'fluctuating active' versus 'passive' patients. For successful treatment with CBT is it especially important to recognise the passive patients and give them a tailored treatment protocol. In the present study it was evaluated whether CFS patient's physical activity pattern can be assessed most accurately with the 'Activity Pattern Interview' (API), the International Physical Activity Questionnaire (IPAQ) or the CFS-Activity Questionnaire (CFS-AQ). The three instruments were validated compared to actometers. Actometers are until now the best and most objective instrument to measure physical activity, but they are too expensive and time consuming for most clinical practice settings. In total 226 CFS patients enrolled for CBT therapy answered the API at intake and filled in the two questionnaires. Directly after intake they wore the actometer for two weeks. Based on receiver operating characteristic (ROC) curves the validity of the three methods were assessed and compared. Both the API and the two questionnaires had an acceptable validity (0.64 to 0.71). None of the three instruments was significantly better than the others. The proportion of false predictions was rather high for all three instrument. The IPAQ had the highest proportion of correct passive predictions (sensitivity 70.1%). The validity of all three instruments appeared to be fair, and all showed rather high proportions of false classifications. Hence in fact none of the tested instruments could really be called satisfactory. Because the IPAQ showed to be the best in correctly predicting 'passive' CFS patients, which is most essentially related to treatment results, it was concluded that the IPAQ is the preferable alternative for an actometer when treating CFS patients in clinical practice.
Overcoming redundancies in bedside nursing assessments by validating a parsimonious meta-tool: findings from a methodological exercise study.

PubMed

Palese, Alvisa; Marini, Eva; Guarnier, Annamaria; Barelli, Paolo; Zambiasi, Paola; Allegrini, Elisabetta; Bazoli, Letizia; Casson, Paola; Marin, Meri; Padovan, Marisa; Picogna, Michele; Taddia, Patrizia; Chiari, Paolo; Salmaso, Daniele; Marognolli, Oliva; Canzan, Federica; Ambrosi, Elisa; Saiani, Luisa; Grassetti, Luca

2016-10-01

There is growing interest in validating tools aimed at supporting the clinical decision-making process and research. However, an increased bureaucratization of clinical practice and redundancies in the measures collected have been reported by clinicians. Redundancies in clinical assessments affect negatively both patients and nurses. To validate a meta-tool measuring the risks/problems currently estimated by multiple tools used in daily practice. A secondary analysis of a database was performed, using a cross-validation and a longitudinal study designs. In total, 1464 patients admitted to 12 medical units in 2012 were assessed at admission with the Brass, Barthel, Conley and Braden tools. Pertinent outcomes such as the occurrence of post-discharge need for resources and functional decline at discharge, as well as falls and pressure sores, were measured. Explorative factor analysis of each tool, inter-tool correlations and a conceptual evaluation of the redundant/similar items across tools were performed. Therefore, the validation of the meta-tool was performed through explorative factor analysis, confirmatory factor analysis and the structural equation model to establish the ability of the meta-tool to predict the outcomes estimated by the original tools. High correlations between the tools have emerged (from r 0.428 to 0.867) with a common variance from 18.3% to 75.1%. Through a conceptual evaluation and explorative factor analysis, the items were reduced from 42 to 20, and the three factors that emerged were confirmed by confirmatory factor analysis. According to the structural equation model results, two out of three emerged factors predicted the outcomes. From the initial 42 items, the meta-tool is composed of 20 items capable of predicting the outcomes as with the original tools. © 2016 John Wiley & Sons, Ltd.
Incremental predictive validity of the Addiction Severity Index psychiatric composite score in a consecutive cohort of patients in residential treatment for drug use disorders.

PubMed

Thylstrup, Birgitte; Bloomfield, Kim; Hesse, Morten

2018-01-01

The Addiction Severity Index (ASI) is a widely used assessment instrument for substance abuse treatment that includes scales reflecting current status in seven potential problem areas, including psychiatric severity. The aim of this study was to assess the ability of the psychiatric composite score to predict suicide and psychiatric care after residential treatment for drug use disorders after adjusting for history of psychiatric care. All patients treated for drug use disorders in residential treatment centers in Denmark during the years 2000-2010 with complete ASI data were followed through national registers of psychiatric care and causes of death (N=5825). Competing risks regression analyses were used to assess the incremental predictive validity of the psychiatric composite score, controlling for previous psychiatric care, length of intake, and other ASI composite scores, up to 12years after discharge. A total of 1769 patients received psychiatric care after being discharged from residential treatment (30.3%), and 27 (0.5%) committed suicide. After adjusting for all covariates, psychiatric composite score was associated with a higher risk of receiving psychiatric care after residential treatment (subhazard ratio [SHR]=3.44, p<0.001), and of committing suicide (SHR=11.45, p<0.001). The ASI psychiatric composite score has significant predictive validity and promises to be useful in identifying patients with drug use disorders who could benefit from additional mental health treatment. Copyright © 2017 Elsevier Ltd. All rights reserved.
Using Dynamic Assessment to Evaluate the Expressive Syntax of Children who use Augmentative and Alternative Communication

PubMed Central

King, Marika R.; Binger, Cathy; Kent-Walsh, Jennifer

2015-01-01

The developmental readiness of four 5-year-old children to produce basic sentences using graphic symbols on an augmentative and alternative communication (AAC) device during a dynamic assessment (DA) task was examined. Additionally, the ability of the DA task to predict performance on a subsequent experimental task was evaluated. A graduated prompting framework was used during DA. Measures included amount of support required to produce the targets, modifiability (change in participant performance) within a DA session, and predictive validity of DA. Participants accurately produced target structures with varying amounts of support. Modifiability within DA sessions was evident for some participants, and partial support was provided for the measures of predictive validity. These initial results indicate that DA may be a viable way to measure young children’s developmental readiness to learn how to sequence simple, rule-based messages via aided AAC. PMID:25621928
[Prediction equations for fat percentage from body circumferences in prepubescent children].

PubMed

Gómez Campos, Rossana; De Marco, Ademir; de Arruda, Miguel; Martínez Salazar, Cristian; Margarita Salazar, Ciria; Valgas, Carmen; Fuentes, José Damián; Cossio-Bolaños, Marco Antonio

2013-01-01

The analysis of body composition through direct and indirect methods allows the study of the various components of the human body, becoming the central hub for assessing nutritional status. The objective of the study was to develop equations for predicting body fat% from circumferential body arm, waist and calf and propose percentiles to diagnose the nutritional status of school children of both sexes aged 4-10 years. We selected intentionally (non-probabilistic) 515 children, 261 children and 254 being girls belonging to Program interaction and development of children and adolescents from the State University of Campinas (Sao Paulo, Brazil). Anthropometric variables were evaluated for weight, height, triceps and subscapular skinfolds and body circumferences of arm, waist and calf, and the% fat determined by the equation proposed by Boileau, Lohman and Slaughter (1985). Through regression method 2 were generated equations to predict the percentage of fat from the body circumferences, the equations 1 and 2 were validated by cross validation method. The equations showed high predictive values ranging with a R² = 64-69%. In cross validation between the criterion and the regression equation proposed no significant difference (p > 0.05) and there was a high level of agreement to a 95% CI. It is concluded that the proposals are validated and shown as an alternative to assess the percentage of fat in school children of both sexes aged 4-10 years in the region of Campinas, SP (Brazil). Copyright © AULA MEDICA EDICIONES 2013. Published by AULA MEDICA. All rights reserved.
Predicting MCAT Examination Scores.

ERIC Educational Resources Information Center

Dawson-Saunders, Beth; And Others

Acceptable performance on the Medical College Admissions Test (MCAT) is necessary for acceptance into medical school; therefore, students planning a career in medicine and their advisors would benefit by having information useful in predicting performance on this examination. The present study examined the validity of the ACT Assessment as such a…
Validation of Field Methods to Assess Body Fat Percentage in Elite Youth Soccer Players.

PubMed

Munguia-Izquierdo, Diego; Suarez-Arrones, Luis; Di Salvo, Valter; Paredes-Hernandez, Victor; Alcazar, Julian; Ara, Ignacio; Kreider, Richard; Mendez-Villanueva, Alberto

2018-05-01

This study determined the most effective field method for quantifying body fat percentage in male elite youth soccer players and developed prediction equations based on anthropometric variables. Forty-four male elite-standard youth soccer players aged 16.3-18.0 years underwent body fat percentage assessments, including bioelectrical impedance analysis and the calculation of various skinfold-based prediction equations. Dual X-ray absorptiometry provided a criterion measure of body fat percentage. Correlation coefficients, bias, limits of agreement, and differences were used as validity measures, and regression analyses were used to develop soccer-specific prediction equations. The equations from Sarria et al. (1998) and Durnin & Rahaman (1967) reached very large correlations and the lowest biases, and they reached neither the practically worthwhile difference nor the substantial difference between methods. The new youth soccer-specific skinfold equation included a combination of triceps and supraspinale skinfolds. None of the practical methods compared in this study are adequate for estimating body fat percentage in male elite youth soccer players, except for the equations from Sarria et al. (1998) and Durnin & Rahaman (1967). The new youth soccer-specific equation calculated in this investigation is the only field method specifically developed and validated in elite male players, and it shows potentially good predictive power. © Georg Thieme Verlag KG Stuttgart · New York.
Predictive and concurrent validity of the Braden scale in long-term care: a meta-analysis.

PubMed

Wilchesky, Machelle; Lungu, Ovidiu

2015-01-01

Pressure ulcer prevention is an important long-term care (LTC) quality indicator. While the Braden Scale is a recommended risk assessment tool, there is a paucity of information specifically pertaining to its validity within the LTC setting. We, therefore, undertook a systematic review and meta-analysis comparing Braden Scale predictive and concurrent validity within this context. We searched the Medline, EMBASE, PsychINFO and PubMed databases from 1985-2014 for studies containing the requisite information to analyze tool validity. Our initial search yielded 3,773 articles. Eleven datasets emanating from nine published studies describing 40,361 residents met all meta-analysis inclusion criteria and were analyzed using random effects models. Pooled sensitivity, specificity, positive predictive value (PPV), and negative predictive values were 86%, 38%, 28%, and 93%, respectively. Specificity was poorer in concurrent samples as compared with predictive samples (38% vs. 72%), while PPV was low in both sample types (25 and 37%). Though random effects model results showed that the Scale had good overall predictive ability [RR, 4.33; 95% CI, 3.28-5.72], none of the concurrent samples were found to have "optimal" sensitivity and specificity. In conclusion, the appropriateness of the Braden Scale in LTC is questionable given its low specificity and PPV, in particular in concurrent validity studies. Future studies should further explore the extent to which the apparent low validity of the Scale in LTC is due to the choice of cutoff point and/or preventive strategies implemented by LTC staff as a matter of course. © 2015 by the Wound Healing Society.
Validation of the use of synthetic imagery for camouflage effectiveness assessment

NASA Astrophysics Data System (ADS)

Newman, Sarah; Gilmore, Marilyn A.; Moorhead, Ian R.; Filbee, David R.

2002-08-01

CAMEO-SIM was developed as a laboratory method to assess the effectiveness of aircraft camouflage schemes. It is a physically accurate synthetic image generator, rendering in any waveband between 0.4 and 14 microns. Camouflage schemes are assessed by displaying imagery to observers under controlled laboratory conditions or by analyzing the digital image and calculating the contrast statistics between the target and background. Code verification has taken place during development. However, validation of CAMEO-SIM is essential to ensure that the imagery produced is suitable to be used for camouflage effectiveness assessment. Real world characteristics are inherently variable, so exact pixel to pixel correlation is unnecessary. For camouflage effectiveness assessment it is more important to be confident that the comparative effects of different schemes are correct, but prediction of detection ranges is also desirable. Several different tests have been undertaken to validate CAMEO-SIM for the purpose of assessing camouflage effectiveness. Simple scenes have been modeled and measured. Thermal and visual properties of the synthetic and real scenes have been compared. This paper describes the validation tests and discusses the suitability of CAMEO-SIM for camouflage assessment.
Disentangling the risk assessment and intimate partner violence relation: Estimating mediating and moderating effects.

PubMed

Williams, Kirk R; Stansfield, Richard

2017-08-01

To manage intimate partner violence (IPV), the criminal justice system has turned to risk assessment instruments to predict if a perpetrator will reoffend. Empirically determining whether offenders assessed as high risk are those who recidivate is critical for establishing the predictive validity of IPV risk assessment instruments and for guiding the supervision of perpetrators. But by focusing solely on the relation between calculated risk scores and subsequent IPV recidivism, previous studies of the predictive validity of risk assessment instruments omitted mediating factors intended to mitigate the risk of this behavioral recidivism. The purpose of this study was to examine the mediating effects of such factors and the moderating effects of risk assessment on the relation between assessed risk (using the Domestic Violence Screening Instrument-Revised [DVSI-R]) and recidivistic IPV. Using a sample of 2,520 perpetrators of IPV, results revealed that time sentenced to jail and time sentenced to probation each significantly mediated the relation between DVSI-R risk level and frequency of reoffending. The results also revealed that assessed risk moderated the relation between these mediating factors and IPV recidivism, with reduced recidivism (negative estimated effects) for high-risk perpetrators but increased recidivism (positive estimate effects) for low-risk perpetrators. The implication is to assign interventions to the level of risk so that no harm is done. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

Field Validation of the Los Angeles Motor Scale as a Tool for Paramedic Assessment of Stroke Severity.

PubMed

Kim, Joon-Tae; Chung, Pil-Wook; Starkman, Sidney; Sanossian, Nerses; Stratton, Samuel J; Eckstein, Marc; Pratt, Frank D; Conwit, Robin; Liebeskind, David S; Sharma, Latisha; Restrepo, Lucas; Tenser, May-Kim; Valdes-Sueiras, Miguel; Gornbein, Jeffrey; Hamilton, Scott; Saver, Jeffrey L

2017-02-01

The Los Angeles Motor Scale (LAMS) is a 3-item, 0- to 10-point motor stroke-deficit scale developed for prehospital use. We assessed the convergent, divergent, and predictive validity of the LAMS when performed by paramedics in the field at multiple sites in a large and diverse geographic region. We analyzed early assessment and outcome data prospectively gathered in the FAST-MAG trial (Field Administration of Stroke Therapy-Magnesium phase 3) among patients with acute cerebrovascular disease (cerebral ischemia and intracranial hemorrhage) within 2 hours of onset, transported by 315 ambulances to 60 receiving hospitals. Among 1632 acute cerebrovascular disease patients (age 70±13 years, male 57.5%), time from onset to prehospital LAMS was median 30 minutes (interquartile range 20-50), onset to early postarrival (EPA) LAMS was 145 minutes (interquartile range 119-180), and onset to EPA National Institutes of Health Stroke Scale was 150 minutes (interquartile range 120-180). Between the prehospital and EPA assessments, LAMS scores were stable in 40.5%, improved in 37.6%, and worsened in 21.9%. In tests of convergent validity, against the EPA National Institutes of Health Stroke Scale, correlations were r=0.49 for the prehospital LAMS and r=0.89 for the EPA LAMS. Prehospital LAMS scores did diverge from the prehospital Glasgow Coma Scale, r=-0.22. Predictive accuracy (adjusted C statistics) for nondisabled 3-month outcome was as follows: prehospital LAMS, 0.76 (95% confidence interval 0.74-0.78); EPA LAMS, 0.85 (95% confidence interval 0.83-0.87); and EPA National Institutes of Health Stroke Scale, 0.87 (95% confidence interval 0.85-0.88). In this multicenter, prospective, prehospital study, the LAMS showed good to excellent convergent, divergent, and predictive validity, further establishing it as a validated instrument to characterize stroke severity in the field. © 2017 American Heart Association, Inc.
Cross-cultural Adaptation and Validation of the Medication Regimen Complexity Index Adapted to Spanish.

PubMed

Saez de la Fuente, Javier; Such Diaz, Ana; Cañamares-Orbis, Irene; Ramila, Estela; Izquierdo-Garcia, Elsa; Esteban, Concepcion; Escobar-Rodríguez, Ismael

2016-11-01

The most widely used validated instrument to assess the complexity of medication regimens is the Medication Regimen Complexity Index (MRCI). This study aimed to translate, adapt, and validate a reliable version of the MRCI adapted to Spanish (MRCI-E). The cross-cultural adaptation process consisted of an independent translation by 3 clinical pharmacists and a backtranslation by 2 native English speakers. A reliability analysis was conducted on 20 elderly randomly selected patients. Two clinical pharmacists calculated the MRCI-E from discharge treatments and 2 months later. For the validity analysis, the sample was augmented to 60 patients. Convergent validity was assessed by analyzing the correlation between the number of medications; discriminant validity was stratified by gender; and predictive validity was determined by analyzing the ability to predict readmission and mortality at 3 and 6 months. The MRCI-E retained the original structure of 3 sections. The reliability analysis demonstrated an excellent internal consistency (Cronbach's α=0.83), and the intraclass correlation coefficient exceeded 0.9 in all cases. The correlation coefficient with the number of medications was 0.883 ( P<0.001). No significant differences were found when stratified by gender (3.6; 95%CI=-2.9 to 10.2; P=0.27). Patients who were readmitted at 3 months had a higher MRCI-E score (10.7; 95%CI=4.4 to 17.2; P=0.001). The differences remained significant in patients readmitted at 6 months, but differences in mortality were not detected. The MRCI-E retains the reliability and validity of the original index and provides a suitable tool to assess the complexity of medication regimens in Spanish.
Reliability and validity of the Tilburg Frailty Indicator (TFI) among Chinese community-dwelling older people.

PubMed

Dong, Lijuan; Liu, Na; Tian, Xiaoyu; Qiao, Xiaoxia; Gobbens, Robbert J J; Kane, Robert L; Wang, Cuili

2017-11-01

To translate the Tilburg Frailty Indicator (TFI) into Chinese and assess its reliability and validity. A sample of 917 community-dwelling older people, aged ≥60 years, in a Chinese city was included between August 2015 and March 2016. Construct validity was assessed using alternative measures corresponding to the TFI items, including self-rated health status (SRH), unintentional weight loss, walking speed, timed-up-and-go tests (TUGT), making telephone calls, grip strength, exhaustion, Short Portable Mental Status Questionnaire (SPMSQ), Geriatric Depression scale (GDS-15), emotional role, Adaptability Partnership Growth Affection and Resolve scale (APGAR) and Social Support Rating Scale (SSRS). Fried's phenotype and frailty index were measured to evaluate criterion validity. Adverse health outcomes (ADL and IADL disability, healthcare utilization, GDS-15, SSRS) were used to assess predictive (concurrent) validity. The internal consistency reliability was good (Cronbach's α=0.71). The test-retest reliability was strong (r=0.88). Kappa coefficients showed agreements between the TFI items and corresponding alternative measures. Alternative measures correlated as expected with the three domains of TFI, with an exclusion that alternative psychological measures had similar correlations with psychological and physical domains of the TFI. The Chinese TFI had excellent criterion validity with the AUCs regarding physical phenotype and frailty index of 0.87 and 0.86, respectively. The predictive (concurrent) validities of the adverse health outcomes and healthcare utilization were acceptable (AUCs: 0.65-0.83). The Chinese TFI has good validity and reliability as an integral instrument to measure frailty of older people living in the community in China. Copyright © 2017 Elsevier B.V. All rights reserved.
Assessing the external validity of algorithms to estimate EQ-5D-3L from the WOMAC.

PubMed

Kiadaliri, Aliasghar A; Englund, Martin

2016-10-04

The use of mapping algorithms have been suggested as a solution to predict health utilities when no preference-based measure is included in the study. However, validity and predictive performance of these algorithms are highly variable and hence assessing the accuracy and validity of algorithms before use them in a new setting is of importance. The aim of the current study was to assess the predictive accuracy of three mapping algorithms to estimate the EQ-5D-3L from the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) among Swedish people with knee disorders. Two of these algorithms developed using ordinary least squares (OLS) models and one developed using mixture model. The data from 1078 subjects mean (SD) age 69.4 (7.2) years with frequent knee pain and/or knee osteoarthritis from the Malmö Osteoarthritis study in Sweden were used. The algorithms' performance was assessed using mean error, mean absolute error, and root mean squared error. Two types of prediction were estimated for mixture model: weighted average (WA), and conditional on estimated component (CEC). The overall mean was overpredicted by an OLS model and underpredicted by two other algorithms (P < 0.001). All predictions but the CEC predictions of mixture model had a narrower range than the observed scores (22 to 90 %). All algorithms suffered from overprediction for severe health states and underprediction for mild health states with lesser extent for mixture model. While the mixture model outperformed OLS models at the extremes of the EQ-5D-3D distribution, it underperformed around the center of the distribution. While algorithm based on mixture model reflected the distribution of EQ-5D-3L data more accurately compared with OLS models, all algorithms suffered from systematic bias. This calls for caution in applying these mapping algorithms in a new setting particularly in samples with milder knee problems than original sample. Assessing the impact of the choice of these algorithms on cost-effectiveness studies through sensitivity analysis is recommended.
Validation of a prediction model that allows direct comparison of the Oxford Knee Score and American Knee Society clinical rating system.

PubMed

Maempel, J F; Clement, N D; Brenkel, I J; Walmsley, P J

2015-04-01

This study demonstrates a significant correlation between the American Knee Society (AKS) Clinical Rating System and the Oxford Knee Score (OKS) and provides a validated prediction tool to estimate score conversion. A total of 1022 patients were prospectively clinically assessed five years after TKR and completed AKS assessments and an OKS questionnaire. Multivariate regression analysis demonstrated significant correlations between OKS and the AKS knee and function scores but a stronger correlation (r = 0.68, p < 0.001) when using the sum of the AKS knee and function scores. Addition of body mass index and age (other statistically significant predictors of OKS) to the algorithm did not significantly increase the predictive value. The simple regression model was used to predict the OKS in a group of 236 patients who were clinically assessed nine to ten years after TKR using the AKS system. The predicted OKS was compared with actual OKS in the second group. Intra-class correlation demonstrated excellent reliability (r = 0.81, 95% confidence intervals 0.75 to 0.85) for the combined knee and function score when used to predict OKS. Our findings will facilitate comparison of outcome data from studies and registries using either the OKS or the AKS scores and may also be of value for those undertaking meta-analyses and systematic reviews. ©2015 The British Editorial Society of Bone & Joint Surgery.
The Use of a Dynamic Screening of Phonological Awareness to Predict Risk for Reading Disabilities in Kindergarten Children

PubMed Central

Bridges, Mindy Sittner; Catts, Hugh W.

2013-01-01

This study examined the usefulness and predictive validity of a dynamic screening of phonological awareness in two samples of kindergarten children. In one sample (n = 90), the predictive validity of the dynamic assessment was compared to a static version of the same screening measure. In the second sample (n = 96), the dynamic screening measure was compared to a commonly used screening tool, Dynamic Indicators of Basic Early Literacy Skills Initial Sound Fluency. Results showed that the dynamic screening measure uniquely predicted end-of-year reading achievement and outcomes in both samples. These results provide preliminary support for the usefulness of a dynamic screening measure of phonological awareness for kindergarten students. PMID:21571700
Uncertainty quantification's role in modeling and simulation planning, and credibility assessment through the predictive capability maturity model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rider, William J.; Witkowski, Walter R.; Mousseau, Vincent Andrew

2016-04-13

The importance of credible, trustworthy numerical simulations is obvious especially when using the results for making high-consequence decisions. Determining the credibility of such numerical predictions is much more difficult and requires a systematic approach to assessing predictive capability, associated uncertainties and overall confidence in the computational simulation process for the intended use of the model. This process begins with an evaluation of the computational modeling of the identified, important physics of the simulation for its intended use. This is commonly done through a Phenomena Identification Ranking Table (PIRT). Then an assessment of the evidence basis supporting the ability to computationallymore » simulate these physics can be performed using various frameworks such as the Predictive Capability Maturity Model (PCMM). There were several critical activities that follow in the areas of code and solution verification, validation and uncertainty quantification, which will be described in detail in the following sections. Here, we introduce the subject matter for general applications but specifics are given for the failure prediction project. In addition, the first task that must be completed in the verification & validation procedure is to perform a credibility assessment to fully understand the requirements and limitations of the current computational simulation capability for the specific application intended use. The PIRT and PCMM are tools used at Sandia National Laboratories (SNL) to provide a consistent manner to perform such an assessment. Ideally, all stakeholders should be represented and contribute to perform an accurate credibility assessment. PIRTs and PCMMs are both described in brief detail below and the resulting assessments for an example project are given.« less
Development of an Itemwise Efficiency Scoring Method: Concurrent, Convergent, Discriminant, and Neuroimaging-Based Predictive Validity Assessed in a Large Community Sample

PubMed Central

Moore, Tyler M.; Reise, Steven P.; Roalf, David R.; Satterthwaite, Theodore D.; Davatzikos, Christos; Bilker, Warren B.; Port, Allison M.; Jackson, Chad T.; Ruparel, Kosha; Savitt, Adam P.; Baron, Robert B.; Gur, Raquel E.; Gur, Ruben C.

2016-01-01

Traditional “paper-and-pencil” testing is imprecise in measuring speed and hence limited in assessing performance efficiency, but computerized testing permits precision in measuring itemwise response time. We present a method of scoring performance efficiency (combining information from accuracy and speed) at the item level. Using a community sample of 9,498 youths age 8-21, we calculated item-level efficiency scores on four neurocognitive tests, and compared the concurrent, convergent, discriminant, and predictive validity of these scores to simple averaging of standardized speed and accuracy-summed scores. Concurrent validity was measured by the scores' abilities to distinguish men from women and their correlations with age; convergent and discriminant validity were measured by correlations with other scores inside and outside of their neurocognitive domains; predictive validity was measured by correlations with brain volume in regions associated with the specific neurocognitive abilities. Results provide support for the ability of itemwise efficiency scoring to detect signals as strong as those detected by standard efficiency scoring methods. We find no evidence of superior validity of the itemwise scores over traditional scores, but point out several advantages of the former. The itemwise efficiency scoring method shows promise as an alternative to standard efficiency scoring methods, with overall moderate support from tests of four different types of validity. This method allows the use of existing item analysis methods and provides the convenient ability to adjust the overall emphasis of accuracy versus speed in the efficiency score, thus adjusting the scoring to the real-world demands the test is aiming to fulfill. PMID:26866796
TESTING BALANCE AND FALL RISK IN PERSONS WITH PARKINSON DISEASE, AN ARGUMENT FOR ECOLOGICALLY VALID TESTING

PubMed Central

Foreman, K. Bo; Addison, Odessa; Kim, Han S.; Dibble, Leland E.

2010-01-01

Introduction Despite clear deficits in postural control, most clinical examination tools lack accuracy in identifying persons with Parkinson disease (PD) who have fallen or are at risk for falls. We assert that this is in part due to the lack of ecological validity of the testing. Methods To test this assertion, we examined the responsiveness and predictive validity of the Functional Gait Assessment (FGA), the Pull test, and the Timed up and Go (TUG) during clinically defined ON and OFF medication states. To address responsiveness, ON/OFF medication performance was compared. To address predictive validity, areas under the curve (AUC) of receiver operating characteristic (ROC) curves were compared. Comparisons were made using separate non-parametric tests. Results Thirty-six persons (24 male, 12 female) with PD (22 fallers, 14 non-fallers) participated. Only the FGA was able to detect differences between fallers and non-fallers for both ON/OFF medication testing. The predictive validity of the FGA and the TUG for fall identification was higher during OFF medication compared to ON medication testing. The predictive validity of the FGA was higher than the TUG and the Pull test during ON and OFF medication testing. Discussion In order to most accurately identify fallers, clinicians should test persons with PD in ecologically relevant conditions and tasks. In this study, interpretation of the OFF medication performance and use of the FGA provided more accurate prediction of those who would fall. PMID:21215674
Getting precise and pragmatic about the assessment of bullying: the development of the California Bullying Victimization Scale.

PubMed

Felix, Erika D; Sharkey, Jill D; Green, Jennifer Greif; Furlong, Michael J; Tanigawa, Diane

2011-01-01

Accurate assessment of bullying is essential to intervention planning and evaluation. Limitations to many currently available self-report measures of bullying victimization include a lack of psychometric information, use of the emotionally laden term "bullying" in definition-first approaches to self-report surveys, and not assessing all components of the definition of bullying (chronicity, intentionality, and imbalance of power) in behavioral-based self-report methods. To address these limitations, we developed the California Bullying Victimization Scale (CBVS), which is a self-report scale that measures the three-part definition of bullying without the use of the term bully. We examined test-retest reliability and the concurrent and predictive validity of the CBVS across students in Grades 5-12 in four central California schools. Concurrent validity was assessed by comparing the CBVS with a common, definition-based bullying victimization measure. Predictive validity was examined through the co-administration of measures of psychological well-being. Analysis by grade and gender are included. Results support the test-retest reliability of the CBVS over a 2-week period. The CBVS was significantly, positively correlated with another bullying assessment and was related in expected directions to measures of well-being. Implications for differentiating peer victimization and bullying victimization via self-report measures are discussed. © 2011 Wiley-Liss, Inc.
Reading the Road Signs: The Utility of the MMPI-2 Restructured Form Validity Scales in Prediction of Premature Termination.

PubMed

Anestis, Joye C; Finn, Jacob A; Gottfried, Emily; Arbisi, Paul A; Joiner, Thomas E

2015-06-01

This study examined the utility of the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF) Validity Scales in prediction of premature termination in a sample of 511 individuals seeking services from a university-based psychology clinic. Higher scores on True Response Inconsistency-Revised and Infrequent Psychopathology Responses increased the risk of premature termination, whereas higher scores on Adjustment Validity lowered the risk of premature termination. Additionally, when compared with individuals who did not prematurely terminate, individuals who prematurely terminated treatment had lower Global Assessment of Functioning scores at both intake and termination and made fewer improvements. Implications of these findings for the use of the MMPI-2-RF Validity Scales in promoting treatment compliance are discussed. © The Author(s) 2014.
Factor structure, validity and reliability of the Cambridge Worry Scale in a pregnant population.

PubMed

Green, Josephine M; Kafetsios, Konstantinos; Statham, Helen E; Snowdon, Claire M

2003-11-01

This article presents the Cambridge Worry Scale (CWS), a content-based measure for assessing worries, and discusses its psychometric properties based on a longitudinal study of 1,207 pregnant women. Principal components analysis revealed a four-factor structure of women's concerns during pregnancy: socio-medical, own health, socio-economic and relational. The measure demonstrated good reliability and validity. Total CWS scores were strongly associated with state and trait anxiety (convergent validity) but also had significant and unique predictive value for mood outcomes (discriminant validity). The CWS discriminated better between women with different reproductive histories than measures of state and trait anxiety. We conclude that the CWS is a reliable and valid tool for assessing the extent and content of worries in specific situations.
Development of the Affordances in the Home Environment for Motor Development-Infant Scale.

PubMed

Caçola, Priscila; Gabbard, Carl; Santos, Denise C C; Batistela, Ana Carolina T

2011-12-01

The present study reports the development and application of the Affordances in the Home Environment for Motor Development-Infant Scale (AHEMD-IS), a parental self-report designed to assess the quantity and quality of affordances in the home environment that are conducive to motor development for infants aged 3-18 months. Steps in its development included use of expert feedback, establishment of construct validity, interrater and intrarater reliability, and predictive validity. With all phases of the project, 113 homes were involved. Intraclass correlation coefficients for interrater and intrarater reliability for the total score were 1 and 0.94, respectively. In addition, results indicate that the test has the characteristic of differentiating a wide range of scores. Regression analysis for the AHEMD-IS and motor development using the Alberta Infant Motor Scale supports preliminary evidence for predictive validity. Our findings suggest that the AHEMD-IS has sufficient reliability and validity as an instrument for assessing affordances in the home environment, with clinical and research applications. © 2011 The Authors. Pediatrics International © 2011 Japan Pediatric Society.
Development and practical implications of the Exercise Resourcefulness Inventory.

PubMed

Fast, Hilary V; Kennett, Deborah J

2015-05-01

To determine the validity and reliability of the Exercise Resourcefulness Inventory (ERI) designed to assess the self-regulatory strategies used to promote regular exercise. In Study 1, the inventory's relationship with other established scales in the exercise behavior change field was examined. In Study 2, the test-retest reliability and predictive validity of the ERI was established by having participants from Study 1 complete the inventory a second time. Internal consistency, and convergent, discriminant, and concurrent validity were supported in both studies. The test-retest correlation of the ERI was .80. As well, participants scoring higher on the ERI in Study 1 were more likely to be at a higher stage of change in Study 2, and greater increases in exercise resourcefulness over time were predictive of advancement to higher stages of change. ERI is a reliable and valid measure to assess the self-regulatory strategies used to promote regular exercise. Facilitators may want to tailor exercise programs for individuals scoring lower in resourcefulness to prevent them from relapsing. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Calibration power of the Braden scale in predicting pressure ulcer development.

PubMed

Chen, Hong-Lin; Cao, Ying-Juan; Wang, Jing; Huai, Bao-Sha

2016-11-02

Calibration is the degree of correspondence between the estimated probability produced by a model and the actual observed probability. The aim of this study was to investigate the calibration power of the Braden scale in predicting pressure ulcer development (PU). A retrospective analysis was performed among consecutive patients in 2013. The patients were separated into training a group and a validation group. The predicted incidence was calculated using a logistic regression model in the training group and the Hosmer-Lemeshow test was used for assessing the goodness of fit. In the validation cohort, the observed and the predicted incidence were compared by the Chi-square (χ 2 ) goodness of fit test for calibration power. We included 2585 patients in the study, of these 78 patients (3.0%) developed a PU. Between the training and validation groups the patient characteristics were non-significant (p>0.05). In the training group, the logistic regression model for predicting pressure ulcer was Logit(P) = -0.433*Braden score+2.616. The Hosmer-Lemeshow test showed no goodness fit (χ 2 =13.472; p=0.019). In the validation group, the predicted pressure ulcer incidence also did not fit well with the observed incidence (χ 2 =42.154, p=0.000 by Braden scores; and χ 2 =17.223, p=0.001 by Braden scale risk classification). The Braden scale has low calibration power in predicting PU formation.
External validation of the diffuse intrinsic pontine glioma survival prediction model: a collaborative report from the International DIPG Registry and the SIOPE DIPG Registry.

PubMed

Veldhuijzen van Zanten, Sophie E M; Lane, Adam; Heymans, Martijn W; Baugh, Joshua; Chaney, Brooklyn; Hoffman, Lindsey M; Doughman, Renee; Jansen, Marc H A; Sanchez, Esther; Vandertop, William P; Kaspers, Gertjan J L; van Vuurden, Dannis G; Fouladi, Maryam; Jones, Blaise V; Leach, James

2017-08-01

We aimed to perform external validation of the recently developed survival prediction model for diffuse intrinsic pontine glioma (DIPG), and discuss its utility. The DIPG survival prediction model was developed in a cohort of patients from the Netherlands, United Kingdom and Germany, registered in the SIOPE DIPG Registry, and includes age <3 years, longer symptom duration and receipt of chemotherapy as favorable predictors, and presence of ring-enhancement on MRI as unfavorable predictor. Model performance was evaluated by analyzing the discrimination and calibration abilities. External validation was performed using an unselected cohort from the International DIPG Registry, including patients from United States, Canada, Australia and New Zealand. Basic comparison with the results of the original study was performed using descriptive statistics, and univariate- and multivariable regression analyses in the validation cohort. External validation was assessed following a variety of analyses described previously. Baseline patient characteristics and results from the regression analyses were largely comparable. Kaplan-Meier curves of the validation cohort reproduced separated groups of standard (n = 39), intermediate (n = 125), and high-risk (n = 78) patients. This discriminative ability was confirmed by similar values for the hazard ratios across these risk groups. The calibration curve in the validation cohort showed a symmetric underestimation of the predicted survival probabilities. In this external validation study, we demonstrate that the DIPG survival prediction model has acceptable cross-cohort calibration and is able to discriminate patients with short, average, and increased survival. We discuss how this clinico-radiological model may serve a useful role in current clinical practice.
Reliability, validity and administrative burden of the community reintegration of injured service members computer adaptive test (CRIS-CAT)".

PubMed

Resnik, Linda; Borgia, Matthew; Ni, Pensheng; Pirraglia, Paul A; Jette, Alan

2012-09-17

The Computer Adaptive Test version of the Community Reintegration of Injured Service Members measure (CRIS-CAT) consists of three scales measuring Extent of, Perceived Limitations in, and Satisfaction with community integration. The CRIS-CAT was developed using item response theory methods. The purposes of this study were to assess the reliability, concurrent, known group and predictive validity and respondent burden of the CRIS-CAT.The CRIS-CAT was developed using item response theory methods. The purposes of this study were to assess the reliability, concurrent, known group and predictive validity and respondent burden of the CRIS-CAT. This was a three-part study that included a 1) a cross-sectional field study of 517 homeless, employed, and Operation Enduring Freedom/Operation Iraqi Freedom (OEF/OIF) Veterans; who completed all items in the CRIS item set, 2) a cohort study with one year follow-up study of 135 OEF/OIF Veterans, and 3) a 50-person study of CRIS-CAT administration. Conditional reliability of simulated CAT scores was calculated from the field study data, and concurrent validity and known group validity were examined using Pearson product correlations and ANOVAs. Data from the cohort were used to examine the ability of the CRIS-CAT to predict key one year outcomes. Data from the CRIS-CAT administration study were used to calculate ICC (2,1) minimum detectable change (MDC), and average number of items used during CAT administration. Reliability scores for all scales were above 0.75, but decreased at both ends of the score continuum. CRIS-CAT scores were correlated with concurrent validity indicators and differed significantly between the three Veteran groups (P < .001). The odds of having any Emergency Room visits were reduced for Veterans with better CRIS-CAT scores (Extent, Perceived Satisfaction respectively: OR = 0.94, 0.93, 0.95; P < .05). CRIS-CAT scores were predictive of SF-12 physical and mental health related quality of life scores at the 1 year follow-up. Scales had ICCs >0.9. MDCs were 5.9, 6.2, and 3.6, respectively for Extent, Perceived and Satisfaction subscales. Number of items (mn, SD) administered at Visit 1 were 14.6 (3.8) 10.9 (2.7) and 10.4 (1.7) respectively for Extent, Perceived and Satisfaction subscales. The CRIS-CAT demonstrated sound measurement properties including reliability, construct, known group and predictive validity, and it was administered with minimal respondent burden. These findings support the use of this measure in assessing community reintegration.
Validation of the Registry to Evaluate Early and Long-Term Pulmonary Arterial Hypertension Disease Management (REVEAL) pulmonary hypertension prediction model in a unique population and utility in the prediction of long-term survival.

PubMed

Cogswell, Rebecca; Kobashigawa, Erin; McGlothlin, Dana; Shaw, Robin; De Marco, Teresa

2012-11-01

The Registry to Evaluate Early and Long-Term Pulmonary Arterial (PAH) Hypertension Disease Management (REVEAL) model was designed to predict 1-year survival in patients with PAH. Multivariate prediction models need to be evaluated in cohorts distinct from the derivation set to determine external validity. In addition, limited data exist on the utility of this model in the prediction of long-term survival. REVEAL model performance was assessed to predict 1-year and 5-year outcomes, defined as survival or composite survival or freedom from lung transplant, in 140 patients with PAH. The validation cohort had a higher proportion of human immunodeficiency virus (7.9% vs 1.9%, p < 0.0001), methamphetamine use (19.3% vs 4.9%, p < 0.0001), and portal hypertension PAH (16.4% vs 5.1%, p < 0.0001) compared with the development cohort. The C-index of the model to predict survival was 0.765 at 1 year and 0.712 at 5 years of follow-up. The C-index of the model to predict composite survival or freedom from lung transplant was 0.805 and 0.724 at 1 and 5 years of follow-up, respectively. Prediction by the model, however, was weakest among patients with intermediate-risk predicted survival. The REVEAL model had adequate discrimination to predict 1-year survival in this small but clinically distinct validation cohort. Although the model also had predictive ability out to 5 years, prediction was limited among patients of intermediate risk, suggesting our prediction methods can still be improved. Copyright © 2012. Published by Elsevier Inc.
Risk Assessments by Female Victims of Intimate Partner Violence: Predictors of Risk Perceptions and Comparison to an Actuarial Measure

ERIC Educational Resources Information Center

Connor-Smith, Jennifer K.; Henning, Kris; Moore, Stephanie; Holdford, Robert

2011-01-01

Recent studies support the validity of both structured risk assessment tools and victim perceptions as predictors of risk for repeat intimate partner violence (IPV). Combining structured risk assessments and victim risk assessments leads to better predictions of repeat violence than either alone, suggesting that the two forms of assessment provide…
Calibration and validation of toxicokinetic-toxicodynamic models for three neonicotinoids and some aquatic macroinvertebrates.

PubMed

Focks, Andreas; Belgers, Dick; Boerwinkel, Marie-Claire; Buijse, Laura; Roessink, Ivo; Van den Brink, Paul J

2018-05-01

Exposure patterns in ecotoxicological experiments often do not match the exposure profiles for which a risk assessment needs to be performed. This limitation can be overcome by using toxicokinetic-toxicodynamic (TKTD) models for the prediction of effects under time-variable exposure. For the use of TKTD models in the environmental risk assessment of chemicals, it is required to calibrate and validate the model for specific compound-species combinations. In this study, the survival of macroinvertebrates after exposure to the neonicotinoid insecticide was modelled using TKTD models from the General Unified Threshold models of Survival (GUTS) framework. The models were calibrated on existing survival data from acute or chronic tests under static exposure regime. Validation experiments were performed for two sets of species-compound combinations: one set focussed on multiple species sensitivity to a single compound: imidacloprid, and the other set on the effects of multiple compounds for a single species, i.e., the three neonicotinoid compounds imidacloprid, thiacloprid and thiamethoxam, on the survival of the mayfly Cloeon dipterum. The calibrated models were used to predict survival over time, including uncertainty ranges, for the different time-variable exposure profiles used in the validation experiments. From the comparison between observed and predicted survival, it appeared that the accuracy of the model predictions was acceptable for four of five tested species in the multiple species data set. For compounds such as neonicotinoids, which are known to have the potential to show increased toxicity under prolonged exposure, the calibration and validation of TKTD models for survival needs to be performed ideally by considering calibration data from both acute and chronic tests.

Validation of the CRASH model in the prediction of 18-month mortality and unfavorable outcome in severe traumatic brain injury requiring decompressive craniectomy.

PubMed

Honeybul, Stephen; Ho, Kwok M; Lind, Christopher R P; Gillett, Grant R

2014-05-01

The goal in this study was to assess the validity of the corticosteroid randomization after significant head injury (CRASH) collaborators prediction model in predicting mortality and unfavorable outcome at 18 months in patients with severe traumatic brain injury (TBI) requiring decompressive craniectomy. In addition, the authors aimed to assess whether this model was well calibrated in predicting outcome across a wide spectrum of severity of TBI requiring decompressive craniectomy. This prospective observational cohort study included all patients who underwent a decompressive craniectomy following severe TBI at the two major trauma hospitals in Western Australia between 2004 and 2012 and for whom 18-month follow-up data were available. Clinical and radiological data on initial presentation were entered into the Web-based model and the predicted outcome was compared with the observed outcome. In validating the CRASH model, the authors used area under the receiver operating characteristic curve to assess the ability of the CRASH model to differentiate between favorable and unfavorable outcomes. The ability of the CRASH 6-month unfavorable prediction model to differentiate between unfavorable and favorable outcomes at 18 months after decompressive craniectomy was good (area under the receiver operating characteristic curve 0.85, 95% CI 0.80-0.90). However, the model's calibration was not perfect. The slope and the intercept of the calibration curve were 1.66 (SE 0.21) and -1.11 (SE 0.14), respectively, suggesting that the predicted risks of unfavorable outcomes were not sufficiently extreme or different across different risk strata and were systematically too high (or overly pessimistic), respectively. The CRASH collaborators prediction model can be used as a surrogate index of injury severity to stratify patients according to injury severity. However, clinical decisions should not be based solely on the predicted risks derived from the model, because the number of patients in each predicted risk stratum was still relatively small and hence the results were relatively imprecise. Notwithstanding these limitations, the model may add to a clinician's ability to have better-informed conversations with colleagues and patients' relatives about prognosis.
A method to develop vocabulary checklists in new languages and their validity to assess early language development.

PubMed

Prado, Elizabeth L; Phuka, John; Ocansey, Eugenia; Maleta, Kenneth; Ashorn, Per; Ashorn, Ulla; Adu-Afarwuah, Seth; Oaks, Brietta M; Lartey, Anna; Dewey, Kathryn G

2018-05-11

Since the adoption of United Nations' Sustainable Goal 4.2 to ensure that all children have access to quality early child development (ECD) so that they are ready for primary education, the demand for valid ECD assessments has increased in contexts where they do not yet exist. The development of early language ability is important for school readiness. Our objective was to evaluate the validity of a method to develop vocabulary checklists in new languages to assess early language development, based on the MacArthur-Bates Communicative Development Inventories. Through asking mothers of young children what words their children say and through pilot testing, we developed 100-word vocabulary checklists in multilingual contexts in Malawi and Ghana. In Malawi, we evaluated the validity of the vocabulary checklist among 29 children age 17-25 months compared to three language measures assessed concurrently: Developmental Milestones Checklist-II (DMC-II) language scale, Malawi Developmental Assessment Tool (MDAT) language scale, and the number of different words (NDW) in 30-min recordings of spontaneous speech. In Ghana, we assessed the predictive validity of the vocabulary checklist at age 18 months to forecast language, pre-academic, and other skills at age 4-6 years among 869 children. We also compared the predictive validity of the vocabulary checklist scores to that of other developmental assessments administered at age 18 months. In Malawi, the Spearman's correlation of the vocabulary checklist score with DMC-II language was 0.46 (p = 0.049), with MDAT language was 0.66 (p = 0.016) and with NDW was 0.50 (p = 0.033). In Ghana, the 18-month vocabulary checklist score showed the strongest (rho = 0.12-0.26) and most consistent (8/12) associations with preschool scores, compared to the other 18-month assessments. The largest coefficients were the correlations of the 18-month vocabulary score with the preschool cognitive factor score (rho = 0.26), language score (0.25), and pre-academic score (0.24). We have demonstrated the validity of a method to develop vocabulary checklists in new languages, which can be used in multilingual contexts, using a feasible adaptation process requiring about 2 weeks. This is a promising method to assess early language development, which is associated with later preschool language, cognitive, and pre-academic skills.
Predictive models and prognostic factors for upper tract urothelial carcinoma: a comprehensive review of the literature.

PubMed

Mbeutcha, Aurélie; Mathieu, Romain; Rouprêt, Morgan; Gust, Kilian M; Briganti, Alberto; Karakiewicz, Pierre I; Shariat, Shahrokh F

2016-10-01

In the context of customized patient care for upper tract urothelial carcinoma (UTUC), decision-making could be facilitated by risk assessment and prediction tools. The aim of this study was to provide a critical overview of existing predictive models and to review emerging promising prognostic factors for UTUC. A literature search of articles published in English from January 2000 to June 2016 was performed using PubMed. Studies on risk group stratification models and predictive tools in UTUC were selected, together with studies on predictive factors and biomarkers associated with advanced-stage UTUC and oncological outcomes after surgery. Various predictive tools have been described for advanced-stage UTUC assessment, disease recurrence and cancer-specific survival (CSS). Most of these models are based on well-established prognostic factors such as tumor stage, grade and lymph node (LN) metastasis, but some also integrate newly described prognostic factors and biomarkers. These new prediction tools seem to reach a high level of accuracy, but they lack external validation and decision-making analysis. The combinations of patient-, pathology- and surgery-related factors together with novel biomarkers have led to promising predictive tools for oncological outcomes in UTUC. However, external validation of these predictive models is a prerequisite before their introduction into daily practice. New models predicting response to therapy are urgently needed to allow accurate and safe individualized management in this heterogeneous disease.
[Open narcissism, covered narcissism and personality disorders as predictive factors of treatment response in an out-patient Drug Addiction Unit].

PubMed

Salazar-Fraile, José; Ripoll-Alandes, Carmen; Bobes, Julio

2010-01-01

Although a high prevalence of personality disorders has been reported in substance users, the literature on their value for predicting treatment response is controversial. On the other hand, while the predictive validity of personality traits as predictors of response to drug abuse or dependence has been studied, research on the validity of narcissistic personality traits is scarce. To study the predictive value of personality disorders, narcissistic personality traits and self-esteem for predicting treatment response. We assessed 78 patients attended at an addiction treatment unit using personality disorder diagnoses and measures of self-esteem, narcissism and covert (hypersensitive) narcissism. These variables were used in a Cox survival model as predictive variables of time to relapse into drug use. Hypersensitive (covert) narcissism and borderline and passive-aggressive personality disorders were risk factors for relapse into drug use, while open narcissism was a protective factor. Self-esteem did not show predictive validity. Personality disorders characterized by impulsivity-instability and passivity-resentfulness show higher risk of relapse into drug abuse. Personality traits characterized by high sensitivity to humiliation increase the risk of relapse, whereas pride and self-confidence are protective factors.
Accuracy of the DIBELS Oral Reading Fluency Measure for Predicting Third Grade Reading Comprehension Outcomes

ERIC Educational Resources Information Center

Roehrig, Alysia D.; Petscher, Yaacov; Nettles, Stephen M.; Hudson, Roxanne F.; Torgesen, Joseph K.

2008-01-01

We evaluated the validity of DIBELS ("Dynamic Indicators of Basic Early Literacy Skills") ORF ("Oral Reading Fluency") for predicting performance on the "Florida Comprehensive Assessment Test" (FCAT-SSS) and "Stanford Achievement Test" (SAT-10) reading comprehension measures. The usefulness of previously…
The Effectiveness of Academic Interest Scales in Predicting College Achievement.

ERIC Educational Resources Information Center

Johnson, Richard W.

The predictive validities of various SVIB academic interest scales were assessed with first semester freshman males at the University of Massachusetts. Both the Rust and Ryan and the Campbell and Johansson scales contributed significantly, albeit modestly, to a multiple correlation coefficient consisting of high school rank and scholastic aptitude…
Evaluation of Urinary Tract Dilation Classification System for Grading Postnatal Hydronephrosis.

PubMed

Hodhod, Amr; Capolicchio, John-Paul; Jednak, Roman; El-Sherif, Eid; El-Doray, Abd El-Alim; El-Sherbiny, Mohamed

2016-03-01

We assessed the reliability and validity of the Urinary Tract Dilation classification system as a new grading system for postnatal hydronephrosis. We retrospectively reviewed charts of patients who presented with hydronephrosis from 2008 to 2013. We included patients diagnosed prenatally and those with hydronephrosis discovered incidentally during the first year of life. We excluded cases involving urinary tract infection, neurogenic bladder and chromosomal anomalies, those associated with extraurinary congenital malformations and those with followup of less than 24 months without resolution. Hydronephrosis was graded postnatally using the Society for Fetal Urology system, and then the management protocol was chosen. All units were regraded using the Urinary Tract Dilation classification system and compared to the Society for Fetal Urology system to assess reliability. Univariate and multivariate analyses were performed to assess the validity of the Urinary Tract Dilation classification system in predicting hydronephrosis resolution and surgical intervention. A total of 490 patients (730 renal units) were eligible to participate. The Urinary Tract Dilation classification system was reliable in the assessment of hydronephrosis (parallel forms 0.92). Hydronephrosis resolved in 357 units (49%), and 86 units (12%) were managed by surgical intervention. The remainder of renal units demonstrated stable or improved hydronephrosis. Multivariate analysis revealed that the likelihood of surgical intervention was predicted independently by Urinary Tract Dilation classification system risk group, while Society for Fetal Urology grades were predictive of likelihood of resolution. The Urinary Tract Dilation classification system is reliable for evaluation of postnatal hydronephrosis and is valid in predicting surgical intervention. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Australian validation of the Cancer of the Prostate Risk Assessment Post-Surgical score to predict biochemical recurrence after radical prostatectomy.

PubMed

Beckmann, Kerri; O'Callaghan, Michael; Vincent, Andrew; Roder, David; Millar, Jeremy; Evans, Sue; McNeil, John; Moretti, Kim

2018-03-01

The Cancer of the Prostate Risk Assessment Post-Surgical (CAPRA-S) score is a simple post-operative risk assessment tool predicting disease recurrence after radical prostatectomy, which is easily calculated using available clinical data. To be widely useful, risk tools require multiple external validations. We aimed to validate the CAPRA-S score in an Australian multi-institutional population, including private and public settings and reflecting community practice. The study population were all men on the South Australian Prostate Cancer Clinical Outcomes Collaborative Database with localized prostate cancer diagnosed during 1998-2013, who underwent radical prostatectomy without adjuvant therapy (n = 1664). Predictive performance was assessed via Kaplan-Meier and Cox proportional regression analyses, Harrell's Concordance index, calibration plots and decision curve analysis. Biochemical recurrence occurred in 342 (21%) cases. Five-year recurrence-free probabilities for CAPRA-S scores indicating low (0-2), intermediate (3-5) and high risk were 95, 79 and 46%, respectively. The hazard ratio for CAPRA-S score increments was 1.56 (95% confidence interval 1.49-1.64). The Concordance index for 5-year recurrence-free survival was 0.77. The calibration plot showed good correlation between predicted and observed recurrence-free survival across scores. Limitations include the retrospective nature and small numbers with higher CAPRA-S scores. The CAPRA-S score is an accurate predictor of recurrence after radical prostatectomy in our cohort, supporting its utility in the Australian setting. This simple tool can assist in post-surgical selection of patients who would benefit from adjuvant therapy while avoiding morbidity among those less likely to benefit. © 2017 Royal Australasian College of Surgeons.
Assessing Infant Feeding Attitudes of Expectant Women in a Provincial Population in Canada: Validation of the Iowa Infant Feeding Attitude Scale.

PubMed

Twells, Laurie K; Midodzi, William K; Ludlow, Valerie; Murphy-Goodridge, Janet; Burrage, Lorraine; Gill, Nicole; Halfyard, Beth; Schiff, Rebecca; Newhook, Leigh Anne

2016-08-01

Maternal attitudes to infant feeding are predictive of intent and initiation of breastfeeding. The Iowa Infant Feeding Attitude Scale (IIFAS) has not been validated in the Canadian population. This study was conducted in Newfoundland and Labrador, a Canadian province with low breastfeeding rates. Objectives were to assess the reliability and validity of the IIFAS in expectant mothers; to compare attitudes to infant feeding in urban and rural areas; and to examine whether attitudes are associated with intent to breastfeed. The IIFAS assessment tool was administered to 793 pregnant women. Differences in the total IIFAS scores were compared between urban and rural areas. Reliability and validity analysis was conducted on the IIFAS. The receiver operating characteristic (ROC) of the IIFAS was assessed against mother's intent to breastfeed. The mean ± SD of the total IIFAS score of the overall sample was 64.0 ± 10.4. There were no significant differences in attitudes between urban (63.9 ± 10.5) and rural (64.4 ± 9.9) populations. There were significant differences in total IIFAS scores between women who intend to breastfeed (67.3 ± 8.3) and those who do not (51.6 ± 7.7), regardless of population region. The high value of the area under the curve (AUC) of the ROC (AUC = 0.92) demonstrates excellent ability of the IIFAS to predict intent to breastfeed. The internal consistency of the IIFAS was strong, with a Cronbach's alpha greater than .80 in the overall sample. The IIFAS examined in this provincial population provides a valid and reliable assessment of maternal attitudes toward infant feeding. This tool could be used to identify mothers less likely to breastfeed and to inform health promotion programs. © The Author(s) 2014.
Work-related stress assessed by a text message single-item stress question.

PubMed

Arapovic-Johansson, B; Wåhlin, C; Kwak, L; Björklund, C; Jensen, I

2017-12-02

Given the prevalence of work stress-related ill-health in the Western world, it is important to find cost-effective, easy-to-use and valid measures which can be used both in research and in practice. To examine the validity and reliability of the single-item stress question (SISQ), distributed weekly by short message service (SMS) and used for measurement of work-related stress. The convergent validity was assessed through associations between the SISQ and subscales of the Job Demand-Control-Support model, the Effort-Reward Imbalance model and scales measuring depression, exhaustion and sleep. The predictive validity was assessed using SISQ data collected through SMS. The reliability was analysed by the test-retest procedure. Correlations between the SISQ and all the subscales except for job strain and esteem reward were significant, ranging from -0.186 to 0.627. The SISQ could also predict sick leave, depression and exhaustion at 12-month follow-up. The analysis on reliability revealed a satisfactory stability with a weighted kappa between 0.804 and 0.868. The SISQ, administered through SMS, can be used for the screening of stress levels in a working population. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Development and validation of classifiers and variable subsets for predicting nursing home admission.

PubMed

Nuutinen, Mikko; Leskelä, Riikka-Leena; Suojalehto, Ella; Tirronen, Anniina; Komssi, Vesa

2017-04-13

In previous years a substantial number of studies have identified statistically important predictors of nursing home admission (NHA). However, as far as we know, the analyses have been done at the population-level. No prior research has analysed the prediction accuracy of a NHA model for individuals. This study is an analysis of 3056 longer-term home care customers in the city of Tampere, Finland. Data were collected from the records of social and health service usage and RAI-HC (Resident Assessment Instrument - Home Care) assessment system during January 2011 and September 2015. The aim was to find out the most efficient variable subsets to predict NHA for individuals and validate the accuracy. The variable subsets of predicting NHA were searched by sequential forward selection (SFS) method, a variable ranking metric and the classifiers of logistic regression (LR), support vector machine (SVM) and Gaussian naive Bayes (GNB). The validation of the results was guaranteed using randomly balanced data sets and cross-validation. The primary performance metrics for the classifiers were the prediction accuracy and AUC (average area under the curve). The LR and GNB classifiers achieved 78% accuracy for predicting NHA. The most important variables were RAI MAPLE (Method for Assigning Priority Levels), functional impairment (RAI IADL, Activities of Daily Living), cognitive impairment (RAI CPS, Cognitive Performance Scale), memory disorders (diagnoses G30-G32 and F00-F03) and the use of community-based health-service and prior hospital use (emergency visits and periods of care). The accuracy of the classifier for individuals was high enough to convince the officials of the city of Tampere to integrate the predictive model based on the findings of this study as a part of home care information system. Further work need to be done to evaluate variables that are modifiable and responsive to interventions.
LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients

USGS Publications Warehouse

Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.

2015-01-01

Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.
Linguistic adaptation and psychometric evaluation of original Oral Health Literacy-Adult Questionnaire (OHL-AQ).

PubMed

Vyas, Shaleen; Nagarajappa, Sandesh; Dasar, Pralhad L; Mishra, Prashant

2016-10-01

Linguistically adapted oral health literacy tools are helpful to assess oral health literacy among local population with clarity and understandability. The original oral health literacy adult questionnaire, Oral Health Literacy Adult Questionnaire, was given in English (2013), consisting of 17 items under 4 domains. The present study rationalizes to culturally adapt and validate Oral Health Literacy Adult Questionnaire into Hindi language. Thus, we objectified to translate Oral Health Literacy Adult Questionnaire into Hindi and test its psychometric properties like reliability and validity among primary school teachers. The Oral Health Literacy Adult Questionnaire was translated into Oral Health Literacy Adult Questionnaire - Hindi Version using the World Health Organization recommended translation back-translation protocol. During pre-testing, an expert panel assessed content validity of the questionnaire. Face validity was assessed on a small sample of 10 individuals. A cross-sectional study was conducted (June-July 2015) and OHL-AQ-H was administered on a convenient sample of 170 primary school teachers. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and Intra-class correlation coefficient (ICC), respectively, with 2 weeks interval to ascertain adherence to the questionnaire response. Predictive validity was tested by comparing OHL-AQ-H scores with clinical indicators like oral hygiene scores and dental caries scores. The concurrent and discriminant validity was assessed through self-reported oral health and through negative association with sociodemographic variables. The data was analyzed by descriptive tests using chi-square and bivariate logistic regression in SPSS software, version 20 and p<0.05 was considered as the significance level. The mean OHL-AQ-H score was 13.58±2.82. ICC and Cronbach's alpha for Oral Health Literacy Adult Questionnaire - Hindi Version were 0.94 and 0.70, respectively. Comparisons of varying levels of oral health literacy with self-reported oral health established significant concurrent validity (p=0.01). Significant predictive validity was observed between OHL-AQ-H scores and clinical parameters like oral hygiene status (p=0.005) and dentition status (p=0.001). The translated and culturally adapted Oral Health Literacy Adult Questionnaire - Hindi Version indicated good reliability and validity among primary school teachers to assess oral health literacy among Hindi speaking population. Hence, improving OHL levels and implementing education oriented policies can improve the quality of life.
Memory Binding Test Predicts Incident Dementia: Results from the Einstein Aging Study.

PubMed

Mowrey, Wenzhu B; Lipton, Richard B; Katz, Mindy J; Ramratan, Wendy S; Loewenstein, David A; Zimmerman, Molly E; Buschke, Herman

2018-01-01

The Memory Binding Test (MBT) demonstrated good cross-sectional discriminative validity and predicted incident aMCI. To assess whether the MBT predicts incident dementia better than a conventional list learning test in a longitudinal community-based study. As a sub-study in the Einstein Aging Study, 309 participants age≥70 initially free of dementia were administered the MBT and followed annually for incident dementia for up to 13 years. Based on previous work, poor memory binding was defined using an optimal empirical cut-score of≤17 on the binding measure of the MBT, Total Items in the Paired condition (TIP). Cox proportional hazards models were used to assess predictive validity adjusting for covariates. We compared the predictive validity of MBT TIP to that of the free and cued selective reminding test free recall score (FCSRT-FR; cut-score:≤24) and the single list recall measure of the MBT, Cued Recalled from List 1 (CR-L1; cut-score:≤12). Thirty-five of 309 participants developed incident dementia. When assessing each test alone, the hazard ratio (HR) for dementia was significant for MBT TIP (HR = 8.58, 95% CI: (3.58, 20.58), p < 0.0001), FCSRT-FR (HR = 4.19, 95% CI: (1.94, 9.04), p = 0.0003) and MBT CR-L1 (HR = 2.91, 95% CI: (1.37, 6.18), p = 0.006). MBT TIP remained a significant predictor of dementia (p = 0.0002) when adjusting for FCSRT-FR or CR-L1. Older adults with poor memory binding as measured by the MBT TIP were at increased risk for incident dementia. This measure outperforms conventional episodic memory measures of free and cued recall, supporting the memory binding hypothesis.
Applicability of "MEGA"[Eighth Note] to Sexually Abusive Youth with Low Intellectual Functioning

ERIC Educational Resources Information Center

Miccio-Fonseca, L. C.; Rasmussen, Lucinda A.

2013-01-01

The study explored the predictive validity of "Multiplex Empirically Guided Inventory of Ecological Aggregates for Assessing Sexually Abusive Children and Adolescents (Ages 4 to 19)" ("MEGA"[eighth note]; Miccio-Fonseca, 2006b), a comprehensive developmentally sensitive risk assessment outcome tool. "MEGA"[eighth note] assesses risk for coarse…
Ambulatory Assessment

PubMed Central

Trull, Timothy J.; Ebner-Priemer, Ulrich

2014-01-01

Ambulatory assessment (AA) covers a wide range of assessment methods to study people in their natural environment, including self-report, observational, and biological/physiological/behavioral. AA methods minimize retrospective biases while gathering ecologically valid data from patients’ everyday life in real time or near real time. Here, we report on the major characteristics of AA, and we provide examples of applications of AA in clinical psychology (a) to investigate mechanisms and dynamics of symptoms, (b) to predict the future recurrence or onset of symptoms, (c) to monitor treatment effects, (d) to predict treatment success, (e) to prevent relapse, and (f) as interventions. In addition, we present and discuss the most pressing and compelling future AA applications: technological developments (the smartphone), improved ecological validity of laboratory results by combined lab-field studies, and investigating gene-environment interactions. We conclude with a discussion of acceptability, compliance, privacy, and ethical issues. PMID:23157450
Predicting risk and outcomes for frail older adults: an umbrella review of frailty screening tools

PubMed Central

Apóstolo, João; Cooke, Richard; Bobrowicz-Campos, Elzbieta; Santana, Silvina; Marcucci, Maura; Cano, Antonio; Vollenbroek-Hutten, Miriam; Germini, Federico; Holland, Carol

2017-01-01

EXECUTIVE SUMMARY Background A scoping search identified systematic reviews on diagnostic accuracy and predictive ability of frailty measures in older adults. In most cases, research was confined to specific assessment measures related to a specific clinical model. Objectives To summarize the best available evidence from systematic reviews in relation to reliability, validity, diagnostic accuracy and predictive ability of frailty measures in older adults. Inclusion criteria Population Older adults aged 60 years or older recruited from community, primary care, long-term residential care and hospitals. Index test Available frailty measures in older adults. Reference test Cardiovascular Health Study phenotype model, the Canadian Study of Health and Aging cumulative deficit model, Comprehensive Geriatric Assessment or other reference tests. Diagnosis of interest Frailty defined as an age-related state of decreased physiological reserves characterized by an increased risk of poor clinical outcomes. Types of studies Quantitative systematic reviews. Search strategy A three-step search strategy was utilized to find systematic reviews, available in English, published between January 2001 and October 2015. Methodological quality Assessed by two independent reviewers using the Joanna Briggs Institute critical appraisal checklist for systematic reviews and research synthesis. Data extraction Two independent reviewers extracted data using the standardized data extraction tool designed for umbrella reviews. Data synthesis Data were only presented in a narrative form due to the heterogeneity of included reviews. Results Five reviews with a total of 227,381 participants were included in this umbrella review. Two reviews focused on reliability, validity and diagnostic accuracy; two examined predictive ability for adverse health outcomes; and one investigated validity, diagnostic accuracy and predictive ability. In total, 26 questionnaires and brief assessments and eight frailty indicators were analyzed, most of which were applied to community-dwelling older people. The Frailty Index was examined in almost all these dimensions, with the exception of reliability, and its diagnostic and predictive characteristics were shown to be satisfactory. Gait speed showed high sensitivity, but only moderate specificity, and excellent predictive ability for future disability in activities of daily living. The Tilburg Frailty Indicator was shown to be a reliable and valid measure for frailty screening, but its diagnostic accuracy was not evaluated. Screening Letter, Timed-up-and-go test and PRISMA 7 (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) demonstrated high sensitivity and moderate specificity for identifying frailty. In general, low physical activity, variously measured, was one of the most powerful predictors of future decline in activities of daily living. Conclusion Only a few frailty measures seem to be demonstrably valid, reliable and diagnostically accurate, and have good predictive ability. Among them, the Frailty Index and gait speed emerged as the most useful in routine care and community settings. However, none of the included systematic reviews provided responses that met all of our research questions on their own and there is a need for studies that could fill this gap, covering all these issues within the same study. Nevertheless, it was clear that no suitable tool for assessing frailty appropriately in emergency departments was identified. PMID:28398987
A diagnostic model for the detection of sensitization to wheat allergens was developed and validated in bakery workers.

PubMed

Suarthana, Eva; Vergouwe, Yvonne; Moons, Karel G; de Monchy, Jan; Grobbee, Diederick; Heederik, Dick; Meijer, Evert

2010-09-01

To develop and validate a prediction model to detect sensitization to wheat allergens in bakery workers. The prediction model was developed in 867 Dutch bakery workers (development set, prevalence of sensitization 13%) and included questionnaire items (candidate predictors). First, principal component analysis was used to reduce the number of candidate predictors. Then, multivariable logistic regression analysis was used to develop the model. Internal validation and extent of optimism was assessed with bootstrapping. External validation was studied in 390 independent Dutch bakery workers (validation set, prevalence of sensitization 20%). The prediction model contained the predictors nasoconjunctival symptoms, asthma symptoms, shortness of breath and wheeze, work-related upper and lower respiratory symptoms, and traditional bakery. The model showed good discrimination with an area under the receiver operating characteristic (ROC) curve area of 0.76 (and 0.75 after internal validation). Application of the model in the validation set gave a reasonable discrimination (ROC area=0.69) and good calibration after a small adjustment of the model intercept. A simple model with questionnaire items only can be used to stratify bakers according to their risk of sensitization to wheat allergens. Its use may increase the cost-effectiveness of (subsequent) medical surveillance.
British isles lupus assessment group 2004 index is valid for assessment of disease activity in systemic lupus erythematosus

PubMed Central

Yee, Chee-Seng; Farewell, Vernon; Isenberg, David A; Rahman, Anisur; Teh, Lee-Suan; Griffiths, Bridget; Bruce, Ian N; Ahmad, Yasmeen; Prabu, Athiveeraramapandian; Akil, Mohammed; McHugh, Neil; D'Cruz, David; Khamashta, Munther A; Maddison, Peter; Gordon, Caroline

2007-01-01

Objective To determine the construct and criterion validity of the British Isles Lupus Assessment Group 2004 (BILAG-2004) index for assessing disease activity in systemic lupus erythematosus (SLE). Methods Patients with SLE were recruited into a multicenter cross-sectional study. Data on SLE disease activity (scores on the BILAG-2004 index, Classic BILAG index, and Systemic Lupus Erythematosus Disease Activity Index 2000 [SLEDAI-2K]), investigations, and therapy were collected. Overall BILAG-2004 and overall Classic BILAG scores were determined by the highest score achieved in any of the individual systems in the respective index. Erythrocyte sedimentation rates (ESRs), C3 levels, C4 levels, anti–double-stranded DNA (anti-dsDNA) levels, and SLEDAI-2K scores were used in the analysis of construct validity, and increase in therapy was used as the criterion for active disease in the analysis of criterion validity. Statistical analyses were performed using ordinal logistic regression for construct validity and logistic regression for criterion validity. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Results Of the 369 patients with SLE, 92.7% were women, 59.9% were white, 18.4% were Afro-Caribbean and 18.4% were South Asian. Their mean ± SD age was 41.6 ± 13.2 years and mean disease duration was 8.8 ± 7.7 years. More than 1 assessment was obtained on 88.6% of the patients, and a total of 1,510 assessments were obtained. Increasing overall scores on the BILAG-2004 index were associated with increasing ESRs, decreasing C3 levels, decreasing C4 levels, elevated anti-dsDNA levels, and increasing SLEDAI-2K scores (all P < 0.01). Increase in therapy was observed more frequently in patients with overall BILAG-2004 scores reflecting higher disease activity. Scores indicating active disease (overall BILAG-2004 scores of A and B) were significantly associated with increase in therapy (odds ratio [OR] 19.3, P < 0.01). The BILAG-2004 and Classic BILAG indices had comparable sensitivity, specificity, PPV, and NPV. Conclusion These findings show that the BILAG-2004 index has construct and criterion validity. PMID:18050213
Identifying a predictive model for response to atypical antipsychotic monotherapy treatment in south Indian schizophrenia patients.

PubMed

Gupta, Meenal; Moily, Nagaraj S; Kaur, Harpreet; Jajodia, Ajay; Jain, Sanjeev; Kukreti, Ritushree

2013-08-01

Atypical antipsychotic (AAP) drugs are the preferred choice of treatment for schizophrenia patients. Patients who do not show favorable response to AAP monotherapy are subjected to random prolonged therapeutic treatment with AAP multitherapy, typical antipsychotics or a combination of both. Therefore, prior identification of patients' response to drugs can be an important step in providing efficacious and safe therapeutic treatment. We thus attempted to elucidate a genetic signature which could predict patients' response to AAP monotherapy. Our logistic regression analyses indicated the probability that 76% patients carrying combination of four SNPs will not show favorable response to AAP therapy. The robustness of this prediction model was assessed using repeated 10-fold cross validation method, and the results across n-fold cross-validations (mean accuracy=71.91%; 95%CI=71.47-72.35) suggest high accuracy and reliability of the prediction model. Further validations of these results in large sample sets are likely to establish their clinical applicability. Copyright © 2013 Elsevier Inc. All rights reserved.

[Validity and concordance of electronic health records in primary care (AP-Madrid) for surveillance of diabetes mellitus. PREDIMERC study].

PubMed

Gil Montalbán, Elisa; Ortiz Marrón, Honorato; López-Gay Lucio-Villegas, Dulce; Zorrilla Torrás, Belén; Arrieta Blanco, Francisco; Nogales Aguado, Pedro

2014-01-01

To assess the validity and concordance of diabetes data in the electronic health records of primary care (Madrid-PC) by comparing with those from the PREDIMERC study. The sensitivity, specificity, positive predictive value, negative predictive value and kappa index of diabetes cases recorded in the health records of Madrid-PC were calculated by using data from PREDIMERC as the gold standard. The prevalence of diabetes was also determined according to each data source. The sensitivity of diabetes recorded in Madrid-PC was 74%, the specificity was 98.8%, the positive predictive value was 87.9%, the negative predictive value was 97.3%, and the kappa index was 0.78. The prevalence of diabetes recorded in Madrid-PC was 6.7% versus 8.1% by PREDIMERC, where known diabetes was 6.3%. The electronic health records of primary care are a valid source for epidemiological surveillance of diabetes in Madrid. Copyright © 2013 SESPAS. Published by Elsevier Espana. All rights reserved.
Brief comprehensive quality of life assessment after stroke: the assessment of quality of life instrument in the north East melbourne stroke incidence study (NEMESIS).

PubMed

Sturm, Jonathan W; Osborne, Richard H; Dewey, Helen M; Donnan, Geoffrey A; Macdonell, Richard A L; Thrift, Amanda G

2002-12-01

Generic utility health-related quality of life instruments are useful in assessing stroke outcome because they facilitate a broader description of the disease and outcomes, allow comparisons between diseases, and can be used in cost-benefit analysis. The aim of this study was to validate the Assessment of Quality of Life (AQoL) instrument in a stroke population. Ninety-three patients recruited from the community-based North East Melbourne Stroke Incidence Study between July 13, 1996, and April 30, 1997, were interviewed 3 months after stroke. Validity of the AQoL was assessed by examining associations between the AQoL and comparator instruments: the Medical Outcomes Short-Form Health Survey (SF-36); London Handicap Scale; Barthel Index; National Institutes of Health Stroke Scale; and Irritability, Depression, Anxiety scale. Sensitivity of the AQoL was assessed by comparing AQoL scores from groups of patients categorized by severity of impairment and disability and with total anterior circulation syndrome (TACS) versus non-TACS. Predictive validity was assessed by examining the association between 3-month AQoL scores and outcomes of death or institutionalization 12 months after stroke. Overall AQoL utility scores and individual dimension scores were most highly correlated with relevant scales on the comparator instruments. AQoL scores clearly differentiated between patients in categories of severity of impairment and disability and between patients with TACS and non-TACS. AQoL scores at 3 months after stroke predicted death and institutionalization at 12 months. The AQoL demonstrated strong psychometric properties and appears to be a valid and sensitive measure of health-related QoL after stroke.
The in-training examination: an analysis of its predictive value on performance on the general pediatrics certification examination.

PubMed

Althouse, Linda A; McGuinness, Gail A

2008-09-01

This study investigates the predictive validity of the In-Training Examination (ITE). Although studies have confirmed the predictive validity of ITEs in other medical specialties, no study has been done for general pediatrics. Each year, residents in accredited pediatric training programs take the ITE as a self-assessment instrument. The ITE is similar to the American Board of Pediatrics General Pediatrics Certifying Examination. First-time takers of the certifying examination over a 5-year period who took at least 1 ITE examination were included in the sample. Regression models analyzed the predictive value of the ITE. The predictive power of the ITE in the first training year is minimal. However, the predictive power of the ITE increases each year, providing the greatest power in the third year of training. Even though ITE scores provide information regarding the likelihood of passing the certification examination, the data should be used with caution, particularly in the first training year. Other factors also must be considered when predicting performance on the certification examination. This study continues to support the ITE as an assessment tool for program directors, as well as a means of providing residents with feedback regarding their acquisition of pediatric knowledge.
Predicting acute contact toxicity of pesticides in honeybees (Apis mellifera) through a k-nearest neighbor model.

PubMed

Como, F; Carnesecchi, E; Volani, S; Dorne, J L; Richardson, J; Bassan, A; Pavan, M; Benfenati, E

2017-01-01

Ecological risk assessment of plant protection products (PPPs) requires an understanding of both the toxicity and the extent of exposure to assess risks for a range of taxa of ecological importance including target and non-target species. Non-target species such as honey bees (Apis mellifera), solitary bees and bumble bees are of utmost importance because of their vital ecological services as pollinators of wild plants and crops. To improve risk assessment of PPPs in bee species, computational models predicting the acute and chronic toxicity of a range of PPPs and contaminants can play a major role in providing structural and physico-chemical properties for the prioritisation of compounds of concern and future risk assessments. Over the last three decades, scientific advisory bodies and the research community have developed toxicological databases and quantitative structure-activity relationship (QSAR) models that are proving invaluable to predict toxicity using historical data and reduce animal testing. This paper describes the development and validation of a k-Nearest Neighbor (k-NN) model using in-house software for the prediction of acute contact toxicity of pesticides on honey bees. Acute contact toxicity data were collected from different sources for 256 pesticides, which were divided into training and test sets. The k-NN models were validated with good prediction, with an accuracy of 70% for all compounds and of 65% for highly toxic compounds, suggesting that they might reliably predict the toxicity of structurally diverse pesticides and could be used to screen and prioritise new pesticides. Copyright © 2016 Elsevier Ltd. All rights reserved.
In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods.

PubMed

Cheng, Feixiong; Shen, Jie; Yu, Yue; Li, Weihua; Liu, Guixia; Lee, Philip W; Tang, Yun

2011-03-01

There is an increasing need for the rapid safety assessment of chemicals by both industries and regulatory agencies throughout the world. In silico techniques are practical alternatives in the environmental hazard assessment. It is especially true to address the persistence, bioaccumulative and toxicity potentials of organic chemicals. Tetrahymena pyriformis toxicity is often used as a toxic endpoint. In this study, 1571 diverse unique chemicals were collected from the literature and composed of the largest diverse data set for T. pyriformis toxicity. Classification predictive models of T. pyriformis toxicity were developed by substructure pattern recognition and different machine learning methods, including support vector machine (SVM), C4.5 decision tree, k-nearest neighbors and random forest. The results of a 5-fold cross-validation showed that the SVM method performed better than other algorithms. The overall predictive accuracies of the SVM classification model with radial basis functions kernel was 92.2% for the 5-fold cross-validation and 92.6% for the external validation set, respectively. Furthermore, several representative substructure patterns for characterizing T. pyriformis toxicity were also identified via the information gain analysis methods. Copyright © 2010 Elsevier Ltd. All rights reserved.
A Validation Study of the School Attitude Assessment Survey.

ERIC Educational Resources Information Center

McCoach, D. Betsy

2002-01-01

This article describes the development of the School Attitude Assessment Survey (SAAS), an instrument that measures self-concept, self-motivation and self-regulation, attitude toward school, and peer attitudes to predict the academic achievement of adolescents. (Contains 43 references and 5 tables.) (Author)
Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology.

PubMed

Fox, Eric W; Hill, Ryan A; Leibowitz, Scott G; Olsen, Anthony R; Thornbrugh, Darren J; Weber, Marc H

2017-07-01

Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological data sets, there is limited guidance on variable selection methods for RF modeling. Typically, either a preselected set of predictor variables are used or stepwise procedures are employed which iteratively remove variables according to their importance measures. This paper investigates the application of variable selection methods to RF models for predicting probable biological stream condition. Our motivating data set consists of the good/poor condition of n = 1365 stream survey sites from the 2008/2009 National Rivers and Stream Assessment, and a large set (p = 212) of landscape features from the StreamCat data set as potential predictors. We compare two types of RF models: a full variable set model with all 212 predictors and a reduced variable set model selected using a backward elimination approach. We assess model accuracy using RF's internal out-of-bag estimate, and a cross-validation procedure with validation folds external to the variable selection process. We also assess the stability of the spatial predictions generated by the RF models to changes in the number of predictors and argue that model selection needs to consider both accuracy and stability. The results suggest that RF modeling is robust to the inclusion of many variables of moderate to low importance. We found no substantial improvement in cross-validated accuracy as a result of variable reduction. Moreover, the backward elimination procedure tended to select too few variables and exhibited numerous issues such as upwardly biased out-of-bag accuracy estimates and instabilities in the spatial predictions. We use simulations to further support and generalize results from the analysis of real data. A main purpose of this work is to elucidate issues of model selection bias and instability to ecologists interested in using RF to develop predictive models with large environmental data sets.
The 6-min push test is reliable and predicts low fitness in spinal cord injury.

PubMed

Cowan, Rachel E; Callahan, Morgan K; Nash, Mark S

2012-10-01

The objective of this study is to assess 6-min push test (6MPT) reliability, determine whether the 6MPT is sensitive to fitness differences, and assess if 6MPT distance predicts fitness level in persons with spinal cord injury (SCI) or disease. Forty individuals with SCI who could self-propel a manual wheelchair completed an incremental arm crank peak oxygen consumption assessment and two 6MPTs across 3 d (37% tetraplegia (TP), 63% paraplegia (PP), 85% men, 70% white, 63% Hispanic, mean age = 34 ± 10 yr, mean duration of injury = 13 ± 10 yr, and mean body mass index = 24 ± 5 kg.m). Intraclass correlation and Bland-Altman plots assessed 6MPT distance (m) reliability. Mann-Whitney U test compared 6MPT distance (m) of high and low fitness groups for TP and PP. The fitness status prediction was developed using N = 30 and validated in N = 10 (validation group (VG)). A nonstatistical prediction approach, below or above a threshold distance (TP = 445 m and PP = 604 m), was validated statistically by binomial logistic regression. Accuracy, sensitivity, and specificity were computed to evaluate the threshold approach. Intraclass correlation coefficients exceeded 0.90 for the whole sample and the TP/PP subsets. High fitness persons propelled farther than low fitness persons for both TP/PP (both P < 0.05). Binomial logistic regression (P < 0.008) predicted the same fitness levels in the VG as the threshold approach. In the VG, overall accuracy was 70%. Eighty-six percent of low fitness persons were correctly identified (sensitivity), and 33% of high fitness persons were correctly identified (specificity). The 6MPT may be a useful tool for SCI clinicians and researchers. 6MPT distance demonstrates excellent reliability and is sensitive to differences in fitness level. 6MPT distances less than a threshold distance may be an effective approach to identify low fitness in person with SCI.
The Depression, Anxiety and Stress Scale (DASS-21) as a Screener for Depression in Substance Use Disorder Inpatients: A Pilot Study.

PubMed

Beaufort, Ilse N; De Weert-Van Oene, Gerdien H; Buwalda, Victor A J; de Leeuw, J Rob J; Goudriaan, Anna E

2017-01-01

Depression is a common co-morbid disorder in substance use disorder (SUD) patients. Hence, valid instruments are needed to screen for depression in this subpopulation. In this study, the predictive validity of the Depression, Anxiety and Stress Scale (DASS-21) for the presence of a depressive disorder was investigated in SUD inpatients. Furthermore, differences between DASS-21 scores at intake and those recorded one week after inpatient detoxification were assessed in order to determine the measurement point of the assessment of the DASS-21 leading to the best predictive validity. The DASS-21 was administered to 47 patients at intake and shortly after inpatient detoxification. The results of the DASS-21 were compared to the Mini International Neuropsychiatric Interview (MINI), which served as the gold standard. Levels of sensitivity and specificity of 78-89% and 71-76% were found for the DASS-21 assessed after detoxification, satisfactorily predicting depression as diagnosed with the MINI. Total DASS-21 scores as well as the DASS subscale for depression were significantly reduced at the second measurement, compared to the DASS at intake. We conclude that the DASS-21 may be a suitable instrument to screen for depressive disorders in SUD patients when administered (shortly) after detoxification. Future research is needed to support this conclusion. © 2017 The Author(s) Published by S. Karger AG, Basel.
Validity of the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire.

PubMed

João, Thaís Moreira São; Rodrigues, Roberta Cunha Matheus; Gallani, Maria Cecília Bueno Jayme; Miura, Cinthya Tamie Passos; Domingues, Gabriela de Barros Leite; Amireault, Steve; Godin, Gaston

2015-09-01

This study provides evidence of construct validity for the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire (GSLTPAQ), a 1-item instrument used among 236 participants referred for cardiopulmonary exercise testing. The Baecke Habitual Physical Activity Questionnaire (Baecke-HPA) was used to evaluate convergent and divergent validity. The self-reported measure of walking (QCAF) evaluated the convergent validity. Cardiorespiratory fitness assessed convergent validity by the Veterans Specific Activity Questionnaire (VSAQ), peak measured (VO2peak) and maximum predicted (VO2pred) oxygen uptake. Partial adjusted correlation coefficients between the GSLTPAQ, Baecke-HPA, QCAF, VO2pred and VSAQ provided evidence for convergent validity; while divergent validity was supported by the absence of correlations between the GSLTPAQ and the Occupational Physical Activity domain (Baecke-HPA). The GSLTPAQ presents level 3 of evidence of construct validity and may be useful to assess leisure-time physical activity among patients with cardiovascular disease and healthy individuals.
Development and validation of a reading-related assessment battery in Malay for the purpose of dyslexia assessment.

PubMed

Lee, Lay Wah

2008-06-01

Malay is an alphabetic language with transparent orthography. A Malay reading-related assessment battery which was conceptualised based on the International Dyslexia Association definition of dyslexia was developed and validated for the purpose of dyslexia assessment. The battery consisted of ten tests: Letter Naming, Word Reading, Non-word Reading, Spelling, Passage Reading, Reading Comprehension, Listening Comprehension, Elision, Rapid Letter Naming and Digit Span. Content validity was established by expert judgment. Concurrent validity was obtained using the schools' language tests as criterion. Evidence of predictive and construct validity was obtained through regression analyses and factor analyses. Phonological awareness was the most significant predictor of word-level literacy skills in Malay, with rapid naming making independent secondary contributions. Decoding and listening comprehension made separate contributions to reading comprehension, with decoding as the more prominent predictor. Factor analysis revealed four factors: phonological decoding, phonological naming, comprehension and verbal short-term memory. In conclusion, despite differences in orthography, there are striking similarities in the theoretical constructs of reading-related tasks in Malay and in English.
Examining the Validity of Self-Report: Middle-Level Singers' Ability to Predict and Assess Their Sight-Singing Skills

ERIC Educational Resources Information Center

Darrow, Alice-Ann; Marsh, Kerry

2006-01-01

The purpose of the present study was to determine choral students' ability to predict and evaluate their sight-singing skills. Participants were asked to assign a rating based on how well they predicted they would sight-sing five musical examples. Following the singing of each example, participants were asked to evaluate their sight-singing…
Review and assessment of turbulence models for hypersonic flows

NASA Astrophysics Data System (ADS)

Roy, Christopher J.; Blottner, Frederick G.

2006-10-01

Accurate aerodynamic prediction is critical for the design and optimization of hypersonic vehicles. Turbulence modeling remains a major source of uncertainty in the computational prediction of aerodynamic forces and heating for these systems. The first goal of this article is to update the previous comprehensive review of hypersonic shock/turbulent boundary-layer interaction experiments published in 1991 by Settles and Dodson (Hypersonic shock/boundary-layer interaction database. NASA CR 177577, 1991). In their review, Settles and Dodson developed a methodology for assessing experiments appropriate for turbulence model validation and critically surveyed the existing hypersonic experiments. We limit the scope of our current effort by considering only two-dimensional (2D)/axisymmetric flows in the hypersonic flow regime where calorically perfect gas models are appropriate. We extend the prior database of recommended hypersonic experiments (on four 2D and two 3D shock-interaction geometries) by adding three new geometries. The first two geometries, the flat plate/cylinder and the sharp cone, are canonical, zero-pressure gradient flows which are amenable to theory-based correlations, and these correlations are discussed in detail. The third geometry added is the 2D shock impinging on a turbulent flat plate boundary layer. The current 2D hypersonic database for shock-interaction flows thus consists of nine experiments on five different geometries. The second goal of this study is to review and assess the validation usage of various turbulence models on the existing experimental database. Here we limit the scope to one- and two-equation turbulence models where integration to the wall is used (i.e., we omit studies involving wall functions). A methodology for validating turbulence models is given, followed by an extensive evaluation of the turbulence models on the current hypersonic experimental database. A total of 18 one- and two-equation turbulence models are reviewed, and results of turbulence model assessments for the six models that have been extensively applied to the hypersonic validation database are compiled and presented in graphical form. While some of the turbulence models do provide reasonable predictions for the surface pressure, the predictions for surface heat flux are generally poor, and often in error by a factor of four or more. In the vast majority of the turbulence model validation studies we review, the authors fail to adequately address the numerical accuracy of the simulations (i.e., discretization and iterative error) and the sensitivities of the model predictions to freestream turbulence quantities or near-wall y+ mesh spacing. We recommend new hypersonic experiments be conducted which (1) measure not only surface quantities but also mean and fluctuating quantities in the interaction region and (2) provide careful estimates of both random experimental uncertainties and correlated bias errors for the measured quantities and freestream conditions. For the turbulence models, we recommend that a wide-range of turbulence models (including newer models) be re-examined on the current hypersonic experimental database, including the more recent experiments. Any future turbulence model validation efforts should carefully assess the numerical accuracy and model sensitivities. In addition, model corrections (e.g., compressibility corrections) should be carefully examined for their effects on a standard, low-speed validation database. Finally, as new experiments or direct numerical simulation data become available with information on mean and fluctuating quantities, they should be used to improve the turbulence models and thus increase their predictive capability.
Regression models for predicting peak and continuous three-dimensional spinal loads during symmetric and asymmetric lifting tasks.

PubMed

Fathallah, F A; Marras, W S; Parnianpour, M

1999-09-01

Most biomechanical assessments of spinal loading during industrial work have focused on estimating peak spinal compressive forces under static and sagittally symmetric conditions. The main objective of this study was to explore the potential of feasibly predicting three-dimensional (3D) spinal loading in industry from various combinations of trunk kinematics, kinetics, and subject-load characteristics. The study used spinal loading, predicted by a validated electromyography-assisted model, from 11 male participants who performed a series of symmetric and asymmetric lifts. Three classes of models were developed: (a) models using workplace, subject, and trunk motion parameters as independent variables (kinematic models); (b) models using workplace, subject, and measured moments variables (kinetic models); and (c) models incorporating workplace, subject, trunk motion, and measured moments variables (combined models). The results showed that peak 3D spinal loading during symmetric and asymmetric lifting were predicted equally well using all three types of regression models. Continuous 3D loading was predicted best using the combined models. When the use of such models is infeasible, the kinematic models can provide adequate predictions. Finally, lateral shear forces (peak and continuous) were consistently underestimated using all three types of models. The study demonstrated the feasibility of predicting 3D loads on the spine under specific symmetric and asymmetric lifting tasks without the need for collecting EMG information. However, further validation and development of the models should be conducted to assess and extend their applicability to lifting conditions other than those presented in this study. Actual or potential applications of this research include exposure assessment in epidemiological studies, ergonomic intervention, and laboratory task assessment.
Statistical Methods for Rapid Aerothermal Analysis and Design Technology: Validation

NASA Technical Reports Server (NTRS)

DePriest, Douglas; Morgan, Carolyn

2003-01-01

The cost and safety goals for NASA s next generation of reusable launch vehicle (RLV) will require that rapid high-fidelity aerothermodynamic design tools be used early in the design cycle. To meet these requirements, it is desirable to identify adequate statistical models that quantify and improve the accuracy, extend the applicability, and enable combined analyses using existing prediction tools. The initial research work focused on establishing suitable candidate models for these purposes. The second phase is focused on assessing the performance of these models to accurately predict the heat rate for a given candidate data set. This validation work compared models and methods that may be useful in predicting the heat rate.
External validation of the ability of the DRAGON score to predict outcome after thrombolysis treatment.

PubMed

Ovesen, C; Christensen, A; Nielsen, J K; Christensen, H

2013-11-01

Easy-to-perform and valid assessment scales for the effect of thrombolysis are essential in hyperacute stroke settings. Because of this we performed an external validation of the DRAGON scale proposed by Strbian et al. in a Danish cohort. All patients treated with intravenous recombinant plasminogen activator between 2009 and 2011 were included. Upon admission all patients underwent physical and neurological examination using the National Institutes of Health Stroke Scale along with non-contrast CT scans and CT angiography. Patients were followed up through the Outpatient Clinic and their modified Rankin Scale (mRS) was assessed after 3 months. Three hundred and three patients were included in the analysis. The DRAGON scale proved to have a good discriminative ability for predicting highly unfavourable outcome (mRS 5-6) (area under the curve-receiver operating characteristic [AUC-ROC]: 0.89; 95% confidence interval [CI] 0.81-0.96; p<0.001) and good outcome (mRS 0-2) (AUC-ROC: 0.79; 95% CI 0.73-0.85; p<0.001). When only patients with M1 occlusions were selected the DRAGON scale provided good discriminative capability (AUC-ROC: 0.89; 95% CI 0.78-1.0; p=0.003) for highly unfavourable outcome. We confirmed the validity of the DRAGON scale in predicting outcome after thrombolysis treatment. Copyright © 2013 Elsevier Ltd. All rights reserved.
Innovative Approach to Validation of Ultraviolet (UV) Reactors ...

EPA Pesticide Factsheets

Slide presentation at Conference: ASCE 7th Civil Engineering Conference in the Asian Region. USEPA in partnership with the Cadmus Group, Carollo Engineers, and other State & Industry collaborators, are evaluating new approaches for validating UV reactors to meet groundwater & surface water pathogen inactivation including viruses for low-pressure and medium-pressure UV systems. Evaluation objectives of the study: Practical approach for validating LP and MP UV reactors for virus & cryptosporidium inactivation using various test microbes, i.e., MS2, B. pumilus, AD2, T1; Apply UV dose algorithms based on theory vs empirical that predict log-I and RED as a function of the UV sensitivity of the microbe (combined variable criteria), flow, lamp-sensor output, DL-ASCFs, w/wo UVT; Assess capabilities of test microbe for predicting target pathogen, assess credibility with second test microbe vs bracketing; Evaluate UV lamp sensor technology that accounts for germicidal contributions of low-and high-wavelength UV light within MP reactors; Address approaches for propagating and assaying AD2, B. pumilus, MS2, and methods for determining low and high wavelength ASCFs using collimated beam LP & MP UV lamps; Determine & apply low and high wavelength ASCFs to predict cryptosporidium and adenovirus credit using MS2, or B. pumilus, T1 test data; Simplify Validation-Factor (VF) analysis of uncertainties/biases; Develop recommendations document from recent lessons learned applicabl
Assessing Internalizing, Externalizing, and Attention Problems in Young Children: Validation of the MacArthur HBQ

ERIC Educational Resources Information Center

Lemery-Chalfant, Kathryn; Schreiber, Jane E.; Schmidt, Nicole L.; Van Hulle, Carol A.; Essex, Marilyn J.; Goldsmith, H. H.

2007-01-01

Objective: To test the validity of the MacArthur Health and Behavior Questionnaire (HBQ) using receiver operating characteristic (ROC) analysis to determine optimal thresholds for the HBQ in predicting Diagnostic Interview Schedule for Children Version-IV (DISC-IV)diagnoses. The roles of child sex, level of impairment, and physical health in…
Cross-Validation of a PACER Prediction Equation for Assessing Aerobic Capacity in Hungarian Youth

ERIC Educational Resources Information Center

Saint-Maurice, Pedro F.; Welk, Gregory J.; Finn, Kevin J.; Kaj, Mónika

2015-01-01

Purpose: The purpose of this article was to evaluate the validity of the Progressive Aerobic Cardiovascular and Endurance Run (PACER) test in a sample of Hungarian youth. Method: Approximately 500 participants (aged 10-18 years old) were randomly selected across Hungary to complete both laboratory (maximal treadmill protocol) and field assessments…
Validation of the Dutch Eating Behaviour Questionnaire (DEBQ) among Maltese women.

PubMed

Dutton, Elaine; Dovey, Terence M

2016-12-01

The main aim of this study was to assess the dimensional structure of the Maltese version of the Dutch Eating Behaviour Questionnaire (DEBQ) and evaluate the instrument's validity and reliability among Maltese women (N = 586). Exploratory factor analysis reflected the theoretical structure of three factors; emotional, restrained and external eating which was supported by a Confirmatory Factor analysis. Minor issues with specific items in the Emotional and External eating scale were identified and discussed. Criterion-related validity was ascertained through correlations with the EAT-26. The study also assessed the DEBQ's predictive value in differentiating between BMI groups and between dieters and weight maintainers. The results suggest that the Maltese DEBQ is a psychometrically valid and reliable instrument for assessing eating behaviours with women in the Maltese community. The study also highlights the critical role of Emotional and Restrained eating in dieting and overweight Maltese women. Copyright © 2016 Elsevier Ltd. All rights reserved.

Validation in Support of Internationally Harmonised OECD Test Guidelines for Assessing the Safety of Chemicals.

PubMed

Gourmelon, Anne; Delrue, Nathalie

Ten years elapsed since the OECD published the Guidance document on the validation and international regulatory acceptance of test methods for hazard assessment. Much experience has been gained since then in validation centres, in countries and at the OECD on a variety of test methods that were subjected to validation studies. This chapter reviews validation principles and highlights common features that appear to be important for further regulatory acceptance across studies. Existing OECD-agreed validation principles will most likely generally remain relevant and applicable to address challenges associated with the validation of future test methods. Some adaptations may be needed to take into account the level of technique introduced in test systems, but demonstration of relevance and reliability will continue to play a central role as pre-requisite for the regulatory acceptance. Demonstration of relevance will become more challenging for test methods that form part of a set of predictive tools and methods, and that do not stand alone. OECD is keen on ensuring that while these concepts evolve, countries can continue to rely on valid methods and harmonised approaches for an efficient testing and assessment of chemicals.
Assessment of Clinical Criteria for Sepsis

PubMed Central

Seymour, Christopher W.; Liu, Vincent X.; Iwashyna, Theodore J.; Brunkhorst, Frank M.; Rea, Thomas D.; Scherag, André; Rubenfeld, Gordon; Kahn, Jeremy M.; Shankar-Hari, Manu; Singer, Mervyn; Deutschman, Clifford S.; Escobar, Gabriel J.; Angus, Derek C.

2016-01-01

IMPORTANCE The Third International Consensus Definitions Task Force defined sepsis as “life-threatening organ dysfunction due to a dysregulated host response to infection.” The performance of clinical criteria for this sepsis definition is unknown. OBJECTIVE To evaluate the validity of clinical criteria to identify patients with suspected infection who are at risk of sepsis. DESIGN, SETTINGS, AND POPULATION Among 1.3 million electronic health record encounters from January 1, 2010, to December 31, 2012, at 12 hospitals in southwestern Pennsylvania, we identified those with suspected infection in whom to compare criteria. Confirmatory analyses were performed in 4 data sets of 706 399 out-of-hospital and hospital encounters at 165 US and non-US hospitals ranging from January 1, 2008, until December 31, 2013. EXPOSURES Sequential [Sepsis-related] Organ Failure Assessment (SOFA) score, systemic inflammatory response syndrome (SIRS) criteria, Logistic Organ Dysfunction System (LODS) score, and a new model derived using multivariable logistic regression in a split sample, the quick Sequential [Sepsis-related] Organ Failure Assessment (qSOFA) score (range, 0–3 points, with 1 point each for systolic hypotension [≤100 mm Hg], tachypnea [≥22/min], or altered mentation). MAIN OUTCOMES AND MEASURES For construct validity, pairwise agreement was assessed. For predictive validity, the discrimination for outcomes (primary: in-hospital mortality; secondary: in-hospital mortality or intensive care unit [ICU] length of stay ≥3 days) more common in sepsis than uncomplicated infection was determined. Results were expressed as the fold change in outcome over deciles of baseline risk of death and area under the receiver operating characteristic curve (AUROC). RESULTS In the primary cohort, 148 907 encounters had suspected infection (n = 74 453 derivation; n = 74 454 validation), of whom 6347 (4%) died. Among ICU encounters in the validation cohort (n = 7932 with suspected infection, of whom 1289 [16%] died), the predictive validity for in-hospital mortality was lower for SIRS (AUROC = 0.64; 95% CI, 0.62–0.66) and qSOFA (AUROC = 0.66; 95% CI, 0.64–0.68) vs SOFA (AUROC = 0.74; 95% CI, 0.73–0.76; P < .001 for both) or LODS (AUROC = 0.75; 95% CI, 0.73–0.76; P < .001 for both). Among non-ICU encounters in the validation cohort (n = 66 522 with suspected infection, of whom 1886 [3%] died), qSOFA had predictive validity (AUROC = 0.81; 95% CI, 0.80–0.82) that was greater than SOFA (AUROC = 0.79; 95% CI, 0.78–0.80; P < .001) and SIRS (AUROC = 0.76; 95% CI, 0.75–0.77; P < .001). Relative to qSOFA scores lower than 2, encounters with qSOFA scores of 2 or higher had a 3- to 14-fold increase in hospital mortality across baseline risk deciles. Findings were similar in external data sets and for the secondary outcome. CONCLUSIONS AND RELEVANCE Among ICU encounters with suspected infection, the predictive validity for in-hospital mortality of SOFA was not significantly different than the more complex LODS but was statistically greater than SIRS and qSOFA, supporting its use in clinical criteria for sepsis. Among encounters with suspected infection outside of the ICU, the predictive validity for in-hospital mortality of qSOFA was statistically greater than SOFA and SIRS, supporting its use as a prompt to consider possible sepsis. PMID:26903335
Multimethod Investigation of Interpersonal Functioning in Borderline Personality Disorder

PubMed Central

Stepp, Stephanie D.; Hallquist, Michael N.; Morse, Jennifer Q.; Pilkonis, Paul A.

2011-01-01

Even though interpersonal functioning is of great clinical importance for patients with borderline personality disorder (BPD), the comparative validity of different assessment methods for interpersonal dysfunction has not yet been tested. This study examined multiple methods of assessing interpersonal functioning, including self- and other-reports, clinical ratings, electronic diaries, and social cognitions in three groups of psychiatric patients (N=138): patients with (1) BPD, (2) another personality disorder, and (3) Axis I psychopathology only. Using dominance analysis, we examined the predictive validity of each method in detecting changes in symptom distress and social functioning six months later. Across multiple methods, the BPD group often reported higher interpersonal dysfunction scores compared to other groups. Predictive validity results demonstrated that self-report and electronic diary ratings were the most important predictors of distress and social functioning. Our findings suggest that self-report scores and electronic diary ratings have high clinical utility, as these methods appear most sensitive to change. PMID:21808661
Identifying critical success factors for designing selection processes into postgraduate specialty training: the case of UK general practice.

PubMed

Plint, Simon; Patterson, Fiona

2010-06-01

The UK national recruitment process into general practice training has been developed over several years, with incremental introduction of stages which have been piloted and validated. Previously independent processes, which encouraged multiple applications and produced inconsistent outcomes, have been replaced by a robust national process which has high reliability and predictive validity, and is perceived to be fair by candidates and allocates applicants equitably across the country. Best selection practice involves a job analysis which identifies required competencies, then designs reliable assessment methods to measure them, and over the long term ensures that the process has predictive validity against future performance. The general practitioner recruitment process introduced machine markable short listing assessments for the first time in the UK postgraduate recruitment context, and also adopted selection centre workplace simulations. The key success factors have been identified as corporate commitment to the goal of a national process, with gradual convergence maintaining locus of control rather than the imposition of change without perceived legitimate authority.
On vital aid: the why, what and how of validation

PubMed Central

Kleywegt, Gerard J.

2009-01-01

Limitations to the data and subjectivity in the structure-determination process may cause errors in macromolecular crystal structures. Appropriate validation techniques may be used to reveal problems in structures, ideally before they are analysed, published or deposited. Additionally, such techniques may be used a posteriori to assess the (relative) merits of a model by potential users. Weak validation methods and statistics assess how well a model reproduces the information that was used in its construction (i.e. experimental data and prior knowledge). Strong methods and statistics, on the other hand, test how well a model predicts data or information that were not used in the structure-determination process. These may be data that were excluded from the process on purpose, general knowledge about macromolecular structure, information about the biological role and biochemical activity of the molecule under study or its mutants or complexes and predictions that are based on the model and that can be tested experimentally. PMID:19171968
Predicting the risk of toxic blooms of golden alga from cell abundance and environmental covariates

USGS Publications Warehouse

Patino, Reynaldo; VanLandeghem, Matthew M.; Denny, Shawn

2016-01-01

Golden alga (Prymnesium parvum) is a toxic haptophyte that has caused considerable ecological damage to marine and inland aquatic ecosystems worldwide. Studies focused primarily on laboratory cultures have indicated that toxicity is poorly correlated with the abundance of golden alga cells. This relationship, however, has not been rigorously evaluated in the field where environmental conditions are much different. The ability to predict toxicity using readily measured environmental variables and golden alga abundance would allow managers rapid assessments of ichthyotoxicity potential without laboratory bioassay confirmation, which requires additional resources to accomplish. To assess the potential utility of these relationships, several a priori models relating lethal levels of golden alga ichthyotoxicity to golden alga abundance and environmental covariates were constructed. Model parameters were estimated using archived data from four river basins in Texas and New Mexico (Colorado, Brazos, Red, Pecos). Model predictive ability was quantified using cross-validation, sensitivity, and specificity, and the relative ranking of environmental covariate models was determined by Akaike Information Criterion values and Akaike weights. Overall, abundance was a generally good predictor of ichthyotoxicity as cross validation of golden alga abundance-only models ranged from ∼ 80% to ∼ 90% (leave-one-out cross-validation). Environmental covariates improved predictions, especially the ability to predict lethally toxic events (i.e., increased sensitivity), and top-ranked environmental covariate models differed among the four basins. These associations may be useful for monitoring as well as understanding the abiotic factors that influence toxicity during blooms.
(Very) Early technology assessment and translation of predictive biomarkers in breast cancer.

PubMed

Miquel-Cases, Anna; Schouten, Philip C; Steuten, Lotte M G; Retèl, Valesca P; Linn, Sabine C; van Harten, Wim H

2017-01-01

Predictive biomarkers can guide treatment decisions in breast cancer. Many studies are undertaken to discover and translate these biomarkers, yet few biomarkers make it to practice. Before use in clinical decision making, predictive biomarkers need to demonstrate analytical validity, clinical validity and clinical utility. While attaining analytical and clinical validity is relatively straightforward, by following methodological recommendations, the achievement of clinical utility is extremely challenging. It requires demonstrating three associations: the biomarker with the outcome (prognostic association), the effect of treatment independent of the biomarker, and the differential treatment effect between the prognostic and the predictive biomarker (predictive association). In addition, economical, ethical, regulatory, organizational and patient/doctor-related aspects are hampering the translational process. Traditionally, these aspects do not receive much attention until formal approval or reimbursement of a biomarker test (informed by Health Technology Assessment (HTA)) is at stake, at which point the clinical utility and sometimes price of the test can hardly be influenced anymore. When HTA analyses are performed earlier, during biomarker research and development, they may prevent further development of those biomarkers unlikely to ever provide sufficient added value to society, and rather facilitate translation of the promising ones. Early HTA is particularly relevant for the predictive biomarker field, as expensive medicines are under pressure and the need for biomarkers to guide their appropriate use is huge. Closer interaction between clinical researchers and HTA experts throughout the translational research process will ensure that available data and methodologies will be used most efficiently to facilitate biomarker translation. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Validity of equations using knee height to predict overall height among older people in Benin.

PubMed

Jésus, Pierre; Mizéhoun-Adissoda, Carmelle; Houinato, Dismand; Preux, Pierre-Marie; Fayemendy, Philippe; Desport, Jean-Claude

2017-10-01

Chumlea's formulas are a validated means of predicting overall height from knee height (KH) among people >60 y of age, but, to our knowledge, no formula is validated for use in African countries, including Benin. The aim of this study was to compare height provided by predictive formulas using KH to measured height in an elderly population in Benin. Individuals >60 y of age in Benin underwent nutritional assessment with determination of weight, body mass index (BMI), height, and KH. A Bland-Altman analysis was carried out by sex and age. The percentage of predictions accurate to ±5 cm compared with the measured height was calculated. The tested formulas were Chumlea's formulas for non-Hispanic Black people (CBP) and two formulas for use among Caucasians. Data from 396 individuals (81.1% male) were analyzed. The three formulas achieved 98% accuracy, but with 4.6% risk for error (±2 SD: -6 to +9 cm), which appeared to make them unfit for the whole population. Nevertheless, if a level of prediction ±5 cm is considered acceptable in clinical practice, the CBP formula achieved 83.1% accuracy. Moreover, there was no significant difference in BMI calculated with the measured and the predicted height, and the nutritional status based on BMI did not differ. CBP formulas seem applicable in 83% of cases (±5 cm) to assess the height with KH of older people in Benin and do not overestimate the prevalence of malnutrition. Copyright © 2017 Elsevier Inc. All rights reserved.
Predicting the need for muscle flap salvage after open groin vascular procedures: a clinical assessment tool.

PubMed

Fischer, John P; Nelson, Jonas A; Shang, Eric K; Wink, Jason D; Wingate, Nicholas A; Woo, Edward Y; Jackson, Benjamin M; Kovach, Stephen J; Kanchwala, Suhail

2014-12-01

Groin wound complications after open vascular surgery procedures are common, morbid, and costly. The purpose of this study was to generate a simple, validated, clinically usable risk assessment tool for predicting groin wound morbidity after infra-inguinal vascular surgery. A retrospective review of consecutive patients undergoing groin cutdowns for femoral access between 2005-2011 was performed. Patients necessitating salvage flaps were compared to those who did not, and a stepwise logistic regression was performed and validated using a bootstrap technique. Utilising this analysis, a simplified risk score was developed to predict the risk of developing a wound which would necessitate salvage. A total of 925 patients were included in the study. The salvage flap rate was 11.2% (n = 104). Predictors determined by logistic regression included prior groin surgery (OR = 4.0, p < 0.001), prosthetic graft (OR = 2.7, p < 0.001), coronary artery disease (OR = 1.8, p = 0.019), peripheral arterial disease (OR = 5.0, p < 0.001), and obesity (OR = 1.7, p = 0.039). Based upon the respective logistic coefficients, a simplified scoring system was developed to enable the preoperative risk stratification regarding the likelihood of a significant complication which would require a salvage muscle flap. The c-statistic for the regression demonstrated excellent discrimination at 0.89. This study presents a simple, internally validated risk assessment tool that accurately predicts wound morbidity requiring flap salvage in open groin vascular surgery patients. The preoperatively high-risk patient can be identified and selectively targeted as a candidate for a prophylactic muscle flap.
Mean Flow and Noise Prediction for a Separate Flow Jet With Chevron Mixers

NASA Technical Reports Server (NTRS)

Koch, L. Danielle; Bridges, James; Khavaran, Abbas

2004-01-01

Experimental and numerical results are presented here for a separate flow nozzle employing chevrons arranged in an alternating pattern on the core nozzle. Comparisons of these results demonstrate that the combination of the WIND/MGBK suite of codes can predict the noise reduction trends measured between separate flow jets with and without chevrons on the core nozzle. Mean flow predictions were validated against Particle Image Velocimetry (PIV), pressure, and temperature data, and noise predictions were validated against acoustic measurements recorded in the NASA Glenn Aeroacoustic Propulsion Lab. Comparisons are also made to results from the CRAFT code. The work presented here is part of an on-going assessment of the WIND/MGBK suite for use in designing the next generation of quiet nozzles for turbofan engines.
Validation of the FACT-B+4-UL questionnaire and exploration of its predictive value in women submitted to surgery for breast cancer.

PubMed

Andrade Ortega, Juan Alfonso; Millán Gómez, Ana Pilar; Ribeiro González, Marisa; Martínez Piró, Pilar; Jiménez Anula, Juan; Sánchez Andújar, María Belén

2017-06-21

The early detection of upper limb complications is important in women operated on for breast cancer. The "FACT-B+4-UL" questionnaire, a specific variant of the Functional Assessment of Cancer Therapy-Breast (FACT-B) is available among others to measure the upper limb function. The Spanish version of the upper limb subscale of the FACT-B+4 was validated in a prospective cohort of 201 women operated on for breast cancer (factor analysis, internal consistency, test-retest reliability, construct validity and sensitivity to change were determined). Its predictive capacity of subsequent lymphoedema and other complications in the upper limb was explored using logistic regression. This subscale is unifactorial and has a great internal consistency (Cronbach's alpha: 0.87), its test-retest reliability and construct validity are strong (intraclass correlation coefficient: 0.986; Pearson's R with "Quick DASH": 0.81) as is its sensitivity to change. It didn't predict the onset of lymphedema. Its predictive capacity for other upper limb complications is low. FACT-B+4-UL is useful in measuring upper limb disability in women surgically treated for breast cancer; but it does not predict the onset of lymphoedema and its predictive capacity for others complications in the upper limb is low. Copyright © 2017 Elsevier España, S.L.U. All rights reserved.
Incremental Validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF).

PubMed

Siegling, A B; Vesely, Ashley K; Petrides, K V; Saklofske, Donald H

2015-01-01

This study examined the incremental validity of the adult short form of the Trait Emotional Intelligence Questionnaire (TEIQue-SF) in predicting 7 construct-relevant criteria beyond the variance explained by the Five-factor model and coping strategies. Additionally, the relative contributions of the questionnaire's 4 subscales were assessed. Two samples of Canadian university students completed the TEIQue-SF, along with measures of the Big Five, coping strategies (Sample 1 only), and emotion-laden criteria. The TEIQue-SF showed consistent incremental effects beyond the Big Five or the Big Five and coping strategies, predicting all 7 criteria examined across the 2 samples. Furthermore, 2 of the 4 TEIQue-SF subscales accounted for the measure's incremental validity. Although the findings provide good support for the validity and utility of the TEIQue-SF, directions for further research are emphasized.
The Quantitative Reasoning for College Science (QuaRCS) Assessment: Emerging Themes from 5 Years of Data

NASA Astrophysics Data System (ADS)

Follette, Katherine; Dokter, Erin; Buxner, Sanlyn

2018-01-01

The Quantitative Reasoning for College Science (QuaRCS) Assessment is a validated assessment instrument that was designed to measure changes in students' quantitative reasoning skills, attitudes toward mathematics, and ability to accurately assess their own quantitative abilities. It has been administered to more than 5,000 students at a variety of institutions at the start and end of a semester of general education college science instruction. I will begin by briefly summarizing our published work surrounding validation of the instrument and identification of underlying attitudinal factors (composite variables identified via factor analysis) that predict 50% of the variation in students' scores on the assessment. I will then discuss more recent unpublished work, including: (1) Development and validation of an abbreviated version of the assessment (The QuaRCS Light), which results in marked improvements in students' ability to maintain a high effort level throughout the assessment and has broad implications for quantitative reasoning assessments in general, and (2) Our efforts to revise the attitudinal portion of the assessment to better assess math anxiety level, another key factor in student performance on numerical assessments.
Validation Assessment of a Glass-to-Metal Seal Finite-Element Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jamison, Ryan Dale; Buchheit, Thomas E.; Emery, John M

Sealing glasses are ubiquitous in high pressure and temperature engineering applications, such as hermetic feed-through electrical connectors. A common connector technology are glass-to-metal seals where a metal shell compresses a sealing glass to create a hermetic seal. Though finite-element analysis has been used to understand and design glass-to-metal seals for many years, there has been little validation of these models. An indentation technique was employed to measure the residual stress on the surface of a simple glass-to-metal seal. Recently developed rate- dependent material models of both Schott 8061 and 304L VAR stainless steel have been applied to a finite-element modelmore » of the simple glass-to-metal seal. Model predictions of residual stress based on the evolution of material models are shown. These model predictions are compared to measured data. Validity of the finite- element predictions is discussed. It will be shown that the finite-element model of the glass-to-metal seal accurately predicts the mean residual stress in the glass near the glass-to-metal interface and is valid for this quantity of interest.« less
Validation metrics for turbulent plasma transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holland, C., E-mail: chholland@ucsd.edu

Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. The utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak [J. L. Luxon, Nucl. Fusion 42, 614 (2002)], as part of a multi-year transport model validation activity.« less
Development and validation of a measure of display rule knowledge: the display rule assessment inventory.

PubMed

Matsumoto, David; Yoo, Seung Hee; Hirayama, Satoko; Petrova, Galina

2005-03-01

As one component of emotion regulation, display rules, which reflect the regulation of expressive behavior, have been the topic of many studies. Despite their theoretical and empirical importance, however, to date there is no measure of display rules that assesses a full range of behavioral responses that are theoretically possible when emotion is elicited. This article reports the development of a new measure of display rules that surveys 5 expressive modes: expression, deamplification, amplification, qualification, and masking. Two studies provide evidence for its internal and temporal reliability and for its content, convergent, discriminant, external, and concurrent predictive validity. Additionally, Study 1, involving American, Russian, and Japanese participants, demonstrated predictable cultural differences on each of the expressive modes. Copyright 2005 APA, all rights reserved.
Predicting First-Quarter Test Scores from the New Medical College Admission Test.

ERIC Educational Resources Information Center

Cullen, Thomas J.; And Others

1980-01-01

The predictive validity of the new Medical College Admission Test as it relates to end-of-quarter examinations in anatomy, histology, physiology, biochemistry, and "ages of man" is presented. Results indicate that the Science Knowledge assessment areas of chemistry and physics and the Science Problems subtest were most useful in…
On the Factorial Structure of the SAT and Implications for Next-Generation College Readiness Assessments

ERIC Educational Resources Information Center

Wiley, Edward W.; Shavelson, Richard J.; Kurpius, Amy A.

2014-01-01

The name "SAT" has become synonymous with college admissions testing; it has been dubbed "the gold standard." Numerous studies on its reliability and predictive validity show that the SAT predicts college performance beyond high school grade point average. Surprisingly, studies of the factorial structure of the current version…
Examining the Validity of Behavioral Self-Regulation Tools in Predicting Preschoolers' Academic Achievement

ERIC Educational Resources Information Center

Schmitt, Sara A.; Pratt, Megan E.; McClelland, Megan M.

2014-01-01

The current study investigated the predictive utility among teacher-rated, observed, and directly assessed behavioral self-regulation skills to academic achievement in preschoolers. Specifically, this study compared how a teacher report, the Child Behavior Rating Scale, an observer report, the Observed Child Engagement Scale, and a direct…
Examining the Validity of Behavioral Self-Regulation Tools in Predicting Preschoolers' Academic Achievement

ERIC Educational Resources Information Center

Schmitt, Sara A.; Pratt, Megan E.; McClelland, Megan M.

2014-01-01

Research Findings: The current study investigated the predictive utility of teacher-rated, observed, and directly assessed behavioral self-regulation skills to academic achievement in preschoolers. Specifically, this study compared how a teacher report (the Child Behavior Rating Scale), an observer report (the Observed Child Engagement Scale), and…

Short-Term Memory as an Additional Predictor of School Achievement for Immigrant Children?

ERIC Educational Resources Information Center

te Nijenhuis, Jan; Resing, Wilma; Tolboom, Elsbeth; Bleichrodt, Nico

2004-01-01

The predictive validity and utility of assessment procedures can be increased by adding predictors to the prediction supplied by general ability tests. Of Jensen's early work comes the suggestion of focusing on the cognitive ability short-term memory (STM), especially for low-"g" Black children. Meta-analysis convincingly shows high…
Isolated Open Rotor Noise Prediction Assessment Using the F31A31 Historical Blade Set

NASA Technical Reports Server (NTRS)

Nark, Douglas M.; Jones, William T.; Boyd, D. Douglas, Jr.; Zawodny, Nikolas S.

2016-01-01

In an effort to mitigate next-generation fuel efficiency and environmental impact concerns for aviation, open rotor propulsion systems have received renewed interest. However, maintaining the high propulsive efficiency while simultaneously meeting noise goals has been one of the challenges in making open rotor propulsion a viable option. Improvements in prediction tools and design methodologies have opened the design space for next generation open rotor designs that satisfy these challenging objectives. As such, validation of aerodynamic and acoustic prediction tools has been an important aspect of open rotor research efforts. This paper describes validation efforts of a combined computational fluid dynamics and Ffowcs Williams and Hawkings equation methodology for open rotor aeroacoustic modeling. Performance and acoustic predictions were made for a benchmark open rotor blade set and compared with measurements over a range of rotor speeds and observer angles. Overall, the results indicate that the computational approach is acceptable for assessing low-noise open rotor designs. Additionally, this approach may be used to provide realistic incident source fields for acoustic shielding/scattering studies on various aircraft configurations.
Assessing young children's intention-reading in authentic communicative contexts: preliminary evidence and clinical utility.

PubMed

Greenslade, Kathryn J; Coggins, Truman E

2014-01-01

Identifying what a communication partner is looking at (referential intention) and why (social intention) is essential to successful social communication, and may be challenging for children with social communication deficits. This study explores a clinical task that assesses these intention-reading abilities within an authentic context. To gather evidence of the task's reliability and validity, and to discuss its clinical utility. The intention-reading task was administered to twenty 4-7-year-olds with typical development (TD) and ten with autism spectrum disorder (ASD). Task items were embedded in an authentic activity, and they targeted the child's ability to identify the examiner's referential and social intentions, which were communicated through joint attention behaviours. Reliability and construct validity evidence were addressed using established psychometric methods. Reliability and validity evidence supported the use of task scores for identifying children whose intention-reading warranted concern. Evidence supported the reliability of task administration and coding, and item-level codes were highly consistent with overall task performance. Supporting task validity, group differences aligned with predictions, with children with ASD exhibiting poorer and more variable task scores than children with TD. Also, as predicted, task scores correlated significantly with verbal mental age and ratings of parental concerns regarding social communication abilities. The evidence provides preliminary support for the reliability and validity of the clinical task's scores in assessing young children's real-time intention-reading abilities, which are essential for successful interactions in school and beyond. © 2014 Royal College of Speech and Language Therapists.
[Design and validation of an instrument to assess families at risk for health problems].

PubMed

Puschel, Klaus; Repetto, Paula; Solar, María Olga; Soto, Gabriela; González, Karla

2012-04-01

There is a paucity of screening instruments with a high clinical predictive value to identify families at risk and therefore, develop focused interventions in primary care. To develop an easy to apply screening instrument with a high clinical predictive value to identify families with a higher health vulnerability. In the first stage of the study an instrument with a high content validity was designed through a review of existent instruments, qualitative interviews with families and expert opinions following a Delphi approach of three rounds. In the second stage, concurrent validity was tested through a comparative analysis between the pilot instrument and a family clinical interview conducted to 300 families randomly selected from a population registered at a primary care clinic in Santiago. The sampling was blocked based on the presence of diabetes, depression, child asthma, behavioral disorders, presence of an older person or the lack of previous conditions among family members. The third stage, was directed to test the clinical predictive validity of the instrument by comparing the baseline vulnerability obtained by the instrument and the change in clinical status and health related quality of life perceptions of the family members after nine months of follow-up. The final SALUFAM instrument included 13 items and had a high internal consistency (Cronbach's alpha: 0.821), high test re-test reproducibility (Pearson correlation: 0.84) and a high clinical predictive value for clinical deterioration (Odds ratio: 1.826; 95% confidence intervals: 1.101-3.029). SALUFAM instrument is applicable, replicable, has a high content validity, concurrent validity and clinical predictive value.
Positive and negative emotional eating have different associations with overeating and binge eating: Construction and validation of the Positive-Negative Emotional Eating Scale.

PubMed

Sultson, Hedvig; Kukk, Katrin; Akkermann, Kirsti

2017-09-01

Research on emotional eating mostly focuses on negative emotions. Much less is known about how positive emotions relate to overeating and binge eating (BE). The aim of the current study was to construct a scale for positive and negative emotional eating and to assess its predictive validity. In study 1, the Positive-Negative Emotional Eating Scale (PNEES) was constructed and tested on 531 women, who also completed Eating Disorders Assessment Scale (EDAS). Results showed that a two-factor model constituting Positive emotional eating (PNEES-P) and Negative emotional eating (PNEES-N) fit the data well. PNEES-N also showed good convergent validity in assessing binge eating, correlating highly with EDAS subscale Binge eating. Further, a path analysis showed that after controlling for the mediating effect of PNEES-N, PNEES-P continued to significantly predict binge eating. In study 2 (N = 60), experience sampling method was used to assess overeating and BE in the natural environment. Palmtop computers were given to participants for a three-day study period that prompted them with questions regarding emotional experience, overeating, and BE. Results indicated that PNEES-P significantly predicted overeating, whereas PNEES-N predicted overeating and BE episodes only in a subsample of women who had experienced at least one overeating or BE episode. Thus, positive and negative emotional eating might have different relations with overeating and BE, with the latter being more characteristic of the severity/frequency of overeating and BE. New assessment tools that in addition to negative emotional eating also address positive emotional eating could be of potential help in planning intervention. Further, the tendency to overeat in response to positive emotions could be integrated into current models of eating disorders, especially when addressing relapse prevention. Copyright © 2017 Elsevier Ltd. All rights reserved.
Implicit but not explicit self-esteem predicts future depressive symptomatology.

PubMed

Franck, Erik; De Raedt, Rudi; De Houwer, Jan

2007-10-01

To date, research on the predictive validity of implicit self-esteem for depressive relapse is very sparse. In the present study, we assessed implicit self-esteem using the Name Letter Preference Task and explicit self-esteem using the Rosenberg self-esteem scale in a group of currently depressed patients, formerly depressed individuals, and never depressed controls. In addition, we examined the predictive validity of explicit, implicit, and the interaction of explicit and implicit self-esteem in predicting future symptoms of depression in formerly depressed individuals and never depressed controls. The results showed that currently depressed individuals reported a lower explicit self-esteem as compared to formerly depressed individuals and never depressed controls. In line with previous research, all groups showed a positive implicit self-esteem not different from each other. Furthermore, after controlling for initial depressive symptomatology, implicit but not explicit self-esteem significantly predicted depressive symptoms at six months follow-up. Although implicit self-esteem assessed with the Name Letter Preference Test was not different between formerly depressed individuals and never depressed controls, the findings suggest it is an interesting variable in the study of vulnerability for depression relapse.
Assessing the Learning Environment for Medical Students: An Evaluation of a Novel Survey Instrument in Four Medical Schools.

PubMed

Pololi, Linda H; Evans, Arthur T; Nickell, Leslie; Reboli, Annette C; Coplit, Lisa D; Stuber, Margaret L; Vasiliou, Vasilia; Civian, Janet T; Brennan, Robert T

2017-06-01

A practical, reliable, and valid instrument is needed to measure the impact of the learning environment on medical students' well-being and educational experience and to meet medical school accreditation requirements. From 2012 to 2015, medical students were surveyed at the end of their first, second, and third year of studies at four medical schools. The survey assessed students' perceptions of the following nine dimensions of the school culture: vitality, self-efficacy, institutional support, relationships/inclusion, values alignment, ethical/moral distress, work-life integration, gender equity, and ethnic minority equity. The internal reliability of each of the nine dimensions was measured. Construct validity was evaluated by assessing relationships predicted by our conceptual model and prior research. Assessment was made of whether the measurements were sensitive to differences over time and across institutions. Six hundred and eighty-six students completed the survey (49 % women; 9 % underrepresented minorities), with a response rate of 89 % (range over the student cohorts 72-100 %). Internal consistency of each dimension was high (Cronbach's α 0.71-0.86). The instrument was able to detect significant differences in the learning environment across institutions and over time. Construct validity was supported by demonstrating several relationships predicted by our conceptual model. The C-Change Medical Student Survey is a practical, reliable, and valid instrument for assessing the learning environment of medical students. Because it is sensitive to changes over time and differences across institution, results could potentially be used to facilitate and monitor improvements in the learning environment of medical students.
The predictive value of early behavioural assessments in pet dogs--a longitudinal study from neonates to adults.

PubMed

Riemer, Stefanie; Müller, Corsin; Virányi, Zsófia; Huber, Ludwig; Range, Friederike

2014-01-01

Studies on behavioural development in domestic dogs are of relevance for matching puppies with the right families, identifying predispositions for behavioural problems at an early stage, and predicting suitability for service dog work, police or military service. The literature is, however, inconsistent regarding the predictive value of tests performed during the socialisation period. Additionally, some practitioners use tests with neonates to complement later assessments for selecting puppies as working dogs, but these have not been validated. We here present longitudinal data on a cohort of Border collies, followed up from neonate age until adulthood. A neonate test was conducted with 99 Border collie puppies aged 2-10 days to assess activity, vocalisations when isolated and sucking force. At the age of 40-50 days, 134 puppies (including 93 tested as neonates) were tested in a puppy test at their breeders' homes. All dogs were adopted as pet dogs and 50 of them participated in a behavioural test at the age of 1.5 to 2 years with their owners. Linear mixed models found little correspondence between individuals' behaviour in the neonate, puppy and adult test. Exploratory activity was the only behaviour that was significantly correlated between the puppy and the adult test. We conclude that the predictive validity of early tests for predicting specific behavioural traits in adult pet dogs is limited.
The Predictive Value of Early Behavioural Assessments in Pet Dogs – A Longitudinal Study from Neonates to Adults

PubMed Central

Riemer, Stefanie; Müller, Corsin; Virányi, Zsófia; Huber, Ludwig; Range, Friederike

2014-01-01

Studies on behavioural development in domestic dogs are of relevance for matching puppies with the right families, identifying predispositions for behavioural problems at an early stage, and predicting suitability for service dog work, police or military service. The literature is, however, inconsistent regarding the predictive value of tests performed during the socialisation period. Additionally, some practitioners use tests with neonates to complement later assessments for selecting puppies as working dogs, but these have not been validated. We here present longitudinal data on a cohort of Border collies, followed up from neonate age until adulthood. A neonate test was conducted with 99 Border collie puppies aged 2–10 days to assess activity, vocalisations when isolated and sucking force. At the age of 40–50 days, 134 puppies (including 93 tested as neonates) were tested in a puppy test at their breeders' homes. All dogs were adopted as pet dogs and 50 of them participated in a behavioural test at the age of 1.5 to 2 years with their owners. Linear mixed models found little correspondence between individuals' behaviour in the neonate, puppy and adult test. Exploratory activity was the only behaviour that was significantly correlated between the puppy and the adult test. We conclude that the predictive validity of early tests for predicting specific behavioural traits in adult pet dogs is limited. PMID:25003341
Correlates of mammographic density in B-mode ultrasound and real time elastography.

PubMed

Jud, Sebastian Michael; Häberle, Lothar; Fasching, Peter A; Heusinger, Katharina; Hack, Carolin; Faschingbauer, Florian; Uder, Michael; Wittenberg, Thomas; Wagner, Florian; Meier-Meitinger, Martina; Schulz-Wendtland, Rüdiger; Beckmann, Matthias W; Adamietz, Boris R

2012-07-01

The aim of our study involved the assessment of B-mode imaging and elastography with regard to their ability to predict mammographic density (MD) without X-rays. Women, who underwent routine mammography, were prospectively examined with additional B-mode ultrasound and elastography. MD was assessed quantitatively with a computer-assisted method (Madena). The B-mode and elastography images were assessed by histograms with equally sized gray-level intervals. Regression models were built and cross validated to examine the ability to predict MD. The results of this study showed that B-mode imaging and elastography were able to predict MD. B-mode seemed to give a more accurate prediction. R for B-mode image and elastography were 0.67 and 0.44, respectively. Areas in the B-mode images that correlated with mammographic dense areas were either dark gray or of intermediate gray levels. Concerning elastography only the gray levels that represent extremely stiff tissue correlated positively with MD. In conclusion, ultrasound seems to be able to predict MD. Easy and cheap utilization of regular breast ultrasound machines encourages the use of ultrasound in larger case-control studies to validate this method as a breast cancer risk predictor. Furthermore, the application of ultrasound for breast tissue characterization could enable comprehensive research concerning breast cancer risk and breast density in young and pregnant women.
Validity of the Montreal Cognitive Assessment Screener in Adolescents and Young Adults With and Without Congenital Heart Disease.

PubMed

Pike, Nancy A; Poulsen, Marie K; Woo, Mary A

Cognitive deficits are common, long-term sequelae in children and adolescents with congenital heart disease (CHD) who have undergone surgical palliation. However, there is a lack of a validated brief cognitive screening tool appropriate for the outpatient setting for adolescents with CHD. One candidate instrument is the Montreal Cognitive Assessment (MoCA) questionnaire. The purpose of the research was to validate scores from the MoCA against the General Memory Index (GMI) of the Wide Range Assessment of Memory and Learning, 2nd Edition (WRAML2), a widely accepted measure of cognition/memory, in adolescents and young adults with CHD. We administered the MoCA and the WRAML2 to 156 adolescents and young adults ages 14-21 (80 youth with CHD and 76 healthy controls who were gender and age matched). Spearman's rank order correlations were used to assess concurrent validity. To assess construct validity, the Mann-Whitney U test was used to compare differences in scores in youth with CHD and the healthy control group. Receiver operating characteristic curves were created and area under the curve, sensitivity, specificity, positive predictive value, and negative predictive value were also calculated. The MoCA median scores in the CHD versus healthy controls were (23, range 15-29 vs. 28, range 22-30; p < .001), respectively. With the screening cutoff scores at <26 points for the MoCA and 85 for GMI (<1 SD, M = 100, SD = 15), the CHD versus healthy control groups showed sensitivity of .96 and specificity of .67 versus sensitivity of .75 and specificity of .90, respectively, in the detection of cognitive deficits. A cutoff score of 26 on the MoCA was optimal in the CHD group; a cutoff of 25 had similar properties except for a lower negative predictive value. The area under the receiver operating characteristic curve (95% CI) for the MoCA was 0.84 (95% CI [0.75, 0.93], p < .001) and 0.84 (95% CI [0.62, 1.00], p = .02) for the CHD and controls, respectively. Scores on the MoCA were valid for screening to detect cognitive deficits in adolescents and young adults aged 14-21 with CHD when a cutoff score of 26 is used to differentiate youth with and without significant cognitive impairment. Future studies are needed in other adolescent disease groups with known cognitive deficits and healthy populations to explore the generalizability of validity of MoCA scores in adolescents and young adults.
The use of neuropsychological tests to assess intelligence.

PubMed

Gansler, David A; Varvaris, Mark; Schretlen, David J

We sought to derive a 'neuropsychological intelligence quotient' (NIQ) to replace IQ testing in some routine assessments. We administered neuropsychological testing and a seven-subtest short form of the Wechsler Adult Intelligence Scale to a community sample of 394 adults aged 18-96 years. We regressed Wechsler Full Scale IQs (W-FSIQ) on 23 neuropsychological scores and derived an NIQ from 9 measures that explained significant variance in W-FSIQ. We then compared subgroups of 284 healthy and 108 unhealthy participants in NIQ and W-FSIQ to assess criterion validity, correlated NIQ and W-FSIQ scores with education level and independence for activities of daily living to assess convergent validity, and compared validity coefficients for the NIQ with those of 'hold' and 'no-hold' indices. By design, NIQ and W-FSIQ scores correlated highly (r = .84), and both were higher in healthy participants. The difference was larger for NIQ, which accounted for more variability in activities of daily living. The NIQ and 'no-hold' index were better predicted by health status and less predicted by educational status than the 'hold' index. We constructed an NIQ that correlates highly with Wechsler FSIQ. Tests required to obtain NIQ are commonly used and can be administered in about 45 min. Validity properties of NIQ and W-FSIQ are similar. The NIQ bore greater resemblance to a 'no-hold' than 'hold' index. One can obtain a reasonably accurate estimate of current Full Scale IQ without formal intelligence testing from a brief neuropsychological battery.
Cross-Cultural Psychometric Assessment of the Leeds Assessment of Neuropathic Symptoms and Signs (LANSS) Pain Scale in the Portuguese Population.

PubMed

Barbosa, Margarida; Bennett, Michael I; Verissimo, Ramiro; Carvalho, Davide

2014-09-01

Chronic pain is a well-known phenomenon. The differential diagnosis between neuropathic and nociceptive pain syndromes is a challenge. Consequently, assessment instruments that can distinguish between these conditions in a standardized way are of the utmost importance. The Leeds Assessment of Neuropathic Symptoms and Signs (LANSS) is a screening tool developed to identify chronic neuropathic pain. The aim of this study was the Portuguese language translation, linguistic adaptation of the LANSS pain scale, its semantic validation, internal consistency, temporal stability, as well its validity and discriminative power. LANSS Portuguese version scale was applied to 165 consecutive patients attending the pain clinic: 103 fulfilled the clinical criteria for the diagnosis of pain of neuropathic origin and the remaining 62 fulfilled the criteria for nociceptive pain. The scale proved to be an internally consistent (Cronbach's alpha = 0.78) and reliable instrument with good test-retest stability (r = 0.7; P < 0.001). However, its validity and specificity with a cutoff point of ≥ 12, for differentiating patients with neuropathic pain from those with non-neuropathic pain, had 89% sensitivity, 74% specificity, positive predictive value of 85%, and negative predictive value of 81%. The Portuguese LANSS version pain scale properties lead us to the conclusion that such a cross-cultural version is a reliable and valid instrument for the differentiation of this type of pain. Its usage is recommended. © 2013 World Institute of Pain.
Tools for assessing fall risk in the elderly: a systematic review and meta-analysis.

PubMed

Park, Seong-Hi

2018-01-01

The prevention of falls among the elderly is arguably one of the most important public health issues in today's aging society. The aim of this study was to assess which tools best predict the risk of falls in the elderly. Electronic searches were performed using Medline, EMBASE, the Cochrane Library, CINAHL, etc., using the following keywords: "fall risk assessment", "elderly fall screening", and "elderly mobility scale". The QUADAS-2 was applied to assess the internal validity of the diagnostic studies. Selected studies were meta-analyzed with MetaDisc 1.4. A total of 33 studies were eligible out of the 2,321 studies retrieved from selected databases. Twenty-six assessment tools for fall risk were used in the selected articles, and they tended to vary based on the setting. The fall risk assessment tools currently used for the elderly did not show sufficiently high predictive validity for differentiating high and low fall risks. The Berg Balance scale and Mobility Interaction Fall chart showed stable and high specificity, while the Downton Fall Risk Index, Hendrich II Fall Risk Model, St. Thomas's Risk Assessment Tool in Falling elderly inpatients, Timed Up and Go test, and Tinetti Balance scale showed the opposite results. We concluded that rather than a single measure, two assessment tools used together would better evaluate the characteristics of falls by the elderly that can occur due to a multitude of factors and maximize the advantages of each for predicting the occurrence of falls.
The Predictive Validity of a Gender-Responsive Needs Assessment: An Exploratory Study

ERIC Educational Resources Information Center

Salisbury, Emily J.; Van Voorhis, Patricia; Spiropoulos, Georgia V.

2009-01-01

Risk assessment and classification systems for women have been largely derived from male-based systems. As a result, many of the needs unique to women are not formally assessed or treated. Emerging research advocating a gender-responsive approach to the supervision and treatment of women offenders suggests that needs such as abuse, mental health,…
Can a Two-Question Test Be Reliable and Valid for Predicting Academic Outcomes?

ERIC Educational Resources Information Center

Bridgeman, Brent

2016-01-01

Scores on essay-based assessments that are part of standardized admissions tests are typically given relatively little weight in admissions decisions compared to the weight given to scores from multiple-choice assessments. Evidence is presented to suggest that more weight should be given to these assessments. The reliability of the writing scores…
Predicting survival of de novo metastatic breast cancer in Asian women: systematic review and validation study.

PubMed

Miao, Hui; Hartman, Mikael; Bhoo-Pathy, Nirmala; Lee, Soo-Chin; Taib, Nur Aishah; Tan, Ern-Yu; Chan, Patrick; Moons, Karel G M; Wong, Hoong-Seam; Goh, Jeremy; Rahim, Siti Mastura; Yip, Cheng-Har; Verkooijen, Helena M

2014-01-01

In Asia, up to 25% of breast cancer patients present with distant metastases at diagnosis. Given the heterogeneous survival probabilities of de novo metastatic breast cancer, individual outcome prediction is challenging. The aim of the study is to identify existing prognostic models for patients with de novo metastatic breast cancer and validate them in Asia. We performed a systematic review to identify prediction models for metastatic breast cancer. Models were validated in 642 women with de novo metastatic breast cancer registered between 2000 and 2010 in the Singapore Malaysia Hospital Based Breast Cancer Registry. Survival curves for low, intermediate and high-risk groups according to each prognostic score were compared by log-rank test and discrimination of the models was assessed by concordance statistic (C-statistic). We identified 16 prediction models, seven of which were for patients with brain metastases only. Performance status, estrogen receptor status, metastatic site(s) and disease-free interval were the most common predictors. We were able to validate nine prediction models. The capacity of the models to discriminate between poor and good survivors varied from poor to fair with C-statistics ranging from 0.50 (95% CI, 0.48-0.53) to 0.63 (95% CI, 0.60-0.66). The discriminatory performance of existing prediction models for de novo metastatic breast cancer in Asia is modest. Development of an Asian-specific prediction model is needed to improve prognostication and guide decision making.
Validation of a Method To Screen for Pulmonary Hypertension in Advanced Idiopathic Pulmonary Fibrosis*

PubMed Central

Zisman, David A.; Karlamangla, Arun S.; Kawut, Steven M.; Shlobin, Oksana A.; Saggar, Rajeev; Ross, David J.; Schwarz, Marvin I.; Belperio, John A.; Ardehali, Abbas; Lynch, Joseph P.; Nathan, Steven D.

2008-01-01

Background We have developed a method to screen for pulmonary hypertension (PH) in idiopathic pulmonary fibrosis (IPF) patients, based on a formula to predict mean pulmonary artery pressure (MPAP) from standard lung function measurements. The objective of this study was to validate this method in a separate group of IPF patients. Methods Cross-sectional study of 60 IPF patients from two institutions. The accuracy of the MPAP estimation was assessed by examining the correlation between the predicted and measured MPAPs and the magnitude of the estimation error. The discriminatory ability of the method for PH was assessed using the area under the receiver operating characteristic curve (AUC). Results There was strong correlation in the expected direction between the predicted and measured MPAPs (r = 0.72; p < 0.0001). The estimated MPAP was within 5 mm Hg of the measured MPAP 72% of the time. The AUC for predicting PH was 0.85, and did not differ by institution. A formula-predicted MPAP > 21 mm Hg was associated with a sensitivity, specificity, positive predictive value, and negative predictive value of 95%, 58%, 51%, and 96%, respectively, for PH defined as MPAP from right-heart catheterization > 25 mm Hg. Conclusions A prediction formula for MPAP using standard lung function measurements can be used to screen for PH in IPF patients. PMID:18198245
Validity and reliability of a questionnaire to assess social skills in traumatic brain injury: A preliminary study.

PubMed

Francis, Heather M; Osborne-Crowley, Katherine; McDonald, Skye

2017-01-01

To describe the reliability and validity of a new measure, the Social Skills Questionnaire for Traumatic Brain Injury (SSQ-TBI). Fifty-one adults with severe TBI completed the SSQ-TBI questionnaire. Scores were compared to informant- and self-report on questionnaires addressing frontal lobe mediated behaviour, as well as performance on an objective measure of social cognition and neuropsychological tasks, in order to provide evidence of concurrent, divergent and predictive validity. Internal consistency was excellent at α = 0.90. Convergent validity was good, with informant ratings on the SSQ-TBI significantly correlated with Neuropsychiatric Inventory Disinhibition sub-scales (r = 0.50-63), the Current Behaviour Scale (r = 0.39-0.48) and Frontal Systems Behaviour Scale (r = 0.60-0.83). However, no relationship was seen with an objective measure of social skills or neuropsychological tasks of disinhibition. There was a significant relationship with real-world psychosocial outcomes on the Sydney Psychosocial Reintegration Scale-2 (r = -0.38--0.69) Conclusions: This study provides preliminary findings of good internal consistency and convergent and predictive validity of a social skills questionnaire adapted to be appropriate for individuals with TBI. Further assessment of psychometric properties such as test-re-test reliability and factor structure is warranted.
Bayesian cross-entropy methodology for optimal design of validation experiments

NASA Astrophysics Data System (ADS)

Jiang, X.; Mahadevan, S.

2006-07-01

An important concern in the design of validation experiments is how to incorporate the mathematical model in the design in order to allow conclusive comparisons of model prediction with experimental output in model assessment. The classical experimental design methods are more suitable for phenomena discovery and may result in a subjective, expensive, time-consuming and ineffective design that may adversely impact these comparisons. In this paper, an integrated Bayesian cross-entropy methodology is proposed to perform the optimal design of validation experiments incorporating the computational model. The expected cross entropy, an information-theoretic distance between the distributions of model prediction and experimental observation, is defined as a utility function to measure the similarity of two distributions. A simulated annealing algorithm is used to find optimal values of input variables through minimizing or maximizing the expected cross entropy. The measured data after testing with the optimum input values are used to update the distribution of the experimental output using Bayes theorem. The procedure is repeated to adaptively design the required number of experiments for model assessment, each time ensuring that the experiment provides effective comparison for validation. The methodology is illustrated for the optimal design of validation experiments for a three-leg bolted joint structure and a composite helicopter rotor hub component.

Long-Term Survival Prediction for Coronary Artery Bypass Grafting: Validation of the ASCERT Model Compared With The Society of Thoracic Surgeons Predicted Risk of Mortality.

PubMed

Lancaster, Timothy S; Schill, Matthew R; Greenberg, Jason W; Ruaengsri, Chawannuch; Schuessler, Richard B; Lawton, Jennifer S; Maniar, Hersh S; Pasque, Michael K; Moon, Marc R; Damiano, Ralph J; Melby, Spencer J

2018-05-01

The recently developed American College of Cardiology Foundation-Society of Thoracic Surgeons (STS) Collaboration on the Comparative Effectiveness of Revascularization Strategy (ASCERT) Long-Term Survival Probability Calculator is a valuable addition to existing short-term risk-prediction tools for cardiac surgical procedures but has yet to be externally validated. Institutional data of 654 patients aged 65 years or older undergoing isolated coronary artery bypass grafting between 2005 and 2010 were reviewed. Predicted survival probabilities were calculated using the ASCERT model. Survival data were collected using the Social Security Death Index and institutional medical records. Model calibration and discrimination were assessed for the overall sample and for risk-stratified subgroups based on (1) ASCERT 7-year survival probability and (2) the predicted risk of mortality (PROM) from the STS Short-Term Risk Calculator. Logistic regression analysis was performed to evaluate additional perioperative variables contributing to death. Overall survival was 92.1% (569 of 597) at 1 year and 50.5% (164 of 325) at 7 years. Calibration assessment found no significant differences between predicted and actual survival curves for the overall sample or for the risk-stratified subgroups, whether stratified by predicted 7-year survival or by PROM. Discriminative performance was comparable between the ASCERT and PROM models for 7-year survival prediction (p < 0.001 for both; C-statistic = 0.815 for ASCERT and 0.781 for PROM). Prolonged ventilation, stroke, and hospital length of stay were also predictive of long-term death. The ASCERT survival probability calculator was externally validated for prediction of long-term survival after coronary artery bypass grafting in all risk groups. The widely used STS PROM performed comparably as a predictor of long-term survival. Both tools provide important information for preoperative decision making and patient counseling about potential outcomes after coronary artery bypass grafting. Copyright © 2018 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
SELF-RATED EXPECTATIONS OF SUICIDAL BEHAVIOR PREDICT FUTURE SUICIDE ATTEMPTS AMONG ADOLESCENT AND YOUNG ADULT PSYCHIATRIC EMERGENCY PATIENTS.

PubMed

Czyz, Ewa K; Horwitz, Adam G; King, Cheryl A

2016-06-01

This study's purpose was to examine the predictive validity and clinical utility of a brief measure assessing youths' own expectations of their future risk of suicidal behavior, administered in a psychiatric emergency (PE) department; and determine if youths' ratings improve upon a clinician-administered assessment of suicidal ideation severity. The outcome was suicide attempts up to 18 months later. In this medical record review study, 340 consecutively presenting youths (ages 13-24) seeking PE services over a 7-month period were included. Subsequent PE visits and suicide attempts were retrospectively tracked for up to 18 months. The 3-item scale assessing patients' perception of their own suicidal behavior risk and the clinician-administered ideation severity scale were used routinely at the study site. Cox regression results showed that youths' expectations of suicidal behavior were independently associated with increased risk of suicide attempts, even after adjusting for key covariates. Results were not moderated by sex, suicide attempt history, or age. Receiver-operating characteristic (ROC) analyses indicated that self-assessed expectations of risk improved the predictive accuracy of the clinician-administered suicidal ideation measure. Youths' ratings indicative of lower confidence in maintaining safety uniquely predicted follow-up attempts and provided incremental validity over and above the clinician-administered assessment and improved its accuracy, suggesting their potential for augmenting suicide risk formulation. Assessing youths' own perceptions of suicide risk appears to be clinically useful, feasible to implement in PE settings, and, if replicated, promising for improving identification of youth at risk for suicidal behavior. © 2016 Wiley Periodicals, Inc.
Developing and validating risk prediction models in an individual participant data meta-analysis

PubMed Central

2014-01-01

Background Risk prediction models estimate the risk of developing future outcomes for individuals based on one or more underlying characteristics (predictors). We review how researchers develop and validate risk prediction models within an individual participant data (IPD) meta-analysis, in order to assess the feasibility and conduct of the approach. Methods A qualitative review of the aims, methodology, and reporting in 15 articles that developed a risk prediction model using IPD from multiple studies. Results The IPD approach offers many opportunities but methodological challenges exist, including: unavailability of requested IPD, missing patient data and predictors, and between-study heterogeneity in methods of measurement, outcome definitions and predictor effects. Most articles develop their model using IPD from all available studies and perform only an internal validation (on the same set of data). Ten of the 15 articles did not allow for any study differences in baseline risk (intercepts), potentially limiting their model’s applicability and performance in some populations. Only two articles used external validation (on different data), including a novel method which develops the model on all but one of the IPD studies, tests performance in the excluded study, and repeats by rotating the omitted study. Conclusions An IPD meta-analysis offers unique opportunities for risk prediction research. Researchers can make more of this by allowing separate model intercept terms for each study (population) to improve generalisability, and by using ‘internal-external cross-validation’ to simultaneously develop and validate their model. Methodological challenges can be reduced by prospectively planned collaborations that share IPD for risk prediction. PMID:24397587
Nomograms to predict the pathological stage of clinically localized prostate cancer in Korean men: comparison with western predictive tools using decision curve analysis.

PubMed

Jeong, Chang Wook; Jeong, Seong Jin; Hong, Sung Kyu; Lee, Seung Bae; Ku, Ja Hyeon; Byun, Seok-Soo; Jeong, Hyeon; Kwak, Cheol; Kim, Hyeon Hoe; Lee, Eunsik; Lee, Sang Eun

2012-09-01

To develop and evaluate nomograms to predict the pathological stage of clinically localized prostate cancer after radical prostatectomy in Korean men. We reviewed the medical records of 2041 patients who had clinical stages T1c-T3a prostate cancer and were treated solely with radical prostatectomy at two hospitals. Logistic regressions were carried out to predict organ-confined disease, extraprostatic extension, seminal vesicle invasion, and lymph node metastasis using preoperative variables and resulting nomograms. Internal validations were assessed using the area under the receiver operating characteristic curve and calibration plot, and then external validations were carried out on 129 patients from another hospital. Head-to-head comparisons with 2007 Partin tables and Cancer of the Prostate Risk Assessment score were carried out using the area under the curve and decision curve analysis. The significant predictors for organ-confined disease and extraprostatic extension were clinical stage, prostate-specific antigen, Gleason score and a percent positive core of biopsy. Significant predictors for seminal vesicle invasion were prostate-specific antigen, Gleason score and percent positive core, and those for lymph node metastasis were prostate-specific antigen and percent positive core. The area under the curve of established nomograms for organ-confined disease, extraprostatic extension, seminal vesicle invasion and lymph node metastasis were 0.809, 0.804, 0.889 and 0.838, respectively. The nomograms were well calibrated and externally validated. These nomograms showed significantly higher accuracies and net benefits than two Western tools in Korean men. This is the first study to have developed and fully validated nomograms to predict the pathological stage of prostate cancer in an Asian population. These nomograms might be more accurate and useful for Korean men than other predictive models developed using Western populations. © 2012 The Japanese Urological Association.
Predicting stress urinary incontinence during pregnancy: combination of pelvic floor ultrasound parameters and clinical factors.

PubMed

Chen, Ling; Luo, Dan; Yu, Xiajuan; Jin, Mei; Cai, Wenzhi

2018-05-12

The aim of this study was to develop and validate a predictive tool that combining pelvic floor ultrasound parameters and clinical factors for stress urinary incontinence during pregnancy. A total of 535 women in first or second trimester were included for an interview and transperineal ultrasound assessment from two hospitals. Imaging data sets were analyzed offline to assess for bladder neck vertical position, urethra angles (α, β, and γ angles), hiatal area and bladder neck funneling. All significant continuous variables at univariable analysis were analyzed by receiver-operating characteristics. Three multivariable logistic models were built on clinical factor, and combined with ultrasound parameters. The final predictive model with best performance and fewest variables was selected to establish a nomogram. Internal and external validation of the nomogram were performed by both discrimination represented by C-index and calibration measured by Hosmer-Lemeshow test. A decision curve analysis was conducted to determine the clinical utility of the nomogram. After excluding 14 women with invalid data, 521 women were analyzed. β angle, γ angle and hiatal area had limited predictive value for stress urinary incontinence during pregnancy, with area under curves of 0.558-0.648. The final predictive model included body mass index gain since pregnancy, constipation, previous delivery mode, β angle at rest, and bladder neck funneling. The nomogram based on the final model showed good discrimination with a C-index of 0.789 and satisfactory calibration (P=0.828), both of which were supported by external validation. Decision curve analysis showed that the nomogram was clinical useful. The nomogram incorporating both the pelvic floor ultrasound parameters and clinical factors has been validated to show good discrimination and calibration, and could be an important tool for stress urinary incontinence risk prediction at an early stage of pregnancy. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Validation of the Social Appearance Anxiety Scale: factor, convergent, and divergent validity.

PubMed

Levinson, Cheri A; Rodebaugh, Thomas L

2011-09-01

The Social Appearance Anxiety Scale (SAAS) was created to assess fear of overall appearance evaluation. Initial psychometric work indicated that the measure had a single-factor structure and exhibited excellent internal consistency, test-retest reliability, and convergent validity. In the current study, the authors further examined the factor, convergent, and divergent validity of the SAAS in two samples of undergraduates. In Study 1 (N = 323), the authors tested the factor structure, convergent, and divergent validity of the SAAS with measures of the Big Five personality traits, negative affect, fear of negative evaluation, and social interaction anxiety. In Study 2 (N = 118), participants completed a body evaluation that included measurements of height, weight, and body fat content. The SAAS exhibited excellent convergent and divergent validity with self-report measures (i.e., self-esteem, trait anxiety, ethnic identity, and sympathy), predicted state anxiety experienced during the body evaluation, and predicted body fat content. In both studies, results confirmed a single-factor structure as the best fit to the data. These results lend additional support for the use of the SAAS as a valid measure of social appearance anxiety.
VDA, a Method of Choosing a Better Algorithm with Fewer Validations

PubMed Central

Kluger, Yuval

2011-01-01

The multitude of bioinformatics algorithms designed for performing a particular computational task presents end-users with the problem of selecting the most appropriate computational tool for analyzing their biological data. The choice of the best available method is often based on expensive experimental validation of the results. We propose an approach to design validation sets for method comparison and performance assessment that are effective in terms of cost and discrimination power. Validation Discriminant Analysis (VDA) is a method for designing a minimal validation dataset to allow reliable comparisons between the performances of different algorithms. Implementation of our VDA approach achieves this reduction by selecting predictions that maximize the minimum Hamming distance between algorithmic predictions in the validation set. We show that VDA can be used to correctly rank algorithms according to their performances. These results are further supported by simulations and by realistic algorithmic comparisons in silico. VDA is a novel, cost-efficient method for minimizing the number of validation experiments necessary for reliable performance estimation and fair comparison between algorithms. Our VDA software is available at http://sourceforge.net/projects/klugerlab/files/VDA/ PMID:22046256
The Shutdown Dissociation Scale (Shut-D)

PubMed Central

Schalinski, Inga; Schauer, Maggie; Elbert, Thomas

2015-01-01

The evolutionary model of the defense cascade by Schauer and Elbert (2010) provides a theoretical frame for a short interview to assess problems underlying and leading to the dissociative subtype of posttraumatic stress disorder. Based on known characteristics of the defense stages “fright,” “flag,” and “faint,” we designed a structured interview to assess the vulnerability for the respective types of dissociation. Most of the scales that assess dissociative phenomena are designed as self-report questionnaires. Their items are usually selected based on more heuristic considerations rather than a theoretical model and thus include anything from minor dissociative experiences to major pathological dissociation. The shutdown dissociation scale (Shut-D) was applied in several studies in patients with a history of multiple traumatic events and different disorders that have been shown previously to be prone to symptoms of dissociation. The goal of the present investigation was to obtain psychometric characteristics of the Shut-D (including factor structure, internal consistency, retest reliability, predictive, convergent and criterion-related concurrent validity). A total population of 225 patients and 68 healthy controls were accessed. Shut-D appears to have sufficient internal reliability, excellent retest reliability, high convergent validity, and satisfactory predictive validity, while the summed score of the scale reliably separates patients with exposure to trauma (in different diagnostic groups) from healthy controls. The Shut-D is a brief structured interview for assessing the vulnerability to dissociate as a consequence of exposure to traumatic stressors. The scale demonstrates high-quality psychometric properties and may be useful for researchers and clinicians in assessing shutdown dissociation as well as in predicting the risk of dissociative responding. PMID:25976478
Depression among Parents Two to Six Years Following the Loss of a Child by Suicide: A Novel Prediction Model.

PubMed

Nyberg, Tommy; Hed Myrberg, Ida; Omerov, Pernilla; Steineck, Gunnar; Nyberg, Ullakarin

2016-01-01

Parents who lose a child by suicide have elevated risks of depression. No clinical prediction tools exist to identify which suicide-bereaved parents will be particularly vulnerable; we aimed to create a prediction model for long-term depression for this purpose. During 2009 and 2010 we collected data using a nationwide study-specific questionnaire among parents in Sweden who had lost a child aged 15-30 by suicide in years 2004-2007. Current depression was assessed with the Patient Health Questionnaire (PHQ-9) and a single question on antidepressant use. We considered 26 potential predictors assumed clinically assessable at the time of loss, including socio-economics, relationship status, history of psychological stress and morbidity, and suicide-related circumstances. We developed a novel prediction model using logistic regression with all subsets selection and stratified cross-validation. The model was assessed for classification performance and calibration, overall and stratified by time since loss. In total 666/915 (73%) participated. The model showed acceptable classification performance (adjusted area under the curve [AUC] = 0.720, 95% confidence interval [CI] 0.673-0.766), but performed classification best for those at shortest time since loss. Agreement between model-predicted and observed risks was fair, but with a tendency for underestimation and overestimation for individuals with shortest and longest time since loss, respectively. The identified predictors include female sex (odds ratio [OR] = 1.84); sick-leave (OR = 2.81) or unemployment (OR = 1.64); psychological premorbidity debuting during the last 10 years, before loss (OR = 3.64), or more than 10 years ago (OR = 4.96); suicide in biological relatives (OR = 1.54); with non-legal guardianship during the child's upbringing (OR = 0.48); and non-biological parenthood (OR = 0.22) found as protective. Our prediction model shows promising internal validity, but should be externally validated before application. Psychological premorbidity seems to be a prominent predictor of long-term depression among suicide-bereaved parents, and thus important for healthcare providers to assess.
Using SMS Text Messaging to Assess Moderators of Smoking Reduction: Validating a New Tool for Ecological Measurement of Health Behaviors

PubMed Central

Berkman, Elliot T.; Dickenson, Janna; Falk, Emily B.; Lieberman, Matthew D.

2011-01-01

Objective Understanding the psychological processes that contribute to smoking reduction will yield population health benefits. Negative mood may moderate smoking lapse during cessation, but this relationship has been difficult to measure in ongoing daily experience. We used a novel form of ecological momentary assessment to test a self-control model of negative mood and craving leading to smoking lapse. Design We validated short message service (SMS) text as a user-friendly and low-cost option for ecologically measuring real-time health behaviors. We sent text messages to cigarette smokers attempting to quit eight times daily for the first 21 days of cessation (N-obs = 3,811). Main outcome measures Approximately every two hours, we assessed cigarette count, mood, and cravings, and examined between- and within-day patterns and time-lagged relationships among these variables. Exhaled carbon monoxide was assessed pre- and posttreatment. Results Negative mood and craving predicted smoking two hours later, but craving mediated the mood–smoking relationship. Also, this mediation relationship predicted smoking over the next two, but not four, hours. Conclusion Results clarify conflicting previous findings on the relation between affect and smoking, validate a new low-cost and user-friendly method for collecting fine-grained health behavior assessments, and emphasize the importance of rapid, real-time measurement of smoking moderators. PMID:21401252
Evaluation of the DAVROS (Development And Validation of Risk-adjusted Outcomes for Systems of emergency care) risk-adjustment model as a quality indicator for healthcare

PubMed Central

Wilson, Richard; Goodacre, Steve W; Klingbajl, Marcin; Kelly, Anne-Maree; Rainer, Tim; Coats, Tim; Holloway, Vikki; Townend, Will; Crane, Steve

2014-01-01

Background and objective Risk-adjusted mortality rates can be used as a quality indicator if it is assumed that the discrepancy between predicted and actual mortality can be attributed to the quality of healthcare (ie, the model has attributional validity). The Development And Validation of Risk-adjusted Outcomes for Systems of emergency care (DAVROS) model predicts 7-day mortality in emergency medical admissions. We aimed to test this assumption by evaluating the attributional validity of the DAVROS risk-adjustment model. Methods We selected cases that had the greatest discrepancy between observed mortality and predicted probability of mortality from seven hospitals involved in validation of the DAVROS risk-adjustment model. Reviewers at each hospital assessed hospital records to determine whether the discrepancy between predicted and actual mortality could be explained by the healthcare provided. Results We received 232/280 (83%) completed review forms relating to 179 unexpected deaths and 53 unexpected survivors. The healthcare system was judged to have potentially contributed to 10/179 (8%) of the unexpected deaths and 26/53 (49%) of the unexpected survivors. Failure of the model to appropriately predict risk was judged to be responsible for 135/179 (75%) of the unexpected deaths and 2/53 (4%) of the unexpected survivors. Some 10/53 (19%) of the unexpected survivors died within a few months of the 7-day period of model prediction. Conclusions We found little evidence that deaths occurring in patients with a low predicted mortality from risk-adjustment could be attributed to the quality of healthcare provided. PMID:23605036
Systematic review of prediction models for delirium in the older adult inpatient.

PubMed

Lindroth, Heidi; Bratzke, Lisa; Purvis, Suzanne; Brown, Roger; Coburn, Mark; Mrkobrada, Marko; Chan, Matthew T V; Davis, Daniel H J; Pandharipande, Pratik; Carlsson, Cynthia M; Sanders, Robert D

2018-04-28

To identify existing prognostic delirium prediction models and evaluate their validity and statistical methodology in the older adult (≥60 years) acute hospital population. Systematic review. PubMed, CINAHL, PsychINFO, SocINFO, Cochrane, Web of Science and Embase were searched from 1 January 1990 to 31 December 2016. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses and CHARMS Statement guided protocol development. age >60 years, inpatient, developed/validated a prognostic delirium prediction model. alcohol-related delirium, sample size ≤50. The primary performance measures were calibration and discrimination statistics. Two authors independently conducted search and extracted data. The synthesis of data was done by the first author. Disagreement was resolved by the mentoring author. The initial search resulted in 7,502 studies. Following full-text review of 192 studies, 33 were excluded based on age criteria (<60 years) and 27 met the defined criteria. Twenty-three delirium prediction models were identified, 14 were externally validated and 3 were internally validated. The following populations were represented: 11 medical, 3 medical/surgical and 13 surgical. The assessment of delirium was often non-systematic, resulting in varied incidence. Fourteen models were externally validated with an area under the receiver operating curve range from 0.52 to 0.94. Limitations in design, data collection methods and model metric reporting statistics were identified. Delirium prediction models for older adults show variable and typically inadequate predictive capabilities. Our review highlights the need for development of robust models to predict delirium in older inpatients. We provide recommendations for the development of such models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Environmental fate model for ultra-low-volume insecticide applications used for adult mosquito management

USGS Publications Warehouse

Schleier, Jerome J.; Peterson, Robert K.D.; Irvine, Kathryn M.; Marshall, Lucy M.; Weaver, David K.; Preftakes, Collin J.

2012-01-01

One of the more effective ways of managing high densities of adult mosquitoes that vector human and animal pathogens is ultra-low-volume (ULV) aerosol applications of insecticides. The U.S. Environmental Protection Agency uses models that are not validated for ULV insecticide applications and exposure assumptions to perform their human and ecological risk assessments. Currently, there is no validated model that can accurately predict deposition of insecticides applied using ULV technology for adult mosquito management. In addition, little is known about the deposition and drift of small droplets like those used under conditions encountered during ULV applications. The objective of this study was to perform field studies to measure environmental concentrations of insecticides and to develop a validated model to predict the deposition of ULV insecticides. The final regression model was selected by minimizing the Bayesian Information Criterion and its prediction performance was evaluated using k-fold cross validation. Density of the formulation and the density and CMD interaction coefficients were the largest in the model. The results showed that as density of the formulation decreases, deposition increases. The interaction of density and CMD showed that higher density formulations and larger droplets resulted in greater deposition. These results are supported by the aerosol physics literature. A k-fold cross validation demonstrated that the mean square error of the selected regression model is not biased, and the mean square error and mean square prediction error indicated good predictive ability.
A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS

NASA Astrophysics Data System (ADS)

Pradhan, Biswajeet

2013-02-01

The purpose of the present study is to compare the prediction performances of three different approaches such as decision tree (DT), support vector machine (SVM) and adaptive neuro-fuzzy inference system (ANFIS) for landslide susceptibility mapping at Penang Hill area, Malaysia. The necessary input parameters for the landslide susceptibility assessments were obtained from various sources. At first, landslide locations were identified by aerial photographs and field surveys and a total of 113 landslide locations were constructed. The study area contains 340,608 pixels while total 8403 pixels include landslides. The landslide inventory was randomly partitioned into two subsets: (1) part 1 that contains 50% (4000 landslide grid cells) was used in the training phase of the models; (2) part 2 is a validation dataset 50% (4000 landslide grid cells) for validation of three models and to confirm its accuracy. The digitally processed images of input parameters were combined in GIS. Finally, landslide susceptibility maps were produced, and the performances were assessed and discussed. Total fifteen landslide susceptibility maps were produced using DT, SVM and ANFIS based models, and the resultant maps were validated using the landslide locations. Prediction performances of these maps were checked by receiver operating characteristics (ROC) by using both success rate curve and prediction rate curve. The validation results showed that, area under the ROC curve for the fifteen models produced using DT, SVM and ANFIS varied from 0.8204 to 0.9421 for success rate curve and 0.7580 to 0.8307 for prediction rate curves, respectively. Moreover, the prediction curves revealed that model 5 of DT has slightly higher prediction performance (83.07), whereas the success rate showed that model 5 of ANFIS has better prediction (94.21) capability among all models. The results of this study showed that landslide susceptibility mapping in the Penang Hill area using the three approaches (e.g., DT, SVM and ANFIS) is viable. As far as the performance of the models are concerned, the results appeared to be quite satisfactory, i.e., the zones determined on the map being zones of relative susceptibility.
Screening for frailty in community-dwelling elderly subjects: Predictive validity of the modified SEGA instrument.

PubMed

Oubaya, N; Dramé, M; Novella, J-L; Quignard, E; Cunin, C; Jolly, D; Mahmoudi, R

2017-11-01

To study the capacity of the SEGAm instrument to predict loss of independence among elderly community-dwelling subjects. The study was performed in four French departments (Ardennes, Marne, Meurthe-et-Moselle, Meuse). Subjects aged 65 years or more, living at home, who could read and understand French, with a degree of autonomy corresponding to groups 5 or 6 in the AGGIR autonomy evaluation scale were included. Assessment included demographic characteristics, comprehensive geriatric assessment, and the SEGAm instrument at baseline. Subjects had follow-up visits at home at 6 and 12 months. During follow-up, vital status and level of independence were recorded. Logistic regression was used to study predictive validity of the SEGAm instrument. Among the 116 subjects with complete follow-up, 84 (72.4%) were classed as not very frail at baseline, 23 (19.8%) as frail, and 9 (7.8%) as very frail; 63 (54.3%) suffered loss of at least one ADL or IADL at 12 months. By multivariable analysis, frailty status at baseline was significantly associated with loss of independence during the 12 months of follow-up (OR=4.52, 95% CI=1.40-14.68; p=0.01). We previously validated the SEGAm instrument in terms of feasibility, acceptability, internal structure validity, reliability, and discriminant validity. This instrument appears to be a suitable tool for screening frailty among community-dwelling elderly subjects, and could be used as a basis to plan early targeted interventions for subjects at risk of adverse outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
When Significant Others Suffer: German Validation of the Burden Assessment Scale (BAS)

PubMed Central

Hunger, Christina; Krause, Lena; Hilzinger, Rebecca; Ditzen, Beate; Schweitzer, Jochen

2016-01-01

There is a need of an economical, reliable, and valid instrument in the German-speaking countries to measure the burden of relatives who care for mentally ill persons. We translated the Burden Assessment Scale (BAS) and conducted a study investigating factor structure, psychometric quality and predictive validity. We used confirmative factor analyses (CFA, maximum-likelihood method) to examine the dimensionality of the German BAS in a sample of 215 relatives (72% women; M = 32 years, SD = 14, range: 18 to 77; 39% employed) of mentally ill persons (50% (ex-)partner or (best) friend; M = 32 years, SD = 13, range 8 to 64; main complaints were depression and/or anxiety). Cronbach’s α determined the internal consistency. We examined predictive validity using regression analyses including the BAS and validated scales of social systems functioning (Experience In Social Systems Questionnaire, EXIS.pers, EXIS.org) and psychopathology (Brief Symptom Inventory, BSI). Variables that might have influenced the dependent variables (e.g. age, gender, education, employment and civil status) were controlled by their introduction in the first step, and the BAS in the second step of the regression analyses. A model with four correlated factors (Disrupted Activities, Personal Distress, Time Perspective, Guilt) showed the best fit. With respect to the number of items included, the internal consistency was very good. The modified German BAS predicted relatives’ social systems functioning and psychopathology. The economical design makes the 19-item BAS promising for practice-oriented research, and for studies under time constraints. Strength, limitations and future directions are discussed. PMID:27764109
Machine Learning Algorithms Outperform Conventional Regression Models in Predicting Development of Hepatocellular Carcinoma

PubMed Central

Singal, Amit G.; Mukherjee, Ashin; Elmunzer, B. Joseph; Higgins, Peter DR; Lok, Anna S.; Zhu, Ji; Marrero, Jorge A; Waljee, Akbar K

2015-01-01

Background Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine learning algorithms. Methods We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared to the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. Results After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95%CI 0.56-0.67), whereas the machine learning algorithm had a c-statistic of 0.64 (95%CI 0.60–0.69) in the validation cohort. The machine learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (p<0.001) and integrated discrimination improvement (p=0.04). The HALT-C model had a c-statistic of 0.60 (95%CI 0.50-0.70) in the validation cohort and was outperformed by the machine learning algorithm (p=0.047). Conclusion Machine learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC. PMID:24169273
How to quantify exposure to traumatic stress? Reliability and predictive validity of measures for cumulative trauma exposure in a post-conflict population.

PubMed

Wilker, Sarah; Pfeiffer, Anett; Kolassa, Stephan; Koslowski, Daniela; Elbert, Thomas; Kolassa, Iris-Tatjana

2015-01-01

While studies with survivors of single traumatic experiences highlight individual response variation following trauma, research from conflict regions shows that almost everyone develops posttraumatic stress disorder (PTSD) if trauma exposure reaches extreme levels. Therefore, evaluating the effects of cumulative trauma exposure is of utmost importance in studies investigating risk factors for PTSD. Yet, little research has been devoted to evaluate how this important environmental risk factor can be best quantified. We investigated the retest reliability and predictive validity of different trauma measures in a sample of 227 Ugandan rebel war survivors. Trauma exposure was modeled as the number of traumatic event types experienced or as a score considering traumatic event frequencies. In addition, we investigated whether age at trauma exposure can be reliably measured and improves PTSD risk prediction. All trauma measures showed good reliability. While prediction of lifetime PTSD was most accurate from the number of different traumatic event types experienced, inclusion of event frequencies slightly improved the prediction of current PTSD. As assessing the number of traumatic events experienced is the least stressful and time-consuming assessment and leads to the best prediction of lifetime PTSD, we recommend this measure for research on PTSD etiology.
Binary Decision Trees for Preoperative Periapical Cyst Screening Using Cone-beam Computed Tomography.

PubMed

Pitcher, Brandon; Alaqla, Ali; Noujeim, Marcel; Wealleans, James A; Kotsakis, Georgios; Chrepa, Vanessa

2017-03-01

Cone-beam computed tomographic (CBCT) analysis allows for 3-dimensional assessment of periradicular lesions and may facilitate preoperative periapical cyst screening. The purpose of this study was to develop and assess the predictive validity of a cyst screening method based on CBCT volumetric analysis alone or combined with designated radiologic criteria. Three independent examiners evaluated 118 presurgical CBCT scans from cases that underwent apicoectomies and had an accompanying gold standard histopathological diagnosis of either a cyst or granuloma. Lesion volume, density, and specific radiologic characteristics were assessed using specialized software. Logistic regression models with histopathological diagnosis as the dependent variable were constructed for cyst prediction, and receiver operating characteristic curves were used to assess the predictive validity of the models. A conditional inference binary decision tree based on a recursive partitioning algorithm was constructed to facilitate preoperative screening. Interobserver agreement was excellent for volume and density, but it varied from poor to good for the radiologic criteria. Volume and root displacement were strong predictors for cyst screening in all analyses. The binary decision tree classifier determined that if the volume of the lesion was >247 mm 3 , there was 80% probability of a cyst. If volume was <247 mm 3 and root displacement was present, cyst probability was 60% (78% accuracy). The good accuracy and high specificity of the decision tree classifier renders it a useful preoperative cyst screening tool that can aid in clinical decision making but not a substitute for definitive histopathological diagnosis after biopsy. Confirmatory studies are required to validate the present findings. Published by Elsevier Inc.
Reliability, validity and minimal detectable change of computerized respiratory sounds in patients with chronic obstructive pulmonary disease.

PubMed

Oliveira, Ana; Lage, Susan; Rodrigues, João; Marques, Alda

2017-11-17

Computerized respiratory sounds (CRS) are closely related to the movement of air within the tracheobronchial tree and are promising outcome measures in patients with chronic obstructive pulmonary disease (COPD). However, CRS measurement properties have been poorly tested. The aim of this study was to assess the reliability, validity and the minimal detectable changes (MDC) of CRS in patients with stable COPD. Fifty patients (36♂, 67.26 ± 9.31y, FEV 1 49.52 ± 19.67%predicted) were enrolled. CRS were recorded simultaneously at seven anatomic locations (trachea; right and left anterior, lateral and posterior chest). The number of crackles, wheeze occupation rate, median frequency (F50) and maximum intensity (Imax) were processed using validated algorithms. Within-day and between-days reliability, criterion and construct validity, validity to predict exacerbations and MDC were established. CRS presented moderate-to-excellent within-day reliability (ICC 1,3 ≥ 0.51; P < .05) and moderate-to-good between-days reliability (ICC 1,2 ≥ 0.47; P < .05) for most locations. Negligible-to-moderate correlations with FEV 1 %predicted were found (-0.53 < r s < -0.28; P < .05), and the inspiratory number of crackles were the best discriminator between mild-to-moderate and severe-to-very severe airflow limitations (area under the curve >0.78). CRS correlated poorly with patient-reported outcomes (r s < 0.48; P < .05) and did not predict exacerbations. Inspiratory number of crackles at posterior right chest, inspiratory F50 at trachea and anterior left chest and expiratory Imax at anterior right chest were simultaneously reliable and valid, and their MDC were 2.41, 55.27, 29.55 and 3.98, respectively. CRS are reliable and valid. Their use, integrated with other clinical and patient-reported measures, may fill the gap of assessing small airways and contribute toward a patient's comprehensive evaluation. © 2017 John Wiley & Sons Ltd.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.